How weld pull and weld push work

How “weld pull” does its stuff

Obviously, the code is the final word on what happens, but this is intended as reasonable background.

Remember, “weld pull” updates its idea of the bases, and then updates the seams in the weld “to match”.

The main code for this is in welded/pull.py

The very short form

We make sure we have a current copy of the base (in .weld/bases), copy across the changes for each seam in that base to our weld, and commit with an X-Weld-State: Merged message.

  • The local copy of the base is updated, but is not otherwise changed.
  • The relevant seams in the weld are updated to match the equivalent directories in the base.

The short form

  • Check it’s safe to do a weld pull
  • Update the copy of the base in .weld/bases
  • Find the last time weld pull or weld push was done for this base - this is the synchronisation point.
  • Determine which seams have been added/removed/changed in comparison to the newly update base. If nothing has changed, then there is nothing to do.
  • If there are deleted seams, delete them in the weld. If there are added seams, add them in the weld, by copying their files across. If there are changed seams, then replay the appropriate changes in the weld.
  • Commit with an X-Weld-State: Merged message.

In detail

So we’re pulling a base

(You can also pull multiple bases at once, by giving multiple base names on the command line, or use weld pull _all to pull all bases, but these both just work by doing this whole sequence for each base in turn. Note that this can be more confusing, for instance if the Nth base requires remedial action to take to “finish” it, at which point you have to fix the problem, do weld finish, and then give the original weld command again to pull the remaining bases.)

(Pulling an individual seam would in theory be possible, but rather fiddly, and of questionable use anyway, so we’ll go with just pulling bases).

Given the name of a base:

  1. Weld checks that there are no local changes in the weld - specifically, it runs git status in the weld’s base directory (the directory containing the .weld directory). If there are any files in the weld that could be added with git add or committed with git commit, then it will refuse to proceed, suggesting that the user commit or stash the changes first.

    It also checks whether:

    • the user is part way through an unfinished weld pull or weld push
    • the weld could be updated with git pull (it looks at the remote repository to determine this)
    • the current branch is a weld-specific branch (starting with weld-).

    and refuses to proceeed if any of those is true (this is essentially what weld status does, so you can do it beforehand as well).

  2. It finds the last time that weld pull or weld push was done for this base, by looking for the most recent commit with an X-Weld-State: Merged <base-name> or X-Weld-State: Pushed <base-name> message. If there isn’t one (i.e., this is the first weld pull or weld push for this weld), then it uses the X-Weld-State: Init commit instead.

    For weld pull we want to know the last time our base was synchronised with the weld as a whole. Since both weld pull and weld push do this, we can use either as the relevant place to work from.

  3. Weld makes sure its copy of the base is up-to-date:

    1. If it doesn’t yet have a clone of the base, it does:

      $ cd .weld/bases
      $ git clone <base-repository> <base>
      $ cd <base>
      $ git pull
      
    2. If it does have a clone of the base, it does:

      $ cd .weld/bases/<base>
      $ git pull
      

    In either case, it notes the HEAD commit of the base.

  4. It determines which (if any) seams have been deleted, changed or added in the weld (with respect to the now up-to-date base). If all of those lists are empty, there is nothing to do, and the weld pull for this base is finished.

  5. It branches the weld. The branch point is the synchronisation commit that was found earlier (the last weld pull or weld push commit, or else the Init commit).

    (The branch name used is chosen to be unique to this repository, and is currently of the form “weld-merge-<commit-id>-<index>”, where <commit-id> is the first 10 characters of the synchronisation commits SHA1 id, and <index> is chosen to make the branch unique in case that is not enough.)

  6. It then:

    • deletes any deleted seams
    • modifies any modified seams
    • adds any added seams

    within that branch.

    Deleting a seam is easy - it just means deleting the appropriate directory.

    Adding a seam just copies the directory structure for that seam across from the base into the correct place in the weld.

    Modifying a seam uses git diff to determine the appropriate changes in the base, and then replays them in the weld.

    Note

    TODO It occurs to me that the technique use in weld push might be more efficient than this last, if it turns out to be usable - I’d need to think on this further. (Tibs)

  7. It writes .weld/complete.py and .weld/abort.py, which can later be used by the weld finish and weld abort commands if necessary (and which will be deleted if the weld pull of this base doesn’t need user interaction).

  8. It merges the original branch (typically master) onto this temporary branch. This will commonly “just work”, but if anything goes wrong, the weld pull stops with a return code of 1 and a message of the form:

    <merge error message>
    Merge failed
    Either fix your merges and then do 'weld finish',
    or do 'weld abort' to give up.
    
  9. If the merge onto the branch succeeded, or if the user fixes problems and then does weld finish, then the complete.py script is run, which:

    1. changes back to the original branch
    2. calculates the difference between this branch and the temporary branch on which we did our merge
    3. applies that patch to this original branch
    4. makes sure that any changed files are added to the git index (it does this over the entire weld, but that should be OK because nothing else should be changing the weld whilst we’re busy)
    5. commits this whole operation using an appropriate X-Weld-State: Merged <base-name> message.
    6. deletes the complete.py and abort.py scripts

    At the moment, this doesn’t delete the temporary/working branch (which will show as a loop if you look in gitk). Future versions of weld may do so as part of the “complete” phase, but during the current active development it’s thought to be useful to leave the branch visible.

  10. If the merge didn’t succeed, and the user chooses to do weld abort, then the abort.py script is run, which:

    • switches back to the original branch
    • deletes the temporary/working branch
    • deletes the complete.py and abort.py scripts

Also note that the weld- branches are always meant to be local to the current repository - they’re not meant to be pushed anywhere else.

Not having those “remotes/origin/weld-” branches

If you do a weld pull and then do a git push of the weld, in general the transient branches will not be propagated to the weld’s remote.

However, if you clone directly from a “checked out” weld (rather than from a bare repository), then by default all branches are cloned, which is (a) untidy, and (b) mak cause future working branches to have the same name as earlier (remote) working branches.

If you have git version 1.7.10 or later, then you can instead clone a “working” weld using:

$ git clone --single-branch <weld-directory>

to retrieve (in this case) just master (or use -b <branch to name a specific branch).

Of course, unfortunately, if you later do a git pull, then the branches will be fetched for you at that stage, so it’s not a perfect solution. But then maybe you shouldn’t clone a “checked out” weld.

How “weld push” works

Obviously, the code is the final word on what happens, but this is intended as reasonable background.

Again, we’re only going to look at doing “weld push” on a single base - the command line will take more than one base name, or the magic _all, but we’ll ignore that here.

The main code for this is in welded/push.py

The very short form

We make sure we have a current copy of the base (in .weld/bases), copy across the changes for each seam in that base from our weld to the base, commit them all as one change with an X-Weld-State: Pushed message, and push to the base’s origin. We also add an empty X-Weld-State: Pushed commit in the weld, as a marker of when the weld push happened.

  • The base is updated to match its seams in the weld, and pushed to its remote.
  • The weld is marked with when the push happened.

The short form

  • Check it’s safe to do a weld push
  • Update the copy of the base in .weld/bases
  • Determine the last weld push for this base
  • For each seam, work out which files have changed (added, removed, changed) in the weld, since that last weld push
  • Use rsync to make the files in (the corresponding directory in) the base match
  • Commit that in the base, and push it to the base’s remote
  • Add a corresponding X-Weld-State: Pushed commit in the weld

Remember that only seams that are currently “named” in the weld are pushed, since they’re the only seams that are of interest “now” - if the user wanted to push changes to a seam that is not currently in use, then they should have done it when it was in use.

In detail

So doing weld push for a given base name works as follows:

  1. Weld checks that there are no local changes in the weld - specifically, it runs git status in the weld’s base directory (the directory containing the .weld directory). If there are any files in the weld that could be added with git add or committed with git commit, then it will refuse to proceed, suggesting that the user commit or stash the changes first.

    It also checks whether:

    • the user is part way through an unfinished weld pull or weld push
    • the weld could be updated with git pull (it looks at the remote repository to determine this)
    • the current branch is a weld-specific branch (starting with weld-).

    and refuses to proceeed if any of those is true (this is essentially what weld status does, so you can do it beforehand as well).

  2. Weld makes sure its copy of the base is up-to-date:

    1. If it doesn’t yet have a clone of the base, it does:

      $ cd .weld/bases
      $ git clone <base-repository> <base>
      $ cd <base>
      $ git pull
      
    2. If it does have a clone of the base, it does:

      $ cd .weld/bases/<base>
      $ git pull
      
  3. It finds the last time that weld push was done for this base, by looking for the most recent commit with an X-Weld-State: Pushed <base-name message. If there isn’t one (i.e., this is the first weld push for this weld), then it uses the X-Weld-State: Init commit instead.

    Why do we use the last weld push, and not the last weld push or weld pull?

    Consider the following “pseudo git log”:

    6   +   change B to a file in base <fred>
    5   o   X-Weld-State: Merged <fred>/...
    4   -   a change to some irrelevant file(s)
    3   +   change A to a file in base <fred>
    2   o   X-Weld-State: Pushed <fred>/...
    1   x   some common commit
    

    So we last did weld pull at commit 5, and the weld thus contains all the changes from base <fred> up to that point.

    However, we last did a weld push at commit 2, which means that changes 3 and 6 have still to be applied to base <fred>. But change 3 is before our last weld pull, so we definitely want the last push.

    Note

    Remember: weld pull updates the base from its remote, and then brings any changes therein into our weld. It does not propagate any changes in the weld back to the base.

  4. Weld looks up the current seams being used for this base. This tells it which directories (in the weld and in the base) it is interested in.

  5. It looks up all of the changes in the weld since the synchronisation point (using git log --one-line) and remembers them.

    If there aren’t any, it has finished.

  6. It trims out any X-Weld-State commits from that list, and remembers it.

    Again, if there are no changes (left), then it has finished.

  7. As an aid during development (so this may go away later on), it tags the synchronisation commit, using a tag name of the form weld-last-<base-name>-sync-<commit-id>, where <commit-id> is the first 10 characters of its SHA1 commit id.

  8. In the base, it branches at the synchronisation point (remember, X-Weld-State commit messages record the equivalent commit id in the base as well), using a branch name of the form “weld-pushing-<commit-id>-<index>”.

  9. We then update the branch in the base:

    For each seam that the base currently has in our weld:

    1. We use git ls-files in the appropriate seam in the weld to find out which files git is managing.
    2. We do the same in the corresponding directory in the base.
    3. For files which the seam (in the weld) has, but the base does not, we do git rm. We commit that change.
    4. For all the other files in the seam, we just copy them over into the base (actually, we use rsync). It is, of course, quite likely that many of them won’t have changed, but that’s OK. Then we git add all of the files we’ve copied, in the base, and commit that change as well.

    Note

    Any seams that are not in the weld are, by definition, not of interest to us. Even if they were in the weld at the last synchronisation point, the fact they aren’t now means we are not interested in any (possible) intermediate changes - if we cared about such changes, we should have done weld push then.

  10. We prepare a final commit message, and write out the .weld/complete.py and .weld/abort.py files.

  11. We run the complete.py file to finish off our weld push. This:

    • sets the merging indicator (touches a file in the .weld/pushing directory)

    • in the base, if the merging indicator is not set, merges the original branch (normally “master”) into our working branch - this should just proceed with no problems

    • still in the base, merges that back into the original branch

    • still in the base, commits the change using the saved commit message

      The commit message has a summary of the corresponding commits from the weld, as output by git log --one-line.

      If the user specified weld push -edit, then they get the chance to edit the message before it is used.

    • notes the new HEAD commit id in the base

    • adds an X-Weld-State: Pushed <base-name>/<commit-id> commit to the weld, using that commit id (this is, of course, notionally an empty commit).

    • deletes the merging indicator and the saved commit message.

If something did go wrong, then weld finish just does that last item again (which is why we need the merging indicator). weld abort deletes the working branch, and then deletes the merging indicator and saved commit message.