I’m heavily involved in the Enterprise Web Library open-source project, and the other contributors and I sometimes run into situations in which we need to merge two complex branches together. This happens mainly because we don’t always have the time to promptly review and merge each other’s changes; sometimes a contributor will write some code that won’t make it into our canonical repo for several months. But anyway, when merging, we try to always avoid “criss-cross” merges (i.e. merges in which the two branches being merged have multiple Greatest Common Ancestors) since these can result in changes being accidentally undone and other bad things.

After toiling away for the better part of a week, I’ve created a Mercurial revsets alias called nextMerge that takes in two branch heads (merge destination and merge source) and gives you the next ancestor of source that you should merge into destination. It will just return source itself if there is no criss-crossing. If there is criss-crossing, it will tell you to first merge each parent of a merge changeset that would cause a criss-cross, and then it will tell you to actually do the criss-cross merge, but at this point it will be what I call a “no-op” merge, meaning that it should not change destination in any way since all you’re doing is merging in a merge changeset whose parents (i.e. the actual code changes) have already been merged. This no-op merge reduces the number of Greatest Common Ancestors for the original desired merge, and when we get down to just a single GCA, we can safely proceed with the original merge.

Complicated explanation, I know. Drop me a line if you want to know more. But to use nextMerge, just drop the code below into your Mercurial configuration file:

# The first parameter is the merge destination. The second is the set of changesets that you'd ultimately like to merge.
# If multiple changesets are returned, recursion is required. Please rerun with the head changeset as the second argument.
# Implementing this within Mercurial or as an extension would be better because we'd be able to automatically perform the recursion.
nextMerge($1,$2) = nextMergeForSingleHead( $1, oldest( heads( $2 – ancestors($1) ) ) )
# In the following aliases, each parameter should be a single changeset.
# only one term at a time should return results
nextMergeForSingleHead($1,$2) = headMerge($1,$2) | safeMergeCandidateAndAncestors($1,$2) | noOpMerge($1,$2)
# the unmerged head, if only a single GCA exists
headMerge($1,$2) = $2 & firstGcaChangesets($1,$2)
safeMergeCandidateAndAncestors($1,$2) = ancestors( safeMergeCandidate($1,$2) ) & unmergedAncestors($1,$2)
# the oldest unmerged parent of the oldest multi-GCA changeset
safeMergeCandidate($1,$2) = oldest( unmergedAncestors($1,$2) & parents( oldestMultiGcaChangeset($1,$2) ) ) & noOpMergeMask($1,$2)
# The oldest unmerged changeset that descends from the first GCA and one or more other GCAs. Will always be a merge changeset.
oldestMultiGcaChangeset($1,$2) = oldest( ( unmergedAncestors($1,$2) & descendants( firstGca($1,$2) ) ) – firstGcaChangesets($1,$2) )
# the changesets that descend from the first GCA and not from any others
firstGcaChangesets($1,$2) = descendants( firstGca($1,$2) ) – descendants( greatestCommonAncestors($1,$2) – firstGca($1,$2) )
noOpMergeMask($1,$2) = not descendants( ancestors( noOpMergeChangesets($1,$2) ) )
# the oldest head of the no-op merge changesets
noOpMerge($1,$2) = oldest( heads( noOpMergeChangesets($1,$2) ) )
# the GCA-child merge changesets that, when merged into the destination, will eliminate GCAs without changing anything
noOpMergeChangesets($1,$2) = gcaChildMergeChangesets($1,$2) – descendants( unmergedAncestors($1,$2) – merge() )
# the unmerged merge changesets that are children of a GCA
gcaChildMergeChangesets($1,$2) = unmergedAncestors($1,$2) & merge() & children( greatestCommonAncestors($1,$2) )
# The first GCA of the oldest unmerged child of a GCA. If the oldest unmerged child of a GCA is a merge with two GCA parents, one is chosen arbitrarily.
firstGca($1,$2) = first( parents( oldest( children( greatestCommonAncestors($1,$2) ) & unmergedAncestors($1,$2) ) ) & greatestCommonAncestors($1,$2) )
unmergedAncestors($1,$2) = ancestors($2) – ancestors($1)
oldest($1) = first( sort($1,rev) )
# Similar to the *ancestor* predicate, but not limited to one result. Each parameter should be a single changeset.
greatestCommonAncestors($1,$2) = heads( ancestors($1) & ancestors($2) )
# Rationale:
# Merging algorithms depend on finding the Greatest Common Ancestor (GCA) of the two changesets being merged, so they'll work most predictably when this is not
# ambiguous. This means that, in general, we should avoid these "criss-cross" merges. When we cannot avoid them, we should first perform merges into the
# destination that get us in a position to perform a criss-cross merge that consists solely of merge changesets (i.e. a "no-op" merge). We can repeat this
# process until only a single GCA remains, at which point we can safely complete the original merge.
# References:
# http://codicesoftware.blogspot.com/2011/09/merge-recursive-strategy.html
# http://article.gmane.org/gmane.comp.version-control.mercurial.general/29523

view raw
hosted with ❤ by GitHub

Happy merging!