ChatGPT解决这个技术问题 Extra ChatGPT

Git workflow and rebase vs merge questions

I've been using Git now for a couple of months on a project with one other developer. I have several years of experience with SVN, so I guess I bring a lot of baggage to the relationship.

I have heard that Git is excellent for branching and merging, and so far, I just don't see it. Sure, branching is dead simple, but when I try to merge, everything goes all to hell. Now, I'm used to that from SVN, but it seems to me that I just traded one sub-par versioning system for another.

My partner tells me that my problems stem from my desire to merge willy-nilly, and that I should be using rebase instead of merge in many situations. For example, here's the workflow that he's laid down:

clone the remote repository
git checkout -b my_new_feature
..work and commit some stuff
git rebase master
..work and commit some stuff
git rebase master
..finish the feature
git checkout master
git merge my_new_feature

Essentially, create a feature branch, ALWAYS rebase from master to the branch, and merge from the branch back to master. Important to note is that the branch always stays local.

Here is the workflow that I started with

clone remote repository
create my_new_feature branch on remote repository
git checkout -b --track my_new_feature origin/my_new_feature
..work, commit, push to origin/my_new_feature
git merge master (to get some changes that my partner added)
..work, commit, push to origin/my_new_feature
git merge master
..finish my_new_feature, push to origin/my_new_feature
git checkout master
git merge my_new_feature
delete remote branch
delete local branch

There are two essential differences (I think): I use merge always instead of rebasing, and I push my feature branch (and my feature branch commits) to the remote repository.

My reasoning for the remote branch is that I want my work backed up as I'm working. Our repository is automatically backed up and can be restored if something goes wrong. My laptop is not, or not as thoroughly. Therefore, I hate to have code on my laptop that's not mirrored somewhere else.

My reasoning for the merge instead of rebase is that merge seems to be standard and rebase seems to be an advanced feature. My gut feeling is that what I'm trying to do is not an advanced setup, so rebase should be unnecessary. I've even perused the new Pragmatic Programming book on Git, and they cover merge extensively and barely mention rebase.

Anyway, I was following my workflow on a recent branch, and when I tried to merge it back to master, it all went to hell. There were tons of conflicts with things that should have not mattered. The conflicts just made no sense to me. It took me a day to sort everything out, and eventually culminated in a forced push to the remote master, since my local master has all conflicts resolved, but the remote one still wasn't happy.

What is the "correct" workflow for something like this? Git is supposed to make branching and merging super-easy, and I'm just not seeing it.

Update 2011-04-15

This seems to be a very popular question, so I thought I'd update with my two years experience since I first asked.

It turns out that the original workflow is correct, at least in our case. In other words, this is what we do and it works:

clone the remote repository
git checkout -b my_new_feature
..work and commit some stuff
git rebase master
..work and commit some stuff
git rebase master
..finish the feature, commit
git rebase master
git checkout master
git merge my_new_feature

In fact, our workflow is a little different, as we tend to do squash merges instead of raw merges. (Note: This is controversial, see below.) This allows us to turn our entire feature branch into a single commit on master. Then we delete our feature branch. This allows us to logically structure our commits on master, even if they're a little messy on our branches. So, this is what we do:

clone the remote repository
git checkout -b my_new_feature
..work and commit some stuff
git rebase master
..work and commit some stuff
git rebase master
..finish the feature, commit
git rebase master
git checkout master
git merge --squash my_new_feature
git commit -m "added my_new_feature"
git branch -D my_new_feature

Squash Merge Controversy - As several commenters have pointed out, the squash merge will throw away all history on your feature branch. As the name implies, it squashes all the commits down into a single one. For small features, this makes sense as it condenses it down into a single package. For larger features, it's probably not a great idea, especially if your individual commits are already atomic. It really comes down to personal preference.

Github and Bitbucket (others?) Pull Requests - In case you're wondering how merge/rebase relates to Pull Requests, I recommend following all the above steps up until you're ready to merge back to master. Instead of manually merging with git, you just accept the PR. Note that this will not do a squash merge (at least not by default), but non-squash, non-fast-forward is the accepted merge convention in the Pull Request community (as far as I know). Specifically, it works like this:

clone the remote repository
git checkout -b my_new_feature
..work and commit some stuff
git rebase master
..work and commit some stuff
git rebase master
..finish the feature, commit
git rebase master
git push # May need to force push
...submit PR, wait for a review, make any changes requested for the PR
git rebase master
git push # Will probably need to force push (-f), due to previous rebases from master
...accept the PR, most likely also deleting the feature branch in the process
git checkout master
git branch -d my_new_feature
git remote prune origin

I've come to love Git and never want to go back to SVN. If you're struggling, just stick with it and eventually you'll see the light at the end of the tunnel.

Unfortunately, the new Pragmstic Programming book is mostly written from using Git while still thinking in SVN, and in this case it has misled you. In Git, rebase keeps things simple when they can be. Your experience could tell you that your workflow doesn't work in Git, not that Git doesn't work.
I would not recommend squash merging in this case, as it saves no information about what is merged (just like svn, but no mergeinfo here).
Love the note at the bottom, I had a similar experience of struggle with Git, but now struggle to imagine not using it. Thanks for the final explanation too, helped a lot with rebase understanding
After you have finish the feature, shouldn't you rebase one last time before you merge new_feature to master?
Your workflow loses all commit history from the deleted branch :(

E
Edward Anderson

TL;DR

A git rebase workflow does not protect you from people who are bad at conflict resolution or people who are used to a SVN workflow, like suggested in Avoiding Git Disasters: A Gory Story. It only makes conflict resolution more tedious for them and makes it harder to recover from bad conflict resolution. Instead, use diff3 so that it's not so difficult in the first place.

Rebase workflow is not better for conflict resolution!

I am very pro-rebase for cleaning up history. However if I ever hit a conflict, I immediately abort the rebase and do a merge instead! It really kills me that people are recommending a rebase workflow as a better alternative to a merge workflow for conflict resolution (which is exactly what this question was about).

If it goes "all to hell" during a merge, it will go "all to hell" during a rebase, and potentially a lot more hell too! Here's why:

Reason #1: Resolve conflicts once, instead of once for each commit

When you rebase instead of merge, you will have to perform conflict resolution up to as many times as you have commits to rebase, for the same conflict!

Real scenario

I branch off of master to refactor a complicated method in a branch. My refactoring work is comprised of 15 commits total as I work to refactor it and get code reviews. Part of my refactoring involves fixing the mixed tabs and spaces that were present in master before. This is necessary, but unfortunately it will conflict with any change made afterward to this method in master. Sure enough, while I'm working on this method, someone makes a simple, legitimate change to the same method in the master branch that should be merged in with my changes.

When it's time to merge my branch back with master, I have two options:

git merge: I get a conflict. I see the change they made to master and merge it in with (the final product of) my branch. Done.

git rebase: I get a conflict with my first commit. I resolve the conflict and continue the rebase. I get a conflict with my second commit. I resolve the conflict and continue the rebase. I get a conflict with my third commit. I resolve the conflict and continue the rebase. I get a conflict with my fourth commit. I resolve the conflict and continue the rebase. I get a conflict with my fifth commit. I resolve the conflict and continue the rebase. I get a conflict with my sixth commit. I resolve the conflict and continue the rebase. I get a conflict with my seventh commit. I resolve the conflict and continue the rebase. I get a conflict with my eighth commit. I resolve the conflict and continue the rebase. I get a conflict with my ninth commit. I resolve the conflict and continue the rebase. I get a conflict with my tenth commit. I resolve the conflict and continue the rebase. I get a conflict with my eleventh commit. I resolve the conflict and continue the rebase. I get a conflict with my twelfth commit. I resolve the conflict and continue the rebase. I get a conflict with my thirteenth commit. I resolve the conflict and continue the rebase. I get a conflict with my fourteenth commit. I resolve the conflict and continue the rebase. I get a conflict with my fifteenth commit. I resolve the conflict and continue the rebase.

You have got to be kidding me if this is your preferred workflow. All it takes is a whitespace fix that conflicts with one change made on master, and every commit will conflict and must be resolved. And this is a simple scenario with only a whitespace conflict. Heaven forbid you have a real conflict involving major code changes across files and have to resolve that multiple times.

With all the extra conflict resolution you need to do, it just increases the possibility that you will make a mistake. But mistakes are fine in git since you can undo, right? Except of course...

Reason #2: With rebase, there is no undo!

I think we can all agree that conflict resolution can be difficult, and also that some people are very bad at it. It can be very prone to mistakes, which why it's so great that git makes it easy to undo!

When you merge a branch, git creates a merge commit that can be discarded or amended if the conflict resolution goes poorly. Even if you have already pushed the bad merge commit to the public/authoritative repo, you can use git revert to undo the changes introduced by the merge and redo the merge correctly in a new merge commit.

When you rebase a branch, in the likely event that conflict resolution is done wrong, you're screwed. Every commit now contains the bad merge, and you can't just redo the rebase*. At best, you have to go back and amend each of the affected commits. Not fun.

After a rebase, it's impossible to determine what was originally part of the commits and what was introduced as a result of bad conflict resolution.

*It can be possible to undo a rebase if you can dig the old refs out of git's internal logs, or if you create a third branch that points to the last commit before rebasing.

Take the hell out of conflict resolution: use diff3

Take this conflict for example:

<<<<<<< HEAD
TextMessage.send(:include_timestamp => true)
=======
EmailMessage.send(:include_timestamp => false)
>>>>>>> feature-branch

Looking at the conflict, it's impossible to tell what each branch changed or what its intent was. This is the biggest reason in my opinion why conflict resolution is confusing and hard.

diff3 to the rescue!

git config --global merge.conflictstyle diff3

When you use the diff3, each new conflict will have a 3rd section, the merged common ancestor.

<<<<<<< HEAD
TextMessage.send(:include_timestamp => true)
||||||| merged common ancestor
EmailMessage.send(:include_timestamp => true)
=======
EmailMessage.send(:include_timestamp => false)
>>>>>>> feature-branch

First examine the merged common ancestor. Then compare each side to determine each branch's intent. You can see that HEAD changed EmailMessage to TextMessage. Its intent is to change the class used to TextMessage, passing the same parameters. You can also see that feature-branch's intent is to pass false instead of true for the :include_timestamp option. To merge these changes, combine the intent of both:

TextMessage.send(:include_timestamp => false)

In general:

Compare the common ancestor with each branch, and determine which branch has the simplest change Apply that simple change to the other branch's version of the code, so that it contains both the simpler and the more complex change Remove all the sections of conflict code other than the one that you just merged the changes together into

Alternate: Resolve by manually applying the branch's changes

Finally, some conflicts are terrible to understand even with diff3. This happens especially when diff finds lines in common that are not semantically common (eg. both branches happened to have a blank line at the same place!). For example, one branch changes the indentation of the body of a class or reorders similar methods. In these cases, a better resolution strategy can be to examine the change from either side of the merge and manually apply the diff to the other file.

Let's look at how we might resolve a conflict in a scenario where merging origin/feature1 where lib/message.rb conflicts.

Decide whether our currently checked out branch (HEAD, or --ours) or the branch we're merging (origin/feature1, or --theirs) is a simpler change to apply. Using diff with triple dot (git diff a...b) shows the changes that happened on b since its last divergence from a, or in other words, compare the common ancestor of a and b with b. git diff HEAD...origin/feature1 -- lib/message.rb # show the change in feature1 git diff origin/feature1...HEAD -- lib/message.rb # show the change in our branch Check out the more complicated version of the file. This will remove all conflict markers and use the side you choose. git checkout --ours -- lib/message.rb # if our branch's change is more complicated git checkout --theirs -- lib/message.rb # if origin/feature1's change is more complicated With the complicated change checked out, pull up the diff of the simpler change (see step 1). Apply each change from this diff to the conflicting file.


How would merging all conflicts at one go work better than individual commits? I already get problems from merging single commits (especially from people who don't break commits into logical portions AND provide sufficient tests for verification). Also, rebase is not any worse than merge when it comes to backup options, intelligent use of interactive rebase and tools like tortoisegit (which allows selection of which commits to include) will help a lot.
I feel like I addressed the reason in #1. If individual commits are not logically consistent, all the more reason to merge the logically consistent branch, so that you can actually make sense of the conflict. If commit 1 is buggy and commit 2 fixes it, merging commit 1 will be confusing. There are legitimate reasons you might get 15 conflicts in a row, like the one I outlined above. Also your argument for rebase not being worse is somewhat unfounded. Rebase mixes bad merges into the original good commits and doesn't leave the good commits around to let you try again. Merge does.
I completely agree with you nilbus. Great post; that clears some things up. I wonder if rerere would be any help here though. Also, thanks for the suggestion on using diff3, I'm definitely going to switch that one on right now.
+1 for telling me about diff3 alone - how often was looking on an incomprehensible conflict cursing whoever is responsible for not telling me what the common ancestor had to say. Thank you very much.
This should have been the accepted answer. The rebase workflow is also horrible because it hides the fact that there was a huge divergence in the codebase at some point in time, which might be useful to know if you want to understand how the code you are looking at was written. Only small branches that don't conflict should be rebased onto master.
J
James Wright

"Conflicts" mean "parallel evolutions of a same content". So if it goes "all to hell" during a merge, it means you have massive evolutions on the same set of files.

The reason why a rebase is then better than a merge is that:

you rewrite your local commit history with the one of the master (and then reapply your work, resolving any conflict then)

the final merge will certainly be a "fast forward" one, because it will have all the commit history of the master, plus only your changes to reapply.

I confirm that the correct workflow in that case (evolutions on common set of files) is rebase first, then merge.

However, that means that, if you push your local branch (for backup reason), that branch should not be pulled (or at least used) by anyone else (since the commit history will be rewritten by the successive rebase).

On that topic (rebase then merge workflow), barraponto mentions in the comments two interesting posts, both from randyfay.com:

A Rebase Workflow for Git: reminds us to fetch first, rebase:

Using this technique, your work always goes on top of the public branch like a patch that is up-to-date with current HEAD.

(a similar technique exists for bazaar)

Avoiding Git Disasters: A Gory Story: about the dangers of git push --force (instead of a git pull --rebase for instance)


For a technique that allows rebasing and sharing, see softwareswirl.blogspot.com/2009/04/…
randyfay.com/node/91 and randyfay.com/node/89 are wonderful reads. these articles made me understand what was worng with my workflow, and what an ideal workflow would be.
just to get it straight, rebasing from the master branch onto your local is basically to update any history your local may have missed that the master has knowledge of after any sort of merge?
@dtan what I describe here is rebasing local on top of master. You not exactly updating the local history, but rather re-applying local history on top of master in order to solve any conflict within the local branch.
A
Alex Gontmakher

In my workflow, I rebase as much as possible (and I try to do it often. Not letting the discrepancies accumulate drastically reduces the amount and the severity of collisions between branches).

However, even in a mostly rebase-based workflow, there is a place for merges.

Recall that merge actually creates a node that has two parents. Now consider the following situation: I have two independent feature brances A and B, and now want to develop stuff on feature branch C which depends on both A and B, while A and B are getting reviewed.

What I do then, is the following:

Create (and checkout) branch C on top of A. Merge it with B

Now branch C includes changes from both A and B, and I can continue developing on it. If I do any change to A, then I reconstruct the graph of branches in the following way:

create branch T on the new top of A merge T with B rebase C onto T delete branch T

This way I can actually maintain arbitrary graphs of branches, but doing something more complex than the situation described above is already too complex, given that there is no automatic tool to do the rebasing when the parent changes.


You could achieve the same with just rebases. The merge is actually not necessary here (except if you don't want to duplicate the commits - but I hardly see that as an argument).
Indeed I don't want to duplicate the commits. I would like to keep the in-flight structure of my work as clean as possible. But that's a matter of personal taste and is not necessarily right for everybody.
I 100% agree with the first paragraph. (@Edward's answer works where that is not the case, but I'd rather have all project in the world work like you suggest). The rest of the answer seems a bit far-fetched in sense that working on C while A and B are in progress is already sort of risky (at least to the extent it really depends on A and B), and even in the end you would not probably keep the merges (C would get rebased on top of the latest & greatest).
o
ololuki

DO NOT use git push origin --mirror UNDER ALMOST ANY CIRCUMSTANCE.

It does not ask if you're sure you want to do this, and you'd better be sure, because it will erase all of your remote branches that are not on your local box.

http://twitter.com/dysinger/status/1273652486


Or don't do things that you're not sure what the result will be? A machine I used to admin had Instructions to this machine may lead to unintended consequences, loss of work/data, or even death (at the hands of the sysad). Remember that you are solely responsible for the consequences of your actions in the MOTD.
use it if you do have a mirrored repo (although in my case it is now executed by a special user at the source repo on post-receive hook)
P
Pat Notz

In your situation I think your partner is correct. What's nice about rebasing is that to the outsider your changes look like they all happened in a clean sequence all by themselves. This means

your changes are very easy to review

you can continue to make nice, small commits and yet you can make sets of those commits public (by merging into master) all at once

when you look at the public master branch you'll see different series of commits for different features by different developers but they won't all be intermixed

You can still continue to push your private development branch to the remote repository for the sake of backup but others should not treat that as a "public" branch since you'll be rebasing. BTW, an easy command for doing this is git push --mirror origin .

The article Packaging software using Git does a fairly nice job explaining the trade offs in merging versus rebasing. It's a little different context but the principals are the same -- it basically comes down to whether your branches are public or private and how you plan to integrate them into the mainline.


The link to Packaging software using git does not work anymore. I could not find a good link to edit the original answer.
You shouldn't mirror to origin, you should mirror to a third dedicated-backup repository.
T
Taylan Aydinli

I have one question after reading your explanation: Could it be that you never did a

git checkout master
git pull origin
git checkout my_new_feature

before doing the 'git rebase/merge master' in your feature branch?

Because your master branch won't update automatically from your friend's repository. You have to do that with the git pull origin. I.e. maybe you would always rebase from a never-changing local master branch? And then come push time, you are pushing in a repository which has (local) commits you never saw and thus the push fails.


P
Peter Mortensen

Anyway, I was following my workflow on a recent branch, and when I tried to merge it back to master, it all went to hell. There were tons of conflicts with things that should have not mattered. The conflicts just made no sense to me. It took me a day to sort everything out, and eventually culminated in a forced push to the remote master, since my local master has all conflicts resolved, but the remote one still wasn't happy.

In neither your partner's nor your suggested workflows should you have come across conflicts that didn't make sense. Even if you had, if you are following the suggested workflows then after resolution a 'forced' push should not be required. It suggests that you haven't actually merged the branch to which you were pushing, but have had to push a branch that wasn't a descendent of the remote tip.

I think you need to look carefully at what happened. Could someone else have (deliberately or not) rewound the remote master branch between your creation of the local branch and the point at which you attempted to merge it back into the local branch?

Compared to many other version control systems I've found that using Git involves less fighting the tool and allows you to get to work on the problems that are fundamental to your source streams. Git doesn't perform magic, so conflicting changes cause conflicts, but it should make it easy to do the write thing by its tracking of commit parentage.


You are implying that the OP has some undiscovered rebase or mistake in his process, right ?
E
Esteban Herrera

"Even if you’re a single developer with only a few branches, it’s worth it to get in the habit of using rebase and merge properly. The basic work pattern will look like:

Create new branch B from existing branch A

Add/commit changes on branch B

Rebase updates from branch A

Merge changes from branch B onto branch A"

https://www.atlassian.com/git/tutorials/merging-vs-rebasing/


P
Pepe

From what I have observed, git merge tends to keep the branches separate even after merging, whereas rebase then merge combines it into one single branch. The latter comes out much cleaner, whereas in the former, it would be easier to find out which commits belong to which branch even after merging.


B
Bombe

With Git there is no “correct” workflow. Use whatever floats your boat. However, if you constantly get conflicts when merging branches maybe you should coordinate your efforts better with your fellow developer(s)? Sounds like the two of you keep editing the same files. Also, watch out for whitespace and subversion keywords (i.e., “$Id$” and others).


W
WesternGun

I only use rebase workflow, because it is visually clearer(not only in GitKraken, but also in Intellij and in gitk, but I recommend the first one most): you have a branch, it originates from the master, and it goes back to master. When the diagram is clean and beautiful, you will know that nothing goes to hell, ever.

https://i.stack.imgur.com/B85dm.png

My workflow is almost the same from yours, but with only one small difference: I squash commits into one in my local branch before rebase my branch onto the latest changes on master, because:

rebase works on basis of each commit

which means, if you have 15 commits changing the same line as master does, you have to check 15 times if you don't squash, but what matters is the final result, right?

So, the whole workflow is:

Checkout to master and pull to ensure that you have the latest version From there, create a new branch Do your work there, you can freely commit several times, and push to remote, no worries, because it is your branch. If someone tells you, "hey, my PR/MR is approved, now it is merged to master", you can fetch them/pull them. You can do it anytime, or in step 6. After doing all your work, commit them, and if you have several commits, squash them(they are all your work, and how many times you change a line of code does not matter; the only important thing is the final version). Push it or not, it doesn't matter. Checkout to master, pull again to ensure you have the latest master in local. Your diagram should be similar to this:

https://i.stack.imgur.com/GntcK.png

As you can see, you are on your local branch, which originates from an outdated status on master, while master(both local and remote) has moved forward with changes of your colleague.

Checkout back to your branch, and rebase to master. You will now have one commit only, so you solve the conflicts only once.(And in GitKraken, you only have to drag your branch on to master and choose "Rebase"; another reason why I like it.) After that, you will be like:

https://i.stack.imgur.com/tCSGE.png

So now, you have all the changes on the latest master, combined with changes on your branch. You can now push to your remote, and, if you have pushed before, you will have to force push; Git will tell you that you cannot simply fast forward. That's normal, because of the rebase, you have changed the start point of your branch. But you should not fear: wisely use the force, but without fear. In the end, the remote is also your branch so you do not affect master even if you do something wrong. Create PR/MR and wait until it is approved, so master will have your contribution. Congrats! So you can now checkout to master, pull your changes, and delete your local branch in order to clean up the diagram. The remote branch should be deleted too, if this is not done when you merge it into master.

The final diagram is clean and clear again:

https://i.stack.imgur.com/CljEO.png