ChatGPT解决这个技术问题 Extra ChatGPT

Update Git submodule to latest commit on origin

I have a project with a Git submodule. It is from an ssh://... URL, and is on commit A. Commit B has been pushed to that URL, and I want the submodule to retrieve the commit, and change to it.

Now, my understanding is that git submodule update should do this, but it doesn't. It doesn't do anything (no output, success exit code). Here's an example:

$ mkdir foo
$ cd foo
$ git init .
Initialized empty Git repository in /.../foo/.git/
$ git submodule add ssh://user@host/git/mod mod
Cloning into mod...
user@host's password: hunter2
remote: Counting objects: 131, done.
remote: Compressing objects: 100% (115/115), done.
remote: Total 131 (delta 54), reused 0 (delta 0)
Receiving objects: 100% (131/131), 16.16 KiB, done.
Resolving deltas: 100% (54/54), done.
$ git commit -m "Hello world."
[master (root-commit) 565b235] Hello world.
 2 files changed, 4 insertions(+), 0 deletions(-)
 create mode 100644 .gitmodules
 create mode 160000 mod
# At this point, ssh://user@host/git/mod changes; submodule needs to change too.
$ git submodule init
Submodule 'mod' (ssh://user@host/git/mod) registered for path 'mod'
$ git submodule update
$ git submodule sync
Synchronizing submodule url for 'mod'
$ git submodule update
$ man git-submodule 
$ git submodule update --rebase
$ git submodule update
$ echo $?
0
$ git status
# On branch master
nothing to commit (working directory clean)
$ git submodule update mod
$ ...

I've also tried git fetch mod, which appears to do a fetch (but can't possibly, because it's not prompting for a password!), but git log and git show deny the existence of new commits. Thus far I've just been rm-ing the module and re-adding it, but this is both wrong in principle and tedious in practice.

David Z's answer seems like the better way of doing this - now that Git has the functionality you need built in via the --remote option, perhaps it would be useful to mark that as the accepted answer rather than the "by hand" approach in Jason's answer?
I'm agreeing highly with @MarkAmery. While Jason gave a working solution, it isn't the intended way to do it, as it leaves the submodule's commit pointer at the wrong commit identifier. The new --remote is definitively a better solution at this point in time, and since this question has been linked to from a Github Gist about submodules, I feel it would be better for incoming readers to see the new answer.
Nice touch with the hunter2 password :o)

M
Melebius

The git submodule update command actually tells Git that you want your submodules to each check out the commit already specified in the index of the superproject. If you want to update your submodules to the latest commit available from their remote, you will need to do this directly in the submodules.

So in summary:

# Get the submodule initially
git submodule add ssh://bla submodule_dir
git submodule init

# Time passes, submodule upstream is updated
# and you now want to update

# Change to the submodule directory
cd submodule_dir

# Checkout desired branch
git checkout master

# Update
git pull

# Get back to your project root
cd ..

# Now the submodules are in the state you want, so
git commit -am "Pulled down update to submodule_dir"

Or, if you're a busy person:

git submodule foreach git pull origin master

git submodule foreach git pull
@Nicklas In that case, use git submodule foreach git pull origin master.
At this point, with all these corrections to the corrections, I need someone to write an explanatory blog post and point me there. Please.
minor improvement to the 'foreach' approach - you may want to add --recursive in there in case you have submodules within submodules. so: git submodule foreach --recursive git pull origin master.
What if each git submodule has a different default branch?
P
Peter Mortensen

Git 1.8.2 features a new option, --remote, that will enable exactly this behavior. Running

git submodule update --remote --merge

will fetch the latest changes from upstream in each submodule, merge them in, and check out the latest revision of the submodule. As the documentation puts it:

--remote This option is only valid for the update command. Instead of using the superproject’s recorded SHA-1 to update the submodule, use the status of the submodule’s remote-tracking branch.

This is equivalent to running git pull in each submodule, which is generally exactly what you want.


"equivalent to running git pull in each submodule" To clarify, there is no difference (from the user's perspective) between your answer and git submodule foreach git pull?
I wish I could upvote this 10,000X. Why isn't this shown in git's documentation anywhere? Huge oversight.
For me they actually differed quite significantly; foreach git pull only checked them out, but did not update the pointer of the main repo to point to the newer commit of the submodule. Only with --remote it made it point to the latest commit.
why the --merge option? What difference does it makes?
Nowadys with a mixture of repos using a master or a main branch the git submodule foreach git pull origin master will fail. Therefore git submodule update --remote is the better solution.
P
Peter Mortensen

In your project parent directory, run:

git submodule update --init

Or if you have recursive submodules run:

git submodule update --init --recursive

Sometimes this still doesn't work, because somehow you have local changes in the local submodule directory while the submodule is being updated.

Most of the time the local change might not be the one you want to commit. It can happen due to a file deletion in your submodule, etc. If so, do a reset in your local submodule directory and in your project parent directory, run again:

git submodule update --init --recursive

this is the true answer. can i push it to my remote repository somehow?
This works for new submodules! I could update all the others but the folder of new submodules would remain empty until I ran this command.
It doesn't pull changes for existing submodules
This will clone the submodules, but only to the commit specified in the main repo. You need to cd into the submodule folder and run git pull origin <branch_name> to get the latest commit, after running git submodule update --init
P
Peter Mortensen

Your main project points to a particular commit that the submodule should be at. git submodule update tries to check out that commit in each submodule that has been initialized. The submodule is really an independent repository - just creating a new commit in the submodule and pushing that isn't enough. You also need to explicitly add the new version of the submodule in the main project.

So, in your case, you should find the right commit in the submodule - let's assume that's the tip of master:

cd mod
git checkout master
git pull origin master

Now go back to the main project, stage the submodule and commit that:

cd ..
git add mod
git commit -m "Updating the submodule 'mod' to the latest version"

Now push your new version of the main project:

git push origin master

From this point on, if anyone else updates their main project, then git submodule update for them will update the submodule, assuming it's been initialized.


P
Peter Mortensen

It seems like two different scenarios are being mixed together in this discussion:

Scenario 1

Using my parent repository's pointers to submodules, I want to check out the commit in each submodule that the parent repository is pointing to, possibly after first iterating through all submodules and updating/pulling these from remote.

This is, as pointed out, done with

git submodule foreach git pull origin BRANCH
git submodule update

Scenario 2, which I think is what OP is aiming at

New stuff has happened in one or more submodules, and I want to 1) pull these changes and 2) update the parent repository to point to the HEAD (latest) commit of this/these submodules.

This would be done by

git submodule foreach git pull origin BRANCH
git add module_1_name
git add module_2_name
......
git add module_n_name
git push origin BRANCH

Not very practical, since you would have to hardcode n paths to all n submodules in e.g. a script to update the parent repository's commit pointers.

It would be cool to have an automated iteration through each submodule, updating the parent repository pointer (using git add) to point to the head of the submodule(s).

For this, I made this small Bash script:

git-update-submodules.sh

#!/bin/bash

APP_PATH=$1
shift

if [ -z $APP_PATH ]; then
  echo "Missing 1st argument: should be path to folder of a git repo";
  exit 1;
fi

BRANCH=$1
shift

if [ -z $BRANCH ]; then
  echo "Missing 2nd argument (branch name)";
  exit 1;
fi

echo "Working in: $APP_PATH"
cd $APP_PATH

git checkout $BRANCH && git pull --ff origin $BRANCH

git submodule sync
git submodule init
git submodule update
git submodule foreach "(git checkout $BRANCH && git pull --ff origin $BRANCH && git push origin $BRANCH) || true"

for i in $(git submodule foreach --quiet 'echo $path')
do
  echo "Adding $i to root repo"
  git add "$i"
done

git commit -m "Updated $BRANCH branch of deployment repo to point to latest head of submodules"
git push origin $BRANCH

To run it, execute

git-update-submodules.sh /path/to/base/repo BRANCH_NAME

Elaboration

First of all, I assume that the branch with name $BRANCH (second argument) exists in all repositories. Feel free to make this even more complex.

The first couple of sections is some checking that the arguments are there. Then I pull the parent repository's latest stuff (I prefer to use --ff (fast-forwarding) whenever I'm just doing pulls. I have rebase off, BTW).

git checkout $BRANCH && git pull --ff origin $BRANCH

Then some submodule initializing, might be necessary, if new submodules have been added or are not initialized yet:

git submodule sync
git submodule init
git submodule update

Then I update/pull all submodules:

git submodule foreach "(git checkout $BRANCH && git pull --ff origin $BRANCH && git push origin $BRANCH) || true"

Notice a few things: First of all, I'm chaining some Git commands using && - meaning previous command must execute without error.

After a possible successful pull (if new stuff was found on the remote), I do a push to ensure that a possible merge-commit is not left behind on the client. Again, it only happens if a pull actually brought in new stuff.

Finally, the final || true is ensuring that script continues on errors. To make this work, everything in the iteration must be wrapped in the double-quotes and the Git commands are wrapped in parentheses (operator precedence).

My favourite part:

for i in $(git submodule foreach --quiet 'echo $path')
do
  echo "Adding $i to root repo"
  git add "$i"
done

Iterate all submodules - with --quiet, which removes the 'Entering MODULE_PATH' output. Using 'echo $path' (must be in single-quotes), the path to the submodule gets written to output.

This list of relative submodule paths is captured in an array ($(...)) - finally iterate this and do git add $i to update the parent repository.

Finally, a commit with some message explaining that the parent repository was updated. This commit will be ignored by default, if nothing was done. Push this to origin, and you're done.

I have a script running this in a Jenkins job that chains to a scheduled automated deployment afterwards, and it works like a charm.

I hope this will be of help to someone.


!@#$% SO We're using scripts akin to yours; one note: Instead of ``` git submodule foreach --quiet 'echo $path' ``` we use ``` git submodule foreach --recursive --quiet pwd ``` inside the for loops. The pwd command prints the proper 'absolute path' for each submodule present; --recursive ensures we visit all submodules, including the submodules-within-submodules-... that may be present in a large project. Both methods cause trouble with directories which include spaces, e.g. /c/Users/Ger/Project\ Files/... hence policy is to never use whitespace anywhere in our projects.
This is nice, and you're right that there's a misunderstand in some answers about what the question even is, but as pointed out by David Z's excellent answer, your script is unnecessary since the functionality has been built into Git since mid-2013 when they added the --remote option. git submodule update --remote behaves approximately the way that your script does.
@GerHobbelt Thanks. You are right, we have only 1 level of submodules, so I never thought to make it recursive. I won't update the script, before I've had a chance to verify it works as expected, but definitely my script would ingore sub-sub-modules. As to spaces in folders, this definitely sounds like something to avoid! :S
@MarkAmery Thanks for your feedback. I see 1 issue, however: not by-argument being able to specify branch for submodules. From git manual: The remote branch used defaults to master, but the branch name may be overridden by setting the submodule.<name>.branch option in either .gitmodules or .git/config (with .git/config taking precedence). I don't want to edit .gitmodules nor .git/config every time I want to do this to another branch than master. But maybe I have missed something? Also, the method seems to enforce recursive merges (thus missing the possibility of a fast-forward).
Last thing: I tried @DavidZ's method, and it doesn't seem to do the exact thing, I set out to do (and which op was asking about): Adding the HEAD commit of submodules to parent (i.e. "updating the pointer"). It does, however, seem to do the sole job very well (and faster) of fetching and merging latest changes in all submodules. Alas, by default only from master branch (unless you edit the .gitmodules file (see above)).
V
VonC

Note, while the modern form of updating submodule commits would be:

git submodule update --recursive --remote --merge --force

The older form was:

git submodule foreach --quiet git pull --quiet origin

Except... this second form is not really "quiet".

See commit a282f5a (12 Apr 2019) by Nguyễn Thái Ngọc Duy (pclouds).
(Merged by Junio C Hamano -- gitster -- in commit f1c9f6c, 25 Apr 2019)

submodule foreach: fix " --quiet" not being respected

Robin reported that git submodule foreach --quiet git pull --quiet origin is not really quiet anymore. It should be quiet before fc1b924 (submodule: port submodule subcommand 'foreach' from shell to C, 2018-05-10, Git v2.19.0-rc0) because parseopt can't accidentally eat options then. "git pull" behaves as if --quiet is not given. This happens because parseopt in submodule--helper will try to parse both --quiet options as if they are foreach's options, not git-pull's. The parsed options are removed from the command line. So when we do pull later, we execute just this git pull origin When calling submodule helper, adding "--" in front of "git pull" will stop parseopt for parsing options that do not really belong to submodule--helper foreach. PARSE_OPT_KEEP_UNKNOWN is removed as a safety measure. parseopt should never see unknown options or something has gone wrong. There are also a couple usage string update while I'm looking at them. While at it, I also add "--" to other subcommands that pass "$@" to submodule--helper. "$@" in these cases are paths and less likely to be --something-like-this. But the point still stands, git-submodule has parsed and classified what are options, what are paths. submodule--helper should never consider paths passed by git-submodule to be options even if they look like one.

And Git 2.23 (Q3 2019) fixes another issue: "git submodule foreach" did not protect command line options passed to the command to be run in each submodule correctly, when the "--recursive" option was in use.

See commit 30db18b (24 Jun 2019) by Morian Sonnet (momoson).
(Merged by Junio C Hamano -- gitster -- in commit 968eecb, 09 Jul 2019)

submodule foreach: fix recursion of options

Calling: git submodule foreach --recursive --

Note that, before Git 2.29 (Q4 2020), "git submodule update --quiet"(man) did not squelch underlying "rebase" and "pull" commands.

See commit 3ad0401 (30 Sep 2020) by Theodore Dubois (tbodt).
(Merged by Junio C Hamano -- gitster -- in commit 300cd14, 05 Oct 2020)

submodule update: silence underlying merge/rebase with "--quiet" Signed-off-by: Theodore Dubois

Commands such as $ git pull --rebase --recurse-submodules --quiet produce non-quiet output from the merge or rebase. Pass the --quiet option down when invoking "rebase" and "merge". Also fix the parsing of git submodule update(man) -v. When e84c3cf3 ("git-submodule.sh: accept verbose flag in cmd_update to be non-quiet", 2018-08-14, Git v2.19.0-rc0 -- merge) taught "git submodule update"(man) to take "--quiet", it apparently did not know how ${GIT_QUIET:+--quiet} works, and reviewers seem to have missed that setting the variable to "0", rather than unsetting it, still results in "--quiet" being passed to underlying commands.


P
Peter Mortensen

Plain and simple, to fetch the submodules:

git submodule update --init --recursive

And now proceed updating them to the latest master branch (for example):

git submodule foreach git pull origin master

佚名
git pull --recurse-submodules

This will pull all the latest commits.


J
Jobin James

This works for me to update to the latest commits

git submodule update --recursive --remote --init


This question already has a lot of similar, though not identical, answers. It would help if you could explain how yours improves on what's been said here already.
n
noseratio

In my case, I wanted git to update to the latest and at the same time re-populate any missing files.

The following restored the missing files (thanks to --force which doesn't seem to have been mentioned here), but it didn't pull any new commits:

git submodule update --init --recursive --force

This did:

git submodule update --recursive --remote --merge --force


P
Peter Mortensen

If you don't know the host branch, make this:

git submodule foreach git pull origin $(git rev-parse --abbrev-ref HEAD)

It will get a branch of the main Git repository and then for each submodule will make a pull of the same branch.


P
Peter Mortensen

@Jason is correct in a way but not entirely.

update Update the registered submodules, i.e. clone missing submodules and checkout the commit specified in the index of the containing repository. This will make the submodules HEAD be detached unless --rebase or --merge is specified or the key submodule.$name.update is set to rebase or merge.

So, git submodule update does checkout, but it is to the commit in the index of the containing repository. It does not yet know of the new commit upstream at all. So go to your submodule, get the commit you want and commit the updated submodule state in the main repository and then do the git submodule update.


It seems that if I move the submodule to a different commit, and then run git submodule update, update will move the submodule to the commit that is specified in the current HEAD of the superproject. (whatever the most recent commit in the superproject says the subproject should be at — this behavior, after the explanation in Jason's post, seems logical to me) It also appears to fetch, but only in the case that the subproject is on the wrong commit, which was adding to my confusion.
M
Mohsin Mahmood

If you are looking to checkout master branch for each submodule -- you can use the following command for that purpose:

git submodule foreach git checkout master

F
Friedrich

For me all git submodule did not work. But this worked:

cd <path/to/submodule>
git pull

It downloads and thus updates the third party repo. Then

cd <path/to/repo>
git commit -m "update latest version" <relative_path/to/submodule>
git push

which updates your remote repo (with the link to the last commit repo@xxxxxx).


d
dustinrwh

Here's an awesome one-liner to update everything to the latest on master:

git submodule foreach 'git fetch origin --tags; git checkout master; git pull' && git pull && git submodule update --init --recursive

Thanks to Mark Jaquith


O
Oleg Kokorin

the simplest way to handle git projects containing submodules is to always add

--recurse-submodules 

at the end of each git command example:

git fetch --recurse-submodules

another

git pull --update --recurse-submodules

etc...