Rsync includes a nifty option --cvs-exclude
to “ignore files in the same way CVS does”, but CVS has been obsolete for years. Is there any way to make it also exclude files which would be ignored by modern version control systems (Git, Mercurial, Subversion)?
For example, I have lots of Maven projects checked out from GitHub. Typically they include a .gitignore
listing at least target
, the default Maven build directory (which may be present at top level or in submodules). Since the contents of these directories are entirely disposable, and they can be far larger than source code, I would like to exclude them when using rsync for backups.
Of course I can explicitly --exclude=target/
but that will accidentally suppress unrelated directories that just happen to be named target
and are not supposed to be ignored.
And I could supply a complete list of absolute paths for all file names and patterns mentioned in any .gitignore
, .hgignore
, or svn:ignore
property on my disk, but this would be a huge list that would have to be produced by some sort of script.
Since rsync has no built-in support for VCS checkouts other than CVS, is there any good trick for feeding it their ignore patterns? Or some kind of callback system whereby a user script can be asked whether a given file/directory should be included or not?
Update: --filter=':- .gitignore'
as suggested by LordJavac seems to work as well for Git as --filter=:C
does for CVS, at least on the examples I have found, though it is unclear if the syntax is an exact match. --filter=':- .hgignore'
does not work very well for Mercurial; e.g. an .hgignore
containing a line like ^target$
(the Mercurial equivalent of Git /target/
) is not recognized by rsync as a regular expression. And nothing seems to work for Subversion, for which you would have to parse .svn/dir-prop-base
for a 1.6 or earlier working copy, and throw up your hands in dismay for a 1.7 or later working copy.
:-
mean exactly? What does the colon mean? What the dash?
:
represents dir-merge
(useful if you have .gitignore
files deep in the folder tree) and the -
represents exclude
(filter can also include).
As mentioned by luksan, you can do this with the --filter
switch to rsync
. I achieved this with --filter=':- .gitignore'
(there's a space before ".gitignore") which tells rsync
to do a directory merge with .gitignore
files and have them exclude per git's rules. You may also want to add your global ignore file, if you have one. To make it easier to use, I created an alias to rsync
which included the filter.
You can use git ls-files
to build the list of files excluded by the repository's .gitignore
files. https://git-scm.com/docs/git-ls-files
Options:
--exclude-standard Consider all .gitignore files.
-o Don't ignore unstaged changes.
-i Only output ignored files.
--directory Only output the directory path if the entire directory is ignored.
The only thing I left to ignore was .git
.
rsync -azP --exclude=.git --exclude=`git -C <SRC> ls-files --exclude-standard -oi --directory` <SRC> <DEST>
rsync -azP --exclude-from="$(git -C SRC ls-files --exclude-standard -oi --directory > /tmp/excludes; echo /tmp/excludes)" SRC DEST
.gitignore
(i.e. lines that start with !
). It also rsyncs files that you --force
added to your repo, which is usually a good thing.
After the hours of research I have found exactly what I need: to sync the destination folder with the source folder (also deleting files in the destination if they were deleted in the source), and not to copy to the destination the files that are ignored by .gitignore
, but also not to delete this files in the destination:
rsync -vhra /source/project/ /destination/project/ --include='**.gitignore' --exclude='/.git' --filter=':- .gitignore' --delete-after
In other words, this command completely ignores files from .gitignore, both in source and in the destination. You can omit --exclude='/.git'
part if want to copy the .git
folder too.
You MUST copy .gitignore
files from the source. If you will use LordJavac's command, the .gitignore
will not be copied. And if you create a file in the destination folder, that should be ignored by .gitignore
, this file will be deleted despite .gitignore
. This is because you don't have .gitignore
-files in the destination. But if you will have these files, the files described in the .gitignore
will not be deleted, they will be ignored, just expected.
.gitignore
scatter around their directory, which is most of the modern git structure. Glad I scroll down to here
foo/*
inside .gitigore
, rsync with fail to sync src/foo/.*
even that is not part of the git ignore patterns.
--include '.git'
':- .gitignore'
means dir-merge (:
), exclude patterns (-
) from the file .gitignore
. "dir-merge" is short for "per-directory merge", which means "rsync will scan every directory that it traverses for the named file, merging its contents when the file exists into the current list of inherited rules." In my case, I only have one .gitignore
, and it's in a parent directory, so the correct option for me is: --filter='.- ../.gitignore'
, which is a "single-instance" (.
) merge.
2018 solution confirmed
rsync -ah --delete
--include .git --exclude-from="$(git -C SRC ls-files \
--exclude-standard -oi --directory >.git/ignores.tmp && \
echo .git/ignores.tmp')" \
SRC DST
Details: --exclude-from
is mandatory instead of --exclude because likely case that exclude list would not be parsed as an argument. Exclude from requires a file and cannot work with pipes.
Current solution saves the exclude file inside the .git folder in order to assure it will not affect git status
while keeping it self contained. If you want you are welcome to use /tmp.
SRC
here—but not for the original problem I stated, which is a sprawling directory with thousands of Git repositories as subdirectories at various depths, many of which have idiosyncratic .gitignore
s.
--exclude-from=<(git -C SRC ls-files --exclude-standard -oi --directory)
how about rsync --exclude-from='path/.gitignore' --exclude-from='path/myignore.txt' source destination
?
It worked for me.
I believe you can have more --exclude-from
parameters too.
.gitignore
files happen to use a syntax compatible with rsync
.
I had a number of very large .gitignore
files and none of the "pure rsync" solutions worked for me. I wrote this rsync wrapper script, it fully respects .gitignore
rules (include !
-style exceptions and .gitignore
files in subdirectories) and has worked like a charm for me.
locate -0e .gitignore | (while read -d '' x; do process_git_ignore "$x"; done)
, but has a lot of issues. Files in the same directory as .gitignore
not correctly separated from the directory name with /
. Blank lines and comments misinterpreted. Chokes on .gitignore
files in paths with spaces (never mind the fiendish /opt/vagrant/embedded/gems/gems/rb-fsevent-0.9.4/spec/fixtures/custom 'path/.gitignore
from the vagrant
package for Ubuntu). Perhaps better done as a Perl script.
rsync
, for the specific reason that handling quoting/whitespace is such a pain. If you have an example of a gsync
command line that is failing, and the .gitignore
files associated with it, I would be happy to take a closer look.
rsync
an entire filesystem, with various Git repositories scattered around it. Perhaps your script works fine for the case of synchronizing a single repository.
For mercurial you might use
hg status -i | sed 's/^I //' > /tmp/tmpfile.txt
to collect the list of files which are NOT under mercurial control because of .hgignore restrictions and then run
rsync -avm --exclude-from=/tmp/tmpfile.txt --delete source_dir/ target_dir/
to rsync all files except the ignored ones. Notice -m flag in rsync that will exclude empty directories from syncing because hg status -i would only list excluded files, not dirs
Try this:
rsync -azP --delete --filter=":- .gitignore" <SRC> <DEST>
It can copy all files to remote directory excluding files in '.gitignore', and delete files not in your current directory.
Check out the MERGE-FILES FILTER RULES section in rsync(1).
It looks like it's possible to create a rsync --filter rule that will include .gitignore files as traverses the directory structure.
Per the rsync
man page, in addition to the standard list of file patterns:
files listed in a $HOME/.cvsignore are added to the list and any files listed in the CVSIGNORE environment variable
So, my $HOME/.cvsignore file looks like this:
.git/
.sass-cache/
to exclude .git and the files generated by Sass.
.git/
directories, perhaps even more strongly than the working copy. What I want to exclude are build products.
rsync
man page quoted in this answer describes the --cvs-exclude
option, so you have to use it explicitly. 2/ You may create .cvsignore
files in any directory to have project-specific ignores, those are read as well. 3/ .git
is already ignored when you use --cvs-exclude
, according to the manual, so having it in $HOME/.cvsignore
seems redundant.
Alternatives:
git ls-files -zi --exclude-standard |rsync -0 --exclude-from=- ...
git ls-files -zi --exclude-per-directory=".gitignore" |...
(rsync only partly understands .gitignore)
Instead of creating exclude filters, you can use git ls-files
to select each file to rsync:
#!/usr/bin/env bash
if [[ ! $# -eq 2 ]] ; then
echo "Usage: $(basename $0) <local source> <rsync destination>"
exit 1
fi
cd $1
versioned=$(git ls-files --exclude-standard)
rsync --verbose --links --times --relative --protect-args ${versioned} $2
This works even though git ls-files
returns newline separated paths. Probably won't work if you have versioned files with spaces in the filenames.
Short answer
rsync -r --info=progress2 --filter=':- .gitignore' SOURCE DEST/
Parameters meaning:
-r
: recursive
--info=...
: show progress
--filter=...
: exclude by the rules listed on the .gitignore file
Success story sharing
--exclude='/.git' --filter="dir-merge,- .gitignore"
rsync -rvv --exclude='.git*' --exclude='/rsync-to-dev.sh' --filter='dir-merge,-n /.gitignore' $DIR/ development.foobar.com:~/test/
.. but although it says[sender] hiding file .gitignore because of pattern .git*
, the file still is sent to the desintation--delete
option, here is the working command line:rsync --delete-after --filter=":e- .gitignore" --filter "- .git/" -v -a ...
. This took me a while...e
in filter and--delete-after
are both important. I suggest reading the "PER-DIRECTORY RULES AND DELETE" chapter ofrsync
man page.--delete-after
to @VasiliNovikov's version of the command. (This seems equivalent to @dboliton's version of the command, except @db uses :e which i think excludes the .gitignore files from being copied, which is not what I wanted.)rsync
from the directoy with the.gitignore
in it? Or does it pull ot from the dir syncs from? I guess I have to put in the full path to .gitignore to be save?