Referring to the online docs:
If the pattern does not contain a slash /, Git treats it as a shell glob pattern and checks for a match against the pathname relative to the location of the .gitignore file (relative to the toplevel of the work tree if not from a .gitignore file).
To me, this documentation says that given a pattern 'foo', any file or directory named 'foo' will be ignored only relative to the .gitignore file. I don't read anything explaining its recursive behavior. Shell globs (from what I read and experience) are not recursive.
Now further below it explains the double asterisk:
A leading "**" followed by a slash means match in all directories. For example, "**/foo" matches file or directory "foo" anywhere, the same as pattern "foo"
so yes there is an example in the docs explaining that **/foo is equal to foo, but the recursive behavior remains implicit.
The recursive nature of a rule like "foo" is derived from the way those rules are fetched and applied:
Patterns read from a .gitignore file in the same directory as the path, or in any parent directory, with patterns in the higher level files (up to the toplevel of the work tree) being overridden by those in lower level files down to the directory containing the file.
So even multiple sub-directories below a .gitignore, the rule "foo" will still apply to any "foo" file found in said sub-folder.
Related
In the documentation of pythonforandroid, at https://python-for-android.readthedocs.io/en/latest/buildoptions/, there is a build option described called blacklist.
--blacklist: The path to a file containing blacklisted patterns that will be excluded from the final APK. Defaults to ./blacklist.txt
However, not a word can be found anywhere about how to use this file and what exactly the patterns are supposed to represent. For instance, is this used to exclude libraries, files, or directories? Do the patterns match file names or contents? What is the syntax of the patterns, or an example of a valid blacklist.txt file?
This file should contain a list of glob patterns, i.e. as implemented by fnmatch, one per line. These patterns are compared against the full filepath of each file in your source dir, probably using a global filepath but I'm not certain about that (it might be relative to the source dir).
For instance, the file could contain the following lines:
*.txt
*/test.jpg
This would prevent all files ending with .txt from being included in the apk, and all files named test.jpg in any subfolder.
If using buildozer, the android.blacklist_src buildozer.spec option can be used to point to your choice of blacklist file.
I am using Eclipse (this is probably irrelevant) and I want to exclude Maven target folder from commit.
There are lots of notations
/target/**
*/target/*
/target/**
target/
/target/
What is the difference?
And what is the exact meaning of each of them?
TL;DR: you probably want /target/.
Long
Let's start with a clear definition of the work-tree (from the gitglossary, where it is spelled working tree):
The tree of actual checked out files. The working tree normally contains the contents of the HEAD commit’s tree, plus any local changes that you have made but not yet committed.
We need to keep in mind that what Git stores, and exchanges with other Git repositories, are commits. Each commit freezes, for all time, some set of files so that at any time in the future, you can tell Git get me commit a123456... and get all your files back as of the time you made commit a123456.... (Each commit has a unique, big-and-ugly hash ID like this, which you'll see in git log output and elsewhere.)
Commits vs the work tree
The files inside commits are stored in a special, Git-only, compressed, de-duplicated, and read-only form. I like to call these files freeze-dried. They literally cannot be changed. So they're fine for archival, but completely useless for getting any actual work done. Git therefore needs to be able to extract any given commit, "rehydrating" the freeze-dried files and turning them back into ordinary everyday files that you can see and use and work with. The place you put these files is the work-tree or working tree.
The working tree of course has a top level directory (or folder if you prefer that term), in which you store various files, including your main .gitignore file. That top level directory can have sub-directories (sub-folders) and each sub-folder can have its own .gitignore file too. This is important when you ask about /target vs target, for instance.
Gitignore entries
An entry in a .gitignore file can be in any of the following forms:
name (with no special characters like *)
name.* or *.txt or even name*txt
folder/
folder/*
folder/name
folder/name*txt or any of these variants
folder/subfolder/
folder/subfolder/*
any of the above prefixed with a slash, e.g., /name or /folder/ or /folder/name
any of the above, including prefixed-with-slash, that are then also prefixed with !, e.g., !/folder/name
This is not meant to be an exhaustive list (you have listed several other forms), but rather to illustrate a few basic principles:
A simple file name means any file or directory with this name.
A name suffixed with a slash means any directory (folder) with this name. Entities that are files don't match this kind of entry.
Entries can have embedded slashes—slashes that are not at the front, and not at the rear, such as folder/name.
Entries can have leading slashes, such as /name or /folder/, or both leading slashes and embedded slashes, such as /folder/name.
Entries can have glob characters, * and **, in various places.
Entries can be prefixed with !.
The rules for gitignore entries get pretty complicated, but start out simple enough. Remember that the .gitignore could be in the top level folder of your work-tree, or in some sub-folder!
A plain name, with no embedded or leading slashes, matches any file or folder anywhere from this folder or any of its sub-folders.
A slash-suffixed name, with no embedded or leading slashes, matches any folder (but not file) from this folder or any of its sub-folders.
If an entry has a slash prefix or an embedded slash—either one suffices—the entry matches only files and/or folders in this folder. Hence folder/name and /folder/name mean the same thing: match a file (or folder) named folder/name in this folder—i.e., the place containing the .gitignore file. Do not match the file sub/folder/name, for instance.
If an entry ends with a trailing slash, it only matches folders (regardless of anything else).
You said:
I want to exclude Maven target folder
This requires answering a sub-question: Where does this Maven target folder exist? Is there only one such folder, or can there be target/ entities in sub-folders? (There's also a separate issue, which is that .gitignore directives don't mean quite what people think they mean, and that you need to pay attention to what's in your index, but we'll leave that for another section.)
If this means: Don't include anything in target at the top level of my work-tree, but do go ahead and include, e.g., files named sub/target/file then you should use:
/target/
as the full rule in the .gitignore in the top level of your work-tree. It's slightly redundant since you already know that /target is a folder, but it expresses clearly that you want to ignore the folder named target in the top level of your work-tree.
If this means: Don't include anything in build-artifacts/target/, then you can put:
build-artifacts/target/
or:
/build-artifacts/target/
into the top-level .gitignore; or you can put:
/target/
into build-artifacts/.gitignore. The one in build-artifacts/.gitignore needs a leading slash because /target/ has no embedded slash, while the one in the top level .gitignore does not require a leading slash because it has an embedded slash.
If, on the third hand (first foot?), the requirement is to ignore all files in any folder whose folder-path contains a target component—e.g., you not only want to ignore target/file but also sub/target/file2 and sub/target/sub2/file3—then you should use:
target/
as your .gitignore entry, probably at the top level of your work-tree.
The role of the index / staging-area
The .gitignore files are about things in your work-tree, but Git does not build new commits from your work-tree. Instead, Git builds new commits from an intermediate thing that it calls, variously, the index or the staging area. (These two terms refer to the same entity.)
While the index has some other roles, its main one, especially for our purposes here, is that it holds a copy of every file from the original commit you extracted, or an updated copy or a totally new file. That is, if you extracted a commit that had just the two files files file1 and folder/file2, your index would now have copies of file1 and folder/file2 in it.
The copies inside the index are in the same freeze-dried format as the copies inside a commit. The difference is that you can replace the copies in the index—or add to them, or even subtract them away. That is, you can run git add file1 to take the useful version of file1 in your work-tree, freeze-dry it, and stuff that into the index. You can do the same with folder/file2, and you can put new files like folder2/file3 or ./file4 too. What git add does, in short, is to freeze-dry the work-tree version of the file and stuff it into the index.
When you run git commit, Git simply packages up everything that's in the index right then and make the new commit from that. So that's why you have to git add files all the time: every time you change the work-tree copy, you need to update the index copy, otherwise Git won't save the new version: Git will just re-save the old version again. (To save space, commits that save the same version of an old file really just re-use the old freeze-dried file. They can do that because these files are read-only. It's always safe to locate an old copy and re-use it, because by definition, everything inside Git is frozen for all time. Only the index and work-tree copies can be changed!)
In other words, you can think of the index as the proposed next commit. You copy files into it to update the proposed next commit. To remove a file entirely from the proposed next commit, you use git rm --cached or git rm (without --cached): Git will remove the file from the index, and maybe from the work-tree too, and now your proposed next commit just doesn't have the file at all.
A file can be in the index / staging-area and in the work-tree. That happens all the time. Such a file is called tracked. The contents don't have to match: it's just the fact that the file is in the index right now, and also in the work-tree, that makes the work-tree file tracked.
If a file is tracked—if it's in the index right now—then nothing you do with a .gitignore will affect it at all. To make it not tracked, you have to remove it from the index.
If you remove the file from the index—or if it's already not in the index now because it wasn't in the commit you checked out earlier—then the work-tree copy is untracked. Now the .gitignore entry matters. The .gitignore entry tells Git:
Don't complain about this file. Normally, git status would whine at you, telling you that the file is untracked and, gosh golly gee, shouldn't you git add it? The .gitignore makes Git shut up about that file.
Don't automatically add this file. If you use git add . or git add * or something like that, you're telling Git: add everything. The .gitignore modifies this to be: add everything—except these untracked files that are also ignored, don't add those!
It has a third effect, which is to give Git permission to clobber the work-tree file in some (rare-ish) cases, and to change the way git clean works with -x and -X.
Really, the file should not be called .gitignore, but rather something like .git-dont-whine-about-these-files-and-do-not-auto-add-them-either-and-maybe-occasionally-do-clobber-or-clean-them. But who wants to type that in all the time? So, .gitignore.
Conclusion
There is even more to know about .gitignore entries, but this is already long enough (maybe too long). The summary version is:
.gitignore only affects untracked files;
it's mainly about shutting up whining, and avoiding auto-adding; and
use a trailing slash to mean directory / folder (whichever word you prefer) and a leading slash to mean as found in this directory. When you have complex entries (with embedded slashes), the leading slash is redundant, but conveys your intent.
If you don't want the leading-slash effect, but do need embedded slashes, you either have to distribute your ignore entries to sub-directories / sub-folders, or use the ** notation (as a leading component) to match any number of path components. Otherwise there's rarely any need for ** at all.
Not covered here: once Git realizes it doesn't have to read a work-tree directory, it doesn't bother reading it. As a result, ignoring a subdirectory generally makes it impossible to un-ignore (with ! rules) anything within the subdirectory.
I can gitignore all files beginning with a hash with \#* and all hidden files beginning with (a dot and) a hash with .\#*. But can I ignore both with one pattern?
This is closer, as I detail in "What pattern does .gitignore follow?:
shopt -s dotglob
Then edit your .gitignore with:
*.\#*
But that would probably not ignore a #foo (ie, without .#* extension), only xxx.#yyy.
So two patterns remain the safest setting.
In GitHub's documentation on linguist, the section on using the .gitattributes file says a path can be marked as vendored, and thus ignored in the repository's statistics tracking, with:
special-vendored-path/* linguist-vendored
However, is it possible to have linguist mark directories as vendored that may be nested in directories containing non-vendored code?
I tried adding a line styled as */special-vendored-path/* linguist-vendored to my .gitattributes, but that didn't cause the GitHub code-proportion information to change.
To match a directory inside an arbitrary arborescence of directories, you need double asterisks:
**/special-vendored-path/* linguist-vendored
Note, however, that double asterisks are not needed at the end of paths. For example, test1/* will match test1/test2/test3/file.
I've been using git but still having confusion about the .gitignore file paths.
So, what is the difference between the following two paths in .gitignore file?
tmp/*
public/documents/**/*
I can understand that tmp/* will ignore all the files and folders inside it. Am I right?
But what does that second line path mean?
This depends on the behavior of your shell. Git doesn't do any work to determine how to expand these. In general, * matches any single file or folder:
/a/*/z
matches /a/b/z
matches /a/c/z
doesn't match /a/b/c/z
** matches any string of folders:
/a/**/z
matches /a/b/z
matches /a/b/c/z
matches /a/b/c/d/e/f/g/h/i/z
doesn't match /a/b/c/z/d.pr0n
Combine ** with * to match files in an entire folder tree:
/a/**/z/*.pr0n
matches /a/b/c/z/d.pr0n
matches /a/b/z/foo.pr0n
doesn't match /a/b/z/bar.txt
Update (08-Mar-2016)
Today, I am unable to find a machine where ** does not work as claimed. That includes OSX-10.11.3 (El Capitan) and Ubuntu-14.04.1 (Trusty). Possibly git-ignore as been updated, or possibly recent fnmatch handles ** as people expect. So the accepted answer now seems to be correct in practice.
Original post
The ** has no special meaning in git. It is a feature of bash >= 4.0, via
shopt -s globstar
But git does not use bash. To see what git actually does, you can experiment with git add -nv and files in several levels of sub-directories.
For the OP, I've tried every combination I can think of for the .gitignore file, and nothing works any better than this:
public/documents/
The following does not do what everyone seems to think:
public/documents/**/*.obj
I cannot get that to work no matter what I try, but at least that is consistent with the git docs. I suspect that when people add that to .gitignore, it works by accident, only because their .obj files are precisely one sub-directory deep. They probably copied the double-asterisk from a bash script. But perhaps there are systems where fnmatch(3) can handle the double-asterisk as bash can.
If you're using a shell such as Bash 4, then ** is essentially a recursive version of *, which will match any number of subdirectories.
This makes more sense if you add a file extension to your examples. To match log files immediately inside tmp, you would type:
/tmp/*.log
To match log files anywhere in any subdirectory of tmp, you would type:
/tmp/**/*.log
But testing with git version 1.6.0.4 and bash version 3.2.17(1)-release, it appears that git does not support ** globs at all. The most recent man page for gitignore doesn't mention **, either, so this is either (1) very new, (2) unsupported, or (3) somehow dependent on your system's implementation of globbing.
Also, there's something subtle going on in your examples. This expression:
tmp/*
...actually means "ignore any file inside a tmp directory, anywhere in the source tree, but don't ignore the tmp directories themselves". Under normal circumstances, you'd probably just write:
/tmp
...which would ignore a single top-level tmp directory. If you do need to keep the tmp directories around, while ignoring their contents, you should place an empty .gitignore file in each tmp directory to make sure that git actually creates the directory.
Note that the '**', when combined with a sub-directory (**/bar), must have changed from its default behavior, since the release note for git1.8.2 now mentions:
The patterns in .gitignore and .gitattributes files can have **/, as a pattern that matches 0 or more levels of subdirectory.
E.g. "foo/**/bar" matches "bar" in "foo" itself or in a subdirectory of "foo".
See commit 4c251e5cb5c245ee3bb98c7cedbe944df93e45f4:
"foo/**/bar" matches "foo/x/bar", "foo/x/y/bar"... but not "foo/bar".
We make a special case, when foo/**/ is detected (and "foo/" part is already matched), try matching "bar" with the rest of the string.
"Match one or more directories" semantics can be easily achieved using "foo/*/**/bar".
This also makes "**/foo" match "foo" in addition to "x/foo", "x/y/foo"..
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds#gmail.com>
Simon Buchan also commented:
current docs (.gitignore man page) are pretty clear that no subdirectory is needed, x/** matches all files under (possibly empty) x
The .gitignore man page does mention:
A trailing "/**" matches everything inside. For example, "abc/**" matches all files inside directory "abc", relative to the location of the .gitignore file, with infinite depth.
A slash followed by two consecutive asterisks then a slash matches zero or more directories. For example, "a/**/b" matches "a/b", "a/x/b", "a/x/y/b" and so on.
When ** isn't supported, the "/" is essentially a terminating character for the wildcard, so when you have something like:
public/documents/**/*
it is essentially looking for two wildcard items in between the slashes and does not pick up the slashes themselves. Consequently, this would be the same as:
public/documents/*/*
It doesn't work for me but you could create a new .gitignore in that subdirectory:
tmp/**/*.log
can be replaced by a .gitignore in tmp:
*.log