using grep and find commands - basic questions to help me sort it out in my simple mind - command-line

I am back with a second no-brainer question, but I would like to get this straight in my head.
I have an assignment in which I am charged with providing a command to find a file named test in my home directory (one command using find, and one using grep). I understand that using find is just 'find ~/test', but using grep, wouldn't I have to search out a pattern within the file 'test'? Or is there a way to search for the file (using grep), even if the file is empty?

ls ~ | grep test

I understand that using find is just 'find ~/test'
No. find ~/test will also have a match for every file or directory under the directory $HOME/test/. Rather use find ~ -type f -name test.

The assignment sounds unclear. But yes, if you give any filenames to grep, it will look at the contents of the files and ignore the names of the files. Perhaps you can grep the output of another command? Maybe ls as #Reese suggested, or maybe a different find command.

ls -R ~ | grep test
Explanation: ls -R ~ will recursively list all files and directories in your home folder. grep test will narrow down that list to files (and directories) that have "test" in their name.

Related

Search for files & file names using silver searcher

Using Silver Searcher, how can I search for:
(non-binary) files with a word or pattern AND
all filenames, with a word or pattern including filenames of binary files.
Other preferences: would like to have case insensitive search and search through dotfiles.
Tried to alias using this without much luck:
alias search="ag -g $1 --smart-case --hidden && ag --smart-case --hidden $1"
According to the man page of ag
-G --file-search-regex PATTERN
Only search files whose names match PATTERN.
You can use the -G option to perform searches on files matching a pattern.
So, to answer your question:
root#apache107:~/rpm-4.12.0.1# ag -G cpio.c size
rpm2cpio.c
21: off_t payload_size;
73: /* Retrieve payload size and compression type. */
76: payload_size = headerGetNumber(h, RPMTAG_LONGARCHIVESIZE);
the above command searches for the word size in all files that matches the pattern cpio.c
Reference:
man page of ag version 0.28.0
Note 1:
If you are looking for a string in certain file types, say all C sources code, there is an undocumented feature in ag to help you quickly restrict searches to certain file types.
The commands below both look for foo in all php files:
find . -name \*.php -exec grep foo {}
ag --php foo
While find + grep looks for all .php files, the --php switch in the ag command actually looks for the following file extensions:
.php .phpt .php3 .php4 .php5 .phtml
You can use --cpp for C++ source files, --hh for .h files, --js for JavaScript etc etc. A full list can be found here
Try this:
find . | ag "/.*SEARCHTERM[^/]*$"
The command find . will list all files.
We pipe the output of that to the command ag "/.*SEARCHTERM[^/]*$", which matches SEARCHTERM if it's in the filename, and not just the full path.
Try adding this to your aliases file. Tested with zsh but should work with bash. The problem you encountered in your example is that bash aliases can't take parameters, so you have to first define a function to use the parameter(s) and then assign your alias to that function.
searchfunction() {
echo $(ag -g $1 --hidden)
echo $(ag --hidden -l $1)
}
alias search=searchfunction
You could modify this example to suit your purpose in a few ways, eg
add/remove the -l flag depending on whether or not you want text results to show the text match or just the filename
add headers to separate text results and filename results
deduplicate results to account for files that match both on filename and text, etc.
[Edit: removed unnecessary --smart-case flag per Pablo Bianchi's comment]
Found this question looking for the same answer myself. It doesn't seem like ag has any native capability to search file and directory names. The answers above from Zach Fogg and Jikku Jose both work, but piping find . can be very slow if you're working in a big directory.
I'd recommend using find directly, which is much faster than piping it through ag:
Linux (GNU version of find)
find -name [pattern]
OSX (BSD version of find)
find [pattern]
If you need more help with find, this guide from Digital Ocean is pretty good. I include this because the man pages for find are outrageously dense if you just want to figure out basic usage.
To add to the previous answers, you can use an "Or" Regular Expression to search within files matching different file extensions.
For example to just search a string in C++ header files [.hpp] and Makefiles [.mk] ) :
ag -G '.*\.(hpp|mk)' my_string_to_search
After being unsatisfied with mdfind, find, locate, and other attempts, the following worked for me. It uses tree to get the initial list of files, ag to filter out directories, and then awk to print the matching files themselves.
I wound up using tree because it was more (and more easily) configurable than the other solutions I tried and is fast.
This is a fish function:
function ff --description 'Find files matching given string'
tree . --prune --matchdirs -P "*$argv*" -I "webpack" -i -f --ignore-case -p |
ag '\[[^d].*' |
awk '{print $2}'
end
This gives output similar to the following:
~/temp/hello_world $ ff controller
./app/controllers/application_controller.rb
./config/initializers/application_controller_renderer.rb
~/temp/hello_world $

Recursively replace colons with underscores in Linux

First of all, this is my first post here and I must specify that I'm a total Linux newb.
We have recently bought a QNAP NAS box for the office, on this box we have a large amount of data which was copied off an old Mac XServe machine. A lot of files and folders originally had forward slashes in the name (HFS+ should never have allowed this in the first place), which when copied to the NAS were all replaced with a colon.
I now want to rename all colons to underscores, and have found the following commands in another thread here: pitfalls in renaming files in bash
However, the flavour of Linux that is on this box does not understand the rename command, so I'm having to use mv instead. I have tried using the code below, but this will only work for the files in the current folder, is there a way I can change this to include all subfolders?
for f in *.*; do mv -- "$f" "${f//:/_}"; done
I have found that I can find al the files and folders in question using the find command as follows
Files:
find . -type f -name "*:*"
Folders:
find . -type d -name "*:*"
I have been able to export a list of the results above by using
find . -type f -name "*:*" > files.txt
I tried using the command below but I'm getting an error message from find saying it doesn't understand the exec switch, so is there a way to pipe this all into one command, or could I somehow use the files I exported previously?
find . -depth -name "*:*" -exec bash -c 'dir=${1%/*} base=${1##*/}; mv "$1" "$dir/${base//:/_}"' _ {} \;
Thank you!
Vincent
So your for loop code works, but only in the current dir. Also, you are able to use find to build a file with all the files with : in the filename.
So, as you've already done all this, I would just loop over each line of your file, and perform the same mv command.
Something like this:
for f in `cat files.txt`; do mv $f "${f//:/_}"; done
EDIT:
As pointed out by tripleee, using a while loop is a better solution
EG
while read -r f; do mv "$f" "${f//:/_}"; done <files.txt
Hope this helps.
Will

Why does grep hang when run against the / directory?

My question is in two parts :
1) Why does grep hang when I grep all files under "/" ?
for example :
grep -r 'h' ./
(note : right before the hang/crash, I note that I see some "no such device or address" messages , regarding sockets....
Of course, I know that grep shouldn't run against a socket, but I would think that since sockets are just files in Unix, it should return a negative result, rather than crashing.
2) Now, my follow up question : In any case -- how can I grep the whole filesystem? Are there certain *NIX directories which we should leave out when doing this ? In particular, I'm looking for all recently written log files.
As #ninjalj said, if you don't use -D skip, grep will try to read all your device files, socket files, and FIFO files. In particular, on a Linux system (and many Unix systems), it will try to read /dev/zero, which appears to be infinitely long.
You'll be waiting for a while.
If you're looking for a system log, starting from /var/log is probably the best approach.
If you're looking for something that really could be anywhere in your file system, you can do something like this:
find / -xdev -type f -print0 | xargs -0 grep -H pattern
The -xdev argument to find tells it to stay within a single filesystem; this will avoid /proc and /dev (as well as any mounted filesystems). -type f limits the search to ordinary files. -print0 prints the file names separated by null characters rather than newlines; this avoid problems with files having spaces or other funny characters in their names.
xargs reads a list of file names (or anything else) on its standard input and invokes the specified command on everything in the list. The -0 option works with find's -print0.
The -H option to grep tells it to prefix each match with the file name. By default, grep does this only if there are two or more file names on its command line. Since xargs splits its arguments into batches, it's possible that the last batch will have just one file, which would give you inconsistent results.
Consider using find ... -name '*.log' to limit the search to files with names ending in .log (assuming your log files have such names), and/or using grep -I ... to skip binary files.
Note that all this depends on GNU-specific features. Some of these options might not be available on MacOS (which is based on BSD) or on other Unix systems. Consult your local documentation, and consider installing GNU findutils (for find and xargs) and/or GNU grep.
Before trying any of this, use df to see just how big your root filesystem is. Mine is currently 268 gigabytes; searching all of it would probably take several hours. A few minutes spent (a) restricting the files you search and (b) making sure the command is correct will be well worth the time you spend.
By default, grep tries to read every file. Use -D skip to skip device files, socket files and FIFO files.
If you keep seeing error messages, then grep is not hanging. Keep iotop open in a second window to see how hard your system is working to pull all the contents off its storage media into main memory, piece by piece. This operation should be slow, or you have a very barebones system.
Now, my follow up question : In any case -- how can I grep the whole filesystem? Are there certain *NIX directories which we should leave out when doing this ? In particular, Im looking for all recently written log files.
Grepping the whole FS is very rarely a good idea. Try grepping the directory where the log files should have been written; likely /var/log. Even better, if you know anything about the names of the files you're looking for (say, they have the extension .log), then do a find or locate and grep the files reported by those programs.

How do I do a recursive find & replace within an SVN checkout?

How do I find and replace every occurrence of:
foo
with
bar
in every text file under the /my/test/dir/ directory tree (recursive find/replace).
BUT I want to be able to do it safely within an SVN checkout and not touch anything inside the .svn directories
Similar to this but now with the SVN restriction: Awk/Sed: How to do a recursive find/replace of a string?
There are several possiblities:
Using find:
Using find to create a list of all files, and then piping them to sed or the equivalent, as suggested in the answer you reference, is fairly straightforward, and only requires scanning through the files once.
You'd use one of the same answers as from the question you referenced, but adding -path '*/.svn' -prune -o after the find . in order to prune out the SVN directories. See this question for a discussion of using the prune option with find -- although note that they've got the pattern wrong. Thus, to print out all the files, you would use:
find . -path '*/.svn' -prune -o -type f -print
Then, you can pipe that into an xargs call or whatever to do the individual replacements, as suggested in the question you referenced. There is a lot of discussion there about different options, which I won't reproduce here, although I prefer the version from John Zwinck's answer:
find . -path '*/.svn' -prune -o -type f -exec sed -i 's/foo/bar/g' {} +
Using recursive grep:
If you have a system with GNU grep, you can use that to find the list of files as well. This is probably less efficient than find, but it does allow you to only call sed on the files that match, and I personally find the syntax a lot easier to remember (or figure out from manpages):
sed -i 's/foo/bar/g' `grep -l -R --exclude-dir='*/.svn' 'foo' .`
The -l option causes grep to only output the list of file names, rather than the matching lines.
Using a GUI editor:
Alternately, if you're using windows, do what I do -- get a copy of the NoteTab editor (available in a free version), and use its search-and-replace-on-disk command, which ignores hidden .svn directories automatically and just works.
Edit: Corrected find pattern to */.svn instead of .svn, added more details and some other possibilities. However, this depends on your platform and svn version: .svn without */ may be required in some cases, like on CentOS 7.
How about this?
grep -i "search_string" `find "*.some_extension"`
That is halfway solution to finding a search_string within files that have a specific extension....once you know the files that has the string, can be easily modified by piping it into sed....

Change all with command line

I'm wondering if there is a way to change a specific word in all of the files within the /www/ directory using command line. Just looking for a faster way to change out a specific word so I don't need to open all the files manually! Thanks!
find /www -type f -exec sed -i 's/foo/bar/g' \{\} \;
This line will replace foo with bar every time foo occurs in any file in /www. Be very sure you know what's under /www and what the replacement would do to those files before running it.
You might be looking for a grep-sed solution to find and replace, if you are on a Mac (and referring to the Mac's Terminal app).