How to find all files containing a string?

How to find all files containing a string? - command-line

I'm using this
# cat *.php* | grep -HRi error_reporting
This is my result
(standard input):$mosConfig_error_reporting = '0';
(standard input):error_reporting(E_ALL);
How can I find out what files contain the results?

Use -l option to show the file name only:
grep -il "error_reporting" *php*
For the recursion, you can play with --include to indicate the files you want to look for:
grep -iRl --include=*php* "error_reporting" *
But if you want to show the line numbers, then you need to use -n and hence -l won't work alone. This is a workaround:
grep -iRn --include="*php*" "error_reporting" * | cut -d: -f-2
or
find . -type f -name "*php*" -exec grep -iHn "error_reporting" {} \; | cut -d: -f-2.
The cut part removes the matching text, so that the output is like:
file1:line_of_matching
file2:line_of_matching
...
From man grep:
-l, --files-with-matches
Suppress normal output; instead print the name of each input
file from which output would normally have been printed. The scanning
will stop on the first match. (-l is specified by POSIX.)
--include=GLOB
Search only files whose base name matches GLOB (using wildcard
matching as described under --exclude).
-n, --line-number
Prefix each line of output with the 1-based line number within its
input file. (-n is specified by POSIX.)

Related

How to rename all the files (without for loop) in a single line command?

I want to rename all the files in my home directory (example abc), in the format (abc_bkp) without using any loops and it should be a single line command in unix (bash script).

If the directory contains nothing but files, this should do it:
ls | xargs -I {} mv {} {}_bkp
If it contains subdirectories, links, and other things you don't want to rename, you must filter the output of ls. Here is a crude way to do it; maybe someone can suggest a more elegant approach:
ls -l | grep ^- | cut -d' ' -f 13 | xargs -I {} mv {} {}_bkp

If you don't want to use loops then I believe the BEST way could be find command, try following command as a DRY run first and once you are satisfy with results then you could remove echo from it to give a real shot.
find -type f -or -type d | xargs -I % echo mv % %_bkp
-I: From man xargs page:
-I replace-str
Replace occurrences of replace-str in the initial-arguments with names read from standard input. Also, unquoted blanks do not
terminate
input items; instead the separator is the newline character. Implies -x and -L 1.

find file names by searching the last lines for pattern

I have to find all files in a large number of large ASCII files which contain a specific pattern. At the moment I'm doing that with
grep -l <pattern> <files>
and it's very slow.
But I know that the pattern appears in the last 10 lines, if it exists. Is there an elegant possibility to search only the last lines to speed up the search, e.g. with awk?

You can simply print the filename while processing
for f in $files; do
echo "---- File \"$f\" ------"
tail -n 10 "$f" | grep -l "$pattern"
# you can also save the stdout to $f...
done

to see only specific number of line of a file then command syntex is as follow.
tail [+ number] [-l] [-b] [-c] [-r] [-f] [-c number | -n number] [file]
Now you can use pipe to comand greap and cat to perform your specific work.
i.e.
tail -n 10 <fileName>|grep -l <pattern> <files>
Click here to know more.

Dynamically building a exlude list for both rsync & egrep format

I wonder if anyone out there can assist me in trying to solve a issue with me.
I have written a set of shell scripts with the purpose of auditing remote file systems based on a GOLD build on a audit server.
As part of this, I do the following:
1) Use rsync to work out any new files or directories, any modified or removed files
2) Use find ${source_filesystem} -ls on both local & remote to work out permissions differences
Now as part of this there are certain files or directories that I am excluding, i.e. logs, trace files etc.
So in order to achieve this I use 2 methods:
1) RSYNC - I have an exclude-list that is added using --exclude-from flag
2) find -ls - I use a egrep -v statement to exclude the same as the rsync exclude-list:
e.g. find -L ${source_filesystem} -ls | egrep -v "$SEXCLUDE_supt"
So my issue is that I have to maintain 2 separate lists and this is a bit of a admin nightmare.
I am looking for some assistance or some advice on if it is possible to dynamically build a list of exlusions that can be used for both the rsync or the find -ls?
Here is the format of what the exclude lists look like::
RSYNC:
*.log
*.out
*.csv
logs
shared
tracing
jdk*
8.6_Code
rpsupport
dbarchive
inarchive
comms
PR116PICL
**/lost+found*/
dlxwhsr*
regression
tmp
working
investigation
Investigation
dcsserver_weblogic_*.ear
dcswebrdtEAR_weblogic_*.ear
FIND:
SEXCLUDE_supt="\.log|\.out|\.csv|logs|shared|PR116PICL|tracing|lost\+found|jdk|8\.6\_Code|rpsupport|dbarchive|inarchive|comms|dlxwhsr|regression|tmp|working|investigation|Investigation|dcsserver_weblogic_|dcswebrdtEAR_weblogic_"

You don't need to create a second list for your find command. grep can handle a list of patterns using the -f flag. From the manual:
-f FILE, --file=FILE
Obtain patterns from FILE, one per line. The empty file contains zero
patterns, and therefore matches nothing. (-f is specified by POSIX.)
Here's what I'd do:
find -L ${source_filesystem} -ls | grep -Evf your_rsync_exclude_file_here
This should also work for filenames containing newlines and spaces. Please let me know how it goes.

In the end the grep -Evf was a bit of a nightmare as rsync didnt support regex, it uses regex but not the same.
So I then pursued my other idea of dynamically building the exclude list for egrep by parsing the rsync exclude-list and building variable on the fly to pass into egrep.
This the method I used:
#!/bin/ksh
# Create Signature of current build
AFS=$1
#Create Signature File
crSig()
{
find -L ${SRC} -ls | egrep -v **"$SEXCLUDE"** | awk '{fws = ""; for (i = 11; i <= NF; i++) fws = fws $i " "; print $3, $6, fws}' | sort >${BASE}/${SIFI}.${AFS}
}
#Setup SRC, TRG & SCROOT
LoadAuditReqs()
{
export SRC=`grep ${AFS} ${CONF}/fileSystem.properties | awk {'print $2'}`
export TRG=`grep ${AFS} ${CONF}/fileSystem.properties | awk {'print $3'}`
export SCROOT=`grep ${AFS} ${CONF}/fileSystem.properties | awk {'print $4'}`
**export BEXCLUDE=$(sed -e 's/[*/]//g' -e 's/\([._+-]\)/\\\1/g' ${CONF}/exclude-list.${AFS} | tr "\n" "|")**
**export SEXCLUDE=$(echo ${BEXCLUDE} | sed 's/\(.*\)|/\1/')**
}
#Load Properties File
LoadProperties()
{
. /users/rpapp/rpmonit/audit_tool/conf/environment.properties
}
#Functions
LoadProperties
LoadAuditReqs
crSig
So with these new variables:
**export BEXCLUDE=$(sed -e 's/[*/]//g' -e 's/\([._+-]\)/\\\1/g' ${CONF}/exclude-list.${AFS} | tr "\n" "|")**
**export SEXCLUDE=$(echo ${BEXCLUDE} | sed 's/\(.*\)|/\1/')**
I use them to remove "*" and "/", then match my special characters and prepend with "\" to escape them.
Then it using "tr" replace a newline with "|" and then rerunning that output to remove the trailing "|" to make the variable $SEXCLUDE to use for egrep that is used in the crSig function.
What do you think?

How can I traverse a directory tree using a bash or Perl script?

I am interested into getting into bash scripting and would like to know how you can traverse a unix directory and log the path to the file you are currently looking at if it matches a regex criteria.
It would go like this:
Traverse a large unix directory path file/folder structure.
If the current file's contents contained a string that matched one or more regex expressions,
Then append the file's full path to a results text file.
Bash or Perl scripts are fine, although I would prefer how you would do this using a bash script with grep, awk, etc commands.

find . -type f -print0 | xargs -0 grep -l -E 'some_regexp' > /tmp/list.of.files
Important parts:
-type f makes the find list only files
-print0 prints the files separated not by \n but by \0 - it is here to make sure it will work in case you have files with spaces in their names
xargs -0 - splits input on \0, and passes each element as argument to the command you provided (grep in this example)
The cool thing with using xargs is, that if your directory contains really a lot of files, you can speed up the process by paralleling it:
find . -type f -print0 | xargs -0 -P 5 -L 100 grep -l -E 'some_regexp' > /tmp/list.of.files
This will run the grep command in 5 separate copies, each scanning another set of up to 100 files

use find and grep
find . -exec grep -l -e 'myregex' {} \; >> outfile.txt
-l on the grep gets just the file name
-e on the grep specifies a regex
{} places each file found by the find command on the end of the grep command
>> outfile.txt appends to the text file

grep -l -R <regex> <location> should do the job.

If you wanted to do this from within Perl, you can take the find commands that people suggested and turn them into a Perl script with find2perl:
If you have:
$ find ...
make that
$ find2perl ...
That outputs a Perl program that does the same thing. From there, if you need to do something that easy in Perl but hard in shell, you just extend the Perl program.

find /path -type f -name "*.txt" | awk '
{
while((getline line<$0)>0){
if(line ~ /pattern/){
print $0":"line
#do some other things here
}
}
}'
similar thread

find /path -type f -name "outfile.txt" | awk '
{
while((getline line<$0)>0){
if(line ~ /pattern/){
print $0":"line
}
}
}'

Unable to filter rows which contain "Is a directory" by SED/AWK

I run the code gives me the following sample data
md5deep find * | awk '{ print $1 }'
A sample of the output
/Users/math/Documents/Articles/Number theory: Is a directory
258fe6853b1bfb2d07f512ff6bec52b1
/Users/math/Documents/Articles/Probability and statistics: Is a directory
4811bfb2ad04b9f4318049c01ebb52ef
8aae4ac3694658cf90005dbdea37b4d5
258fe6853b1bfb2d07f512ff6bec52b1
I have tried to filter the rows which contain Is a directory by SED unsuccessfully
md5deep find * | awk '{ print $1 }' | sed s/\/*//g
Its sample output is
/Users/math/Documents/Articles/Number theory: Is a directory
/Users/math/Documents/Articles/Topology: Is a directory
/Users/math/Documents/Articles/useful: Is a directory
How can I filter Out each row which contains "Is a directory" by SED/AWK?
[clarification]
I want to filter out the rows which contain Is a directory.

I have not used the md5deep tool, but I believe those lines are error messages; they would be going to standard error instead of standard out, and so they are going directly to your terminal instead of through the pipe. Thus, they won't be filtered by your sed command. You could filter them by merging your standard error and standard output streams, but
It looks like (I'm not sure because you are missing the backquotes) you are trying to call
md5deep `find *`
and find is returning all of the files and directories.
Some notes on what you might want to do:
It looks like md5deep has a -r for "recursive" option. So, you may want to try:
md5deep -r *
instead of the find command.
If you do wish to use a find command, you can limit it to only files using -type f, instead of files and directories. Also, you don't need to pass * into a find command (which may confuse find if there are files that have names that looks like the options that find understands); passing in . will search recursively through the current directory.
find . -type f
In sed if you wish to use slashes in your pattern, it can be a pain to quote them correctly with \. You can instead choose a different character to delimit your regular expression; sed will use the first character after the s command as a delimiter. Your pattern is also lacking a .; in regular expressions, to indicate one instance of any character you use ., and to indicate "zero or more of the preceding expression" you use *, so .* indicates "zero or more of any character" (this is different from glob patterns, in which * alone means "zero or more of any character").
sed "s|/.*||g"
If you really do want to be including your standard error stream in your standard output, so it will pass through the pipe, then you can run:
md5deep `find *` 2>&1 | awk ...
If you just want to ignore stderr, you can redirect that to /dev/null, which is a special file that just discards anything that goes into it:
md5deep `find *` 2>/dev/null | awk ...
In summary, I think the command below will help you with your immediate problem, and the other suggestions listed above may help you if I did not undersand what you were looking for:
md5deep -r * | awk '{ print $1 }'

To specifically answer the clarification: how to filter out lines using awk and sed:
awk '/Is a directory/ {next} {print}'
sed 'g/Is a directory/d'

Why not use grep instead?
ie,
md5deep find * | grep "Is a directory" | awk '{ print $1 }'
Edit: I just re-read your question and if you want to remove the lines with Is a directory, use the -v flag of grep, ie:
md5deep find * | grep -v "Is a directory" | awk '{ print $1 }'

I'm not intimately familiar with md5deep, but this may do something like you are tying to do.
find -type f -exec md5sum {} +

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse