unix find and replace text in dir and subdirs - perl

I'm trying to change the name of "my-silly-home-page-name.html" to "index.html" in all documents within a given master directory and subdirs.
I saw this: Shell script - search and replace text in multiple files using a list of strings.
And this: How to change all occurrences of a word in all files in a directory
I have tried this:
grep -r "my-silly-home-page-name.html" .
This finds the lines on which the text exists, but now I would like to substitute 'my-silly-home-page-name' for 'index'.
How would I do this with sed or perl?
Or do I even need sed/perl?
Something like:
grep -r "my-silly-home-page-name.html" . | sed 's/$1/'index'/g'
?
Also; I am trying this with perl, and I try the following:
perl -i -p -e 's/my-silly-home-page-name\.html/index\.html/g' *
This works, but I get an error when perl encounters directories, saying "Can't do inplace edit: SOMEDIR-NAME is not a regular file, <> line N"
Thanks,
jml

find . -type f -exec \
perl -i -pe's/my-silly-home-page-name(?=\.html)/index/g' {} +
Or if your find doesn't support -exec +,
find . -type f -print0 | xargs -0 \
perl -i -pe's/my-silly-home-page-name(?=\.html)/index/g'
Both pass to Perl as arguments as many names at a time as possible. Both work with any file name, including those that contains newlines.
If you are on Windows and you are using a Windows build of Perl (as opposed to a cygwin build), -i won't work unless you also do a backup of the original. Change -i to -i.bak. You can then go and delete the backups using
find . -type f -name '*.bak' -delete

This should do the job:
find . -type f -print0 | xargs -0 sed -e 's/my-silly-home-page-name\.html/index\.html/g' -i
Basically it gathers recursively all the files from the given directory (. in the example) with find and runs sed with the same substitution command as in the perl command in the question through xargs.
Regarding the question about sed vs. perl, I'd say that you should use the one you're more comfortable with since I don't expect huge differences (the substitution command is the same one after all).

There are probably better ways to do this but you can use:
find . -name oldname.html |perl -e 'map { s/[\r\n]//g; $old = $_; s/oldname.txt$/newname.html/; rename $old,$_ } <>';
Fyi, grep searches for a pattern; find searches for files.

Related

How to rename all the files (without for loop) in a single line command?

I want to rename all the files in my home directory (example abc), in the format (abc_bkp) without using any loops and it should be a single line command in unix (bash script).
If the directory contains nothing but files, this should do it:
ls | xargs -I {} mv {} {}_bkp
If it contains subdirectories, links, and other things you don't want to rename, you must filter the output of ls. Here is a crude way to do it; maybe someone can suggest a more elegant approach:
ls -l | grep ^- | cut -d' ' -f 13 | xargs -I {} mv {} {}_bkp
If you don't want to use loops then I believe the BEST way could be find command, try following command as a DRY run first and once you are satisfy with results then you could remove echo from it to give a real shot.
find -type f -or -type d | xargs -I % echo mv % %_bkp
-I: From man xargs page:
-I replace-str
Replace occurrences of replace-str in the initial-arguments with names read from standard input. Also, unquoted blanks do not
terminate
input items; instead the separator is the newline character. Implies -x and -L 1.

Perl: edit file, not just output to shell

I found a little one-liner of perl code that will change the serial in my zone-files on my Bind server.
However it wont change the actual file, it just gives me the output directly to the shell.
This is what I run:
find . -type f -print0 | xargs -0 perl -e "while(<>){ s/\d+(\s*;\s*[sS]erial)/2015050466\1/; print; }"
This gives me the correct output to the shell and if I remove the print; at the end of the perl line nothing happens and I want it to actually change the files to the output I got.
I'm a total noob when it comes to Perl so this might be a simple fix so any answer would be appreciated.
I am assuming you want to replace the string inside the files found by find.
Command example below will change in-place (-i) any "foo" with "bar" for all *.txt files from curent directory.
find . -type f -name '*.txt' -print0 | xargs -0 perl -p -i -e 's/foo/bar/g;'
And for your question, you should be able to get it with this command:
find . -type f -print0 | xargs -0 perl -p -i -e 's/\d+(\s*;\s*[sS]erial)/2015050466\1/;'
Note: It is good habit to always use single quotes rather than double quotes. This is because inside double quotes, a \, $, etc. may be processed by the shell before passed to Perl. See Bash manual.

Change multiple files

The following command is correctly changing the contents of 2 files.
sed -i 's/abc/xyz/g' xaa1 xab1
But what I need to do is to change several such files dynamically and I do not know the file names. I want to write a command that will read all the files from current directory starting with xa* and sed should change the file contents.
I'm surprised nobody has mentioned the -exec argument to find, which is intended for this type of use-case, although it will start a process for each matching file name:
find . -type f -name 'xa*' -exec sed -i 's/asd/dsg/g' {} \;
Alternatively, one could use xargs, which will invoke fewer processes:
find . -type f -name 'xa*' | xargs sed -i 's/asd/dsg/g'
Or more simply use the + exec variant instead of ; in find to allow find to provide more than one file per subprocess call:
find . -type f -name 'xa*' -exec sed -i 's/asd/dsg/g' {} +
Better yet:
for i in xa*; do
sed -i 's/asd/dfg/g' $i
done
because nobody knows how many files are there, and it's easy to break command line limits.
Here's what happens when there are too many files:
# grep -c aaa *
-bash: /bin/grep: Argument list too long
# for i in *; do grep -c aaa $i; done
0
... (output skipped)
#
You could use grep and sed together. This allows you to search subdirectories recursively.
Linux: grep -r -l <old> * | xargs sed -i 's/<old>/<new>/g'
OS X: grep -r -l <old> * | xargs sed -i '' 's/<old>/<new>/g'
For grep:
-r recursively searches subdirectories
-l prints file names that contain matches
For sed:
-i extension (Note: An argument needs to be provided on OS X)
Those commands won't work in the default sed that comes with Mac OS X.
From man 1 sed:
-i extension
Edit files in-place, saving backups with the specified
extension. If a zero-length extension is given, no backup
will be saved. It is not recommended to give a zero-length
extension when in-place editing files, as you risk corruption
or partial content in situations where disk space is exhausted, etc.
Tried
sed -i '.bak' 's/old/new/g' logfile*
and
for i in logfile*; do sed -i '.bak' 's/old/new/g' $i; done
Both work fine.
#PaulR posted this as a comment, but people should view it as an answer (and this answer works best for my needs):
sed -i 's/abc/xyz/g' xa*
This will work for a moderate amount of files, probably on the order of tens, but probably not on the order of millions.
Another more versatile way is to use find:
sed -i 's/asd/dsg/g' $(find . -type f -name 'xa*')
I'm using find for similar task. It is quite simple: you have to pass it as an argument for sed like this:
sed -i 's/EXPRESSION/REPLACEMENT/g' `find -name "FILE.REGEX"`
This way you don't have to write complex loops, and it is simple to see, which files you are going to change, just run find before you run sed.
u can make
'xxxx' text u search and will replace it with 'yyyy'
grep -Rn '**xxxx**' /path | awk -F: '{print $1}' | xargs sed -i 's/**xxxx**/**yyyy**/'
There's some good answers above. I thought I'd throw in one more that is succinct and parallelizable, using GNU parallel, which I often prefer to xargs:
parallel sed -i 's/abc/xyz/g' {} ::: xa*
Combine this with the -j N option to run N jobs in parallel.
If you are able to run a script, here is what I did for a similar situation:
Using a dictionary/hashMap (associative array) and variables for the sed command, we can loop through the array to replace several strings. Including a wildcard in the name_pattern will allow to replace in-place in files with a pattern (this could be something like name_pattern='File*.txt' ) in a specific directory (source_dir).
All the changes are written in the logfile in the destin_dir
#!/bin/bash
source_dir=source_path
destin_dir=destin_path
logfile='sedOutput.txt'
name_pattern='File.txt'
echo "--Begin $(date)--" | tee -a $destin_dir/$logfile
echo "Source_DIR=$source_dir destin_DIR=$destin_dir "
declare -A pairs=(
['WHAT1']='FOR1'
['OTHER_string_to replace']='string replaced'
)
for i in "${!pairs[#]}"; do
j=${pairs[$i]}
echo "[$i]=$j"
replace_what=$i
replace_for=$j
echo " "
echo "Replace: $replace_what for: $replace_for"
find $source_dir -name $name_pattern | xargs sed -i "s/$replace_what/$replace_for/g"
find $source_dir -name $name_pattern | xargs -I{} grep -n "$replace_for" {} /dev/null | tee -a $destin_dir/$logfile
done
echo " "
echo "----End $(date)---" | tee -a $destin_dir/$logfile
First, the pairs array is declared, each pair is a replacement string, then WHAT1 will be replaced for FOR1 and OTHER_string_to replace will be replaced for string replaced in the file File.txt. In the loop the array is read, the first member of the pair is retrieved as replace_what=$i and the second as replace_for=$j. The find command searches in the directory the filename (that may contain a wildcard) and the sed -i command replaces in the same file(s) what was previously defined. Finally I added a grep redirected to the logfile to log the changes made in the file(s).
This worked for me in GNU Bash 4.3 sed 4.2.2 and based upon VasyaNovikov's answer for Loop over tuples in bash.
The Silver Searcher Solution
I'm adding another option for those people who don't know about the amazing tool called The Silver Searcher (command line tool is ag).
Note: You can use grep and other tools to do the same thing here, but The Silver Searcher is fantastic :)
TLDR
ag -l 'abc' | xargs sed -i 's/abc/xyz/g'
Install The Silver Searcher
sudo apt install silversearcher-ag # Debian / Ubuntu
sudo pacman -S the_silver_searcher # Arch / EndeavourOS
sudo yum install epel-release the_silver_searcher # RHEL / CentOS
Demo Files
Paste the following into your terminal to create some demonstration files:
mkdir /tmp/food
cd /tmp/food
content="Everybody loves to abc this food!"
echo "$content" > ./milk
echo "$content" > ./bread
mkdir ./fastfood
echo "$content" > ./fastfood/pizza
echo "$content" > ./fastfood/burger
mkdir ./fruit
echo "$content" > ./fruit/apple
echo "$content" > ./fruit/apricot
Using 'ag'
The following ag command will recursively find all the files that contain the string 'abc'. It ignores the .git directory, .gitignore files, and other ignore files:
$ ag 'abc'
milk
1:Everybody loves to abc this food!
bread
1:Everybody loves to abc this food!
fastfood/burger
1:Everybody loves to abc this food!
fastfood/pizza
1:Everybody loves to abc this food!
fruit/apple
1:Everybody loves to abc this food!
fruit/apricot
1:Everybody loves to abc this food!
To just list the files that contain the string 'abc', use the -l switch:
$ ag -l 'abc'
bread
fastfood/burger
fastfood/pizza
fruit/apricot
milk
fruit/apple
Changing Multiple Files
Finally, using xargs and sed, we can replace the 'abc' string with another string:
ag -l 'abc' | xargs sed -i 's/abc/eat/g'
In the above command, ag is listing all the files that contain the string 'abc'. The xargs command is splitting the file names and piping them individually into the sed command.

How to find and replace all occurrences of a string recursively in a directory tree? [duplicate]

This question already has answers here:
How can I do a recursive find/replace of a string with awk or sed?
(37 answers)
Closed 1 year ago.
Using just grep and sed, how do I replace all occurrences of:
a.example.com
with
b.example.com
within a text file under the /home/user/ directory tree recursively finding and replacing all occurrences in all files in sub-directories as well.
Try this:
find /home/user/ -type f | xargs sed -i 's/a\.example\.com/b.example.com/g'
In case you want to ignore dot directories
find . \( ! -regex '.*/\..*' \) -type f | xargs sed -i 's/a\.example\.com/b.example.com/g'
Edit: escaped dots in search expression
Try this:
grep -rl 'SearchString' ./ | xargs sed -i 's/REPLACESTRING/WITHTHIS/g'
grep -rl will recursively search for the SEARCHSTRING in the directories ./ and will replace the strings using sed.
Ex:
Replacing a name TOM with JERRY using search string as SWATKATS in directory CARTOONNETWORK
grep -rl 'SWATKATS' CARTOONNETWORK/ | xargs sed -i 's/TOM/JERRY/g'
This will replace TOM with JERRY in all the files and subdirectories under CARTOONNETWORK wherever it finds the string SWATKATS.
On macOS, none of the answers worked for me. I discovered that was due to differences in how sed works on macOS and other BSD systems compared to GNU.
In particular BSD sed takes the -i option but requires a suffix for the backup (but an empty suffix is permitted)
grep version from this answer.
grep -rl 'foo' ./ | LC_ALL=C xargs sed -i '' 's/foo/bar/g'
find version from this answer.
find . \( ! -regex '.*/\..*' \) -type f | LC_ALL=C xargs sed -i '' 's/foo/bar/g'
Don't omit the Regex to ignore . folders if you're in a Git repo. I realized that the hard way!
That LC_ALL=C option is to avoid getting sed: RE error: illegal byte sequence if sed finds a byte sequence that is not a valid UTF-8 character. That's another difference between BSD and GNU. Depending on the kind of files you are dealing with, you may not need it.
For some reason that is not clear to me, the grep version found more occurrences than the find one, which is why I recommend to use grep.
I know this is a really old question, but...
#vehomzzz's answer uses find and xargs when the questions says explicitly grep and sed only.
#EmployedRussian and #BrooksMoses tried to say it was a dup of awk and sed, but it's not - again, the question explicitly says grep and sed only.
So here is my solution, assuming you are using Bash as your shell:
OLDIFS=$IFS
IFS=$'\n'
for f in `grep -rl a.example.com .` # Use -irl instead of -rl for case insensitive search
do
sed -i 's/a\.example\.com/b.example.com/g' $f # Use /gi instead of /g for case insensitive search
done
IFS=$OLDIFS
If you are using a different shell, such as Unix SHell, let me know and I will try to find a syntax adjustment.
P.S.: Here's a one-liner:
OLDIFS=$IFS;IFS=$'\n';for f in `grep -rl a.example.com .`;do sed -i 's/a\.example\.com/b.example.com/g' $f;done;IFS=$OLDIFS
Sources:
Bash: Iterating over lines in a variable
grep(1) - Linux man page
Official Grep Manual
sed(1) - Linux man page
Official sed Manual
For me works the next command:
find /path/to/dir -name "file.txt" | xargs sed -i 's/string_to_replace/new_string/g'
if string contains slash 'path/to/dir' it can be replace with another character to separate, like '#' instead '/'.
For example: 's#string/to/replace#new/string#g'
We can try using the more powerful ripgrep as
rg "BYE_BYE_TEXT" ./ --files-with-matches | xargs sed -i "s/BYE_BYE_TEXT/WELCOME_TEXT/g"
Because ripgrep is good at finding and sed is great at replacing.
it is much simpler than that.
for i in `find *` ; do sed -i -- 's/search string/target string/g' $i; done
find i => will iterate over all the files in the folder and in subfolders.
sed -i => will replace in the files the relevant string if exists.
Try this command:
/home/user/ directory - find ./ -type f \
-exec sed -i -e 's/a.example.com/b.example.com/g' {} \;
The command below will search all the files recursively whose name matches the search pattern and will replace the string:
find /path/to/searchdir/ -name "serachpatter" -type f | xargs sed -i 's/stringone/StrIngTwo/g'
Also if you want to limit the depth of recursion you can put the limits as well:
find /path/to/searchdir/ -name "serachpatter" -type f -maxdepth 4 -mindepth 2 | xargs sed -i 's/stringone/StrIngTwo/g'

How can I traverse a directory tree using a bash or Perl script?

I am interested into getting into bash scripting and would like to know how you can traverse a unix directory and log the path to the file you are currently looking at if it matches a regex criteria.
It would go like this:
Traverse a large unix directory path file/folder structure.
If the current file's contents contained a string that matched one or more regex expressions,
Then append the file's full path to a results text file.
Bash or Perl scripts are fine, although I would prefer how you would do this using a bash script with grep, awk, etc commands.
find . -type f -print0 | xargs -0 grep -l -E 'some_regexp' > /tmp/list.of.files
Important parts:
-type f makes the find list only files
-print0 prints the files separated not by \n but by \0 - it is here to make sure it will work in case you have files with spaces in their names
xargs -0 - splits input on \0, and passes each element as argument to the command you provided (grep in this example)
The cool thing with using xargs is, that if your directory contains really a lot of files, you can speed up the process by paralleling it:
find . -type f -print0 | xargs -0 -P 5 -L 100 grep -l -E 'some_regexp' > /tmp/list.of.files
This will run the grep command in 5 separate copies, each scanning another set of up to 100 files
use find and grep
find . -exec grep -l -e 'myregex' {} \; >> outfile.txt
-l on the grep gets just the file name
-e on the grep specifies a regex
{} places each file found by the find command on the end of the grep command
>> outfile.txt appends to the text file
grep -l -R <regex> <location> should do the job.
If you wanted to do this from within Perl, you can take the find commands that people suggested and turn them into a Perl script with find2perl:
If you have:
$ find ...
make that
$ find2perl ...
That outputs a Perl program that does the same thing. From there, if you need to do something that easy in Perl but hard in shell, you just extend the Perl program.
find /path -type f -name "*.txt" | awk '
{
while((getline line<$0)>0){
if(line ~ /pattern/){
print $0":"line
#do some other things here
}
}
}'
similar thread
find /path -type f -name "outfile.txt" | awk '
{
while((getline line<$0)>0){
if(line ~ /pattern/){
print $0":"line
}
}
}'