Using sed to find and delete across multiple files recursively - sed

How can I do a string match against, for example:
<meta name="keywords" content="
Then delete that whole line every time a match is found?
I'm looking to do this for all files in the current directory and below.
I'm also new to sed.

Try this command:
find . -type f -exec sed -i '/foobar/d' {} \;
Change foobar to what you search for.

In answer to the question: "How do I do x to all files recursively?", the answer is to use find. To use sed to delete a line, you can either use the non-portable -i, or simply write a script to redirect the stream. For example:
find . -exec sh -c 'f=/tmp/t.$$;
sed "/<meta name=\"keywords\" content=\"/d" $0 > $f; mv $f $0' {} \;

Related

sed not working properly with multiple input files

sed -i is creating a backup of all files in subdirectories before editing in place (as expected) but it's not actually editing files in subdirectories.
$ mkdir -p a/b
$ echo "A" > a/a.txt
$ echo "B" > a/b/b.txt
Now I have two text files, one in a one in a subdirectory of a
$ sed -i.bac "1s/^/PREPENDED /" a/**/*.txt
Backups are created for both:
$ find a
a
a/a.txt
a/a.txt.bac
a/b
a/b/b.txt
a/b/b.txt.bac
Only a.txt is edited:
$ cat a/a.txt
PREPENDED A
$ cat a/b/b.txt
B
I'm using ZSH (so I have globstar support) and I'm on Mac.
Why is this happening and how can I fix it?
It's happening because your sed invocation only has a single line 1, which happens to be in a.txt. If you want it to do it for each file then you need to invoke sed multiple times.
for f in a/**/*.txt
do
sed ... "$f"
done
Since you are needing to descend through several levels of directories, a single invocation of sed alone is not sufficient. However, using find you can accomplish what you want in a single line. If you are not familiar with find ... -exec '{}' \; it is worth taking a few minutes with startpage.com and do a quick search. In your case, the following invocation works well:
find a -type f -name "*.txt" -exec sed -i.bac 's/^/PREPENDED /' '{}' \;
Here find searches directory a and all below for any file (-type f) matching *.txt, then for each file (indicated by '{}') -exec executes sed -i.bac 's/^/PREPENDED /' and lastly an escaped \; is given to indicate the end of the -exec command.
results:
$ ls -1 a
b
a.txt
a.txt.bac
$ ls -1 a/b
b.txt
b.txt.bac
$ cat a/a.txt
PREPENDED A
$ cat a/b/b.txt
PREPENDED B
As was correctly pointed out, with globstar set shopt -s globstar it is unnecessary to use find as the following invocation of sed is sufficient:
sed -i.bac 's/^/PREPENDED /' a/**/*.txt

sed over multiple files in multiple directories

I have the following directory tree:
books>book(i)>cluster.pir
where book(i) are a set of sub directories 1 to 1023 each containing a folder called cluster.pir.
The following sed command:
sed -i '/>/d' ./*.pir
will delete any line in the file containing '>' for any file with a .pir ext, which is great, but my various .pir files are located in their own book(i) directory. How do I get the command to span across all the directories? I have tried:
find ./*.pir -type f -exec sed -i '/>/d' ./*.pir
when starting in the 'book' parent directory, but I get:
find: missing argument to `-exec'
does anyone have any thoughts on this?
Thanks.
The format for find is:
find -exec command {} \;
Where {} is replaced by the filename.
Edit: In your case this would become:
find ./*.pir -type f -exec sed -i '/>/d' {} \;
This will call sed on every file.
You can add a wildcard to span all directories:
sed -i '/>/d' ./book*/*.pir
I was having trouble using file wild-cards with sed on my Mac and this method worked fine:
FILE_PATH="/some/path/"
sed -i '' "s|search|replace|g" $(find ${FILE_PATH} -name '*.ext')

unix find and replace text in dir and subdirs

I'm trying to change the name of "my-silly-home-page-name.html" to "index.html" in all documents within a given master directory and subdirs.
I saw this: Shell script - search and replace text in multiple files using a list of strings.
And this: How to change all occurrences of a word in all files in a directory
I have tried this:
grep -r "my-silly-home-page-name.html" .
This finds the lines on which the text exists, but now I would like to substitute 'my-silly-home-page-name' for 'index'.
How would I do this with sed or perl?
Or do I even need sed/perl?
Something like:
grep -r "my-silly-home-page-name.html" . | sed 's/$1/'index'/g'
?
Also; I am trying this with perl, and I try the following:
perl -i -p -e 's/my-silly-home-page-name\.html/index\.html/g' *
This works, but I get an error when perl encounters directories, saying "Can't do inplace edit: SOMEDIR-NAME is not a regular file, <> line N"
Thanks,
jml
find . -type f -exec \
perl -i -pe's/my-silly-home-page-name(?=\.html)/index/g' {} +
Or if your find doesn't support -exec +,
find . -type f -print0 | xargs -0 \
perl -i -pe's/my-silly-home-page-name(?=\.html)/index/g'
Both pass to Perl as arguments as many names at a time as possible. Both work with any file name, including those that contains newlines.
If you are on Windows and you are using a Windows build of Perl (as opposed to a cygwin build), -i won't work unless you also do a backup of the original. Change -i to -i.bak. You can then go and delete the backups using
find . -type f -name '*.bak' -delete
This should do the job:
find . -type f -print0 | xargs -0 sed -e 's/my-silly-home-page-name\.html/index\.html/g' -i
Basically it gathers recursively all the files from the given directory (. in the example) with find and runs sed with the same substitution command as in the perl command in the question through xargs.
Regarding the question about sed vs. perl, I'd say that you should use the one you're more comfortable with since I don't expect huge differences (the substitution command is the same one after all).
There are probably better ways to do this but you can use:
find . -name oldname.html |perl -e 'map { s/[\r\n]//g; $old = $_; s/oldname.txt$/newname.html/; rename $old,$_ } <>';
Fyi, grep searches for a pattern; find searches for files.

sed to exclude directories

I try to replace many files at once with sed using * as filename. However it tries to process directories too, and gives error and terminates. Is there a simple way to overcome this?
I'm not sure exactly how you're using sed here but the normal way to process only regular files in UNIX is with the find command, something like:
find . -type f -exec sed 's/Hello/Goodbye/g' {} ';'
The type restricts you to regular files, not directories or FIFOs or any other sort of filesystem magic.
If you run man find on your system, you will see a plethora of other options you can use.
To springboard on paxdiablo's answer, I cobbled this alias together, and added it to my bash aliases as 'recursive sed': rsed :
rsed() {
[[ -z $2 ]] && echo "usage: ${FUNCNAME[0]} oldtext newtext" && return
command find . -type f -exec sed -i "s/${1}/${2}/g" {} \;
}
Result:
> cat test/file
Hello how are you?
> rsed "Hello how are you?" "Fine thanks"
> cat test/file
Fine thanks

sed command to write the name of file to HTML comment

I'm looking for a sed command that, with find, I can take a directory tree of JSP files and write the name of the file in an HTML comment to the top of the file.
This will allow me to review a legacy application JSP call tree of in the HTML source.
I'm thinking it will be a one liner for a talented sed guru...
something like:
find . -name '.jsp' -exec sed ? ? ? {} \;
Maybe something using xargs is more appropriate, but I think sed is the tool that will do the work.
If you want to use sed, you can try
find -name "*.jsp" -exec sed -i '1i <!-- {} -->' {} \;
Works fine for me in the presence of /.
On Unix the filename will contain slashes (/) which are special characters for sed, so I would recommend this simpler approach that writes the filename at the bottom of the file:
find . -name '*.jsp' -exec sh -c "echo '<\!-- {} -->' >> '{}'" \;
To write the filename at the top of the file use this:
find . -name '*.jsp' -exec sh -c \
'echo "<!-- {} -->" > "{}.new" && cat "{}" >> "{}.new" && mv "{}.new" "{}"' \;
N.B. The filename might contain characters that might render your HTML invalid, e.g. &, although I doubt that a JSP could have such a strange name.