perl -pe to manipulate filenames - perl

I was trying to do some quick filename cleanup at the shell (zsh, if it matters). Renaming files. (I'm using cp instead of mv just to be safe)
foreach f (\#*.ogg)
cp $f `echo $f | perl -pe 's/\#\d+ (.+)$/"\1"/'`
end
Now, I know there are tools to do stuff like this, but for personal interest I'm wondering how I can do it this way. Right now, I get an error:
cp: target `When.ogg"' is not a directory
Where 'When.ogg' is the last part of the filename. I've tried adding quotes (see above) and escaping the spaces, but nonetheless this is what I get.
Is there a reason I can't use the output of s perl pmr=;omrt as the final argument to another command line tool?

It looks like you have a space in the file names being processed, so each of your cp command lines evaluates to something like
cp \#nnnn When.Ogg When.ogg
When the cp command sees more than two arguments, the last one must be a target directory name for all the files to be copied to - hence the error message. Because your source filename ($f) contains a space it is being treated as two arguments - cp sees three args, rather than the two you intend.
If you put double quotes around the first $f that should prevent the two 'halves' of the name from being treated as separate file names:
cp "$f" `echo ...

This is what you need in bash, hope it's good for zsh too.
cp "$f" "`echo $f | perl -pe 's/\#\d+ (.+)$/\1/'`"
If the filename contains spaces, you also have quote the second argument of cp.

I often use
dir /b ... | perl -nle"$o=$_; s/.../.../; $n=$_; rename $o,$n if !-e $n"
The -l chomps the input.
The -e check is to avoid accidentally renaming all the files to one name. I've done that a couple of times.
In bash (and I'm guessing zsh), that would be
foreach f (...)
echo "$f" | perl -nle'$o=$_; s/.../.../; $n=$_; rename $o,$n if !-e $n'
end
or
find -name '...' -maxdepth 1 \
| perl -nle'$o=$_; s/.../.../; $n=$_; rename $o,$n if !-e $n'
or
find -name '...' -maxdepth 1 -exec \
perl -e'for (#ARGV) {
$o=$_; s/.../.../; $n=$_;
rename $o,$n if !-e $n;
}' {} +
The last supports file names with newlines in them.

Related

How to rename a zero-padded file sequence efficiently in ZSH?

I have a picture sequence named with zero-padded numbers like so:
/path/to/file_07469.jpx
/path/to/file_07470.jpx
/path/to/file_07471.jpx
/path/to/file_07472.jpx
/path/to/file_07473.jpx
/path/to/file_07474.jpx
/path/to/file_07475.jpx
/path/to/file_07476.jpx
/path/to/file_07477.jpx
/path/to/file_07478.jpx
/path/to/file_07479.jpx
/path/to/file_07480.jpx
/path/to/file_07481.jpx
/path/to/file_07482.jpx
This is just an extract. It is thousands of files. I’d like to rename all files from a certain number on, adding / subtracting X. I’d love to use find with a regex.
#!/bin/zsh
shift=-1000
seqnumstart="$(echo "$1" | grep -Eo "\d+")"
bn="$(basename $1)"
bbn="$(echo "${bn%_*}")"
ext="$(echo "${bn##*.}")"
find "$(dirname $1)" -name "$bbn*$ext" -print0 | while read -d $'\0' file
do
seqnum="$(echo "$file" | grep -Eo "\d+")"
seqnum="$(echo "${seqnum#"${seqnum%%[!0]*}"}")"
if [[ "$seqnum" -ge "$seqnumstart" ]]; then
seqnumnew=$(($seqnum + $shift))
seqnumnew=$(printf %05d $seqnumnew)
filenew="$(echo $file | sed -E 's [0-9]+ '$seqnumnew' g')"
mv "$file" "$filenew"
fi
done
How can I improve my code? It is very slow. Im on a Mac (zsh).
zmv is a utility in zsh that can do a lot of filename manipulation and looping for you. Try this:
zmv -n 'p/file_(<7000-7999>).jpx' 'p/file_$(printf "%05d" $(($1 - 1000))).jpx'
Some of the pieces:
zmv: an autoload function; use autoload -Uz zmv to make it available (this is usually added to .zshrc).
-n: no-op. With this option, zmv will just print what would have happened, giving you an idea if the command is correct. Remove this to actually mv the files.
(...): grouping operator for zmv. This identifies sections in the name that you want to change; this section is referenced in the 'to' argument as $1.
<7000-7999>: glob operator for a range. Note that leading zeroes are not always required.
$(printf "%05d" ...): zero-padding.
$((...)): arithmetic.
$1: reference to the parenthetical value in the 'from' argument'. This is where zmv's magic happens - this is substituted for each matching filename.
As you likely know, you'll need to do the renaming in groups or in a specific order to avoid trying to change a name to a name that already exists. zmv will usually halt when it encounters collisions like that.
This is much faster:
#!/bin/zsh
shift=1000
seqnumstart="$(echo "$1" | grep -Eo "\d+")"
lastfile="$(find "$(dirname $1)" -name "*.jpx" | sort | tail -1)"
seqnumend="$(echo "$lastfile" | grep -Eo "\d+")"
bn="$(basename $1)"
bbn="$(echo "${bn%_*}")"
#extension
ext="$(echo "${bn##*.}")"
#basepath before the padded number
bp="$(echo "${1%_*}")"
function buildpath {
echo "$bp"_"$1"."$ext"
}
for i in {$seqnumstart..$seqnumend}
do
unpad="$(echo $i | sed 's/^0*//')"
seqnumnew="$(($unpad + $shift))"
seqnumnewpad="$(printf %05d $seqnumnew)"
op="$(buildpath "$i")"
np="$(buildpath "$seqnumnewpad")"
mv "$op" "$np"
done

Fish: batch rename files with spaces in names using mv and sed

I'm trying to rename multiple files from the command line as follows:
for f in *.pdf
mv $f echo $f | sed 's/\( \)\([0-9]\)/-\2/'
end
I got the error:
mv: target 'filename' is not a directory
It is obvious that target's name has spaces and must be enclosed in quotes to be handled by mv command.
What should I do to get this script works?
Your script will try to move the file and the file called "echo" to the file, and pipe mv's output to sed.
What you want is to run that echo | sed in a command substitution, which fish denotes with ():
for f in *.pdf
mv $f (echo $f | sed 's/\( \)\([0-9]\)/-\2/')
end
It is obvious that target's name has spaces and must be enclosed in quotes to be handled by mv command.
It is not obvious because it doesn't need to be quoted. Fish does not perform word splitting like bash does. The filename is set once, and then $f will always yield the filename as one argument.
Quoting it like "$f" would be entirely superfluous.
For command substitutions, it splits them on newlines only, not spaces, so unless you have a filename with a newline (highly unlikely) you won't have a problem there either.
If you did you'd have to use string collect like
for f in *.pdf
mv $f (echo $f | sed 's/\( \)\([0-9]\)/-\2/' | string collect)
end
to ensure the command substitution results in one argument.

unix find and replace text in dir and subdirs

I'm trying to change the name of "my-silly-home-page-name.html" to "index.html" in all documents within a given master directory and subdirs.
I saw this: Shell script - search and replace text in multiple files using a list of strings.
And this: How to change all occurrences of a word in all files in a directory
I have tried this:
grep -r "my-silly-home-page-name.html" .
This finds the lines on which the text exists, but now I would like to substitute 'my-silly-home-page-name' for 'index'.
How would I do this with sed or perl?
Or do I even need sed/perl?
Something like:
grep -r "my-silly-home-page-name.html" . | sed 's/$1/'index'/g'
?
Also; I am trying this with perl, and I try the following:
perl -i -p -e 's/my-silly-home-page-name\.html/index\.html/g' *
This works, but I get an error when perl encounters directories, saying "Can't do inplace edit: SOMEDIR-NAME is not a regular file, <> line N"
Thanks,
jml
find . -type f -exec \
perl -i -pe's/my-silly-home-page-name(?=\.html)/index/g' {} +
Or if your find doesn't support -exec +,
find . -type f -print0 | xargs -0 \
perl -i -pe's/my-silly-home-page-name(?=\.html)/index/g'
Both pass to Perl as arguments as many names at a time as possible. Both work with any file name, including those that contains newlines.
If you are on Windows and you are using a Windows build of Perl (as opposed to a cygwin build), -i won't work unless you also do a backup of the original. Change -i to -i.bak. You can then go and delete the backups using
find . -type f -name '*.bak' -delete
This should do the job:
find . -type f -print0 | xargs -0 sed -e 's/my-silly-home-page-name\.html/index\.html/g' -i
Basically it gathers recursively all the files from the given directory (. in the example) with find and runs sed with the same substitution command as in the perl command in the question through xargs.
Regarding the question about sed vs. perl, I'd say that you should use the one you're more comfortable with since I don't expect huge differences (the substitution command is the same one after all).
There are probably better ways to do this but you can use:
find . -name oldname.html |perl -e 'map { s/[\r\n]//g; $old = $_; s/oldname.txt$/newname.html/; rename $old,$_ } <>';
Fyi, grep searches for a pattern; find searches for files.

I want to use sed to replace every occurrence of /dir with $dir (replace / with $) in every script in a directory

use sed to replace every occurrence of /dir with $dir (replace / with $) in every script in a directory.
sed "s#/dir#$dir#g"
The $ keeps being interpreted as a function or variable call.
Is there a way around this?
thanks
Read your shell's friendly manual:
man sh
In the shell, "double quotes" around text allow variable interpretation inside, while 'single quotes' do not, a convention adopted by later languages such as Perl and PHP (but not e.g. JavaScript).
sed 's#/dir#$dir#g' *
To perform the replacement within the scripts do something like
find * -maxdepth 0 -type f | while read f; do mv $f $f.old && sed 's#/dir#$dir#' $f.old > $f; done
or just
perl -pi.old -e 's#/dir#\$dir#' * # Perl also interpolates variables in s commands
You can simply escape it with a backslash:
sed "s#/dir#\$dir#g"
shell approach
for file in file*
do
if [ -f "$file ];then
while read -r line
case "$line" in
*/dir* ) line=${line///dir/\$dir}
esac
echo $line > temp
done < "file"
mv temp $file
fi
done

DOS to UNIX path substitution within a file

I have a file that contains this kind of paths:
C:\bad\foo.c
C:\good\foo.c
C:\good\bar\foo.c
C:\good\bar\[variable subdir count]\foo.c
And I would like to get the following file:
C:\bad\foo.c
C:/good/foo.c
C:/good/bar/foo.c
C:/good/bar/[variable subdir count]/foo.c
Note that the non matching path should not be modified.
I know how to do this with sed for a fixed number of subdir, but a variable number is giving me trouble. Actually, I would have to use many s/x/y/ expressions (as many as the max depth... not very elegant).
May be with awk, but this kind of magic is beyond my skills.
FYI, I need this trick to correct some gcov binary files on a cygwin platform.
I am dealing with binary files; therefore, I might have the following kind of data:
bindata\bindata%bindataC:\good\foo.c
which should be translated as:
bindata\bindata%bindataC:/good/foo.c
The first \ must not be translated, despite that it is on the same line.
However, I have just checked my .gcno files while editing this text and it looks like all the paths are flanked with zeros, so most of the answers below should fit.
sed -e '/^C:\\good/ s/\\/\//g' input_file.txt
I would recommend you look into the cygpath utility, which converts path names from one format to another. For instance on my machine:
$ cygpath `pwd`
/home/jericson
$ cygpath -w `pwd`
D:\root\home\jericson
$ cygpath -m `pwd`
D:/root/home/jericson
Here's a Perl implementation of what you asked for:
$ echo 'C:\bad\foo.c
C:\good\foo.c
C:\good\bar\foo.c
C:\good\bar\[variable subdir count]\foo.c' | perl -pe 's|\\|/|g if /good/'
C:\bad\foo.c
C:/good/foo.c
C:/good/bar/foo.c
C:/good/bar/[variable subdir count]/foo.c
It works directly with the string, so it will work anywhere. You could combine it with cygpath, but it only works on machines that have that path:
perl -pe '$_ = `cygpath -m $_` if /good/'
(Since I don't have C:\good on my machine, I get output like C:goodfoo.c. If you use a real path on your machine, it ought to work correctly.)
You want to substitute '/' for all '\' but only on the lines that match the good directory path. Both sed and awk will let you do this by having a LHS (matching) expression that only picks the lines with the right path.
A trivial sed script to do this would look like:
/[Cc]:\\good/ s/\\/\//g
For a file:
c:\bad\foo
c:\bad\foo\bar
c:\good\foo
c:\good\foo\bar
You will get the output below:
c:\bad\foo
c:\bad\foo\bar
c:/good/foo
c:/good/foo/bar
Here's how I would do it in awk:
# fixpaths.awk
/C:\\good/ {
gsub(/\\/,"/",$1);
print $1 >> outfile;
}
Then run it using the command:
awk -f fixpaths.awk paths.txt; mv outfile paths.txt
Or with some help from good ol' Bash:
#!/bin/bash
cat file | while read LINE
do
if <bad_condition>
then
echo "$LINE" >> newfile
else
echo "$LINE" | sed -e "s/\\/\//g" >> newfile
fi
done
try this
sed -re '/\\good\\/ s/\\/\//g' temp.txt
or this
awk -F"\\" '{if($2=="good"){OFS="\/"; $1=$1;} print $0}' temp.txt