Why do I have to specify the -i switch with a backup extension when using ActivePerl? - perl

I cannot get in-place editing Perl one-liners running under ActivePerl to work unless I specify them with a backup extension:
C:\> perl -i -ape "splice (#F, 2, 0, q(inserted text)); $_ = qq(#F\n);" file1.txt
Can't do inplace edit without backup.
The same command with -i.bak or -i.orig works a treat but creates an unwanted backup file in the process.
Is there a way around this?

This is a Windows/MS-DOS limitation. According to perldiag:
You're on a system such as MS-DOS that gets confused if you try reading from a deleted (but still opened) file. You have to say -i.bak, or some such.
Perl's -i implementation causes it to delete file1.txt while keeping an open handle to it, then re-create the file with the same name. This allows you to 'read' file1.txt even though it has been deleted and is being re-created. Unfortunately, Windows/MS-DOS does not allow you to delete a file that has an open handle attached to it, so this mechanism does not work.
Your best shot is to use -i.bak and then delete the backup file. This at least gives you some protection - for example, you could opt not to delete the backup if perl exits with a non-zero exit code. Something like:
perl -i.bak -ape "splice...." file1.txt && del file1.bak

Sample with recursive modify and delete both done by find. Works on e.g. mingw git bash on windows.
$ find . -name "*.xml" -print0 | xargs -0 perl -p -i.bak -e 's#\s*<property name="blah" value="false" />\s*##g'
$ find . -name "*.bak" -print0 | xargs -0 rm
Binary terminated values passed between find/xargs to handle spaces. Unusual s/ prefix to avoid mangling xml in search term. This assumes you didn't have any .bak files hanging around to begin.

Related

Command Line Mass Rename Jpg Files

I have a folder full of jpg files which all end with "-x-large.jpg" I would like to rename them all using command line so that it gets rid of the -x-large and just becomes .jpg.
So for example 123-x-large.jpg will become 123.jpg
Can someone tell me how I can do this with the ren command?
Thanks.
for img in *-x-large.jpg; do mv -i -v "$img" "${img%-x-large.jpg}.jpg"; done
This loops on all matching images and moves them into a new file with a truncated name (removing -x-large.jpg from the end) with the .jpg added back to the end of the file name. I'm invoking this interactively with mv -i so you are prompted before overwriting each file. To force overwriting (always say "yes"), change that to mv (remove the -i). To prevent overwriting (always say "no"), change that to mv -n.
Remove the -v (verbose) if you don't want to see each rename happen.
If you have a very large number of these files, the command line will be too long for the above command (since *-x-large.jpg will be expanded onto a command line). You can work around that with find and xargs as follows:
sh <(find . -maxdepth 1 -name '*-x-large.jpg' \
|sed -r 's/(.*)(-x-large.jpg)$/mv -i "\1\2" "\1.jpg"/')
This creates a shell script using bash process substitution, using find to generate a list of all files we want to rename and then piping them through sed to create the mv commands.
(See above for the mv flags. I removed -v because presumably this will be a very long list.)
See the version below if you want to check the script before running it.
The above one-liner requires GNU bash or Korn shell (ksh) as well as GNU sed.
Here's how to do it with neither (in three commands):
find . -maxdepth 1 -name '*-x-large.jpg' \
|sed 's/.*/mv "&" "&/; s/-x-large.jpg$/.jpg"/' > temp.sh
sh temp.sh
rm temp.sh
Posix sed doesn't reliably support capture groups (\(…\) or sed -r to invoke ERE) and therefore we can't expect it to be able to match and recall text, so this version simply writes most of the command and then fixes the ending (the absence of a trailing double quotes in the first replacement is intentional; we add it in the second replacement). Posix shell (/bin/sh proper) doesn't support process substitution, so we dump to a temporary file, evaluate it, and then remove it.
If we're referring to Windows command-line, then SET /? is your friend. Loads of good info in there.
setlocal ENABLEDELAYEDEXPANSION
set SEARCH_SUFFIX=-x-large.jpg
set REPLACE_SUFFIX=.jpg
for %%A in ("*%SEARCH_SUFFIX%") do (
set OLD_NAME=%%~nxA
set NEW_NAME=!OLD_NAME:%SEARCH_SUFFIX%=%REPLACE_SUFFIX%!
ren "!OLD_NAME!" "!NEW_NAME!"
)
endlocal

Using sed and mv to add characters to files

First off, I'd like to say that I know this is almost an exact duplicate of some posts that I've read, but have not had any luck with referencing.
I have 100+ files that all follow a very strict naming convention of 5_##_<name>.ext My issue was that when originally making these files I failed to realise that 5_100_ and above would mess up my ordering.
I am now trying to append a 0 in front of every number between 01 and 99. I've written a bash script using sed that works for the file contents (the file name is in the file as well):
#!/bin/bash
for fl in *.tcl; do
echo Filename: $fl
#sed -i 's/5_\(..\)_/5_0\1_/g' $fl
done
However, this only changes the contents and not the filename itself. I've read that mv is the solution (rename is simpler but I do not have it on my system). My current incarnation of my multiple attempts is:
mv "$fl" $(echo "$file" | sed -e 's/5_\(..\)_/5_0\1_/g') but it gives me an error: mv: missing destination file operand after <filename>
Again, I'm sorry about the duplicate but I wasn't able to solve my issue by reading it. I'm sure I'm just using the combination of mv and sed incorrectly.
Solution was entered in the comments. I was using $file instead of $fl.
Something like this might be useful:
for n in $(seq 99)
do
prefix2="5_$(printf "%02d" ${n})_"
prefix3="5_$(printf "%03d" ${n})_"
for f in ${prefix2}*.tcl
do
suffix="${f#${prefix2}}"
[[ -r "${prefix3}${suffix}" ]] || mv "${prefix2}${suffix}" "${prefix3}${suffix}"
done
done
Rather than processing every single file, it only looks at the ones that currently have a "5_XX_" prefix, and only renames them if the corresponding "5_XXX_" file doesn't already exist...
#!/bin/bash
for fl in *.tcl
do
NewName="$(echo "${fl} | sed '/^5_[0-9]\{2\}_/ s/../&0/' )"
#echo "Filename: ${fl} -> ${NewName}
[ ! "${fl}" = "${NewName}" ] && mv ${fl} ${NewName}
done
With a bit a securisation a allow to pass several time on same folder (changing only needed one).
Under linux (non posix sed by default), use sed --posix instead of simple sed call

unix find and replace text in dir and subdirs

I'm trying to change the name of "my-silly-home-page-name.html" to "index.html" in all documents within a given master directory and subdirs.
I saw this: Shell script - search and replace text in multiple files using a list of strings.
And this: How to change all occurrences of a word in all files in a directory
I have tried this:
grep -r "my-silly-home-page-name.html" .
This finds the lines on which the text exists, but now I would like to substitute 'my-silly-home-page-name' for 'index'.
How would I do this with sed or perl?
Or do I even need sed/perl?
Something like:
grep -r "my-silly-home-page-name.html" . | sed 's/$1/'index'/g'
?
Also; I am trying this with perl, and I try the following:
perl -i -p -e 's/my-silly-home-page-name\.html/index\.html/g' *
This works, but I get an error when perl encounters directories, saying "Can't do inplace edit: SOMEDIR-NAME is not a regular file, <> line N"
Thanks,
jml
find . -type f -exec \
perl -i -pe's/my-silly-home-page-name(?=\.html)/index/g' {} +
Or if your find doesn't support -exec +,
find . -type f -print0 | xargs -0 \
perl -i -pe's/my-silly-home-page-name(?=\.html)/index/g'
Both pass to Perl as arguments as many names at a time as possible. Both work with any file name, including those that contains newlines.
If you are on Windows and you are using a Windows build of Perl (as opposed to a cygwin build), -i won't work unless you also do a backup of the original. Change -i to -i.bak. You can then go and delete the backups using
find . -type f -name '*.bak' -delete
This should do the job:
find . -type f -print0 | xargs -0 sed -e 's/my-silly-home-page-name\.html/index\.html/g' -i
Basically it gathers recursively all the files from the given directory (. in the example) with find and runs sed with the same substitution command as in the perl command in the question through xargs.
Regarding the question about sed vs. perl, I'd say that you should use the one you're more comfortable with since I don't expect huge differences (the substitution command is the same one after all).
There are probably better ways to do this but you can use:
find . -name oldname.html |perl -e 'map { s/[\r\n]//g; $old = $_; s/oldname.txt$/newname.html/; rename $old,$_ } <>';
Fyi, grep searches for a pattern; find searches for files.

Sed on AIX does not recognize -i flag

Does sed -i work on AIX?
If not, how can I edit a file "in place" on AIX?
The -i option is a GNU (non-standard) extension to the sed command. It was not part of the classic interface to sed.
You can't edit in situ directly on AIX. You have to do the equivalent of:
sed 's/this/that/' infile > tmp.$$
mv tmp.$$ infile
You can only process one file at a time like this, whereas the -i option permits you to achieve the result for each of many files in its argument list. The -i option simply packages this sequence of events. It is undoubtedly useful, but it is not standard.
If you script this, you need to consider what happens if the command is interrupted; in particular, you do not want to leave temporary files around. This leads to something like:
tmp=tmp.$$ # Or an alternative mechanism for generating a temporary file name
for file in "$#"
do
trap "rm -f $tmp; exit 1" 0 1 2 3 13 15
sed 's/this/that/' $file > $tmp
trap "" 0 1 2 3 13 15
mv $tmp $file
done
This removes the temporary file if a signal (HUP, INT, QUIT, PIPE or TERM) occurs while sed is running. Once the sed is complete, it ignores the signals while the mv occurs.
You can still enhance this by doing things such as creating the temporary file in the same directory as the source file, instead of potentially making the file in a wholly different file system.
The other enhancement is to allow the command (sed 's/this/that' in the example) to be specified on the command line. That gets trickier!
You could look up the overwrite (shell) command that Kernighan and Pike describe in their classic book 'The UNIX Programming Environment'.
#!/bin/ksh
host_name=$1
perl -pi -e "s/#workerid#/$host_name/g" test.conf
Above will replace #workerid# to $host_name inside test.conf
You can simply install GNU version of Unix commands on AIX :
http://www-03.ibm.com/systems/power/software/aix/linux/toolbox/alpha.html
You can use a here construction with vi:
vi file >/dev/null 2>&1 <<#
:1,$ s/old/new/g
:wq
#
When you want to do things in the vi-edit mode, you will need an ESC.
For an ESC press CTRL-V ESC.
When you use this in a non-interactive mode, vi can complain about the TERM not set. The solution is adding export TERM=vt100 before calling vi.
Another option is to use good old ed, like this:
ed fileToModify <<EOF
,s/^ff/gg/
w
q
EOF
you can use perl to do it :
perl -p -i.bak -e 's/old/new/g' test.txt
is going to create a .bak file.

How to use multiple files at once using bash

I have a perl script which is used to process some data files from a given directory. I have written below bash script to look for the last updated file in the given directory and process that file.
cd $data_dir
find \( -type f -mtime -1 \) -exec ./script.pl {} \;
Sometimes, user copied multiple files to the data dir and hence the previous one skipped. The perl script execute only the last updated file. Can you please suggest me how to fix this using bash script.
Try
cd $data_dir
find \( -type f -mtime -1 \) -exec ./script.pl {} +
Note the termination of -exec with a + vs your \;
From the man page
-exec command {} +
This variant of the -exec action runs the specified command on the selected files, but the command line is built by appending each selected file name at the end;
Now that you'll have one or more file names passed into your perl script, you can alter your perl script to iterate over each passed in file name.
If I understood the question correctly, you need to process any files that were created or modified in a directory since the last time your script was run.
In my opinion find is not the right tool to determine those files, because it has no notion of which files it has already seen.
Using any of the -atime/-ctime/-mtime options will either produce duplicates if you run your script twice in the specified period, or miss some files if it is not executed at the right time. The timing intricacies of using these options for something like this are not easy to deal with.
I can propose a few alternatives:
a) Use three directories instead of one: incoming/ processing/ done/. Your users should only be allowed to put files in incoming/. You move any files in there to processing/ with a simple mv incoming/* processing/ before running your perl script. Then you move them from processing/ to done/ when its over.
In my opinion this is the simplest and best solution, and the one used by mail servers etc when dealing with this issue. If I were you and there were not any special circumstances preventing you from doing this, I'd stop reading here.
b) Have your finder script touch a special file (e.g. .timestamp, perhaps in a different directory, so that your users will not tamper with it) when it's done. This will allow your script to remember the last time it was run. Then use
find \( -cnewer .timestamp -o -newer .timestamp \) -type f -exec ./script.pl '{}' ';'
to run your perl script for each file. You should modify your perl script so that it can run repeatedly with a different file name each time. If you can modify it to accept multiple files in one go, you can also run it with
find \( -cnewer .timestamp -o -newer .timestamp \) -type f -exec ./script.pl '{}' +
which will minimise the number of ./script.pl processes. Take care to handle the first run of the find script, when the .timestamp file is missing. A good solution would be to simply ignore it by not using the -*newer options at all in that case. Also keep in mind that there is a race condition where files added after find was started but before touching the timestamp file will not be processed.
c) As a variation of (b), have your script update the timestamp with the time of the processed file that was created/modified most recently. This is tricky, because find cannot order its output on its own. You could use a wrapper around your perl script to handle this:
#!/bin/bash
for i in "$#"; do
find "$i" \( -cnewer .timestamp -o -newer .timestamp \) -exec touch -r '{}' .timestamp ';'
done
./script.pl "$#"
This will update the timestamp if it is called to process a file with a newer mtime or ctime, minimising (but not eliminating) the race condition. It is however somewhat awkward - unavoidable since bash's [[ -nt option seems to only check the mtime. It might be better if your perl script handled that on its own.
d) Have your script store each processed filename and its timestamps somewhere and then skip duplicates. That would allow you to just pass all files in the directory to it and let it sort out the mess. Kinda tricky though...
e) Since your are using Linux, you might want to have a look at inotify and the inotify-tools package - specifically the inotifywait tool. With a bit of scripting it would allow you to process files as they are added in the directory:
inotifywait -e MOVED_TO -e CLOSE_WRITE -m -r testd/ | grep --line-buffered -e MOVED_TO -e CLOSE_WRITE | while read d e f; do ./script.pl "$f"; done
This has no race conditions, as long as your users do not create/copy/move any directories rather than just files.
The perl script will only execute against the file which find gives it. Perhaps you should remove the -mtime -1 option from the find command so that it picks up all the files in the directory?