Run a Perl script with options recursively through a directory - perl

Given is a directory with a large number of files.
Also given is a Perl script that I want to run on each file of the directory. But this Perl script has options.
FILES=absolutepathtomyfiles/*
PROGRAMME=absolutepathtoperlscript/script.pl;
for f in $FILES
do
if [[ $f == *.txt ]]; then
absolutepathtoperlscript/script.pl -infile=$f -replace #both necessary options
fi
done

If there are spaces or other strange characters in file names, you may have problems using the for construct. You could do this:
for file in /absolute/path/to/myfiles/*.txt
do
[[ -f "$file" ]] || continue
/absolute/path/to/perl/script/script.pl -infile="$file" -replace
done
Note the [[ -f "$file" ]] || continue. This says that if $file is not a file, skip that file. It's similar to this:
if [[ -f "$file" ]]
then
continue;
fi
If this doesn't work, try this:
export PS4="\$LINENO: "
for file in /absolute/path/to/myfiles/*.txt
do
[[ -f "$file" ]] || continue
set -xv # Turn on debugging
/absolute/path/to/perl/script/script.pl -infile="$file" -replace
set +xv # Turn off debugging
done
This will print out your exact command line you're passing to your Perl script and may help you figure out what your issue could be.

Related

How to remove some text in long filename from bunch of files in directory

Can't boot my Windows PC today and I am on 2nd OS Linux Mint. With my limited knowledge on Linux and shell scripts, I really don't have an idea how to do this.
I have a bunch of files in a directory generated from my system, need to remove the last 12 characters from the left of ".txt"
Sample filenames:
filename1--2c4wRK77Wk.txt
filename2-2ZUX3j6WLiQ.txt
filename3-8MJT42wEGqQ.txt
filename4-sQ5Q1-l3ozU.txt
filename5--Way7CDEyAI.txt
Desired result:
filename1.txt
filename2.txt
filename3.txt
filename4.txt
filename5.txt
Any help would be greatly appreciated.
Here is a programmatic way of doing this while still trying to account for pesky edge cases:
#!/bin/sh
set -e
find . -name "filename*" > /tmp/filenames.list
while read -r FILENAME; do
NEW_FILENAME="$(
echo "$FILENAME" | \
awk -F '.' '{$NF=""; gsub(/ /, "", $0); print}' | \
awk -F '/' '{print $NF}' | \
awk -F '-' '{print $1}'
)"
EXTENSION="$(echo "$FILENAME" | awk -F '.' '{print $NF}')"
if [[ "$EXTENSION" == "backup" ]]; then
continue
else
cp "$FILENAME" "${FILENAME}.backup"
fi
if [[ -z "$EXTENSION" ]]; then
mv "$FILENAME" "$NEW_FILENAME"
else
mv "$FILENAME" "${NEW_FILENAME}.${EXTENSION}"
fi
done < /tmp/filenames.list
Create a List of Files to Edit
First up create a list of files that you would like to edit (assuming that they all start with filename) and under the current working directory (.):
find . -name "filename*" > /tmp/filenames.list
If they don't start with filename fret not you could always use a find command like:
find . -type f > /tmp/filenames.list
Iterate over a list of files
To accomplish this we use a while read loop:
while read -r LINE; do
# perform action
done < file
If you had the ability to use bash you could always use a named pipe redirect:
while read -r LINE; do
# perform action
done < <(
find . -type f
)
Create a rename variable
Next, we create a variable NEW_FILENAME and using awk we strip off the file extension and any trailing spaces using:
awk -F '.' '{$NF=""; gsub(/ /, "", $0); print}'
We could just use the following though if you know for certain that there aren't multiple periods in the filename:
awk -F '.' '{print $1}'
The leading ./ is stripped off via
awk -F '/' '{print $NF}'
although this could have been easily done via basename
With the following command, we strip everything after the first -:
awk -F '-' '{print $1}'
Creating backups
Feel free to remove this if you deem unnecessary:
if [[ "$EXTENSION" == "backup" ]]; then
continue
else
cp "$FILENAME" "${FILENAME}.backup"
fi
One thing that we definitely don't want is to make backups of backups. The above logic accounts for this.
Renaming the files
One thing that we don't want to do is append a period to a filename that doesn't have an extension. This accounts for that.
if [[ -z "$EXTENSION" ]]; then
mv "$FILENAME" "$NEW_FILENAME"
else
mv "$FILENAME" "${NEW_FILENAME}.${EXTENSION}"
fi
Other things of note
Odds are that your Linux Mint installation has a bash shell so you could simplify some of these commands. For instance, you could use variable substitution: echo "$FILENAME" | awk -F '.' '{print $NF}' would become "${FILENAME##.*}"
[[ is not defined in POSIX sh so you will likely just need to replace [[ with [, but review this document first:
https://mywiki.wooledge.org/BashFAQ/031
From the pattern of filenames it looks like that the first token can be picked before "-" from filenames. Use following command to rename these files after changing directory to where files are located -
for srcFile in `ls -1`; do fileN=`echo $srcFile | cut -d"-" -f1`; targetFile="$fileN.txt"; mv $srcFile $targetFile; done
If above observation is wrong, following command can be used to remove exactly 12 characters before .txt (4 chars) -
for srcFile in `ls -1`; do fileN=`echo $srcFile | rev | cut -c17- | rev`; targetFile="$fileN.txt"; mv $srcFile $targetFile; done
In ls -1, a pattern can be added to filter files from current directory if that is required.

sed command/shell script to read a specific line and update it if needed

I have a file whose contents are similar as below.
name: MyName
age: 25
subject: Math
This file needs to be updated to :
name: MyName
age: "25"
subject: Math
But the condition is, the sed command/ shell script can run multiple times. But, the double quotes must be added only once.
I wrote a script for it and it works. Want to find a simpler solution.
#!/bin/bash
FILE="myfile"
while IFS='' read -r line || [[ -n "$line" ]]; do
if [[ $line =~ 'age:' ]]
then
if ! [[ $line =~ 'age: "' ]]
then
sed 's/\(age:[[:blank:]]*\)\(.*\)/\1"\2"/' -i $FILE
fi
fi
done < $FILE
You can just run sed from the command line with this slightly altered regex, and it will have the same effect as your script
sed -i 's/\(age:[[:blank:]]\+\)\([^"].*\)/\1"\2"/' file
It won't match if the first character after the blank space is a double quote, which is what your script checks for.
Tested it and it works for me.

tail and grep log and mail (linux)

i want to tail log file with grep and sent it via mail
like:
tail -f /var/log/foo.log | grep error | mail -s subject name#example.com
how can i do this?
You want to send an email when emailing errors occur? That might fail ;)
You can however try something like this:
tail -f $log |
grep --line-buffered error |
while read line
do
echo "$line" | mail -s subject "$email"
done
Which for every line in the grep output sends an email.
Run above shell script with
nohup ./monitor.sh &
so it will keep running in the background.
I'll have a go at this. Perhaps I'll learn something if my icky bash code gets scrutinised. There is a chance there are already a gazillion solutions to do this, but I am not going to find out, as I am sure you have trawled the depths and widths of the cyberocean. It sounds like what you want can be separated into two bits: 1) at regular intervals obtain the 'latest tail' of the file, 2) if the latest tail actually exists, send it by e-mail. For the regular intervals in 1), use cron. For obtaining the latest tail in 2), you'll have to keep track of the file size. The bash script below does that - it's a solution to 2) that can be invoked by cron. It uses the cached file size to compute the chunk of the file it needs to mail. Note that for a file myfile another file .offset.myfile is created. Also, the script does not allow path components in the file name. Rewrite, or fix it in the invocation [e.g. (cd /foo/bar && segtail.sh zut), assuming it is called segtail.sh ].
#!/usr/local/bin/bash
file=$1
size=0
offset=0
if [[ $file =~ / ]]; then
echo "$0 does not accept path components in the file name" 2>&1
exit 1
fi
if [[ -e .offset.$file ]]; then
offset=$(<".offset.$file")
fi
if [[ -e $file ]]; then
size=$(stat -c "%s" "$file") # this assumes GNU stat, possibly present as gstat. CHECK!
# (gstat can also be Ganglias Status tool - careful).
fi
if (( $size < $offset )); then # file might have been reduced in size
echo "reset offset to zero" 2>&1
offset=0
fi
echo $size > ".offset.$file"
if [[ -e $file && $size -gt $offset ]]; then
tail -c +$(($offset+1)) "$file" | head -c $(($size - $offset)) | mail -s "tail $file" foo#bar
fi
How about:
mail -s "catalina.out errors" blah#myaddress.com < grep ERROR catalina.out

pipe into conditional on command line

I have a problem i could not figure out if it's even possible. I am parsing a file with filenames in it, and want to check if those filenames represent an existing file within the system.
i figured out a possibility to to check if a file exists:
[ -f FILENAME ] && echo "File exists" || echo "File does not exists"
now my problem is: How can i pipe into to the conditional that it tests for all the filenames?
i was trying like tihs, but it did not work:
cat myfilenames.txt | xargs command from above without FILENAME
does anybody know if it is possible?
thanks, dmeu!
while read file; dp
[ -e "$file" ] && echo "$file exists";
done <filelist.txt
I believe what you want is a for loop. This worked for me in bash (I put it in a shell script, but you could probably do it on the command line):
for i in `cat $1` ; do
[ -f $i ] && echo File $i exists || echo File $i does not exist
done
the backticks around the cat execute the command and substitute the output into the loop.

I want to use sed to replace every occurrence of /dir with $dir (replace / with $) in every script in a directory

use sed to replace every occurrence of /dir with $dir (replace / with $) in every script in a directory.
sed "s#/dir#$dir#g"
The $ keeps being interpreted as a function or variable call.
Is there a way around this?
thanks
Read your shell's friendly manual:
man sh
In the shell, "double quotes" around text allow variable interpretation inside, while 'single quotes' do not, a convention adopted by later languages such as Perl and PHP (but not e.g. JavaScript).
sed 's#/dir#$dir#g' *
To perform the replacement within the scripts do something like
find * -maxdepth 0 -type f | while read f; do mv $f $f.old && sed 's#/dir#$dir#' $f.old > $f; done
or just
perl -pi.old -e 's#/dir#\$dir#' * # Perl also interpolates variables in s commands
You can simply escape it with a backslash:
sed "s#/dir#\$dir#g"
shell approach
for file in file*
do
if [ -f "$file ];then
while read -r line
case "$line" in
*/dir* ) line=${line///dir/\$dir}
esac
echo $line > temp
done < "file"
mv temp $file
fi
done