Why isn't this command taking the diff of two directories? - perl

I am asked to diff two directories using Perl but I think something is wrong with my command,
$diff = system("sudo diff -r '/Volumes/$vol1' '/Volumes/$vol2\\ 1/' >> $diff.txt");
It doesn't display and output. Can someone help me with this? Thanks!

It seems that you want to store all differences in a string.
If this is the case, the command in the question is not going to work for a few reasons:
It's hard to tell whether it's intended or not, but the $diff variable is being used to set the filename storing the differences. Perhaps this should be diff.txt, not $diff.txt
The result of the diff command is saved in $diff.txt. It doesn't display anything in STDOUT. This can be remedied by omitting the >> $diff.txt part. If it also needs to be stored in file, consider the tee command:
sudo diff -r dir1/ dir2/ | tee diff.txt
When a system call is assigned to a variable, it will return 0 upon success. To quote the documentation:
The return value is the exit status of the program as returned by the wait call.
This means that $diff won't store the differences, but the command exit status. A more sensible approach would be to use backticks. Doing this will allow $diff to store whatever is output to STDOUT by the command:
my $diff = `sudo diff -r dir1/ dir2/ | tee diff.txt`; # Not $diff.txt
Is it a must to use the sudo command? Avoid using it if even remotely possible:
my $diff = `diff -r dir1/ dir2/ | tee diff.txt`; # Not $diff.txt
A final recommendation
Let a good CPAN module take care of this task, as backtick calls can only go so far. Some have already been suggested here; it may be well worth a look.

Is sudo diff being prompted for a password?
If possible, take out the sudo from the invocation of diff, and run your script with sudo.

"It doesn't display and output." -- this is becuase you are saving the differences to a file, and then (presumably) not doing anything with that resulting file.
However, I expect "diff two directories using Perl" does not mean "use system() to do it in the shell and then capture the results". Have you considered doing this in the language itself? For example, see Text::Diff. For more nuanced control over what constitutes a "difference", you can simply read in each file and craft your own algorithm to perform the comparisons and compile the similarities and differences.

You might want to check out Test::Differences for a more flexible diff implementation.

Related

Read command output line by line in sh (no bash)

I am basically looking for a way to do this
list=$(command)
while read -r arg
do
...
done <<< "$list"
Using sh intead of bash. The code as it is doesn't run because of the last line:
syntax error: unexpected redirection
Any fixes?
Edit: I need to edit variables and access them outside the loop, so using | is not acceptable (as it creates a sub-shell with independent scope)
Edit 2: This question is NOT similar to Why does my Bash counter reset after while loop as I am not using | (as I just noticed in the last edit). I am asking for another way of achiving it. (The answers to the linked question only explain why the problem happens but do not provide any solutions that work with sh (no bash).
There's no purely syntactic way to do this in POSIX sh. You'll need to use either a temporary file for the output of the command, or a named pipe.
mkfifo output
command > output &
while read -r arg; do
...
done < output
rm output
Any reason you can't do this? Should work .. unless you are assigning any variables inside the loop that you want visible when it's done.
command |
while read -r arg
do
...
done

Delete files in a folder using Perl

I want to delete all files in a folder, which contain he word TRAR in their filename.. I hav etried the following :
CONFIG_DIR=`pwd`
VENDOR=ericsson-msc
RELEASE=v1
BASE_DIR=/appl/virtuo/gways
system ("cd /appl/virtuo/gways/config/ericsson-msc/v1/spool/input_d; rm-rf *TRAR");
remove all your config lines ( are they even perl? )
CONFIG_DIR=`pwd`
VENDOR=ericsson-msc
RELEASE=v1
BASE_DIR=/appl/virtuo/gways
and
system ("cd /appl/virtuo/gways/config/ericsson-msc/v1/spool/input_d; rm -rf *TRAR")
should work but you should really be using perl code (unlink, etc)
I suspect you are confusing the usage of perl with how you will use awk in bash scripts.
As #Steffen Ullrich said, that isn't Perl or Shell. But I'll try to make it a little more Perlish for you:
First, note that
variables in Perl start with a $
strings need "quotes around them"
statements end with a ;
spaces around = are ok and make it all easier to read
so
$CONFIG_DIR = `pwd`;
$VENDOR = "ericsson-msc";
$RELEASE = "v1";
$BASE_DIR = "/appl/virtuo/gways";
Next, see how you can combine these into a single string like this (I'm guessing that's what you want to do)
$DIR_FOR_CLEANING = "$BASE_DIR/config/$VENDOR/$RELEASE/spool/input_d";
Lastly, you should be really careful whenever using the -r command to rm along with a wildcard like *. Look up the man page for rm and see if -r is something you want to do. I don't think you need it here, unless you have directories named *TRAR that you want to recurse into to remove. I'll bet you only have files named *TRAR in that input_d directory.
Also, the command the way you wrote it could fail the cd if that directory doesn't exist, and would then proceed to recursively remove *TRAR from whatever directory you're running the script from. But you don't need to change directories at all. Try something like this
system ("echo rm -f $DIR_FOR_CLEANING/*TRAR");
If the echo command lists the files you do in fact want it to remove, then remove the "echo" and the rm will start deleting stuff.

How do I run the sed command with input and output as the same file?

I'm trying to do use the sed command in a shell script where I want to remove lines that read STARTremoveThisComment and lines that read removeThisCommentEND.
I'm able to do it when I copy it to a new file using
sed 's/STARTremoveThisComment//' > test
But how do I do this by using the same file as input and output?
sed -i (or the extended version, --in-place) will automate the process normally done with less advanced implementations, that of sending output to temporary file, then renaming that back to the original.
The -i is for in-place editing, and you can also provide a backup suffix for keeping a copy of the original:
sed -i.bak fileToChange
sed --in-place=.bak fileToChange
Both of those will keep the original file in fileToChange.bak.
Keep in mind that in-place editing may not be available in all sed implementations but it is in GNU sed which should be available on all variants of Linux, as per your tags.
If you're using a more primitive implementation, you can use something like:
cp oldfile oldfile.bak && sed 'whatever' oldfile >newfile && mv newfile oldfile
You can use the flag -i for in-place editing and the -e for specifying normal script expression:
sed -i -e 's/pattern_to_search/text_to_replace/' file.txt
To delete lines that match a certain pattern you can use the simpler syntax. Notice the d flag:
sed -i '/pattern_to_search/d' file.txt
You really should not use sed for that. This question seems to come up ridiculously often, and it seems very strange that it does since the general solution is so trivial. It seems bizarre that people want to know how to do it in sed, and in python, and in ruby, etc. If you want to have a filter operate on an input and overwrite it, use the following simple script:
#!/bin/sh -e
in=${1?No input file specified}
mv $in ${bak=.$in.bak}
shift
"$#" < $bak > $in
Put that in your path in an executable file name inline, and then the problem is solved in general. For example:
inline input-file sed -e s/foo/bar/g
Now, if you want to add logic to keep multiple backups, or if you have some options to change the backup naming scheme, or whatever, you fix it in one place. What's the command line option to get 1-up counters on the backup file when processing a file in-place with perl? What about with ruby? Is the option different for gnu-sed? How does awk handle it? The whole friggin' point of unix is that tools do one thing only. Handling logic for backup files is a second thing, and needs to be factored out. If you are implementing a tool, do not add logic to create backup files. Tell your users to use a 2nd tool for that. Integration is bad. Modularity is good. That is the unix way.
Notice that this script has several problems. The permissions/mode of the input file may be changed, for example. I'm sure there are innumerable other issues. However, by putting the backup logic in a wrapper script, you localize all of these issues and don't have to worry that sed overwrites the files and changes mode, while python keeps the file in place and does not change the inode (I made up those two cases, the point being that not all tools will use the same logic, while the wrapper script will.)
As far as I know it is not possible to use the same file for input and output. Though one solution is make a shell script which will save it to another file, delete the old input and rename the output to the input file name.
sed -e s/try/this/g input.file > output.file;mv output.file input.file
I suggest using sponge
sponge reads standard input and writes it out to the specified file.
Unlike a shell redirect, sponge soaks up all its input before writing
the output file. This allows constructing pipelines that read from and
write to the same file.
cat test | sed 's/STARTremoveThisComment//' | sponge test

Why does grep hang when run against the / directory?

My question is in two parts :
1) Why does grep hang when I grep all files under "/" ?
for example :
grep -r 'h' ./
(note : right before the hang/crash, I note that I see some "no such device or address" messages , regarding sockets....
Of course, I know that grep shouldn't run against a socket, but I would think that since sockets are just files in Unix, it should return a negative result, rather than crashing.
2) Now, my follow up question : In any case -- how can I grep the whole filesystem? Are there certain *NIX directories which we should leave out when doing this ? In particular, I'm looking for all recently written log files.
As #ninjalj said, if you don't use -D skip, grep will try to read all your device files, socket files, and FIFO files. In particular, on a Linux system (and many Unix systems), it will try to read /dev/zero, which appears to be infinitely long.
You'll be waiting for a while.
If you're looking for a system log, starting from /var/log is probably the best approach.
If you're looking for something that really could be anywhere in your file system, you can do something like this:
find / -xdev -type f -print0 | xargs -0 grep -H pattern
The -xdev argument to find tells it to stay within a single filesystem; this will avoid /proc and /dev (as well as any mounted filesystems). -type f limits the search to ordinary files. -print0 prints the file names separated by null characters rather than newlines; this avoid problems with files having spaces or other funny characters in their names.
xargs reads a list of file names (or anything else) on its standard input and invokes the specified command on everything in the list. The -0 option works with find's -print0.
The -H option to grep tells it to prefix each match with the file name. By default, grep does this only if there are two or more file names on its command line. Since xargs splits its arguments into batches, it's possible that the last batch will have just one file, which would give you inconsistent results.
Consider using find ... -name '*.log' to limit the search to files with names ending in .log (assuming your log files have such names), and/or using grep -I ... to skip binary files.
Note that all this depends on GNU-specific features. Some of these options might not be available on MacOS (which is based on BSD) or on other Unix systems. Consult your local documentation, and consider installing GNU findutils (for find and xargs) and/or GNU grep.
Before trying any of this, use df to see just how big your root filesystem is. Mine is currently 268 gigabytes; searching all of it would probably take several hours. A few minutes spent (a) restricting the files you search and (b) making sure the command is correct will be well worth the time you spend.
By default, grep tries to read every file. Use -D skip to skip device files, socket files and FIFO files.
If you keep seeing error messages, then grep is not hanging. Keep iotop open in a second window to see how hard your system is working to pull all the contents off its storage media into main memory, piece by piece. This operation should be slow, or you have a very barebones system.
Now, my follow up question : In any case -- how can I grep the whole filesystem? Are there certain *NIX directories which we should leave out when doing this ? In particular, Im looking for all recently written log files.
Grepping the whole FS is very rarely a good idea. Try grepping the directory where the log files should have been written; likely /var/log. Even better, if you know anything about the names of the files you're looking for (say, they have the extension .log), then do a find or locate and grep the files reported by those programs.

How can I remove a file based on its creation date time in Perl?

My webapp is hosted on a unix server using MySQL as database.
I wrote a Perl script to run backup of my database. The Perl script is inside the cgi-bin folde and it is working. I only need to set the cronjob and run the Perl script once a day.
The backups are stored in a folder named db_backups,. However, I also want to add a command inside my Perl script to remove any files inside the folder db_backups that are older than say 10 days ago.
I have searched high and low for unix commands and cannot find anything that matches what I needed.
if (-M $file > 10) { unlink $file }
or, coupled with File::Find::Rule
my $ten_days_ago = time() - 10 * 86400;
my #to_delete = File::Find::Rule->file()
->mtime("<=$ten_days_ago")
->in("/path/to/db_backup");
unlink #to_delete;
On Unix you can't, because the file's creation date is not stored in the filesystem.
You may want to check out stat, and -M (modification time)/-C (inode change time)/-A (access time) if you want a simple expression with relative timestamps (how long ago).
i have searched high and low for unix commands
and cannot find anything that matches what i needed.
Check out find(1) and xargs(1). Warning: these commands may change your life at the shell prompt.
$ find /path/to/backup -type f -mtime +10 -print0 | xargs -0 echo rm -f
When you're confident that will Do What You Want (tm), remove the echo. It says, roughly, starting in /path/to/backup, descend looking for plain files whose mtime is greater than 10 days, and print their names to xargs, which will pass those names to rm in batches.
(print0 and its complement -0 are GNU extensions -- you mentioned you were on Linux -- which let you deal with whitespace in filenames safely.)
You should be able to do it without resorting to Unix commands. Loop through the files in your directory, use stat on each file to get its last modify time for a file, then use unlink on the file to delete it if it's older than what you want.