How to get input from cat over ssh in Perl - perl

I'm trying to cat a remote file over ssh and process it in local script line by line. So far I've tried this
open(INPUT,"| ssh user#host cat /dir1/dir2/file.dat")
but obviously it's only printing the file.dat to the STDOUT.
I know I can probably just scp the file and process it, but...

You're piping into ssh. I think you want to move the pipe to the other end so you can read the output from that cat command.

I would use
$file_contents = `ssh user#host cat /dir1/dir2/file.dat`;
#lines = split(/\n/, $file_contents);
.
.
. # process the file contents
That captures the output of the command (i.e. the contents of the file).

Related

awk command in Perl's system does not work

I am writing a small Perl script that executes an Awk command :
I try to swap two columns in a file, the file is like this :
domain1,ip1
domain2,ip2
domain3,ip3
the result should be
ip1,domain1
ip2,domain2
ip3,domain3
The Perl command invoking awk is like this:
system("ssh -p 22 root\#$mainip 'awk -F, '{print $2,$1}' OFS=, /root/archive/ipdomain.txt > /root/ipdom.txt'");
This is the error I get :
awk: cmd. line:1: {print
awk: cmd. line:1: ^ unexpected newline or end of string
any suggestions, please?
With the layered commands and all that multi-level quoting and escaping that need be done right,† no wonder it fails. A complex command like that will always be tricky, but libraries help a lot.
A properly quoted string to run through a shell can be formed with String::ShellQuote ‡
use warnings;
use strict;
use feature 'say';
use String::ShellQuote qw(shell_quote);
die "Usage: $0 file outfile\n" if #ARGV != 2;
my ($file, $out) = #ARGV;
my #cmd_words =
( 'ssh', 'hostname', 'awk', q('{print $2 $1}'), $file, '>', $out );
my $cmd = shell_quote #cmd_words;
system($cmd);
Note how the q() operator from of single quotes enables us to pass single quotes nicely.
This swaps the first two words on each line of a file and prints them, using awk, and redirects the output to a file, on a remote host. It works as expected in my tests (with a real hostname). Please adjust as needed.
Another possible improvement would be to use a library for ssh, like Net::OpenSSH.
A complete command, as the one in the question, to use in the above program
my #cmd_words = (
'ssh', '-p', '22', "root\#$mainip",
'awk', '-F,', q('{print $2,$1}'), 'OFS=,', $file, '>', $out );
Tested with a file from the question.
The makeVoiceBot answer is informative and it got half way there but I find the need for
system("ssh hostname \"awk '{print \\\$2 \\\$1}' $path\"");
This works in my tests (on systems I ssh to). I try to avoid needing to deal with such quoting and escaping.
† This is a shell command which runs ssh, and then executes a command on the remote system which runs a shell (there) as well, in order to run awk and redirect its output to a file.
A bit more than an "awk command" as the title says.
‡ The library can prepare a command for bash (as of this writing), but one can look at the source for it and adjust it for their own shell, at least. There is also Win32::ShellQuote
I am using a shortened example here
system("ssh localhost 'awk '{print $2,$1}' file.txt'")
system() sees:
ssh localhost 'awk '{print $2,$1}' file.txt'
local shell expands:
ssh
localhost
awk
{print
$2,$1}
file.txt
local shell replaces $1 and $2 (positional args) with empty strings:
ssh
localhost
awk
{print
,}
file.txt
ssh executes:
ssh localhost awk {print ,} file.txt
remote shell gets:
awk
{print
,}
file.txt
So the remote shell runs awk with {print as its program argument, resulting in the described error. To prevent this, the invocation of system() can be changed to;
system("ssh localhost \"awk '{print \$2,\$1}' file.txt\"")
system() sees:
ssh localhost "awk '{print \$2,\$1}' file.txt"
local shell expands:
ssh
localhost
awk '{print \$2,\$1}' file.txt
ssh executes
ssh localhost awk '{print \$2,\$1}' file.txt
remote shell gets
awk
{print \$2,\$1}
file.txt
remote shell expands \ escapes
awk
{print $2,$1}
file.txt
Remote awk now gets {print $2,$1} as its program argument, and executes successfully.

calling awk from perl does not work with unzip redirect

The below command works fine when I run it manually but when I call it from perl script using backticks or system command, it gives this error
sh: -c: line 0: syntax error near unexpected token `('
Script snapshot:
#Find contents of myFile in zipfile and output the matched records to output.txt
$cmd = "awk -F\"|\" 'NR==FNR{hash[\$0]=1;next} \$237 in hash' $myFILE <(unzip -p $zipfile *XYZ*) >> output.txt";
$result=`$cmd`;
It seems that we cannot call a subshell i.e. (unzip ...) within a system call through perl. Please advise as I have been struggling since a couple of days.
This is because the command line is most likely /bin/bash and perl's shell is /bin/sh which does not support this mean. Put the zip command before the awk with pipe (|). Something like this:
#Find contents of myFile in zipfile and output the matched records to output.txt
$cmd = "unzip -p $zipfile *XYZ* | awk -F\"|\" 'NR==FNR{hash[\$0]=1;next} \$237 in hash' $myFILE >> output.txt";
$result=`$cmd`;

Perl deleting "blank" lines from a csv file

I'm looking to delete blank lines in a CSV file, using Perl.
I'm not too sure how to do this, as these lines aren't exactly "blank" (they're just a bunch of commas).
I'd also like to save the output as a file of the same name, overwriting the original.
How could I go about doing this?
edit: I can't use modules or any source code due to network restrictions...
You can do this using a simple Perl one-liner:
perl -i -ne 'print unless /^[,\s]*$/' <filename>
The -n flag assumes this loop around your program:
while(<>) {
print unless /^[,\s]*$/;
}
and the -i flag means inplace and modifies your input file.
Note: If you are worried about losing your data with -i, you can specify -i.bak and perl will automatically write the original file to your <filename>.bak
More of a command line hack,
perl -i -ne 'print if /[^,\r\n]/' file.csv
If you want to put it inside a shell script you can do this ...
#!/bin/sh
$(perl -i -n -e 'print $_ unless ($_ =~ /^\,+$/);' $*)

storing output of perl command into a variable

This works fine
system("perl -c C:/Users/mytest/scripts/file_name.pm")
This command gives many lines of output in cygwin and a single syntak ok line in centos. since ill be using cygwin, what am trying to do is to get this output into a variable and use it later in my program. How can i do it?
Thanks in advance for your time.
Instead of system, use backticks:
my $output = `perl -c C:/Users/mytest/scripts/file_name.pm`;
if you want to also include STDERR output, use:
my $output = `perl -c C:/Users/mytest/scripts/file_name.pm 2>&1`;

Perl's diamond operator: can it be done in bash?

Is there an idiomatic way to simulate Perl's diamond operator in bash? With the diamond operator,
script.sh | ...
reads stdin for its input and
script.sh file1 file2 | ...
reads file1 and file2 for its input.
One other constraint is that I want to use the stdin in script.sh for something else other than input to my own script. The below code does what I want for the file1 file2 ... case above, but not for data provided on stdin.
command - $# <<EOF
some_code_for_first_argument_of_command_here
EOF
I'd prefer a Bash solution but any Unix shell is OK.
Edit: for clarification, here is the content of script.sh:
#!/bin/bash
command - $# <<EOF
some_code_for_first_argument_of_command_here
EOF
I want this to work the way the diamond operator would work in Perl, but it only handles filenames-as-arguments right now.
Edit 2: I can't do anything that goes
cat XXX | command
because the stdin for command is not the user's data. The stdin for command is my data in the here-doc. I would like the user data to come in on the stdin of my script, but it can't be the stdin of the call to command inside my script.
Sure, this is totally doable:
#!/bin/bash
cat $# | some_command_goes_here
Users can then call your script with no arguments (or '-') to read from stdin, or multiple files, all of which will be read.
If you want to process the contents of those files (say, line-by-line), you could do something like this:
for line in $(cat $#); do
echo "I read: $line"
done
Edit: Changed $* to $# to handle spaces in filenames, thanks to a helpful comment.
Kind of cheezy, but how about
cat file1 file2 | script.sh
I am (like everyone else, it seems) a bit confused about exactly what the goal is here, so I'll give three possible answers that may cover what you actually want. First, the relatively simple goal of getting the script to read from either a list of files (supplied on the command line) or from its regular stdin:
if [ $# -gt 0 ]; then
exec < <(cat "$#")
fi
# From this point on, the script's stdin is redirected from the files
# (if any) supplied on the command line
Note: the double-quoted use of $# is the best way to avoid problems with funny characters (e.g. spaces) in filenames -- $* and unquoted $# both mess this up. The <() trick I'm using here is a bash-only feature; it fires off cat in the background to feed data from files supplied on the command line, and then we use exec to replace the script's stdin with the output from cat.
...but that doesn't seem to be what you actually want. What you seem to really want is to pass the supplied filenames or the script's stdin as arguments to a command inside the script. This requires sort of the opposite process: converting the script's stdin into a file (actually a named pipe) whose name can be passed to the command. Like this:
if [[ $# -gt 0 ]]; then
command "$#" <<EOF
here-doc goes here
EOF
else
command <(cat) <<EOF
here-doc goes here
EOF
fi
This uses <() to launder the script's stdin through cat to a named pipe, which is then passed to command as an argument. Meanwhile, command's stdin is taken from the here-doc.
Now, I think that's what you want to do, but it's not quite what you've asked for, which is to both redirect the script's stdin from the supplied files and pass stdin to the command inside the script. This can be done by combining the above techniques:
if [ $# -gt 0 ]; then
exec < <(cat "$#")
fi
command <(cat) <<EOF
here-doc goes here
EOF
...although I can't think why you'd actually want to do this.
The Perl diamond operator essentially loops across all the command line arguments, treating each as a filename. It opens each file and reads them line-by-line. Here's some bash code that will do approximately the same.
for f in "$#"
do
# Do something with $f, such as...
cat $f | command1 | command2
-or-
command1 < $f
-or-
# Read $f line-by-line
cat $f | while read line_from_f
do
# Do stuff with $line_from_f
done
done
You want to take the first argument and do something with it, and then either read from any files specified or stdin if no files?
Personally, I'd suggest using getopt to indicate arguments using the "-a value" syntax to help disambiguate, but that's just me. Here's how I'd do it in bash without getopts:
firstarg=${1?:usage: $0 arg [file1 .. fileN]}
shift
typeset -a files
if [[ ${##} -gt 0 ]]
then
files=( "$#" )
else
files=( "/dev/stdin" )
fi
for file in "${files[#]}"
do
whatever_you_want < "$file"
done
The ?: operator will die if there are no args specified, since you seem to want at least one arg either way. After grabbing that, shift the args over by one, and then either use the remaining args as your file list, or the bash special filehandle "/dev/stdin" if there were no other args.
I think that the "if no files are specified, use /dev/stdin - otherwise use the files on the command line" piece is probably what you're looking for, but the rest of the code is at least useful for context.
Also a little cheezy, but how about this:
if [[ $# -eq 0 ]]
then
# read from stdin
else
# read from $* (args)
fi
If you need to read and process line-by-line (which is likely) and don't want to copy/paste the same code twice (which is likely), define a function in your script and just pass the lines one-by-one to this function, and process them in said function.
Why not use ``cat #* in the script? For example:
x=`cat $*`
echo $x