Perl system + split + array - perl

my name is luis, live in arg.
i have a problem, which can not solve.
**IN BASH**
pwd
/home/labs-perl
ls
file1.pl file2.pl
**IN PERL**
my $ls = exec("ls");
my #lsarray = split(/\s+/, $ls);
print "$lsarray[1]\n"; #how this, i need the solution. >> file.pl
file1.pl file2.pl # but this is see in shell.

The output you see is not from the print statement, it is the console output of ls. To get the ls output into a variable, use backticks:
my $ls = `ls`;
my #lsarray = split(/\s+/, $ls);
print "$lsarray[1]\n";
This is because exec does not return, the statements after it are not executed. From perldoc:
The exec function executes a system command and never returns; use
system instead of exec if you want it to return. It fails and returns
false only if the command does not exist and it is executed directly
instead of via your system's command shell
But using system command will not help you as it does not allow output capturing, hence, the backticks. However, using glob functions is better:
my #arr = glob("*");
print $arr[1], "\n";
Also, perl array indices start at 0, not 1. To get file1.pl you should use print "$lsarray[0]\n".

It is bad practice to use the shell when you can write something within Perl.
This program displays what I think you want.
chdir '/home/labs-perl' or die $!;
my #dir = glob '*';
print "#dir\n";
Edit
I have just understood better what you need from perreal's post.
To display the first file in the current working directory, just write
print((glob '*')[0], "\n");

print <*> // die("No file found\n"), "\n";
(Though using an iterator in scalar context should usually be avoided if the script will be doing anything further.)

Related

Remove multiple duplicate lines from a file

I have a Perl script run in crontab that generates a file rich with duplicate entries, because on each run it rewrites information previously written.
I would use a sort -u of file, but, I would do it at the end of the Perl script file.
My list
10/10/2017 00:01:39:000;Sagitter
10/11/2017 00:00:01:002;Lupus
10/12/2017 00:03:14:109;Leon
10/12/2017 00:09:00:459;Sagitter
10/13/2017 01:11:03:009;Lupus
12/13/2017 04:29:00:609;Ariet
10/11/2017 00:00:01:002;Lupus
10/12/2017 00:03:14:109;Leon
...
My code
#!/usr/bin/perl
# Libraries
use strict;
use warnings 'all';
%lines = ();
# Remove duplicate
open( TMP_GL_OUTPUT, '>', $OUTPUT_FILE ) or die $!;
while ( <TMP_GL_OUTPUT> ) {
$lines{$_}++;
}
open( OUTFILE, '>', $TMPOUTPUT_FILE ) or die $!;
print OUTFILE keys %lines;
close( OUTFILE );
close( TMP_GL_OUTPUT );
Where am I going wrong? In shell it feels shorter than in Perl.
sort -u $TMPOUTPUT_FILE > $OUTPUT_FILE
As Suggested by ikegamy user, I've do as following:
move $OUTPUT_FILE, $TMPOUTPUT_FILE; # Copy file
run [ 'sort', '-u', '--', $TMPOUTPUT_FILE ], '>', $OUTPUT_FILE; # Remove duplicate
unlink $TMPOUTPUT_FILE;
I think you are asking why your Perl program is longer than your shell script.
First of all, your shell script does something completely different than your Perl program.
Your shell script executes a program, and stores its out in a file.
Your Perl program reads a file, manipulates the data it read, and stores the output in a file.
The Perl equivalent to
sort -u -- "$TMPOUTPUT_FILE" > "$OUTPUT_FILE"
is
use IPC::Run qw( run );
run [ 'sort', '-u', '--', $TMPOUTPUT_FILE ], '>', $OUTPUT_FILE;
(There are differences in error handling between these two.)
They're not that different in length.
This brings up the second difference. The shell specializes in executing programs, but Perl is a general purpose language. It would be surprising if it wasn't longer in Perl!
(Now try comparing the size of your Perl program to the source of sort...)
List::Util is a core module.
use List::Util 'uniq';
print for uniq <>
Your code looks almost OK.
My proposition is only to chomp each line, before you
save an element in the hash.
The reason is that e.g. the last line, not terminated
with a \n may look just the same as one of previous lines,
but without chomp the previous line would have contained
the terminating \n, whereas the last - not.
The resut is that both these lines will be different keys in the hash.
Compare my example program (working, presented below) with yours, there are
no other significant differences, apart from reading from __DATA__ and
writing to the console.
In my program, for demonstration purposes, I put 2 variants of printout,
one with key values (repetition counts) and another, printing just keys.
In your program leave only the second printout.
use strict; use warnings; use feature qw(say);
my %lines;
while(<DATA>) {
chomp;
$lines{$_}++;
}
while(my($key, $val) = each %lines) {
printf "%-32s / %d\n", $key, $val;
}
say '========';
foreach my $key (keys %lines) {
say $key;
}
__DATA__
10/10/2017 00:01:39:000;Sagitter
10/11/2017 00:00:01:002;Lupus
10/12/2017 00:03:14:109;Leon
10/12/2017 00:09:00:459;Sagitter
10/13/2017 01:11:03:009;Lupus
12/13/2017 04:29:00:609;Ariet
10/11/2017 00:00:01:002;Lupus
10/12/2017 00:03:14:109;Leon
Edit
Your code assigns no names to $OUTPUT_FILE and $TMPOUTPUT_FILE,
you even didn't declare these variables, but I assume, that in your actual
code you did it.
Another detail is that %lines should be preceded with my,
otherwise, as you put use strict; the compiler prints an error.
Edit 2
There is a quicker and shorter solution than yours.
Instead of writing lines to a hash and printing them as late as in
the second step, you can do it in a single loop:
Read the line.
Check whether the hash already contains a key equal to the line just read.
If not, then:
write the line to the hash, to block the printout, if just the
same line occured again,
print the line.
You can even write this program as a Perl one-liner:
perl -lne"print if !$lines{$_}++" input.txt
If you run the above command from the Windows cmd, it will print the output
to the console. If you use Linux, instead of double quotes, you can use apostrophes.
You may of course redirect the output to any file, adding > output.txt to
the above command.
The code is executed for each input line, chomped due to -l option.
If any other details concerning Perl one-liners are not known to you, search the web.

Perl open file from command line with wildcard

I am executing my script this way:
./script.pl -f files*
I looked at some other threads (like How can I open a file in Perl using a wildcard in the directory name?)
If i hard code the file name like it is written in this thread I get my desired result. If I take it from the command line it does not.
My options subroutine should save all the files I get this way in an array.
my #file;
sub Options{
my $i=0;
foreach my $opt (#ARGV){
switch ($opt){
case "-f" {
$i++;
### This part does not work:
#file= glob $ARGV[$i];
print Dumper("$ARGV[$i]"); #$VAR1 = 'files';
print Dumper(#file); #$VAR1 = 'files';
}
}
$i++;
}
}
It seems the execution is interpreted in advance and the wildcard (*) is dropped in the process.
Desired result: All files beginning with files are saved in an array, after execution from the command line.
I hope you get my problem. If not feel free to ask.
Thank you.
Well, first I'd suggest using a module to do args on command line:
Getopt::Long for example.
But otherwise your problem is simpler - your shell is expanding the 'file*' before perl gets it. (shell glob is getting there first).
If you do this with:
-f 'file*'
then it'll work properly. You should be able to see this - for example - if you just:
use Data::Dumper;
print Dumper \#ARGV;
I expect you'll see a much longer list than you thought.
However, I'd also point out - perl has a really nice feature you may be able to use (depending what you're doing with your files).
You can use <>, which automatically opens and reads all files specified on command line (in order).
Since your shell is already expanding the glob files* into a list of filenames, that's what the Perl program gets.
$ perl -E 'say #ARGV' files*
files1files2files3
There's no need to do that in Perl, if your shell can do it for you. If all you want is the filenames in an array, you already have #ARGV which contains those.

Perl Capture and Modify STDERR before it prints to a file [duplicate]

I want to execute an external command from within my Perl script, putting the output of both stdout and stderr into a $variable of my choice, and to get the command's exit code into the $? variable.
I went through solutions in perlfaq8 and their forums, but they're not working for me. The strange thing is that I don't get the output of sdterr in any case, as long as the exit code is correct.
I'm using Perl version 5.8.8, on Red Hat Linux 5.
Here's an example of what I'm trying:
my $cmd="less";
my $out=`$cmd 2>&1`;
or
my $out=qx($cmd 2>&1);
or
open(PIPE, "$cmd 2>&1|");
When the command runs successfully, I can capture stdout.
I don't want to use additional capture modules. How can I capture the full results of the external command?
This was exactly the challenge that David Golden faced when he wrote Capture::Tiny. I think it will help you do exactly what you need.
Basic example:
#!/usr/bin/env perl
use strict;
use warnings;
use Capture::Tiny 'capture';
my ($stdout, $stderr, $return) = capture {
system( 'echo Hello' );
};
print "STDOUT: $stdout\n";
print "STDERR: $stderr\n";
print "Return: $return\n";
After rereading you might actually want capture_merged to join STDOUT and STDERR into one variable, but the example I gave is nice and general, so I will leave it.
Actually, the proper way to write this is:
#!/usr/bin/perl
$cmd = 'lsss';
my $out=qx($cmd 2>&1);
my $r_c=$?;
print "output was $out\n";
print "return code = ", $r_c, "\n";
You will get a '0' if no error and '-1' if error.
STDERR is intended to be used for errors or messages that might need to be separated from the STDOUT output stream. Hence, I would not expect any STDERR from the output of a command like less.
If you want both (or either) stream and the return code, you could do:
my $out=qx($cmd 2>&1);
my $r_c=$?
print "output was $out\n";
print "return code = ", $r_c == -1 ? $r_c : $r_c>>8, "\n";
If the command isn't executable (perhaps because you meant to use less but wrote lsss instead), the return code will be -1. Otherwise, the correct exit value is the high 8-bits. See system.
A frequently given answer to this question is to use a command line containing shell type redirection. However, suppose you want to avoid that, and use open() with a command and argument list, so you have to worry less about how a shell might interpret the input (which might be partly made up of user-supplied values). Then without resorting to packages such as IPC::Open3, the following will read both stdout and stderr:
my ($child_pid, $child_rc);
unless ($child_pid = open(OUTPUT, '-|')) {
open(STDERR, ">&STDOUT");
exec('program', 'with', 'arguments');
die "ERROR: Could not execute program: $!";
}
waitpid($child_pid, 0);
$child_rc = $? >> 8;
while (<OUTPUT>) {
# Do something with it
}
close(OUTPUT);

Perl substitute with regex

When I run this command over a Perl one liner, it picks up the the regular expression -
so that can't be bad.
more tagcommands | perl -nle 'print /(\d{8}_\d{9})/' | sort
12012011_000005769
12012011_000005772
12162011_000005792
12162011_000005792
But when I run this script over the command invocation below, it does not pick up the
regex.
#!/usr/bin/perl
use strict;
my $switch="12012011_000005777";
open (FILE, "more /home/shortcasper/work/tagcommands|");
my #array_old = (<FILE>) ;
my #array_new = #array_old ;
foreach my $line(#array_new) {
$line =~ s/\d{8}_\d{9}/$switch/g;
print $line;
sleep 1;
}
This is the data that I am feeding into the script
/CASPERBOT/START URL=simplefile:///data/tag/squirrels/squirrels /12012011_000005777N.dart.gz CASPER=SeqRashMessage
/CASPERBOT/ADDSERVER simplefile:///data/tag/squirrels/12012011_0000057770.dart.trans.gz
/CASPERRIP/newApp multistitch CASPER_BIN
/CASPER_BIN/START URLS=simplefile:///data/tag/squirrels /12012011_000005777R.rash.gz?exitOnEOF=false;binaryfile:///data/tag/squirrels/12162011_000005792D.binaryBlob.gz?exitOnEOF=false;simplefile:///data/tag/squirrels/12012011_000005777E.bean.trans.gz?exitOnEOF=false EXTRACTORS=rash;island;rash BINARY=T
You should study your one-liner to see how it works. First check perl -h to learn about the switches used:
-l[octal] enable line ending processing, specifies line terminator
-n assume "while (<>) { ... }" loop around program
The first one is not exactly self-explanatory, but what -l actually does is chomp each line, and then change $\ and $/ to newline. So, your one-liner:
perl -nle 'print /(\d{8}_\d{9})/'
Actually does this:
$\ = "\n";
while (<>) {
chomp;
print /(\d{8}_\d{9})/;
}
A very easy way to see this is to use the Deparse command:
$ perl -MO=Deparse -nle 'print /(\d{8}_\d{9})/'
BEGIN { $/ = "\n"; $\ = "\n"; }
LINE: while (defined($_ = <ARGV>)) {
chomp $_;
print /(\d{8}_\d{9})/;
}
-e syntax OK
So, that's how you transform that into a working script.
I have no idea how you went from that to this:
use strict;
my $switch="12012011_000005777";
open (FILE, "more /home/shortcasper/work/tagcommands|");
my #array_old = (<FILE>) ;
my #array_new = #array_old ;
foreach my $line(#array_new) {
$line =~ s/\d{8}_\d{9}/$switch/g;
print $line;
sleep 1;
}
First of all, why are you opening a pipe from the more command to read a text file? That is like calling a tow truck to fetch you a cab. Just open the file. Or better yet, don't. Just use the diamond operator, like you did the first time.
You don't need to first copy the lines of a file to an array, and then use the array. while(<FILE>) is a simple way to do it.
In your one-liner, you print the regex. Well, you print the return value of the regex. In this script, you print $line. I'm not sure how you thought that would do the same thing.
Your regex here will remove all set of numbers and replace it with the ones in your script. Nothing else.
You may also be aware that sleep 1 will not do what you think. Try this one-liner, for example:
perl -we 'for (1 .. 10) { print "line $_\n"; sleep 1; }'
As you will notice, it will simply wait 10 seconds then print everything at once. That's because perl by default prints to the standard output buffer (in the shell!), and that buffer is not printed until it is full or flushed (when the perl execution ends). So, it's a perception problem. Everything works like it should, you just don't see it.
If you absolutely want to have a sleep statement in your script, you'll probably want to autoflush, e.g. STDOUT->autoflush(1);
However, why are you doing that? Is it so you will have time to read the numbers? If so, put that more statement at the end of your one-liner instead:
perl ...... | more
That will pipe the output into the more command, so you can read it at your own pace. Now, for your one-liner:
Always also use -w, unless you specifically want to avoid getting warnings (which basically you never should).
Your one-liner will only print the first match. If you want to print all the matches on a new line:
perl -wnle 'print for /(\d{8}_\d{9})/g'
If you want to print all the matches, but keep the ones from the same line on the same line:
perl -wnle 'print "#a" if #a = /(\d{8}_\d{9})/g'
Well, that should cover it.
Your open call may be failing (you should always check the result of an open to make sure it succeeded if the rest of the program depends on it) but I believe your problem is in complicating things by opening a pipe from a more command instead of simply opening the file itself. Change the open to simply
open FILE, "/home/shortcasper/work/tagcommands" or die $!;
and things should improve.

Why does STDIN cause my Perl program to freeze?

I am learning Perl and wrote this script to practice using STDIN. When I run the script, it only shows the first print statement on the console. No matter what I type in, including new lines, the console doesn't show the next print statement. (I'm using ActivePerl on a Windows machine.) It looks like this:
$perl script.pl
What is the exchange rate? 90.45
[Cursor stays here]
This is my script:
#!/user/bin/perl
use warnings; use strict;
print "What is the exchange rate? ";
my #exchangeRate = <STDIN>;
chomp(#exchangeRate);
print "What is the value you would like to convert? ";
chomp(my #otherCurrency = <STDIN>);
my #result = #otherCurrency / #exchangeRate;
print "The result is #{result}.\n";
One potential solution I noticed while researching my problem is that I could include use IO::Handle; and flush STDIN; flush STDOUT; in my script. These lines did not solve my problem, though.
What should I do to have STDIN behave normally? If this is normal behavior, what am I missing?
When you do
my #answer = <STDIN>;
...Perl waits for the EOF character (on Unix and Unix-like it's Ctrl-D). Then, each line you input (separated by linefeeds) go into the list.
If you instead do:
my $answer = <STDIN>;
...Perl waits for a linefeed, then puts the string you entered into $answer.
I found my problem. I was using the wrong type of variable. Instead of writing:
my #exchangeRate = <STDIN>;
I should have used:
my $exchangeRate = <STDIN>;
with a $ instead of a #.
To end multiline input, you can use Control-D on Unix or Control-Z on Windows.
However, you probably just wanted a single line of input, so you should have used a scalar like other people mentioned. Learning Perl walks you through this sort of stuff.
You could try and enable autoflush.
Either
use IO::Handle;
STDOUT->autoflush(1);
or
$| = 1;
That's why you are not seeing the output printed.
Also, you need to change from arrays '#' to scalar variables '$'
$val = <STDIN>;
chomp($val);