Perl Script to Read File Line By Line and Run Command on Each Line - perl

I found this perl script here which seems will work for my purposes. It opens a Unicode text file and reads each line so that a command can be run. But I cannot figure out how to run a certain ICU command on each line. Can someone help me out? The error I get is (largefile is the script name):
syntax error at ./largefile line 11, near "/ ."
Search pattern not terminated at ./largefile line 11.
#!/usr/bin/perl
use strict;
use warnings;
my $file = 'test.txt';
open my $info, $file or die "Could not open $file: $!";
while( my $line = <$info>) {
do
LD_LIBRARY_PATH=icu/source/lib/ ./a.out "$line" >> newtext.txt
done
}
close $info;
Basically I want to open a large text file and run the command (which normally runs from the command line...I think how I call this in the perl script is the problem, but I don't know how to fix it) "LD_LIBRARY_PATH=icu/source/lib/ ./a.out "$line" >> newtext.txt" on each line so that "newtext.txt" is then populated with all the lines after they have been processed by the script. The ICU part is breaking words for Khmer.
Any help would be much appreciated! I'm not much of a programmer... Thanks!

For executing terminal commands, the command needs to be in system(), hence change to
system("LD_LIBRARY_PATH=icu/source/lib/ ./a.out $line >> newtext.txt");

Have you tried backticks:
while (my $line = <$info>) {
`LD_LIBRARY_PATH=icu/source/lib/ ./a.out "$line" >> newtext.txt`
last if $. == 2;
}

Related

Trouble with line iteration using while or for

I am trying to process each line in a file through a perl script instead of sending the entire file to the perl script, sending so much data to memory at once.
In a shell script, I began what I thought to be line iteration as follows:
while read line
do
perl script.pl --script=options "$line"
done < input
When I do this, how do I save the data to an output file >> output?
while read line
do
perl script.pl --script=options "$line"
done < input
>> output
If it takes less memory to split the file, then I also had trouble with the for statement
for file in /dev/*
do
split -l 1000 $file prefix
done < input
## Where do I save the output?
for file in /dev/out/*
do
perl script.pl --script=options
etc...
Which is the most memory-efficient way to
also you can process your very big file line by line within perl script without loading the entire file in memory. for that you just need to enclose the text of your current perl script (that i hope doen't read the file in memory any more :) ) with while loop. for example:
my $line;
while ($line = <>) {
// your script text here, refering to $line variable instead of param variable
}
and in this perl script you can also write results to output file. say, if result is stored in variable $res, you can do it this way:
open (my $fh, ">>", "out") or die "ERROR: $!"; # opening a file descriptor
my $line;
while ($line = <>) {
// your script text here, refering to $line variable instead of param variable
print $fh $res, "\n"; # writing to file descriptor
}
close $fh; # closing file descriptor
try this:
while read line
do
perl script.pl --script=options "$line" >> "out"
done < input
"out" is a name of your output file.
I fixed my issue with:
split -l 100000 input /dev/shm/split/split.input.txt.
find /dev/shm/split/ -type f -name '*.txt.* -exec perl script.pl --script=options {} + > output
This made my script process the files faster.

Tail command used in perl backticks

I'm trying to run a tail command from within a perl script using the usual backticks.
The section in my perl script is as follows:
$nexusTime += nexusUploadTime(`tail $log -n 5`);
So I'm trying to get the last 5 lines of this file but I'm getting the following error when the perl script finishes:
sh: line 1: -n: command not found
Even though when I run the command on the command line it is indeed successful and I can see the 5 lines from that particular.
Not sure what is going on here. Why it works from command line but through perl it won't recognize the -n option.
Anybody have any suggestions?
$log has an extraneous trailing newline, so you are executing
tail file.log
-n 5 # Tries to execute a program named "-n"
Fix:
chomp($log);
Note that you will run into problems if log $log contains shell meta characters (such as spaces). Fix:
use String::ShellQuote qw( shell_quote );
my $tail_cmd = shell_quote('tail', '-n', '5', '--', $log);
$nexusTime += nexusUploadTime(`$tail_cmd`);
ikegami pointed out your error, but I would recommend avoiding external commands whenever possible. They aren't portable and debugging them can be a pain, among other things. You can simulate tail with pure Perl code like this:
use strict;
use warnings;
use File::ReadBackwards;
sub tail {
my ($file, $num_lines) = #_;
my $bw = File::ReadBackwards->new($file) or die "Can't read '$file': $!";
my ($lines, $count);
while (defined(my $line = $bw->readline) && $num_lines > $count++) {
$lines .= $line;
}
$bw->close;
return $lines;
}
print tail('/usr/share/dict/words', 5);
Output
ZZZ
zZt
Zz
ZZ
zyzzyvas
Note that if you pass a file name containing a newline, this will fail with
Can't read 'foo
': No such file or directory at tail.pl line 10.
instead of the more cryptic
sh: line 1: -n: command not found
that you got from running the tail utility in backticks.
The answer to this question is to place the option -n 5 before the target file

Perl search for string and get the full line from text file

I want to search for a string and get the full line from a text file through Perl scripting.
So the text file will be like the following.
data-key-1,col-1.1,col-1.2
data-key-2,col-2.1,col-2.2
data-key-3,col-3.1,col-3.2
Here I want to apply data-key-1 as the search string and get the full line into a Perl variable.
Here I want the exact replacement of grep "data-key-1" data.csv in the shell.
Some syntax like the following worked while running in the console.
perl -wln -e 'print if /\bAPPLE\b/' your_file
But how can I place it in a script? With the perl keyword we can't put it into a script. Is there a way to avoid the loops?
If you'd know the command line options you are giving for your one-liner, you'd know exactly what to write inside your perl script. When you read a file, you need a loop. Choice of loop can yield different results performance wise. Using for loop to read a while is more expensive than using a while loop to read a file.
Your one-liner:
perl -wln -e 'print if /\bAPPLE\b/' your_file
is basically saying:
-w : Use warnings
-l : Chomp the newline character from each line before processing and place it back during printing.
-n : Create an implicit while(<>) { ... } loop to perform an action on each line
-e : Tell perl interpreter to execute the code that follows it.
print if /\bAPPLE\b/ to print entire line if line contains the word APPLE.
So to use the above inside a perl script, you'd do:
#!usr/bin/perl
use strict;
use warnings;
open my $fh, '<', 'your_file' or die "Cannot open file: $!\n";
while(<$fh>) {
my $line = $_ if /\bAPPLE\b/;
# do something with $line
}
chomp is not really required here because you are not doing anything with the line other then checking for an existence of a word.
open($file, "<filename");
while(<$file>) {
print $_ if ($_ =~ /^data-key-3,/);
}
use strict;
use warnings;
# the file name of your .csv file
my $file = 'data.csv';
# open the file for reading
open(FILE, "<$file") or
die("Could not open log file. $!\n");
#process line by line:
while(<FILE>) {
my($line) = $_;
# remove any trail space (the newline)
# not necessary, but again, good habit
chomp($line);
my #result = grep (/data-key-1/, $line);
push (#final, #result);
}
print #final;

Searching through file on Perl

Okay so I need to read in a file, go through each line and find where there is the string ERROR. This is what I have so far:
open(LOGFILE, "input.txt") or die "Can't find file";
$title = <LOGFILE>;
$\=' ' ;
while (<>){
foreach $title(split){
while (/^ERROR/gm){
print "ERROR in line $.\n";
}
}
}
close LOGFILE;
So the problem that I have is that it only looks at the first word of each line. So if the input is
boo far ERROR
It won't register an error. any help would be greatly appreciated! I'm new to perl so please try and keeps things basic. Thanks!
This is a more elegant approach, and I fixed the regex issue. ^ matched the start of a line.
open(LOGFILE, "input.txt") or die "Can't find file";
while(<LOGFILE>) {
print "ERROR in line $.\n" if(/ERROR/);
}
close LOGFILE;
Or how about from the command line:
perl -n -e 'print "ERROR in line $.\n" if(/ERROR/);' input.txt
-n implicitly loops for all lines of input
-e executes a line of code
To output to a file:
perl -n -e 'print "ERROR in line $.\n" if(/ERROR/);' input.txt > output.txt
While this is a good/simple example of using Perl, if you're using a Unix shell, grep does what you want with no need for scripting (thanks to TLP in the OP comments):
grep -n ERROR input.txt > output.txt
This is actually prints the matching line itself, with its line number.
Of course it won't, because ^ in front of your regexp means "start of line". Remove it and it will catch ERROR anywhere. You shouldn't do any splitting tricks either. You need to find ERROR anywhere? Then just write so:
while (<>){
if (/ERROR/){
print "ERROR in line $.\n";
}
}

execute command line command from Perl?

HI all,
i need to have this command line command executed from a Perl file:
for file in *.jpg ; do mv $file `echo $file | sed 's/\(.*\.\)jpg/\1jp4/'` ; done
So I added the folders and tried:
system "bash -c 'for file in $mov0to/*.jp4 ; do mv $file `echo $basedir/internal/0/$file | sed 's/\(.*\.\)jp4/\1jpg/'` ; done'";
But all I get is:
sh: Syntax error: "(" unexpected
No file specified
I am on Kubuntu 10.4
Thanks,
Jan
I can think of many better ways of doing this, but ideally you want pure Perl:
use File::Copy qw( move );
opendir DIR, $mov0to or die "Unable to open $mov0to: $!";
foreach my $file ( readdir DIR ) {
my $out = $file;
next unless $out =~ s/\.jp4$/.jpg/;
move "$mov0to/$file", "$basedir/internal/0/$out" or die "Unable to move $mov0to/$file->$basedir/internal/0/$out: $!";
}
closedir DIR;
If you insist on doing the lifting in Bash, you should be doing:
system qq(bash -c 'for file in $mov0to/*.jp4 ; do mv \$file $basedir/internal/0/\${file\/%.jp4\/.jpg} ; done');
All of the variables inside the double quotes will get interpolated, the $file that you're using in the bash for loop in particular. We don't have enough context to know where the other variables ($mov0to and $basedir) come from or what they contain so they might be having the same problems. The double quotes are also eating your backslashes in the sed part.
Andy White is right, you'd have less of a mess if the bash commands were in a separate script.
It's most likely a problem with nested quotes. I would recommend either moving that bash command into its own (bash) script file, or writing the loop/logic directly in Perl code, making simpler system calls if you need to.