Handling and generating files in Perl code - perl

I have perl code (called aggregator.pl) that reads some data from a file called 'testdata.csv' through the command
open (F,'testdata.csv');
my #lines=<F>;
close(F);
opens a new file handle for the output
open (F1,'>>testdata_aggregates.csv');
and appends to this output file 'testdata_aggregates.csv' the results of some calculations.
To launch my perl code I simply type in my command prompt:
$ perl aggregator.pl
Now, I have different data files, called e.g 20100909.csv or 20100910.csv and I would like to change my perl code so that when I launch my code from the command prompt I tell perl the name of the input file he should use (say, '20100909.csv') and that he should name the output file accordingly (say '20100909_aggregates.csv', basically, adding _aggregates to the input filename).
Any advice on how to change my code and how would I have to launch the new code adding the name of the data_input file he should use?

Just accept parameters via #ARGV.
Your script should open with:
use strict;
use warnings;
use autodie;
die "Usage: $0 Input_File Output_File\n" if #ARGV != 2;
my ($infile, $outfile) = #ARGV;
And later in your file
open (F, '<', $infile);
# ...
open (F1,'>>', $outfile);

One would usually rewrite such an application that it reads from STDIN and simply writes to STDOUT. When the program is then invoked on the command line, redirection operators can be used to specify the files:
$ perl aggregator.pl <testdata.csv > testdata_aggregates.csv
$ perl aggregator.pl <20100909.csv > 20100909_aggregates.csv
...
What changes inside the script? We don't open a file F, instead: my #lines = <>. We don't print to F1, instead we print to STDOUT, which is selected implicitly: print F1 "foo\n" becomes print "foo\n".

Related

How to make output file of one script, as an input file for the another script?

I am currently working on a project where in I am using Perl language to create command line application of one online tool.
There are total nine modules (for each module there is separate Perl script).
This Command Line Application should work in the following way-
Out of these nine modules user would be able to select any number of modules. (in short pipeline should be built).
after running first selected module, output files are generated.
output file of first module should be taken as an input file by the next module selected by the user.
My doubt is how we can make output file of first module as an input file for the next selected module.
It will be a great help if you solve my doubt as I am new to Perl programming.
Thanking you!
Tamar is right. You can use pipe command: "|". You can do this no matter if you're using windows or a unix based operating system.
Here's a simple example of what you're doing:
Code to output data
out.pl
#!/usr/bin/perl
use strict;
use warnings;
my $file = "output.txt";
my $data = "gasp";
unless(-e $file){
open(my $fh, '>', $file);
print $fh $data;
close $fh;
}
Code that takes input file
in.pl
#!/usr/bin/perl
use strict;
use warnings;
my $gaspage = <STDIN>;
chomp $gaspage;
print $gaspage."\n";
Then you just run it with the commands below that can be run within your perl application or just in the terminal:
perl out.pl
cat output.pl | in.pl

How to read a .conf file in Perl

I just created a text test.conf file with some information. How can I read it on Perl?
I am new to Perl and I am not sue would will I need to do.
I tried the following:
C:\Perl\Perl_Project>perl
#!/usr/local/bin/perl
open (MYFILE, 'test.conf');
while (<MYFILE>)
{ chomp; print "$_\n"; }
close (MYFILE);
I tried installing Perl on my laptop that has Windows 7 OS, and using command line.
Instead of using command line, write your program in a file (you can use any editor to write your program, I would suggest use Notepad++) and save as myprogram.pl in the same directory where you have your .conf file.
use warnings;
use strict;
open my $fh, "<", "test.conf" or die $!;
while (<$fh>)
{
chomp;
print "$_\n";
}
close $fh;
Now open a command prompt and go to the same path where you have your both file myprogram.pl and test.conf file and execute your program by typing this:
perl myprogram.pl
You can give full path of your input file inside program and can run your program from any path from command prompt by giving full path of your program:
perl path\to\myprogram.pl
Side note: Always use use warnings; and use strict; at the top of your program and to open file always use lexical filehandle with three arguments with error handling.
This is an extended comment more than an answer, as I believe #serenesat has given you everything you need to execute your program.
When you do "command line" Perl, it's typically stuff that is relatively brief or trivial, such as:
perl -e "print 2 ** 16"
Anything that goes beyond a few lines, and you're probably better off putting that in a file and having Perl run the file. You certainly can put larger programs on the command line, but when it comes to going back in and editing lines, it becomes more of a hassle than a shortcut.
Also, for what it's worth the -n and -p parameters allow you to process the contents of a stream, meaning you could do something like this:
perl -ne "print if /oracle/i" test.conf

Perl File pointers

I have a question concerning these two files:
1)
use strict;
use warnings;
use v5.12;
use externalModule;
my $file = "test.txt";
unless(unlink $file){ # unlink is UNIX equivalent of rm
say "DEBUG: No test.txt persent";
}
unless (open FILE, '>>'.$file) {
die("Unable to create $file");
}
say FILE "This name is $file"; # print newline automatically
unless(externalModule::external_function($file)) {
say "error with external_function";
}
print FILE "end of file\n";
close FILE;
and external module (.pm)
use strict;
use warnings;
use v5.12;
package externalModule;
sub external_function {
my $file = $_[0]; # first arguement
say "externalModule line 11: $file";
# create a new file handler
unless (open FILE, '>>'.$file) {
die("Unable to create $file");
}
say FILE "this comes from an external module";
close FILE;
1;
}
1; # return true
Now,
In the first perl script line 14:
# create a new file handler
unless (open FILE, '>>'.$file) {
die("Unable to create $file");
}
If I would have
'>'.$file
instead, then the string printed by the external module will not be displayed in the final test.txt file.
Why is that??
Kind Regards
'>' means open the file for output, possibly overwriting it ("clobbering"). >> means appending to it if it already exists.
BTW, it is recommended to use 3 argument form of open with lexical file-handles:
open my $FH, '>', $file or die "Cannot open $file: $!\n";
If you use >$file in your main function, it will write to the start of the file, and buffer output as well. So after your external function returns, the "end of file" will be appended to the buffer, and the buffer flushed -- with the file pointer still at position 0 in the file, so you'll just overwrite the text from the external function. Try a much longer text in the external function, and you'll see that the last part of it remains, with the first part getting overwritten.
This is very old syntax, and not recommended in modern Perl. The 3-argument version, which was "only" introduced in 5.6.1 about 10 years ago, is preferred. So is using a lexical variable for a file handle, rather than an uppercase bareword.
Anyway, >> means open for append, whereas > means open for write, which will remove any existing data in the file.
You're clobbering your file when you reopen it once more. The > means open the file for writing, and delete the old file if it exists and create a new one. The >> means open the file for writing, but append the data if the file already exists.
As you can see, it's very hard to pass FILE back and forth between your module and your program.
The latest syntax is to use lexically scoped variables for file handles:
use autodie;
# Here I'm using the newer three parameter version of the open statement.
# I'm also using '$fh' instead of 'FILE' to store the pointer to the open
# file. This makes it easier to pass the file handle to various functions.
#
# I'm also using "autodie". This causes my program to die due to certain
# errors, so if I forget to test, I don't cause problems. I can use `eval`
# to test for an error if I don't want to die.
open my $fh, ">>", $file; # No die if it doesn't open thx to autodie
# Now, I can pass the file handle to whatever my external module needs
# to do with my file. I no longer have to pass the file name and have
# my external module reopen the file
externalModule::xternal_function( $fh );

how to create a script from a perl script which will use bash features to copy a directory structure

hi i have written a perl script which copies all the entire directory structure from source to destination and then i had to create a restore script from the perl script which will undo what the perl script has done that is create a script(shell) which can use bash features to restore the contents from destination back to source i m struggling to find the correct function or command which can copy recursively (not an requirement) but i want exactly the same structure as it was before
Below is the way i m trying to create a file called restore to do the restoration process
i m particularly looking for algorithm.
Also restore will restore the structure to a command line directory input if it is supplied if not You can assume the default input supplied to perl script
$source
$target
in this case we would wanna copy from target to source
So we have two different parts in one script.
1 which will copy from source to destination.
2 it will create a script file which will undo what part 1 has done
i hope this makes it very clear
unless(open FILE, '>'."$source/$file")
{
# Die with error message
# if we can't open it.
die "\nUnable to create $file\n";
}
# Write some text to the file.
print FILE "#!/bin/sh\n";
print FILE "$1=$target;\n";
print FILE "cp -r \n";
# close the file.
close FILE;
# here we change the permissions of the file
chmod 0755, "$source/$file";
The last problem i have is i couldn't get $1 in my restore file as it refers to a some variable in perl
but i need this for getting command line input when i run restore as $0 = ./restore $1=/home/xubuntu/User
First off, the standard way in Perl for doing this:
unless(open FILE, '>'."$source/$file") {
die "\nUnable to create $file\n";
}
is to use the or statement:
open my $file_fh, ">", "$source/$file"
or die "Unable to create "$file"";
It's just easier to understand.
A more modern way would be use autodie; which will handle all IO problems when opening or writing to files.
use strict;
use warnings;
use autodie;
open my $file_fh, '>', "$source/$file";
You should look at the Perl Modules File::Find, File::Basename, and File::Copy for copying files and directories:
use File::Find;
use File::Basename;
my #file_list;
find ( sub {
return unless -f;
push #file_list, $File::Find::name;
},
$directory );
Now, #file_list will contain all the files in $directory.
for my $file ( #file_list ) {
my $directory = dirname $file;
mkdir $directory unless -d $directory;
copy $file, ...;
}
Note that autodie will also terminate your program if the mkdir or copy commands fail.
I didn't fill in the copy command because where you want to copy and how may differ. Also you might prefer use File::Copy qw(cp); and then use cp instead of copy in your program. The copy command will create a file with default permissions while the cp command will copy the permissions.
You didn't explain why you wanted a bash shell command. I suspect you wanted to use it for the directory copy, but you can do that in Perl anyway. If you still need to create a shell script, the easiest way is via the :
print {$file_fh} << END_OF_SHELL_SCRIPT;
Your shell script goes here
and it can contain as many lines as you need.
Since there are no quotes around `END_OF_SHELL_SCRIPT`,
Perl variables will be interpolated
This is the last line. The END_OF_SHELL_SCRIPT marks the end
END_OF_SHELL_SCRIPT
close $file_fh;
See Here-docs in Perldoc.
First, I see that you want to make a copy-script - because if you only need to copy files, you can use:
system("cp -r /sourcepath /targetpath");
Second, if you need to copy subfolders, you can use -r switch, can't you?

How can get a filehandle for a Perl program's output?

I have an encrypted file X1, I have a Perl program P1 that decrypts the X1. I am parsing the decrypted file using a Perl program p2.
X1--P1(decrypter) --> X2(plain text file) --p2(parser) --> parse output
My parser is based on XML::Parser. It can work with a filehandle to the decrypted file. Now I am getting the X2 and storing it in the file system and reading it in the P2 and parsing it. Is there way I can directly get the filehandle over the P1's output and use that filehandle in the P2 to parse it directly with out requiring a temporary file?
Say you're using very weak encryption:
#! /usr/bin/perl
print <<EOXML;
<doc>
<elem attr="Hello, world!" />
</doc>
EOXML
Using open $fh, "-|", ... will create a pipe connected to the standard output of a child process:
#! /usr/bin/perl
use warnings;
use strict;
open my $decrypted, "-|", "./decrypt"
or die "$0: open: $!";
while (<$decrypted>) {
print "got: $_";
}
Output:
got: <doc>
got: <elem attr="Hello, world!" />
got: </doc>
I'm not sure I entirely understand what you're getting at, but it sounds like you just want to use pipes. You can do that at the shell by redirecting one program's STDOUT to another's STDIN
$ foo | bar
Or you can do it within Perl by opening a pipe directly to another program.
See also IPC::Open3 if you need control over STDERR.
Why don't you just make your programs read from STDIN and write to STDOUT and pipe the commands together on the command line?