How to pass whole file to perl script? - perl

I want to have a Perl script that receives a file and do some computation based on it.
Here is my try:
Perl.pl
#!/usr/bin/perl
use strict;
use warnings;
my $book = <STDIN>;
print $book;
Here is my execution of the script:
./Perl.pl < textFile
My script only prints the first line of textFile. Who can I load all textFile into my variable $book?
I want the file to be passed in that way, I do not want to use Perl's open(...)

Assigning a value from a file handle to a scalar pulls it one line at a time.
You can either:
use a while loop to append the lines one by one until there are none left or
set $/ (to undef) to change your script's idea of what constitutes a line. There is an example of the latter in perldoc perlvar (read it as it explains best practises for changing it).

Also you can use Path::Class for easy. It is a wrapper for many file manipulation modules.
For your purpose:
#! /usr/bin/perl
use Path::Class qw/file/;
my $file = file(shift #ARGV);
print $file->slurp;
You can run it by:
./slurp.pl textFile

The answer you're looking for is in the Perl FAQ.
How can I read in an entire file all at once?

Related

How to print the output to same file you are reading from in perl?

I have a perl script which reads a file, changes the required thing and then prints the output of the file on the console.
I want the output to be updated in the same file from where it is picking the data.
How can this be done?
You can use the -i switch, or the $^I special variable.
perl -i.backup -pe 's/change me/something else/'
or
#!/usr/bin/perl
use warnings;
use strict;
$^I = '.backup';
while (<>) {
...
print;
}
Note that it only works for the special file handle *ARGV used by the diamond operator. It creates a new file behind the scenes, anyway.
See perlrun and perlvar.

Calling one Perl program from another

I have two Perl files and I want to call one file from another with arguments
First file a.pl
$OUTFILE = "C://programs/perls/$ARGV[0]";
# this should be some out file created inside work like C://programs/perls/abc.log
Second File abc.pl
require "a.pl" "abc.log";
# $OUTFILE is a variable inside a.pl and want to append current file's name as log.
I want it to create an output file with the name of log as that of current file.
One more constraint I have is to use $OUTFILE in both a.pl and abc.pl.
If there is any better approach please suggest.
The require keyword only takes one argument. That's either a file name or a package name. Your line
require "a.pl" "abc.log";
is wrong. It gives a syntax error along the lines of String found where operator expected.
You can require one .pl file from another .pl, but that is very old-fashioned, badly written Perl code.
If neither file defines a package then the code is implicitly placed in the main package. You can declare a package variable in the outside file and use it in the one that is required.
In abc.pl:
use strict;
use warnings;
# declare a package variable
our $OUTFILE = "C://programs/perls/filename";
# load and execute the other program
require 'a.pl';
And in a.pl:
use strict;
use warnings;
# do something with $OUTFILE, like use it to open a file handle
print $OUTFILE;
If you run this, it will print
C://programs/perls/filename
You should convert your perl file you want to call to a perl module:
Hello.pm
#!/usr/bin/perl
package Hello;
use strict;
use warnings;
sub printHello {
print "Hello $_[0]\n"
}
1;
Then you can call it:
test.pl
#!/usr/bin/perl
use strict;
use warnings;
# you have to put the current directory to the module search path
use lib (".");
use Hello;
Hello::printHello("a");
I tested it in git bash on windows, maybe you have to do some modifications in your environment.
In this way you can pass as many arguments as you would like to, and you don't have to look for the variables you are using and maybe not initialized (this is a less safe approach I think, e.g. sometimes you will delete something you did't really want) somewhere in the file you want to call. The disadvantage is that you need to learn a bit about perl modules but I think it definitely worths.
A second approach could be to use the exec/system call (you can pass arguments in this way too; if forking a child process is acceptable), but that is an another story.
I would do this another way. Have the program take the name of the log file as a command-line parameter:
% perl a.pl name-of-log-file
Inside a.pl, open that file to append to it then output whatever you like. Now you can run it from many other sorts of places besides another Perl program.
# a.pl
my $log_file = $ARGV[0] // 'default_log_name';
open my $fh, '>>:utf8', $log_file or die ...;
print { $fh } $stuff_to_output;
But, you could also call if from another Perl program. The $^X is the path to the currently running perl and this uses system in the slightly-safer list form:
system $^X, 'a.pl', $name_of_log_file
How you get something into $name_of_log_file is up to you. In your example you already knew the value in your first program.

How to execute a script from another so that it also sets variables for the caller script

I reviewed many examples on-line about running another process (either PERL or shell command or a program), but do not find any useful for my needs way.
(As by already received answers I see that my 'request' is not understood, I will try to say it in short, leaving all earlier printed as an example of what I already tried...)
I need:
- In a caller script set parameters for the second script before call the second script (thus, I could not use the do script2.pl s it executed before startin to run the first script)
- In the second script I need to set some variables that will be used in the caller script (therefore it is not useful to process the second script by system() or by back ticks);
- and, as I need to use those variables in the first script, I need come back to the first script after completting the second one
(I hope now it is more clear what I need...)
(Reviewed and not useful the system(), 'back ticks', exec() and open())
I would like to run another PERL-script from a first one, not exiting (as by exec()), not catching the STDOUT of the called script (as in the back tick processing,) but having it printed out, as in initial script (as it is by system()) while I do not need the return status (as by system());
but, I would like to have the called script to set some variables, that will be accessible in the calling s cript (sure, set by the our #set_var;)
My attempt (that I am not able to make do what I need) is:
Script1 is something, like:
...
if($condition)
{ local $0 = 'script2.pl';
local #ARGV = ('first-arg', 'second_arg');
do script2.pl;
}
print "set array is: '#set_var'\n";
...
The 'script2' would have something like:
#!/usr/bin/perl
...
print "having input parameters: '#ARGV'\n";
... # all script activities
our #set_var = ($val1, $val2, $val3);
exit 0;
The problem in my code is that the do ... command is executed on beginning of the first script run and is not in the place, where it is prepared for it (by setting some local .. vars!)
I did try to use the eval "do script2.pl" :
- now it is executed in the proper place, but it is not setting the #set_var into the first script process!
Is there any idea to do it as I would like to have it?
(I understand, that I can rewrite the script2.pl, including whole processing in some function (say, main()) and load it by require() and execute the function main(): that will do everything as I prefer it; but I would like to leave the second script as-is to be executable from shell by itself, as it is now.
... and I do not like the way to pass values by a flat file...)
Does anybody have an idea how to do my whim?
This works just fine:
script2.pl
use strict;
our #set_var = ("foo","bar");
script1.pl
use strict;
our #set_var;
do './script2.pl';
print "#set_var\n";
$ perl script1.pl
foo bar
But it does not if you use:
script2.pl
use strict;
our #set_var = ("foo","bar");
exit 0;
There is only a single perl process in this example, so calling exit, even from the second script, exits your program.
If you don't want to remove the exit call in the second script, we can work around that with some CORE::GLOBAL namespace hacking. The gist is to redirect the exit function to your own custom function that you can manipulate when the second script runs.
script1.pl
BEGIN { *CORE::GLOBAL::exit = *my_exit };
use strict;
sub my_exit { goto &CORE::exit }
our #set_var;
{
local *my_exit = sub { warn "Not exiting" };
do './script2.pl';
}
print "#set_var\n";
script2.pl
use strict;
our #set_var = ("foo","bar");
exit 0;
$ perl script1.pl
Not exiting at script1.pl line 7.
foo bar
(Ok, finally, asked by myself and ansvering by myself, too!)
( After additional reviewing, I am realized, that 'mod' solution does use it, but I did not understand advice!
I am sorry: my false to step over the real solution!
)
Solution to my question is simple! It is the:
do EXPR;
That way
- the second script executed in place where it placed; so, anything defined and set in the first one usefull in the second one;
- It is printing to STDOUT everything what it should print (the second script;)
- any variables or objects that are defined in the second script process, are accessible in the first one after coming back; and
- control is returned to position immediately after the second-script execution with continuation to process the first script commands!
Simple! I am just amazed, why I forget about that 'do...' command. I have used it already not once!
And I am disappointed by that forum much!
While it is badly designed to display communication, participants, instead of perl-issue reviewing, much concerned on moderating others, teaching them how to leave in such nice forum!
I am not really sure what you are trying to do exactly, but along these lines it should be very close.
test.pl
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
use IPC::System::Simple qw(system);
say $0;
system($^X, "sample.pl", #ARGV);
$ perl test.pl first-arg second-arg
test.pl
sample.pl
$VAR1 = [
'first-arg',
'second-arg'
];
sample.pl
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
use feature 'say';
say $0;
print Dumper \#ARGV;
I used the module IPC::System::Simple. You can also capture the output of the script (sample.pl) through IPC::System::Simple::capture.
Update: Maybe you can use Storable. This way you can pass new parameters that you can use from script 2 (sample.pl) to script 1 (test.pl).
test.pl
#!/usr/bin/perl
use strict;
use warnings;
use Storable;
use Data::Dumper;
use feature 'say';
use IPC::System::Simple qw(system);
say $0;
system($^X, "sample.pl", #ARGV);
my $hashref = retrieve('sample');
print Dumper $hashref;
__END__
$ perl test.pl first-arg second-arg
test.pl
sample.pl
$VAR1 = [
'first-arg',
'second-arg'
];
$VAR1 = {
'arg1' => 'test1',
'arg2' => 'test2'
};
sample.pl
#!/usr/bin/perl
use strict;
use warnings;
use Storable;
use Data::Dumper;
use feature 'say';
say $0;
print Dumper \#ARGV;
my %hashArgs = ( arg1 => 'test1',
arg2 => 'test2', );
store \%hashArgs, 'sample';

Finding the standard out for a perl program

I'm redirecting standard out for a perl program. Example:
perl run_program.pl > /log/run_program.log
Is there a way to know what the standard out is. So in this case I'm looking to have the value of '/log/run_program.log'.
If it's not possible is there another/better way to get the same result?
Thanks in advance!
EDIT: The reason I'm not setting STDOUT in the program is because I'm calling a bunch of .pm that have print lines that I want to go to STDOUT with out having to pass the file to it.
On my system, you can use
readlink("/proc/$$/fd/1")
EDIT: The reason I'm not setting STDOUT in the program is because I'm calling a bunch of .pm that have print lines that I want to go to STDOUT with out having to pass the file to it.
Just to let you know, you might be able to use the select command to redefine the FD for the default output:
use strict;
use warnings;
use autodie;
open my $output_fd, ">", "/log/run_program.log";
my $old_default_fd = select( $output_fd );
print "I'm now going into /log/run_program.log\n";
select ($old_default_fd; # Restore the default when you no longer need it
This may work with most of your Perl modules. Just hope that they're not doing something stupid like:
print STDOUT "Ha, ha. I'm still going to STDOUT.\n".
I hate it when Perl modules print stuff.
<soapbox>
To you Perl Module writers:
Perl modules should not be printing (unless that's their main purpose). You should instead return what you want to print and let the caller decide what to do with the output.
</soapbox>
For the first part of your question, no. There's no way for the perl program to know where STDOUT is directed to.
The redirection happens external to the program, and is "wired up" before the perl process even starts. STDOUT could be pointed to a device, a file, or another process (a pipe).
The whole purpose of redirection from stdout to a file is to adapt a program which typically writes to stdout and redirect it to a file. The OS doesn't give you the name of the file, because it figures your program is too stupid to know what to do with a file name.
So your best bet is to get it as my $file_name = shift; and open it yourself. (A shift in the mainline pulls from #ARGV.)
Give a chance to this ideas:
...
my $log_path = "/log/run_program.log"; # or using $0 in some manner
open $log_handler, "<", $log_path or die;
...
Now you could code a myprint subroutine that will call print $log_handler and use it into the whole program, or better, having a look to OVERRIDING CORE FUNCTIONS you could self redefine print doing like this:
...
use subs 'print';
sub print { #redefine here }
...

Perl Porter Stemmer

I was checking this porter stemmer. Below they said I should change my first line. To what exactly I tried every thing but the stemmer ain't working. What a good example might be?
#!/usr/local/bin/perl -w
#
# Perl implementation of the porter stemming algorithm
# described in the paper: "An algorithm for suffix stripping, M F Porter"
# http://www.muscat.com/~martin/stem.html
#
# Daniel van Balen (vdaniel#ldc.usb.ve)
#
# October-1999
#
# To Use:
#
# Put the line "use porter;" in your code. This will import the subroutine
# porter into your current name space (by default this is Main:: ). Make
# sure this file, "porter.pm" is in your #INC path (it includes the current
# directory).
# Afterwards use by calling "porter(<word>)" where <word> is the word to strip.
# The stripped word will be the returned value.
#
# REMEMBER TO CHANGE THE FIRST LINE TO POINT TO THE PATH TO YOUR PERL
# BINARY
#
As A code I am writing what follows:
use Lingua::StopWords qw(getStopWords);
use Main::porter;
my $stopwords = getStopWords('en');
#stopwords = grep { $stopwords->{$_} } (keys %$stopwords);
chdir("c:/perl/input");
#files = <*>;
foreach $file (#files)
{
open (input, $file);
while (<input>)
{
open (output,">>c:/perl/normalized/".$file);
chomp;
porter<$_>;
for my $stop (#stopwords)
{
s/\b\Q$stop\E\b//ig;
}
$_ =~s/<[^>]*>//g;
$_ =~ s/[[:punct:]]//g;
print output "$_\n";
}
}
close (input);
close (output);
The code gives no errors except it is not stemming anything!!!
That comment block is full of incorrect advice.
A #! line in a .pm file has no effect. It's a common mistake. The #! line tells Unix which interpreter to run the program with if and only if you run the file as a command line program.
./somefile # uses #! to determine what to run somefile with
/usr/bin/perl somefile # runs somefile with /usr/bin/perl regardless of #!
The #! line does nothing in a module, a .pm file which you use. Perl is already running at that point. The line is nothing but a comment.
The second problem is that your default namespace is main not Main. Casing matters.
Moving on to your code, use Main::porter; should not work. It should be use porter. You should get an error message like Can't locate Main/porter.pm in #INC (#INC contains: ...). If that code runs, perhaps you moved porter.pm into a Main/ directory? Move it out, it will confuse the importing of the porter function.
porter<$_>; says "try to read a line from the filehandle $_ and pass that into porter". $_ isn't a filehandle, it's a line from the file you just opened. You want porter($_) to pass the line into the porter function. If you turn on warnings (add use warnings to the top of your script) Perl will warn you about mistakes like that.
You'll also presumably want to do something with the return value from porter, otherwise it will truly do nothing. my #whatever_porter_returns = porter($_).
Likely one or more of your chdir or opens have silently failed so your program may have no input. Unfortunately, Perl does not let you know when this happens, you have to check. Normally you add an or die $! after the function to check for the error. This is busy work and often one forgets, instead you can use autodie which will automatically produce an error if any system calls like chdir or open fail.
With that stuff fixed your code should work, or at least produce useful error messages.
Finally, there are many stemming modules on CPAN which are likely to be higher quality than the one you've found with documentation and tests and updates and all that. Lingua::Stem and Text::English specifically use the porter algorithm. You might want to give those a shot.