How to properly declare global variables in Perl? - perl

I am trying to understand variable scope and properly declaring variables in Perl, and I am having a hard time.
The code below basically reads in an excel file, parses it, and spits it out to a new excel file.
However, I am trying to read one of the headers, and if the header matches my string, I want to record that column number, and use it later in the code.
I am getting a "Use of uninitialized value $site_name_col in print at ./parser.pl line 38."
Line 38 is "print $site_name_col;"
I realize this print statement is outside the {} where the variable was initially initialized, but it was declared as a global variable at the beginning of the code, so what gives?
#!/usr/bin/perl -w
use strict;
use warnings;
use vars qw($site_name_col);
use Spreadsheet::WriteExcel;
use Spreadsheet::ParseExcel;
my ($fname1) = #ARGV;
my $parser = Spreadsheet::ParseExcel->new();
my $workbook = $parser->parse($fname1);
my $new_workbook = Spreadsheet::WriteExcel->new('formated_list.xls', $fname1);
if (!defined $workbook) {
die $parser->error(), ".\n";
}
for my $worksheet ( $workbook->worksheets() ) {
my ($wsheet_name) = $worksheet->get_name();
my $new_worksheet = $new_workbook->add_worksheet($wsheet_name);
my ($row_min, $row_max) = $worksheet->row_range();
my ($col_min, $col_max) = $worksheet->col_range();
for my $row ($row_min .. $row_max) {
for my $col ($col_min .. $col_max) {
my $cell = $worksheet->get_cell($row, $col);
next unless $cell;
print "Row, Col = ($row, $col)\n";
if ( $cell->value() =~ /Site Name/ ) {
$site_name_col = $col;
}
print $site_name_col;
$new_worksheet->write($row, $col, $cell->value());
}
}
}
$new_workbook->close();

use vars qw() Is not recommended any more. To declare a global variable use our $my_var
Your problem may be comes from the condition $cell->value() =~ /Site Name/ . It is probably never met so your variable never gets a value.

i recognize this post is a little old, but...for those still coming to this page years later (like myself):
i imagine these excel worksheets you were reading in may not have been created by you. so, you may encounter casing issues, and regexes are case sensitive, of course. either uppercase or lowercase the data during the check: if (lc($cell->value()) =~ /site name/) ...
use our! there are lots of reasons for one to have a global. site_name would seem to be something all files might need...
Jarett
edit:
this will work much better:
if ($cell->value()) =~ /site name/i) { print $col; }
no need to print outside the if statement at all...saves printing nothing many...many times....

Just to clarify what others have already said, a variable declared at the top of a file with my is accessible and usable by your entire file. There is no reason for a global variable in this case.
When would you want a global?
You want a variable to be accessible by another piece of code outside of your file. For example, a module might provide a global variable that is accessible by files that call the module.
You have multiple packages within one file. In which case, you would need a global variable for something accessed by both packages. It would be rather unusual to do this, however.
It is pretty clear that you aren't doing either of those things, so you should just stick with my. If you do want to declare a global, the correct way to do so is with our. There are some important subtleties to that command, explained in the linked documentation.

You don't need to declare global variable in this case, local variable is enough. See example below.
if ( $cell->value() =~ /Site Name/ ) {
my $site_name_col = $col;
print $site_name_col;
}
OR
my $site_name_col = ''; # default value
if ( $cell->value() =~ /Site Name/ ) {
$site_name_col = $col;
}
print $site_name_col;

Related

How are these quoted strings replaced with the values in perl .pm file?

Below is the Perl code in .pm file which is supposed to replace the specified strings (that are in "quotes") with some values. But its not happening. Can anyone explain what is happening in this code?
package SomePackage;
require Exporter;
#ISA = qw(Exporter);
#EXPORT = qw(send_request, create_mmd_and_transfer, update_mmd_file);
sub send_request {
my ( $service, $action, $torole ) = #_;
my ( $seller_request_mmd );
my $replace_contents = ();
$replace_contents{"REPLACE_Service"} = $service;
$replace_contents{"REPLACE_RequestAction"} = $action;
$replace_contents{"REPLACE_TradingPartner"} = $torole;
$replace_contents{"REPLACE_Requestxml"} = "Request.xml";
create_mmd_and_transfer( \%replace_contents, $seller_request_mmd, "/MMD.xml" );
}
sub create_mmd_and_transfer {
my $local_replace_contents = shift;
my $input_mmd = shift;
my $local_output_mmd = shift;
my $output_mmd = shift;
update_mmd_file( "$input_mmd", "temp_mmd_file.xml", $local_replace_contents );
}
sub update_mmd_file {
my $input_file = shift;
my $output_file = shift;
my $contents = shift;
open( MMD_FILE, "<$input_file" )
or main::error_exit(" Cannot open MMD file template $input_file \n $input_file not found int the Templates folder \n Please place the same and then run the script ");
open( TEMP_MMD_FILE, ">$output_file" );
while ( <MMD_FILE> ) {
s/^M//g; # Getrid of the ^Ms
foreach my $content ( keys( %$contents ) ) {
my $exact_value = ${%$contents}{$content};
if ( $main::test_scenario =~ /^Invalid Request Action \a\n\d Service/
and ( $content =~ /REPLACE_Service|REPLACE_RequestAction/i ) ) {
}
else {
if ( $exact_value ne "" ) {
s/$content/$exact_value/g;
}
}
}
print TEMP_MMD_FILE;
}
close MMD_FILE;
close TEMP_MMD_FILE;
}
The following will not make your script work, just create the better base for some future questions.
Before you even thinking about posting a perl question here:
1.) add to the top of your script:
use strict;
use warnings;
Posting a code here without these two lines, nobody will bother even trying to read the code.
2.) use perl -c SomePackage.pm for the check. If it will tell you: SomePackage.pm syntax OK - you can start thinking about posting a question here. ;)
Some basic problems with your script:
package SomePackage;
use strict; # see the above
use warnings;
require Exporter;
# these variables are defined outside of this package, so, tell perl this fact. use the `our`
our #ISA = qw(Exporter);
#the use warnings will warn you about the following line
# #EXPORT = qw(send_request, create_mmd_and_transfer, update_mmd_file);
#the correct one is without commas
our #EXPORT = qw(send_request create_mmd_and_transfer update_mmd_file); #not saying anything about the #EXPORT rudeness. :)
#my $replace_contents = ();
#the $replace_contents is a scalar. Bellow you using a hash. So,
my %replace_contents;
#or use the scalar but the lines bellow should use the hashref notation, e.g.
# $replace_contents->{"REPLACE_Service"} = $service;
# you decide. :)
# the seller_request_mmd contains undef here.
create_mmd_and_transfer( \%replace_contents, $seller_request_mmd, "/MMD.xml");
# also bellow, in the subroutine definition it wants 4 arguments.
# indicates a problem...
# using 2-arg open is not the best practice.
# Also, you should to use lexical filehandles
# open (MMD_FILE, "<$input_file")
# better
open (my $mmd_file, '<', $input_file)
# of course, you need change every MMD_FILE to $mmd_file
# check the result of the open and die if not successful
# or you can use the
use autodie;
# instead of $exact_value = ${%$contents}{$content};
# you probably want
my $exact_value = $contents->{$content};
Indent your code!
All the above are just about the syntactic problems and not solving anything about the "logic" of your code.
Ps: And me is still an beginner, so, others sure will find much more problems with the above code.
Ok. Here's what I've done to test this.
Firstly, you didn't give us an input file or the code that you use to call the module. So I invented them. I made the simplest possible input file:
REPLACE_Service
REPLACE_RequestAction
REPLACE_TradingPartner
REPLACE_Requestxml
And this driver program:
#!/usr/bin/perl
use strict;
use warnings;
use SomePackage;
send_request('foo', 'bar', 'baz');
sub error_exit {
die #_;
}
The first time, I ran it, I got this error:
Undefined subroutine &main::send_request called at test line 8.
That was because your #EXPORT line was wrong. You had:
#EXPORT = qw(send_request, create_mmd_and_transfer, update_mmd_file);
But the point of qw(...) is that you don't need the commas. So I corrected it to:
#EXPORT = qw(send_request create_mmd_and_transfer update_mmd_file);
Then I re-ran the program and got this error:
Cannot open MMD file template
not found int the Templates folder
Please place the same and then run the script at test line 11.
That looked like there was something missing. I changed the error message, adding indicators of where the variable interpolation was supposed to happen:
open( MMD_FILE, "<$input_file" )
or main::error_exit(" Cannot open MMD file template <$input_file> \n <$input_file> not found int the Templates folder \n Please place the same and then run the script ");
Then the error message looked like this:
Cannot open MMD file template <>
<> not found int the Templates folder
Please place the same and then run the script at test line 11.
So it seems clear that the $input_file variable isn't set in the update_mmd_file() subroutine. Tracing that variable back, we see that this value is originally the $seller_request_mmd variable in send_request(). But in send_request() you declare $seller_request_mmd but you never give it a value. So let's do that:
my ( $seller_request_mmd ) = 'test_input.txt';
Now, when I run your program, it runs to completion without any errors. And I find a new temp_mmd_file.xml is generated. But it is exactly the same as the input file. So more investigation is needed.
Digging into the update_mmd_file() subroutine, we find this interesting line:
my $exact_value = ${%$contents}{$content};
I think you're trying to extract a value from $contents, which is a hash reference. But your syntax is wrong. You were probably aiming at:
my $exact_value = ${$contents}{$content};
But most Perl programmers prefer the arrow notation for working with reference look-ups.
my $exact_value = $contents->{$content};
Making that change and re-running the program, I get an output file that contains:
foo
bar
baz
Request.xml
Which is exactly what I expected. So the program now works.
But there is still a lot of work to do. As you have been told repeatedly, you should always add:
use strict;
use warnings;
to your code. That will find a lot of potential problems in your code - which you should fix.
To be honest, this feels to me like you were trying to run before you could walk. I'd recommend spending some time to work through a good Perl introductory book before taking on my more Perl work.
And there was a lot of useful information missing from your question. It wouldn't have taken as long to get to the solution if you had shown us your driver program and your input data.

Data::Dumper wraps second word's output

I'm experiencing a rather odd problem while using Data::Dumper to try and check on my importing of a large list of data into a hash.
My Data looks like this in another file.
##Product ID => Market for product
ABC => Euro
XYZ => USA
PQR => India
Then in my script, I'm trying to read in my list of data into a hash like so:
open(CONFIG_DAT_H, "<", $config_data);
while(my $line = <CONFIG_DAT_H>) {
if($line !~ /^\#/) {
chomp($line);
my #words = split(/\s*\=\>\s/, $line);
%product_names->{$words[0]} = $words[1];
}
}
close(CONFIG_DAT_H);
print Dumper (%product_names);
My parsing is working for the most part that I can find all of my data in the hash, but when I print it using the Data::Dumper it doesn't print it properly. This is my output.
$VAR1 = 'ABC';
';AR2 = 'Euro
$VAR3 = 'XYZ';
';AR4 = 'USA
$VAR5 = 'PQR';
';AR6 = 'India
Does anybody know why the Dumper is printing the '; characters over the first two letters on my second column of data?
There is one unclear thing in the code: is *product_names a hash or a hashref?
If it is a hash, you should use %product_names{key} syntax, not %product_names->{key}, and need to pass a reference to Data::Dumper, so Dumper(\%product_names).
If it is a hashref then it should be labelled with a correct sigil, so $product_names->{key} and Dumper($product_names}.
As noted by mob if your input has anything other than \n it need be cleaned up more explicitly, say with s/\s*$// per comment. See the answer by ikegami.
I'd also like to add, the loop can be simplified by loosing the if branch
open my $config_dat_h, "<", $config_data or die "Can't open $config_data: $!";
while (my $line = <$config_dat_h>)
{
next if $line =~ /^\#/; # or /^\s*\#/ to account for possible spaces
# ...
}
I have changed to the lexical filehandle, the recommended practice with many advantages. I have also added a check for open, which should always be in place.
Humm... this appears wrong to me, even you're using Perl6:
%product_names->{$words[0]} = $words[1];
I don't know Perl6 very well, but in Perl5 the reference should be like bellow considering that %product_names exists and is declared:
$product_names{...} = ... ;
If you could expose the full code, I can help to solve this problem.
The file uses CR LF as line endings. This would become evident by adding the following to your code:
local $Data::Dumper::Useqq = 1;
You could convert the file to use unix line endings (seeing as you are on a unix system). This can be achieved using the dos2unix utility.
dos2unix config.dat
Alternatively, replace
chomp($line);
with the more flexible
$line =~ s/\s+\z//;
Note: %product_names->{$words[0]} makes no sense. It happens to do what you want in old versions of Perl, but it rightfully throws an error in newer versions. $product_names{$words[0]} is the proper syntax for accessing the value of an element of a hash.
Tip: You should be using print Dumper(\%product_names); instead of print Dumper(%product_names);.
Tip: You might also find local $Data::Dumper::Sortkeys = 1; useful. Data::Dumper has such bad defaults :(
Tip: Using split(/\s*=>\s*/, $line, 2) instead of split(/\s*=>\s*/, $line) would permit the value to contain =>.
Tip: You shouldn't use global variable without reason. Use open(my $CONFIG_DAT_H, ...) instead of open(CONFIG_DAT_H, ...), and replace other instances of CONFIG_DAT_H with $CONFIG_DAT_H.
Tip: Using next if $line =~ /^#/; would avoid a lot of indenting.

How do I pass in a variable from one function into another in perl

I am initializing a variable within one function and would like to pass this variable into another function. This variable holds a char value.
I have tried passing in the referencing and dereferencing, declaring the variables outside of the function, and using local.
I've also looked in perlmonks, perl by example, googled and looked through this site for a solution but to no avail. I'm just starting out with perl programming so any help will be appreciated!
Sounds to me like you need to read through some documentation, not just google around. I would suggest http://www.perl.org/books/beginning-perl/.
use strict;
use warnings;
sub foo {
my $char = 'A';
bar($char);
}
sub bar {
my ($bar_char) = #_;
print "bar got char $bar_char\n";
}
foo();
If you pass a parameter by reference (see below), it can be modified by the first function and you can then pass it to another function:
#!/usr/bin/perl
sub f {
$c = shift;
$$c='m';
}
$c='a';
f(\$c);
print $c;
This will print 'm'
Is there a reason who your first function cannot return this variable?
my $config_variable = function1( $param1 );
function2 ( $config_variable, $param2 );
You can also pass more than one variable back too:
my ( $config_variable, $value ) = function1( $param1 );
my $value2 = function2( $param1, $config_variable );
This would be the best way. However, you can use globally defined variables and they can be used from function to function:
#! /usr/bin/env perl
#
use strict;
use warnings;
my $value;
func1();
func2();
sub func1 {
$value = "foo";
}
sub func2 {
print "Value = $value\n";
}
Note that I declared $value outside of both functions, so it's global in the entire file - even in the subroutines. Now, func1 can set it, and func1 can print it.
The technical term for this is: A terrible, awful, evil idea and you should never, ever1 think of doing it.
This is because a particular variable you think is set to one value suddenly and mysteriously changes values without any reason. Do this for one variable is bad enough, but if you use this as a crutch, you'll end up with dozens of variables that are impossible to track through your program.
If you find yourself doing this quite a bit, you may need to rethink your code logic.

Perl while loops not working

I'm quite new to perl and apologies if this has already been answered in a previous discussion. I have a script that needs to use the declared variables outside the loops, but only one loop is working, even though I have declared the variables outside of the loop, the code is:
my $sample;
open(IN, 'ls /*_R1_*.gz |');
while (my $sample = <IN>) {
chomp $sample;
print "sample = $sample\n";
my $fastq1="${sample}"; #need to use fastq1 later on hence it's declared here
my $sample2;
open(IN, 'ls /*_R2_*.gz |');
while (my $sample2 = <IN>) {
chomp $sample2;
print "sample2 = $sample2\n";
my $fastq2="${sample2}"; #need to use fastq2 later on hence it's declared here
}
}
Sample2 works but sample1 does not, only the first sample is output and then the loop goes onto sample2, the output is:
sample =/sample1_R1_001.fastq.gz
sample2 =/sample1_R2_001.fastq.gz
sample2 =/sample2_R2_001.fastq.gz
sample2 =/sample3_R2_001.fastq.gz
etc..
Can anyone figure this out?
Thanks
From your comments, I assume that your problem is probably that you declare $fastq1 and $fastq2 inside the loop. That means they will be out of scope outside the loops, and not accessible. You need something like:
my ($fastq1, $fastq2);
while ( ... ) {
....
$fastq1 = $sample;
}
Note that this will only save the last value in the loop of that variable. The others will of course be overwritten each loop iteration. If you have more values to save, use an array or hash.
Some other notes on your code.
You should always use
use strict;
use warnings;
Not doing so is a very bad idea, as it will only hide the errors and warnings, not solve them.
my $sample;
You declare this variable twice.
open(IN, 'ls /*_R1_*.gz |');
This is just bad on all possible levels:
System calls are always the least desirable option, unless no alternatives exist
Perl has many ways of reading file names
Parsing the output of ls is fragile and not portable
Piping the result of the system command through open is compounding the other flaws with this approach.
Recommended solution: Use either opendir + readdir or glob:
for my $files (</*_R1_*.gz>) { ... }
# or
opendir my $dh, "/" or die $!;
while (my file = readdir $dh) {
next unless $file =~ /_R1_.*\.gz$/;
...
}
my $fastq1 = "${sample}";
You do not need to quote a variable. Nor use support curly braces.
When declaring the variable with my inside a loop, it only retains its value that single loop iteration. Since you never use this variable, I assume you meant to use it outside the loop. But it will be out of scope there.
This can be written
my $fastq1 = $sample;
But you probably want to declare those variables outside your while loops, or they will be out of scope there. You should know that this will only save the last value for these variables, of course.
Also, as Rohit says, your loops are nested, which I assume is not what you wanted. This is most likely because you do not use a proper text editor to write your code, so your indentation is all messed up, and it is hard to see where one loop ends. Follow Rohit's advice there.
You are closing the first while loop after the end of 2nd while loop. Because of that, your 2nd while loop become a part of your 1st while loop, wherein, you are re-assigning the file handler - IN to a different file. And since you are exhausting it in the inner while loop, your outer while loop never run again.
You should close the brace before starting the next while:
while(my $sample = <IN>){
chomp $sample;
print "sample = $sample\n";
my $fastq1="${sample}";
} # You need this
my $sample2;
open(IN, 'ls /data_n2/vmistry/Fluidigm_Exome/300bp_fastq/*_R2_*.gz |');
while(my $sample2 = <IN>){
chomp $sample2;
print "sample2 = $sample2\n";
my $fastq2="${sample2}";
}
# } # Remove this

Can you hook the opening of the DATA handle?

Can you hook the opening of the DATA handle for a module while Perl is still compiling? And by that I mean is there a way that I can insert code that will run after Perl has opened the DATA glob for reading but before the compilation phase has ceased.
Failing that, can you at least see the raw text after __DATA__ before the compiler opens it up?
In response to Ikegami, on recent scripts that I have been working on, I have been using __DATA__ section + YAML syntax to configure the script. I've also been building up a vocabulary of YAML configuration handlers where the behavior is requested by use-ing the modules. And in some scripts that are quick-n-dirty, but not quite enough to forgo strict, I wanted to see if I could expose variables from the YAML specification.
It's been slightly annoying though just saving data in the import subs and then waiting for an INIT block to process the YAML. But it's been doable.
The file handle in DATA is none other than the handle the parser uses to read the code found before __DATA__. If that code is still being compiled, then __DATA__ hasn't been reached, then the handle hasn't been stored in DATA.
You could do something like the following instead:
open(my $data_fh, '<', \<<'__EOI__');
.
. Hunk of text readable via $data_fh
.
__EOI__
I don’t know where you want the hook. Probably in UNITCHECK.
use warnings;
sub i'm {
print "in #_\n";
print scalar <DATA>;
}
BEGIN { i'm "BEGIN" }
UNITCHECK { i'm "UNITCHECK" }
CHECK { i'm "CHECK" }
INIT { i'm "INIT" }
END { i'm "END" }
i'm "main";
exit;
__END__
Data line one.
Data line two.
Data line three.
Data line four.
Data line five.
Data line six.
Produces this when run:
in BEGIN
readline() on unopened filehandle DATA at /tmp/d line 5.
in UNITCHECK
Data line one.
in CHECK
Data line two.
in INIT
Data line three.
in main
Data line four.
in END
Data line five.
You can use any of the before runtime but after compilation blocks to change the *DATA handle. Here is a short example using INIT to change *DATA to uc.
while (<DATA>) {
print;
}
INIT { # after compile time, so DATA is opened, but before runtime.
local $/;
my $file = uc <DATA>;
open *DATA, '<', \$file;
}
__DATA__
hello,
world!
prints:
HELLO,
WORLD!
Which one of the blocks to use depends on other factors in your program. More detail about the various timed blocks can be found on the perlmod manpage.
I'm afraid not, if I got your question right. It's written in The Doc:
Note that you cannot read from the DATA filehandle in a BEGIN block:
the BEGIN block is executed as soon as it is seen (during
compilation), at which point the corresponding DATA (or END)
token has not yet been seen.
There's another way, though: read the file with DATA section as a normal text file, parse this section, then require the script file itself (which will be done at run-time). Don't know whether it'll be relevant in your case. )
perlmod says:
CHECK code blocks are run just after the initial Perl compile phase ends and before the run time begins, in LIFO order.
May be you are looking for something like this?
CHECK {
say "Reading from <DATA> ...";
while (<DATA>) {
print;
$main::count++;
};
}
say "Read $main::count lines from <DATA>";
__DATA__
1
2
3
4
5
This produces the following output:
Reading from <DATA> ...
1
2
3
4
5
Read 5 lines from <DATA>
I found out that ::STDIN actually gives me access to the stream '-'. And that I can save the current location, through tell( $inh ) and then seek() it when I'm done.
By using that method, I could read the __DATA__ section in the import sub!
sub import {
my ( $caller, $file ) = ( caller 0 )[0,1];
my $yaml;
if ( $file eq '-' ) {
my $place = tell( ::STDIN );
local $RS;
$yaml = <::STDIN>;
seek( ::STDIN, $place, 0 );
}
else {
open( my $inh, '<', $file );
local $_ = '';
while ( defined() and !m/^__DATA__$/ ) { $_ = <$inh>; }
local $RS;
$yaml = <$inh>;
close $inh;
}
if ( $yaml ) {
my ( $config ) = YAML::XS::Load( $yaml );;
no strict 'refs';
while ( my ( $n, $v ) = each %$config ) {
*{"$caller\::$n"} = ref $v ? $v : \$v;
}
}
return;
}
This worked on Strawberry Perl 5.16.2, so I don't know how portable this is. But right now, to me, this is working.
Just a background. I used to do a bit of programming with Windows Script Files. One thing I liked about the wsf format was that you could specify globally useful objects outside of the code. <object id="xl" progid="Application.Excel" />. I have always liked the look of programming by specification and letting some modular handler sort the data out. Now I can get a similar behavior through a YAML handler: excel: !ActiveX: Excel.Application.
This works for me.
The test is here, in case you're interested:
use strict;
use warnings;
use English qw<$RS>;
use Test::More;
use data_mayhem; # <-- that's my module.
is( $k, 'Excel.Application' );
is( $l[1], 'two' );
{ local $RS;
my $data = <DATA>;
isnt( $data, '' );
say $data
}
done_testing;
__DATA__
---
k : !ActiveX Excel.Application
l :
- one
- two
- three