Perl Data::Dumper output processing - perl

I am using the DATA::Dumper api to parse an html table..
Here is the perl code:
print Dumper $row;
Here is the output:
$VAR1 = [
'Info1',
'Info2',
'Info3',
];
Question:
1. I want to modify Info1, Info2, etc before writing into a SQL table. How do i access that from above output?
Something like $row->{var1}->? I've tried a couple of options and nothing worked.

This is an old question, with an answer that was never selected.
Ways to update an arrayref
Element by array reference:
$row->[0] = 'foo';
$row->[1] = 'bar';
$row->[2] = 'baz';
List assignment:
($row->[0], $row->[1], $row->[2]) = ('foo','bar','baz');
Array list assignment:
#{$row} = ('foo','bar','baz');

Related

what is the best way to assign a value inside a nested hash?

My code constructs a hash %oprAtnNOW that contains a hash ref, where both the keys and values inside
$oprAtnNOW{opt} are determined at run time.
The sample code below demonstrates that a single command suffices to extract a value from the anonymous hash referenced by $oprAtnNOW{opt}.
But assigning a value doesn't work like that.
When I try to assign the string wolf to the key Dog, something very strange happens.
When I use Dumper to look at the result, it appears that the value assigned was 'wolf, with a single quote pasted to the start of the string;
and when I use print to look at it, it looks like SCALAR(something).
(The end of my code demonstrates that 'wolf does not print out as SCALAR(something), so Dumper has something else in mind.)
So my sample code contains a workaround:
deference the anonymous inner hash; assign key and value in the now named, temporary hash; clobber the previous $oprAtnNOW{opt}
with a reference to the temporary, named hash.
Why does the direct method yield such a strange result?
What is the true content of this SCALAR thing?
Is there a way to do this with a single command, without my multi-step workaround?
#!/usr/bin/perl
use strict; use warnings;
use Data::Dumper qw(Dumper);
$Data::Dumper::Sortkeys = 1;
my %oprAtnNOW;
${$oprAtnNOW{opt}{Dog}} = 'wolf';
print Dumper {%oprAtnNOW};
print join('', '$oprAtnNOW{opt}{Dog}==', $oprAtnNOW{opt}{Dog}, "\n",);
{
my %tmp_oprAtnNOW_opt = %{$oprAtnNOW{opt}} if(defined $oprAtnNOW{opt});
$tmp_oprAtnNOW_opt{Dog} = 'wolf'; # will clobber any previous value for Dog
$oprAtnNOW{opt} = {%tmp_oprAtnNOW_opt};
}
print Dumper {%oprAtnNOW};
print join('', '$oprAtnNOW{opt}{Dog}==', $oprAtnNOW{opt}{Dog}, "\n",);
my $teststring = join('', "\x27", 'wolf',);
print "teststring==$teststring\n";
You want
$oprAtnNOW{opt}{Dog} = 'wolf';
$BLOCK = EXPR; expects BLOCK to return a reference to a scalar.
Thanks to autovivification, one is created for you if needed. In other words,
${$oprAtnNOW{opt}{Dog}} = 'wolf';
is short for
${ $oprAtnNOW{opt}{Dog} //= \my $anon } = 'wolf';
which could also be written as
my $anon = 'wolf';
$oprAtnNOW{opt}{Dog} = \$anon;
This is not what you want. You don't want to assign a reference to the hash; you want to assign the string wolf. To achieve that, you can use
$oprAtnNOW{opt}{Dog} = 'wolf';
This short for
$oprAtnNOW{opt}->{Dog} = 'wolf';
aka
${ $oprAtnNOW{opt} }{Dog} = 'wolf';
The latter is of the form
$BLOCK{Dog} = 'wolf';
which, like $NAME{Dog}, is an assignment to a hash element.

Perl find out if X is an element in an array

I don't know why small things are too not working for me in Perl. I am sorry for that.
I have been trying it around 2 hrs but i couldn't get the results.
my $technologies = 'json.jquery..,php.linux.';
my #techarray = split(',',$technologies);
#my #techarray = [
# 'json.jquery..',
# 'php.linux.'
# ];
my $search_id = 'json.jquery..';
check_val(#techarray, $search_id);
And i am doing a "if" to search the above item in array. but it is not working for me.
sub check_val{
my #techarray = shift;
my $search_id = shift;
if (grep {$_ eq $search_id} #techarray) {
print "It is there \n";
}else{
print "It is not there \n";
}
}
Output: It always going to else condition and returns "It is not there!" :(
Any idea. Am i done with any stupid mistakes?
You are using an anonymous array [ ... ] there, which as a scalar (reference) is then assigned to #techarray, as its only element. It is like #arr = 'a';. An array is defined by ( ... ).
A remedy is to either define an array, my #techarray = ( ... ), or to properly define an arrayref and then dereference when you search
my $rtecharray = [ .... ];
if (grep {$_ eq $search_id} #$rtecharray) {
# ....
}
For all kinds of list manipulations have a look at List::Util and List::MoreUtils.
Updated to changes in the question, as the sub was added
This has something else, which is more instructive.
As you pass an array to a function it is passed as a flat list of its elements. Then in the function the first shift picks up the first element,
and then the second shift picks up the second one.
Then the search is over the array with only 'json.jquery..' element, for 'php.linux.' string.
Instead, you can pass a reference,
check_val(\#techarray, $search_id);
and use it as such in the function.
Note that if you pass the array and get arguments in the function as
my (#array, $search_id) = #_; # WRONG
you are in fact getting all of #_ into #array.
See, for example, this post (passing to function) and this post (returning from function).
In general I'd recommend passing lists by reference.

What is this perl object and how do I iterate through it?

I have a perl object that was returned to me whose data I can't seem to extract. If I run Data::Dumper->Dump on it as:
Data::Dumper->Dump($message_body)
I get:
$VAR1 = 'SBM Message
';
$VAR2 = '--SBD.Boundary.605592468
';
$VAR3 = 'Content-Type: text/plain;charset=US-ASCII
';
$VAR4 = 'Content-Disposition: inline
If I execute the line:
print $message_body;
I get:
ARRAY(0x9145668)
I would think this is an array. However, trying to iterate through it there only seems to be a single element. How do I extract each of the elements from this? By the way this, is basically the body of a mail message extracted using the MIME::Parser package. It was created using the following:
my $parser = new MIME::Parser;
my $entity = $parser->parse($in_fh); # Where $in_fh points to a mail message
$message_body = $entity->body;
Try below foreach loop.
foreach my $item (#{$message_body})
{
print $item."\n";
}
$message_body is an ARRAY reference. Hence you need to dereference it and then iterate through each element using the foreach loop.
Read:
http://perlmeme.org/howtos/using_perl/dereferencing.html and http://www.thegeekstuff.com/2010/06/perl-array-reference-examples/
Data::Dumper is only a poor man's choice to see the content.
To see all the gory internal details use Devel::Peek instead.
use Devel::Peek;
Dump $message_body;

In Perl, how to use 'defined' function on elements of two-dimensional array?

I am trying to check if an element is defined, using defined function in Perl.
Code :
$mylist[0][0]="wqeqwe";
$mylist[0][1]="afasf";
$mylist[1][0]="lkkjh";
print scalar(#mylist), "\n";
if (defined($mylist[2][0])){print "TRUE\n";}
print scalar(#mylist), "\n";
Output
2
3
Before using defined function, there were two elements in first dimension of #myarray. After using defined function, the number of elements increase to 3.
How to use defined function with out adding new elements ?
First check that the first-level reference exists.
if ( defined($mylist[2]) && defined($mylist[2][0]) ) {
print "TRUE\n";
}
What you've encountered is called autovivification: under some circumstances, Perl creates complex data structures when you use them as if they already existed.
It's interesting to note that there's a non-core pragma called autovivification, and that if you run your code under no autovivification; your problem will go away.
When you refer to $mylist[2][0], perl's autovivification creates the array element $mylist[2].
To prevent this, you can check this element first:
if ( (defined $mylist[2]) && (defined $mylist[2][0]) )
defined($mylist[2][0])
is equivalent to
defined($mylist[2]->[0])
which is short for
defined( ( $mylist[2] //= [] )->[0])
due to autovivification. You can disable autovivification using the autovivification pragma.
no autovivificatoin;
if (defined($mylist[2][0]))
Or you can avoid evaluating code that would trigger it.
if (defined($mylist[2]) && defined($mylist[2][0]))
Actually because it's autovivification, you can check it easily with Data::Dumper, before and after using defined.
use Data::Dumper;
my #mylist;
$mylist[0][0]="wqeqwe";
$mylist[0][1]="afasf";
$mylist[1][0]="lkkjh";
print Dumper(#mylist);
Output before
$VAR1 = ['wqeqwe', 'afasf'];
$VAR2 = [ 'lkkjh'];
print Dumper(#mylist);
Output after
$VAR1 = [ 'wqeqwe','afasf' ];
$VAR2 = ['lkkjh'];
$VAR3 = [];

XML parsing using perl

I tried to research on simple question I have but couldn't do it. I am trying to get data from web which is in XML and parse it using perl. Now, I know how to loop on repeating elements. But, I am stuck when its not repeating (I know this might be silly). If the elements are repeating, I put it in array and get the data. But, when there is only a single element it throws and error saying 'Not an array reference'. I want my code such that it can parse at both time (for single and multiple elements). The code I am using is as follows:
use LWP::Simple;
use XML::Simple;
use Data::Dumper;
open (FH, ">:utf8","xmlparsed1.txt");
my $db1 = "pubmed";
my $query = "13054692";
my $q = 16354118; #for multiple MeSH terms
my $xml = new XML::Simple;
$urlxml = "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=$db1&id=$query&retmode=xml&rettype=abstract";
$dataxml = get($urlxml);
$data = $xml->XMLin("$dataxml");
#print FH Dumper($data);
foreach $e(#{$data->{PubmedArticle}->{MedlineCitation}->{MeshHeadingList}->{MeshHeading}})
{
print FH $e->{DescriptorName}{content}, ' $$ ';
}
Also, can I do something such that the separator $$ will not get printed after the last element?
I also tried the following code:
$mesh = $data->{PubmedArticle}->{MedlineCitation}->{MeshHeadingList}->{MeshHeading};
while (my ($key, $value) = each(%$mesh)){
print FH "$value";
}
But, this prints all the childnodes and I just want the content node.
Perl's XML::Simple will take a single item and return it as a scalar, and if the value repeats it sends it back as an array reference. So, to make your code work, you just have to force MeshHeading to always return an array reference:
$data = $xml->XMLin("$dataxml", ForceArray => [qw( MeshHeading )]);
I think you missed the part of "perldoc XML::Simple" that talks about the ForceArray option:
check out ForceArray because you'll almost certainly want to turn it on
Then you will always get an array, even if the array contains only one element.
As others have pointed out, the ForceArray option will solve this particular problem. However you'll undoubtedly strike another problem soon after due to XML::Simple's assumptions not matching yours. As the author of XML::Simple, I strongly recommend you read Stepping up from XML::Simple to XML::LibXML - if nothing else it will teach you more about XML::Simple.
Since $data->{PubmedArticle}-> ... ->{MeshHeading} can be either a string or an array reference depending on how many <MeshHeading> tags are present in the document, you need to examine the value's type with ref and conditionally dereference it. Since I am unaware of any terse Perl idioms for doing this, your best bet is to write a function:
sub toArray {
my $meshes = shift;
if (!defined $meshes) { return () }
elsif (ref $meshes eq 'ARRAY') { return #$meshes }
else { return ($meshes) }
}
and then use it like so:
foreach my $e (toArray($data->{PubmedArticle}->{MedlineCitation}->{MeshHeadingList}->{MeshHeading})) { ... }
To prevent ' $$ ' from being printed after the last element, instead of looping over the list, concatenate all the elements together with join:
print FH join ' $$ ', map { $_->{DescriptionName}{content} }
toArray($data->{PubmedArticle}->{MedlineCitation}->{MeshHeadingList}->{MeshHeading});
This is a place where XML::Simple is being...simple. It deduces whether there's an array or not by whether something occurs more than once. Read the doc and look for the ForceArray option to address this.
To only include the ' $$ ' between elements, replace your loop with
print FH join ' $$ ', map $_->{DescriptorName}{content}, #{$data->{PubmedArticle}->{MedlineCitation}->{MeshHeadingList}->{MeshHeading}};