Perl: overwrite structure member value - perl

I am creating $input with this code:
push(#{$input->{$step}},$time);, then I save it in an xml file, and at the next compiling, I read it from that file. When i print it, i get the structure bellow.
if(-e $file)
my $input =XMLin($file...);
print Dumper $input;
and I get this structure
$VAR1 = {
'opt' => {
'step820' => '0',
'step190' => '0',
'step124' => '0',
}
};
for each step with it's time..
push(#{$input->{$step}},$time3);
XmlOut($file, $input);
If I run the program again, I get this structure:
$VAR1 = {
'opt' => {
'step820' => '0',
'step190' => '0',
'step124' => '0',
'opt' => {
'step820' => '0',
'step190' => '0',
'step124' => '0'
}
}
I just need to overwrite the values of steps(ex:$var1->opt->step820 = 2). How can i do that?

I just need to overwrite the values of steps(ex:$var1->opt->step820 = 2). How can i do that?
$input->{opt}->{step820} = 2;

I'm going to say what I always do, whenever someone posts something asking about XML::Simple - and that is that XML::Simple is deceitful - it isn't simple at all.
Why is XML::Simple "Discouraged"?
So - in your example:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $xml= XML::Twig->new->parsefile($file);
$xml -> get_xpath('./opt/step820',0)->set_text("2");
$xml -> print;
The problem is that XML::Simple is only any good for parsing the type of XML that you didn't really need XML for in the first place.
For more simple examples - have you considered using JSON for serialisation? As it more directly reflects the hash/array structure of native perl data types.
That way you can instead:
print {$output_fh} to_json ( $myconfig, {pretty=>1} );
And read it back in:
my $myconfig = from_json ( do { local $/; <$input_fh> });
Something like:
#!/usr/bin/env perl
use strict;
use warnings;
use JSON;
my $input;
my $time = 0;
foreach my $step ( qw ( step820 step190 step124 ) ) {
push(#{$input->{$step}},$time);
}
print to_json ( $input, {pretty=>1} );
Giving resultant JSON of:
{
"step190" : [
0
],
"step820" : [
0
],
"step124" : [
0
]
}
Although actually, I'd probably:
foreach my $step ( qw ( step820 step190 step124 ) ) {
$input->{$step} = $time;
}
print to_json ( $input, {pretty=>1} );
Which gives;
{
"step190" : 0,
"step124" : 0,
"step820" : 0
}
JSON uses very similar conventions to perl - in that {} denote key value pairs (hashes) and [] denote arrays.

Look at the RootName option of XMLout. By default, when "XMLout()" generates XML, the root element will be named 'opt'. This option allows you to specify an alternative name.
Specifying either undef or the empty string for the RootName option will produce XML with no root elements.

Related

Convert comma separated values to key value pair Perl

I have an array of states in the format
('AL','Alabama','AK','Alaska','AR','Arkansas'...)
which I want formatted like:
[{'AL' => 'Alabama'},...]
This is primarily so that I can more easily loop through using the HTML::Template module (https://metacpan.org/pod/HTML::Template#TMPL_LOOP)
I'm fairly new to perl, so unsure about how to do this sort of action and can't find something similar enough.
Wouldn't the following make more sense for HTML::Template?
states => [ { id => 'AL', name => 'Alabama' }, ... ]
This would allow you to use the following template:
<TMPL_LOOP NAME=states>
<TMPL_VAR NAME=name> (<TMPL_VAR NAME=id>)
</TMPL_LOOP>
To achieve that, you can use the following:
use List::Util 1.29 qw( pairmap );
states => [ pairmap { +{ id => $a, name => $b } } #states ]
That said, you're probably generating HTML.
<select name="state">
<TMPL_LOOP NAME=states>
<option value="<TMPL_VAR NAME=id_html>"><TMPL_VAR NAME=name_html></option>
</TMPL_LOOP>
</select>
To achieve that, you can use the following:
use List::Util 1.29 qw( pairmap );
{
my %escapes = (
'&' => '&',
'<' => '<',
'>' => '>',
'"' => '"',
"'" => ''',
);
sub text_to_html(_) { $_[0] =~ s/([&<>"'])/$escapes{$1}/rg }
}
states => [ pairmap { +{ id_html => $a, name_html => $b } } map text_to_html, #states ]
use List::Util 1.29;
#state_hashes = List::Util::pairmap { +{ $a => $b } } #states;
Unless you need to keep this hash around for later use I think that simply looping through the elements two at a time would be simpler. You can accomplish this type of looping easily with splice:
my #states = ('AL','Alabama','AK','Alaska','AR','Arkansas'...);
while (my ($code, $name) = splice(#states, 0, 2)) {
# operations here
}
Alternatively, you can use this same approach to create the data structure you want:
my #states = ('AL','Alabama','AK','Alaska','AR','Arkansas'...);
my #state_hashes = ();
while (my ($code, $name) = splice(#states, 0, 2)) {
push #state_hashes, { $code => $name };
}
# do w/e you want with #state_hashes
Note: splice will remove elements from #states
bundle_by from List::UtilsBy can easily create this format:
use strict;
use warnings;
use List::UtilsBy 'bundle_by';
my #states = ('AL', 'Alabama', 'AK', 'Alaska', 'AR', 'Arkansas', ... );
my #hashes = bundle_by { +{#_} } 2, #states;
map solution with a few perlish things
my #states = ('AL','Alabama','AK','Alaska','AR','Arkansas','VT','Vermont');
my %states;
map { $states{$states[$_]} = $states[$_+1] unless $_%2 } 0..$#states;

Returning a hash of the Parsed document (using Twig in Perl) to be used for processing in other subs

I am failing terribly to return a Hash of the Parsed XML document using twig - in order to use it in OTHER subs for performing several validation checks. The goal is to do abstraction and create re-usable blocks of code.
XML Block:
<?xml version="1.0" encoding="utf-8"?>
<Accounts locale="en_US">
<Account>
<Id>abcd</Id>
<OwnerLastName>asd</OwnerLastName>
<OwnerFirstName>zxc</OwnerFirstName>
<Locked>false</Locked>
<Database>mail</Database>
<Customer>mail</Customer>
<CreationDate year="2011" month="8" month-name="fevrier" day-of-month="19" hour-of-day="15" minute="23" day-name="dimanche"/>
<LastLoginDate year="2015" month="04" month-name="avril" day-of-month="22" hour-of-day="11" minute="13" day-name="macredi"/>
<LoginsCount>10405</LoginsCount>
<Locale>nl</Locale>
<Country>NL</Country>
<SubscriptionType>free</SubscriptionType>
<ActiveSubscriptionType>free</ActiveSubscriptionType>
<SubscriptionExpiration year="1980" month="1" month-name="janvier" day-of-month="1" hour-of-day="0" minute="0" day-name="jeudi"/>
<SubscriptionMonthlyFee>0</SubscriptionMonthlyFee>
<PaymentMode>Undefined</PaymentMode>
<Provision>0</Provision>
<InternalMail>asdf#asdf.com</InternalMail>
<ExternalMail>fdsa#zxczxc.com</ExternalMail>
<GroupMemberships>
<Group>werkgroep X.Y.Z.</Group>
</GroupMemberships>
<SynchroCount>6</SynchroCount>
<LastSynchroDate year="2003" month="12" month-name="decembre" day-of-month="5" hour-of-day="12" minute="48" day-name="mardi"/>
<HasActiveSync>false</HasActiveSync>
<Company/>
</Account>
<Account>
<Id>mnbv</Id>
<OwnerLastName>cvbb</OwnerLastName>
<OwnerFirstName>bvcc</OwnerFirstName>
<Locked>true</Locked>
<Database>mail</Database>
<Customer>mail</Customer>
<CreationDate year="2012" month="10" month-name="octobre" day-of-month="10" hour-of-day="10" minute="18" day-name="jeudi"/>
<LastLoginDate/>
<LoginsCount>0</LoginsCount>
<Locale>fr</Locale>
<Country>BE</Country>
<SubscriptionType>free</SubscriptionType>
<ActiveSubscriptionType>free</ActiveSubscriptionType>
<SubscriptionExpiration year="1970" month="1" month-name="janvier" day-of-month="1" hour-of-day="1" minute="0" day-name="jeudi"/>
<SubscriptionMonthlyFee>0</SubscriptionMonthlyFee>
<PaymentMode>Undefined</PaymentMode>
<Provision>0</Provision>
<InternalMail/>
<ExternalMail>qweqwe#qwe.com</ExternalMail>
<GroupMemberships/>
<SynchroCount>0</SynchroCount>
<LastSynchroDate year="1970" month="1" month-name="janvier" day-of-month="1" hour-of-day="1" minute="0" day-name="jeudi"/>
<HasActiveSync>false</HasActiveSync>
<Company/>
</Account>
</Accounts>
Perl Block:
my $file = shift || (print "NOTE: \tYou didn't provide the name of the file to be checked.\n" and exit);
my $twig = XML::Twig -> new ( twig_roots => { 'Account' => \& parsing } ); #'twig_roots' mode builds only the required sub-trees from the document while ignoring everything outside that twig.
$twig -> parsefile ($file);
sub parsing {
my ( $twig, $accounts ) = #_;
my %hash = #_;
my $ref = \%hash; #because was getting an error of Odd number of hash elements
return $ref;
$twig -> purge;
It gives a hash reference - which I'm unable to deference properly (even after doing thousands of attempts).
Again - just need a single clean function (sub) for doing the Parsing and returning the hash of all elements ('Accounts' in this case) - to be used in other other function (valid_sub) for performing the validation checks.
I'm literally stuck at this point - and will HIGHLY appreciate your HELP.
Such a hash is not created by Twig, you have to create it yourself.
Beware: Commands after return will never be reached.
#!/usr/bin/perl
use warnings;
use strict;
use XML::Twig;
use Data::Dumper;
my $twig = 'XML::Twig'->new(twig_roots => { Account => \&account });
$twig->parsefile(shift);
sub account {
my ($twig, $account) = #_;
my %hash;
for my $ch ($account->children) {
if (my $text = $ch->text) {
$hash{ $ch->name } = $text;
} else {
for my $attr (keys %{ $ch->atts }) {
$hash{ $ch->name }{$attr} = $ch->atts->{$attr};
}
}
}
print Dumper \%hash;
$twig -> purge;
validate(\%hash);
}
Handling of nested elements (e.g. GroupMemberships) left as an exercise to the reader.
And for validation:
sub validate {
my $account = shift;
if ('abcd' eq $account->{Id}) {
...
}
}
The problem with downconverting XML into hashes, is that XML is fundamentally a more complicated data structure. Each element has properties, children and content - and it's ordered - where hashes... don't.
So I would suggest that you not do what you're doing, and instead of passing a hash, use an XML::Twig::Elt and pass that into your validation.
Fortunately, this is exactly what XML::Twig passes to it's handlers:
## this is fine:
sub parsing {
my ( $twig, $accounts ) = #_;
but this is nonsense - think about what's in #_ at this point - it's references to XML::Twig objects - two of them, you've just assigned them.
my %hash = #_;
And this doesn't makes sense as a result
my $ref = \%hash; #because was getting an error of Odd number of hash elements
And where are you returning it to? (this is being called when XML::Twig is parsing)
return $ref;
#this doesn't happen, you've already returned
$twig -> purge;
But bear in mind - you're returning it to your twig proces that's parsing, that's ... discarding the return code. So that's not going to do anything anyway.
I would suggest instead you 'save' the $accounts reference and use that for your validation - just pass it into your subroutines to validate.
Or better yet, configure up a set of twig_handlers that do this for you:
my %validate = ( 'Account/Locked' => sub { die if $_ -> trimmed_text eq "true" },
'Account/CreationDate' => \&parsing,
'Account/ExternalMail' => sub { die unless $_ -> text =~ m/\w+\#\w+\.\w+ }
);
my $twig = XML::Twig -> new ( twig_roots => \%validate );
You can either die if you want to discard the whole lot, or use things like cut to remove an invalid entry from a document as you parse. (and maybe paste it into a seperate doc).
But if you really must turn your XML into a perl data structure - first read this for why it's a terrible idea:
Why is XML::Simple "Discouraged"?
And then, if you really want to carry on down that road, look at the simplify option of XML::Twig:
sub parsing {
my ( $twig, $accounts ) = #_;
my $horrible_hacky_hashref = $accounts->simplify(forcearray => 1, keyattr => [], forcecontent => 1 );
print Dumper \$horrible_hacky_hashref;
$twig -> purge;
#do something with it.
}
Edit:
To expand:
XML::Twig::Elt is a subset of XML::Twig - it's the 'building block' of an XML::Twig data structure - so in your example above, $accounts is.
sub parsing {
my ( $twig, $accounts ) = #_;
print Dumper $accounts;
}
You will get a lot of data if you do this, because you're dumping the whole data structure - which is effectively a daisy chain of XML::Twig::Elt objects.
$VAR1 = \bless( {
'parent' => bless( {
'first_child' => ${$VAR1},
'flushed' => 1,
'att' => {
'locale' => 'en_US'
},
'gi' => 6,
....
'att' => {},
'last_child' => ${$VAR1}->{'first_child'}->{'next_sibling'}->{'next_sibling'}->{'next_sibling'}->{'next_sibling'}->{'next_sibling'}->{'next_sibling'}->{'next_sibling'}->{'next_sibling'}->{'next_sibling'}->{'next_sibling'}->{'next_sibling'}->{'next_sibling'}->{'next_sibling'}->{'next_sibling'}->{'next_sibling'}->{'next_sibling'}->{'next_sibling'}->{'next_sibling'}->{'next_sibling'}->{'next_sibling'}->{'next_sibling'}->{'next_sibling'}->{'next_sibling'},
'gi' => 7
}, 'XML::Twig::Elt' );
But it already encapsulates the information you need, as well as the structure you require - that's why XML::Twig is using it. And is in no small part going to illustrate why forcing your data into a hash/array, you're going to lose data.

Read config hash-like data into perl hash

I have a config file config.txt like
{sim}{time}{end}=63.1152e6;
{sim}{output}{times}=[2.592e6,31.5576e6,63.1152e6];
{sim}{fluid}{comps}=[ ['H2O','H_2O'], ['CO2','CO_2'],['NACL','NaCl'] ];
I would like to read this into a perl hash,
my %h=read_config('config.txt');
I have checked out module Config::Hash , but it does not offer the same input file format.
Can roll your own. Uses Data::Diver for traversing the hash, but could do that manually as well.
use strict;
use warnings;
use Data::Diver qw(DiveVal);
my %hash;
while (<DATA>) {
chomp;
my ($key, $val) = split /\s*=\s*/, $_, 2;
my #keys = $key =~ m/[^{}]+/g;
my $value = eval $val;
die "Error in line $., '$val': $#" if $#;
DiveVal(\%hash, #keys) = $value;
}
use Data::Dump;
dd \%hash;
__DATA__
{sim}{time}{end}=63.1152e6;
{sim}{output}{times}=[2.592e6,31.5576e6,63.1152e6];
{sim}{fluid}{comps}=[ ['H2O','H_2O'], ['CO2','CO_2'],['NACL','NaCl'] ];
Outputs:
{
sim => {
fluid => { comps => [["H2O", "H_2O"], ["CO2", "CO_2"], ["NACL", "NaCl"]] },
output => { times => [2592000, 31557600, 63115200] },
time => { end => 63115200 },
},
}
Would be better if you could come up with a way to not utilize eval, but not knowing your data, I can't accurately suggest an alternative.
Better Alternative, use a JSON or YAML
If you're picking the data format yourself, I'd advise using JSON or YAML for saving and loading your config data.
use strict;
use warnings;
use JSON;
my %config = (
sim => {
fluid => { comps => [["H2O", "H_2O"], ["CO2", "CO_2"], ["NACL", "NaCl"]] },
output => { times => [2592000, 31557600, 63115200] },
time => { end => 63115200 },
},
);
my $string = encode_json \%config;
## Save the string to a file, and then load below:
my $loaded_config = decode_json $string;
use Data::Dump;
dd $loaded_config;

parse all arguments and store to hash

How can i parse all the arbitrary arguments to a hash without specifying the argument names inside my perl script.
Running command with below argument should give hash like below.
-arg1=first --arg2=second -arg3 -arg4=2.0013 -arg5=100
{
'arg2' => 'second',
'arg1' => 'first',
'arg4' => '2.0013',
'arg3' => 1,
'arg5' => 100
};
This can be achieved using Getopt::Long as below
GetOptions(\%hash,
"arg1=s",
"arg2=s",
"arg3",
"arg4=f",
"arg5=i");
However, my argument list is too long and i don't want to specify argument names in GetOptions.
So a call to GetOptions with only hash as a parameter should figure out what arguments are (and their type integer/string/floats/lone arguments) and just create a hash.
There are a lot of Getopt modules. The following are some that will just slurp everything into a hash like you desire:
Getopt::Mini
Getopt::Whatever
Getopt::Casual
I personally would never do something like this though, and have no real world experience with any of these modules. I'd always aim to validate every script for both error checking and as a means to self-document what the script is doing and uses.
Try this:
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;
sub getOptions {
my (%opts, #args);
while (#_) {
my $opt = shift;
if ($opt =~ /^-/) {
if ($opt =~ /-+([^=]+)(?:=(.+))?/) {
$opts{$1} = $2 ? $2 : 1;
}
}
else {
push #args, $opt;
}
}
return (\%opts, \#args);
}
my ($opts, $args) = getOptions(#ARGV);
print Dumper($opts, $args);
Testing:
$ perl t.pl -arg1=first --arg2=second -arg3 -arg4=2.0013 -arg5=100 datafile
$VAR1 = {
'arg2' => 'second',
'arg1' => 'first',
'arg4' => '2.0013',
'arg3' => 1,
'arg5' => '100'
};
$VAR2 = [
'datafile'
];
This will work as expected for your example,
my %hash = map { s/^-+//; /=/ ? split(/=/, $_, 2) : ($_ =>1) } #ARGV;

Hash doesn't print in Perl

I have a hash:
while( my( $key, $value ) = each %sorted_features ){
print "$key: $value\n";
}
but I cannot obtain the correct value for $value. It gives me:
intron: ARRAY(0x3430440)
source: ARRAY(0x34303b0)
exon: ARRAY(0x34303f8)
sig_peptide: ARRAY(0x33f0a48)
mat_peptide: ARRAY(0x3430008)
Why is it?
Your values are array references. You need to do something like
while( my( $key, $value ) = each %sorted_features ) {
print "$key: #$value\n";
}
In other words, dereference the reference. If you are unsure what your data looks like, a good idea is to use the Data::Dumper module:
use Data::Dumper;
print Dumper \%sorted_features;
You will see something like:
$VAR1 = {
'intron' => [
1,
2,
3
]
};
Where { denotes the start of a hash reference, and [ an array reference.
You can use also Data::Dumper::Pertidy which runs the output of Data::Dump through Perltidy.
#!/usr/bin/perl -w
use strict;
use Data::Dumper::Perltidy;
my $data = [{title=>'This is a test header'},{data_range=>
[0,0,3, 9]},{format => 'bold' }];
print Dumper $data;
Prints:
$VAR1 = [
{ 'title' => 'This is a test header' },
{ 'data_range' => [ 0, 0, 3, 9 ] },
{ 'format' => 'bold' }
];
Your hash values are array references. You need to write additional code to display the contents of these arrays, but if you are just debugging then it is probably simpler to use Data::Dumper like this
use Data::Dumper;
$Data::Dumper::Useqq = 1;
print Dumper \%sorted_features;
And, by the way, the name %sorted_features of your hash worries me. hashes are inherently unsorted, and the order that each retrieves the elements is essentially random.