Read config hash-like data into perl hash - perl

I have a config file config.txt like
{sim}{time}{end}=63.1152e6;
{sim}{output}{times}=[2.592e6,31.5576e6,63.1152e6];
{sim}{fluid}{comps}=[ ['H2O','H_2O'], ['CO2','CO_2'],['NACL','NaCl'] ];
I would like to read this into a perl hash,
my %h=read_config('config.txt');
I have checked out module Config::Hash , but it does not offer the same input file format.

Can roll your own. Uses Data::Diver for traversing the hash, but could do that manually as well.
use strict;
use warnings;
use Data::Diver qw(DiveVal);
my %hash;
while (<DATA>) {
chomp;
my ($key, $val) = split /\s*=\s*/, $_, 2;
my #keys = $key =~ m/[^{}]+/g;
my $value = eval $val;
die "Error in line $., '$val': $#" if $#;
DiveVal(\%hash, #keys) = $value;
}
use Data::Dump;
dd \%hash;
__DATA__
{sim}{time}{end}=63.1152e6;
{sim}{output}{times}=[2.592e6,31.5576e6,63.1152e6];
{sim}{fluid}{comps}=[ ['H2O','H_2O'], ['CO2','CO_2'],['NACL','NaCl'] ];
Outputs:
{
sim => {
fluid => { comps => [["H2O", "H_2O"], ["CO2", "CO_2"], ["NACL", "NaCl"]] },
output => { times => [2592000, 31557600, 63115200] },
time => { end => 63115200 },
},
}
Would be better if you could come up with a way to not utilize eval, but not knowing your data, I can't accurately suggest an alternative.
Better Alternative, use a JSON or YAML
If you're picking the data format yourself, I'd advise using JSON or YAML for saving and loading your config data.
use strict;
use warnings;
use JSON;
my %config = (
sim => {
fluid => { comps => [["H2O", "H_2O"], ["CO2", "CO_2"], ["NACL", "NaCl"]] },
output => { times => [2592000, 31557600, 63115200] },
time => { end => 63115200 },
},
);
my $string = encode_json \%config;
## Save the string to a file, and then load below:
my $loaded_config = decode_json $string;
use Data::Dump;
dd $loaded_config;

Related

Perl: overwrite structure member value

I am creating $input with this code:
push(#{$input->{$step}},$time);, then I save it in an xml file, and at the next compiling, I read it from that file. When i print it, i get the structure bellow.
if(-e $file)
my $input =XMLin($file...);
print Dumper $input;
and I get this structure
$VAR1 = {
'opt' => {
'step820' => '0',
'step190' => '0',
'step124' => '0',
}
};
for each step with it's time..
push(#{$input->{$step}},$time3);
XmlOut($file, $input);
If I run the program again, I get this structure:
$VAR1 = {
'opt' => {
'step820' => '0',
'step190' => '0',
'step124' => '0',
'opt' => {
'step820' => '0',
'step190' => '0',
'step124' => '0'
}
}
I just need to overwrite the values of steps(ex:$var1->opt->step820 = 2). How can i do that?
I just need to overwrite the values of steps(ex:$var1->opt->step820 = 2). How can i do that?
$input->{opt}->{step820} = 2;
I'm going to say what I always do, whenever someone posts something asking about XML::Simple - and that is that XML::Simple is deceitful - it isn't simple at all.
Why is XML::Simple "Discouraged"?
So - in your example:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $xml= XML::Twig->new->parsefile($file);
$xml -> get_xpath('./opt/step820',0)->set_text("2");
$xml -> print;
The problem is that XML::Simple is only any good for parsing the type of XML that you didn't really need XML for in the first place.
For more simple examples - have you considered using JSON for serialisation? As it more directly reflects the hash/array structure of native perl data types.
That way you can instead:
print {$output_fh} to_json ( $myconfig, {pretty=>1} );
And read it back in:
my $myconfig = from_json ( do { local $/; <$input_fh> });
Something like:
#!/usr/bin/env perl
use strict;
use warnings;
use JSON;
my $input;
my $time = 0;
foreach my $step ( qw ( step820 step190 step124 ) ) {
push(#{$input->{$step}},$time);
}
print to_json ( $input, {pretty=>1} );
Giving resultant JSON of:
{
"step190" : [
0
],
"step820" : [
0
],
"step124" : [
0
]
}
Although actually, I'd probably:
foreach my $step ( qw ( step820 step190 step124 ) ) {
$input->{$step} = $time;
}
print to_json ( $input, {pretty=>1} );
Which gives;
{
"step190" : 0,
"step124" : 0,
"step820" : 0
}
JSON uses very similar conventions to perl - in that {} denote key value pairs (hashes) and [] denote arrays.
Look at the RootName option of XMLout. By default, when "XMLout()" generates XML, the root element will be named 'opt'. This option allows you to specify an alternative name.
Specifying either undef or the empty string for the RootName option will produce XML with no root elements.

Read ini files without section names

I want to make a configuration file which hold some objects, like this (where of course none of the paramaters can be considered as a primary key)
param1=abc
param2=ghj
param1=bcd
param2=hjk
; always the sames parameters
This file could be read, lets say with Config::IniFiles, because it has a direct transcription into ini file, like this
[0]
param1=abc
param2=ghj
[1]
param1=bcd
param2=hjk
with, for example, something like
perl -pe 'if (m/^\s*$/ || !$section ) print "[", ($section++ || 0) , "]"'
And finish with
open my $fh, '<', "/path/to/config_file.ini" or die $!;
$cfg = Config::IniFiles->new( -file => $fh );
(...parse here the sections starting with 0.)
But, I here ask me some question about the thing becoming quite complex....
(A) Is There a way to transform the $fh, so that it is not required to execute the perl one-liner BEFORE reading the file sequentially? So, to transform the file during perl is actually reading it.
or
(B) Is there a module to read my wonderfull flat database? Or something approching? I let myslef said, that Gnu coreutils does this kind of flat file reading, but I cannot remember how.
You can create a simple subclass of Config::INI::Reader:
package MyReader;
use strict;
use warnings;
use base 'Config::INI::Reader';
sub new {
my $class = shift;
my $self = $class->SUPER::new( #_ );
$self->{section} = 0;
return $self;
}
sub starting_section { 0 };
sub can_ignore { 0 };
sub parse_section_header {
my ( $self, $line ) = #_;
return $line =~ /^\s*$/ ? ++$self->{section} : undef ;
}
1;
With your input this gives:
% perl -MMyReader -MData::Dumper -e 'print Dumper( MyReader->read_file("cfg") )'
$VAR1 = {
'1' => {
'param2' => 'hjk',
'param1' => 'bcd'
},
'0' => {
'param2' => 'ghj',
'param1' => 'abc'
}
};
You can use a variable reference instead of a file name to create a filehandle that reads from it:
use strict;
use warnings;
use autodie;
my $config = "/path/to/config_file.ini";
my $content = do {
local $/;
open my $fh, "<", $config;
"\n". <$fh>;
};
# one liner replacement
my $section = 0;
$content =~ s/^\s*$/ "\n[". $section++ ."]" /mge;
open my $fh, '<', \$content;
my $cfg = Config::IniFiles->new( -file => $fh );
# ...
You can store the modified data in a real file or a string variable, but I suggest that you use paragraph mode by setting the input record separator $/ to the empty string. Like this
use strict;
use warnings;
{
local $/ = ''; # Read file in "paragraphs"
my $section = 0;
while (<DATA>) {
printf "[%d]\n", $section++;
print;
}
}
__DATA__
param1=abc
param2=ghj
param1=bcd
param2=hjk
output
[0]
param1=abc
param2=ghj
[1]
param1=bcd
param2=hjk
Update
If you read the file into a string, adding section identifiers as above, then you can read the result directly into a Config::IniFiles object using a string reference, for instance
my $config = Config::IniFiles->new(-file => \$modified_contents)
This example shows the tie interface, which results in a Perl hash that contains the configuration information. I have used Data::Dump only to show the structure of the resultant hash.
use strict;
use warnings;
use Config::IniFiles;
my $config;
{
open my $fh, '<', 'config_file.ini' or die "Couldn't open config file: $!";
my $section = 0;
local $/ = '';
while (<$fh>) {
$config .= sprintf "[%d]\n", $section++;
$config .= $_;
}
};
tie my %config, 'Config::IniFiles', -file => \$config;
use Data::Dump;
dd \%config;
output
{
# tied Config::IniFiles
"0" => {
# tied Config::IniFiles::_section
param1 => "abc",
param2 => "ghj",
},
"1" => {
# tied Config::IniFiles::_section
param1 => "bcd",
param2 => "hjk",
},
}
You may want to perform operations on a flux of objects (as Powershell) instead of a flux of text, so
use strict;
use warnings;
use English;
sub operation {
# do something with objects
...
}
{
local $INPUT_RECORD_SEPARATOR = '';
# object are separated with empty lines
while (<STDIN>) {
# key value
my %object = ( m/^ ([^=]+) = ([[:print:]]*) $ /xmsg );
# key cannot have = included, which is the delimiter
# value are printable characters (one line only)
operation ( \%object )
}
A like also other answers.

get nbest key-value pairs hash table in Perl

I have this script that use a hash table:
#!/usr/bin/env perl
use strict; use warnings;
my $hash = {
'cat' => {
"félin" => '0.500000',
'chat' => '0.600000',
'chatterie' => '0.300000'
'chien' => '0.01000'
},
'rabbit' => {
'lapin' => '0.600000'
},
'canteen' => {
"ménagère" => '0.400000',
'cantine' => '0.600000'
}
};
my $text = "I love my cat and my rabbit canteen !\n";
foreach my $word (split "\s+", $text) {
print $word;
exists $hash->{$word}
and print "[" . join(";", keys %{ $hash->{$word} }) . "]";
print " ";
}
For now, I have this output:
I love my cat[chat;félin;chatterie;chien] and my rabbit[lapin] canteen[cantine;ménagère] !
I need to have the nbest key value according to the frequencies (stored in my hash). For example, I want to have the 3 best translations according to the frequencies like this:
I love my cat[chat;félin;chatterie] and my rabbit[lapin] canteen[cantine;ménagère] !
How can I change my code to take into account the frequencies of each values and also to print the nbest values ?
Thanks for your help.
The tidiest way to do this is to write a subroutine that returns the N most frequent translations for a given word. I have written best_n in the program below to do that. It uses rev_nsort_by from List::UtilsBy to do the sort succinctly. It isn't a core module, and so may well need to be installed.
I have also used an executable substitution to modify the string in-place.
use utf8;
use strict;
use warnings;
use List::UtilsBy qw/ rev_nsort_by /;
my $hash = {
'cat' => {
'félin' => '0.500000',
'chat' => '0.600000',
'chatterie' => '0.300000',
'chien' => '0.01000',
},
'rabbit' => {
'lapin' => '0.600000',
},
'canteen' => {
'ménagère' => '0.400000',
'cantine' => '0.600000',
}
};
my $text = "I love my cat and my rabbit canteen !\n";
$text =~ s{(\S+)}{
$hash->{$1} ? sprintf '[%s]', join(';', best_n($1, 3)) : $1;
}ge;
print $text;
sub best_n {
my ($word, $n) = #_;
my $item = $hash->{$word};
my #xlate = rev_nsort_by { $item->{$_} } keys %$item;
$n = $n > #xlate ? $#xlate : $n - 1;
#xlate[0..$n];
}
output
I love my [chat;félin;chatterie] and my [lapin] [cantine;ménagère] !

Hash doesn't print in Perl

I have a hash:
while( my( $key, $value ) = each %sorted_features ){
print "$key: $value\n";
}
but I cannot obtain the correct value for $value. It gives me:
intron: ARRAY(0x3430440)
source: ARRAY(0x34303b0)
exon: ARRAY(0x34303f8)
sig_peptide: ARRAY(0x33f0a48)
mat_peptide: ARRAY(0x3430008)
Why is it?
Your values are array references. You need to do something like
while( my( $key, $value ) = each %sorted_features ) {
print "$key: #$value\n";
}
In other words, dereference the reference. If you are unsure what your data looks like, a good idea is to use the Data::Dumper module:
use Data::Dumper;
print Dumper \%sorted_features;
You will see something like:
$VAR1 = {
'intron' => [
1,
2,
3
]
};
Where { denotes the start of a hash reference, and [ an array reference.
You can use also Data::Dumper::Pertidy which runs the output of Data::Dump through Perltidy.
#!/usr/bin/perl -w
use strict;
use Data::Dumper::Perltidy;
my $data = [{title=>'This is a test header'},{data_range=>
[0,0,3, 9]},{format => 'bold' }];
print Dumper $data;
Prints:
$VAR1 = [
{ 'title' => 'This is a test header' },
{ 'data_range' => [ 0, 0, 3, 9 ] },
{ 'format' => 'bold' }
];
Your hash values are array references. You need to write additional code to display the contents of these arrays, but if you are just debugging then it is probably simpler to use Data::Dumper like this
use Data::Dumper;
$Data::Dumper::Useqq = 1;
print Dumper \%sorted_features;
And, by the way, the name %sorted_features of your hash worries me. hashes are inherently unsorted, and the order that each retrieves the elements is essentially random.

How to put data from CSV file to Perl hash

I have Perl and CSV file with something like:
"Name","Lastname"
"Homer","Simpsons"
"Ned","Flanders"
In this CSV file I have header in the first line and in other lines there are
data.
I want to convert this CSV file to such Perl data:
[
{
Lastname => "Simpsons",
Name => "Homer",
},
{
Lastname => "Flanders",
Name => "Ned",
},
]
I've written the function that users Text::CSV and doing what I need.
Here is the sample script:
#!/usr/bin/perl
use strict;
use warnings FATAL => 'all';
use 5.010;
use utf8;
use open qw(:std :utf8);
use Text::CSV;
sub read_csv {
my ($filename) = #_;
my #first_line;
my $result;
my $csv = Text::CSV->new ({ binary => 1, auto_diag => 1 });
open my $fh, "<:encoding(utf8)", $filename or die "$filename: $!";
while (my $row = $csv->getline ($fh)) {
if (not #first_line) {
#first_line = #{$row};
} else {
push #{$result}, { map { $first_line[$_] => $row->[$_] } 0..$#first_line };
}
}
close $fh;
return $result;
}
my $data = read_csv('sample.csv');
This works fine but this function I want to use in several scripts. I'm
greatly suprised that Text::CSV doesn't have this feature.
My question. What should I do to simplify solving such tasks in the future for
me and others?
Should I use some Perl module from CPAN, should I try to add this function to
Text::CSV, or something else?
Huh? Why so complicated? First, we fetch the header outside of the loop:
my $headers = $csv->getline($fh) or die "no header";
Assign these to be the column names:
$csv->column_names(#$headers);
Then, each call to getline_hr will provide a hashref:
while (my $hashref = $csv->getline_hr($fh)) {
push #$result, $hashref;
}
We can also use getline_hr_all:
$result = $csv->getline_hr_all($fh);
In other words, it ain't complex, most pieces are already provided by Text::CSV, and it can be done in very few lines.
Also, a module like this seems to already exist: Text::CSV::Slurp. (note: reverse dependency search through metacpan is awesome)
It's probably not a standard feature because different people will want their CSV files parsed into different data structures.
Why not create your own module that wraps this function?
package CSVRead;
use strict;
use warnings;
use 5.010;
use open qw(:std :utf8);
use Text::CSV;
require Exporter;
our #ISA = qw(Exporter);
our #EXPORT = qw(read_csv);
sub read_csv {
my ($filename) = #_;
my #first_line;
my $result;
my $csv = Text::CSV->new ({ binary => 1, auto_diag => 1 });
open my $fh, "<:encoding(utf8)", $filename or die "$filename: $!";
while (my $row = $csv->getline ($fh)) {
if (not #first_line) {
#first_line = #{$row};
} else {
push #{$result}, { map { $first_line[$_] => $row->[$_] } 0..$#first_line };
}
}
close $fh;
return $result;
}
Then, use it like this:
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
use Data::Dumper;
use CSVRead;
my $data = read_csv('sample.csv');
say Dumper $data;