How to ignore hash's key sequence in perl - perl

I has a file that contain information like below
1:2=14
3:4=15
2:1=16
4:3=17
I would like to create a hash that treat key {1:2} same with key {2:1}
$hash{1:2} = $hash{2:1}
so when i print all element in $hash{1:2} or $hash{2:1} it give me both 14 and 16 as the result. Is it a simple way to do that?
Below are my planning to create the key for hash
for ( my $i = 0; $i <=$#file ; $i++) {
my $line = $file[$i];
my #key = split ("=",$line);
my $hash{$key[0]} = $key[1];
}

There is no built-in way to do that. If your keys follow the same pattern, you can wrap it in a function, then compute all keys that fit your algorithm and get the values.
my %hash = (
'1:2' => 14,
'3:4' => 15,
'2:1' => 16,
'4:3' => 17,
);
sub get_elements {
my ($key) = #_;
return $hash{$key}, $hash{reverse $key}; # or $key =~ s/(\d+):(\d+)/$2:$1/r
}
print join ' ', get_elements('1:2');
This uses string reverse, and obviously only works if the parts of the key are only one digit. If you can have numbers greater than 9 you will have to split the string and re-assemble it, or use a substitution to switch them around. My example uses the /r modifier, which needs Perl 5.14 or higher.
If you want to however build a data structure when reading your file that takes care of this automatically, you can do that too. Use an array reference instead of the simple values inside your hash, and assign the same reference to all keys that you want to be equal.
use strict;
use warnings;
use feature 'say';
use Data::Dumper;
my %hash;
while (my $line = <DATA>) {
chomp $line;
my ($key, $value) = split /=/, $line;
my ($left, $right) = split /:/, $key;
# $hash{$key} //= ( $hash{"$right:$left"} // [] ); # needs 5.10
$hash{$key} = ( $hash{"$right:$left"} || [] )
unless exists $hash{$key};
push #{ $hash{$key} }, $value;
}
say "#{ $hash{'1:2'} }";
say "#{ $hash{'2:1'} }";
print Dumper \%hash;
__DATA__
1:2=14
3:4=15
2:1=16
4:3=17
The output of this code is
14 16
14 16
The Data::Dumper structure looks like this, which explains that both keys 2:1 and 1:2 point to the same array reference. That means that if you push another value into one of them, it will end up in the other as well, because they are actually the same thing.
$VAR1 = {
'4:3' => [
'15',
'17'
],
'3:4' => $VAR1->{'4:3'},
'2:1' => [
'14',
'16'
],
'1:2' => $VAR1->{'2:1'}
};
The only downside of this is that you cannot get the order of the elements back. This data structure looses the knowledge that 1:2 had the value 14 and 2:1 had 16 initially. If you want 2:1 to output 16 14 this will not work.

Related

Maintain order within a hash of hashes, and output as .csv

Here's my code:
my %hash = (
'2564' => {
'st_responsible' => 'mname1',
'critical' => '',
'last_modified_by' => 'teamname1',
'transstatus' => '',
'rt_res' => 'pname1'
},
'2487' => {
'st_responsible' => 'mname2',
'critical' => '',
'last_modified_by' => 'teamname2',
'transstatus' => '',
'rt_res' => ''
}
);
print "xnum,st_responsible,critical,last_modified_by,transstatus,rt_res\n";
foreach my $x_number (sort keys %hash)
{
print "$x_number";
foreach my $element (keys %{$hash{$x_number}})
{
print ",$hash{$x_number}{$element}";
}
print "\n";
}
Expected output
xnum,st_responsible,critical,last_modified_by,transstatus,rt_res
2487,mname2,,teamname2,,
2564,mname1,,teamname1,,pname1
Actual output
xnum,st_responsible,critical,last_modified_by,transstatus,rt_res
2487,mname2,,,teamname2,
2564,mname1,,,teamname1,pname1
Please help in letting me know as to how exactly do I preserve the order of this data structure, and then write this to a CSV file.
I would suggest that for this, you'd be better off doing this with a slice, which is a way of extracting a list of values from a hash in a particular order?
#configure field order
my #order = qw ( st_responsible critical last_modified_by transstatus rt_res );
#print header row
print join (",", "xnum", #order ),"\n";
#iterate the rows
foreach my $key ( sort keys %hash ) {
#extract hash slice and join it with commas
print join ( ",", $key, #{$hash{$key}}{#order} ),"\n";
}
This gives:
xnum,st_responsible,critical,last_modified_by,transstatus,rt_res
2487,mname2,,teamname2,,
2564,mname1,,teamname1,,pname1
You can consider Text::CSV - but I'd suggest in this scenario it's overkill, best used when you've got quotes and quoted field separators to worry about. (And you don't).
If you have to deal with not just empty keys, but missing ones, you can make use of map:
my #order = qw ( st_responsible critical last_modified_by
transstatus missing rt_res extra_field_here );
print join (",", "xnum", #order ),"\n";
foreach my $key ( sort keys %hash ) {
print join ( ",", $key, map { $_ // '' } #{$hash{$key}}{#order} ),"\n";
}
(Otherwise you'll get a warning about an undef value).
Perl doesn't guarantee the order of items in the hash, this is the root cause of the issue. Even two different hashes with the same keys can have different order of keys. It may also differ from platform to platform and architecture and perl version.
You need to define another array with the list of keys which you want to print in correct order.
my #keys = qw(st_responsible critical last_modified_by transstatus rt_res);
foreach my $element (#keys) {
... print the value
}
EDIT: As you're trying to write CSV file, consider using Text::CSV which takes care about special characters, correct formatting and things like that.
There's probably a slicker way of achieving this, but give this a go:
use warnings;
use strict;
open my $csv_out, '>', 'out.csv' or die $!;
my #keys = qw(2487 2564);
my #vals = qw(st_responsible critical last_modified_by transstatus rt_res);
print $csv_out "xnum,st_responsible,critical,last_modified_by,transstatus,rt_res\n";
for my $key (#keys){
print $csv_out "$key,";
for my $vals (#vals){
$vals eq $vals[-1] ? print $csv_out "$hash{$key}{$vals}\n" : print $csv_out "$hash{$key}{$vals},";
}
}
This will print out comma-separated values to a csv file out.csv maintaining your original order (by iterating over arrays). If it's the last value it will print a newline.
--- OUTPUT ---
xnum,st_responsible,critical,last_modified_by,transstatus,rt_res
2487,mname2,,teamname2,,
2564,mname1,,teamname1,,pname1

Perl list all keys in hash with identical values

If I have a colon-delimited file name FILE and I do:
cat FILE|perl -F: -lane 'my %hash = (); $hash{#F[0]} = #F[2]'
to assign the first and 3rd tokens as the key => value pairs for the hash..
1) Is that a sane way to assign key value pairs to a hash?
2) What is the simplest way to now find all keys with shared values and list them?
Assume FILE looks like:
Mike:34:Apple:Male
Don:23:Corn:Male
Jared:12:Apple:Male
Beth:56:Maize:Female
Sam:34:Apple:Male
David:34:Apple:Male
Desired Output: Keys with value "Apple": Mike,Jared,David,Sam
Your example won't work as you want because the -n option puts a while loop around your one-line program, so the hash you declare is created and destoyed for every record in the file. You could get around that by not declaring the hash, and so making it a persistent package variable which will retain all values stored in it.
You can then write push #{ $hash{$F[2]} }, $F[0] but notice that it should be $F[0] etc. and not #F[0], and I have used push to create a list of column 1 values for each column 3 value instead of just a list of one-to-one values relating each column 1 value with its column 3 value.
To clarify, your method produces a hash looking like this, which has to be searched to produce the display that you want.
(
Beth => "Maize",
David => "Apple",
Don => "Corn",
Jared => "Apple",
Mike => "Apple",
Sam => "Apple",
)
while mine creates this, which as you can see is pretty much already in the form you want.
(
Apple => ["Mike", "Jared", "Sam", "David"],
Corn => ["Don"],
Maize => ["Beth"],
)
But I think this problem is a bit too big to be solved with a one-line Perl program. The solution below expects the path to the input file as a command-line parameter, like this
> perl prog.pl colons.csv
but it will default to myfile.csv if no file is specified.
use strict;
use warnings;
our #ARGV = 'myfile.csv' unless #ARGV;
my %data;
while (<>) {
my #fields = split /:/;
push #{ $data{$fields[2]} }, $fields[0];
}
while (my ($k, $v) = each %data) {
next unless #$v > 1;
printf qq{Keys with value "%s": %s\n}, $k, join ', ', #$v;
}
output
Keys with value "Apple": Mike, Jared, Sam, David
use strict;
use warnings;
open my $in, '<', 'in.txt';
my %data;
while(<$in>){
chomp;
my #split = split/:/;
$data{$split[0]} = $split[2];
}
my $query = 'Apple';
print "Keys with value $query = ";
foreach my $name (keys %data){
print "$name " if $data{$name} eq $query;
}
print "\n";
Arrays are used to hold list of values, so use an array.
perl -F: -lane'
push #{ $h{$F[2]} }, $F[0];
END {
for my $fruit (keys %h) {
next if #{ $h{$fruit} } < 2;
print "$fruit: ", join(",", #{ $h{$fruit} });
}
}
' FILE
The END block is executed on exit. In it, we iterate over the keys of the hash. If the value of the current hash element is an array with only one element, it's skipped. Otherwise, we prints the key followed by contents of the array referenced by the hash element.
Here is another way:
perl -F: -lane'
push #{ $h{$F[2]} }, $F[0];
}{
print "$_: ", join(",", #{ $h{$_} }) for grep { #{$h{$_}} > 1 } keys %h;
' file
We read each line and create hash of arrays using third column as key and first column as list of values for matching key. In the END block we iterate over our hash using grep and filter keys whose array count greater than 1 and print the key followed by array elements.
It doesn't have to be a one liner,
Good. It's not going to be...
Is that a sane way to assign key value pairs to a hash?
You're simply assigning the key value pairs as:
$hash{"key"} = "value";
Which is about as simple as it gets. There might be a way of doing it via map. However, the main issue I see is what should happen if you have duplicate keys.
Let's say your file looks like this:
Mike:34:Apple:Male
Don:23:Corn:Male
Jared:12:Apple:Male
Beth:56:Maize:Female
Sam:34:Apple:Male
David:34:Apple:Male # Note this entry is here twice!
David:35:Wheat:Male # Note this entry is here twice!
Let's do a simple assignment loop:
my %hash;
while my $line ( <$fh> ) {
chomp $line;
my ($name, $age, $category, $sex) = split /:/, $line;
$hash{$name} = $category;
}
When you get to $hash{David}, it will first be set to Apple, but then you change the value to Wheat. There are four ways you can handle this:
Use whatever the last value is. No change in the loop.
Use the first value and ignore subsequent values. Simple enough to do.
If that happens, it's an error. Abort the program and report the error.
Keep all values.
This last one is the most interesting because it involves a reference to an array as the values for your hash:
my %hash;
while my $line ( <$fh> ) {
chomp $line;
my ($name, $age, $category, $sex) = split /:/, $line;
$hash{$name} = [] if not exists $hash{$name}; # I'm making this an array reference
push #{ $hash{$name} }, $category;
}
Now, each value in my hash is a reference to an array:
my #values = #{ $hash{David} ); # The values of David...
print "David is in categories " . join ( ", ", #values ) . "\n";
This will print out David is in categories Wheat, Apple
What is the simplest way to now find all keys with shared values and list them?
The easiest way is to create a second hash that's keyed by your value. In this hash, you will need to use an array reference. Let's assume no duplicate names for now:
my %hash;
my %indexed_hash;
while my $line ( <$fh> ) {
chomp $line;
my ($name, $age, $category, $sex) = split /:/, $line;
$hash{$name} = $category;
my $indexed_hash{$category} = [] if not exist $indexed_hash{$category};
push #{ $indexed_hash{$category} }, $name;
}
Now, if I want to find all the duplicates of Apple:
my #names = #{ $indexed_hash{Apple} };
print "The following are in 'Apple': " . join ( ", " #names ) . "\n";
Since we're getting into references, we could take things a step further and store all of your values of your file in your hash. Again, for simplicity, I am assuming that you will have one and only one entry per name:
my %hash;
while my $line ( <$fh> ) {
chomp $line;
my ($name, $age, $category, $sex) = split /:/, $line;
$hash{$name}->{AGE} = $age;
$hash{$name}->{CATEGORY} = $category;
$hash{$name}->{SEX} = $sex;
}
for my $name ( sort keys %hash ) {
print "$name Information:\n";
print " Age: " . $hash{$name}->{AGE} . "\n";
printf "Category: %s\n", $hash{$name}->{CATEGORY};
print " Sex: #{[$hash{$name}->{SEX}]}\n\n";
}
That last two statements are easier ways of interpolating complex data structures into a string. The printf is fairly clear. The second #{[...]} is a neat little trick.
What have you tried?
If you reverse the hash into a list of value => key pairs then use List::Util's pairs() against the list, you can transform the hash into a hash of values => key arrayrefs. i.e. ( foo => [ 'bar', 'baz' ] ), grep {#{$hash{$_}} > 1} keys %hash, and print the results.

Build hash of hash in perl

I'm new to using perl and I'm trying to build a hash of a hash from a tsv. My current process is to read in a file and construct a hash and then insert it into another hash.
my %hoh = ();
while (my $line = <$tsv>)
{
chomp $line;
my %hash;
my #data = split "\t", $line;
my $id;
my $iter = each_array(#columns, #data);
while(my($k, $v) = $iter->())
{
$hash{$k} = $v;
if($k eq 'Id')
{
$id = $v;
}
}
$hoh{$id} = %hash;
}
print "dump: ", Dumper(%hoh);
This outputs:
dump
$VAR1 = '1234567890';
$VAR2 = '17/32';
$VAR3 = '1234567891';
$VAR4 = '17/32';
.....
Instead of what I would expect:
dump
{
'1234567890' => {
'k1' => 'v1',
'k2' => 'v2',
'k3' => 'v3',
'k4' => 'v4',
'id' => '1234567890'
},
'1234567891' => {
'k1' => 'v1',
'k2' => 'v2',
'k3' => 'v3',
'k4' => 'v4',
'id' => '1234567891'
},
........
};
My limited understanding is that when I do $hoh{$id} = %hash; its inserting in a reference to %hash? What am I doing wrong? Also is there a more succint way to use my columns and data array's as key,value pairs into my %hash object?
-Thanks in advance,
Niru
To get a reference, you have to use \:
$hoh{$id} = \%hash;
%hash is the hash, not the reference to it. In scalar context, it returns the string X/Y wre X is the number of used buckets and Y the number of all the buckets in the hash (i.e. nothing useful).
To get a reference to a hash variable, you need to use \%hash (as choroba said).
A more succinct way to assign values to columns is to assign to a hash slice, like this:
my %hoh = ();
while (my $line = <$tsv>)
{
chomp $line;
my %hash;
#hash{#columns} = split "\t", $line;
$hoh{$hash{Id}} = \%hash;
}
print "dump: ", Dumper(\%hoh);
A hash slice (#hash{#columns}) means essentially the same thing as ($hash{$columns[0]}, $hash{$columns[1]}, $hash{$columns[2]}, ...) up to however many columns you have. By assigning to it, I'm assigning the first value from split to $hash{$columns[0]}, the second value to $hash{$columns[1]}, and so on. It does exactly the same thing as your while ... $iter loop, just without the explicit loop (and it doesn't extract the $id).
There's no need to compare each $k to 'Id' inside a loop; just store it in the hash as a normal field and extract it afterwards with $hash{Id}. (Aside: Is your column header Id or id? You use Id in your loop, but id in your expected output.)
If you don't want to keep the Id field in the individual entries, you could use delete (which removes the key from the hash and returns the value):
$hoh{delete $hash{Id}} = \%hash;
Take a look at the documentation included in Perl. The command perldoc is very helpful. You can also look at the Perldoc webpage too.
One of the tutorials is a tutorial on Perl references. It all help clarify a lot of your questions and explain about referencing and dereferencing.
I also recommend that you look at CPAN. This is an archive of various Perl modules that can do many various tasks. Look at Text::CSV. This module will do exactly what you want, and even though it says "CSV", it works with tab separated files too.
You missed putting a slash in front of your hash you're trying to make a reference. You have:
$hoh{$id} = %hash;
Probably want:
$hoh{$id} = \%hash;
also, when you do a Data::Dumper of a hash, you should do it on a reference to a hash. Internally, hashes and arrays have similar structures when a Data::Dumper dump is done.
You have:
print "dump: ", Dumper(%hoh);
You should have:
print "dump: ", Dumper( \%hoh );
My attempt at the program:
#! /usr/bin/env perl
#
use warnings;
use strict;
use autodie;
use feature qw(say);
use Data::Dumper;
use constant {
FILE => "test.txt",
};
open my $fh, "<", FILE;
#
# First line with headers
#
my $line = <$fh>;
chomp $line;
my #headers = split /\t/, $line;
my %hash_of_hashes;
#
# Rest of file
#
while ( my $line = <$fh> ) {
chomp $line;
my %line_hash;
my #values = split /\t/, $line;
for my $index ( ( 0..$#values ) ) {
$line_hash{ $headers[$index] } = $values[ $index ];
}
$hash_of_hashes{ $line_hash{id} } = \%line_hash;
}
say Dumper \%hash_of_hashes;
You should only store a reference to a variable if you do so in the last line before the variable goes go of scope. In your script, you declare %hash inside the while loop, so placing this statement as the last in the loop is safe:
$hoh{$id} = \%hash;
If it's not the last statement (or you're not sure it's safe), create an anonymous structure to hold the contents of the variable:
$hoh{$id} = { %hash };
This makes a copy of %hash, which is slower, but any subsequent changes to it will not effect what you stored.

CSV into hash

I have a csv with the first column a label followed by comma separated values:
LabelA,45,56,78,90
LabelB,56,65,43,32
LabelC,56,87,98,45
I'd like the first column (LabelA etc) to be the Key in a hash with the numeric values in an array.
I can read the file into an array or scalar but I'm not sure what to do after that. Suggestions??
Edit:
Ok, so it looks like this assigns the value to a key ..but what about the comma delimited numbers in my example? Where are they going? Are they in %hash ? If so could you maybe dumb down your explanation even further? Thanks.
Personally, I like the Text::CSV_XS and IO::File module:
use Text::CSV_XS;
use IO::File;
# Usage example:
my $hash_ref = csv_file_hashref('some_file.csv');
foreach my $key (sort keys %{$hash_ref}){
print qq{$key: };
print join q{,}, #{$hash_ref->{$key}};
print qq{\n};
}
# Implementation:
sub csv_file_hashref {
my ($filename) = #_;
my $csv_fh = IO::File->new($filename, 'r');
my $csv = Text::CSV_XS->new ();
my %output_hash;
while(my $colref = $csv->getline ($csv_fh))
{
$output_hash{shift #{$colref}} = $colref;
}
return \%output_hash;
}
Well, let's assume that there are no special characters and so on.
First you open the file:
open my $fh, '<', 'some.file.csv' or die "Cannot open: $!";
Then you read from it in loop:
while (my $line = <$fh>) {
Afterwards, you remove trailing white characters (\n and others):
$line =~ s/\s*\z//;
And split it into array:
my #array = split /,/, $line;
When it's in array, you get first element off the array:
my $key = shift #array;
And store it in hash:
$hash{$key} = \#array;
(\#array means reference to array).
Whole code:
my %hash;
open my $fh, '<', 'some.file.csv' or die "Cannot open: $!";
while (my $line = <$fh>) {
$line =~ s/\s*\z//;
my #array = split /,/, $line;
my $key = shift #array;
$hash{$key} = \#array;
}
close $fh;
See perlfunc split and perldsc.
Read each line.
Chomp it.
Split it on commas.
Use the first value in the result as the key to your HoA.
The other values become the array.
Store a ref to the array in the hash under the key.
...
Profit!!!
Make a hash of array references:
Your data structure should look like this:
my %foo = (
LabelA => [ 2, 3, 56, 78, 90 ],
LabelB => [ 65, 45, 23, 34, 87 ],
LabelC => [ 67, 34, 56, 67, 98 ],
);
Text::CSV::Hashify
Turn a CSV file into a Perl hash:
# Simple functional interface
use Text::CSV::Hashify;
$hash_ref = hashify('/path/to/file.csv', 'primary_key');
# Object-oriented interface
use Text::CSV::Hashify;
$obj = Text::CSV::Hashify->new( {
file => '/path/to/file.csv',
format => 'hoh', # hash of hashes, which is default
key => 'id', # needed except when format is 'aoh'
max_rows => 20, # number of records to read; defaults to all
... # other key-value pairs possible for Text::CSV
} );
# all records requested
$hash_ref = $obj->all;
I think this can also work easier.
The $refhashvariable will be a reference to an array of hashes.
Each hash contain the headers (as hashkey) and the values of one CSV line.
The array contains the hashes for all lines in the CSV.
use Text::CSV_XS qw( csv );
$refhashvariable = csv(
in => "$input_csv_filename",
sep => ';',
headers => "auto"
); # as array of hash
This works for me. I have not tried en case the CSV does not have headers.

How can I convert these strings to a hash in Perl?

I wish to convert a single string with multiple delimiters into a key=>value hash structure. Is there a simple way to accomplish this? My current implementation is:
sub readConfigFile() {
my %CONFIG;
my $index = 0;
open(CON_FILE, "config");
my #lines = <CON_FILE>;
close(CON_FILE);
my #array = split(/>/, $lines[0]);
my $total = #array;
while($index < $total) {
my #arr = split(/=/, $array[$index]);
chomp($arr[1]);
$CONFIG{$arr[0]} = $arr[1];
$index = $index + 1;
}
while ( ($k,$v) = each %CONFIG ) {
print "$k => $v\n";
}
return;
}
where 'config' contains:
pub=3>rec=0>size=3>adv=1234 123 4.5 6.00
pub=1>rec=1>size=2>adv=111 22 3456 .76
The last digits need to be also removed, and kept in a separate key=>value pair whose name can be 'ip'. (I have not been able to accomplish this without making the code too lengthy and complicated).
What is your configuration data structure supposed to look like? So far the solutions only record the last line because they are stomping on the same hash keys every time they add a record.
Here's something that might get you closer, but you still need to figure out what the data structure should be.
I pass in the file handle as an argument so my subroutine isn't tied to a particular way of getting the data. It can be from a file, a string, a socket, or even the stuff below DATA in this case.
Instead of fixing things up after I parse the string, I fix the string to have the "ip" element before I parse it. Once I do that, the "ip" element isn't a special case and it's just a matter of a double split. This is a very important technique to save a lot of work and code.
I create a hash reference inside the subroutine and return that hash reference when I'm done. I don't need a global variable. :)
use warnings;
use strict;
use Data::Dumper;
readConfigFile( \*DATA );
sub readConfigFile
{
my( $fh ) = shift;
my $hash = {};
while( <$fh> )
{
chomp;
s/\s+(\d*\.\d+)$/>ip=$1/;
$hash->{ $. } = { map { split /=/ } split />/ };
}
return $hash;
}
my $hash = readConfigFile( \*DATA );
print Dumper( $hash );
__DATA__
pub=3>rec=0>size=3>adv=1234 123 4.5 6.00
pub=1>rec=1>size=2>adv=111 22 3456 .76
This gives you a data structure where each line is a separate record. I choose the line number of the record ($.) as the top-level key, but you can use anything that you like.
$VAR1 = {
'1' => {
'ip' => '6.00',
'rec' => '0',
'adv' => '1234 123 4.5',
'pub' => '3',
'size' => '3'
},
'2' => {
'ip' => '.76',
'rec' => '1',
'adv' => '111 22 3456',
'pub' => '1',
'size' => '2'
}
};
If that's not the structure you want, show us what you'd like to end up with and we can adjust our answers.
I am assuming that you want to read and parse more than 1 line. So, I chose to store the values in an AoH.
#!/usr/bin/perl
use strict;
use warnings;
my #config;
while (<DATA>) {
chomp;
push #config, { split /[=>]/ };
}
for my $href (#config) {
while (my ($k, $v) = each %$href) {
print "$k => $v\n";
}
}
__DATA__
pub=3>rec=0>size=3>adv=1234 123 4.5 6.00
pub=1>rec=1>size=2>adv=111 22 3456 .76
This results in the printout below. (The while loop above reads from DATA.)
rec => 0
adv => 1234 123 4.5 6.00
pub => 3
size => 3
rec => 1
adv => 111 22 3456 .76
pub => 1
size => 2
Chris
The below assumes the delimiter is guaranteed to be a >, and there is no chance of that appearing in the data.
I simply split each line based on '>'. The last value will contain a key=value pair, then a space, then the IP, so split this on / / exactly once (limit 2) and you get the k=v and the IP. Save the IP to the hash and keep the k=v pair in the array, then go through the array and split k=v on '='.
Fill in the hashref and push it to your higher-scoped array. This will then contain your hashrefs when finished.
(Having loaded the config into an array)
my #hashes;
for my $line (#config) {
my $hash; # config line will end up here
my #pairs = split />/, $line;
# Do the ip first. Split the last element of #pairs and put the second half into the
# hash, overwriting the element with the first half at the same time.
# This means we don't have to do anything special with the for loop below.
($pairs[-1], $hash->{ip}) = (split / /, $pairs[-1], 2);
for (#pairs) {
my ($k, $v) = split /=/;
$hash->{$k} = $v;
}
push #hashes, $hash;
}
The config file format is sub-optimal, shall we say. That is, there are easier formats to parse and understand. [Added: but the format is already defined by another program. Perl is flexible enough to deal with that.]
Your code slurps the file when there is no real need.
Your code only pays attention to the last line of data in the file (as Chris Charley noted while I was typing this up).
You also have not allowed for comment lines or blank lines - both are a good idea in any config file and they are easy to support. [Added: again, with the pre-defined format, this is barely relevant, but when you design your own files, do remember it.]
Here's an adaptation of your function into somewhat more idiomatic Perl.
#!/bin/perl -w
use strict;
use constant debug => 0;
sub readConfigFile()
{
my %CONFIG;
open(CON_FILE, "config") or die "failed to open file ($!)\n";
while (my $line = <CON_FILE>)
{
chomp $line;
$line =~ s/#.*//; # Remove comments
next if $line =~ /^\s*$/; # Ignore blank lines
foreach my $field (split(/>/, $line))
{
my #arr = split(/=/, $field);
$CONFIG{$arr[0]} = $arr[1];
print ":: $arr[0] => $arr[1]\n" if debug;
}
}
close(CON_FILE);
while (my($k,$v) = each %CONFIG)
{
print "$k => $v\n";
}
return %CONFIG;
}
readConfigFile; # Ignores returned hash
Now, you need to explain more clearly what the structure of the last field is, and why you have an 'ip' field without the key=value notation. Consistency makes life easier for everybody. You also need to think about how multiple lines are supposed to be handled. And I'd explore using a more orthodox notation, such as:
pub=3;rec=0;size=3;adv=(1234,123,4.5);ip=6.00
Colon or semi-colon as delimiters are fairly conventional; parentheses around comma separated items in a list are not an outrageous convention. Consistency is paramount. Emerson said "A foolish consistency is the hobgoblin of little minds, adored by little statesmen and philosophers and divines", but consistency in Computer Science is a great benefit to everyone.
Here's one way.
foreach ( #lines ) {
chomp;
my %CONFIG;
# Extract the last digit first and replace it with an end of
# pair delimiter.
s/\s*([\d\.]+)\s*$/>/;
$CONFIG{ip} = $1;
while ( /([^=]*)=([^>]*)>/g ) {
$CONFIG{$1} = $2;
}
print Dumper ( \%CONFIG );
}