Build and Access a Complex Data Structure Using Perl - perl

I have a large one-dimensional hash with lots of data that I need to structure in such a way that I can sort it easily into a format that is the same each time the code executes.
Original Hash Data:
{
'datetime' => 'datetime value',
'param_name' => 'param name',
'param_value' => 'param value',
'category' => 'category name'
}
Current Data Structure:
{
'datetime value' => {
'category' => {
'param_name' = > 'param name',
'param_value' => 'param value
}
}
}
I can almost build this structure in code, except for every category, there could be multiple param_names and param_values with the same key name.
The problem I have is that if there are multiple param names/values, only the last pair are saved in the new data structure.
I know that keys have to be unique, so I'm not quite sure how to resolve this as of yet.
Once the structure is built, I then need to understand how to sort the data based on datetime, then param_name so that the order is always the same in the output.

Looking at the difference between your first and second example, I think you have your structure a bit off. I think this matches more of what you want:
{
DATE => date_time_value,
PARAMETERS => {
param_name1 => parameter_value1,
param_name2 => parameter_value2
}
}
This way, the structure with data may look like this:
{
DATE_TIME => "10/31/2031 12:00am",
PARAMETERS => {
COLOR => "red",
SIZE => "Really big",
NAME => "Herman",
}
}
Usually, you think of objects having fields which contain values. Think of a row of a SQL table or a spreadsheet. You have columns with headings, and rows that contain the value.
Let's take an employee. They have a name, age, job, and a phone number:
{
NAME => "Bob Smith",
AGE => "None of your business",
JOB => "Making your life miserable",
PHONE => "555-1212"
}
Unlike a table, each entry could contain other structure. For example, people usually have more than one phone number, and we might want to store the last name separate from the first name:
{
NAME => {
FIRST => "Bob",
LAST => "Smith"
}
AGE => "None of your business",
JOB => "Making your life miserable"
PHONE => {
CELL => "555.1234",
WORK => "555.1212"
}
}
Then we have the people who have multiple phones of the same time. For example, Bob has two cell phones. In this case, we'll make each phone type field an array of values:
{
NAME => {
FIRST => "Bob",
LAST => "Smith",
}
AGE => "None of your business",
JOB => "Making your life miserable"
PHONE => {
CELL => ["555.1234", "555.4321"]
WORK => ["555.1212"]
}
}
And to initialize it:
my $person = {};
$person->{NAME}->{FIRST} = "Bob";
$person->{NAME}->{LAST} = "Smith";
$person->{AGE} = "None of your business";
$person->{JOB} = "Making your life miserable";
$person->{PHONE}->{CELL}->[0] = "555.1234";
$person->{PHONE}->{CELL}->[1] = "555.4321";
$person->{PHONE}->{WORK}->[0] = ""555.1212";

I think it seems appropriate to have a params hash where the keys are all of the names and the values are the actual values. It seems like that is what you want.
my %hash = {
'datetime value' => {
'category' => {
'params' => {
'param-name1' => 'param-value1',
'param-name2' => 'param-value2',
'param-name3' => 'param-value3',
etc..
}
}
}
}
After this restructuring it should be pretty easy to sort based on whatever you would like.
alphabetically by key:
my #alphabetic_keys = sort { $hash{$a} cmp $hash{$b} } keys %{ $hash{params} };
length by key:
my #by_length_keys = sort { length($a) <=> length($b) } keys %{ $hash{params} };

Assuming category names are unique, I would suggest the following data structure:
{
'datetime value 1' => {
'category name 1' => {
'param name 1' = > [param value1, param value2, ...],
'param name 2' = > [param value3, param value4, ...],
etc...
},
'category name 2' => {
'param...' => [ value... ]
},
'datetime value 2' => {
etc...
}
}

Related

Inserting one hash into another using Perl

I've tried many different versions of using push and splice, but can't seem to combine two hashes as needed. Trying to insert the second hash into the first inside the 'Item' array:
(
ItemData => { Item => { ItemNum => 2, PriceList => "25.00", UOM => " " } },
)
(
Alternate => {
Description => "OIL FILTER",
InFile => "Y",
MfgCode => "FRA",
QtyAvailable => 29,
Stocked => "Y",
},
)
And I need to insert the second 'Alternate' hash into the 'Item' array of the first hash for this result:
(
ItemData => {
Item => {
Alternate => {
Description => "OIL FILTER",
InFile => "Y",
MfgCode => "FRA",
QtyAvailable => 29,
Stocked => "Y",
},
ItemNum => 2,
PriceList => "25.00",
UOM => " ",
},
},
)
Can someone suggest how I can accomplish this?
Assuming you have two hash references, this is straight-forward.
my $item = {
'ItemData' => {
'Item' => {
'PriceList' => '25.00',
'UOM' => ' ',
'ItemNum' => '2'
}
}
};
my $alt = {
'Alternate' => {
'MfgCode' => 'FRA',
'Description' => 'OIL FILTER',
'Stocked' => 'Y',
'InFile' => 'Y',
'QtyAvailable' => '29'
}
};
$item->{ItemData}->{Item}->{Alternate} = $alt->{Alternate};
The trick here is not to actually merge $alt into some part of $item, but to only take the specific part you want and put it where you want it. We take the Alternate key from $alt and put it's content into a new Alternate key inside the guts of $item.
Adam Millerchip pointed out in a hence deleted comment that this is not a copy. If you alter any of the keys inside of $alt->{Alternative} after sticking it into $item, the data will be changed inside of $item as well because we are dealing with references.
$item->{ItemData}->{Item}->{Alternate} = $alt->{Alternate};
$alt->{Alternate}->{InFile} = 'foobar';
This will actually also change the value of $item->{ItemData}->{Item}->{Alternate}->{InFile} to foobar as seen below.
$VAR1 = {
'ItemData' => {
'Item' => {
'ItemNum' => '2',
'Alternate' => {
'Stocked' => 'Y',
'MfgCode' => 'FRA',
'InFile' => 'foobar',
'Description' => 'OIL FILTER',
'QtyAvailable' => '29'
},
'UOM' => ' ',
'PriceList' => '25.00'
}
}
};
References are supposed to do that, because they only reference something. That's what's good about them.
To make a real copy, you need to dereference and create a new anonymous hash reference.
# create a new ref
# deref
$item->{ItemData}->{Item}->{Alternate} = { %{ $alt->{Alternate} } };
This will create a shallow copy. The values directly inside of the Alternate key will be copies, but if they contain references, those will not be copied, but referenced.
If you do want to merge larger data structures where more than the content of one key needs to be merged, take a look at Hash::Merge instead.

For each object in array in perl

I am trying to loop through each object in an array in Perl and I think I am making an obvious error.
my #members_array = [
{
id => 1234,
email => 'first#example.com',
}, {
id => 4321,
email => 'second#example.com',
}
];
use Data::Dumper;
for my $member ( #members_array ) {
print Dumper( $member );
}
Expected output for first iteration
{
id => 1234,
email => 'first#example.com',
}
Actual output for first iteration
[{
'email' => 'first#example.com',
'id' => 1234
}, {
'email' => 'second#example.com',
'id' => 4321
}];
How do I loop through these elements in the array? Thanks!
[ ... ] is used to create an array reference; you need to use ( ... ) to create an array :
my #members_array = (
{
id => 1234,
email => 'first#example.com',
}, {
id => 4321,
email => 'second#example.com',
}
);
And then the rest of your code will work just fine.

How can I join a nested Perl hash?

I have a Perl hash, where I store information about LUNs. It has the following structure:
my %luns = (
360000 => {
Devices => [
{ Major_Minor => "8:144",
SCSI_Address => "1:0:0:8",
SCSI_Device => "sdj",
SCSI_Host => "host1",
},
{ Major_Minor => "129:48",
SCSI_Address => "3:0:0:8",
SCSI_Device => "sder",
SCSI_Host => "host3",
},
],
DM_Device => "dm-13",
Size => "45G",
WWID => 360000,
},
360001 => {
Devices => [
{ Major_Minor => "70:144",
SCSI_Address => "1:0:1:39",
SCSI_Device => "sddb",
SCSI_Host => "host1",
},
{ Major_Minor => "135:48",
SCSI_Address => "3:0:1:39",
SCSI_Device => "sdij",
SCSI_Host => "host3",
},
],
DM_Device => "dm-53",
Size => "200G",
WWID => 360000,
},
);
How can I use join to get a comma-separated list of all SCSI_Devices, for example, of 360000?
You're working with a Hash of Hash of Array of Hash. To learn how to work with such structures, I recommend reading perldsc - Perl Data Structures Cookbook.
In this instance, the following loop will print out each of your device lists:
for my $id ( sort { $a <=> $b } keys %luns ) {
my #devices = map { $_->{SCSI_Device} } #{ $luns{$id}{Devices} };
print "$id - #devices\n";
}
Outputs:
360000 - sdj sder
360001 - sddb sdij
Live Demo
You say you want a list of values for LUN 360000, so for a start you need
$luns->{36000}
which is another hash with a Devices element, which has an array reference as a value, and DM_Device, Size, and WWID elements, whose values are simple scalars.
So presumably you want the list that is
$luns->{36000}{Devices}
which is an array of references to hashes, each of which has Major_Minor, SCSI_Address, SCSI_Device, and SCSI_Host elements.
It sounds like you want the SCSI_Device element, and map is the ideal tool to help you with this
my #scsi_devices = map { $_->{SCSI_Device} } #{ $luns->{360000}{Devices} };
That last step is a big leap, and it may help to separate it in your code. For instance, you can copy the reference to the list of devices for 360000, like this
my $devices = $luns->{360000}{Devices};
and extract the SCSI_Device from each of the hashes in that array with
my #scsi_devices = map { $_->{SCSI_Device} } #$devices;
Either way, the array reference must be dereferenced and the required element from each hash in that array must be extracted.
To get a CSV record, unless the data may contain commas of double-quotes, you simply need to join the result of that map
print join(',', #scsi_devices), "\n";
output
sdj,sder
Although I think this falls short of what you actually need. If this isn't clear then please ask.

How can I look and search for a key inside a heavily nested hash?

I am trying to check if a BIG hash has any keys from small hash and see if they exist, and if they do modify the BigHash with updated values from small hash.
So the lookup hash would look like this :
configure =(
CommonParameter => {
'SibSendOverride' => 'true',
'SibOverrideEnabledFlag' => 'true',
'SiPosition' => '8',
'Period' => '11'
}
)
But the BigHash is very very nested.. The key/hash CommonParameter from the small hash configure is there in the BigHash.
Can somebody help/suggest some ideas for me please?
Here is an example BigHash :
%BigHash = (
'SibConfig' => {
'CELL' => {
'Sib9' => {
'HnbName' => 'HnbName',
'CommonParameter' => {
'SibSendOverride' => 'false',
'SibMaskOverrideEnabledFlag' => 'false',
'SiPosition' => '0',
'Period' => '8'
}
}
}
},
)
I hope I was clear in my question. Trying to modify values of heavily nested BigHash based on Lookup Hash if those keys exist.
Can somebody help me? I am not approaching this in the right way. Is there a neat little key lookup fucntion or something available perhaps?
Give Data::Search a try.
use Data::Search;
#results = Data::Search::datasearch(
data => $BigHash, search => 'keys',
find => 'CommonParameter',
return => 'hashcontainer');
foreach $result (#results) {
# result is a hashref that has 'CommonParameter' as a key
if ($result->{CommonParameter}{AnotherKey} ne $AnotherValue) {
print STDERR "AnotherKey was ", $result->{CommonParameter}{AnotherKey},
" ... fixing\n";
$result->{CommonParameter}{AnotherKey} = $AnotherValue;
}
}

Array of hashes

In perl , i have an array of hashes
like
0 HASH(0x98335e0)
'title' => 1177
'author' => 'ABC'
'quantity' => '-100'
1 HASH(0x832a9f0)
'title' => 1177
'author' => 'ABC'
'quantity' => '100'
2 HASH(0x98335e0)
'title' => 1127
'author' => 'DEF'
'quantity' => '5100'
3 HASH(0x832a9f0)
'title' => 1277
'author' => 'XYZ'
'quantity' => '1030'
Now I need to accumulate the quantity where title and author are same.
In the above structure for hash with title = 1177 and author ='ABC' quantity can be accumulated into one and the entire structure should looks like below
0 HASH(0x98335e0)
'title' => 1177
'author' => 'ABC'
'quantity' => 0
1 HASH(0x98335e0)
'title' => 1127
'author' => 'DEF'
'quantity' => '5100'
2 HASH(0x832a9f0)
'title' => 1277
'author' => 'XYZ'
'quantity' => '1030'
What is the best way i can do this accumulation so that it is optimised? Number of array elements can be very large. I dont mind adding an extra key to the hash to aid the same , but i dont want n lookups . Kindly advise
my %sum;
for (#a) {
$sum{ $_->{author} }{ $_->{title} } += $_->{quantity};
}
my #accumulated;
foreach my $author (keys %sum) {
foreach my $title (keys %{ $sum{$author} }) {
push #accumulated => { title => $title,
author => $author,
quantity => $sum{$author}{$title},
};
}
}
Not sure whether map makes it look nicer:
my #accumulated =
map {
my $author = $_;
map { author => $author,
title => $_,
quantity => $sum{$author}{$_},
},
keys %{ $sum{$author} };
}
keys %sum;
If you don't want N lookups, then you need a hash function -- however you need to store them with that hash function. By the time you have them in a list (or array), it's too late. You either get lucky, all the time, or you're going to have N lookups.
Or insert them into the hash abovebelow. A hybrid solution is to store a locator as item 0 in the list/array.
my $lot = get_lot_from_whatever();
my $tot = $list[0]{ $lot->{author} }{ $lot->{title} };
if ( $tot ) {
$tot->{quantity} += $lot->{quantity};
}
else {
push #list, $list[0]{ $lot->{author} }{ $lot->{title} } = $lot;
}
previous
First of all we'll reformat that to make it readable.
[ { title => 1177, author => 'ABC', quantity => '-100' }
, { title => 1177, author => 'ABC', quantity => '100' }
, { title => 1127, author => 'DEF', quantity => '5100' }
, { title => 1277, author => 'XYZ', quantity => '1030' }
]
Next, you need to break down the problem. You want quantities of things grouped
by author and title. So you need those things to uniquely identify those lots.
To repeat, you want a combination of names to identify entities. Thus, you
will need a hash that identifies things by names.
Since we have two things, a double hash is a good way to do it.
my %hash;
foreach my $lot ( #list ) {
$hash{ $lot->{author} }{ $lot->{title} } += $lot->{quantity};
}
# consolidated by hash
To turn this back into a list, we need to unbundle the levels.
my #consol
= sort { $a->{author} cmp $b->{author} || $a->{title} cmp $b->{title} }
map {
my ( $a, $titles ) = #$_; # $_ is [ $a, {...} ]
map { +{ title => $_, author => $a, quantity => $titles->{$_} }
keys %$titles;
}
map { [ $_ => $hash{$_} ] } # group and freeze a pair
keys %hash
;
# consolidated in a list.
And there you have it back, I even sorted it for you. Of course you could also
sort this by--publishers being what they are--descending quantities.
sort { $b->{quantity} <=> $a->{quantity}
|| $a->{author} cmp $b->{author}
|| $a->{title} cmp $b->{title}
}
I think it is important to step back and consider the source of the data. If the data are coming from a database, then you should write the SQL query so that it gives you one row for each author/title combination with the total quantity in the quantity field. If you are reading the data from a file, then you should either read it directly into a hash or use Tie::IxHash if order is important.
Once you have the data in an array of hashrefs like you do, you will have to create an auxiliary data structure and do a whole bunch of lookups, the cost of which may well dominate the running time of your program (not in a way it matters if it is run for 15 minutes once a day) and you might run into memory issues.