How do I work globally on a complex data structure using subroutines? - perl

Specifically my data structure looks like this
{
"SomeGuy" => {
date_and_time => "11-04-2013",
Id => 7,
nr => 52,
picks => [
{ "This is an option" => "Option3" },
{ "This is another option" => "Option4" },
{ "This is another option" => "Option1" },
{ "And another one" => "Something" },
],
time_of => "06:11 AM",
total => 1,
},
"Another Guy" => { ... },
}
This is output via Data::Dump. The actual data structure contains a lot more records like "SomeGuy". The structure is identical for all of them.
I populate this data structure this way:
$guys->{$profile}{options}[$total++]{$var} = $var2;
$guys->{$profile}{Id} = $i;
$guys->{$profile}{date_and_time} = get_date($Time[0]);
$guys->{$profile}{time_of} = $Time[1];
$guys->{$profile}{total} = keys (% {$guys->{$profile}{options}[0]});
$guys->{$profile}{nr} = $pNr;
Having all this, what I want to do next is operate on this data structure. I repeat that there many many records in the data structure.
When I output the contents of it I get it in jumbled order, as to say, not in the order that it was populated. I've tried this with Data::Dumper, Data::Dump and iterating through the records myself by hand.
I know that the methods in the Data namespace are notorious for this, that's why Data::Dumper provides a way to sort through a subroutine and Data::Dump provides a default one.
So I have the data structure. It looks like I expect it to be, I have all the data as I knew it should look in it, valid. I want to sort the records according to their Id field. My thinking is that I have to use a subroutine and essentially pass a reference of the data structure to it and do the sorting there.
sub sortt {
my $dref = shift #_;
foreach my $name ( sort { $dref->{$a}{Id} <=> $dref->{$b}{Id} } keys %$dref ) {
print "$data->{$name}{Id}: $name \n";
}
}
This gets called with (in the same scope where the structure is populated, so no worries there):
sortt(\$guys);
The error is:
Not a HASH reference at perlprogram.pl line 452
So I go and use ref in the subroutine to make sure I'm passing an actual reference. And it says REF.
Next I go into desperate mode and try some stupid things like calling it with:
sortt(\%$guys)
But if I'm not mistaking this just sends a copy to the subroutine and just sorts that copy locally, so no use there.
It's no use if I make a copy and return it from the subroutine, I just want to pass a reference of my data structure and sort it and have it reflect those changes globally (or in the calling scope per se). How would I do this?

Leaving aside your syntax problem with "Not a ref", you are approaching the problem from the wrong end in the first place. I'll leave small syntactic details to others (see Ikegami's comment).
You can NOT sort them at all, because $guys is a hash, not an array. Hashes are NOT ever sorted in Perl. If you want to sort it, your have three solutions:
Store an ordered list of names as a separate array.
my #ordered_names = sort { $guys->{$a}{Id} <=> $guys->{$b}{Id} } keys %$guys
Then you use the array for ordering and go back to hashref for individual records.
Add the name to the individual guy's hash, instead of outer hash reference. $guys should be an array reference. The downside to this approach is that you can't look up a person's record by their name any more - if that functionality is needed, use #1.
#codnodder's answer shows how to do that, if you don't care about accessing records by name.
Use a Tie::* module. (Tie::IxHash, Tie::Hash::Sorted). NOT recommended since it's slower.

Perl hashes are inherently unordered. There is no way you can sort them, or reorder them at all. You have to write code to to access the elements in the order you want.
Your sortt subroutine does nothing but print the ID and the name of each hash element, sorted by the ID. Except that it doesn't, because you are trying to use the variable $data when you have actually set up $dref. That is likely to be the cause of your Not a HASH reference error, although unless you show your entire code, or at least indicate which is perlprogram.pl line 452, then we cannot help further.
The best way to do what you want is to create an array of hash keys, which you can sort in whatever order you want. Like this
my #names_by_id = sort { $guys->{$a}{Id} <=> $guys->{$b}{Id} } keys %$guys;
Then you can use this to access the hash in sorted order, like this, which prints the same output as your sortt is intended to, but formatted a little more nicely
for my $name (#names_by_id) {
printf "%4d: %s\n", $guys->{$name}{Id}, $name;
}
If you want to do anything else with the hash elements in sorted order then you have to use this technique.

$guys is already a hash ref, so you just need sortt($guys)
If you want a sorted data structure, you need something like this:
my #guys_sorted =
map { { $_ => $guys->{$_} } }
sort { $guys->{$a}{Id} <=> $guys->{$b}{Id} } keys %$guys;
print(Dumper(\#guys_sorted));
Or, in a sub:
sub sortt {
# returns a SORTED ARRAY of HASHREFS
my $ref = shift;
return map { { $_ => $ref->{$_} } }
sort { $ref->{$a}{Id} <=> $ref->{$b}{Id} } keys %$ref;
}
print(Dumper([sortt($guys)]));

That's a pretty complex data structure. If you commonly use structures like this in your program, I suggest you take your Perl programming skills up a notch, and look into learning a bit of Object Oriented Perl.
Object Oriented Perl is fairly straight forward: Your object is merely that data structure you've previously created. Methods are merely subroutines that work with that object. Most methods are getter/setter methods to set up your structure.
Initially, it's a bit more writing, but once you get the hang of it, the extra writing is easily compensated by the saving is debugging and maintaining your code.
Object Oriented Perl does two things: It first makes sure that your structure is correct. For example:
$some_guy->{Picks}->[2]->{"this is an option"} = "Foo!";
Whoops! That should have been {picks}. Imagine trying to find that error in your code.
In OO-Perl, if I mistyped a method's name, the program will immediately pick it up:
$some_guy->picks(
{
"This is an option" -> "Foo!",
"This is option 2" => "Bar!",
}
)
If I had $some_guy->Picks, I would have gotten a runtime error.
It also makes you think of your structure as an object. For example, what are you sorting on? You're sorting on your Guys' IDs, and the guys are stored in a hash called %guys.
# $a and $b are hash keys from `%guys`.
# $guys{$a} and $guys{$b} represent the guy objects.
# I can use the id method to get they guys' IDs
sort_guys_by_id {
guys{$a}->id cmp guys{$b}->id; #That was easy!
}
Take a look at the tutorial. You'll find yourself writing better programs with fewer errors.

With your heart set on a sorted data structure, I recommend the following. It is a simple array of hashes and, rather than using the name string as the key for a single-element hash, it adds a new name key to each guy's data.
I hope that, with the Data::Dump output, it is self-explanatory. It is sorted by the Id field as you requested, but it still has the disadvantage that a separate index array would allow any ordering at all without modifying or copying the original hash data.
use strict;
use warnings;
use Data::Dump;
my $guys = {
"SomeGuy " => {
date_and_time => "11-04-2013",
Id => 7,
nr => 52,
picks => [
{ "This is an option" => "Option3" },
{ "This is another option" => "Option4" },
{ "This is another option" => "Option1" },
{ "And another one" => "Something" },
],
time_of => "06:11 AM",
total => 1,
},
"Another Guy" => { Id => 9, nr => 99, total => 6 },
"My Guy" => { Id => 1, nr => 42, total => 3 },
};
my #guys_sorted = map {
my $data = $guys->{$_};
$data->{name} = $_;
$data;
}
sort {
$guys->{$a}{Id} <=> $guys->{$b}{Id}
} keys %$guys;
dd \#guys_sorted;
output
[
{ Id => 1, name => "My Guy", nr => 42, total => 3 },
{
date_and_time => "11-04-2013",
Id => 7,
name => "SomeGuy ",
nr => 52,
picks => [
{ "This is an option" => "Option3" },
{ "This is another option" => "Option4" },
{ "This is another option" => "Option1" },
{ "And another one" => "Something" },
],
time_of => "06:11 AM",
total => 1,
},
{ Id => 9, name => "Another Guy", nr => 99, total => 6 },
]

Related

A dictionary inside of a dictionary in PowerShell

So, I'm rather new to PowerShell and just can't figure out how to use the arrays/lists/hashtables. I basically want to do the following portrayed by Python:
entries = {
'one' : {
'id': '1',
'text': 'ok'
},
'two' : {
'id': '2',
'text': 'no'
}
}
for entry in entries:
print(entries[entry]['id'])
Output:
1
2
But how does this work in PowerShell? I've tried the following:
$entries = #{
one = #{
id = "1";
text = "ok"
};
two = #{
id = "2";
text = "no"
}
}
And now I can't figure out how to access the information.
foreach ($entry in $entries) {
Write-Host $entries[$entry]['id']
}
=> Error
PowerShell prevents implicit iteration over dictionaries to avoid accidental "unrolling".
You can work around this and loop through the contained key-value pairs by calling GetEnumerator() explicitly:
foreach($kvp in $entries.GetEnumerator()){
Write-Host $kvp.Value['id']
}
For something closer to the python example, you can also extract the key values and iterate over those:
foreach($key in $entries.get_Keys()){
Write-Host $entries[$key]['id']
}
Note: You'll find that iterating over $entries.Keys works too, but I strongly recommend never using that, because PowerShell resolves dictionary keys via property access, so you'll get unexpected behavior if the dictionary contains an entry with the key "Keys":
$entries = #{
Keys = 'a','b'
a = 'discoverable'
b = 'also discoverable'
c = 'you will never find me'
}
foreach($key in $entries.Keys){ # suddenly resolves to just `'a', 'b'`
Write-Host $entries[$key]
}
You'll see only the output:
discoverable
also discoverable
Not the Keys or c entries
To complement Mathias R. Jessen's helpful answer with a more concise alternative that takes advantage of member-access enumeration:
# Implicitly loops over all entry values and from each
# gets the 'Id' entry value from the nested hashtable.
$entries.Values.Id # -> 2, 1
Note: As with .Keys vs. .get_Keys(), you may choose to routinely use .get_Values() instead of .Values to avoid problems with keys literally named Values.

Trouble adding to hash in perl

I am reading a file replacing data and returning output to json. When I try to add a new item to the hash I get the following error. Not a HASH reference When I use ref() I am getting HASH as the type.
I have tried.
my $json_data = decode_json($template);
$json_data->{CommandCenters}{NewItem} = ["haha","moredata"];
Gives the not a hash reference error
The $json_data is below.
{
"Location":"Arkansas",
"CommandCenters": [
{
"secretary": "jill",
"janitor": "mike"
}
],
}
I am looking for the following output after I add the element.
{
"Location":"Arkansas",
"city": "little rock"
"CommandCenters": [
{
"secretary": "jill",
"janitor": "mike"
},
{
"NewItem":["whatever","more data"]
}
],
}
If I use $json_data->{CommandCenters}[0]{NewItem} = ['whatever','sure']; I do not get an error but I get unexpected results.
The data is added but in the incorrect slot.
"commandcenters":
[
"secretary":"jill",
"janitor": "mike",
"newitem":
[
"whatever","sure"
],
]
To add a new element to an array, use push. As we
're dealing with an array reference, we need to dereference it first.
push #{ $json_data->{CommandCenters} }, { NewItem => ["haha", "moredata"] };
When I try to add a new item to the hash I get the following error. Not a HASH reference When I use ref() I am getting HASH as the type.
Attention to detail is a vital skill for a successful programmer. And you're missing something subtle here.
When you use ref(), I assume you're passing it your $json_data variable. And that is, indeed, a hash reference. But the line that generates your Not a HASH reference is this line:
$json_data->{CommandCenters}{NewItem} = ["haha","moredata"];
And that's not just treating $json_data as a hash reference ($json_data->{...}) it's also treating $json_data->{CommandCenters} as a hash reference. And that's where your problem is. $json_data->{CommandCenters} is an array reference, not a hash reference. It's generated from the bit of your JSON that looks like this:
"CommandCenters": [
{
"secretary": "jill",
"janitor": "mike"
}
]
And those [ .. ] mark it as an array not a hash. You can't add a new key/value pair to an array; you need to push() new data to the end of the array. Something like:
push #{ $json_data->{CommandCenters} }, { NewItem => ["haha", "moredata"] };
That will leave you with this data structure:
$VAR1 = {
'CommandCenters' => [
{
'janitor' => 'mike',
'secretary' => 'jill'
},
{
'NewItem' => [
'haha',
'moredata'
]
}
],
'Location' => 'Arkansas'
};
And encode_json() will turn that into the JSON that you want.

Assigning array/hash to data structure with arrow operator. Perl

Just asking because i can't find the answer I'm looking for...
What is the difference between these lines, I'm under the assumption that there isn't a difference.
I'm just unsure if it $stash->{hashdata} automatically becomes a reference.
my %data = { thing => 1, otherthing => 2 };
$stash->{hashdata} = \%data;
$stash->{hashdata} = { thing => 1, otherthing => 2 };
{ ... } is the syntax for a hash reference (similarly, [ ... ] is for array references).
When you assign something to a hash, it is interpreted as a list of alternating keys/values. If the list has an odd number of elements (such as 1), you get this warning:
Odd number of elements in hash assignment at ...
... unless it's only a single value that is a reference, in which case you get:
Reference found where even-sized list expected at ...
In any case, the last element is interpreted as a key with a corresponding value of undef.
Thus, if you try to assign a reference to a hash:
my %data = { ... };
A warning is emitted and the code behaves as if you had written:
my %data = ({ ... } => undef);
Hash keys are always strings, so the reference is implicitly stringified, yielding something like "HASH(0xdeadbeef)":
my %data = ('HASH(0xdeadbeef)' => undef);
This is never what you want.
The equivalent of
$stash->{hashdata} = { thing => 1, otherthing => 2 };
with a named hash would look like:
my %data = ( thing => 1, otherthing => 2 );
$stash->{hashdata} = \%data;
Note: There is no reference in the first line. We're assigning a plain list to %data.
In fact, you can think of { LIST } as syntactic sugar for:
do { my %tmp = LIST; \%tmp }
The block limits the scope of %tmp to this location in the code; the do keyword turns the block into an expression that returns the result of the last statement in the block.
This is an error.
my %data = { thing => 1, otherthing => 2 };
It should be.
my %data = ( thing => 1, otherthing => 2 );
What is the difference between these two?
$stash->{hashdata} = \%data;
$stash->{hashdata} = { thing => 1, otherthing => 2 };
The first one means any changes to %data will also happen in $stash->{hashdata} and vice versa, because it has a reference to %data.
The second means %data and $stash->{hashdata} are independent. Changes in one will not happen in the other.
Yes, there is a big difference between the two.

What is this `job_desc_msg_t` format that I need to submit jobs to SLURM via the Perl API?

The Perl API for SLURM indicates that to submit a job with the API requires that we give it a "job description" ($job_desc or $job_desc_msg), which has the structure job_desc_msg_t but it doesn't tell what job_desc_msg_t is.
update: I found it in slurm.h starting at line 1162, so I'm guessing that I will need to pass in a hash with a similar structure.
That's exactly what you must do according to the man page.
Typicaly, C structures are converted to (maybe blessed) Perl hash
references, with field names as hash keys. Arrays in C are converted to
arrays in Perl. For example, there is a structure "job_info_msg_t":
typedef struct job_info_msg {
time_t last_update; /* time of latest info */
uint32_t record_count; /* number of records */
job_info_t *job_array; /* the job records */
} job_info_msg_t;
This will be converted to a hash reference with the following
structure:
{
last_update => 1285847672,
job_array => [ {account => 'test', alloc_node => 'ln0', alloc_sid => 1234, ...},
{account => 'debug', alloc_node => 'ln2', alloc_sid => 5678, ...},
...
]
}
Note the missing of the "record_count" field in the hash. It can be
derived from the number of elements in array "job_array".
To pass parameters to the API functions, use the corresponding hash
references, for example:
$rc = $slurm->update_node({node_names => 'node[0-7]', node_state => NODE_STATE_DRAIN});
Please see "<slurm/slurm.h>" for the definition of the structures.

Populating a HoH, whose outer hash keys are in one array, and inner hash values are in another

Explanation
One array holds a set of keys. The values to these keys are inner hashes. The keys of these inner hashes, in this case are numbers (like array indices). Another array holds the values of the inner hash.
Question:
How can you populate the outer hash keys with the correct corresponding values (ie. correct inner hash)?
Contraints
I'd prefer a solution utilizing slices, map or grep. Eliminating cascading for loops
I realize it should be an HoA. But this is only for me to learn, it has no functional value...
Working Code:
This code works as I want but I would like to use more advanced techniques:
#! usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my %register=();
my #classNames = ('Science_class', 'Math_class');
my #Science_class_student_names = ('George', 'Lisa', 'Mathias'); #prob from file
my #Math_class_student_names = ('Martin', 'Anna', 'Peter', 'George'); #prob from file
foreach my $className (#classNames) {
my $array_name = $className.'_'.'student_names';
if ($array_name =~ /Science/) {
foreach (0..$#Science_class_student_names ) {
$register{$className}{$_ + 1} = $Science_class_student_names[$_];
}
}
elsif ($array_name =~ /Math/) {
foreach (0..$#Math_class_student_names ) {
$register{$className}{$_ + 1} = $Math_class_student_names[$_];
}
}
}
print Dumper(\%register);
Ideas
A hash slice works for direct key-value pairs, but the intermediate keys are throwing me off. Trying something like: #register{#classNames} = map{$count => $student}
One idea I had, before the if statements was if there was a way to use a string in the name of an array: $#($array_name)student_names but that doesn't work.
Another would be to separately create an array of all the inner hash keys, use a slice and then put that hash into the outer hash.
The only other idea I had was using an AoA to hold all the 'inner hash value' arrays. (ie. my #studentNames = (\#Science_class_student_names, \#Math_class_student_names); but haven't gotten anywhere with that yet.
It's easier to work on it a layer at a time. For the inner layer, it seems you want
1 => George,
2 => Lisa,
...
So start by figuring out how to do that.
map { $_+1 => $Science_class_student_names[$_] }
0..$#Science_class_student_names
So you end up with
$register{'Science_class'} = {
map { $_+1 => $Science_class_student_names[$_] }
0..$#Science_class_student_names
}
};
$register{'Math_class'} = {
map { $_+1 => $Math_class_student_names[$_] }
0..$#Math_class_student_names
}
};
If you generalise the inner layer, you get
for (
[ 'Science_class' => \#Science_class_student_names ],
[ 'Math_class' => \#Math_class_student_names ],
) {
my ($class_name, $student_names) = #$_;
$register{$class_name} = {
map { $_+1 => $student_names->[$_] }
0..$#$student_names
};
}