Initialize empty Array of Hashes of a given length - one-liner - perl

I'd like to pre-initialize the elements of an array of hashes so that when it comes time to filling in the data, I don't need to check for the existence of various members and initialize them every loop. If I can, I'd like to pre-initialize the general form of the datastructure which should look like this:
$num_sections = 4;
# Some magic to initialize $VAR1 to this:
$VAR1 = {
'sections' => [
{},
{},
{},
{}
]
};
I would like this to work,
$ph->{sections} ||= [({} x $num_sections )];
but it results in
$VAR1 = {
'sections' => 'HASH(0x21b6110)HASH(0x21b6110)HASH(0x21b6110)HASH(0x21b6110)'
};
And no amount of playing with the () list context and {} empty hash reference seem to make it work.
This works but it's not quite a one-liner
unless ($ph->{sections})
{
push #{ $ph->{sections}}, {} foreach (1..$num_sections);
}
There's probably some perl magic that I can use to add the unless to the end, but I haven't quite figured it out.
I feel I'm so close, but I just can't quite get it.
Update Oleg points out that this probably isn't necessary at all. See comments below.

If the left-hand side of x is not in parens, x repeats the string on its LHS and returns the concatenation of those strings.
If the left-hand side of x is in parens, x repeats the value on its LHS and returns the copies.
This latter approach is closer to what you want, but it's still wrong as you'll end up with multiple references to a single hash. You want to create not only new references, but new hashes as well. For that, you can use the following:
$ph->{sections} ||= [ map { +{} } 1..$num_sections ];

Related

Raku: Trouble Accessing Value of a Multidimensional Hash

I am having issues accessing the value of a 2-dimensional hash. From what I can tell online, it should be something like: %myHash{"key1"}{"key2"} #Returns value
However, I am getting the error: "Type Array does not support associative indexing."
Here's a Minimal Reproducible Example.
my %hash = key1-dim1 => key1-dim2 => 42, key2-dim1 => [42, 42];
say %hash{'key1-dim1'}{'key1-dim2'}; # 42
say %hash{'key2-dim1'}{'foo bar'}; # Type Array does not support associative indexing.
Here's another reproducible example, but longer:
my #tracks = 'Foo Bar', 'Foo Baz';
my %count;
for #tracks -> $title {
$_ = $title;
my #words = split(/\s/, $_);
if (#words.elems > 1) {
my $i = 0;
while (#words.elems - $i > 1) {
my %wordHash = ();
%wordHash.push: (#words[$i + 1] => 1);
%counts.push: (#words[$i] => %wordHash);
say %counts{#words[$i]}{#words[$i+1]}; #===============CRASHES HERE================
say %counts.kv;
$i = $i + 1;
}
}
}
In my code above, the problem line where the 2-d hash value is accessed will work once in the first iteration of the for-loop. However, it always crashes with that error on the second time through. I've tried replacing the array references in the curly braces with static key values in case something was weird with those, but that did not affect the result. I can't seem to find what exactly is going wrong by searching online.
I'm very new to raku, so I apologize if it's something that should be obvious.
After adding the second elements with push to the same part of the Hash, the elment is now an array. Best you can see this by print the Hash before the crash:
say "counts: " ~ %counts.raku;
#first time: counts: {:aaa(${:aaa(1)})}
#second time: counts: {:aaa($[{:aaa(1)}, {:aaa(1)}])}
The square brackets are indicating an array.
Maybe BagHash does already some work for you. See also raku sets without borders
my #tracks = 'aa1 aa2 aa2 aa3', 'bb1 bb2', 'cc1';
for #tracks -> $title {
my $n = BagHash.new: $title.words;
$n.raku.say;
}
#("aa2"=>2,"aa1"=>1,"aa3"=>1).BagHash
#("bb1"=>1,"bb2"=>1).BagHash
#("cc1"=>1).BagHash
Let me first explain the minimal example:
my %hash = key1-dim1 => key1-dim2 => 42,
key2-dim1 => [42, 42];
say %hash{'key1-dim1'}{'key1-dim2'}; # 42
say %hash{'key2-dim1'}{'key2-dim2'}; # Type Array does not support associative indexing.
The problem is that the value associated with key2-dim1 isn't itself a hash but is instead an Array. Arrays (and all other Positionals) only support indexing by position -- by integer. They don't support indexing by association -- by string or object key.
Hopefully that explains that bit. See also a search of SO using the [raku] tag plus 'Type Array does not support associative indexing'.
Your longer example throws an error at this line -- not immediately, but eventually:
say %counts{...}{...}; # Type Array does not support associative indexing.
The hash %counts is constructed by the previous line:
%counts.push: ...
Excerpting the doc for Hash.push:
If a key already exists in the hash ... old and new value are both placed into an Array
Example:
my %h = a => 1;
%h.push: (a => 1); # a => [1,1]
Now consider that the following code would have the same effect as the example from the doc:
my %h;
say %h.push: (a => 1); # {a => 1}
say %h.push: (a => 1); # {a => [1,1]}
Note how the first .push of a => 1 results in a 1 value for the a key of the %h hash, while the second .push of the same pair results in a [1,1] value for the a key.
A similar thing is going on in your code.
In your code, you're pushing the value %wordHash into the #words[$i] key of the %counts hash.
The first time you do this the resulting value associated with the #words[$i] key in %counts is just the value you pushed -- %wordHash. This is just like the first push of 1 above resulting in the value associated with the a key, from the push, being 1.
And because %wordHash is itself a hash, you can associatively index into it. So %counts{...}{...} works.
But the second time you push a value to the same %counts key (i.e. when the key is %counts{#words[$i]}, with #words[$i] set to a word/string/key that is already held by %counts), then the value associated with that key will not end up being associated with %wordHash but instead with [%wordHash, %wordHash].
And you clearly do get such a second time in your code, if the #tracks you are feeding in have titles that begin with the same word. (I think the same is true even if the duplication isn't the first word but instead later ones. But I'm too confused by your code to be sure what the exact broken combinations are. And it's too late at night for me to try understand it, especially given that it doesn't seem important anyway.)
So when your code then evaluates %counts{#words[$i]}{#words[$i+1]}, it is the same as [%wordHash, %wordHash]{...}. Which doesn't make sense, so you get the error you see.
Hopefully the foregoing has been helpful.
But I must say I'm both confused by your code, and intrigued as to what you're actually trying to accomplish.
I get that you're just learning Raku, and that what you've gotten from this SO might already be enough for you, but Raku has a range of nice high level hash like data types and functionality, and if you describe what you're aiming at we might be able to help with more than just clearing up Raku wrinkles that you and we have been dealing with thus far.
Regardless, welcome to SO and Raku. :)
Well, this one was kind of funny and surprising. You can't go wrong if you follow the other question, however, here's a modified version of your program:
my #tracks = ['love is love','love is in the air', 'love love love'];
my %counts;
for #tracks -> $title {
$_ = $title;
my #words = split(/\s/, $_);
if (#words.elems > 1) {
my $i = 0;
while (#words.elems - $i > 1) {
my %wordHash = ();
%wordHash{#words[$i + 1]} = 1;
%counts{#words[$i]} = %wordHash;
say %counts{#words[$i]}{#words[$i+1]}; # The buck stops here
say %counts.kv;
$i = $i + 1;
}
}
}
Please check the line where it crashed before. Can you spot the difference? It was kind of a (un)lucky thing that you used i as a loop variable... i is a complex number in Raku. So it was crashing because it couldn't use complex numbers to index an array. You simply had dropped the $.
You can use sigilless variables in Raku, as long as they're not i, or e, or any of the other constants that are already defined.
I've also made a couple of changes to better reflect the fact that you're building a Hash and not an array of Pairs, as Lukas Valle said.

Hash value is not re-initialized when loop is terminated with 'last' keyword

Consider the following nested loops:
my %deleted_documents_names = map { $_ => 1 }
$self->{MANUAL}->get_deleted_documents();
while($sth->fetch){
.....
.....
.....
while(my ($key, $value) = each(%deleted_documents_names)){
{
if($document_name eq $key){
$del_status=1;
last;
}
}
if($del_status==1){next;}
.....
.....
.....
.....
}
Now, I take a sample case where three values (A,B,C) will be compared against two values (B,C).
First scan:
A compared to B
A compared to C
Second scan:
B compared to B
Loop is terminated.
Third scan:
C is compared with C.
In this case, C should be compared first with B, being first value, but this comparison is skipped, and it only scans from the next element after the one that was found equal. If I remove last termination condition and let the loop run for total number of scans, then it works all fine, but I need to find out why in this case, $key refers to the next compared value and not to the first value once loop is restarted after getting terminated with last keyword.
Any help will be appreciated.
Use
keys %deleted_documents_names ; # Reset the "each" iterator.
See keys.
But, why are you iterating over the hash? Why don't you just
if (exists $deleted_documents_names{$document_name}) {
each() is a function that returns key-value pairs from a hash until it reaches the end. It is not aware of the scope it was called in, and doesn't know anything about your while loop logic. See the documentation here.
It can be reset by calling keys %hash or values %hash.
Update: however, as Choroba points out, you don't really need this loop. Your loop and accompanying logic could be replaced by this:
next if (exists $deleted_documents_names{$document_name});
(Hashes are designed with a structure that allows a key to be quickly found. In fact, this structure is what gives them the name "hashes". So doing it this way will be much more efficient than looping through all elements and testing each one).

Having difficulty in understanding a piece of Perl code

I am studying an existing Perl program, which includes the following line of code:
#{$labels->{$set}->{"train"}->{"negative"}} = (#{$labels->{$set}->{"train"}->{"negative"}}, splice(#shuffled, 0, 2000));
I am very confused on how to understand this piece of code.
It is not valid Perl as written:
#{$labels->{$set}->{"train"}->{"negative"}} = (#{$labels->{$set}->{"train"}->{"negative"}};
^
Syntax error - open parenthesis without a matching close parenthesis
If you ignore the open parenthesis, then the LHS and the RHS expressions are identical; it assigns an array value to itself. The arrows -> and {} notations mostly mean that you're dealing with an array ref at the end of a hash reference to a hash reference to a hash reference, or thereabouts. It is a nasty piece of code to have to understand, at best, but the structure may make more sense in the bigger context of the whole program (and the whole program will be considerably bigger, so I don't recommend posting it here).
Double check your copy'n'paste. If that is actually in the Perl script, then it can't be being compiled, much less executed, so you'll have to work out how and why that line is not operative.
The revised expression has the RHS:
(#{$labels->{$set}->{"train"}->{"negative"}}, splice(#shuffled, 0, 2000));
The parentheses provide an array or list context; the first term is the original array; the second term is the result of splice applied to the array #shuffled. So, the splice removes a couple thousand elements (2001?) from the array #shuffled and the expression as a whole adds the deleted elements to the end of the array identified by the complex expression on the LHS.
It would probably be more efficiently written as:
push #{$labels->{$set}->{"train"}->{"negative"}}, splice(#shuffled, 0, 2000);
It's also more economical on the typing, and a lot more economical on the brain cells.
The statement:
#{$labels->{$set}->{"train"}->{"negative"}} =
(#{$labels->{$set}->{"train"}->{"negative"}}, splice(#shuffled, 0, 2000));
Does a number of things at once. It can also be written, slightly more verbose:
my #array = #{$labels->{$set}->{"train"}->{"negative"}};
my #values = #shuffled[0..1999]; # get the first 2000 values
splice #shuffled, 0, 2000; # delete the values after use
#array = (#array, #values); # add the values to the array
#{$labels->{$set}->{"train"}->{"negative"}} = #array;
As you'll notice, the LENGTH in splice is not the number of the array element, but the length of the array, which is why the count is one off in the array slice above.
As Jonathan pointed out, it is much simpler to use push.
push #{$labels->{$set}{"train"}{"negative"}}, splice(#shuffled, 0, 2000);
Documentation: splice
$labels is a reference to a hash of hashes with three depths (HoHoH). The lookup $labels->{$set}->{"train"}->{"negative"} returns an array reference.
Hope that helps a bit ..

Is there a name for this sort?

What is the name for the sort used in this answer? I Googled for "perfect insertion sort" but didn't find anything. Here is the code from that answer:
#this is O(n) instead of O(n log n) or worse
sub perfect_insert_sort {
my $h = shift;
my #k;
for my $k (keys %$h) {
$k[$h->{$k}{order}] = $k;
}
return #k;
}
I think I probably should have named that perfect_bucket_sort instead of perfect_insertion_sort.
This isn't insertion sort, in fact it's not even a comparison sort because the theoretical lowest bound for those is O(nlogn).
So it's probably bucket sort; also notice there are no comparisons made :)
It's not really a sort at all. it is, in fact, primarily a map or a transformation. This is an example of the data structure they have:
my $hash = {
foo => { order => 3 },
bar => { order => 20 },
baz => { order => 66 },
};
It's simply a translation of 'order' to elements in an array. For example, if you pass in this $hash to perfect_insert_sort, it will return a 67 element array, with three items (one at index 3, one at 20, and one at 66) and the rest being undef, and entirely in an unsorted order.
Nothing about that function does any sorting of any kind. If there is any sorting going on in that other answer, it's happening before the function is called.
#downvoter:
And looking at the other answer, the sorting happens at insertion time. THAT component might be considered a sort. This subroutine, however, does not create that order - it merely reconstitutes it.
Take a look at the classical definition for a sort:
The output is in nondecreasing order (each element is no smaller than the previous element according to the desired total order)
The output is a permutation, or reordering, of the input.
Part 2 is certainly being satisfied: there is a transformation of the hash structure to a list going on. However, part 1 is not satisfied. There is no determining of order going on. The order has been predetermined during insertion. If this were a 'sort', then the following would also be a sort:
my #data = ... ;
my $index = -1;
my %stored = map { ++$index; $_ => { order => $index } } #data;
my #remade_data;
#remade_data[(map { $stored{$_}{order} } keys %stored)] = keys %stored;
As you can see, there is no sorting going on in that chunk of code, merely transformation.
I think it's nothing but an insertion sort.

Modifying hash within a hash in Perl

What is the shortest amount of code to modify a hash within a hash in the following instances:
%hash{'a'} = { 1 => one,
2 => two };
(1) Add a new key to the inner hash of 'a' (ex: c => 4 in the inner hash of 'a')
(2) Changing a value in the inner hash (ex: change the value of 1 to 'ONE')
Based on the question, you seem new to perl, so you should look at perldoc perlop among others.
Your %hash keys contain scalar values that are hashrefs. You can dereference using the -> operator, eg, $hashref = {foo=>42}; $hashref->{foo}. Similarly you can do the same with the values in the hash: $hash{a}->{1}. When you chain the indexes, though, there's some syntactic sugar for an implicit -> between them, so you can just do $hash{a}{1} = 'ONE' and so on.
This question probably also will give you some useful leads.
$hash{a}{c} = 4;
$hash{a}{1} = "ONE";