How do I do a use a map function to create a custom hash from an array in Perl? - perl

I have some JavaScript:
var $things = [
{
"id": "1",
"image": "one.png"
},
{
"id": "2",
"image": "two.png"
},
];
It gets converted to a Perl array ($thingsJSON is a string representation of the above):
my $coder = JSON::XS->new->utf8;
my $things = $coder->decode($thingsJSON);
A want a map function on $things that will return a collection ($args) that looks like this:
(
image => "one.png",
image => "two.png"
)
I want to be able to pass the above as $args to another function:
$Bar->find($args)

Assuming what you actually want is an array containing a list you can pass to that function, which expects key/value pairs, this would work:
#args = map { (image => $_->{image}) } #$things;
for clarity, this is the same as
#args = map { ('image', $_->{image}) } #$things;
that is, map is just returning a list where all of the even-numbered elements are the string "image" and all of the odd-numbered elements are the value of the image key of some element in #$things.

Related

Trouble adding to hash in perl

I am reading a file replacing data and returning output to json. When I try to add a new item to the hash I get the following error. Not a HASH reference When I use ref() I am getting HASH as the type.
I have tried.
my $json_data = decode_json($template);
$json_data->{CommandCenters}{NewItem} = ["haha","moredata"];
Gives the not a hash reference error
The $json_data is below.
{
"Location":"Arkansas",
"CommandCenters": [
{
"secretary": "jill",
"janitor": "mike"
}
],
}
I am looking for the following output after I add the element.
{
"Location":"Arkansas",
"city": "little rock"
"CommandCenters": [
{
"secretary": "jill",
"janitor": "mike"
},
{
"NewItem":["whatever","more data"]
}
],
}
If I use $json_data->{CommandCenters}[0]{NewItem} = ['whatever','sure']; I do not get an error but I get unexpected results.
The data is added but in the incorrect slot.
"commandcenters":
[
"secretary":"jill",
"janitor": "mike",
"newitem":
[
"whatever","sure"
],
]
To add a new element to an array, use push. As we
're dealing with an array reference, we need to dereference it first.
push #{ $json_data->{CommandCenters} }, { NewItem => ["haha", "moredata"] };
When I try to add a new item to the hash I get the following error. Not a HASH reference When I use ref() I am getting HASH as the type.
Attention to detail is a vital skill for a successful programmer. And you're missing something subtle here.
When you use ref(), I assume you're passing it your $json_data variable. And that is, indeed, a hash reference. But the line that generates your Not a HASH reference is this line:
$json_data->{CommandCenters}{NewItem} = ["haha","moredata"];
And that's not just treating $json_data as a hash reference ($json_data->{...}) it's also treating $json_data->{CommandCenters} as a hash reference. And that's where your problem is. $json_data->{CommandCenters} is an array reference, not a hash reference. It's generated from the bit of your JSON that looks like this:
"CommandCenters": [
{
"secretary": "jill",
"janitor": "mike"
}
]
And those [ .. ] mark it as an array not a hash. You can't add a new key/value pair to an array; you need to push() new data to the end of the array. Something like:
push #{ $json_data->{CommandCenters} }, { NewItem => ["haha", "moredata"] };
That will leave you with this data structure:
$VAR1 = {
'CommandCenters' => [
{
'janitor' => 'mike',
'secretary' => 'jill'
},
{
'NewItem' => [
'haha',
'moredata'
]
}
],
'Location' => 'Arkansas'
};
And encode_json() will turn that into the JSON that you want.

Multiple values in query

I am trying to get stats for categories, where I want to specify multiple categories. I have query params like this
$query_params = json_decode('{
"aggregated_by": "day",
"limit": 500,
"categories": "category1 category2",
"start_date": "2018-08-12",
"end_date": "2018-08-13"}');
Obviously the way that I am defining multiple categories is wrong, but what is the correct way ? I tried array, comma/space separated values but with no luck.
You need to just name each category in a distinct argument. In your example, It'd probably look like:
$query_params = json_decode('{
"aggregated_by": "day",
"limit": 500,
"categories": "category1",
"categories": "category2",
"start_date": "2018-08-12",
"end_date": "2018-08-13"}');
So I solved this by passing the parameters as array:
$query = array('categories' => ['cat1','cat2']);
And then I modified sendgrid client buildUrl method like this:
private function buildUrl($queryParams = null)
{
$path = '/' . implode('/', $this->path);
if (isset($queryParams)) {
$queryParams = \http_build_query($queryParams);
$path .= '?' . preg_replace('/%5B(?:[0-9]|[1-9][0-9]+)%5D=/', '=', $queryParams);
}
return sprintf('%s%s%s', $this->host, $this->version ?: '', $path);
}
I added the preg_replace part so the url is correctly built without including array key from nested array.

How do I work globally on a complex data structure using subroutines?

Specifically my data structure looks like this
{
"SomeGuy" => {
date_and_time => "11-04-2013",
Id => 7,
nr => 52,
picks => [
{ "This is an option" => "Option3" },
{ "This is another option" => "Option4" },
{ "This is another option" => "Option1" },
{ "And another one" => "Something" },
],
time_of => "06:11 AM",
total => 1,
},
"Another Guy" => { ... },
}
This is output via Data::Dump. The actual data structure contains a lot more records like "SomeGuy". The structure is identical for all of them.
I populate this data structure this way:
$guys->{$profile}{options}[$total++]{$var} = $var2;
$guys->{$profile}{Id} = $i;
$guys->{$profile}{date_and_time} = get_date($Time[0]);
$guys->{$profile}{time_of} = $Time[1];
$guys->{$profile}{total} = keys (% {$guys->{$profile}{options}[0]});
$guys->{$profile}{nr} = $pNr;
Having all this, what I want to do next is operate on this data structure. I repeat that there many many records in the data structure.
When I output the contents of it I get it in jumbled order, as to say, not in the order that it was populated. I've tried this with Data::Dumper, Data::Dump and iterating through the records myself by hand.
I know that the methods in the Data namespace are notorious for this, that's why Data::Dumper provides a way to sort through a subroutine and Data::Dump provides a default one.
So I have the data structure. It looks like I expect it to be, I have all the data as I knew it should look in it, valid. I want to sort the records according to their Id field. My thinking is that I have to use a subroutine and essentially pass a reference of the data structure to it and do the sorting there.
sub sortt {
my $dref = shift #_;
foreach my $name ( sort { $dref->{$a}{Id} <=> $dref->{$b}{Id} } keys %$dref ) {
print "$data->{$name}{Id}: $name \n";
}
}
This gets called with (in the same scope where the structure is populated, so no worries there):
sortt(\$guys);
The error is:
Not a HASH reference at perlprogram.pl line 452
So I go and use ref in the subroutine to make sure I'm passing an actual reference. And it says REF.
Next I go into desperate mode and try some stupid things like calling it with:
sortt(\%$guys)
But if I'm not mistaking this just sends a copy to the subroutine and just sorts that copy locally, so no use there.
It's no use if I make a copy and return it from the subroutine, I just want to pass a reference of my data structure and sort it and have it reflect those changes globally (or in the calling scope per se). How would I do this?
Leaving aside your syntax problem with "Not a ref", you are approaching the problem from the wrong end in the first place. I'll leave small syntactic details to others (see Ikegami's comment).
You can NOT sort them at all, because $guys is a hash, not an array. Hashes are NOT ever sorted in Perl. If you want to sort it, your have three solutions:
Store an ordered list of names as a separate array.
my #ordered_names = sort { $guys->{$a}{Id} <=> $guys->{$b}{Id} } keys %$guys
Then you use the array for ordering and go back to hashref for individual records.
Add the name to the individual guy's hash, instead of outer hash reference. $guys should be an array reference. The downside to this approach is that you can't look up a person's record by their name any more - if that functionality is needed, use #1.
#codnodder's answer shows how to do that, if you don't care about accessing records by name.
Use a Tie::* module. (Tie::IxHash, Tie::Hash::Sorted). NOT recommended since it's slower.
Perl hashes are inherently unordered. There is no way you can sort them, or reorder them at all. You have to write code to to access the elements in the order you want.
Your sortt subroutine does nothing but print the ID and the name of each hash element, sorted by the ID. Except that it doesn't, because you are trying to use the variable $data when you have actually set up $dref. That is likely to be the cause of your Not a HASH reference error, although unless you show your entire code, or at least indicate which is perlprogram.pl line 452, then we cannot help further.
The best way to do what you want is to create an array of hash keys, which you can sort in whatever order you want. Like this
my #names_by_id = sort { $guys->{$a}{Id} <=> $guys->{$b}{Id} } keys %$guys;
Then you can use this to access the hash in sorted order, like this, which prints the same output as your sortt is intended to, but formatted a little more nicely
for my $name (#names_by_id) {
printf "%4d: %s\n", $guys->{$name}{Id}, $name;
}
If you want to do anything else with the hash elements in sorted order then you have to use this technique.
$guys is already a hash ref, so you just need sortt($guys)
If you want a sorted data structure, you need something like this:
my #guys_sorted =
map { { $_ => $guys->{$_} } }
sort { $guys->{$a}{Id} <=> $guys->{$b}{Id} } keys %$guys;
print(Dumper(\#guys_sorted));
Or, in a sub:
sub sortt {
# returns a SORTED ARRAY of HASHREFS
my $ref = shift;
return map { { $_ => $ref->{$_} } }
sort { $ref->{$a}{Id} <=> $ref->{$b}{Id} } keys %$ref;
}
print(Dumper([sortt($guys)]));
That's a pretty complex data structure. If you commonly use structures like this in your program, I suggest you take your Perl programming skills up a notch, and look into learning a bit of Object Oriented Perl.
Object Oriented Perl is fairly straight forward: Your object is merely that data structure you've previously created. Methods are merely subroutines that work with that object. Most methods are getter/setter methods to set up your structure.
Initially, it's a bit more writing, but once you get the hang of it, the extra writing is easily compensated by the saving is debugging and maintaining your code.
Object Oriented Perl does two things: It first makes sure that your structure is correct. For example:
$some_guy->{Picks}->[2]->{"this is an option"} = "Foo!";
Whoops! That should have been {picks}. Imagine trying to find that error in your code.
In OO-Perl, if I mistyped a method's name, the program will immediately pick it up:
$some_guy->picks(
{
"This is an option" -> "Foo!",
"This is option 2" => "Bar!",
}
)
If I had $some_guy->Picks, I would have gotten a runtime error.
It also makes you think of your structure as an object. For example, what are you sorting on? You're sorting on your Guys' IDs, and the guys are stored in a hash called %guys.
# $a and $b are hash keys from `%guys`.
# $guys{$a} and $guys{$b} represent the guy objects.
# I can use the id method to get they guys' IDs
sort_guys_by_id {
guys{$a}->id cmp guys{$b}->id; #That was easy!
}
Take a look at the tutorial. You'll find yourself writing better programs with fewer errors.
With your heart set on a sorted data structure, I recommend the following. It is a simple array of hashes and, rather than using the name string as the key for a single-element hash, it adds a new name key to each guy's data.
I hope that, with the Data::Dump output, it is self-explanatory. It is sorted by the Id field as you requested, but it still has the disadvantage that a separate index array would allow any ordering at all without modifying or copying the original hash data.
use strict;
use warnings;
use Data::Dump;
my $guys = {
"SomeGuy " => {
date_and_time => "11-04-2013",
Id => 7,
nr => 52,
picks => [
{ "This is an option" => "Option3" },
{ "This is another option" => "Option4" },
{ "This is another option" => "Option1" },
{ "And another one" => "Something" },
],
time_of => "06:11 AM",
total => 1,
},
"Another Guy" => { Id => 9, nr => 99, total => 6 },
"My Guy" => { Id => 1, nr => 42, total => 3 },
};
my #guys_sorted = map {
my $data = $guys->{$_};
$data->{name} = $_;
$data;
}
sort {
$guys->{$a}{Id} <=> $guys->{$b}{Id}
} keys %$guys;
dd \#guys_sorted;
output
[
{ Id => 1, name => "My Guy", nr => 42, total => 3 },
{
date_and_time => "11-04-2013",
Id => 7,
name => "SomeGuy ",
nr => 52,
picks => [
{ "This is an option" => "Option3" },
{ "This is another option" => "Option4" },
{ "This is another option" => "Option1" },
{ "And another one" => "Something" },
],
time_of => "06:11 AM",
total => 1,
},
{ Id => 9, name => "Another Guy", nr => 99, total => 6 },
]

How to access this._id in map function in MongoDB MapReduce?

I'm doing a MapReduce in Mongo to generate a reverse index of tokens for some documents. I am having trouble accessing document's _id in the map function.
Example document:
{
"_id" : ObjectId("4ea42a2c6fe22bf01f000d2d"),
"attributes" : {
"name" : "JCDR 50W38C",
"upi-tokens" : [
"50w38c",
"jcdr"
]
},
"sku" : "143669259486830515"
}
(The field ttributes['upi-tokens'] is a list of text tokens I want to create reverse index for.)
Map function (source of the problem):
m = function () {
this.attributes['upi-tokens'].forEach(
function (token) { emit(token, {ids: [ this._id ]} ); }
); }
Reduce function:
r = function (key, values) {
var results = new Array;
for (v in values) {
results = results.concat(v.ids);
}
return {ids:results};
}
MapReduce call:
db.offers.mapReduce(m, r, { out: "outcollection" } )
PROBLEM Resulting collection has null values everywhere where I'd expect an id instead of actual ObjectID strings.
Possible reason:
I was expecting the following 2 functions to be equivalent, but they aren't.
m1 = function (d) { print(d['_id']); }
m2 = function () { print(this['_id']); }
Now I run:
db.offers.find().forEach(m1)
db.offers.find().forEach(m2)
The difference is that m2 prints undefined for each document while m1 prints the ids as desired. I have no clue why.
Questions:
How do I get the _id of the current object in the map function for use in MapReduce? this._id or this['_id'] doesn't work.
Why exactly aren't m1 and m2 equivalent?
Got it to work... I made quite simple JS mistakes:
inner forEach() in the map function seems to overwrite 'this' object; this is no longer the main document (which has an _id) but the iterated object inside the loop)...
...or it was simply because in JS the for..in loop only returns the keys, not values, i.e.
for (v in values) {
now requires
values[v]
to access the actual array value. Duh...
The way I circumvented mistake #1 is by using for..in loop instead of ...forEach() loop in the map function:
m = function () {
for (t in this.attributes['upi-tokens']) {
var token = this.attributes['upi-tokens'][t];
emit (token, { ids: [ this._id ] });
}
}
That way "this" refers to what it needs to.
Could also do:
that = this;
this.attributes['upi-tokens'].forEach( function (d) {
...
that._id...
...
}
probably would work just fine.
Hope this helps someone.

Populating a HoH, whose outer hash keys are in one array, and inner hash values are in another

Explanation
One array holds a set of keys. The values to these keys are inner hashes. The keys of these inner hashes, in this case are numbers (like array indices). Another array holds the values of the inner hash.
Question:
How can you populate the outer hash keys with the correct corresponding values (ie. correct inner hash)?
Contraints
I'd prefer a solution utilizing slices, map or grep. Eliminating cascading for loops
I realize it should be an HoA. But this is only for me to learn, it has no functional value...
Working Code:
This code works as I want but I would like to use more advanced techniques:
#! usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my %register=();
my #classNames = ('Science_class', 'Math_class');
my #Science_class_student_names = ('George', 'Lisa', 'Mathias'); #prob from file
my #Math_class_student_names = ('Martin', 'Anna', 'Peter', 'George'); #prob from file
foreach my $className (#classNames) {
my $array_name = $className.'_'.'student_names';
if ($array_name =~ /Science/) {
foreach (0..$#Science_class_student_names ) {
$register{$className}{$_ + 1} = $Science_class_student_names[$_];
}
}
elsif ($array_name =~ /Math/) {
foreach (0..$#Math_class_student_names ) {
$register{$className}{$_ + 1} = $Math_class_student_names[$_];
}
}
}
print Dumper(\%register);
Ideas
A hash slice works for direct key-value pairs, but the intermediate keys are throwing me off. Trying something like: #register{#classNames} = map{$count => $student}
One idea I had, before the if statements was if there was a way to use a string in the name of an array: $#($array_name)student_names but that doesn't work.
Another would be to separately create an array of all the inner hash keys, use a slice and then put that hash into the outer hash.
The only other idea I had was using an AoA to hold all the 'inner hash value' arrays. (ie. my #studentNames = (\#Science_class_student_names, \#Math_class_student_names); but haven't gotten anywhere with that yet.
It's easier to work on it a layer at a time. For the inner layer, it seems you want
1 => George,
2 => Lisa,
...
So start by figuring out how to do that.
map { $_+1 => $Science_class_student_names[$_] }
0..$#Science_class_student_names
So you end up with
$register{'Science_class'} = {
map { $_+1 => $Science_class_student_names[$_] }
0..$#Science_class_student_names
}
};
$register{'Math_class'} = {
map { $_+1 => $Math_class_student_names[$_] }
0..$#Math_class_student_names
}
};
If you generalise the inner layer, you get
for (
[ 'Science_class' => \#Science_class_student_names ],
[ 'Math_class' => \#Math_class_student_names ],
) {
my ($class_name, $student_names) = #$_;
$register{$class_name} = {
map { $_+1 => $student_names->[$_] }
0..$#$student_names
};
}