Iterating over data structure - perl

I am trying to iterate over this data structure:
$deconstructed->{data}->{workspaces}[0]->{workspace}->{facts}[0]->{code}
where fact[0] is increasing. It's several files I am processing so the number of {facts}[x] varies.
I thought this might work but it doesn't seem to be stepping up the $iter var:
foreach $iter(#{$deconstructed->{data}->{workspaces}[0]->{workspace}->{facts}}){
print $deconstructed->{data}->{workspaces}[0]->{workspace}->{facts}[$iter]->{code}."\n";
}
I'm totally digging data structures but this one is stumping me. Any advice what might be wrong here?

$iter is being set to the content of each item in the array not the index. e.g.
my $a = [ 'a', 'b', 'c' ];
for my $i (#$a) {
print "$i\n";
}
...prints:
a
b
c
Try:
foreach $iter (#{$deconstructed->{data}->{workspaces}[0]->{workspace}->{facts}}){
print $iter->{code}."\n";
}

$iter is not going to be an index that you can subscript the array with, it is rather the current element of the array. So I guess you should be fine with:
$iter->{code}

Your $iter contains the data sctructure. What you basiclly want is:
foreach my $elem ( #{$deconstructed->{data}->{workspaces}[0]->{workspace}->{facts}} ){
print $elem->{code};
}
or:
foreach my $iter ( 0 .. scalar #{$deconstructed->{data}->{workspaces}[0]->{workspace}->{facts}} ){
print $deconstructed->{data}->{workspaces}[0]->{workspace}->{facts}[$iter]->{code}."\n";
}

Since you are looping over the array, your misnamed $iter is the value you are looking for, not an index.
If you want to loop over the indexes instead, do:
foreach $iter ( 0 .. $#{$deconstructed->{data}->{workspaces}[0]->{workspace}->{facts}} ) {
print "Index $iter: ",
$deconstructed->{data}->{workspaces}[0]->{workspace}->{facts}[$iter]->{code}."\n";
}
Also note that you can drop -> between two [] or {}:
$deconstructed->{data}{workspaces}[0]{workspace}{facts}[$iter]{code}
I recommend reading http://perlmonks.org/?node=References+quick+reference.

When you have ugly data structures like this, make an interface for it so your life is easier:
foreach my $fact ( $data_obj->facts ) { # make some lightweight class for this
....;
}
Even without that, consider using a reference to just the part of the data structure you need so you don't think about the rest:
my $facts = $deconstructed->{data}{workspaces}[0]{workspace}{facts};
foreach my $fact ( #$facts ) {
print "Thing is $fact->{code}\n";
}
It's just a reference, so you're not recreating anything. Since you only have to think about the parts beyond the facts key, the problem doesn't look as hard.

Related

Multidimension array in perl

I am working on a short script in which two to three variables are linked with each other.
Example:
my #batch;
my #case;
my #type = {
back => "sticker",
front => "no sticker",
};
for (my $i=0; $i<$#batch; $i++{
for (my $j=0; $j<$#batch; $j++{
if ($batch[$i]=="health" && $case[$i]$j]=="pain"){
$type[$i][$j]->back = "checked";
}
}
}
In this short code I want to use #type as $type[$i][$j]->back & $type[$i][$j]->front, but I am getting error that array referenced not defined . Can anyone help me how to fix this ?
Perl two-dimensional arrays are just arrays of arrays: each element of the top level array contains a (reference to) another array. The best reference for this is perldoc perlreftut
From what I can understand, you want an array of arrays of hashes. $type[$i][$j]->back and $type[$i][$j]->front are method calls in Perl, and what you want is $type[$i][$j]{back} and $type[$i][$j]{front}.
use strict;
use warnings;
my #batch;
my #case;
# Populate #batch and #case
my #type;
for my $i (0 .. $#batch) {
for my $j (0 .. $#{ $batch[$i] } ) {
if ($batch[$i] eq 'health' and $case[$i][$j] eq 'pain') {
$type[$i][$j]{back} = 'checked';
}
}
}
But I am very worried about your design. #type will be full of undefined elements, with only occasional ones set to checked. A proper fix depends entirely on what you need to do with #type once you have built it.
I hope this helps
Perl doesn't have multiple dimension variables. To emulate multidimential arrays, you can use what are called references. A reference is a way of referring to a memory location of another Perl structure such as an array or hash.
References allows you to build up more complex structures. For example, you could have an array and instead of each element in the array having a distinct value, it could point to another array. Using this, I can treat my array of arrays as a two dimensional array. But it's not a two dimensional array.
In a two dimensional array, each column ($j) has the same length. That's guaranteed. In Perl, what you have is each row ($i), pointing to a different array of columns ($j), and each of those column arrays could have a different number of elements (or even none at all! That inner array $j may not even be defined!).
There for, I have to check each column and see exactly how many values it might have:
for my $i ( 0..$#array ) {
if ( ref $array[i] ne "ARRAY" ) {
die qq(There is no sub array! for \$array[$i]!\n);
}
my #temp_j_array = #{ $array[$i] } { # This is how you dereference a reference
for my $j ( 0..$#temp_j_array ) {
# Here be dragons...
}
}
Note that I have to see exactly how many columns are in my inner ($j) array before I can go through it.
By the way, notice how I use .. to index my arrays. It's a lot cleaner than using that three part for loop which is very error prone. For example, should you check $i < $#array or $i <= $#array`? See the difference?
Since you're already dealing with a very complex structure (an array of arrays), I'm going to make it even more complex: (An array of arrays of hashes). This added complexity allows me to get rid of three separate variables. Instead of trying to keep #batch #case and #type in sync with each other, I can make these keys to my inner most hash:
my #structure = ... # Some sort of structure...
for my $i ( 0..$#structure ) {
my #temp = #{ $structure[$i] }; # This is a reference to an array. Dereference it.
for my $j ( 0..$#temp ) {
if ( $structure[$i]->[$j]->{batch} eq "health"
and $structure[$i]->[$j]->{case} eq "pain" ) {
$structure[$i]->[$j]->{back} = "checked";
}
}
}
This is a very common way to use Perl references to build more complex data structures:
my %employees; # Keyed by employee number:
$employees{1001}->{NAME} = "Bob";
$employees{1001}->{JOB} = "Yes man";
$employees{1002}->{NAME} = "Susan";
$employees{1002}->{JOB} = "sycophant";
You had some syntax errors, and were using the wrong boolean operator (==) instead of (ne).

Perl need the right grep operator to match value of variable

I want to see if I have repeated items in my array, there are over 16.000 so will automate it
There may be other ways but I started with this and, well, would like to finish it unless there is a straightforward command. What I am doing is shifting and pushing from one array into another and this way, check the destination array to see if it is "in array" (like there is such a command in PHP).
So, I got this sub routine and it works with literals, but it doesn't with variables. It is because of the 'eq' or whatever I should need. The 'sourcefile' will contain one or more of the words of the destination array.
// Here I just fetch my file
$listamails = <STDIN>;
# Remove the newlines filename
chomp $listamails;
# open the file, or exit
unless ( open(MAILS, $listamails) ) {
print "Cannot open file \"$listamails\"\n\n";
exit;
}
# Read the list of mails from the file, and store it
# into the array variable #sourcefile
#sourcefile = <MAILS>;
# Close the handle - we've read all the data into #sourcefile now.
close MAILS;
my #destination = ('hi', 'bye');
sub in_array
{
my ($destination,$search_for) = #_;
return grep {$search_for eq $_} #$destination;
}
for($i = 0; $i <=100; $i ++)
{
$elemento = shift #sourcefile;
if(in_array(\#destination, $elemento))
{
print "it is";
}
else
{
print "it aint there";
}
}
Well, if instead of including the $elemento in there I put a 'hi' it does work and also I have printed the value of $elemento which is also 'hi', but when I put the variable, it does not work, and that is because of the 'eq', but I don't know what else to put. If I put == it complains that 'hi' is not a numeric value.
When you want distinct values think hash.
my %seen;
#seen{ #array } = ();
if (keys %seen == #array) {
print "\#array has no duplicate values\n";
}
It's not clear what you want. If your first sentence is the only one that matters ("I want to see if I have repeated items in my array"), then you could use:
my %seen;
if (grep ++$seen{$_} >= 2, #array) {
say "Has duplicates";
}
You said you have a large array, so it might be faster to stop as soon as you find a duplicate.
my %seen;
for (#array) {
if (++$seen{$_} == 2) {
say "Has duplicates";
last;
}
}
By the way, when looking for duplicates in a large number of items, it's much faster to use a strategy based on sorting. After sorting the items, all duplicates will be right next to each other, so to tell if something is a duplicate, all you have to do is compare it with the previous one:
#sorted = sort #sourcefile;
for (my $i = 1; $i < #sorted; ++$i) { # Start at 1 because we'll check the previous one
print "$sorted[$i] is a duplicate!\n" if $sorted[$i] eq $sorted[$i - 1];
}
This will print multiple dupe messages if there are multiple dupes, but you can clean it up.
As eugene y said, hashes are definitely the way to go here. Here's a direct translation of the code you posted to a hash-based method (with a little more Perlishness added along the way):
my #destination = ('hi', 'bye');
my %in_array = map { $_ => 1 } #destination;
for my $i (0 .. 100) {
$elemento = shift #sourcefile;
if(exists $in_array{$elemento})
{
print "it is";
}
else
{
print "it aint there";
}
}
Also, if you mean to check all elements of #sourcefile (as opposed to testing the first 101 elements) against #destination, you should replace the for line with
while (#sourcefile) {
Also also, don't forget to chomp any values read from a file! Lines read from a file have a linebreak at the end of them (the \r\n or \n mentioned in comments on the initial question), which will cause both eq and hash lookups to report that otherwise-matching values are different. This is, most likely, the reason why your code is failing to work correctly in the first place and changing to use sort or hashes won't fix that. First chomp your input to make it work, then use sort or hashes to make it efficient.

Converting code to perl sub, but not sure I'm doing it right

I'm working from a question I posted earlier (here), and trying to convert the answer to a sub so I can use it multiple times. Not sure that it's done right though. Can anyone provide a better or cleaner sub?
I have a good deal of experience programming, but my primary language is PHP. It's frustrating to know how to execute in one language, but not be able to do it in another.
sub search_for_key
{
my ($args) = #_;
foreach $row(#{$args->{search_ary}}){
print "#$row[0] : #$row[1]\n";
}
my $thiskey = NULL;
my #result = map { $args->{search_ary}[$_][0] } # Get the 0th column...
grep { #$args->{search_in} =~ /$args->{search_ary}[$_][1]/ } # ... of rows where the
0 .. $#array; # first row matches
$thiskey = #result;
print "\nReturning: " . $thiskey . "\n";
return $thiskey;
}
search_for_key({
'search_ary' => $ref_cam_make,
'search_in' => 'Canon EOS Rebel XSi'
});
---Edit---
From the answers so far, I've cobbled together the function below. I'm new to Perl, so I don't really understand much of the syntax. All I know is that it throws an error (Not an ARRAY reference at line 26.) about that grep line.
Since I seem to not have given enough info, I will also mention that:
I am calling this function like this (which may or may not be correct):
search_for_key({
'search_ary' => $ref_cam_make,
'search_in' => 'Canon EOS Rebel XSi'
});
And $ref_cam_make is an array I collect from a database table like this:
$ref_cam_make = $sth->fetchall_arrayref;
And it is in the structure like this (if I understood how to make the associative fetch work properly, I would like to use it like that instead of by numeric keys):
Reference Array
Associative
row[1][cam_make_id]: 13, row[1][name]: Sony
Numeric
row[1][0]: 13, row[1][1]: Sony
row[0][0]: 19, row[0][1]: Canon
row[2][0]: 25, row[2][1]: HP
sub search_for_key
{
my ($args) = #_;
foreach my $row(#{$args->{search_ary}}){
print "#$row[0] : #$row[1]\n";
}
print grep { $args->{search_in} =~ #$args->{search_ary}[$_][1] } #$args->{search_ary};
}
You are moving in the direction of a 2D array, where the [0] element is some sort of ID number and the [1] element is the camera make. Although reasonable in a quick-and-dirty way, such approaches quickly lead to unreadable code. Your project will be easier to maintain and evolve if you work with richer, more declarative data structures.
The example below uses hash references to represent the camera brands. An even nicer approach is to use objects. When you're ready to take that step, look into Moose.
use strict;
use warnings;
demo_search_feature();
sub demo_search_feature {
my #camera_brands = (
{ make => 'Canon', id => 19 },
{ make => 'Sony', id => 13 },
{ make => 'HP', id => 25 },
);
my #test_searches = (
"Sony's Cyber-shot DSC-S600",
"Canon cameras",
"Sony HPX-32",
);
for my $ts (#test_searches){
print $ts, "\n";
my #hits = find_hits($ts, \#camera_brands);
print ' => ', cb_stringify($_), "\n" for #hits;
}
}
sub cb_stringify {
my $cb = shift;
return sprintf 'id=%d make=%s', $cb->{id}, $cb->{make};
}
sub find_hits {
my ($search, $camera_brands) = #_;
return grep { $search =~ $_->{make} } #$camera_brands;
}
This whole sub is really confusing, and I'm a fairly regular perl user. Here are some blanket suggestions.
Do not create your own undef ever -- use undef then return at the bottom return $var // 'NULL'.
Do not ever do this: foreach $row, because foreach my $row is less prone to create problems. Localizing variables is good.
Do not needlessly concatenate, for it offends the style god: not this, print "\nReturning: " . $thiskey . "\n";, but print "\nReturning: $thiskey\n";, or if you don't need the first \n: say "Returning: $thiskey;" (5.10 only)
greping over 0 .. $#array; is categorically lame, just grep over the array: grep {} #{$foo[0]}, and with that code being so complex you almost certainly don't want grep (though I don't understand what you're doing to be honest.). Check out perldoc -q first -- in short grep doesn't stop until the end.
Lastly, do not assign an array to a scalar: $thiskey = #result; is an implicit $thiskey = scalar #result; (see perldoc -q scalar) for more info. What you probably want is to return the array reference. Something like this (which eliminates $thiskey)
printf "\nReturning: %s\n", join ', ', #result;
#result ? \#result : 'NULL';
If you're intending to return whether a match is found, this code should work (inefficiently). If you're intending to return the key, though, it won't -- the scalar value of #result (which is what you're getting when you say $thiskey = #result;) is the number of items in the list, not the first entry.
$thiskey = #result; should probably be changed to $thiskey = $result[0];, if you want mostly-equivalent functionality to the code you based this off of. Note that it won't account for multiple matches anymore, though, unless you return #result in its entirety, which kinda makes more sense anyway.

Matching elements from 2 arrays in perl

Right now I am attempting to synchronize two data files that are listed by date so that i can make comparisons later on. However I can not seem to print out only the lines where the dates match. At this point I have separated out the data for each file into 2 arrays. I need to find only the dates that are in both arrays and print them out. Any suggestions would be much appreciated.
Here is a sample set of the raw data that I am working with, each file is in the same format:
09/11/2009,00:56:00,51.602,47.894,87,88,0,1032
09/12/2009,00:56:00,57.794,55.796,93,54,0,1023.6
09/13/2009,00:56:00,64.292,62.204,93,66,0,1014.4
09/14/2009,00:56:00,61.592,55.4,80,25,0,1009.6
09/15/2009,00:56:00,58.604,53.798,84,31,0,1009.1
09/16/2009,00:56:00,53.6,48.902,84,45,0,1017
I have split the date into an array for each file. My ultimate goal is to only print lines of code where both files have data. So to do this I wanted to compare the 2 arrays with the elements being the dates.
My initial code looked like this:
foreach $bdate(#bdate){
while (<PL>){
chomp;
#arr = split (/,/);
$pday=$arr[1];
push #pdate, $pday;
if ($bdate eq $pdate){
print "$bdate,$pday\n";
}
}
One way (of many) would be to iterate once through each array, building a hash as follows;
for (#array1, #array2) {
$dates{$_}++;
}
Then you can print the keys that correspond to values of 2 or more;
print $_,"\n" for grep {$dates{$_} > 1} keys %dates;
(untested, written on a machine with no perl)
...and a quick CPAN search turns up List::Compare, with this example;
$lc = List::Compare->new(\#Llist, \#Rlist);
#intersection = $lc->get_intersection;
Here's example from perlfaq4 (simplified a bit):
my (#intersection, %count);
for my $element (#array1, #array2) { $count{$element}++ }
for my $element (keys %count) {
push #intersection, $element if $count{$element} > 1;
}
More idiomatic version:
my (%union, %isect);
for my $e (#array1, #array2) { $union{$e}++ && $isect{$e}++ }
my #intersection = keys %isect;
Both methods assume that each element is unique in a given array.

How can I create multidimensional arrays in Perl?

I am a bit new to Perl, but here is what I want to do:
my #array2d;
while(<FILE>){
push(#array2d[$i], $_);
}
It doesn't compile since #array2d[$i] is not an array but a scalar value.
How should I declare #array2d as an array of array?
Of course, I have no idea of how many rows I have.
To make an array of arrays, or more accurately an array of arrayrefs, try something like this:
my #array = ();
foreach my $i ( 0 .. 10 ) {
foreach my $j ( 0 .. 10 ) {
push #{ $array[$i] }, $j;
}
}
It pushes the value onto a dereferenced arrayref for you. You should be able to access an entry like this:
print $array[3][2];
Change your "push" line to this:
push(#{$array2d[$i]}, $_);
You are basically making $array2d[$i] an array by surrounding it by the #{}... You are then able to push elements onto this array of array references.
Have a look at perlref and perldsc to see how to make nested data structures, like arrays of arrays and hashes of hashes. Very useful stuff when you're doing Perl.
There's really no difference between what you wrote and this:
#{$array2d[$i]} = <FILE>;
I can only assume you're iterating through files.
To avoid keeping track of a counter, you could do this:
...
push #array2d, [ <FILE> ];
...
That says 1) create a reference to an empty array, 2) storing all lines in FILE, 3) push it onto #array2d.
Another simple way is to use a hash table and use the two array indices to make a hash key:
$two_dimensional_array{"$i $j"} = $val;
If you're just trying to store a file in an array you can also do this:
fopen(FILE,"<somefile.txt");
#array = <FILE>;
close (FILE);