lets assume i have an array #atom. i am pushing three elements $a, $b, $c (residue name, chain and residue number respectively fetched from pdb file) into that array. for instance, $b has values AAAAAAAAA, BBBBBBB, CCCCCCC. how to empty the array every time when $b changes?
The array is as follows,
push(#atom,"$a $b $c");
I'm not sure why you are using an array when you're only storing a single value in it. You think you are storing three values, but you are putting those into a single string before storing them in the array.
To store three values in your array, you might use code like this:
my #atom = ($residue_name, $chain, $residue_number);
(Notice that I have also changed your variable names. $a, $b and $c are terrible names for variables and $a and $b are special variables for Perl and should not be used in random code.)
I don't really know what you are doing here, but it seems to me that it might make more sense to store this data in a hash.
my %atom = (
residue_name => $residue_name,
chain => $chain,
residue_number => $residue_number,
);
Of course, that's only a guess as I don't know what you need to do with your data - but an important part of programming is to get your data structures right.
But let's assume for now that you're still using your original array and you want to a) see if the $chain variable has changed its value and b) empty the array at that point. You would need to write code something like this:
my #atom = ($residue_name, $chain, $residue_number);
# Store the current value of $chain
my $original_chain = $chain;
Then, later on, you need to check the value has changed and take appropriate action.
if ($chain ne $original_chain) {
#atom = ();
}
Of course, this is all just the sketchiest of suggestions. I have no idea how your code is structured.
Assuming $a $b $c are read in a loop and pushed in an array
while (...) {
# read $a $b $c
if ($b ne $last_b) {
#atom = () # atom is affected to a new empty array
}
push #atom, ...
$last_b=$b
}
Related
Lets say I have a large hash and I want to iterate over the contents of it contents. The standard idiom would be something like this:
while(($key, $value) = each(%{$hash_ref})){
///do something
}
However, if I understand my perl correctly this is actually doing two things. First the
%{$hash_ref}
is translating the ref into list context. Thus returning something like
(key1, value1, key2, value2, key3, value3 etc)
which will be stored in my stacks memory. Then the each method will run, eating the first two values in memory (key1 & value1) and returning them to my while loop to process.
If my understanding of this is right that means that I have effectively copied my entire hash into my stacks memory only to iterate over the new copy, which could be expensive for a large hash, due to the expense of iterating over the array twice, but also due to potential cache hits if both hashes can't be held in memory at once. It seems pretty inefficient. I'm wondering if this is what really happens, or if I'm either misunderstanding the actual behavior or the compiler optimizes away the inefficiency for me?
Follow up questions, assuming I am correct about the standard behavior.
Is there a syntax to avoid copying of the hash by iterating over it values in the original hash? If not for a hash is there one for the simpler array?
Does this mean that in the above example I could get inconsistent values between the copy of my hash and my actual hash if I modify the hash_ref content within my loop; resulting in $value having a different value then $hash_ref->($key)?
No, the syntax you quote does not create a copy.
This expression:
%{$hash_ref}
is exactly equivalent to:
%$hash_ref
and assuming the $hash_ref scalar variable does indeed contain a reference to a hash, then adding the % on the front is simply 'dereferencing' the reference - i.e. it resolves to a value that represents the underlying hash (the thing that $hash_ref was pointing to).
If you look at the documentation for the each function, you'll see that it expects a hash as an argument. Putting the % on the front is how you provide a hash when what you have is a hashref.
If you wrote your own subroutine and passed a hash to it like this:
my_sub(%$hash_ref);
then on some level you could say that the hash had been 'copied', since inside the subroutine the special #_ array would contain a list of all the key/value pairs from the hash. However even in that case, the elements of #_ are actually aliases for the keys and values. You'd only actually get a copy if you did something like: my #args = #_.
Perl's builtin each function is declared with the prototype '+' which effectively coerces a hash (or array) argument into a reference to the underlying data structure.
As an aside, starting with version 5.14, the each function can also take a reference to a hash. So instead of:
($key, $value) = each(%{$hash_ref})
You can simply say:
($key, $value) = each($hash_ref)
No copy is created by each (though you do copy the returned values into $key and $value through assignment). The hash itself is passed to each.
each is a little special. It supports the following syntaxes:
each HASH
each ARRAY
As you can see, it doesn't accept an arbitrary expression. (That would be each EXPR or each LIST). The reason for that is to allow each(%foo) to pass the hash %foo itself to each rather than evaluating it in list context. each can do that because it's an operator, and operators can have their own parsing rules. However, you can do something similar with the \% prototype.
use Data::Dumper;
sub f { print(Dumper(#_)); }
sub g(\%) { print(Dumper(#_)); } # Similar to each
my %h = (a=>1, b=>2);
f(%h); # Evaluates %h in list context.
print("\n");
g(%h); # Passes a reference to %h.
Output:
$VAR1 = 'a'; # 4 args, the keys and values of the hash
$VAR2 = 1;
$VAR3 = 'b';
$VAR4 = 2;
$VAR1 = { # 1 arg, a reference to the hash
'a' => 1,
'b' => 2
};
%{$h_ref} is the same as %h, so all of the above applies to %{$h_ref} too.
Note that the hash isn't copied even if it is flattened. The keys are "copied", but the values are returned directly.
use Data::Dumper;
my %h = (abc=>"def", ghi=>"jkl");
print(Dumper(\%h));
$_ = uc($_) for %h;
print(Dumper(\%h));
Output:
$VAR1 = {
'abc' => 'def',
'ghi' => 'jkl'
};
$VAR1 = {
'abc' => 'DEF',
'ghi' => 'JKL'
};
You can read more about this here.
I am a first timer with Perl and I have to make changes to a Perl script and I have come across the following:
my %summary ;
for my $id ( keys %trades ) {
my ( $sym, $isin, $side, $type, $usrOrdrNum, $qty ) = #{$trades{$id}} ;
$type = "$side $type" ;
$summary{$sym}{$type} += $qty ;
$summary{$sym}{'ISIN'} = $isin ;
}
The portion I do not understand is $summary{$sym}{$type} += $qty ;. What is the original author trying to do here?
This piece of code populates the %summary hash with a summary of the data in %trades. Each trade is an array with multiple fields which are unpacked inside the loop. I.e. $sym is the value of the first array field of the current trade, $qty the last field
$summary{$sym} accesses the $sym field in the %summary hash. The entry named $type in the $summary{$sym} field is then accessed. If the field does not exist, it is created. If $summary{$sym} does not hold a hashref, one is created there, so everything Just Works. (technical term: autovivification)
$var += $x adds $x to $var, so $summary{$sym}{$type} holds a sum of all $qty values with the same $sym and $type after the loop finishes.
The $summary{$sym}{ISIN} field will hold the $isin value of the last trade with name $sym (I suspect they are the same for all such trades).
Perl has three built in different data types:
Scalar (as in $foo).
Arrays (as in #foo).
Hashes (as in %foo).
The problem is that each of these deal with single bits of data. Sure, there can be lots of items in list and hashes, but they are lots of single bits of data.
Let's say I want to keep track of people. People have a first name, last name, phone, etc. Let's define a person:
my %person;
$person{FIRST_NAME} = "Bob";
$person{LAST_NAME} = "Smith";
$person{PHONE_NUMBER} = "555-1234";
Okay, now I need to store another person. Do I create another hash? What if I could have, say an array of hashes with each hash representing a single person?
Perl allows you to do this by making a reference to the hash:
my #list;
push #list, \%person;
THe \%person is my reference to the memory location that contains my hash. $list[0] points to that memory location and allows me to access that person through dereferencing.
Now, my array contains my person. I can create a second one:
$person{FIRST_NAME} = "Susan";
$person{LAST_NAME} = "Brown";
$person{PHONE_NUMBER} = "555-9876";
push #list, \%person.
Okay, how do I reference my person. In Perl, you dereference by putting the correct sigil in front of your reference. For example:
my $person_ref = #list[0]; #Reference to Bob's hash
my %person = %{person_ref}; #Dereference to Bob's hash. %person is now Bob.
Several things, I'm doing a lot of moving data from one variable to another, and I am not really using those variables. Let's eliminate the variables, or at least their names:
my #list;
push #list, {}; #Anonymous hash in my list
$list[0] still points to a reference to a hash, but I never had to give that hash a name. Now, how do I put Bob's information into it?
If $list[0] is a reference to a hash, I could dereference it by putting %{...} around it!
%person = %{ $list[0] }; #Person is an empty hash, but you get the idea
Let's fill up that hash!
${ $list[0] }{FIRST_NAME} = "Bob";
${ $list[0] }{LAST_NAME} = "Smith";
${ $list[0] }{PHONE_NUMBER} = "555-1234";
That's easy to read...
Fortunately, Perl provides a bit of syntactic sweetener. This is the same:
$list[0]->{FIRST_NAME} = "Bob";
$list[0]->{LAST_NAME} = "Smith";
$list[0]->{PHONE_NUMBER} = "555-1234";
The -> operator points to the dereference you're doing.
Also, in certain circumstances, I don't need the {...} curly braces. Think of it like math operations where there's an order of precedence:
(3 x 4) + (5 x 8)
is the same as:
3 x 4 + 5 x 8
One, I specify the order of operation, and the other I don't:
The original adding names into a hash reference stored in a list:
${ $list[0] }{FIRST_NAME} = "Bob";
${ $list[0] }{LAST_NAME} = "Smith";
${ $list[0] }{PHONE_NUMBER} = "555-1234";
Can be rewritten as:
$list[0]{FIRST_NAME} = "Bob";
$list[0]{LAST_NAME} = "Smith";
$list[0]{PHONE_NUMBER} = "555-1234";
(And I didn't have to do push #list, {}; first. I just wanted to emphasize that this was a reference to a hash.
Thus:
$trades{$id}
Is a reference to an array of data.
Think of it as this way:
my #list = qw(a bunch of data);
$trades{$id} = \#list;
And to dereference that reference to a list, I do this:
#{trades{$id}}
See Mark's Short Tutorial About References.
$summary{$sym}{$type} is a scalar inside a hashref inside a hash.
+= is an operator that takes the left hand side, adds the right hand side to it, then assigns the result back to the left hand side.
$qty is the value to add to the previously stored value.
$summary{$sym}{$type} += $qty ; #is the same as
#$summary{$sym}{$type} = $summary{$sym}{$type} + $qty;
#This line calculates total of the values from the hash %trades ($trades{$id}[5];).
The best way to see types in Perl if you are a newbie is to use perl debugger option.
You can run the script as :
perl -d <scriptname>
And then withoin the debugger (you will see something like this)
DB<1>
type the following to go to the code where you want to debug:
DB<1> c <linenumber>
Then You can use x to see the variables like:
DB<2>x %trades
DB<3>x $trades{$id}
DB<4>print Dumper \%trades
This way you can actually see whats inside the hash or even hash of hash.
It computes the sum of all values in the last field for each combination of values of the first three fields.
If the hash was a SQL table instead (and why not - something like DBD::CSV may come in handy here) with fields id, sym, isin, side, type, usrOrdrNum, qty, the code would translate to something like
SELECT sym, CONCAT(side,' ',type) AS type, SUM(qty), isin
FROM trades
GROUP BY sym, CONCAT(side,' ',type);
So, here's the deal. I have an array, let's call it
#array = ('string1','string2','string3','string4');
etc., etc. I have no way of knowing how large the array is or what the contents are, specifically, except that it is an array of strings.
I also have a variable which needs to be changed depending on the size and contents of the array.
Here's a sample easy assignment of that variable, along with the array that would have generated the assignment:
#array = ('string1','string2','string3');
$var = Some::Obj1(Some::Obj2('string1'),
Some::Obj2('string2'),
Some::Obj2('string3'));
Then, if for instance, I had the following #array,
#array = ('string1','string2','string3','string4','string5');
My assignment would need to look like this:
$var = Some::Obj1(Some::Obj2('string1'),
Some::Obj2('string2'),
Some::Obj2('string3'),
Some::Obj2('string4'),
Some::Obj2('string5'));
Can you guys think of any way that something like this could be implemented?
Well, if all you need is to turn some strings into a list of objects inside an object... Why not map?
my #array = ('string1','string2','string3','string4','string5');
my $var = Some::Obj1(map { Some::Obj2($_) } #array);
Yes, you just do
$var = Some::Obj1(map(Some::Obj2($_), #array));
That produces the exact same result as the code you wrote:
$var = Some::Obj1(Some::Obj2('string1'),
Some::Obj2('string2'),
Some::Obj2('string3'),
Some::Obj2('string4'),
Some::Obj2('string5'));
Of course, it goes without saying that you should use either my or our before the variable as appropriate if you are initializing it for the first time. If you wish to perform more complicated operations using map, an entire block of code can be enclosed in braces and the comma omitted, i.e.,
map {operation 1; operation 2; ...; final operation stored as result;} #array
Okay, not sure where to ask this, but I'm a beginner programmer, using Perl. I need to create an array of an array, but I'm not sure if it would be better use array/hash references, or array of hashes or hash of arrays etc.
I need an array of matches: #totalmatches
Each match contains 6 elements(strings):
#matches = ($chapternumber, $sentencenumber, $sentence, $grammar_relation, $argument1, $argument2)
I need to push each of these elements into the #matches array/hash/reference, and then push that array/hash/reference into the #totalmatches array.
The matches are found based on searching a file and selecting the strings based on meeting the criteria.
QUESTIONS
Which data structure would you use?
Can you push an array into another array, as you would push an element into an array? Is this an efficient method?
Can you push all 6 elements simultaneously, or have to do 6 separate pushes?
When working with 2-D, to loop through would you use:
foreach (#totalmatches) {
foreach (#matches) {
...
}
}
Thanks for any advice.
Which data structure would you use?
An array for a ordered set of things. A hash for a set of named things.
Can you push an array into another array, as you would push an element into an array? Is this an efficient method?
If you try to push an array (1) into an array (2), you'll end up pushing all the elements of 1 into 2. That is why you would push an array ref in instead.
Can you push all 6 elements simultaneously, or have to do 6 separate pushes?
Look at perldoc -f push
push ARRAY,LIST
You can push a list of things in.
When working with 2-D, to loop through would you use:
Nested foreach is fine, but that syntax wouldn't work. You have to access the values you are dealing with.
for my $arrayref (#outer) {
for my $item (#$arrayref) {
$item ...
}
}
Do not push one array into another array.
Lists just join with each other into a new list.
Use list of references.
#create an anonymous hash ref for each match
$one_match_ref = {
chapternumber => $chapternumber_value,
sentencenumber => $sentencenumber_value,
sentence => $sentence_value,
grammar_relation => $grammar_relation_value,
arg1 => $argument1,
arg2 => $argument2
};
# add the reference of match into array.
push #all_matches, $one_match_ref;
# list of keys of interest
#keys = qw(chapternumber sentencenumber sentence grammer_relation arg1 arg2);
# walk through all the matches.
foreach $ref (#all_matches) {
foreach $key (#keys) {
$val = $$ref{$key};
}
# or pick up some specific keys
my $arg1 = $$ref{arg1};
}
Which data structure would you use?
An array... I can't really justify that choice, but I can't imagine what you would use as keys if you used a hash.
Can you push an array into another array, as you would push an element into an array? Is this an efficient method?
Here's the thing; in Perl, arrays can only contain scalar variables - the ones which start with $. Something like...
#matrix = ();
#row = ();
$arr[0] = #row; # FAIL!
... wont't work. You will have to instead use a reference to the array:
#matrix = ();
#row = ();
$arr[0] = \#row;
Or equally:
push(#matrix, \#row);
Can you push all 6 elements simultaneously, or have to do 6 separate pushes?
If you use references, you need only push once... and since you don't want to concatenate arrays (you need an array of arrays) you're stuck with no alternatives ;)
When working with 2-D, to loop through would you use:
I'd use something like:
for($i=0; $i<#matrix; $i++) {
#row = #{$matrix[$i]}; # de-reference
for($j=0; $j<#row; $j++) {
print "| "$row[$j];
}
print "|\n";
}
Which data structure would you use?
Some fundamental container properties:
An array is a container for ordered scalars.
A hash is a container for scalars obtained by a unique key (there can be no duplicate keys in the hash). The order of values added later is not available anymore.
I would use the same structure like ZhangChn proposed.
Use a hash for each match.
The details of the match then can be accessed by descriptive names instead of plain numerical indices. i.e. $ref->{'chapternumber'} instead of $matches[0].
Take references of these anonymous hashes (which are scalars) and push them into an array in order to preserve the order of the matches.
To dereference items from the data structure
get an item from the array which is a hash reference
retrieve any matching detail you need from the hash reference
This question has been asked about PHP both here and here, and I have the same question for Perl. Given a function that returns a list, is there any way (or what is the best way) to immediately index into it without using a temporary variable?
For example:
my $comma_separated = "a,b,c";
my $a = split (/,/, $comma_separated)[0]; #not valid syntax
I see why the syntax in the second line is invalid, so I'm wondering if there's a way to get the same effect without first assigning the return value to a list and indexing from that.
Just use parentheses to define your list and then index it to pull your desired element(s):
my $a = (split /,/, $comma_separated)[0];
Just like you can do this:
($a, $b, $c) = #array;
You can do this:
my($a) = split /,/, $comma_separated;
my $a on the LHS (left hand side) is treated as scalar context. my($a) is list context. Its a single element list so it gets just the first element returned from split.
It has the added benefit of auto-limiting the split, so there's no wasted work if $comma_separated is large.