I have an array like this:
my #array = qw( zero one two three four five six seven eigth nine);
How to output a subarray consisting of strings of length 4 from #array. For example, if the string is equal to 4, the new array will be output as #subarray = ( zero four five nine )
The built-in function grep serves as the "filter" operation in Perl, capable of filtering a list based on a regular expression or an arbitrary block.
If given a block, grep will call the block for each element of the list, setting the implicit variable $_ to the current value. It will keep the values which return truthy. So your filter would look like
my #subarray = grep { length == 4 } #array;
length, like a lot of built-in Perl functions, can take an argument (length($a), etc.), but when called without one it automatically takes the implicit variable $_ as an argument.
You can also pass it a regular expression. This is mainly useful if you're worried your coworkers like you too much and want to make some enemies.
my #subarray = grep(/^.{4}$/, #array);
Related
I want to split a scalar by whitespaces and save the result in an ArrayReference.
use strict;
use warnings;
use Data::Dumper;
my $name = 'hans georg mustermann';
my $array = split ' ', $name;
print Dumper($array); #$VAR1 = 3;
So it seems $array is now a scalar with the size resulted by the split operation.
When i change the code to my $array = [split ' ', $name]; the variable $array is now a ArrayReference and contains all 3 strings.
I just don't understand this behavior. Would be really great if someone could explain it to me or post a good documentation about these things, as i don't know how to search for this topic.
Thank you in advance
What you see here is called "context". The documentation about this is rather scattered. You also want to take a look at this tutorial about "scalar vs list context" https://perlmaven.com/scalar-and-list-context-in-perl
If you assign the result of split (or any subroutine calls) to an array, it's list context:
my #arr = split ' ', $name;
#=> #arr = ('hans', 'georg', 'mustermann');
What your example code shows is assigning them to a scalar -- and therefore it's under "scalar context".
Since, naturally, multiple things cannot fit into one position, some sort of summarization needs to be done. In the case of split function, perl5 has defined that the number of elements in the result of split shall be the best.
Check the documentation of the split function: https://perldoc.pl/functions/split -- which actually defines the behaviour under scalar context as well as list context.
Also take a glance at the documentation of all builtin functions at https://perldoc.pl/functions -- you'll find the behaviour definition under "list context" and "scalar context" for most of them -- although many of them are not returning "the size of lists" but rather something else.
That's called context.
The partial expression split ' ', $name evaluates to a list. The partial expression $array = LIST coerces the list to a scalar value, namely counting the number of elements in the list. That's the default behaviour of lists in scalar context.
You should write #array = LIST instead, using an array variable, not a scalar variable, in order to preserve the list values.
If you read the documentation for split(), you'll find the bit that explains what the function returns.
Splits the string EXPR into a list of strings and returns the list in list context, or the size of the list in scalar context.
You're calling the function in scalar context (because you're assigning the result of the call to a scalar variable) so you're getting the size of the list.
If you want to get the list, then you need to store it either in a list of variables:
my ($forename, $middlename, $surname) = split ' ', $name;
Or (more usually) in an array:
my #name_parts = split ' ', $name;
But actually, you say that you want an array reference. You can do that by calling split() inside an anonymous array constructor ([ ... ]) and assigning the result of that call to a scalar variable.
my $name_parts = [ split ' ', $name ];
In Perl, I know of three ways to test objects for equality: ==, eq, and ~~. All of these tell me that 1 is equal to "1" in Perl 5.
However, 1 and "1" are not the same thing. How can I compare two objects so that 1 equals 1, "1" equals "1", 1 does not equal "1" or 2, and "1" does not equal "01"? Answers for both Perls would be appreciated.
Don't. In Perl, one is one. Polymorphism based on a value's type is bound to fail. That's why Perl has two comparison operators, and that's why ~~ is broken[1].
For example, the scalar returned by !0 contains three values, one stored as an integer, one stored as a floating point number, and one stored as a string.
For example, an object of class package Foo; use overload '0+' => sub { 1 }, fallback => 1; might not contain one at all, but it's considered to be one in numerical and string contexts.
That's why it's still flagged as experimental.
Serializers like Data::Dumper::Dumper and JSON::encode_json can treat scalars internally stored as numbers differently from scalars internally stored as strings, so you can compare serialized output:
use Data::Dumper;
$x = 1;
$y = "1";
$same = Dumper($x) eq Dumper($y);
You can use this snippet as a starting point for your comparison function:
my $num = 42;
my $str = "42";
say B::svref_2object(\$num)->FLAGS;
say B::svref_2object(\$str)->FLAGS;
You will see that the flags of both variables differ. Read perldoc B for more in-depth information.
You may also find the source code of JSON::PP interesting. Search for "sub value_to_json".
This question already has answers here:
How can I verify that a value is present in an array (list) in Perl?
(8 answers)
Closed 9 years ago.
I'm still feeling my way though perl and so there's probably a simple way of doing this but I can find it. I want to compare a single value say A or E to an array that may or may not contain that value, eg A B C D and then perform an action if they match. How should I set this up?
Thanks.
You filter each element of the array to see if it is the element you are looking for and then use the resulting array as a boolean value (not empty = true, empty = false):
#filtered_array = grep { $_ eq 'A' } #array;
if (#filtered_array) {
print "found it!\n";
}
If you store the list in an array then the only way is to examine each element individually in a loop, using grep, or for or any from List::MoreUtils. (grep is the worst of these, as it searches the entire array, even if a match has been found early on.) This is fine if the array is small, but you will hit performance probelms if the array has a significant size and you have to check it frequently.
You can speed things up by representing the same list in a hash, when a check for membership is just a single key lookup.
Alternatively, if the list is enormous, then it is best kept in a database, using SQLite.
Are you stuck on arrays?
Whenever in Perl you're talk about quickly looking up data, you should think in terms of hashes. A hash is a collection of data like an array, but it is keyed, and looking up the key is a very fast operation in Perl.
There's nothing that says the keys to your hash can't be your data, and it is very common in Perl to index an array with a hash in order to quickly search for values.
This turns your array #array into a hash called %arrays_hash.
use strict;
use warnings;
use feature qw(say);
use autodie;
my #array = qw(Alpha Beta Delta Gamma Ohm);
my %array_index;
for my $entry ( #array ) {
$array_index{$entry} = 1; # Doesn't matter. As long as it isn't blank or zero
}
Now, looking up whether or not your data is in your array is very quick. Just simply see if it's a key in your %array_index:
my $item = "Delta"; # Is this in my initial array?
if ( $array_index{$item} ) {
say "Yes! Item '$item' is in my array.";
}
else {
say "No. Item '$item' isn't in my array. David sad.";
}
This is so common, that you'll see a lot of programs that use the map command to index the array. Instead of that for loop, I could have done this:
my %array_index = ( map { $_ => 1 } #array );
or
my %array_index;
map { $array_index{$_} = 1 } #array;
You'll see both. The first one is a one liner. The map command takes each entry in the array, and puts it in $_. Then, it returns the results into an array. Thus, the map will return an array with your data in the even positions (0, 2, 4 8...) and a 1 in the odd positions (1, 3, 5...).
The second one is more literal and easier to understand (or about as easy to understand in a map command). Again, each item in your #array is being assigned to $_, and that is being used as the key in my %array_index hash.
Whether or not you want to use hashes depend upon the length of your array, and how many items of input you'll be searching for. If you're simply searching whether a single item is in your array, I'd probably use List::Utils or List::MoreUtils, or use a for loop to search each value of my array. If I am doing this for multiple values, I am better off with a hash.
What are the differences between #variable and $variable in Perl?
I have read code with the symbol $ and the symbol # before a variable name.
For example:
$info = "Caine:Michael:Actor:14, Leafy Drive";
#personal = split(/:/, $info);
What are the difference between a variable containing $ as opposed to #?
It isn't really about the variable, but more about the context how the variable is used. If you put a $ in front of the variable name, then it is used in scalar context, if you have a # that means you use the variable in list context.
my #arr; defines variable arr as array
when you want to access one individual element (that is a scalar context), you have to use $arr[0]
You can find more about Perl contexts here: http://www.perlmonks.org/?node_id=738558
All your knowledge about Perl will be crashed with mountains, when you don't feel context of this language.
As many people, you use in your speech single value (scalars) and many things in a set.
So, the difference between all of them:
i have a cat. $myCatName = 'Snowball';
it jump on bed where sit #allFriends = qw(Fred John David);
And you can count them $count = #allFriends;
but can't count them at all cause list of names not countable: $nameNotCount = (Fred John David);
So, after all:
print $myCatName = 'Snowball'; # scalar
print #allFriends = qw(Fred John David); # array! (countable)
print $count = #allFriends; # count of elements (cause array)
print $nameNotCount = qw(Fred John David); # last element of list (uncountable)
So, list is not the same, as an array.
Interesting feature is slices where your mind will play a trick with you:
this code is a magic:
my #allFriends = qw(Fred John David);
$anotherFriendComeToParty =qq(Chris);
$allFriends[#allFriends] = $anotherFriendComeToParty; # normal, add to the end of my friends
say #allFriends;
#allFriends[#allFriends] = $anotherFriendComeToParty; # WHAT?! WAIT?! WHAT HAPPEN?
say #allFriends;
so, after all things:
Perl have an interesting feature about context. your $ and # are sigils, that help Perl know, what you want, not what you really mean.
$ like s, so scalar
# like a, so array
Variables that start $ are scalars, a single value.
$name = "david";
Variables that start # are arrays:
#names = ("dracula", "frankenstein", "dave");
If you refer to a single value from the array, you use the $
print "$names[1]"; // will print frankenstein
From perldoc perlfaq7
What are all these $#%&* punctuation signs, and how do I know when to use them?
They are type specifiers, as detailed in
perldata:
$ for scalar values (number, string or reference)
# for arrays
% for hashes (associative arrays)
& for subroutines (aka functions, procedures, methods)
* for all types of that symbol name. In version 4 you used them like
pointers, but in modern perls you can just use references.
$ is for scalar variables(in your case a string variable.)
# is for arrays.
split function will split the variable passed to it acoording to the delimiter mentioned(:) and put the strings in the array.
Variable name starts with $ symbol called scalar variable.
Variable name starts with # symbol called array.
$var -> can hold single value.
#var -> can hold bunch of values ie., it contains list of scalar values.
Please explain this apparently inconsistent behaviour:
$a = b, c;
print $a; # this prints: b
$a = (b, c);
print $a; # this prints: c
The = operator has higher precedence than ,.
And the comma operator throws away its left argument and returns the right one.
Note that the comma operator behaves differently depending on context. From perldoc perlop:
Binary "," is the comma operator. In
scalar context it evaluates its left
argument, throws that value away, then
evaluates its right argument and
returns that value. This is just like
C's comma operator.
In list context, it's just the list
argument separator, and inserts both
its arguments into the list. These
arguments are also evaluated from left
to right.
As eugene's answer seems to leave some questions by OP i try to explain based on that:
$a = "b", "c";
print $a;
Here the left argument is $a = "b" because = has a higher precedence than , it will be evaluated first. After that $a contains "b".
The right argument is "c" and will be returned as i show soon.
At that point when you print $a it is obviously printing b to your screen.
$a = ("b", "c");
print $a;
Here the term ("b","c") will be evaluated first because of the higher precedence of parentheses. It returns "c" and this will be assigned to $a.
So here you print "c".
$var = ($a = "b","c");
print $var;
print $a;
Here $a contains "b" and $var contains "c".
Once you get the precedence rules this is perfectly consistent
Since eugene and mugen have answered this question nicely with good examples already, I am going to setup some concepts then ask some conceptual questions of the OP to see if it helps to illuminate some Perl concepts.
The first concept is what the sigils $ and # mean (we wont descuss % here). # means multiple items (said "these things"). $ means one item (said "this thing"). To get first element of an array #a you can do $first = $a[0], get the last element: $last = $a[-1]. N.B. not #a[0] or #a[-1]. You can slice by doing #shorter = #longer[1,2].
The second concept is the difference between void, scalar and list context. Perl has the concept of the context in which your containers (scalars, arrays etc.) are used. An easy way to see this is that if you store a list (we will get to this) as an array #array = ("cow", "sheep", "llama") then we store the array as a scalar $size = #array we get the length of the array. We can also force this behavior by using the scalar operator such as print scalar #array. I will say it one more time for clarity: An array (not a list) in scalar context will return, not an element (as a list does) but rather the length of the array.
Remember from before you use the $ sigil when you only expect one item, i.e. $first = $a[0]. In this way you know you are in scalar context. Now when you call $length = #array you can see clearly that you are calling the array in scalar context, and thus you trigger the special property of an array in list context, you get its length.
This has another nice feature for testing if there are element in the array. print '#array contains items' if #array; print '#array is empty' unless #array. The if/unless tests force scalar context on the array, thus the if sees the length of the array not elements of it. Since all numerical values are 'truthy' except zero, if the array has non-zero length, the statement if #array evaluates to true and you get the print statement.
Void context means that the return value of some operation is ignored. A useful operation in void context could be something like incrementing. $n = 1; $n++; print $n; In this example $n++ (increment after returning) was in void context in that its return value "1" wasn't used (stored, printed etc).
The third concept is the difference between a list and an array. A list is an ordered set of values, an array is a container that holds an ordered set of values. You can see the difference for example in the gymnastics one must do to get particular element after using sort without storing the result first (try pop sort { $a cmp $b } #array for example, which doesn't work because pop does not act on a list, only an array).
Now we can ask, when you attempt your examples, what would you want Perl to do in these cases? As others have said, this depends on precedence.
In your first example, since the = operator has higher precedence than the ,, you haven't actually assigned a list to the variable, you have done something more like ($a = "b"), ("c") which effectively does nothing with the string "c". In fact it was called in void context. With warnings enabled, since this operation does not accomplish anything, Perl attempts to warn you that you probably didn't mean to do that with the message: Useless use of a constant in void context.
Now, what would you want Perl to do when you attempt to store a list to a scalar (or use a list in a scalar context)? It will not store the length of the list, this is only a behavior of an array. Therefore it must store one of the values in the list. While I know it is not canonically true, this example is very close to what happens.
my #animals = ("cow", "sheep", "llama");
my $return;
foreach my $animal (#animals) {
$return = $animal;
}
print $return;
And therefore you get the last element of the list (the canonical difference is that the preceding values were never stored then overwritten, however the logic is similar).
There are ways to store a something that looks like a list in a scalar, but this involves references. Read more about that in perldoc perlreftut.
Hopefully this makes things a little more clear. Finally I will say, until you get the hang of Perl's precedence rules, it never hurts to put in explicit parentheses for lists and function's arguments.
There is an easy way to see how Perl handles both of the examples, just run them through with:
perl -MO=Deparse,-p -e'...'
As you can see, the difference is because the order of operations is slightly different than you might suspect.
perl -MO=Deparse,-p -e'$a = a, b;print $a'
(($a = 'a'), '???');
print($a);
perl -MO=Deparse,-p -e'$a = (a, b);print $a'
($a = ('???', 'b'));
print($a);
Note: you see '???', because the original value got optimized away.