I have a certain array from which I want to remove the last three element (constant number, because I want a list of subdirectory which are in a directory where there is 3 files).
Is there a better and nicer solution than this one ?
my #list = (`ls --group-directories-first`);
pop #list;
pop #list;
pop #list;
You can use splice:
splice #list, -3;
Where -3 denotes an offset of 3 from the end. This will remove elements from that offset and onward.
It's worth noting that parsing output from ls is a horrible idea. You can do the exact same thing with Perl code:
my #list = grep -d, glob "*";
The -d is a file check, which checks if the current argument is a directory.
splice is also the solution I'd advise.
However, since you're working with the end of an array and just wanting to destroy those trailing elements instead of saving them to another structure, you could also just edit the last array index:
$#list -= 3;
The above is reduces the size of the #list array by 3 elements. Here's another example:
my #a = (1..10);
$#a -= 4;
print "#a\n";
# Prints: 1 2 3 4 5 6
Related
I seem to have come across several different ways to find the size of an array. What is the difference between these three methods?
my #arr = (2);
print scalar #arr; # First way to print array size
print $#arr; # Second way to print array size
my $arrSize = #arr;
print $arrSize; # Third way to print array size
The first and third ways are the same: they evaluate an array in scalar context. I would consider this to be the standard way to get an array's size.
The second way actually returns the last index of the array, which is not (usually) the same as the array size.
First, the second ($#array) is not equivalent to the other two. $#array returns the last index of the array, which is one less than the size of the array.
The other two (scalar #arr and $arrSize = #arr) are virtually the same. You are simply using two different means to create scalar context. It comes down to a question of readability.
I personally prefer the following:
say 0+#array; # Represent #array as a number
I find it clearer than
say scalar(#array); # Represent #array as a scalar
and
my $size = #array;
say $size;
The latter looks quite clear alone like this, but I find that the extra line takes away from clarity when part of other code. It's useful for teaching what #array does in scalar context, and maybe if you want to use $size more than once.
This gets the size by forcing the array into a scalar context, in which it is evaluated as its size:
print scalar #arr;
This is another way of forcing the array into a scalar context, since it's being assigned to a scalar variable:
my $arrSize = #arr;
This gets the index of the last element in the array, so it's actually the size minus 1 (assuming indexes start at 0, which is adjustable in Perl although doing so is usually a bad idea):
print $#arr;
This last one isn't really good to use for getting the array size. It would be useful if you just want to get the last element of the array:
my $lastElement = $arr[$#arr];
Also, as you can see here on Stack Overflow, this construct isn't handled correctly by most syntax highlighters...
To use the second way, add 1:
print $#arr + 1; # Second way to print array size
All three give the same result if we modify the second one a bit:
my #arr = (2, 4, 8, 10);
print "First result:\n";
print scalar #arr;
print "\n\nSecond result:\n";
print $#arr + 1; # Shift numeration with +1 as it shows last index that starts with 0.
print "\n\nThird result:\n";
my $arrSize = #arr;
print $arrSize;
Example:
my #a = (undef, undef);
my $size = #a;
warn "Size: " . $#a; # Size: 1. It's not the size
warn "Size: " . $size; # Size: 2
The “Perl variable types” section of the perlintro documentation contains
The special variable $#array tells you the index of the last element of an array:
print $mixed[$#mixed]; # last element, prints 1.23
You might be tempted to use $#array + 1 to tell you how many items there are in an array. Don’t bother. As it happens, using #array where Perl expects to find a scalar value (“in scalar context”) will give you the number of elements in the array:
if (#animals < 5) { ... }
The perldata documentation also covers this in the “Scalar values” section.
If you evaluate an array in scalar context, it returns the length of the array. (Note that this is not true of lists, which return the last value, like the C comma operator, nor of built-in functions, which return whatever they feel like returning.) The following is always true:
scalar(#whatever) == $#whatever + 1;
Some programmers choose to use an explicit conversion so as to leave nothing to doubt:
$element_count = scalar(#whatever);
Earlier in the same section documents how to obtain the index of the last element of an array.
The length of an array is a scalar value. You may find the length of array #days by evaluating $#days, as in csh. However, this isn’t the length of the array; it’s the subscript of the last element, which is a different value since there is ordinarily a 0th element.
From perldoc perldata, which should be safe to quote:
The following is always true:
scalar(#whatever) == $#whatever + 1;
Just so long as you don't $#whatever++ and mysteriously increase the size or your array.
The array indices start with 0.
and
You can truncate an array down to nothing by assigning the null list () to it. The following are equivalent:
#whatever = ();
$#whatever = -1;
Which brings me to what I was looking for which is how to detect the array is empty. I found it if $#empty == -1;
There are various ways to print size of an array. Here are the meanings of all:
Let’s say our array is my #arr = (3,4);
Method 1: scalar
This is the right way to get the size of arrays.
print scalar #arr; # Prints size, here 2
Method 2: Index number
$#arr gives the last index of an array. So if array is of size 10 then its last index would be 9.
print $#arr; # Prints 1, as last index is 1
print $#arr + 1; # Adds 1 to the last index to get the array size
We are adding 1 here, considering the array as 0-indexed. But, if it's not zero-based then, this logic will fail.
perl -le 'local $[ = 4; my #arr = (3, 4); print $#arr + 1;' # prints 6
The above example prints 6, because we have set its initial index to 4. Now the index would be 5 and 6, with elements 3 and 4 respectively.
Method 3:
When an array is used in a scalar context, then it returns the size of the array
my $size = #arr;
print $size; # Prints size, here 2
Actually, method 3 and method 1 are same.
Use int(#array) as it threats the argument as scalar.
To find the size of an array use the scalar keyword:
print scalar #array;
To find out the last index of an array there is $# (Perl default variable). It gives the last index of an array. As an array starts from 0, we get the size of array by adding one to $#:
print "$#array+1";
Example:
my #a = qw(1 3 5);
print scalar #a, "\n";
print $#a+1, "\n";
Output:
3
3
As numerous answers pointed out, the first and third way are the correct methods to get the array size, and the second way is not.
Here I expand on these answers with some usage examples.
#array_name evaluates to the length of the array = the size of the array = the number of elements in the array, when used in a scalar context.
Below are some examples of a scalar context, such as #array_name by itself inside if or unless, of in arithmetic comparisons such as == or !=.
All of these examples will work if you change #array_name to scalar(#array_name). This would make the code more explicit, but also longer and slightly less readable. Therefore, more idiomatic usage omitting scalar() is preferred here.
my #a = (undef, q{}, 0, 1);
# All of these test whether 'array' has four elements:
print q{array has four elements} if #a == 4;
print q{array has four elements} unless #a != 4;
#a == 4 and print q{array has four elements};
!(#a != 4) and print q{array has four elements};
# All of the above print:
# array has four elements
# All of these test whether array is not empty:
print q{array is not empty} if #a;
print q{array is not empty} unless !#a;
#a and print q{array is not empty};
!(!#a) and print q{array is not empty};
# All of the above print:
# array is not empty
Let's consider simple Perl code:
my #x = ( 1, 5, 9);
for my $i ( 0 .. $#x ) {
splice( #x, $i, 1 ) if ( $x[$i] >= 5 );
}
print "#x";
Output is not correct, 1 9 but there must be 1
If we run code with -w flag it prints warning
Use of uninitialized value within #x in numeric ge (>=) at splice.pl line 5.
So, it's not good practise to use conditional splice and better to push result in new variable?
The problem isn't your use of conditional splice per se, it's your loop. The most obvious problem, and the one that causes your warning, is that you're running off of the end of the array. for my $i ( 0 .. $#x ) sets the iteration endpoint to $#x before the loop starts, but after you splice one or more elements out, the last index of the array will be smaller. You could fix that using a C-style for loop, instead of the range-style loop, but I don't recommend it — keep reading.
The next problem is that after you splice an element out of the array, you continue the loop with $i one higher... but because you spliced an element out of the array, the next element that you haven't seen yet is in $x[$i], not $x[$i+1]. You say "Output is correct, 1 9", but shouldn't 9 have been removed, since it's more than 5? You could fix this using redo after splice to go through the loop again without incrementing $i, but I don't recommend that either.
So it is possible to fix your loop which uses splice in place so that it will work correctly, but the result would be pretty complicated. Unless there's a compelling reason to do it differently, I would recommend using simply
#x = grep { $_ < 5 } #x;
There's no problem with assigning the result to the same array as the source, and there is no loop management or other housekeeping for you to do.
I have a 3 dimendional array like:
$a[0][0]=1; $a[0][1]=2; $a[0][2]=1;
$a[1][0]=1; $a[1][1]=3; $a[1][2]=-1;
$a[2][0]=2; $a[2][1]=4; $a[2][2]=1;
I would like to erase the line that has as the first two elements the values 1 and 3 (in this case the whole a[1][0 .. 2] elements. The output I would like to obtain is:
$a[0][0]=1; $a[0][1]=2; $a[0][2]=1;
$a[1][0]=2; $a[1][1]=4; $a[1][2]=1;
I am looking for a general solution for this problem. With one condition I would use the grep function, but I don't know how to do it with 2 conditions..
There's no such thing as a 2D array in Perl. #a is just an array of references. So you're not trying to delete $a[1][0 .. 2], but just $a[1].
Yet you can't really delete from an array. Using splice, you can shift all the elements around,
for my $i (reverse 0 .. $#a) {
splice(#a, $i, 1) if $a[$i][0]==1 && $a[$i][1]==3;
}
But it's usually simpler and more efficient to remove the unwanted elements using grep and assigning the rest back to the array.
#a = grep { !( $_->[0]==1 && $_->[1]==3 ) } #a;
use Data::Dumper;
$a[0][0]=1; $a[0][1]=2; $a[0][2]=1;
$a[1][0]=1; $a[1][1]=3; $a[1][2]=-1;
$a[2][0]=2; $a[2][1]=4; $a[2][2]=1;
print Dumper(\#a);
#b=grep(!($_->[0]==1 && $_->[1]==3),#a);
print Dumper(\#b);
I have the following code:
#!/usr/bin/perl
# splits.pl
use strict;
use warnings;
use diagnostics;
my $pivotfile = "myPath/Internal_Splits_Pivot.txt";
open PIVOTFILE, $pivotfile or die $!;
while (<PIVOTFILE>) { # loop through each line in file
next if ($. == 1); # skip first line (contains business segment code)
next if ($. == 2); # skip second line (contains transaction amount text)
my #fields = split('\t',$_); # split fields for line into an array
print scalar(grep $_, #fields), "\n";
}
Given that the data in the text file is this:
4 G I M N U X
Transaction Amount Transaction Amount Transaction Amount Transaction Amount Transaction Amount Transaction Amount Transaction Amount
0000-13-I21 600
0001-8V-034BLA 2,172 2,172
0001-8V-191GYG 13,125 4,375
0001-9W-GH5B2A -2,967.09 2,967.09 25.00
I would expect the output from the perl script to be: 2 3 3 4 given the amount of defined elements in each line. The file is a tab delimited text file with 8 columns.
Instead I get 3 4 3 4 and I have no idea why!
For background, I am using Counting array elements in Perl as the basis for my development, as I am trying to count the number of elements in the line to know if I need to skip that line or not.
I suspect you have spaces mixed with the tabs in some places, and your grep test will consider " " true.
What does:
use Data::Dumper;
$Data::Dumper::Useqq=1;
print Dumper [<PIVOTFILE>];
show?
The problem should be in this line:
my #fields = split('\t',$_); # split fields for line into an array
The tab character doesn't get interpolated. And your file doesn't seem to be tab-only separated, at least here on SO. I changed the split regex to match arbitrary whitespace, ran the code on my machine and got the "right" result:
my #fields = split(/\s+/,$_); # split fields for line into an array
Result:
2
3
3
4
As a side note:
For background, I am using Counting array elements in Perl as the basis for my development, as I am trying to count the number of elements in the line to know if I need to skip that line or not.
Now I understand why you use grep to count array elements. That's important when your array contains undefined values like here:
my #a;
$a[1] = 42; # #a contains the list (undef, 42)
say scalar #a; # 2
or when you manually deleted entries:
my #a = split /,/ => 'foo,bar'; # #a contains the list ('foo', 'bar')
delete $a[0]; # #a contains the list (undef, 'bar')
say scalar #a; # 2
But in many cases, especially when you're using arrays to just store list without operating on single array elements, scalar #a works perfectly fine.
my #a = (1 .. 17, 1 .. 25); # (1, 2, ..., 17, 1, 2, .., 25)
say scalar #a; # 42
It's important to understand, what grep does! In your case
print scalar(grep $_, #fields), "\n";
grep returns the list of true values of #fields and then you print how many you have. But sometimes this isn't what you want/expect:
my #things = (17, 42, 'foo', '', 0); # even '' and 0 are things
say scalar grep $_ => #things # 3!
Because the empty string and the number 0 are false values in Perl, they won't get counted with that idiom. So if you want to know how long an array is, just use
say scalar #array; # number of array entries
If you want to count true values, use this
say scalar grep $_ => #array; # number of true values
But if you want to count defined values, use this
say scalar grep defined($_) => #array; # number of defined values
I'm pretty sure you already know this from the other answers on the linked page. In hashes, the situation is a little bit more complex because setting something to undef is not the same as deleteing it:
my %h = (a => 0, b => 42, c => 17, d => 666);
$h{c} = undef; # still there, but undefined
delete $h{d}; # BAM! $h{d} is gone!
What happens when we try to count values?
say scalar grep $_ => values %h; # 1
because 42 is the only true value in %h.
say scalar grep defined $_ => values %h; # 2
because 0 is defined although it's false.
say scalar grep exists $h{$_} => qw(a b c d); # 3
because undefined values can exist. Conclusion:
know what you're doing instead of copy'n'pasting code snippets :)
There are not only tabs, but there are spaces as well.
trying out with splitting by space works
Look below
#!/usr/bin/perl
# splits.pl
use strict;
use warnings;
use diagnostics;
while (<DATA>) { # loop through each line in file
next if ($. == 1); # skip first line (contains business segment code)
next if ($. == 2); # skip second line (contains transaction amount text)
my #fields = split(" ",$_); # split fields by SPACE
print scalar(#fields), "\n";
}
__DATA__
4 G I M N U X
Transaction Amount Transaction Amount Transaction Amount Transaction Amount Transaction Amount Transaction Amount Transaction Amount
0000-13-I21 600
0001-8V-034BLA 2,172 2,172
0001-8V-191GYG 13,125 4,375
0001-9W-GH5B2A -2,967.09 2,967.09 25.00
Output
2
3
3
4
Your code works for me. The problem may be that the input file contains some "hidden" whitespace fields (eg. other whitespace than tabs). For instance
A<tab><space><CR> gives two fields, A and <space><CR>
A<tab>B<tab><CR> gives three, A, B, <CR> (remember, the end of line is part of the input!)
I suggest you to chomp every line you use; other than that, you will have to clean the array from whitespace-only fields. Eg.
scalar(grep /\S/, #fields)
should do it.
A lot of great help on this question, and quickly too!
After a long, drawn-out learning process, this is what I came up with that worked quite well, with intended results.
#!/usr/bin/perl
# splits.pl
use strict;
use warnings;
use diagnostics;
my $pivotfile = "myPath/Internal_Splits_Pivot.txt";
open PIVOTFILE, $pivotfile or die $!;
while (<PIVOTFILE>) { # loop through each line in file
next if ($. == 1); # skip first line (contains business segment code)
next if ($. == 2); # skip second line (contains transaction amount text)
chomp $_; # clean line of trailing \n and white space
my #fields = split(/\t/,$_); # split fields for line into an array
print scalar(grep $_, #fields), "\n";
}
I wanted to chop off all but the first five elements of an array, so I stupidly did:
#foo = #foo[ 0 .. 4 ];
and heartily praised my own cleverness. But that broke once #foo ended up with only three elements, because then I ended up with two undefs on the end, instead of a three-element array. So I changed it to:
#foo = #foo > 5 ? #foo[ 0 .. 4 ] : #foo;
This works but is kinda ugly. Is there a better idiom for saying "give me everything up to the first five elements of the array?"
You can set the last index of an array to shorten or lengthen it. Like your code you'll need to check to make sure your not creating undef elements.
$#foo = 4 if $#foo > 4;
If you don't care about mutations (implied by the self-referential lhs #foo = something referencing #foo) use the two-argument splice(), see perldoc -f splice for more info.
splice ARRAY,OFFSET
Removes the elements designated by OFFSET and LENGTH from an array, and replaces them with the elements of LIST, if any. In list context, returns the elements removed from the array. In scalar context, returns the last element removed, or "undef" if no elements are removed. The array grows or shrinks as necessary. If OFFSET is negative then it starts that far from the end of the array. If LENGTH is omitted, removes everything from OFFSET onward. If LENGTH is negative, removes the elements from OFFSET onward except for -LENGTH elements at the end of the array. If both OFFSET and LENGTH are omitted, removes everything. If OFFSET is past the end of the array, perl issues a warning, and splices at the end of the array.
Then watch the effect:
#_ = 1..10;
splice #_, 5;
say for #_;
#_ = 1..3;
splice #_, 5;
say for #_;
If you're using warnings, and I hope you'll have to check for the length (as in Axeman's suggestion) or disable the noisy warning (splice() offset past end of array):
{
no warnings 'misc';
splice #_, 5;
}
Yet another way:
#foo = splice(#foo, 0, 5);
Unlike the other suggestion for splice, this doesn't trigger a warning; the 5 explicitly means "up to 5".
This isn't that graceful, but you can express it like this:
#foo[ 0..( $#foo > 4 ? 4 : $#foo ) ];
A generalized min function might look better.
use List::Util qw<min>;
#foo[ 0..min( $#foo, 4 ) ];
But if you just want to get rid of everything else then you just need to splice off the rest:
splice( #foo, 5 ) if 5 < #foo;