Perl get array count so can start foreach loop at a certain array element - perl

I have a file that I am reading in. I'm using perl to reformat the date. It is a comma seperated file. In one of the files, I know that element.0 is a zipcode and element.1 is a counter. Each row can have 1-n number of cities. I need to know the number of elements from element.3 to the end of the line so that I can reformat them properly. I was wanting to use a foreach loop starting at element.3 to format the other elements into a single string.
Any help would be appreciated. Basically I am trying to read in a csv file and create a cpp file that can then be compiled on another platform as a plug-in for that platform.
Best Regards
Michael Gould

you can do something like this to get the fields from a line:
my #fields = split /,/, $line;
To access all elements from 3 to the end, do this:
foreach my $city (#fields[3..$#fields])
{
#do stuff
}
(Note, based on your question I assume you are using zero-based indexing. Thus "element 3" is the 4th element).
Alternatively, consider Text::CSV to read your CSV file, especially if you have things like escaped delimiters.

Well if your line is being read into an array, you can get the number of elements in the array by evaluating it in scalar context, for example
my $elems = #line;
or to be really sure
my $elems = scalar(#line);
Although in that case the scalar is redundant, it's handy for forcing scalar context where it would otherwise be list context. You can also find the index of the last element of the array with $#line.
After that, if you want to get everything from element 3 onwards you can use an array slice:
my #threeonwards = #line[3 .. $#line];

Related

What does $variable{$2}++ mean in Perl?

I have a two-column data set in a tab-separated .txt file, and the perl script reads it as FH and this is the immediate snippet of code that follows:
while(<FH>)
{
chomp;
s/\r//;
/(.+)\t(.+)/;
$uniq_tar{$2}++;
$uniq_mir{$1}++;
push#{$mir_arr{$1}},$2;
push #{$target{$2}} ,$1;
}
When I try to print any of the above 4 variables, it says the variables are uninitialized.
And, when I tried to print $uniq_tar{$2}++; and $uniq_mir{$1}++;
It just prints some numbers which I cannot understand.
I would just like to know what this part of code evaluate in general?
$uniq_tar{$2}++;
The while loop puts each line of your file, in turn, into Perl's special variable $_.
/.../ is the match operator. By default it works on $_.
/(.*)\t(.*)/ is a regular expression inside the match operator. If the regex matches what is in $_, then the bits of the matching string that are inside the two pairs of parentheses are stored in Perl's special variables $1 and $2.
You have hashes called %uniq_tar and %uniq_mir. You access individual elements in a hash using the $hashname{key}. So, $uniq_tar{$1} is finding the value in %uniq_tar associated with the key that is stored in $1 (that is - the part of your record before the first tab).
$variable++ increments the number in $variable. So $uniq_tar{$1}++ increments the value that we found in the previous paragraph.
So, as zdim says, it's a frequency counter. You read each line in the file, and extract the bits of data before and after the first tab in the line. You then increment the values in two hashes to count the number of occurences of each of the strings.

perl to hardcode a static value in a field

I am still learning perl and have all most got a program written. My question, as simple as it may be, is if I want to hardcode a string to a field would the below do that? Thank you :).
$out[45]="VUS";
In the other lines I use the below to define the values that are passed into the `$[out], but the one in question is hardcoded and the others come from a split.
my #vals = split/\t/; # this splits the line at tabs
my #mutations=split/,/,$vals[9]; # splits on comma to create an array of mutations
my ($gene,$transcript,$exon,$coding,$aa);
for (#mutations)
{
($gene,$transcript,$exon,$coding,$aa) = split/\:/; # this takes col AB and splits it at colons
grep {$transcript eq $_} keys %nms or next;
}
my #out=($.,#colsleft,$_,#colsright);
$out[2]=$gene;
$out[3]=$nms{$transcript};
$out[4]=$transcript;
$out[15]=$coding;
$out[17]=$aa;
Your line of code: $out[45]="VUS"; is correct in that it is defining that 46th element of the array #out to the string, "VUS". I am trying to understand from your code, however why you would want to do that? Usually, it is better practice to not hardcode if at all possible. You want to make it your goal to make your program as dynamic as possible.

Declaring a Perl array and assigning it values by an array-slice

I was trying to split a string and rearrange the results, all in a single statement:
my $date_str = '15/5/2015';
my #directly_assigned_date_array[2,1,0] = split ('/', $date_str);
This resulted in:
syntax error at Array_slice_test.pl line 16, near "#directly_assigned_date_array["
Why is that an error?
The following works well though:
my #date_array;
#date_array[2,1,0] = split ('/', $date_str);
#vol7ron offered a different way to do it:
my #rvalue_array = (split '/', $date_str)[2,1,0];
And it indeed does the job, but it looks unintuitive, to me at least.
As you are just reversing the splitted array you can accomplish the same using this single statement: #date_array = reverse(split('/',$date_str));
Others here know much more about Perl internals than myself, but I assume it cannot perform the operation because an array slice is referencing an element of an array, which does not yet exist. Because the array has not yet been declared, it wouldn't know what address to reference.
my #array = ( split '/', $date_str )[2,1,0];
This works because split returns values in list context. Lists and arrays are very similar in Perl. You could think of an array as a super list, with extra abilities. However you choose to think of it, you can perform a list slice just like an array slice.
In the above code, you're taking the list, then reordering it using the slice and then assigning that to array. It may feel different to think about at first, but it shouldn't be too hard. Generally, you want your data operations (modifications and ordering) to be performed on the rhs of the assignment and your lhs to be the receiving end.
Keep in mind that I've also dropped some parentheses and used Perl's smart order of operation interpreting to reduce the syntax. The same code might otherwise look like the following (same operations, just more fluff):
my #array = ( split( '/', $date_str ) )[2,1,0];
As #luminos mentioned, since you only have 3 elements you're manually reversing it, you could use a reverse function; again we can make use of Perl's magic order of operation and drop the parentheses here:
my #array = reverse split '/', $date_str;
But in this case it might be too magical, so depending on your coding practice guidelines, you may want to include a set of parentheses for the split or reverse, if it increases readability and comprehension.

In Perl, what is the difference between accessing an array element using #a[$i] as opposed to using $a[$i]?

Basic syntax tutorials I followed do not make this clear:
Is there any practical/philosophical/context-dependent/tricky difference between accessing an array using the former or latter subscript notation?
$ perl -le 'my #a = qw(io tu egli); print $a[1], #a[1]'
The output seems to be the same in both cases.
$a[...] # array element
returns the one element identified by the index expression, and
#a[...] # array slice
returns all the elements identified by the index expression.
As such,
You should use $a[EXPR] when you mean to access a single element in order to convey this information to the reader. In fact, you can get a warning if you don't.
You should use #a[LIST] when you mean to access many elements or a variable number of elements.
But that's not the end of the story. You asked for practical and tricky (subtle?) differences, and there's one noone mentioned yet: The index expression for an array element is evaluated in scalar context, while the index expression for an array slice is evaluated in list context.
sub f { return #_; }
$a[ f(4,5,6) ] # Same as $a[3]
#a[ f(4,5,6) ] # Same as $a[4],$a[5],$a[6]
If you turn on warnings (which you always should) you would see this:
Scalar value #a[0] better written as $a[0]
when you use #a[1].
The # sigil means "give me a list of something." When used with an array subscript, it retrieves a slice of the array. For example, #foo[0..3] retrieves the first four items in the array #foo.
When you write #a[1], you're asking for a one-element slice from #a. That's perfectly OK, but it's much clearer to ask for a single value, $a[1], instead. So much so that Perl will warn you if you do it the first way.
The first yields a scalar variable while the second gives you an array slice .... Very different animals!!

loading data from file into 2d array

I am just starting with perl and would like some help with arrays please.
I am reading lines from a data file and splitting the line into fields:
open (INFILE, $infile);
do {
my $linedata = <INFILE>;
my #data= split ',',$linedata;
....
} until eof;
I then want to store the individual field values (in #data) in and array so that the array looks like the input data file ie, the first "row" of the array contains the first line of data from INFILE etc.
Each line of data from the infile contains 4 values, x,y,z and w and once the data are all in the array, I have to pass the array into another program which reads the x,y,z,w and displays the w value on a screen at the point determined by the x,y,z value. I can not pas the data to the other program on a row-by-row basis as the program expects the data to in a 2d matrtix format.
Any help greatly appreciated.
Chris
That's not really that difficult, you just need to store the splits, not in their own separate list, but in an array, taking up a slot of a larger array:
my #all_data;
while (my $linedata = <INFILE>) {
push # creates the next (n) slot(s) in an array
#all_data
, [ split ',',$linedata ]
# ^ we're pushing an *array* not just additional elements.
;
}
However, if you're just trying to read a commonly-known concept as a comma-separated values format, then have a look at something like Text::CSV, because the full capabilities of CSV is more than splitting on commas.