(3 lines) from bash to perl? - perl

I have these three lines in bash that work really nicely. I want to add them to some existing perl script but I have never used perl before ....
could somebody rewrite them for me? I tried to use them as they are and it didn't work
note that $SSH_CLIENT is a run-time parameter you get if you type set in bash (linux)
users[210]=radek #where 210 is tha last octet from my mac's IP
octet=($SSH_CLIENT) # split the value on spaces
somevariable=$users[${octet[0]##*.}] # extract the last octet from the ip address

These might work for you. I noted my assumptions with each line.
my %users = ( 210 => 'radek' );
I assume that you wanted a sparse array. Hashes are the standard implementation of sparse arrays in Perl.
my #octet = split ' ', $ENV{SSH_CLIENT}; # split the value on spaces
I assume that you still wanted to use the environment variable SSH_CLIENT
my ( $some_var ) = $octet[0] =~ /\.(\d+)$/;
You want the last set of digits from the '.' to the end.
The parens around the variable put the assignment into list context.
In list context, a match creates a list of all the "captured" sequences.
Assigning to a scalar in a list context, means that only the number of scalars in the expression are assigned from the list.
As for your question in the comments, you can get the variable out of the hash, by:
$db = $users{ $some_var };
# OR--this one's kind of clunky...
$db = $users{ [ $octet[0] =~ /\.(\d+)$/ ]->[0] };

Say you have already gotten your IP in a string,
$macip = "10.10.10.123";
#s = split /\./ , $macip;
print $s[-1]; #get last octet
If you don't know Perl and you are required to use it for work, you will have to learn it. Surely you are not going to come to SO and ask every time you need it in Perl right?

Related

Check if an array contais a time - Perl

How do I check if an array contains a time value? I've tried checking like this:
if ( #time =~ /$_:$_:$_/)
But it didn't work. Any ideas?
P.S.: The time is given like this: HH:MM:SS
Matching the time
To check for HH:MM:SS with a regular expression match, the simplest pattern would be
/\d\d:\d\d:\d\d/
If you only want this, add anchors for start (^) and end ($) of the string.
/^\d\d:\d\d:\d\d$/
If you want to make sure that your digits are only 0 to 9 and not digits from any script, use a character group.
/^[0-9]{2}:[0-9]{2}:[0-9]{2}$/
If you also want to make sure the time is a valid time, things get more complicated.
You might want to read perlre and perlretut. The tag wiki on Regular Expressions here on Stack Overflow has a lot of useful information and links to tools as well.
On arrays and scalars
However, there is no array in the code you've shown. In Perl, a variable with a $ as its sigil is called a scalar and represents a single value. That's the only thing you can pattern match against. An array would start with an # symbol.
What you can do is match against every element in your array. For that, you have to iterate the array.
A very verbose way to do that would be:
my $matches;
foreach my $time (#times) {
++$matches if $time =~ m/\d\d:\d\d:\d\d/;
}
A more Perlish way would be to use grep.
my $matches = grep { m/\d\d:\d\d:\d\d/ } #times;
This makes use of the fact that the list returned by grep will be converted to its number of elements in scalar context. If all you want is to know whether any of the elements matched, this is enough.
What your code did
The $_ variable is called the topic in Perl, and often contains some kind of default value for certain operators, if no other value is specified. Depending on where in your program you used your line of code, you are matching the number of elements in #time (because of scalar context, see above) against a pattern built up of the content of $_ and colons.
if (
#time # number of elements in array #times
=~ # because this operator forces scalar context
/
$_ # value of $_ based on surrounding code, or undef
: # a literal colon
$_ # see above
: # a literal colon
$_ # see above
/x # ( I added /x to allow comments so this compiles)
) { ... }

perl to hardcode a static value in a field

I am still learning perl and have all most got a program written. My question, as simple as it may be, is if I want to hardcode a string to a field would the below do that? Thank you :).
$out[45]="VUS";
In the other lines I use the below to define the values that are passed into the `$[out], but the one in question is hardcoded and the others come from a split.
my #vals = split/\t/; # this splits the line at tabs
my #mutations=split/,/,$vals[9]; # splits on comma to create an array of mutations
my ($gene,$transcript,$exon,$coding,$aa);
for (#mutations)
{
($gene,$transcript,$exon,$coding,$aa) = split/\:/; # this takes col AB and splits it at colons
grep {$transcript eq $_} keys %nms or next;
}
my #out=($.,#colsleft,$_,#colsright);
$out[2]=$gene;
$out[3]=$nms{$transcript};
$out[4]=$transcript;
$out[15]=$coding;
$out[17]=$aa;
Your line of code: $out[45]="VUS"; is correct in that it is defining that 46th element of the array #out to the string, "VUS". I am trying to understand from your code, however why you would want to do that? Usually, it is better practice to not hardcode if at all possible. You want to make it your goal to make your program as dynamic as possible.

Perl $1 giving uninitialized value error

I am trying to extract a part of a string and put it into a new variable. The string I am looking at is:
maker-scaffold_26653|ref0016423-snap-gene-0.1
(inside a $gene_name variable)
and the thing I want to match is:
scaffold_26653|ref0016423
I'm using the following piece of code:
my $gene_name;
my $scaffold_name;
if ($gene_name =~ m/scaffold_[0-9]+\|ref[0-9]+/) {
$scaffold_name = $1;
print "$scaffold_name\n";
}
I'm getting the following error when trying to execute:
Use of uninitialized value $scaffold_name in concatenation (.) or string
I know that the pattern is right, because if I use $' instead of $1 I get
-snap-gene-0.1
I'm at a bit of a loss: why will $1 not work here?
If you want to use a value from the matching you have to make () arround the character in regex
To expand on Jens' answer, () in a regex signifies an anonymous capture group. The content matched in a capture group is stored in $1-9+ from left to right, so for example,
/(..):(..):(..)/
on an HH:MM:SS time string will store hours, minutes, and seconds in $1, $2, $3 respectively. Naturally this begins to become unwieldy and is not self-documenting, so you can assign the results to a list instead:
my ($hours, $mins, $secs) = $time =~ m/(..):(..):(..)/;
So your example could bypass the use of $ variables by doing direct assignment:
my ($scaffold_name) = $gene_name =~ m/(scaffold_[0-9]+[|]ref[0-9]+)/;
# $scaffold_name now contains 'scaffold_26653|ref0016423'
You can even get rid of the ugly =~ binding by using for as a topicalizer:
my $scaffold_name;
for ($gene_name) {
($scaffold_name) = m/(scaffold_\d+[|]ref\d+)/;
print $scaffold_name;
}
If things start to get more complex, I prefer to use named capture groups (introduced in Perl v5.10.0):
$gene_name =~ m{
(?<scaffold_name> # ?<name> creates a named capture group
scaffold_\d+? # 'scaffold' and its trailing digits
[|] # Literal pipe symbol
ref\d+ # 'ref' and its trailing digits
)
}xms; # The x flag lets us write more readable regexes
print $+{scaffold_name}, "\n";
The results of named capture groups are stored in the magic hash %+. Access is done just like any other hash lookup, with the capture groups as the keys. %+ is locally scoped in the same way the $ are, so it can be used as a drop-in replacement for them in most situations.
It's overkill for this particular example, but as regexes start to get larger and more complicated, this saves you the trouble of either having to scroll all the way back up and count anonymous capture groups from left to right to find which of those darn $ variables is holding the capture you wanted, or scan across a long list assignment to find where to add a new variable to hold a capture that got inserted in the middle.
My personal rule of thumb is to assign the results of anonymous captured to descriptively named lexically scoped variables for 3 or less captures, then switch to using named captures, comments, and indentation in regexes when more are necessary.

Perl "else" statement not executing

I use an ActivePerl script to take in CSV files and create XML files that I load into a database. These are userid database entries, name, address, etc. We've always used the home phone number field to generate an initial password (which we encourage the users to change immediately!). The proliferation of cellphones means I have a bunch of people with no home phone, so I want to use the cell phone field when the home phone field is empty.
My input fields look like this:
# 0 Firstname
# 1 Lastname
# 2 VP (voicepart)
# 3 Address
# 4 City
# 5 State
# 6 Zip
# 7 Phone
# 8 Mobile
# 9 Email
Here's the Perl code I've worked up to create the password - the create_password subroutine is working when there's a value in field 7:
my $pass_word = '';
my $pass_word = create_password($fields[7]);
if (my $pass_word = '') {
print "Use the cell phone number \n";
my $pass_word = create_password($fields[8]);
}
The "print" statement is to tell me what it thinks it's doing.
This looks to me like it should work, but the "if" statment never fires. The Print statement doesn't print, and nobody with a value only in field 8 ever gets a password generated. There must be something wrong with the way I'm testing the value of $pass_word but I can't see it. Should I be testing the values of $fields[7] and $fields[8] instead of the variable value? How DO you test a Perl variable for null value if this doesn't work?
You have several problems in your code.
First of all, after you declared a variable using my, you don't need to add my before the variable when you use it;
Secondly, for this line:
if (my $pass_word = '')
I think you meant
if ($pass_word == '')
(my is removed, as talked in the first point)
= means assignment, which returns the value you assigned to $pass_word, which is '' here, that's why this condition always return false.
But still, == is not correct here. In perl, we use eq to compare two strings. == is used to compare numbers.
So, remove all the my except the first one, and use eq to compare your strings.
You've got two major problems in here.
First one is your string equality test. In Perl, strings are compared for equality using operator eq (as in $string eq 'something'). = is the assignment operator.
Second one is your (ab)use of my. Each my declares a new variable that “hides” the previous one, so in effect you can never re-use its value, you're confronted to undef every time.
Replace = with eq in your if clause; remove all but the first uses of my, and you should be set!
my declares a new variable which hides the variable with the same name in the surrounding scope. Remove the excessive use of my.

Transform data to array with Perl

How do I transform my data to an array with Perl?
Here is my data:
my $data =
"203.174.38.128203.174.38.129203.174.38.1" .
"30203.174.38.131203.174.38.132203.174.38" .
".133203.174.38.134173.174.38.135203.174." .
"38.136203.174.38.137203.174.38.142";
And I want to transform it to be array like this
my #array= (
"203.174.38.128",
"203.174.38.129",
"203.174.38.130",
"203.174.38.131",
"203.174.38.132",
"203.174.38.133",
"203.174.38.134",
"173.174.38.135",
"203.174.38.136",
"203.174.38.137",
"203.174.38.142"
);
Anyone know how to do that with Perl?
If the first part of IP logged is always 203, it's kinda easy:
my #arr = split /(?<=\d)(?=203\.)/, $data;
In the example given it's not, but the first part is always 3-digit, and the second part is always 174, so it's enough to do...
my #arr = split /(?<=\d)(?=\d{3}\.174\.)/, $data;
... to get the correct result.
But please understand that it's close to impossible to give a more generic (and bulletproof) solution here - when these 'marker' parts are... too dynamic. For example, take this string...
11.11.11.22222.11.11.11
The question is, where to split it? Should it be 11.11.11.22; 222.11.11.11? Or 11.11.11.222; 22.11.11.11? Both are quite valid IPs, if you ask me. And it could get even worse, with trying to split '2222' part (can be '2; 222', '22; 22' and even '222; 2').
You can, for example, make a rule: "split each sequence of > 3 digits followed by a dot sign so that the second part of this split would always start from 3 digits":
my #arr = split /(?<=\d)(?=\d{3}\.)/, $data;
... but this will obviously fail to work properly in the ambiguous cases mentioned earlier IF there are IPs with two- or even one-digit first octet in your datastring.
If you write a regex that will match any valid value for one of the numbers in the quartet then you can just search for them all and recombine them in sets of four. This
/2[0-5][0-5]|1\d\d|[1-9]\d|\d/
matches 200-255 or 100-199 or 10-99 or 0-9, and a program to use it is shown below.
There is no way to know which option to take if there is more than one way to split the string, and this solution assigns the longest value to the first of the two ip addresses. For instance, 1.1.1.1234.1.1.1 will split as 1.1.1.123 and 4.1.1.1
use strict;
use warnings;
use feature 'say';
my $data =
"203.174.38.128203.174.38.129203.174.38.1" .
"30203.174.38.131203.174.38.132203.174.38" .
".133203.174.38.134173.174.38.135203.174." .
"38.136203.174.38.137203.174.38.142";
my $byte = qr/2[0-5][0-5]|1\d\d|\d\d|\d/;
my #bytes = $data =~ /($byte)/g;
my #addresses;
push #addresses, join('.', splice(#bytes, 0, 4)) while #bytes;
say for #addresses;
output
203.174.38.128
203.174.38.129
203.174.38.130
203.174.38.131
203.174.38.132
203.174.38.133
203.174.38.134
173.174.38.135
203.174.38.136
203.174.38.137
203.174.38.142
Using your sample, it looks like you have 3 digits for the first and last node. That would prompt using this pattern:
/(\d{3}\.\d{1,3}\.\d{1,3}\.\d{3})/
Add that with a /g switch and it will pull every one.
However, if you have a larger and divergent set of data than what you show for your sample, somebody should have separated the ips before dumping them into this string. If they are separate data points, they should have some separation.