Is there a one-liner to get the first element of a split? - perl

Instead of writing:
#holder = split /\./,"hello.world";
print #holder[0];
is it possible to just do a one-liner to just get the first element of the split? Something like:
print (split /\./,"hello.world")[0]
I get the following error when I try the second example:
print (...) interpreted as function at test.pl line 3.
syntax error at test.pl line 3, near ")["

You should have tried your hunch. That’s how to do it.
my $first = (split /\./, "hello.world")[0];
You could use a list-context assignment that grabs the first field only.
my($first) = split /\./, "hello.world";
To print it, use
print +(split /\./, "hello.world")[0], "\n";
or
print ((split(/\./, "hello.world"))[0], "\n");
The plus sign is there because of a syntactic ambiguity. It signals that everything following are arguments to print. The perlfunc documentation on print explains.
Be careful not to follow the print keyword with a left parenthesis unless you want the corresponding right parenthesis to terminate the arguments to the print; put parentheses around all arguments (or interpose a +, but that doesn't look as good).
In the case above, I find the case with + much easier to write and read. YMMV.

If you insist on using split for this then you could potentially be splitting a long string into multiple fields, only to discard all but the first. The third parameter to split should be used to limit the number of fields into which to divide the string.
my $string = 'hello.world';
print((split(/\./, $string, 2))[0]);
But I believe a regular expression better describes what you want to do, and avoids this problem completely.
Either
my $string = 'hello.world';
my ($first) = $string =~ /([^.]+)/;
or
my $string = 'hello.world';
print $string =~ /([^.]+)/;
will extract the first string of non-dot characters for you.

I get the following error when I try the second example:
"syntax error at test.pl line 3, near ")["
No, if you have warnings enabled as you should, you get:
print (...) interpreted as function at test.pl line 3.
syntax error at test.pl line 3, near ")["
which should be a big clue as to your problem.

Related

Perl: Find a match, remove the same lines, and to get the last field

Being a Perl newbie, please pardon me for asking this basic question.
I have a text file #server1 that shows a bunch of sentences (white space is the field separator) on many lines in the file.
I needed to match lines with my keyword, remove the same lines, and extract only the last field, so I have tried with:
my #allmatchedlines;
open(output1, "ssh user1#server1 cat /tmp/myfile.txt |");
while(<output1>) {
chomp;
#allmatchedlines = $_ if /mysearch/;
}
close(output1);
my #uniqmatchedline = split(/ /, #allmatchedlines);
my $lastfield = $uniqmatchedline[-1]\n";
print "$lastfield\n";
and it gives me the output showing:
1
I don't know why it's giving me just "1".
Could someone please explain why I'm getting "1" and how I can get the last field of the matched line correctly?
Thank you!
my #uniqmatchedline = split(/ /, #allmatchedlines);
You're getting "1" because split takes a scalar, not an array. An array in scalar context returns the number of elements.
You need to split on each individual line. Something like this:
my #uniqmatchedline = map { split(/ /, $_) } #allmatchedlines;
There are two issues with your code:
split is expecting a scalar value (string) to split on; if you are passing an array, it will convert the array to scalar (which is just the array length)
You did not have a way to remove same lines
To address these, the following code should work (not tested as no data):
my #allmatchedlines;
open(output1, "ssh user1#server1 cat /tmp/myfile.txt |");
while(<output1>) {
chomp;
#allmatchedlines = $_ if /mysearch/;
}
close(output1);
my %existing;
my #uniqmatchedline = grep !$existing{$_}++, #allmatchedlines; #this will return the unique lines
my #lastfields = map { ((split / /, $_)[-1]) . "\n" } #uniqmatchedline ; #this maps the last field in each line into an array
print for #lastfields;
Apart from two errors in the code, I find the statement "remove the same lines and extract only the last field" unclear. Once duplicate matching lines are removed, there may still be multiple distinct sentences with the pattern.
Until a clarification comes, here is code that picks the last field from the last such sentence.
use warnings 'all';
use strict;
use List::MoreUtils qw(uniq)
my $file = '/tmp/myfile.txt';
my $cmd = "ssh user1\#server1 cat $file";
open my $fh, '-|', $cmd // die "Error opening $cmd: $!"; # /
while (<$fh>) {
chomp;
push #allmatchedlines, $_ if /mysearch/;
}
close(output1);
my #unique_matched_lines = uniq #allmatchedlines;
my $lastfield = ( split ' ', $unique_matched_lines[-1] )[-1];
print $lastfield, "\n";
I changed to the three-argument open, with error checking. Recall that open for a process involves a fork and returns pid, so an "error" doesn't at all relate to what happened with the command itself. See open. (The # / merely turns off wrong syntax highlighting.) Also note that # under "..." indicates an array and thus need be escaped.
The (default) pattern ' ' used in split splits on any amount of whitespace. The regex / / turns off this behavior and splits on a single space. You most likely want to use ' '.
For more comments please see the original post below.
The statement #allmatchedlines = $_ if /mysearch/; on every iteration assigns to the array, overwriting whatever has been in it. So you end up with only the last line that matched mysearch. You want push #allmatchedlines, $_ ... to get all those lines.
Also, as shown in the answer by Justin Schell, split needs a scalar so it is taking the length of #allmatchedlines – which is 1 as explained above. You should have
my #words_in_matched_lines = map { split } #allmatchedlines;
When all this is straightened out, you'll have words in the array #uniqmatchedline and if that is the intention then its name is misleading.
To get unique elements of the array you can use the module List::MoreUtils
use List::MoreUtils qw(uniq);
my #unique_elems = uniq #whole_array;

What is "Use of unitialized value $. in range (or flip)" trying to tell me in Perl

I have the following code snippet in Perl:
my $argsize = #args;
if ($argsize >1){
foreach my $a ($args[1..$argsize-1]) {
$a =~ s/(.*[-+*].*)/\($1\)/; # if there's a math operator, put in parens
}
}
On execution I'm getting "Use of unitialized value $. in range (or flip) , followed by Argument "" isn't numeric in array element at... both pointing to the foreach line.
Can someone help me decipher the error message (and fix the problem(s))? I have an array #args of strings. The code should loop through the second to n't elements (if any exist), and surround individual args with () if they contain a +,-, or *.
I don't think the error stems from the values in args, I think I'm screwing up the range somehow... but I'm failing when args has > 1 element. an example might be:
<"bla bla bla"> <x-1> <foo>
The long and short of it is - your foreach line is broken:
foreach my $a (#args[1..$argsize-1]) {
Works fine. It's because you're using a $ which says 'scalar value' rather than an # which says array (or list).
If you use diagnostics you get;
Use of uninitialized value $. in range (or flip) at
(W uninitialized) An undefined value was used as if it were already
defined. It was interpreted as a "" or a 0, but maybe it was a mistake.
To suppress this warning assign a defined value to your variables.
To help you figure out what was undefined, perl will try to tell you
the name of the variable (if any) that was undefined. In some cases
it cannot do this, so it also tells you what operation you used the
undefined value in. Note, however, that perl optimizes your program
and the operation displayed in the warning may not necessarily appear
literally in your program. For example, "that $foo" is usually
optimized into "that " . $foo, and the warning will refer to the
concatenation (.) operator, even though there is no . in
your program.
You can reproduce this error by:
my $x = 1..3;
Which is actually pretty much what you're doing here - you're trying to assign an array value into a scalar.
There's a load more detail in this question:
What is the Perl context with range operator?
But basically: It's treating it as a range operator, as if you were working your way through a file. You would be able to 'act on' particular lines in the file via this operator.
e.g.:
use Data::Dumper;
while (<DATA>) {
my $x = 2 .. 3;
print Dumper $x;
print if $x;
}
__DATA__
line one
another line
third line
fourth line
That range operator is testing line numbers - and because you have no line numbers (because you're not iterating a file) it errors. (But otherwise - it might work, but you'd get some really strange results ;))
But I'd suggest you're doing this quite a convoluted way, and making (potentially?) an error, in that you're starting your array at 1, not zero.
You could instead:
s/(.*[-+*].*)/\($1\)/ for #args;
Which'll have the same result.
(If you need to skip the first argument:
my ( $first_arg, #rest ) = #args;
s/(.*[-+*].*)/\($1\)/ for #rest;
But that error at runtime is the result of some of the data you're feeding in. What you've got here though:
use strict;
use warnings;
my #args = ( '<"bla bla bla">', '<x-1>', '<foo>' );
print "Before #args\n";
s/(.*[-+*].*)/\($1\)/ for #args;
print "After: #args\n";

conditional substitution using hashes

I'm trying for substitution in which a condition will allow or disallow substitution.
I have a string
$string = "There is <tag1>you can do for it. that dosen't mean <tag2>you are fool.There <tag3>you got it.";
Here are two hashes which are used to check condition.
my %tag = ('tag1' => '<you>', 'tag2'=>'<do>', 'tag3'=>'<no>');
my %data = ('you'=>'<you>');
Here is actual substitution in which substitution is allowed for hash tag values not matched.
$string =~ s{(?<=<(.*?)>)(you)}{
if($tag{"$1"} eq $data{"$2"}){next;}
"I"
}sixe;
in this code I want to substitute 'you' with something with the condition that it is not equal to the hash value given in tag.
Can I use next in substitution?
Problem is that I can't use \g modifier. And after using next I cant go for next substitution.
Also I can't modify expression while matching and using next it dosen't go for second match, it stops there.
You can't use a variable length look behind assertion. The only one that is allowed is the special \K marker.
With that in mind, one way to perform this test is the following:
use strict;
use warnings;
while (my $string = <DATA>) {
$string =~ s{<([^>]*)>\K(?!\1)\w+}{I}s;
print $string;
}
__DATA__
There is <you>you can do for it. that dosen't mean <notyou>you are fool.
There is <you>you can do for it. that dosen't mean <do>you are fool.There <no>you got it.
Output:
There is <you>you can do for it. that dosen't mean <notyou>I are fool.
There is <you>you can do for it. that dosen't mean <do>I are fool.There <no>you got it.
It was simple but got my two days to think about it. I just written another substitution where it ignores previous tag which is cancelled by next;
$string = "There is <tag1>you can do for it. that dosen't mean <tag2>you are fool.There <tag3>you got it.";
my %tag = ('tag1' => '<you>', 'tag2'=>'<do>', 'tag3'=>'<no>');
my %data = ('you'=>'<you>');
my $notag;
$string =~ s{(?<=<(.*?)>)(you)}{
$notag = $2;
if($tag{"$1"} eq $data{"$2"}){next;}
"I"
}sie;
$string =~ s{(?<=<(.*?)>)(?<!$notag)(you)}{
"I"
}sie;

Why do I get a syntax error in my compound if statement?

Why do I get a syntax error in the following script?
print "Enter Sequence:";
$a = <STDIN>;
if ($a=="A")|| ($a== "T")|| ( $a == "C")|| ($a== "G")
{
print $a;
}
else
{
print "Error";
}
First, you have a syntax error: The condition expression of an if statement must be in parens.
The second error is found by using use strict; use warnings;, something you should always do. The error is the use of numerical comparison (==) where string comparison (eq) is called for.
The final problem is that $a will almost surely contain a string ending with a newline, so a chomp is in order.
The immediate problem is that he entire logical expression for an if must be in parentheses.
In addition
You must use eq instead of == for comparing strings
Your input string will have a trailing newline, so it will look like "C\n" and will not match a simple one-character string. You need to chomp the input before you compare it
It is generally better to read from STDIN using <> rather than <STDIN>. That way you can specify an input file on the command line, or read from the STDIN if no input was provided
You must always put use strict and use warnings at the top of your program. That will catch many simple errors that you may otherwise overlook
You shouldn't use $a as a variable name. It is a symbol reserved by Perl itself, and says nothing about the purpose of the variable
It is best to use a regular expression for simple comparisons like this. It makes your code much easier to read and will usually make the execution very much faster
Please take a look at this program, which I think does what you want.
use strict;
use warnings;
print "Enter Sequence: ";
my $input = <>;
chomp $input;
if ( $input =~ /^[ATCG]$/i ) {
print $input, "\n";
}
else {
print "Error";
}

Can I replace the binding operator with the smartmatch operator in Perl?

How can I write this with the smartmatch operator (~~)?
use 5.010;
my $string = '12 23 34 45 5464 46';
while ( $string =~ /(\d\d)\s/g ) {
say $1;
}
Interesting. perlsyn states:
Any ~~ Regex pattern match $a =~ /$b/
so, at first glance, it seems reasonable to expect
use strict; use warnings;
use 5.010;
my $string = '12 23 34 45 5464 46';
while ( $string ~~ /(\d\d)\s/g ) {
say $1;
}
to print 12, 23, etc but it gets stuck in a loop, matching 12 repeatedly. Using:
$ perl -MO=Deparse y.pl
yields
while ($string ~~ qr/(\d\d)\s/g) {
say $1;
}
looking at perlop, we notice
qr/STRING/msixpo
Note that 'g' is not listed as a modifier (logically, to me).
Interestingly, if you write:
my $re = qr/(\d\d)\s/g;
perl barfs:
Bareword found where operator expected at C:\Temp\y.pl line 5,
near "qr/(\d\d)\s/g"
syntax error at C:\Temp\y.pl line 5, near "qr/(\d\d)\s/g"
and presumably it should also say something if an invalid expression is used in the code above
If we go and look at what these two variants get transformed into, we can see the reason for this.
First lets look at the original version.
perl -MO=Deparse -e'while("abc" =~ /(.)/g){print "hi\n"}'
while ('abc' =~ /(.)/g) {
print "hi\n";
}
As you can see there wasn't any changing of the opcodes.
Now if you go and change it to use the smart-match operator, you can see it does actually change.
perl -MO=Deparse -e'while("abc" ~~ /(.)/g){print "hi\n"}'
while ('abc' ~~ qr/(.)/g) {
print "hi\n";
}
It changes it to qr, which doesn't recognize the /g option.
This should probably give you an error, but it doesn't get transformed until after it gets parsed.
The warning you should have gotten, and would get if you used qr instead is:
syntax error at -e line 1, near "qr/(.)/g"
The smart-match feature was never intended to replace the =~ operator. It came out of the process of making given/when work like it does.
Most of the time, when(EXPR) is treated as an implicit smart match of $_.
...
Is the expected behaviour to output to first match endlessly? Because that's what this code must do in its current form. The problem isn't the smart-match operator. The while loop is endless, because no modification ever occurs to $string. The /g global switch doesn't change the loop itself.
What are you trying to achieve? I'm assuming you want to output the two-digit values, one per line. In which case you might want to consider:
say join("\n", grep { /^\d{2}$/ } split(" ",$string));
To be honest, I'm not sure you can use the smart match operator for this. In my limited testing, it looks like the smart match is returning a boolean instead of a list of matches. The code you posted (using =~) can work without it, however.
What you posted doesn't work because of the while loop. The conditional statement on a while loop is executed before the start of each iteration. In this case, your regex is returning the first value in $string because it is reset at each iteration. A foreach would work however:
my $string = '12 23 34 45 5464 46';
foreach my $number ($string =~ /(\d\d)\s/g) {
print $number."\n";
}