Split and add digits - perl

If I open a file with strings like "233445", how can I then split that string into digits "2 3 3 4 4 5" and add each one to each other "2 + 3 + 3 etc..." and print out the result.
My code so far looks like this:
use strict;
#open (FILE, '<', shift);
#my #strings = <FILE>;
#strings = qw(12243434, 345, 676744); ## or a contents of a file
foreach my $numbers (#strings) {
my #done = split(undef, $numbers);
print "#done\n";
}
But I don't know where to start for the actual add function.

use strict;
use warnings;
my #strings = qw( 12243434 345 676744 );
for my $string (#strings) {
my $sum;
$sum += $_ for split(//, $string);
print "$sum\n";
}
or
use strict;
use warnings;
use List::Util qw( sum );
my #strings = qw( 12243434 345 676744 );
for my $string (#strings) {
my $sum = sum split(//, $string);
print "$sum\n";
}
PS — Always use use strict; use warnings;. It would have detected your misuse of commas in qw, and it would have dected your misuse of undef for split's first argument.

use strict;
my #done;
#open (FILE, '<', shift);
#my #strings = <FILE>;
my #strings = qw(12243434, 345, 676744); ## or a contents of a file
foreach my $numbers (#strings) {
#done = split(undef, $numbers);
print "#done\n";
}
my $tot;
map { $tot += $_} #done;
print $tot, "\n";

No one suggested an eval solution?
my #strings = qw( 12243434 345 676744 );
foreach my $string (#strings) {
my $sum = eval join '+',split //, $string;
print "$sum\n";
}

If your numbers are in a file, a one-liner might be nice:
perl -lnwe 'my $sum; s/(\d)/$sum += $1/eg; print $sum' numbers.txt
Since addition only uses numbers, it is safe to ignore all other characters. So just extract them one at the time with the regex and sum them up.
TIMTOWTDI:
perl -MList::Util=sum -lnwe 'print sum(/\d/g);' numbers.txt
perl -lnwe 'my $a; $a+=$_ for /\d/g; print $a' numbers.txt
Options:
-l auto-chomp input and add newline to print
-n implicit while(<>) loop around program -- open the file name given as argument and read each line into $_.

Related

How to separate an array in Perl based on pattern

I am trying to write a big script but I am stuck on a part. I want to sprit an array based on ".."
From the script I got this:
print #coordinates;
gene complement(872..1288)
my desired output:
complement 872 1288
I tried:
1) my #answer = split(.., #coordinates)
print("#answer\n");
2) my #answer = split /../, #coordinates;
3) print +(split /\../)[-1],[-2],[-3] while <#coordinates>
4) foreach my $anwser ( #coordinates )
{$anwser =~ s/../"\t"/;
print $anwser;}
5) my #answer = split(/../, "complement(872..1288)"); #to see if the printed array is problematic.
which prints:
) ) ) ) ) ) ) ) )
6) my #answer = split /"gene "/, #coordinates; # I tried to "catch" the entire output's spaces and tabs
which prints
0000000000000000000000000000000001000000000100000000
But none of them works. Does anyone has any idea how to step over this issue?
Ps, unfortunately, I can't run my script right now on Linux so I used this website to run my script. I hope this is not the reason why I didn't get my desired output.
my $RE_COMPLEMENT = qr{(complement)\((\d+)\.\.(\d+)\)}msx;
for my $item (#coordinates) {
my ($head, $i, $j) = $item =~ $RE_COMPLEMENT;
if (defined($head) && defined($i) && defined($j)) {
print("$head\t$i\t$j\n");
}
}
split operates on a scalar, not on an array.
my $string = 'gene complement(872..1288)';
my #parts = split /\.\./, $string;
print $parts[0]; # gene complement(872
print $parts[1]; # 1288)
To get the desired output, you can use a substitution:
my $string = 'gene complement(872..1288)';
$string =~ s/gene +|\)//g;
$string =~ s/\.\./ /;
$string =~ s/\(/ /;
Desired effect can be achieved with
use of tr operator to replace '(.)' => ' '
then splitting data string into element on space
storing only required part of array
output elements of array joined with tabulation
use strict;
use warnings;
use feature 'say';
my $data = <DATA>;
chomp $data;
$data =~ tr/(.)/ /;
my #elements = (split ' ', $data)[1..3];
say join "\t", #elements;
__DATA__
gene complement(872..1288)
Or as an alternative solution with only substitutions (without splitting data string into array)
use strict;
use warnings;
use feature 'say';
my $data = <DATA>;
chomp $data;
$data =~ s/gene\s+//;
$data =~ s/\)//;
$data =~ s/[(.]+/\t/g;
say $data;
__DATA__
gene complement(872..1288)
Output
complement 872 1288

Pick up the longest peptide using perl

I want to find out the longest possible protein sequence translated from cds in 6 forward and reverse frame.
This is the example input format:
>111
KKKKKKKMGFSOXLKPXLLLLLLLLLLLLLLLLLMJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJX
>222
WWWMPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPMPPPPPXKKKKKK
I would like to find out all the strings which start from "M" and stop at "X", count the each length of the strings and select the longest.
For example, in the case above:
the script will find,
>111 has two matches:
MGFSOX
MJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJX
>222 has one match:
MPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPMPPPPPX
Then count each match's length, and print the string and number of longest matches which is the result I want:
>111
MJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJX 32
>222
MPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPMPPPPPX 38
But it prints out no answer. Does anyone know how to fix it? Any suggestion will be helpful.
#!/usr/bin/perl -w
use strict;
use warnings;
my #pep=();
my $i=();
my #Xnum=();
my $n=();
my %hash=();
my #k=();
my $seq=();
$n=0;
open(IN, "<$ARGV[0]");
while(<IN>){
chomp;
if($_=~/^[^\>]/){
#pep=split(//, $_);
if($_ =~ /(X)/){
push(#Xnum, $1);
if($n >= 0 && $n <= $#Xnum){
if(#pep eq "M"){
for($i=1; $i<=$#pep; $i++){
$seq=join("",#pep);
$hash{$i}=$seq;
push(#k, $i);
}
}
elsif(#pep eq "X"){
$n=$n+1;
}
foreach (sort {$a cmp $b} #k){
print "$hash{$k[0]}\t$k[0]";
}
}
}
}
elsif($_=~/^\>/){
print "$_\n";
}
}
close IN;
Check out this Perl one-liner
$ cat iris.txt
>111
KKKKKKKMGFSOXLKPXLLLLLLLLLLLLLLLLLMJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJX
>222
WWWMPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPMPPPPPXKKKKKK
$ perl -ne ' if(!/^>/) { print "$p"; while(/(M[^M]+?X)/g ) { if(length($1)>length($x)) {$x=$1 } } print "$x ". length($x)."\n";$x="" } else { $p=$_ } ' iris.txt
>111
MJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJX 32
>222
MPPPPPX 7
$
There's more than one way to do it!
Try this too:
print and next if /^>/;
chomp and my #z = $_ =~ /(M[^X]*X)/g;
my $m = "";
for my $s (#z) {
$m = $s if length $s > length $m
}
say "$m\t" . length $m
Output:
>111
MJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJX 32
>222
MPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPMPPPPPX 38
uses >=5.14 and make sure to run script with perl -n
As a one-liner:
perl -E 'print and next if /^>/; chomp and my #z = $_ =~ /(M[^X]*X)/g; my $m = ""; for my $s (#z) { $m = $s if length $s > length $m } say "$m\t" . length $m' -n data.txt
Here is solution using reduce from List::Util.
Edit: mistakenly used maxstr which gave results but is not what was needed. Have reedited this post to use reduce (correctly) instead.
#!/usr/bin/perl
use strict;
use warnings;
use List::Util qw/reduce/;
open my $fh, '<', \<<EOF;
>111
KKKKKKKMGFSOXLKPXLLLLLLLLLLLLLLLLLMJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJX
>222
WWWMPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPMPPPPPXKKKKKK
EOF
my $id;
while (<$fh>) {
chomp;
if (/^>/) {
$id = $_;
}
else {
my $data = reduce {length($a) > length($b) ? $a : $b} /M[^X]*X/g;
print "$id\n$data\t" . length($data) . "\n" if $data;
}
}
Here's my take on it.
I like fasta files tucked into a hash, with the fasta name as the key. This way you can just add descriptions to it, e.g. base composition etc...
#!/usr/local/ActivePerl-5.20/bin/env perl
use strict;
use warnings;
my %prot;
open (my $fh, '<', '/Users/me/Desktop/fun_prot.fa') or die $!;
my $string = do { local $/; <$fh> };
close $fh;
chomp $string;
my #fasta = grep {/./} split (">", $string);
for my $aa (#fasta){
my ($key, $value) = split ("\n", $aa);
$value =~ s/[A-Z]*(M.*M)[A-Z]/$1/;
$prot{$key}->{'len'} = length($value);
$prot{$key}->{'prot'} = $value;
}
for my $sequence (sort { $prot{$b}->{'len'} <=> $prot{$a}->{'len'} } keys %prot){
print ">" . $sequence, "\n", $prot{$sequence}->{'prot'}, "\t", $prot{$sequence}->{'len'}, "\n";
last;
}
__DATA__
>1232
ASDFASMJJJJJMFASDFSDAFSDDFSA
>2343
AASFDFASMJJJJJJJJJJJJJJMRGQEGDAGDA
Output
>2343
MJJJJJJJJJJJJJJM 16

Split list of delimited lines to hash

The following produces what i want.
#!/usr/bin/env perl
use 5.020;
use warnings;
use Data::Dumper;
sub command {
<DATA>
#in the reality instead of the DATA I have
#qx(some weird shell command what produces output like in the DATA);
}
my #lines = grep { !/^\s*$/ } command();
chomp #lines;
my $data;
#how to write the following nicer - more compact, elegant, etc.. ;)
for my $line (#lines) {
my #arr = split /:/, $line;
$data->{$arr[0]}->{text} = $arr[1];
$data->{$arr[0]}->{par} = $arr[2];
$data->{$arr[0]}->{val} = $arr[3];
}
say Dumper $data;
__DATA__
line1:some text1:par1:val1
line2:some text2:par2:val2
line3:some text3:par3:val3
Wondering how to write the loop in more perlish form. ;)
You can assign to a hash slice:
for my $line (#lines) {
my ($id, #arr) = split /:/, $line;
#{ $data->{$id} }{qw{ text par val }} = #arr;
}
Also, use the following instead of qx, so you don't need to store all the lines in an array:
open my $PIPE, '-|', 'command' or die $!;
while (<$PIPE>) {
# ...
}

How many different ways are there to concatenate two files line by line using Perl?

Suppose file1 looks like this:
bye bye
hello
thank you
And file2 looks like this:
chao
hola
gracias
The desired output is this:
bye bye chao
hello hola
thank you gracias
I myself have already come up with five different approaches to solve this problem. But I think there must be more ways, probably more concise and more elegant ways, and I hope I can learn more cool stuff :)
The following is what I have tried so far, based on what I've learnt from the many solutions of my previous problems. Also, I'm trying to sort of digest or internalize the knowledge I've acquired from the Llama book.
Code 1:
#!perl
use autodie;
use warnings;
use strict;
open my $file1,'<','c:/file1.txt';
open my $file2,'<','c:/file2.txt';
while(defined(my $line1 = <$file1>)
and defined(my $line2 = <$file2>)){
die "Files are different sizes!\n" unless eof(file1) == eof(file2);
$line1 .= $line2;
$line1 =~ s/\n/ /;
print "$line1 \n";
}
Code 2:
#!perl
use autodie;
use warnings;
use strict;
open my $file1,'<','c:/file1.txt';
my #file1 = <$file1>;
open my $file2,'<','c:/file2.txt';
my #file2 =<$file2>;
for (my $n=0; $n<=$#file1; $n++) {
$file1[$n] .=$file2[$n];
$file1[$n]=~s/\n/ /;
print $file1[$n];
}
Code 3:
#!perl
use autodie;
use warnings;
use strict;
open my $file1,'<','c:/file1.txt';
open my $file2,'<','c:/file2.txt';
my %hash;
while(defined(my $line1 = <$file1>)
and defined(my $line2 = <$file2>)) {
chomp $line1;
chomp $line2;
my ($key, $val) = ($line1,$line2);
$hash{$key} = $val;
}
print map { "$_ $hash{$_}\n" } sort keys %hash;
Code 4:
#!perl
use autodie;
use warnings;
use strict;
open my $file1,'<','c:/file1.txt';
open my $file2,'<','c:/file2.txt';
while(defined(my $line1 = <$file1>)
and defined(my $line2 = <$file2>)) {
$line1 =~ s/(.+)/$1 $line2/;
print $line1;
}
Code 5:
#!perl
use autodie;
use warnings;
use strict;
open my $file1,'<','c:/file1.txt';
my #file1 =<$file1>;
open my $file2,'<','c:/file2.txt';
my #file2 =<$file2>;
while ((#file1) && (#file2)){
my $m = shift (#file1);
chomp($m);
my $n = shift (#file2);
chomp($n);
$m .=" ".$n;
print "$m \n";
}
I have tried something like this:
foreach $file1 (#file2) && foreach $file2 (#file2) {...}
But Perl gave me a syntactic error warning. I was frustrated. But can we run two foreach loops simultaneously?
Thanks, as always, for any comments, suggestions and of course the generous code sharing :)
This works for any number of files:
use strict;
use warnings;
use autodie;
my #handles = map { open my $h, '<', $_; $h } #ARGV;
while (#handles){
#handles = grep { ! eof $_ } #handles;
my #lines = map { my $v = <$_>; chomp $v; $v } #handles;
print join(' ', #lines), "\n";
}
close $_ for #handles;
The most elegant way doesn't involve perl at all:
paste -d' ' file1 file2
If I were a golfing man, I could rewrite #FM's answer as:
($,,$\)=(' ',"\n");#_=#ARGV;open $_,$_ for #_;print
map{chomp($a=<$_>);$a} #_=grep{!eof $_} #_ while #_
which you might be able to turn into a one-liner but that is just evil. ;-)
Well, here it is, under 100 characters:
C:\Temp> perl -le "$,=' ';#_=#ARGV;open $_,$_ for #_;print map{chomp($a =<$_>);$a} #_=grep{!eof $_ }#_ while #_" file1 file2
If it is OK to slurp (and why the heck not — we are looking for different ways), I think I have discovered the path the insanity:
#_=#ARGV;chomp($x[$.-1]{$ARGV}=$_) && eof
and $.=0 while<>;print "#$_{#_}\n" for #x
C:\Temp> perl -e "#_=#ARGV;chomp($x[$.-1]{$ARGV}=$_) && eof and $.=0 while<>;print qq{#$_{#_}\n} for #x" file1 file2
Output:
bye bye chao
hello hola
thank you gracias
An easier alternative to your Code 5 which allows for an arbitrary number of lines and does not care if files have different numbers of lines (hat tip #FM):
#!/usr/bin/perl
use strict; use warnings;
use File::Slurp;
use List::AllUtils qw( each_arrayref );
my #lines = map [ read_file $_ ], #ARGV;
my $it = each_arrayref #lines;
while ( my #lines = grep { defined and chomp and length } $it->() ) {
print join(' ', #lines), "\n";
}
And, without using any external modules:
#!perl
use autodie; use warnings; use strict;
my ($file1, $file2) = #ARGV;
open my $file1_h,'<', $file1;
my #file1 = grep { chomp; length } <$file1_h>;
open my $file2_h,'<', $file2;
my #file2 = grep { chomp; length } <$file2_h>;
my $n_lines = #file1 > #file2 ? #file1 : #file2;
for my $i (0 .. $n_lines - 1) {
my ($line1, $line2) = map {
defined $_ ? $_ : ''
} $file1[$i], $file2[$i];
print $line1, ' ', $line2, "\n";
}
If you want to concatenate only the lines that appear in both files:
#!perl
use autodie; use warnings; use strict;
my ($file1, $file2) = #ARGV;
open my $file1_h,'<', $file1;
my #file1 = grep { chomp; length } <$file1_h>;
open my $file2_h,'<', $file2;
my #file2 = grep { chomp; length } <$file2_h>;
my $n_lines = #file1 < #file2 ? #file1 : #file2;
for my $i (0 .. $n_lines - 1) {
print $file1[$i], ' ', $file2[$i], "\n";
}
An easy one with minimal error checking:
#!/usr/bin/perl -w
use strict;
open FILE1, '<file1.txt';
open FILE2, '<file2.txt';
while (defined(my $one = <FILE1>) or defined(my $twotemp = <FILE2>)){
my $two = $twotemp ? $twotemp : <FILE2>;
chomp $one if ($one);
chomp $two if ($two);
print ''.($one ? "$one " : '').($two ? $two : '')."\n";
}
And no, you can't run two loops simultaneous within the same thread, you'd have to fork, but that would not be guaranteed to run synchronously.

How can I extract the values after = in my string with Perl?

I have a string like this
field1=1 field2=2 field3=abc
I want to ouput this as
2,1,abc
Any ideas as to how I can go about this? I can write a small C or Java program to do this, trying I'm trying to find out a simple way to do it in Perl.
use strict;
use warnings;
my $string = 'field1=1 field2=2 field3=abc';
my #values = ($string =~ m/=(\S+)/g);
print join(',', #values), "\n";
#!/usr/bin/perl
use strict;
use warnings;
# Input string
my $string = "field1=1 field2=2 field3=abc";
# Split string into a list of "key=value" strings
my #pairs = split(/\s+/,$string);
# Convert pair strings into hash
my %hash = map { split(/=/, $_, 2) } #pairs;
# Output hash
printf "%s,%s,%s\n", $hash{field2}, $hash{field1}, $hash{field3}; # => 2,1,abc
# Output hash, alternate method
print join(",", #hash{qw(field2 field1 field3)}), "\n";
Use m//g in list context:
#!/usr/bin/perl
use strict;
use warnings;
my $x = "field1=1 field2=2 field3=abc";
if ( my #matches = $x =~ /(?:field[1-3]=(\S+))/g ) {
print join(',', #matches), "\n";
}
__END__
Output:
C:\Temp> klm
1,2,abc
$_='field1=1 field2=2 field3=abc';
$,=',';
say /=(\S+)/g
Let's play Perl golf :D
my $str = 'field1=1 field2=2 field3=abc';
print(join(',', map { (split('=', $_))[1] } split(' ', $str)));
There's several ways you can do that:
Regex match
my $s = "field1=1 field2=2 field3=abc";
$s =~ /field1=(\w*) field2=(\w*) field3=(\w*)$/; //pick out each field
print $1,$2,$3;'
12abc
Split the string on match
my $s = "field1=1 field2=2 field3=abc";
my #arr = split / /, $s; print #arr,"\n"; //make an array of name=value pairs
my #vals = map { #pairs = split /=/, $_; $pairs[1] } #arr; //get the values only from each pair
print #vals'
field1=1field2=2field3=abc
12abc
Split and put in a hash (I think that's the most useful one)
my $s = "field1=1 field2=2 field3=abc";
my #arr = split / /, $s;
my %pairs = map { split=/, $_; } #arr;
print $pairs{field1}, $pairs{field2}, $pairs{field3}
12abc
Assuming your ordering was a typo:
#!/usr/bin/perl
use strict; use warnings;
my $str='a=1 b=2 c=abc';
my #v;
while ($str =~ /=(\S+)/g) {
push #v, $1;
}
print join (',', #v);
Perl is definitely the right tool for this.
#! /usr/bin/perl
$str = "field1=1 field2=2 field3=abc";
$str =~ /field1=(\S+)\ field2=(\S+)\ field3=(\S+)/;
print "$1,$2,$3", "\n";
my $a = "field1=1 field2=2 field3=abc";
my #f = split /\s*\w+=/, $a;
shift(#f);
print join(",", #f), "\n";
$string="field1=1 field2=2 field3=abc";
#s=split /\s+/,$string;
$temp=$s[1];$s[1]=$s[0];$s[0]=$temp;
foreach (#s){s/.*=//; push(#a,$_ );}
print join(",",#a);
If you actually need both the keys and the values. I would put them into a hash. You could just capture both sides of the "=", and put directly into the hash.
use strict;
use warnings;
my $str = 'field1=1 field2=2 field3=abc';
my %fields = $str =~ / (\S+) \s* = \s* (\S+) /xg;
use YAML;
print Dump \%fields
---
field1: 1
field2: 2
field3: abc
For further information please read perldoc perlre.
If you are just a beginner, you may want to read perldoc perlretut.