How can I iterate through nested arrays? - perl

I have created an array as follows
while (defined ($line = `<STDIN>`))
{
chomp ($line);
push #stack,($line);
}
each line has two numbers.
15 6
2 8
how do iterate over each item in each line?
i.e. I want to print
15
6
2
8
I understand it's something like
foreach (#{stack}) (#stack){
print "?????
}
This is where I am stuck.

See the perldsc documentation. That's the Perl Data Structures Cookbook, which has examples for dealing with arrays of arrays. From what you're doing though, it doesn't look like you need an array of arrays.
For your problem of taking two numbers per line and outputting one number per line, just turn the whitespace into newlines:
while( <> ) {
s/\s+/\n/; # turn all whitespace runs into newlines
print; # it's ready to print
}
With Perl 5.10, you can use the new \h character class that matches only horizontal whitespace:
while( <> ) {
s/\h+/\n/; # turn all horizontal whitespace runs into newlines
print; # it's ready to print
}
As a Perl one-liner, that's just:
% perl -pe 's/\h+/\n/' file.txt

#!/usr/bin/perl
use strict;
use warnings;
while ( my $data = <DATA> ) {
my #values = split ' ', $data;
print $_, "\n" for #values;
}
__DATA__
15 6
2 8
Output:
C:\Temp> h
15
6
2
8
Alternatively, if you want to store each line in #stack and print out later:
my #stack = map { [ split ] } grep { chomp; length } <DATA>;
The line above slurps everything coming from the DATA filehandle into a list of lines (because <DATA> happens in list context). The grep chomps each line and filters by length after chomping (to avoid getting any trailing empty lines in the data file -- you can avoid it if there are none). The map then splits each line along spaces, and then creates an anonymous array reference for each line. Finally, such array references are stored in each element of #stack. You might want to use Data::Dumper to look at #stack to understand what's going on.
print join("\n", #$_), "\n" for #stack;
Now, we look over each entry in stack, dereferencing each array in turn, then joining the elements of each array with newlines to print one element per line.
Output:
C:\Temp> h
15
6
2
8
The long way of writing essentially the same thing (with less memory consumption) would be:
my #stack;
while ( my $line = <DATA> ) {
last unless $line =~ /\S/;
my #values = split ' ', $line;
push #stack, \#values;
}
for my $ref ( #stack ) {
print join("\n", #$ref), "\n";
}
Finally, if you wanted do something other than printing all values, say, sum all the numbers, you should store one value per element of #stack:
use List::Util qw( sum );
my #stack;
while ( my $line = <DATA> ) {
last unless $line =~ /\S/;
my #values = split ' ', $line;
push #stack, #values;
}
printf "The sum is %d\n", sum #stack;

#!/usr/bin/perl
while ($line = <STDIN>) {
chomp ($line);
push #stack, $line;
}
# prints each line
foreach $line (#stack) {
print "$line\n";
}
# splits each line into items using ' ' as separator
# and prints the items
foreach $line (#stack) {
#items = split / /, $line;
foreach $item (#items) {
print $item . "\n";
}
}

I use 'for' for "C" style loops, and 'foreach' for iterating over lists.

#!/usr/bin/perl
use strict;
use warnings;
open IN, "< read.txt" or
die "Can't read in 'read.txt'!";
my $content = join '', <IN>;
while ($content =~ m`(\d+)`g) {
print "$1\n";
}

Related

How to separate an array in Perl based on pattern

I am trying to write a big script but I am stuck on a part. I want to sprit an array based on ".."
From the script I got this:
print #coordinates;
gene complement(872..1288)
my desired output:
complement 872 1288
I tried:
1) my #answer = split(.., #coordinates)
print("#answer\n");
2) my #answer = split /../, #coordinates;
3) print +(split /\../)[-1],[-2],[-3] while <#coordinates>
4) foreach my $anwser ( #coordinates )
{$anwser =~ s/../"\t"/;
print $anwser;}
5) my #answer = split(/../, "complement(872..1288)"); #to see if the printed array is problematic.
which prints:
) ) ) ) ) ) ) ) )
6) my #answer = split /"gene "/, #coordinates; # I tried to "catch" the entire output's spaces and tabs
which prints
0000000000000000000000000000000001000000000100000000
But none of them works. Does anyone has any idea how to step over this issue?
Ps, unfortunately, I can't run my script right now on Linux so I used this website to run my script. I hope this is not the reason why I didn't get my desired output.
my $RE_COMPLEMENT = qr{(complement)\((\d+)\.\.(\d+)\)}msx;
for my $item (#coordinates) {
my ($head, $i, $j) = $item =~ $RE_COMPLEMENT;
if (defined($head) && defined($i) && defined($j)) {
print("$head\t$i\t$j\n");
}
}
split operates on a scalar, not on an array.
my $string = 'gene complement(872..1288)';
my #parts = split /\.\./, $string;
print $parts[0]; # gene complement(872
print $parts[1]; # 1288)
To get the desired output, you can use a substitution:
my $string = 'gene complement(872..1288)';
$string =~ s/gene +|\)//g;
$string =~ s/\.\./ /;
$string =~ s/\(/ /;
Desired effect can be achieved with
use of tr operator to replace '(.)' => ' '
then splitting data string into element on space
storing only required part of array
output elements of array joined with tabulation
use strict;
use warnings;
use feature 'say';
my $data = <DATA>;
chomp $data;
$data =~ tr/(.)/ /;
my #elements = (split ' ', $data)[1..3];
say join "\t", #elements;
__DATA__
gene complement(872..1288)
Or as an alternative solution with only substitutions (without splitting data string into array)
use strict;
use warnings;
use feature 'say';
my $data = <DATA>;
chomp $data;
$data =~ s/gene\s+//;
$data =~ s/\)//;
$data =~ s/[(.]+/\t/g;
say $data;
__DATA__
gene complement(872..1288)
Output
complement 872 1288

Attach frequent words in text file

I have two files. The first contains frequent
word sequences
extracted from a text file
a.txt :
big pizza
eat big pizza
...
the text file is
b.txt :
i eat big pizza .my big pizza ...
My problem is to add bbb between words from each sequence that exist in file a.txt and write a new file .
so the result will be
i eatbbbbigbbbpizza.my bigbbbpizza...
below is my script. It adds bbb only between pairs 9f words. How can I correct this?
use strict;
use warnings;
use autodie;
my ($f1, $f2) = ('a.txt', 'b.txt');
open( my $fh, $f1 );
my #seq;
foreach ( <$fh> ) {
chomp;
s/^\s+|\s+$//g;
push #seq, $_;
}
close $fh;
open($fh, $f2);
foreach (<$fh> ) {
foreach my $r (#seq) {
my $t = $r =~ s/ /bbb/r;
if (/$r/) {
s/$r/$t/g;
}
}
print ;
}
close $fh;
All that is wrong is your line
my $t = $r =~ s/ /bbb/r;
This substitution runs just once, and so replaces only the first space with bbb
You need to use a global substitution instead. And while we're changing this line it's best to also replace the space with \h+, which matches any amount of "horizontal space", including both tabs and spaces
my $t = $r =~ s/\h+/bbb/gr;
As it stands your code will find and replacing substrings of other entries in #seq if they appear earlier in the array. In this case, that means big pizza will be found first and converted to bigbbbpizza, after which
eat big pizza can no longer be found. You need to first sort your array in descending order of length so that longer phrases are found before shorter ones
#seq = sort { length($b) <=> length($a) } #seq;
Then your program will work a little better
Here is the modified code.
use strict;
use warnings;
use autodie;
my ($f1, $f2) = ('a.txt', 'b.txt');
open(my $fh, $f1);
my #seq;
foreach (<$fh> )
{
chomp;
s/^\s+|\s+$//g;
push #seq, $_;
}
close $fh;
#seq = sort bylen #seq; # need to sort #seq by length.
open($fh, $f2);
foreach (<$fh> ) {
foreach my $r (#seq) {
my $t = $r =~ s/ /bbb/gr;
s/$r/$t/g; # you may need to take care of cases of extra spaces
}
print;
}
close $fh;
exit 0;
sub bylen {
length($b) <=> length($a);
}

Counting and printing location of duplicate words in a line using Perl

I am trying to read from a file and print out the location of duplicate words on each line.I have stored each line in an array, but I am not sure if this is the right way to start.
while (my $fileLine = <$fh>){
my #lineWords = split /\s+/, $fileLine;
print "#\n"
}
#!/usr/bin/perl
use strict;
use warnings;
while (<DATA>){
chomp; # remove end of line chars
my #wordsInLine = split /\s+/, $_;
#wordsInLine = map {lc($_)} #wordsInLine; # convert words to lowercase
my( $word, %wordsInLine, $n );
for $word (#wordsInLine) {
$wordsInLine{$word}++; # use hash %wordsInLine to count occurences of words
}
for $word (#wordsInLine) {
$n++;
if( (my $count = $wordsInLine{$word}||0) > 1 ) {
print "line $.: Word $n \"$word\" is repeated $count times\n";
delete($wordsInLine{$word}); # do not generate more than one report
# about the same word in single line
}
}
}
__DATA__
This this is a sample sentence
A that That THAT !

Perl: Grep unique value

Basically I wanted to emulate the piped grep operation as we do in shell script, (grep pattern1 |grep pattern2) in my Perl code to make the result unique.
Below code is working, bust just wanted to know this is the right approach. Please note, I don't want to introduce a inner loop here, just for the grep part.
foreach my $LINE ( #ARRAY1 ) {
#LINES = split /\s+/, $LINE;
#RESULT= grep ( /$LINES[0]/, ( grep /$LINES[1]/, #ARRAY2 ) );
...
This is basically same thing what you're doing, "for every #ARRAY2 element, check whether it matches ALL elements from #LINES" (stop as soon as any of the #LINES element does not match),
use List::Util "none";
my #RESULT= grep { my $s = $_; none { $s !~ /$_/ } #LINES } #ARRAY2;
# index() is faster for literal values
my #RESULT= grep { my $s = $_; none { index($s, $_) <0 } #LINES } #ARRAY2;
There is no need to cascade calls to grep -- you can simply and the conditions together
It's also worth saying that you should be using lower-case letters for your identifiers, and split /\s+/ should almost always be split ' '
Here's what I would write
for my $line ( #array1 ) {
my #fields = split ' ', $line;
my #result = grep { /$fields[0]/ and /$fields[1] } #array2;
...
}
There are different ways to grep/extract unique values from array in perl.
##2) Best of all
my %hash = map { $_ , 1 } #array;
my #uniq = keys %hash;
print "\n Uniq Array:", Dumper(\#uniq);
##3) Costly process as it involves 'greping'
my %saw;
my #out = grep(!$saw{$_}++, #array);
print "\n Uniq Array: #out \n";

cant retrieve values from hash reversal (Perl)

I've initialized a hash with Names and their class ranking as follows
a=>5,b=>2,c=>1,d=>3,e=>5
I've this code so far
my %Ranks = reverse %Class; #As I need to find out who's ranked first
print "\nFirst place goes to.... ", $Ranks{1};
The code only prints out
"First place goes to...."
I want it to print out
First place goes to....c
Could you tell me where' I'm going wrong here?
The class hash prints correctly
but If I try to print the reversed hash using
foreach $t (keys %Ranks) {
print "\n $t $Ranks{$t}"; }
It prints
5
abc23
cab2
ord
If this helps in any way
FULL CODE
#Script to read from the data file and initialize it into a hash
my %Code;
my %Ranks;
#Check whether the file exists
open(fh, "Task1.txt") or die "The File Does Not Exist!\n", $!;
while (my $line = <fh>) {
chomp $line;
my #fields = split /,/, $line;
$Code{$fields[0]} = $fields[1];
$Class{$fields[0]} = $fields[2];
}
close(fh);
#Prints the dataset
print "Code \t Name\n";
foreach $code ( keys %Code) {
print "$code \t $Code{$code}\n";
}
#Find out who comes first
my %Ranks = reverse %Class;
foreach $t (keys %Ranks)
{
print "\n $t $Ranks{$t}";
}
print "\nFirst place goes to.... ", $Ranks{1}, "\n";
When you want to check what your data structures actually contain, use Data::Dumper. use Data::Dumper; local $Data::Dumper::Useqq = 1; print(Dumper(\%Class));. You'll find un-chomped newlines.
You need to use chomp. At present your $fields[2] value has a trailing newline.
Change your file read loop to this
while (my $line = <fh>) {
chomp $line;
my #fields = split /,/, $line;
$Code{$fields[0]} = $fields[1];
$Class{$fields[0]} = $fields[2];
}