Not getting output while creating Hash of Arrays - perl

I am trying to create hash of arrays. I am taking data from a txt file and converting this into hash of arrays.
Txt file data is as below
group1 : usr1 usr4 usr6
group2 : usr2 usr1 usr5
group3 : usr1 usr2 usr3
so on ......
I am converting this hash of arrays like
%hash = (group1 => [usr1 usr4 usr6], group2 => [usr2 usr1 usr5]);
Following code i am trying
%hash = ();
open (FH, "2.txt") or die "file not found";
while (<FH>) {
#array = split (":", $_);
$array[1] =~ s/^\s*//;
$array[1] =~ s/\s*$//;
#arrayRef = split (" ", $array[1]);
$hash{$array[0]} = [ #arrayRef ];
#print #array;
#print "\n";
}
close FH;
print $hash{group1}[0];
print #{ $hash{group2}};
I am not getting output. There is something wrong in the code. Please help me understanding it better

Your code works for me, but the problem is that you are using the key "group1 " (note the extra space), and not "group1" like you think. When you split on colon :, you remember to strip the fields after from spaces, but not the field before. You should probably do:
my #array = split /\s*:\s*/, $_;
Also, you should always use
use strict;
use warnings;
Coding without these two pragmas is difficult and takes much longer.

use strict;
use warnings;
my %hash;
open (my $FH, "<", "2.txt") or die $!;
while (<$FH>) {
my ($key, #array) = split /[:\s]+/, $_;
$hash{$key} = \#array;
}
close $FH;
use Data::Dumper;
print Dumper \%hash;

Related

Perl: Read columns and convert to array

I am new to perl, trying to read a file with columns and creating an array.
I am having a file with following columns.
file.txt
A 15
A 20
A 33
B 20
B 45
C 32
C 78
I wanted to create an array for each unique item present in A with its values assigned from second column.
eg:
#A = (15,20,33)
#B = (20,45)
#C = (32,78)
Tried following code, only for printing 2 columns
use strict;
use warnings;
my $filename = $ARGV[0];
open(FILE, $filename) or die "Could not open file '$filename' $!";
my %seen;
while (<FILE>)
{
chomp;
my $line = $_;
my #elements = split (" ", $line);
my $row_name = join "\t", #elements[0,1];
print $row_name . "\n" if ! $seen{$row_name}++;
}
close FILE;
Thanks
Firstly some general Perl advice. These days, we like to use lexical variables as filehandles and pass three arguments to open().
open(my $fh, '<', $filename) or die "Could not open file '$filename' $!";
And then...
while (<$fh>) { ... }
But, given that you have your filename in $ARGV[0], another tip is to use an empty file input operator (<>) which will return data from the files named in #ARGV without you having to open them. So you can remove your open() line completely and replace the while with:
while (<>) { ... }
Second piece of advice - don't store this data in individual arrays. Far better to store it in a more complex data structure. I'd suggest a hash where the key is the letter and the value is an array containing all of the numbers matching that letter. This is surprisingly easy to build:
use strict;
use warnings;
use feature 'say';
my %data; # I'd give this a better name if I knew what your data was
while (<>) {
chomp;
my ($letter, $number) = split; # splits $_ on whitespace by default
push #{ $data{$letter} }, $number;
}
# Walk the hash to see what we've got
for (sort keys %data) {
say "$_ : #{ $data{$_ } }";
}
Change the loop to be something like:
while (my $line = <FILE>)
{
chomp($line);
my #elements = split (" ", $line);
push(#{$seen{$elements[0]}}, $elements[1]);
}
This will create/append a list of each item as it is found, and result in a hash where the keys are the left items, and the values are lists of the right items. You can then process or reassign the values as you wish.

How to randomly pair items in a list

I have a list of Accession numbers that I want to pair randomly using a Perl script below:
#!/usr/bin/perl -w
use List::Util qw(shuffle);
my $file = 'randomseq_acc.txt';
my #identifiers = map { (split /\n/)[1] } <$file>;
chomp #identifiers;
#Shuffle them and put in a hash
#identifiers = shuffle #identifiers;
my %pairs = (#identifiers);
#print the pairs
for (keys %pairs) {
print "$_ and $pairs{$_} are partners\n";
but keep getting errors.
The accession numbers in the file randomseq_acc.txt are:
1094711
1586007
2XFX_C
Q27031.2
P22497.2
Q9TVU5.1
Q4N4N8.1
P28547.2
P15711.1
AAC46910.1
AAA98602.1
AAA98601.1
AAA98600.1
EAN33235.2
EAN34465.1
EAN34464.1
EAN34463.1
EAN34462.1
EAN34461.1
EAN34460.1
I needed to add the closing right curly brace to be able to compile the script.
As arrays are indexed from 0, (split /\n/)[1] returns the second field, i.e. what follows newline on each line (i.e. nothing). Change it to [0] to make it work:
my #identifiers = map { (split /\n/)[0] } <$file>; # Still wrong.
The diamond operator needs a file handle, not a file name. Use open to associate the two:
open my $FH, '<', $file or die $!;
my #identifiers = map { (split /\n/)[0] } <$FH>;
Using split to remove a newline is not common. I'd probably use something else:
map { /(.*)/ } <$FH>
# or
map { chomp; $_ } <$FH>
# or, thanks to ikegami
chomp(my #identifiers = <$FH>);
So, the final result would be something like the following:
#!/usr/bin/perl
use warnings;
use strict;
use List::Util qw(shuffle);
my $filename = '...';
open my $FH, '<', $filename or die $!;
chomp(my #identifiers = <$FH>);
my %pairs = shuffle(#identifiers);
print "$_ and $pairs{$_} are partners\n" for keys %pairs;

Extraction and printing of key-value pair from a text file using Perl

I have a text file temp.txt which contains entries like,
cinterim=3534
cstart=517
cstop=622
ointerim=47
ostart=19
ostop=20
Note: key-value pairs may be arranged in new line or all at once in one line separated by space.
I am trying to print and store these values in DB for corresponding keys using Perl. But I am getting many errors and warnings. Right now I am just trying to print those values.
use strict;
use warnings;
open(FILE,"/root/temp.txt") or die "Unable to open file:$!\n";
while (my $line = <FILE>) {
# optional whitespace, KEY, optional whitespace, required ':',
# optional whitespace, VALUE, required whitespace, required '.'
$line =~ m/^\s*(\S+)\s*:\s*(.*)\s+\./;
my #pairs = split(/\s+/,$line);
my %hash = map { split(/=/, $_, 2) } #pairs;
printf "%s,%s,%s\n", $hash{cinterim}, $hash{cstart}, $hash{cstop};
}
close(FILE);
Could somebody provide help to refine my program.
use strict;
use warnings;
open my $fh, '<', '/root/temp.txt' or die "Unable to open file:$!\n";
my %hash = map { split /=|\s+/; } <$fh>;
close $fh;
print "$_ => $hash{$_}\n" for keys %hash;
What this code does:
<$fh> reads a line from our file, or in list context, all lines and returns them as an array.
Inside map we split our line into an array using the regexp /= | \s+/x. This means: split when you see a = or a sequence of whitespace characters. This is just a condensed and beautified form of your original code.
Then, we cast the list resulting from map to the hash type. We can do that because the item count of the list is even. (Input like key key=value or key=value=valuewill throw an error at this point).
After that, we print the hash out. In Perl, we can interpolate hash values inside strings directly and don't have to use printf and friends except for special formatting.
The for loop iterates over all keys (returned in the $_ special variable), and $hash{$_} is the corresponding value. This could also have been written as
while (my ($key, $val) = each %hash) {
print "$key => $val\n";
}
where each iterates over all key-value pairs.
Try this
use warnings;
my %data = ();
open FILE, '<', 'file1.txt' or die $!;
while(<FILE>)
{
chomp;
$data{$1} = $2 while /\s*(\S+)=(\S+)/g;
}
close FILE;
print $_, '-', $data{$_}, $/ for keys %data;
The simplest way is to slurp the entire file into memory and assign key/value pairs to the hash using a regular expression.
This program shows the technique
use strict;
use warnings;
my %data = do {
open my $fh, '<', '/root/temp.txt' or die $!;
local $/;
<$fh> =~ /(\w+)\s*=\s*(\w+)/g;
};
use Data::Dump;
dd \%data;
output
{
cinterim => 3534,
cstart => 517,
cstop => 622,
ointerim => 47,
ostart => 19,
ostop => 20,
}

Perl's Chomp: Chomp is removing the whole word instead of the newline

I am facing issues with perl chomp function.
I have a test.csv as below:
col1,col2
vm1,fd1
vm2,fd2
vm3,fd3
vm4,fd4
I want to print the 2nd field of this csv. This is my code:
#!/usr/bin/perl -w
use strict;
my $file = "test.csv";
open (my $FH, '<', $file);
my #array = (<$FH>);
close $FH;
foreach (#array)
{
my #row = split (/,/,$_);
my $var = chomp ($row[1]); ### <<< this is the problem
print $var;
}
The output of aboe code is :
11111
I really don't know where the "1" is comming from. Actually, the last filed can be printed as below:
foreach (#array)
{
my #row = split (/,/,$_);
print $row[1]; ### << Note that I am not printing "\n"
}
the output is:
vm_cluster
fd1
fd2
fd3
fd4
Now, i am using these field values as an input to the DB and the DB INSERT statement is failing due this invisible newline. So I thought chomp would help me here. instead of chomping, it gives me "11111".
Could you help me understand what am i doing wrong here.
Thanks.
Adding more information after reading loldop's responce:
If I write as below, then it will not print anything (not even the "11111" output mentioned above)
foreach (#array)
{
my #row = split (/,/,$_);
chomp ($row[1]);
my $var = $row[1];
print $var;
}
Meaning, chomp is removing the last string and the trailing new line.
The reason you see only a string of 1s is that you are printing the value of $val which is the value returned from chomp. chomp doesn't return the trimmed string, it modifies its parameter in-place and returns the number of characters removed from the end. Since it always removes exactly one "\n" character you get a 1 output for each element of the array.
You really should use warnings instead of the -w command-line option, and there is no reason here to read the entire file into an array. But well done on using a lexical filehandle with the three-parameter form of open.
Here is a quick refactoring of your program that will do what you want.
#!/usr/bin/perl
use strict;
use warnings;
my $file = 'test.csv';
open my $FH, '<', $file or die qq(Unable to open "$file": $!);
while (<$FH>) {
chomp;
my #row = split /,/;
print $row[1], "\n";
}
although, it is my fault at the beginning.
chomp function return 1 <- result of usage this function.
also, you can find this bad example below. but it will works, if you use numbers.
sometimes i use this cheat (don't do that! it is my bad-hack code!)
map{/filter/ && $_;}#all_to_filter;
instead of this, use
grep{/filter/}#all_to_filter;
foreach (#array)
{
my #row = split (/,/,$_);
my $var = chomp ($row[1]) * $row[1]; ### this is bad code!
print $var;
}
foreach (#array)
{
my #row = split (/,/,$_);
chomp ($row[1]);
my $var = $row[1];
print $var;
}
If you simply want to get rid of new lines you can use a regex:
my $var = $row[1];
$var=~s/\n//g;
So, I was quite frustrated with this easy looking task bugging me for the whole day long. I really appreciate everyone who responded.
Finaly I ended up using Text::CSV perl module and then calling each of the CSV field as array reference. There was no need left to run the chomp after using Text::CSV.
Here is the code:
#!/usr/bin/perl
use warnings;
use strict;
use Text::CSV;
my $csv = Text::CSV->new ( { binary => 1 } ) # should set binary attribute.
or die "Cannot use CSV: ".Text::CSV->error_diag ();
open my $fh, "<:encoding(utf8)", "vm.csv" or die "vm.csv: $!";
<$fh>; ## this is to remove the column headers.
while ( my $row = $csv->getline ($fh) )
{
print $row->[1];
}
and here is hte output:
fd1fd2fd3fd4
Later i was pulled these individual values and inserted into the DB.
Thanks everyone.

How can I check if a value is in a list in Perl?

I have a file in which every line is an integer which represents an id. What I want to do is just check whether some specific ids are in this list.
But the code didn't work. It never tells me it exists even if 123 is a line in that file. I don't know why? Help appreciated.
open (FILE, "list.txt") or die ("unable to open !");
my #data=<FILE>;
my %lookup =map {chop($_) => undef} #data;
my $element= '123';
if (exists $lookup{$element})
{
print "Exists";
}
Thanks in advance.
You want to ensure you make your hash correctly. The very outdated chop isn't what you want to use. Use chomp instead, and use it on the entire array at once and before you create the hash:
open my $fh, '<', 'list.txt' or die "unable to open list.txt: $!";
chomp( my #data = <$fh> );
my $hash = map { $_, 1 } #data;
With Perl 5.10 and up, you can also use the smart match operator:
my $id = get_id_to_check_for();
open my $fh, '<', 'list.txt' or die "unable to open list.txt: $!";
chomp( my #data = <$fh> );
print "Id found!" if $id ~~ #data;
perldoc -q contain
chop returns the character it chopped, not what was left behind. You perhaps want something like this:
my %lookup = map { substr($_,0,-1) => undef } #data;
However, generally, you should consider using chomp instead of chop to do a more intelligent CRLF removal, so you'd end up with a line like this:
my %lookup =map {chomp; $_ => undef } #data;
Your problem is that chop returns the character chopped, not the resulting string, so you're creating a hash with a single entry for newline. This would be obvious in debugging if you used Data::Dumper to output the resulting hash.
Try this instead:
my #data=<FILE>;
chomp #data;
my %lookup = map {$_ => undef} #data;
This should work... it uses first in List::Util to do the searching, and eliminates the initial map (this is assuming you don't need to store the values for something else immediately after). The chomp is done while searching for the value; see perldoc -f chomp.
use List::Util 'first';
open (my $fh, 'list.txt') or die 'unable to open list.txt!';
my #elements = <$fh>;
my $element = '123';
if (first { chomp; $_ eq $element } #elements)
{
print "Exists";
}
This one may not exactly match your specific problem,
but if your integer numbers need to be
counted, you might even use the good
old "canonical" perl approach:
open my $fh, '<', 'list.txt' or die "unable to open list.txt: $!";
my %lookup;
while( <$fh> ) { chomp; $lookup{$_}++ } # this will count occurences if ints
my $element = '123';
if( exists $lookup{$element} ) {
print "$element $lookup{$element} times there\n"
}
This might even be in some circumstances faster than
solutions with intermediate array.
Regards
rbo