Read file into two hashes with Perl - perl

I'm struggling to understand how to read a simple text file into two Perl hashes.
I have a text file like:
George Washington
John Adams
Abraham Lincoln
and I want to create two hashes, one that holds the first names and the other that holds the last names.
I'm looking at doing something like:
my %first;
my %last;
open(my $FH, '<', $file) or die$!;
my $count = 1;
while (<$FH>)
{
chomp;
if count is odd, add to %first
elsif count is even, add to %last
}
close($FH);
but I'm honestly lost. Does anyone have any ideas?

Well you can get desired result with following code.
use strict;
use warnings;
use feature 'say';
use Data::Dumper;
my $count = 0;
my %first;
my %last;
while(<DATA>) {
chomp;
my($f,$l) = split;
$first{$f} = $count;
$last{$l} = $count;
$count++;
}
say Dumper(\%first);
say Dumper(\%last);
__DATA__
George Washington
John Adams
Abraham Lincoln
Output
$VAR1 = {
'George' => 0,
'Abraham' => 2,
'John' => 1
};
$VAR1 = {
'Adams' => 1,
'Lincoln' => 2,
'Washington' => 0
};

Related

Perl Hash Count

I have a table with users the gender of their kids in seprate lines.
lilly boy
lilly boy
jane girl
lilly girl
jane boy
I wrote a script to put parse the lines and give me a total at the end
lilly boys=2 girls1
jane boys=1 girls=1
I tried this with a hash, but I dont know how to approach it
foreach $lines (#all_lines){
if ($lines =~ /(.+?)/s(.+)/){
$person = $1;
if ($2 =~ /boy/){
$boycount=1;
$girlcount=0;
}
if ($2 =~ /girl/){
$boycount=0;
$girlcount=1;
}
the next part is, if the person doesn't already exist inside the hash, add the person and then start a count for boy and girl. (i think this is the correct way, not sure)
if (!$hash{$person}){
%hash = (
'$person' => [
{'boy' => "0+$boycount", 'girl' => "0+$girlcount"}
],
);
Now, I dont know how to keep updating the values inside the hash, if the person already exists in the hash.
%hash = (
'$person' => [
{'boys' => $boyscount, 'girls' => $girlscount}
],
);
I am not sure how to keep updating the hash.
You just need to study the Perl Data Structures Cookbook
use strict;
use warnings;
my %person;
while (<DATA>) {
chomp;
my ($parent, $gender) = split;
$person{$parent}{$gender}++;
}
use Data::Dump;
dd \%person;
__DATA__
lilly boy
lilly boy
jane girl
lilly girl
jane boy
use strict;
use warnings;
my %hash;
open my $fh, '<', 'table.txt' or die "Unable to open table: $!";
# Aggregate stats:
while ( my $line = <$fh> ) { # Loop over record by record
chomp $line; # Remove trailing newlines
# split is a better tool than regexes to get the necessary data
my ( $parent, $kid_gender ) = split /\s+/, $line;
$hash{$parent}{$kid_gender}++; # Increment by one
# Take advantage of auto-vivification
}
# Print stats:
for my $parent ( keys %hash ) {
printf "%s boys=%d girls = %d\n",
$parent, $hash{$parent}{boy}, $hash{$parent}{girl};
}

GD::Graph with Perl

I have data for each and every student, e.g
Student Name Score
Jack 89
Jill 70
Sandy 40
Now I'm trying to plot these in a bar chart using GD::Graph::Bar, but since I'm pretty new to perl and modules, I see that I can manually declare all the X and Y values from the chart to be plotted.
But since I don't know the names and scores of each of the student(pulled from a text file)
I want to be able to do the values automatically,
I was thinking hash keys and values was a good approach. So I placed everything in a hash table, %hash(student name)=(score)
Can anyone help me plot this as a bar chart or guide me? Or would you recommend a different approach?
Thanks
"Update
This is the part where I can plot the graph manually by entering the student names.
my $graph = GD::Graph::bars->new(800, 800);
#data = (
["Jack","Jill"],
['30','50'],
);
$graph->set(
x_label => 'Students',
y_label => 'Scores',
title => 'Student Vs. Scores',
y_max_value => 60,
y_tick_number => 8,
y_label_skip => 2
) or die $graph->error;
my $gd = $graph->plot(\#data) or die $graph->error;
open(IMG, '>file.png') or die $!;
binmode IMG;
print IMG $gd->png;
Assuming your data file is as follows, using tab delimiters.
Student Name Score
Jack 89
Jill 70
Sandy 40
You could do something like this, pushing your x axis and y axis values from your data file to arrays.
use strict;
use warnings;
use CGI qw( :standard );
use GD::Graph::bars;
open my $fh, '<', 'data.txt' or die $!;
my (#x, #y);
while (<$fh>) {
next if $. == 1; # skip header line
push #x, (split /\t/)[0]; # push 'Student Names' into #x array
push #y, (split /\t/)[1]; # push 'Score' into #y array
}
close $fh;
my $graph = GD::Graph::bars->new(800, 800);
$graph->set(
x_label => 'Students',
y_label => 'Scores',
title => 'Student Vs. Scores',
) or warn $graph->error;
my #data = (\#x, \#y);
$graph->plot(\#data) or die $graph->error();
print header(-type=>'image/jpeg'), $graph->gd->jpeg;
Giving you for example:
If you are wanting to use multiple y axis values, assuming you have another tab delimiter column with for example Score2, you could easily do something like this.
my (#x, #y, #y2);
while (<$fh>) {
next if $. == 1;
push #x, (split /\t/)[0];
push #y, (split /\t/)[1];
push #y2, (split /\t/)[2];
}
And change your #data array to:
my #data = (\#x, \#y, \#y2);
And your result would be:
According to the documentation, you need to pass an array of arrays to the plot method of GD::Graph::bars. It sounds like you already have a hash so you need to convert it to an array of arrays. There are a number of ways to do this, but here's an example:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my %hash = (
Larry => 15,
Curly => 16,
Moe => 20
);
my (#names, #scores);
while (my ($name, $score) = each %hash) {
push #names, $name;
push #scores, $score;
}
my #data = (\#names, \#scores);
print Dumper(\#data);
# $VAR1 = [
# [
# 'Moe',
# 'Curly',
# 'Larry'
# ],
# [
# 20,
# 16,
# 15
# ]
# ];
However you do it, make sure you preserve the order in the inner arrays.
I adapted the code from the samples directory in GD::Graph:
use warnings;
use strict;
use GD::Graph::bars;
use GD::Graph::Data;
my %students = (
Jack => 89,
Jill => 70,
Sandy => 40,
);
my #scores;
my #names;
for (keys %students) {
push #names, $_;
push #scores, $students{$_};
}
my $data = GD::Graph::Data->new([
[#names],
[#scores],
]) or die GD::Graph::Data->error;
my $my_graph = GD::Graph::bars->new();
$my_graph->set(
x_label => 'Name',
y_label => 'Score',
title => 'A Simple Bar Chart',
) or warn $my_graph->error;
$my_graph->plot($data) or die $my_graph->error();
save_chart($my_graph, 'graph');
sub save_chart {
my $chart = shift or die "Need a chart!";
my $name = shift or die "Need a name!";
local(*OUT);
my $ext = $chart->export_format;
open(OUT, ">$name.$ext") or
die "Cannot open $name.$ext for write: $!";
binmode OUT;
print OUT $chart->gd->$ext();
close OUT;
}

How to print an array that it looks like a hash [duplicate]

This question already has an answer here:
I need help in perl, how to write a code to get the output of my csv file in the form of a hash [closed]
(1 answer)
Closed 10 years ago.
I am new to Perl, and have to write a code which takes contents of a file into and array and print the output that it looks like a hash. Here is an example entry:
my %amino_acids = (F => ["Phenylalanine", "Phe", ["TTT", "TTC"]])
Out put should be exactly in above format.
Lines of Files are like this...
"Methionine":"Met":"M":"AUG":"ATG"
"Phenylalanine":"Phe":"F":"UUU, UUC":"TTT, TTC"
"Proline":"Pro":"P":"CCU, CCC, CCA, CCG":"CCT, CCC, CCA, CCG"
I have to take last codons after semicolon and ignore the first group.
Is it your intention to build the equivalent hash? Or do you really want the string format? This program uses Text::CSV to build the hash from the file and then dumps it using Data::Dump so that you have the string format as well.
use strict;
use warnings;
use Text::CSV;
use Data::Dump 'dump';
my $csv = Text::CSV->new({ sep_char => ':' });
open my $fh, '<', 'amino.txt' or die $!;
my %amino_acids;
while (my $data= $csv->getline($fh)) {
$amino_acids{$data->[2]} = [
$data->[0],
$data->[1],
[ $data->[4] =~ /[A-Z]+/g ]
];
}
print '$amino_acids = ', dump \%amino_acids;
output
$amino_acids = {
F => ["Phenylalanine", "Phe", ["TTT", "TTC"]],
M => ["Methionine", "Met", ["ATG"]],
P => ["Proline", "Pro", ["CCT", "CCC", "CCA", "CCG"]],
}
Update
If you really don't want to install modules (it is a very straightforward process and makes the code much more concise and reliable) then this does what you need.
use strict;
use warnings;
open my $fh, '<', 'amino.txt' or die $!;
print "my %amino_acids = (\n";
while (<$fh>) {
chomp;
my #data = /[^:"]+/g;
my #codons = $data[4] =~ /[A-Z]+/g;
printf qq{ %s => ["%s", "%s", [%s]],\n},
#data[2,0,1],
join ', ', map qq{"$_"}, #codons;
}
print ")\n";
output
my %amino_acids = (
M => ["Methionine", "Met", ["ATG"]],
F => ["Phenylalanine", "Phe", ["TTT", "TTC"]],
P => ["Proline", "Pro", ["CCT", "CCC", "CCA", "CCG"]],
)
Assuming you actually want valid perl as the output, this will do it:
open(my $IN, "<input.txt") or die $!;
while(<$IN>){
chomp;
my #tmp = split(':',$_);
if(#tmp != 5){
# error on this line
next;
}
my $group = join('","',split(/,\s*/,$tmp[4]));
print "\$amino_acids{$tmp[2]} = [$tmp[0],$tmp[1],[$group]];\n";
}
close $IN;
Using your sample lines, the output is:
$amino_acids{"M"} = ["Methionine","Met",["ATG"]];
$amino_acids{"F"} = ["Phenylalanine","Phe",["TTT","TTC"]];
$amino_acids{"P"} = ["Proline","Pro",["CCT","CCC","CCA","CCG"]];
#Borodin Thank you very much for your answer, actually I don't have to use Text::csv or Data::dump.I have to open a file and build the equivalent hash from the file.I am trying to do without using both, hopefully it will help.Thanks again!!!
Perl has no special method to print hashes. What you should probably do is create a hash when reading the file:
while (<FILE>) {
my #line = split ':'; # split the line into an array
$amino_acids{$line[0]} = \#line[1..-1]; # take elements 1..end
}
And then print out the hash one entry at a time:
foreach (keys %amino_acids) {
print "$_ => [", (join ",", #$amino_acids{$_}), "]\n";
}
Note that I didn't compile this, so it may need a small amount of work to get it done.

Perl : change array item that is hashed to a key

I am having some problem with my perl. I hashed a key to an array. Now I want to change things in the array for each key. But I can't find out how this works :
open(DATEBOOK,"<sample.file");
#datebook = <DATEBOOK>;
$person = "Norma";
foreach(#datebook){
#record = ();
#lines = split(":",$_);
$size = #lines;
for ($i=1; $i < $size; $i++){
$record[$i-1] = $lines[$i];
}
$map{$lines[0]}="#record";
}
for(keys%map){ print $map{$_}};
The datebook file :
Tommy Savage:408.724.0140:1222 Oxbow Court, Sunnyvale,CA 94087:5/19/66:34200
Lesle Kerstin:408.456.1234:4 Harvard Square, Boston, MA 02133:4/22/62:52600
JonDeLoach:408.253.3122:123 Park St., San Jose, CA 94086:7/25/53:85100
Ephram Hardy:293.259.5395:235 Carlton Lane, Joliet, IL 73858:8/12/20:56700
Betty Boop:245.836.8357:635 Cutesy Lane, Hollywood, CA 91464:6/23/23:14500
William Kopf:846.836.2837:6937 Ware Road, Milton, PA 93756:9/21/46:43500
Norma Corder:397.857.2735:74 Pine Street, Dearborn, MI 23874:3/28/45:245700
James Ikeda:834.938.8376:23445 Aster Ave., Allentown, NJ 83745:12/1/38:45000
Lori Gortz:327.832.5728:3465 Mirlo Street, Peabody, MA 34756:10/2/65:35200
Barbara Kerz:385.573.8326:832 Ponce Drive, Gary, IN 83756:12/15/46:268500
I tried $map{$_}[1], but that doesn't work. Can anyone give me an example on how this works :) ?
thanks!
First, use strict and use warnings. Always.
Assuming what you want is a hash of arrays, do something like this:
use strict;
use warnings;
open my $datebookfh, '<', 'sample.file' or die $!;
my #datebook = <$datebookfh>;
my %map;
foreach my $row( #datebook ) {
my #record = split /:/, $row;
my $key = shift #record; # throw out first element and save it in $key
$map{$key} = \#record;
}
You can test that you have the correct structure by using Data::Dumper:
use Data::Dumper;
print Dumper( \%map );
The \ operator takes a reference. All hashes and arrays in Perl contain scalars, so compound structures (e.g. hashes of arrays) are really hashes of references to arrays. A reference is like a pointer.
Before going further, you should check out:
Perl reference tutorial
Arrays of arrays
Perl Data Structure Cookbook
Others have given you excellent advice. Here's one other idea to consider: store your data in a hash of hashes rather than a hash of arrays. It makes the data structure more communicative.
# Include these in your Perl scripts.
use strict;
use warnings;
my %data;
# Use lexical files handles, and check whether open() succeeds.
open(my $fh, '<', shift) or die $!;
while (my $line = <$fh>){
chomp $line;
my ($name, $ss, $address, $date, $number) = split /:/, $line;
$data{$name} = {
name => $name,
ss => $ss,
address => $address,
date => $date,
number => $number,
};
}
# Example usage: print info for one person.
my $person = $data{'Betty Boop'};
print $_, ' => ', $person->{$_}, "\n" for keys %$person;

finding the substring present in string and also count the number of occurrences

Could anyone tel me what is the mistake? As the program is for finding the substrings in a given string and count there number of occurrences for those substrings. but the substring must check the occurrences for every three alphabets.
for eg: String: AGAUUUAGA (i.e. for AGA, UUU, AGA)
output: AGA-2
UUU-1
print"Enter the mRNA Sequence\n";
$count=0;
$count1=0;
$seq=<>;
chomp($seq);
$p='';
$ln=length($seq);
$j=$ln/3;
for($i=0,$k=0;$i<$ln,$k<$j;$k++) {
$fra[$k]=substr($seq,$i,3);
$i=$i+3;
if({$fra[$k]} eq AGA) {
$count++;
print"The number of AGA is $count";
} elseif({$fra[$k]} eq UUU) {
$count1++;
print" The number of UUU is $count1";
}
}
This is a Perl FAQ:
perldoc -q count
This code will count the occurrences of your 2 strings:
use warnings;
use strict;
my $seq = 'AGAUUUAGA';
my $aga_cnt = () = $seq =~ /AGA/g;
my $uuu_cnt = () = $seq =~ /UUU/g;
print "The number of AGA is $aga_cnt\n";
print "The number of UUU is $uuu_cnt\n";
__END__
The number of AGA is 2
The number of UUU is 1
If you use strict and warnings, you will get many messages pointing out errors in your code.
Here is another approach which is more scalable:
use warnings;
use strict;
use Data::Dumper;
my $seq = 'AGAUUUAGA';
my %counts;
for my $key (qw(AGA UUU)) {
$counts{$key} = () = $seq =~ /$key/g;
}
print Dumper(\%counts);
__END__
$VAR1 = {
'AGA' => 2,
'UUU' => 1
};
Have a try with this, that avoids overlaps:
#!/usr/bin/perl
use strict;
use warnings;
use 5.10.1;
use Data::Dumper;
my $str = q!AGAUUUAGAGAAGAG!;
my #list = $str =~ /(...)/g;
my ($AGA, $UUU);
foreach(#list) {
$AGA++ if $_ eq 'AGA';
$UUU++ if $_ eq 'UUU';
}
say "number of AGA is $AGA and number of UUU is $UUU";
output:
number of AGA is 2 and number of UUU is 1
This is an example of how quickly you can get things done in Perl. Grouping the strands together as a alternation is one way to make sure there is no overlap. Also a hash is a great way to count occurrences of they key.
$values{$_}++ foreach $seq =~ /(AGA|UUU)/g;
print "AGA-$values{AGA} UUU-$values{UUU}\n";
However, I generally want to generalize it to something like this, thinking that this might not be the only time you have to do something like this.
use strict;
use warnings;
use English qw<$LIST_SEPARATOR>;
my %values;
my #spans = qw<AGA UUU>;
my $split_regex
= do { local $LIST_SEPARATOR = '|';
qr/(#spans)/
}
;
$values{$_}++ foreach $seq =~ /$split_regex/g;
print join( ' ', map { "$_-$values{$_}" } #spans ), "\n";
Your not clear on how many "AGA" the string "AGAGAGA" contains.
If 2,
my $aga = () = $seq =~ /AGA/g;
my $uuu = () = $seq =~ /UUU/g;
If 3,
my $aga = () = $seq =~ /A(?=GA)/g;
my $uuu = () = $seq =~ /U(?=UU)/g;
If I understand you correctly (and certainly that is questionable; almost every answer so far is interpreting your question differently than every other answer):
my %substring;
$substring{$1}++ while $seq =~ /(...)/;
print "There are $substring{UUU} UUU's and $substring{AGA} AGA's\n";