I have data for each and every student, e.g
Student Name Score
Jack 89
Jill 70
Sandy 40
Now I'm trying to plot these in a bar chart using GD::Graph::Bar, but since I'm pretty new to perl and modules, I see that I can manually declare all the X and Y values from the chart to be plotted.
But since I don't know the names and scores of each of the student(pulled from a text file)
I want to be able to do the values automatically,
I was thinking hash keys and values was a good approach. So I placed everything in a hash table, %hash(student name)=(score)
Can anyone help me plot this as a bar chart or guide me? Or would you recommend a different approach?
Thanks
"Update
This is the part where I can plot the graph manually by entering the student names.
my $graph = GD::Graph::bars->new(800, 800);
#data = (
["Jack","Jill"],
['30','50'],
);
$graph->set(
x_label => 'Students',
y_label => 'Scores',
title => 'Student Vs. Scores',
y_max_value => 60,
y_tick_number => 8,
y_label_skip => 2
) or die $graph->error;
my $gd = $graph->plot(\#data) or die $graph->error;
open(IMG, '>file.png') or die $!;
binmode IMG;
print IMG $gd->png;
Assuming your data file is as follows, using tab delimiters.
Student Name Score
Jack 89
Jill 70
Sandy 40
You could do something like this, pushing your x axis and y axis values from your data file to arrays.
use strict;
use warnings;
use CGI qw( :standard );
use GD::Graph::bars;
open my $fh, '<', 'data.txt' or die $!;
my (#x, #y);
while (<$fh>) {
next if $. == 1; # skip header line
push #x, (split /\t/)[0]; # push 'Student Names' into #x array
push #y, (split /\t/)[1]; # push 'Score' into #y array
}
close $fh;
my $graph = GD::Graph::bars->new(800, 800);
$graph->set(
x_label => 'Students',
y_label => 'Scores',
title => 'Student Vs. Scores',
) or warn $graph->error;
my #data = (\#x, \#y);
$graph->plot(\#data) or die $graph->error();
print header(-type=>'image/jpeg'), $graph->gd->jpeg;
Giving you for example:
If you are wanting to use multiple y axis values, assuming you have another tab delimiter column with for example Score2, you could easily do something like this.
my (#x, #y, #y2);
while (<$fh>) {
next if $. == 1;
push #x, (split /\t/)[0];
push #y, (split /\t/)[1];
push #y2, (split /\t/)[2];
}
And change your #data array to:
my #data = (\#x, \#y, \#y2);
And your result would be:
According to the documentation, you need to pass an array of arrays to the plot method of GD::Graph::bars. It sounds like you already have a hash so you need to convert it to an array of arrays. There are a number of ways to do this, but here's an example:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my %hash = (
Larry => 15,
Curly => 16,
Moe => 20
);
my (#names, #scores);
while (my ($name, $score) = each %hash) {
push #names, $name;
push #scores, $score;
}
my #data = (\#names, \#scores);
print Dumper(\#data);
# $VAR1 = [
# [
# 'Moe',
# 'Curly',
# 'Larry'
# ],
# [
# 20,
# 16,
# 15
# ]
# ];
However you do it, make sure you preserve the order in the inner arrays.
I adapted the code from the samples directory in GD::Graph:
use warnings;
use strict;
use GD::Graph::bars;
use GD::Graph::Data;
my %students = (
Jack => 89,
Jill => 70,
Sandy => 40,
);
my #scores;
my #names;
for (keys %students) {
push #names, $_;
push #scores, $students{$_};
}
my $data = GD::Graph::Data->new([
[#names],
[#scores],
]) or die GD::Graph::Data->error;
my $my_graph = GD::Graph::bars->new();
$my_graph->set(
x_label => 'Name',
y_label => 'Score',
title => 'A Simple Bar Chart',
) or warn $my_graph->error;
$my_graph->plot($data) or die $my_graph->error();
save_chart($my_graph, 'graph');
sub save_chart {
my $chart = shift or die "Need a chart!";
my $name = shift or die "Need a name!";
local(*OUT);
my $ext = $chart->export_format;
open(OUT, ">$name.$ext") or
die "Cannot open $name.$ext for write: $!";
binmode OUT;
print OUT $chart->gd->$ext();
close OUT;
}
Related
I'm struggling to understand how to read a simple text file into two Perl hashes.
I have a text file like:
George Washington
John Adams
Abraham Lincoln
and I want to create two hashes, one that holds the first names and the other that holds the last names.
I'm looking at doing something like:
my %first;
my %last;
open(my $FH, '<', $file) or die$!;
my $count = 1;
while (<$FH>)
{
chomp;
if count is odd, add to %first
elsif count is even, add to %last
}
close($FH);
but I'm honestly lost. Does anyone have any ideas?
Well you can get desired result with following code.
use strict;
use warnings;
use feature 'say';
use Data::Dumper;
my $count = 0;
my %first;
my %last;
while(<DATA>) {
chomp;
my($f,$l) = split;
$first{$f} = $count;
$last{$l} = $count;
$count++;
}
say Dumper(\%first);
say Dumper(\%last);
__DATA__
George Washington
John Adams
Abraham Lincoln
Output
$VAR1 = {
'George' => 0,
'Abraham' => 2,
'John' => 1
};
$VAR1 = {
'Adams' => 1,
'Lincoln' => 2,
'Washington' => 0
};
I am trying to read a file containing this into a Perl hash.
I want the first two columns to be the key and the remaining columns to be the values.
Celena Standard F 01/24/94 Cancer
Jeniffer Orlowski F 06/24/86 None
Brent Koehler M 12/05/97 HIV
Mao Schleich M 04/17/60 Cancer
Goldie Moultrie F 04/05/96 None
This is where I got stuck.
open FILE1, "Patient_Info.txt" or die;
my %hash;
while ( my $line = <FILE1> ) {
chomp $line; # remove newline
my ( $key, $value ) = split ' ', $line, 2;
$hash{$key} = $value;
}
my #sorted_keys = sort keys %hash;
my $new = 'Celena';
for my $new ( #sorted_keys ) {
print "$new $hash{$new} \n";
}
The first two fields are joined on '', and the remaining fields are left as an array reference:
use strict;
use warnings;
my %data;
while (<DATA>) {
my #fields = split;
my $key = join('', splice(#fields, 0, 2));
$data{$key} = \#fields;
}
for my $key (sort(keys(%data))) {
printf("%s: %s\n", $key, join(' ', #{$data{$key}}));
}
__DATA__
Celena Standard F 01/24/94 Cancer
Jeniffer Orlowski F 06/24/86 None
Brent Koehler M 12/05/97 HIV
Mao Schleich M 04/17/60 Cancer
Goldie Moultrie F 04/05/96 None
Output:
BrentKoehler: M 12/05/97 HIV
CelenaStandard: F 01/24/94 Cancer
GoldieMoultrie: F 04/05/96 None
JenifferOrlowski: F 06/24/86 None
MaoSchleich: M 04/17/60 Cancer
Some notes on your code
You should
Always use strict and use warnings 'all' at the top of every Perl program that you write
Use lexical file handles, like my $file1 rather than FILE1, as they are much safer and more useful than global ones
Please choose better variable identifiers. In %hash, the % says that it's a hash so you may as well have used %_. Is this a personnel list perhaps? Or a subscriber list?
I can't work out what you're hoping for with my $new = 'Celina' as you never use that variable again
You don't say how you want the data stored in the hash, so I've use an array to store the last three fields
I've added the Data::Dump output so that you can see the structure of the resulting hash, as well as a simple while loop that reproduces the original data (in a different order)
use strict;
use warnings 'all';
use autodie;
my %data = do {
open my $fh, '<', 'patient_info.txt';
map {
my ($first, $second, #info) = split;
"$first $second" => \#info;
} <$fh>;
};
use Data::Dump;
dd \%data;
print "\n";
while ( my ($name, $info) = each %data ) {
print "$name #$info\n";
}
output
{
"Brent Koehler" => ["M", "12/05/97", "HIV"],
"Celena Standard" => ["F", "01/24/94", "Cancer"],
"Goldie Moultrie" => ["F", "04/05/96", "None"],
"Jeniffer Orlowski" => ["F", "06/24/86", "None"],
"Mao Schleich" => ["M", "04/17/60", "Cancer"],
}
Celena Standard F 01/24/94 Cancer
Mao Schleich M 04/17/60 Cancer
Jeniffer Orlowski F 06/24/86 None
Goldie Moultrie F 04/05/96 None
Brent Koehler M 12/05/97 HIV
I have two files. One file has a list of values like so
NC_SNPStest.txt
250
275
375
The other file has space delimited information. Column one is the first value of a range, Column two has the second value of a range, Column 5 has the name of the range, and Column eight has what acts on that range.
promoterstest.txt
20 100 yaaX F yaaX 5147 5.34 Sigma70 99
200 300 yaaA R yaaAp1 6482 6.54 Sigma70 35
350 400 yaaA R yaaAp2 6498 2.86 Sigma70 51
I am trying to write a script that takes the first line from file 1 and then parses file 2 line by line to see if that value falls in the range is between the first two columns.
When the first match is found, I want to print the value from file 1 and then the values in file 2 for columns 5 and 8 from the line with the match. If no match is found in File 2 then just print the value from File 1 and move on.
It seems like it should be a simple enough task but I'm having an issue cycling though both files.
This is what I have written:
#!/usr/bin/perl
use warnings;
use strict;
open my $PromoterFile, '<', 'promoterstest.txt' or die $!;
open my $SNPSFile, '<', 'NC_SNPtest.txt' or die $!;
open (FILE, ">PromoterMatchtest.txt");
while (my $SNPS = <$SNPSFile>) {
chomp ($SNPS);
while (my $Cord = <$PromoterFile>) {
chomp ($Cord);
my #CordFile =split(/\s/, $Cord);
my $Lend = $CordFile[0];
my $Rend = $CordFile[1];
my $Promoter = $CordFile[4];
my $SigmaFactor = $CordFile[7];
foreach $a ($SNPS)
{
if ($a >= $Lend && $a <= $Rend)
{
print FILE "$a\t$CordFile[4]\t$CordFile[7]\n";
}
else
{
print FILE "$a\n";
}
}
}
}
close FILE;
close $PromoterFile;
close $SNPSFile;
exit;
So far my output looks like so:
250
250 yaaAp1 Sigma70
250
Where the first line of file 1 is being called and file 2 is being cycled through. But the else command is being used on each line of file 2 and the script never cycles through the other lines of file 1.
Your problem is you're not resetting your progress through the second file. You read one line from $SNPSFile, check that against ever line in the second file.
But when you start over, you're already at the end of file, so:
while (my $Cord = <$PromoterFile>) {
Doesn't have anything to read.
A quick fix for this would be to add a seek command in there, but that'll make inefficient code. I'd suggest instead reading file 1 into a array, and referencing that instead.
Here's a first draft rewrite that may help.
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;
open my $PromoterFile, '<', 'promoterstest.txt' or die $!;
open my $SNPSFile, '<', 'NC_SNPtest.txt' or die $!;
open my $output, ">", "PromoterMatchtest.txt" or die $!;
my #data;
while (<$PromoterFile>) {
chomp;
my #CordFile = split;
my $Lend = $CordFile[0];
my $Rend = $CordFile[1];
my $Promoter = $CordFile[4];
my $SigmaFactor = $CordFile[7];
push(
#data,
{ lend => $CordFile[0],
rend => $CordFile[1],
promoter => $CordFile[4],
sigmafactor => $CordFile[7]
}
);
}
print Dumper \#data;
foreach my $value (<$SNPSFile>) {
chomp $value;
my $found = 0;
foreach my $element (#data) {
if ( $value >= $element->{lend}
and $value <= $element->{rend} )
{
#print "Found $value\n";
print {$output} join( "\t",
$value, $element->{promoter}, $element->{sigmafactor} ),
"\n";
$found++;
last;
}
}
if ( not $found ) {
print {$output} $value,"\n";
}
}
close $output;
close $PromoterFile;
close $SNPSFile;
First - we open file2, read in the stuff in it to an array of hashes. (If any of the elements there are unique, we could key off that instead.)
Then we read through SNPSfile one line at a time, looking for each key - printing it if it exists (at least once, on the first hit) and printing just the key if it doesn't.
This generates the output:
250 yaaAp1 Sigma70
275 yaaAp1 Sigma70
375 yaaAp2 Sigma70
Was that what you were aiming for?
Aside from that 'Dumper' statement which outputs the content of #data as thus:
$VAR1 = [
{
'sigmafactor' => 'Sigma70',
'promoter' => 'yaaX',
'lend' => '20',
'rend' => '100'
},
{
'sigmafactor' => 'Sigma70',
'promoter' => 'yaaAp1',
'rend' => '300',
'lend' => '200'
},
{
'promoter' => 'yaaAp2',
'sigmafactor' => 'Sigma70',
'rend' => '400',
'lend' => '350'
}
];
Here's my take on a programming solution. It's important to
Use lexical file handles and the three-paremeter form of open
Keep to lower-case letters, digits and underscores for local variables
I have also used the autodie pragma to remove the need to test the status of open explicitly, and the first function from the core library List::Util to make the code clearer and more concise
use strict;
use warnings;
use 5.010;
use autodie;
use List::Util 'first';
my #promoters;
{
open my $fh, '<', 'promoterstest.txt';
while ( <$fh> ) {
my #fields = split;
push #promoters, [ #fields[0,1,4,7] ];
}
}
open my $fh, '<', 'NC_SNPStest.txt';
open my $out_fh, '>', 'PromoterMatchtest.txt';
select $out_fh;
while ( <$fh> ) {
my ($num) = split;
my $match = first { $num >= $_->[0] and $num <= $_->[1] } #promoters;
if ( $match ) {
print join("\t", $num, #{$match}[2,3]), "\n";
}
else {
print $num, "\n";
}
}
output
250 yaaAp1 Sigma70
275 yaaAp1 Sigma70
375 yaaAp2 Sigma70
I have a two tab separated files that I need to align together. for example:
File 1: File 2:
AAA 123 BBB 345
BBB 345 CCC 333
CCC 333 DDD 444
(These are large files, potentially thousands of lines!)
What I would like to do is to have the output look like this:
AAA 123
BBB 345 BBB 345
CCC 333 CCC 333
DDD 444
Preferably I would like to do this in perl, but not sure how. any help would be greatly appreaciated.
If its just about making a data structure, this can be quite easy.
#!/usr/bin/env perl
# usage: script.pl file1 file2 ...
use strict;
use warnings;
my %data;
while (<>) {
chomp;
my ($key, $value) = split;
push #{$data{$key}}, $value;
}
use Data::Dumper;
print Dumper \%data;
You can then output in any format you like. If its really about using the files exactly as they are, then its a little bit more tricky.
Assuming the files are sorted,
sub get {
my ($fh) = #_;
my $line = <$fh>;
return () if !defined($line);
return split(' ', $line);
}
my ($key1, $val1) = get($fh1);
my ($key2, $val2) = get($fh2);
while (defined($key1) && defined($key2)) {
if ($key1 lt $key2) {
print(join("\t", $key1, $val1), "\n");
($key1, $val1) = get($fh1);
}
elsif ($key1 gt $key2) {
print(join("\t", '', '', $key2, $val2), "\n");
($key2, $val2) = get($fh2);
}
else {
print(join("\t", $key1, $val1, $key2, $val2), "\n");
($key1, $val1) = get($fh1);
($key2, $val2) = get($fh2);
}
}
while (defined($key1)) {
print(join("\t", $key1, $val1), "\n");
($key1, $val1) = get($fh1);
}
while (defined($key2)) {
print(join("\t", '', '', $key1, $val1), "\n");
($key2, $val2) = get($fh2);
}
Similar to Joel Berger's answer, but this approach allows to you keep track of whether files did or did not contain a given key:
my %data;
while (my $line = <>){
chomp $line;
my ($k) = $line =~ /^(\S+)/;
$data{$k}{line} = $line;
$data{$k}{$ARGV} = 1;
}
use Data::Dumper;
print Dumper(\%data);
Output:
$VAR1 = {
'CCC' => {
'other.dat' => 1,
'data.dat' => 1,
'line' => 'CCC 333'
},
'BBB' => {
'other.dat' => 1,
'data.dat' => 1,
'line' => 'BBB 345'
},
'DDD' => {
'other.dat' => 1,
'line' => 'DDD 444'
},
'AAA' => {
'data.dat' => 1,
'line' => 'AAA 123'
}
};
As ikegami mentioned, it assumes that the files' contents are arranged as shown in your example.
use strict;
use warnings;
open my $file1, '<file1.txt' or die $!;
open my $file2, '<file2.txt' or die $!;
my $file1_line = <$file1>;
print $file1_line;
while ( my $file2_line = <$file2> ) {
if( defined( $file1_line = <$file1> ) ) {
chomp $file1_line;
print $file1_line;
}
my $tabs = $file1_line ? "\t" : "\t\t";
print "$tabs$file2_line";
}
close $file1;
close $file2;
Reviewing your example, you show some identical key/value pairs in both files. Given this, it looks like you want to show the pair(s) unique to file 1, unique to file 2, and show the common pairs. If this is the case (and you're not trying to match the files' pairs by either keys or values), you can use List::Compare:
use strict;
use warnings;
use List::Compare;
open my $file1, '<file1.txt' or die $!;
my #file1 = <$file1>;
close $file1;
open my $file2, '<file2.txt' or die $!;
my #file2 = <$file2>;
close $file2;
my $lc = List::Compare->new(\#file1, \#file2);
my #file1Only = $lc->get_Lonly; # L(eft array)only
for(#file1Only) { print }
my #bothFiles = $lc->get_intersection;
for(#bothFiles) { chomp; print "$_\t$_\n" }
my #file2Only = $lc->get_Ronly; # R(ight array)only
for(#file2Only) { print "\t\t$_" }
So I have a text file with the following line:
123456789
But then I have a second file:
987654321
So how can I make the first file's contents the keys in a hash, and the second file's values the values? (Each character is a key/value)
Should I store each file into different arrays and then somehow merge them? How would I do that? Anything else?
Honestly, I would give you my code I have tried, but I haven't the slightest idea where to start.
You could use a hash slice.
If each line is a key/value: (s///r requires 5.14, but it can easily be rewritten for earlier versions)
my %h;
#h{ map s/\s+\z//r, <$fh1> } = map s/\s+\z//r, <$fh2>;
If each character is a key/value:
my %h;
{
local $/ = \1;
#h{ grep !/\n/, <$fh1> } = grep !/\n/, <$fh2>;
}
Just open both files and read them line by line simultaneously:
use strict; use warnings;
use autodie;
my %hash;
open my $keyFile, '<', 'keyfileName';
open my $valueFile, '<', 'valuefileName';
while(my $key = <$keyFile>) {
my $value = <$valueFile>;
chomp for $key, $value;
$hash{$key} = $value;
}
Of course this is just a quick sketch on how it could work.
The OP mentions that each character is a key or value, by this I take it that you mean that the output should be a hash like ( 1 => 9, 2 => 8, ... ). The OP also asks:
Should I store each file into different arrays and then somehow merge them? How would I do that?
This is exactly how this answer works. Here get_chars is a function that reads in each file, splits on every char and returns that array. Then use zip from List::MoreUtils to create the hash.
#!/usr/bin/env perl
use strict;
use warnings;
use List::MoreUtils 'zip';
my ($file1, $file2) = #ARGV;
my #file1chars = get_chars($file1);
my #file2chars = get_chars($file2);
my %hash = zip #file1chars, #file2chars;
use Data::Dumper;
print Dumper \%hash;
sub get_chars {
my $filename = shift;
open my $fh, '<', $filename
or die "Could not open $filename: $!";
my #chars;
while (<$fh>) {
chomp;
push #chars, split //;
}
return #chars;
}
Iterator madness:
#!/usr/bin/env perl
use autodie;
use strict; use warnings;
my $keyfile_contents = join("\n", 'A' .. 'J');
my $valuefile_contents = join("\n", map ord, 'A' .. 'E');
# Use get_iterator($keyfile, $valuefile) to read from physical files
my $each = get_iterator(\ ($keyfile_contents, $valuefile_contents) );
my %hash;
while (my ($k, $v) = $each->()) {
$hash{ $k } = $v;
}
use YAML;
print Dump \%hash;
sub get_iterator {
my ($keyfile, $valuefile) = #_;
open my $keyf, '<', $keyfile;
open my $valf, '<', $valuefile;
return sub {
my $key = <$keyf>;
return unless defined $key;
my $value = <$valf>;
chomp for grep defined, $key, $value;
return $key => $value;
};
}
Output:
C:\temp> yy
---
A: 65
B: 66
C: 67
D: 68
E: 69
F: ~
G: ~
H: ~
I: ~
J: ~
I would write
my %hash = ('123456789' => '987654321');