I have a CSV file like this:
name,email,salary
a,b#b.com,1000
d,e#e.com,2000
Now, I need to transform this to an array of hash-maps in Perl, so when I do something like:
table[1]{"email"}
it returns e#e.com.
The code I wrote is :
open(DATA, "<$file") or die "Cannot open the file\n";
my #table;
#fetch header line
$line = <DATA>;
my #header = split(',',$line);
#fetch data tuples
while($line = <DATA>)
{
my %map;
my #row = split(',',$line);
for($index = 0; $index <= $#header; $index++)
{
$map{"$header[$index]"} = $row[$index];
}
push(#table, %map);
}
close(DATA);
But I am not getting desired results.. Can u help?? Thanks in advance...
This line
push(#table, %map)
should be
push(#table, \%map)
You want table to be a list of hash references; your code adds each key and value in %map to the list as a separate element.
There is no need to reinvent the wheel here. You can do this with the Text::CSV module.
#!/usr/bin/perl
use strict;
use warnings;
use v5.16;
use Text::CSV;
my $csv = Text::CSV->new;
open my $fh, "<:encoding(utf8)", "data.csv" or die "data.csv: $!";
$csv->column_names( $csv->getline ($fh) );
while (my $row = $csv->getline_hr ($fh)) {
say $row->{email};
}
Something like this perhaps:
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
my #table;
chomp(my $header = <DATA>);
my #cols = split /,/, $header; # Should really use a real CSV parser here
while (<DATA>) {
chomp;
my %rec;
#rec{#cols} = split /,/;
push #table, \%rec;
}
say $table[1]{email};
__END__
name,email,salary
a,b#b.com,1000
d,e#e.com,2000
Related
I want to create a hash with column header as hash key and column values as hash values in Perl.
For example, if my csv file has the following data:
A,B,C,D,E
1,2,3,4,5
6,7,8,9,10
11,12,13,14,15 ...
I want to create a hash as follows:
A=> 1,6,11
B=>2,7,12
c=>3,8,13 ...
So that just by using the header name I can use that column values.
Is there a way in PERL to do this? Please help me.
I was able to store required column values as array using the following script
use strict;
use warnings;
open( IN, "sample.csv" ) or die("Unable to open file");
my $wanted_column = "A";
my #cells;
my #colvalues;
my $header = <IN>;
my #column_names = split( ",", $header );
my $extract_col = 0;
for my $header_line (#column_names) {
last if $header_line =~ m/$wanted_column/;
$extract_col++;
}
while ( my $row = <IN> ) {
last unless $row =~ /\S/;
chomp $row;
#cells = split( ",", $row );
push( #colvalues, $cells[$extract_col] );
}
my $sizeofarray = scalar #colvalues;
print "Size of the coulmn= $sizeofarray";
But I want to do this to all my column.I guess Hash of arrays will be the best solution but I dont know how to implement it.
Text::CSV is a useful helper module for this sort of thing.
use strict;
use warnings;
use Text::CSV;
use Data::Dumper;
my %combined;
open( my $input, "<", "sample.csv" ) or die("Unable to open file");
my $csv = Text::CSV->new( { binary => 1 } );
my #headers = #{ $csv->getline($input) };
while ( my $row = $csv->getline($input) ) {
for my $header (#headers) {
push( #{ $combined{$header} }, shift(#$row) );
}
}
print Dumper \%combined;
Since you requested without a module - you can use split but you need to bear in mind the limitations. CSV format allows for things like commas nested in quotes. split won't handle that case very well.
use strict;
use warnings;
use Data::Dumper;
my %combined;
open( my $input, "<", "sample.csv" ) or die("Unable to open file");
my $line = <$input>;
chomp ( $line );
my #headers = split( ',', $line );
while (<$input>) {
chomp;
my #row = split(',');
for my $header (#headers) {
push( #{ $combined{$header} }, shift(#row) );
}
}
print Dumper \%combined;
Note: Both of these will effectively ignore any extra columns that don't have headers. (And will get confused by duplicate column names).
Another solution by using for loop :
use strict;
use warnings;
my %data;
my #columns;
open (my $fh, "<", "file.csv") or die "Can't open the file : ";
while (<$fh>)
{
chomp;
my #list=split(',', $_);
for (my $i=0; $i<=$#list; $i++)
{
if ($.==1) # collect the columns, if its first line.
{
$columns[$i]=$list[$i];
}
else #collect the data, if its not the first line.
{
push #{$data{$columns[$i]}}, $list[$i];
}
}
}
foreach (#columns)
{
local $"="\,";
print "$_=>#{$data{$_}}\n";
}
Output will be like this :
A=>1,6,11
B=>2,7,12
C=>3,8,13
D=>4,9,14
E=>5,10,15
I feel like I'm missing something rather obvious, but can't find any answers in the documentation. Still new to OOP with Perl, but I'm using Text::CSV to parse a CSV for later use.
How would I go about extracting the first row and pushing the values to array #headers?
Here's what I have so far:
#!/usr/bin/perl
use warnings;
use diagnostics;
use strict;
use Fcntl ':flock';
use Text::CSV;
my $csv = Text::CSV->new({ sep_char => ',' });
my $file = "sample.csv";
my #headers; # Column names
open(my $data, '<:encoding(utf8)', $file) or die "Could not open '$file' $!\n";
while (my $line = <$data>) {
chomp $line;
if ($csv->parse($line)) {
my $r = 0; # Increment row counter
my $field_count = $csv->fields(); # Number of fields in row
# While data exists...
while (my $fields = $csv->getline( $data )) {
# Parse row into columns
print "Row ".$r.": \n";
# If row zero, process headers
if($r==0) {
# Add value to #columns array
push(#headers,$fields->[$c]);
} else {
# do stuff with records...
}
}
$r++
}
close $data;
You'd think that there would be a way to reference the existing fields in the first row.
Pretty much straight from the documentation, for example.
#!/usr/bin/perl
use strict;
use warnings;
use Text::CSV_XS;
my $csv = Text::CSV_XS->new ({ binary => 1, eol => $/ });
my $file = 'o33.txt';
open my $io, "<", $file or die "$file: $!";
my $header = $csv->getline ($io);
print join("-", #$header), "\n\n";
while (my $row = $csv->getline ($io)) {
print join("-", #$row), "\n";
}
__END__
***contents of o33.txt
lastname,firstname,age,gender,phone
mcgee,bobby,27,M,555-555-5555
kincaid,marl,67,M,555-666-6666
hofhazards,duke,22,M,555-696-6969
Prints:
lastname-firstname-age-gender-phone
mcgee-bobby-27-M-555-555-5555
kincaid-marl-67-M-555-666-6666
hofhazards-duke-22-M-555-696-6969
Update: Thinking about your problem, it may be that you want to address the data by its column name. For that, you might be able to use something (also from the docs), like this:
$csv->column_names ($csv->getline ($io));
while (my $href = $csv->getline_hr ($io)) {
print "lastname is: ", $href->{lastname},
" and gender is: ", $href->{gender}, "\n"
}
Note: You can use Text::CSV instead of Text::CSV_XS, as the former is a wrapper around the latter.
Thought I'd post my results for others.
#!/usr/bin/perl
use warnings;
use diagnostics;
use strict;
use Text::CSV;
sub read_csv {
my $csv = Text::CSV->new({ sep_char => ',' });
my $file = shift;
open(my $data, '<:encoding(utf8)', $file) or die "Could not open '$file' $!\n";
# Process Row Zero
my $header = $csv->getline ($data);
my $field_count = $csv->fields();
# Read the rest of the file
while (my $line = <$data>) {
chomp $line;
# Read line if possible
if ($csv->parse($line)) {
my $r = 0;
# While data exists...
while (my $fields = $csv->getline( $data )) {
# Parse row into columns
print Display->H2;
print "Row ".$r.": ".#$fields." columns. \n";
# Print column values
for(my $c=0; $c<#$fields; $c++) {
print #$header[$c]." : ".#$fields[$c]."\n";
}
$r++
}
}
close $data;
}
}
Cheers
I want to replace a pattern 'good' with 'bad' in my file. Currently what I am doing is:
#!/perl/bin/perl
$filename= "abc.txt";
open my $fh, $filename;
my $text = do { local( $/ ); <$fh> };
close $fh;
$text =~s/good/bad/g;
Is there any way I can do this without reading the whole file??
Edit: Suppose I know that there's only 1 'good' in the file.
P.S.,Hi I am new here. Hope I am doing it correctly.
You could use Tie::File:
#!/usr/bin/perl
use strict;
use Tie::File;
my $filename = ...;
tie my #lines, 'Tie::File', $filename or die;
for (#lines) {
s/good/bad/g and last;
}
Too make sure that not the whole file is slurped in you want to read the lines one-by-one, e.g. using:
#!/usr/bin/perl
use strict;
use Tie::File;
my $filename = ...;
tie my #lines, 'Tie::File', $filename or die;
for(my $i=0; ; $i++) {
last if !defined $lines[$i]; # eof
last if $lines[$i] =~ s/good/bad/g;
}
#!/usr/bin/perl
use strict;
use warnings;
use warnings;
use 5.010;
my #names = ("RD", "HD", "MP");
my $flag = 0;
my $filename = 'Sample.txt';
if (open(my $fh, '<', $filename))
{
while (my $row = <$fh>)
{
foreach my $i (0 .. $#names)
{
if( scalar $row =~ / \G (.*?) ($names[$i]) /xg )
{
$flag=1;
}
}
}
if( $flag ==1)
{
say $filename;
}
$flag=0;
}
here i read the content from one file and compare with array values, if file contant matches with array value i just display the file. in the same way how can i access different file from different direcory and compare the array values with same?
Q: How can I access a different file?
A: By specifying a different filename.
By the way: If you are using flags for loop control in Perl, you are doing something wrong. You can specify that this was the last iteration of the loop (in C: break), or that you want to start the next iteration. You can label the loops so that you can break out of as many loops as you like at once:
#!/usr/bin/perl
use 5.010; use warnings;
my #names = qw(RD HD MP);
# unpack command line arguments
my ($filename) = #ARGV;
open my $fh, "<", $filename or die "Oh noes, $filename is bad: $!";
LINE:
while (my $line = <$fh>) {
NAME:
foreach my $name (#names) {
if ($line =~ /\Q$name\E/) { # \QUOT\E the $name to escape everything
say "$filename contains $name";
last LINE;
}
}
}
Other highlights:
using a foreach loop as intended and
removing the (in this context) senseless \G assertion
You can then execute the script as perl script.pl Sample.txt or perl script.pl ../another.dir/foo.bar or whatever.
You can use the ~~ operator in Perl 5.10.
Don't forget to chomp the trailing whitespace.
#!/usr/bin/perl
use 5.010;
use strict;
use warnings;
my #names = ('RD', 'HD', 'MP');
my $other_dir = '/tmp';
my $filename = 'Sample.txt';
if ( open( my $fh, '<', "$other_dir/$filename" ) ) {
ROW:
while ( my $row = <$fh> ) {
chomp $row; # remove trailing \n
if ( $row ~~ #names ) {
say $filename;
last ROW;
}
}
}
close $fh;
Suppose file1 looks like this:
bye bye
hello
thank you
And file2 looks like this:
chao
hola
gracias
The desired output is this:
bye bye chao
hello hola
thank you gracias
I myself have already come up with five different approaches to solve this problem. But I think there must be more ways, probably more concise and more elegant ways, and I hope I can learn more cool stuff :)
The following is what I have tried so far, based on what I've learnt from the many solutions of my previous problems. Also, I'm trying to sort of digest or internalize the knowledge I've acquired from the Llama book.
Code 1:
#!perl
use autodie;
use warnings;
use strict;
open my $file1,'<','c:/file1.txt';
open my $file2,'<','c:/file2.txt';
while(defined(my $line1 = <$file1>)
and defined(my $line2 = <$file2>)){
die "Files are different sizes!\n" unless eof(file1) == eof(file2);
$line1 .= $line2;
$line1 =~ s/\n/ /;
print "$line1 \n";
}
Code 2:
#!perl
use autodie;
use warnings;
use strict;
open my $file1,'<','c:/file1.txt';
my #file1 = <$file1>;
open my $file2,'<','c:/file2.txt';
my #file2 =<$file2>;
for (my $n=0; $n<=$#file1; $n++) {
$file1[$n] .=$file2[$n];
$file1[$n]=~s/\n/ /;
print $file1[$n];
}
Code 3:
#!perl
use autodie;
use warnings;
use strict;
open my $file1,'<','c:/file1.txt';
open my $file2,'<','c:/file2.txt';
my %hash;
while(defined(my $line1 = <$file1>)
and defined(my $line2 = <$file2>)) {
chomp $line1;
chomp $line2;
my ($key, $val) = ($line1,$line2);
$hash{$key} = $val;
}
print map { "$_ $hash{$_}\n" } sort keys %hash;
Code 4:
#!perl
use autodie;
use warnings;
use strict;
open my $file1,'<','c:/file1.txt';
open my $file2,'<','c:/file2.txt';
while(defined(my $line1 = <$file1>)
and defined(my $line2 = <$file2>)) {
$line1 =~ s/(.+)/$1 $line2/;
print $line1;
}
Code 5:
#!perl
use autodie;
use warnings;
use strict;
open my $file1,'<','c:/file1.txt';
my #file1 =<$file1>;
open my $file2,'<','c:/file2.txt';
my #file2 =<$file2>;
while ((#file1) && (#file2)){
my $m = shift (#file1);
chomp($m);
my $n = shift (#file2);
chomp($n);
$m .=" ".$n;
print "$m \n";
}
I have tried something like this:
foreach $file1 (#file2) && foreach $file2 (#file2) {...}
But Perl gave me a syntactic error warning. I was frustrated. But can we run two foreach loops simultaneously?
Thanks, as always, for any comments, suggestions and of course the generous code sharing :)
This works for any number of files:
use strict;
use warnings;
use autodie;
my #handles = map { open my $h, '<', $_; $h } #ARGV;
while (#handles){
#handles = grep { ! eof $_ } #handles;
my #lines = map { my $v = <$_>; chomp $v; $v } #handles;
print join(' ', #lines), "\n";
}
close $_ for #handles;
The most elegant way doesn't involve perl at all:
paste -d' ' file1 file2
If I were a golfing man, I could rewrite #FM's answer as:
($,,$\)=(' ',"\n");#_=#ARGV;open $_,$_ for #_;print
map{chomp($a=<$_>);$a} #_=grep{!eof $_} #_ while #_
which you might be able to turn into a one-liner but that is just evil. ;-)
Well, here it is, under 100 characters:
C:\Temp> perl -le "$,=' ';#_=#ARGV;open $_,$_ for #_;print map{chomp($a =<$_>);$a} #_=grep{!eof $_ }#_ while #_" file1 file2
If it is OK to slurp (and why the heck not — we are looking for different ways), I think I have discovered the path the insanity:
#_=#ARGV;chomp($x[$.-1]{$ARGV}=$_) && eof
and $.=0 while<>;print "#$_{#_}\n" for #x
C:\Temp> perl -e "#_=#ARGV;chomp($x[$.-1]{$ARGV}=$_) && eof and $.=0 while<>;print qq{#$_{#_}\n} for #x" file1 file2
Output:
bye bye chao
hello hola
thank you gracias
An easier alternative to your Code 5 which allows for an arbitrary number of lines and does not care if files have different numbers of lines (hat tip #FM):
#!/usr/bin/perl
use strict; use warnings;
use File::Slurp;
use List::AllUtils qw( each_arrayref );
my #lines = map [ read_file $_ ], #ARGV;
my $it = each_arrayref #lines;
while ( my #lines = grep { defined and chomp and length } $it->() ) {
print join(' ', #lines), "\n";
}
And, without using any external modules:
#!perl
use autodie; use warnings; use strict;
my ($file1, $file2) = #ARGV;
open my $file1_h,'<', $file1;
my #file1 = grep { chomp; length } <$file1_h>;
open my $file2_h,'<', $file2;
my #file2 = grep { chomp; length } <$file2_h>;
my $n_lines = #file1 > #file2 ? #file1 : #file2;
for my $i (0 .. $n_lines - 1) {
my ($line1, $line2) = map {
defined $_ ? $_ : ''
} $file1[$i], $file2[$i];
print $line1, ' ', $line2, "\n";
}
If you want to concatenate only the lines that appear in both files:
#!perl
use autodie; use warnings; use strict;
my ($file1, $file2) = #ARGV;
open my $file1_h,'<', $file1;
my #file1 = grep { chomp; length } <$file1_h>;
open my $file2_h,'<', $file2;
my #file2 = grep { chomp; length } <$file2_h>;
my $n_lines = #file1 < #file2 ? #file1 : #file2;
for my $i (0 .. $n_lines - 1) {
print $file1[$i], ' ', $file2[$i], "\n";
}
An easy one with minimal error checking:
#!/usr/bin/perl -w
use strict;
open FILE1, '<file1.txt';
open FILE2, '<file2.txt';
while (defined(my $one = <FILE1>) or defined(my $twotemp = <FILE2>)){
my $two = $twotemp ? $twotemp : <FILE2>;
chomp $one if ($one);
chomp $two if ($two);
print ''.($one ? "$one " : '').($two ? $two : '')."\n";
}
And no, you can't run two loops simultaneous within the same thread, you'd have to fork, but that would not be guaranteed to run synchronously.