Cast array into list - perl

I have a file that I need to read into a list so I can use it in Template Toolkit. I can do that easy with array and I am bit struggling with list. Is there a way to cast array into list?
# file.txt
foo
bar
zoo
my $filename = shift;
open(my $fh, '<:encoding(UTF-8)', $filename)
or die "Could not open file '$filename' $!";
while (my $row = <$fh>) {
chomp $row;
unshift #yy_array, $row;
}
my $zz_list = ['foo', 'bar', 'zoo'];
say "### Dumper - array ###";
print Dumper \#yy_array;
say "### Dumper - list ###";
print Dumper $zz_list;
### Dumper - array ###
$VAR1 = [
'zoo',
'bar',
'foo'
];
### Dumper - list ###
$VAR1 = [
'foo',
'bar',
'zoo'
];
###
Any thoughts?

What you call a list is an array reference. You can use the reference operator to get a reference to an array:
my $array_ref = \#array;
Another option is to create a new anonymous array and populate it by the elements of the array:
my $array_ref = [#array];
To get the array back from the reference, you dereference:
my #arr2 = #{ $array_ref };

Related

match variable name in reading each line of a file to create view ddl

I have an input file,
TableName1.Column1
TableName1.Column2
TableName2.Column1
TableName2.Column2
TableName3.Column3 etc
I would like it read each of the line and distinguish what columns belong for TableName1 so I can build a view ddl like this: CREATE VIEW TABLENAME1 AS SELECT Column1, Column2 From TableName1; and Next will be View TableName2 etc.
my $file = "summary.csv";
open (my $FH, '<', $file) or die "Can't open '$file' for read: $!";
my #lines;
while (my $line = <$FH>) {
push (#lines, $line);
}
close $FH or die "Cannot close $file: $!";
my $ln=#lines;
for (my $x=0; $x<$ln; $x++){
print("---Start->\n") if($x == 0);
print "---------------->\n";
my $first = (split /\./, $lines[$x] )[0];
my $second = $first;
print "Second is: $second \n";
if ((split /\./, $lines[$x] )[0] eq $first )
{
print "Same Table: $lines[$x]";
}
else
{
print "Next Table: $lines[$x]";
}
print("---End-->\n") if($x == $ln -1);
}
I'd do it something like this.
Parse the data into a data structure. I'm using an array of anonymous arrays. In the anonymous arrays, the first element is the table name and any other elements are columns.
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
my #tables;
my $curr_table = '';
# Note: I used a DATA filehandle to test this. You'll need to
# insert your file-opening code here.
while (<DATA>) {
chomp;
my ($table, $column) = split /\./;
if ($table ne $curr_table) {
push #tables, [ $table ];
$curr_table = $table;
}
push #{ $tables[-1] }, $column;
}
And then walk the data structure to do whatever you want with the data (here, I'm just displaying it).
for my $t (#tables) {
my ($table, #columns) = #{ $t };
say "Table: table";
say " * $_" for #columns;
}

Perl - need to store the column values into hash

I want to create a hash with column header as hash key and column values as hash values in Perl.
For example, if my csv file has the following data:
A,B,C,D,E
1,2,3,4,5
6,7,8,9,10
11,12,13,14,15 ...
I want to create a hash as follows:
A=> 1,6,11
B=>2,7,12
c=>3,8,13 ...
So that just by using the header name I can use that column values.
Is there a way in PERL to do this? Please help me.
I was able to store required column values as array using the following script
use strict;
use warnings;
open( IN, "sample.csv" ) or die("Unable to open file");
my $wanted_column = "A";
my #cells;
my #colvalues;
my $header = <IN>;
my #column_names = split( ",", $header );
my $extract_col = 0;
for my $header_line (#column_names) {
last if $header_line =~ m/$wanted_column/;
$extract_col++;
}
while ( my $row = <IN> ) {
last unless $row =~ /\S/;
chomp $row;
#cells = split( ",", $row );
push( #colvalues, $cells[$extract_col] );
}
my $sizeofarray = scalar #colvalues;
print "Size of the coulmn= $sizeofarray";
But I want to do this to all my column.I guess Hash of arrays will be the best solution but I dont know how to implement it.
Text::CSV is a useful helper module for this sort of thing.
use strict;
use warnings;
use Text::CSV;
use Data::Dumper;
my %combined;
open( my $input, "<", "sample.csv" ) or die("Unable to open file");
my $csv = Text::CSV->new( { binary => 1 } );
my #headers = #{ $csv->getline($input) };
while ( my $row = $csv->getline($input) ) {
for my $header (#headers) {
push( #{ $combined{$header} }, shift(#$row) );
}
}
print Dumper \%combined;
Since you requested without a module - you can use split but you need to bear in mind the limitations. CSV format allows for things like commas nested in quotes. split won't handle that case very well.
use strict;
use warnings;
use Data::Dumper;
my %combined;
open( my $input, "<", "sample.csv" ) or die("Unable to open file");
my $line = <$input>;
chomp ( $line );
my #headers = split( ',', $line );
while (<$input>) {
chomp;
my #row = split(',');
for my $header (#headers) {
push( #{ $combined{$header} }, shift(#row) );
}
}
print Dumper \%combined;
Note: Both of these will effectively ignore any extra columns that don't have headers. (And will get confused by duplicate column names).
Another solution by using for loop :
use strict;
use warnings;
my %data;
my #columns;
open (my $fh, "<", "file.csv") or die "Can't open the file : ";
while (<$fh>)
{
chomp;
my #list=split(',', $_);
for (my $i=0; $i<=$#list; $i++)
{
if ($.==1) # collect the columns, if its first line.
{
$columns[$i]=$list[$i];
}
else #collect the data, if its not the first line.
{
push #{$data{$columns[$i]}}, $list[$i];
}
}
}
foreach (#columns)
{
local $"="\,";
print "$_=>#{$data{$_}}\n";
}
Output will be like this :
A=>1,6,11
B=>2,7,12
C=>3,8,13
D=>4,9,14
E=>5,10,15

hash doesnt print right while reading from file in perl

I am trying to build a hash in perl by reading a file.
File content is as below:
s1=i1
s2=i2
s3=i3
And my code is as below:
my $FD;
open ($FD, "read") || die "Cant open the file: $!";
while(<$FD>){
chomp $_;
print "\n Line read = $_\n";
$_ =~ /([0-9a-z]*)=([0-9a-zA-Z]*)/;
#temp_arr=($2,$3,$4);
print "Array = #temp_arr\n";
$HASH{$1}=\#temp_arr;
print "Hash now = ";
foreach(keys %HASH){print "$_ = $HASH{$_}->[0]\n";};
}
And my output as below
Line read = s1=i1
Array = i1
Hash now = s1 = i1
Line read = s2=i2
Array = i2
Hash now = s2 = i2
s1 = i2
Line read = s3=i3
Array = i3
Hash now = s2 = i3
s1 = i3
s3 = i3
Why is the value for all the keys in the end printed as i3?????
Because you are putting references to the same array in each value.
Try something like this:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my %result;
open my $fh, '<', 'read' or die $!;
while (my $line=<$fh>) {
chomp $line;
my ($key, $value)=split /=/, $line, 2;
die "$key already exists" if (exists $result{$key});
$result{$key}=$value;
}
print Dumper(\%result);
Output is:
$VAR1 = {
's1' => 'i1',
's3' => 'i3',
's2' => 'i2'
};
\#temp_arr is a reference to the global variable #temp_arr. You are re-initializing it repeatedly, but it's still a reference to the original variable.
You need to lexically scope the #temp_arr (my #temp_arr=($2,$3,$4);) or pass a new reference in to the hash ($HASH{$1} = [ $2,$3,$4 ];)
Try this:
my $FD;
open ($FD, "read") || die "Cant open the file: $!";
for(<$FD>){
chomp $_;
push(#temp_arr,$1,$2) if($_=~/(.*?)=(.*)/);
}
%HASH=#temp_arr;
print Dumper \%HASH;
Try this.
open (my $FD, "read") || die "Cant open the file: $!";
my %HASH = map {chomp $_; my #x = split /=/, $_; $x[0] => $x[1]} <$FD>;
print "Key: $_ Value: $HASH{$_}\n" for (sort keys %HASH);
Beside the errors in your "open" statement, try keeping it simple then make it unreadable.
my ($FD, $a, $b, $k);
$FD = "D:\\Perl\\test.txt";
open (FD, "<$FD") or die "Cant open the file $FD: $!";
while(<FD>){
chomp $_;
print "\n Line read = $_\n";
($a, $b) = split('=', $_);
print "A: $a, B: $b\n";
$HASH{$a}="$b";
print "Hash now ..\n";
foreach $k (sort keys %HASH){
print "Key: $k -- HASH{$k} = $HASH{$k}\n";
}
}

Retrieving values matching the same ID with perl

This is a simple problem but cannot find any working solution for it. I have 2 files and the first file holds all the ID that I am interested in, for example "tomato", "cucumber", but also the ones I am not interested in, which hold no value in the second file. The second file has the following data structure
tomato red
tomato round
tomato sweet
cucumber green
cucumber bitter
cucumber watery
What I need to get is a file containing all the IDs with all the matching values from the second file, everything tab-seperated, like this:
tomato red round sweet
cucumber green bitter watery
What I did so far is create a hash out of the IDs in the first file:
while (<FILE>) {
chomp;
#records = split "\t", $_;
{%hash = map { $records[0] => 1 } #records};
}
And this for the second file:
while (<FILE2>) {
chomp;
#records2 = split "\t", $_;
$key, $value = $records2[0], $records2[1];
$data{$key} = join("\t", $value);
}
close FILE;
foreach my $key ( keys %data )
{
print OUT "$key\t$data{$key}\n"
if exists $hash{$key}
}
Would be grateful for some simple solution for combining all the values matching the same ID! :)
for th first file:
while (<FILE>) {
chomp;
#records = split "\t", $_;
$hash{$records[0]} = 1;
}
and for the second:
while (<FILE2>) {
chomp;
#records2 = split "\t", $_;
($key,$value) = #records2;
$data{$key} = [] unless exists $data{$key};
push #{$data{$key}}, $value;
}
close FILE;
foreach my $key ( keys %data ) {
print OUT $key."\t".join("\t", #{$data{$key}})."\n" if exists $hash{$key};
}
This seems to do what is needed
use strict;
use warnings;
my %data;
open my $fh, '<', 'file1.txt' or die $!;
while (<$fh>) {
$data{$1} = {} if /([^\t]+)/;
}
open $fh, '<', 'file2.txt' or die $!;
while (<$fh>) {
$data{$1}{$2}++ if /^(.+?)\t(.+?)$/ and exists $data{$1};
}
while ( my ($key, $values) = each %data) {
print join("\t", $key, keys %$values), "\n";
}
output
tomato sweet round red
cucumber green watery bitter
It's easier if you read the data mapping first.
Also, if you are using Perl, you should consider from the get-go leveraging one its main strengths - CPAN libraries. For example, the reading in of the file is as simple as read_file() from File::Slurp; instead of having to open/close the file yourself and then run a while(<>) loop.
use File::Slurp;
my %data;
my #data_lines = File::Slurp::read_file($filename2);
chomp(#data_lines);
foreach my $line (#data_lines) { # Improved version from CyberDem0n's answer
my ($key, $value) = split("\t", $line);
$data{$key} ||= []; # Make sure it's an array reference if first time
push #{ $data{$key} }, $value;
}
my #id_lines = File::Slurp::read_file($filename1);
chomp(#id_lines);
foreach my $id (#id_lines) {
print join("\t", ( $id, #{ $data{$id} } ) )."\n";
}
A slightly more hacky but a bit shorter code adds the ID to the list of values in the data hash from the get go:
my #data_lines = File::Slurp::read_file($filename2);
chomp(#data_lines);
foreach my $line (#data_lines) { # Improved version from CyberDem0n's answer
my ($key, $value) = split("\t", $line);
$data{$key} ||= [ $id ]; # Add the ID for printing
push #{ $data{$key} }, $value;
}
my #id_lines = File::Slurp::read_file($filename1);
chomp(#id_lines);
foreach my $id (#id_lines) {
print join("\t", #{ $data{$id} } ) ."\n"; # ID already in %data!
}

perl printing hash of arrays with out Data::Dumper

Here is the code, I know it is not perfect perl. If you have insight on how I an do better let me know. My main question is how would I print out the arrays without using Data::Dumper?
#!/usr/bin/perl
use Data::Dumper qw(Dumper);
use strict;
use warnings;
open(MYFILE, "<", "move_headers.txt") or die "ERROR: $!";
#First split the list of files and the headers apart
my #files;
my #headers;
my #file_list = <MYFILE>;
foreach my $source_parts (#file_list) {
chomp($source_parts);
my #parts = split(/:/, $source_parts);
unshift(#files, $parts[0]);
unshift(#headers, $parts[1]);
}
# Next get a list of unique headers
my #unique_files;
foreach my $item (#files) {
my $found = 0;
foreach my $i (#unique_files) {
if ($i eq $item) {
$found = 1;
last;
}
}
if (!$found) {
unshift #unique_files, $item;
}
}
#unique_files = sort(#unique_files);
# Now collect the headers is a list per file
my %hash_table;
for (my $i = 0; $i < #files; $i++) {
unshift #{ $hash_table{"$files[$i]"} }, "$headers[$i]";
}
# Process the list with regex
while ((my $key, my $value) = each %hash_table) {
if (ref($value) eq "ARRAY") {
print "$value", "\n";
}
}
The Perl documentation has a tutorial on "Printing of a HASH OF ARRAYS" (without using Data::Dumper)
perldoc perldsc
You're doing a couple things the hard way. First, a hash will already uniqify its keys, so you don't need the loop that does that. It appears that you're building a hash of files, with the values meant to be the headers found in those files. The input data is "filename:header", one per line. (You could use a hash of hashes, since the headers may need uniquifying, but let's let that go for now.)
use strict;
use warnings;
open my $files_and_headers, "<", "move_headers.txt" or die "Can't open move_headers: $!\n";
my %headers_for_file;
while (defined(my $line = <$files_and_headers> )) {
chomp $line;
my($file, $header) = split /:/, $line, 2;
push #{ $headers_for_file{$file} }, $header;
}
# Print the arrays for each file:
foreach my $file (keys %headers_for_file) {
print "$file: #{ $headers_for_file{$file}}\n";
}
We're letting Perl do a chunk of the work here:
If we add keys to a hash, they're always unique.
If we interpolate an array into a print statement, Perl adds spaces between them.
If we push onto an empty hash element, Perl automatically puts an empty anonymous array in the element and then pushes onto that.
An alternative to using Data::Dumper is to use Data::Printer:
use Data::Printer;
p $value;
You can also use this to customise the format of the output. E.g. you can have it all in a single line without the indexes (see the documentation for more options):
use Data::Printer {
index => 0,
multiline => 0,
};
p $value;
Also, as a suggestion for getting unique files, put the elements into a a hash:
my %unique;
#unique{ #files } = #files;
my #unique_files = sort keys %unique;
Actually, you could even skip that step and put everything into %hash_table in one pass:
my %hash_table;
foreach my $source_parts (#file_list) {
chomp($source_parts);
my #parts = split(/:/, $source_parts);
unshift #{ $hash_table{$parts[0]} }, $parts[1];
}