reading text file and writing to two dimensional array in perl? [closed] - perl

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
#!/usr/bin/perl
use strict;
use warnings;
use utf8;
use open ':encoding(UTF-8)', ':std';
use List::Util qw( sum );
my $filename = 'data1.txt';
open(my $fh, '<:encoding(UTF-8)', $filename)
or die "Could not open file '$filename' $!";
while (my $row = <$fh>) {
chomp $row;
print "$row\n";
}
my $filename2 = 'data2.txt';
open(my $fh2, '<:encoding(UTF-8)', $filename2)
or die "Could not open file '$filename2' $!";
while (my $row = <$fh2>) {
chomp $row;
print "$row\n";
}
my #last = ();
my %grades = (
Ahmet => {
quiz1 => 97,
quiz2 => 67,
quiz3 => 93,
},
Su => {
quiz1 => 88,
quiz2 => 82,
quiz3 => 99,
});
my %sum;
for my $name (keys %grades){
$sum{$name} = sum(values %{ $grades{$name} });
}
for my $name (sort { $sum{$a} <=> $sum{$b} } keys %sum){
push #last, "$name: $sum{$name}\n";
}
my %grades2 = (
Bugra => {
quiz1 => 33,
quiz2 => 41,
quiz3 => 59,
},
Lale => {
quiz1 => 79,
quiz2 => 31,
quiz3 => 62,
},
);
my %sum2;
for my $name (keys %grades2){
$sum2{$name} = sum(values %{ $grades2{$name} });
}
for my $name (sort { $sum2{$a} <=> $sum2{$b} } keys %sum2){
push #last, "$name: $sum2{$name}\n";
}
my #last1 = sort { lc($a) cmp lc($b) } #last;
print #last1;
This is my code. I want to take values from a text file something like ( marry 10 65 23) and write to a two dimensional array. I managed array separately end of the read text file it has to be seen like grade1 and grade2 for data1.txt and data2.txt. I can pull the values but I couldn't write to two dimensional array. Also result is correct.

I read your question as How do I populate the hashes %grade1 and %grade2 from the files data1.txt and data2.txt?
I also assume that your files data1.txt and data2.txt have the following structure (whitespace separated):
marry 10 65 23
john 20 30 40
I suggest to write a function that takes the filename as paramater and returns a reference to a populated hash:
sub read_grades_from_file
{
my $filename = shift;
my $result = {};
open( my $fh, '<:encoding(UTF-8)', $filename )
or die "Could not open file '$filename' $!\n";
while ( my $row = <$fh> ) {
next unless $row =~ /\S/; # skip empty lines
my ( $name, $quiz1, $quiz2, $quiz3 ) = split( ' ', $row );
$result->{$name} = {
quiz1 => $quiz1,
quiz2 => $quiz2,
quiz3 => $quiz3,
};
}
close($fh);
return $result;
}
The function is used as follows:
my $result = read_grades_from_file('data1.txt'); # returns hashref
my %grade1 = %{$result}; # dereference $result to make it a hash
$result = read_grades_from_file('data2.txt');
my %grade2 = %{$result};
The result of read_grades_from_file is a reference to a hash so it has to be de-referenced and then assigned to %grade. Thus the two steps.

Perhaps the data structures you are using are overly complex. You probably only need a single %grades hash, for example.
The following will take data from space- or tab-separated records from both two input files - ignoring comments or empty lines.
my %grades;
while (1) {
my $row1 = <$data1>;
my $row2 = <$data2>;
last unless (defined $row1 or defined $row2);
chomp ($row1, $row2);
if (defined $row1 and $row1 !~ /(^#|^$)/) {
my ($name, #quizzes) = split /[ \t]/, $row1, 4;
$grades{$name}{'grades1'} = sum(#quizzes);
}
if (defined $row2 and $row1 !~ /(^#|^$)/) {
my ($name, #quizzes) = split /[ \t]/, $row2, 4;
$grades{$name}{'grades2'} = sum(#quizzes);
}
}
To print to STDOUT, you could try the following.
print "Name\tMarks 1\tMarks 2", $/;
for (keys %grades) {
my $name = $grades{$_};
print $_, "\t", $name->{grades1} || '?', "\t", $name->{grades2} || '?', "\t", $/;
}
With data1.txt as
# Grades
Bugra 33 41 59
Mary 10 65 23
Lale 79 31 62
and data2.txt as
# Grades 2
Bugra 49 32 57
Lale 79 31 62
Peter 21 34 42
the output is shown below.
Name Marks 1 Marks 2
Peter ? 97
Lale 172 172
Bugra 133 138
Mary 98 ?
(A '?' indicates that no record exists for the specified student in one of the two input files.)

Related

How to push different row of values into hashes and compare it with foreach loop

I have two files, I need to do comparison to find out the matching and non-matching data. I got two problems now:
Question 1: one of my hashes can only capture the 2nd row of the 'num', i tried to use
push #{hash1{name1}},$x1,$y1,$x2,$y2
but it is still returning the 2nd row of the 'num'.
File1 :
name foo
num 111 222 333 444
name jack
num 999 111 222 333
num 333 444 555 777
File2:
name jack
num 999 111 222 333
num 333 444 555 777
name foo
num 666 222 333 444
This is my code:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my $input1=$ARGV[0];
my $input2=$ARGV[1];
my %hash1;
my %hash2;
my $name1;
my $name2;
my $x1;
my $x2;
my $y2;
my $y1;
open my $fh1,'<', $input1 or die "Cannot open file : $!\n";
while (<$fh1>)
{
chomp;
if(/^name\s+(\S+)/)
{
$name1 = $1;
}
if(/^num\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)/)
{
$x1 = $1;
$y1 = $2;
$x2 = $3;
$y2 = $4;
}
$hash1{$name1}=[$x1,$y1,$x2,$y2];
}
close $fh1;
print Dumper (\%hash1);
open my $fh2,'<', $input2 or die "Cannot open file : $!\n";
while (<$fh2>)
{
chomp;
if(/^name\s+(\S+)/)
{
$name2 = $1;
}
if(/^num\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)/)
{
$x1 = $1;
$y1 = $2;
$x2 = $3;
$y2 = $4;
}
$hash2{$name2}=[$x1,$y1,$x2,$y2];
}
close $fh2;
print Dumper (\%hash2);
My output:
$VAR1 = {
'jack' => [
'333',
'444',
'555',
'777'
],
'foo' => [
'111',
'222',
'333',
'444'
]
};
$VAR1 = {
'jack' => [
'333',
'444',
'555',
'777'
],
'foo' => [
'666',
'222',
'333',
'444'
]
};
My expected Output:
$VAR1 = {
'jack' => [
'999',
'111',
'222',
'333',
'333',
'444',
'555',
'777'
],
'foo' => [
'111',
'222',
'333',
'444'
]
};
$VAR1 = {
'jack' => [
'999',
'111',
'222',
'333',
'333',
'444',
'555',
'777'
],
'foo' => [
'666',
'222',
'333',
'444'
]
};
Question 2: I tried to use this foreach loop to do the matching of keys and values and print out in a table format.
I tried this :
print "Name\tx1\tX1\tY1\tX2\tY2\n"
foreach my $k1(keys %hash1)
{
foreach my $k2 (keys %hash2)
{
if($hash1{$name1} == $hash2{$name2})
{
print "$name1,$x1,$y1,$x2,$y2"
}
}
}
but Im getting :
"my" variable %hash2 masks earlier declaration in same scope at script.pl line 67.
"my" variable %hash1 masks earlier declaration in same scope at script.pl line 69.
"my" variable $name1 masks earlier declaration in same scope at script.pl line 69.
"my" variable %hash2 masks earlier declaration in same statement at script.pl line 69.
"my" variable $name2 masks earlier declaration in same scope at script.pl line 69.
syntax error at script.pl line 65, near "$k1("
Execution of script.pl aborted due to compilation errors.
my desired output for matching :
Name x1 y1 x2 y2
jack 999 111 222 333
333 444 555 777
The one direct error is that you assign to a hash element with $hash2{$name2}=[...], what overwrites whatever was at that key before. Thus your output shows for jake the second set of numbers only. You need to push to that arrayref. Some comments on the code are below.
Here is a rudimentary (but working) code. Please note and implement the omitted checks.
use warnings;
use strict;
use feature 'say';
my ($f1, $f2) = #ARGV;
die "Usage: $0 file1 file2\n" if not $f1 or not $f2;
my $ds1 = read_file($f1);
my $ds2 = read_file($f2);
compare_data($ds1, $ds2);
sub compare_data {
my ($ds1, $ds2) = #_;
# Add: check whether one has more keys; work with the longer one
foreach my $k (sort keys %$ds1) {
if (not exists $ds2->{$k}) {
say "key $k does not exist in dataset 2";
next;
}
# Add tests: do both datasets have the same "ref" type here?
# If those are arrayrefs, as expected, are they the same size?
my #data = #{$ds1->{$k}};
foreach my $i (0..$#data) {
if ($data[$i] ne $ds2->{$k}->[$i]) {
say "differ for $k: $data[$i] vs $ds2->{$k}->[$i]";
}
}
}
}
sub read_file {
my ($file) = #_;
open my $fh, '<', $file or die "Can't open $file: $!";
my (%data, $name);
while (<$fh>) {
my #fields = split;
if ($fields[0] eq 'name') {
$name = $fields[1];
next;
}
elsif ($fields[0] eq 'num') {
push #{$data{$name}}, #fields[1..$#fields];
}
}
return \%data;
}
I'm leaving it as an exercise to code the desired format of the printout. The above prints
differ for foo: 111 vs 666
Note comments in code to add tests. As you descend into data structures to compare them you need to check whether they carry the same type of data at each level (see ref) and whether they are of the same size (so you wouldn't try to read past the end of an array). Once you get this kind of work under your belt search for modules for this.
I use eq in comparison of data (in arrayrefs) since it's not stated firmly that they are numbers. But if they are, as it appears to be the case, change eq to == .
Doing a code review would take us too far, but here are a few remarks
When you catch yourself needing such long list of variables think "collections" and reconsider your choice of data structures for the problem. Note that in the example above I didn't need a single scalar variable for data (I used one for temporary storage of the name)
Picking strings apart with a regex is part and parcel of text analysis -- when suitable. Familiarize yourself with other approaches. At this point see split

Using perl to cycle through a list of values (x) and compare to another file with value ranges

I have two files. One file has a list of values like so
NC_SNPStest.txt
250
275
375
The other file has space delimited information. Column one is the first value of a range, Column two has the second value of a range, Column 5 has the name of the range, and Column eight has what acts on that range.
promoterstest.txt
20 100 yaaX F yaaX 5147 5.34 Sigma70 99
200 300 yaaA R yaaAp1 6482 6.54 Sigma70 35
350 400 yaaA R yaaAp2 6498 2.86 Sigma70 51
I am trying to write a script that takes the first line from file 1 and then parses file 2 line by line to see if that value falls in the range is between the first two columns.
When the first match is found, I want to print the value from file 1 and then the values in file 2 for columns 5 and 8 from the line with the match. If no match is found in File 2 then just print the value from File 1 and move on.
It seems like it should be a simple enough task but I'm having an issue cycling though both files.
This is what I have written:
#!/usr/bin/perl
use warnings;
use strict;
open my $PromoterFile, '<', 'promoterstest.txt' or die $!;
open my $SNPSFile, '<', 'NC_SNPtest.txt' or die $!;
open (FILE, ">PromoterMatchtest.txt");
while (my $SNPS = <$SNPSFile>) {
chomp ($SNPS);
while (my $Cord = <$PromoterFile>) {
chomp ($Cord);
my #CordFile =split(/\s/, $Cord);
my $Lend = $CordFile[0];
my $Rend = $CordFile[1];
my $Promoter = $CordFile[4];
my $SigmaFactor = $CordFile[7];
foreach $a ($SNPS)
{
if ($a >= $Lend && $a <= $Rend)
{
print FILE "$a\t$CordFile[4]\t$CordFile[7]\n";
}
else
{
print FILE "$a\n";
}
}
}
}
close FILE;
close $PromoterFile;
close $SNPSFile;
exit;
So far my output looks like so:
250
250 yaaAp1 Sigma70
250
Where the first line of file 1 is being called and file 2 is being cycled through. But the else command is being used on each line of file 2 and the script never cycles through the other lines of file 1.
Your problem is you're not resetting your progress through the second file. You read one line from $SNPSFile, check that against ever line in the second file.
But when you start over, you're already at the end of file, so:
while (my $Cord = <$PromoterFile>) {
Doesn't have anything to read.
A quick fix for this would be to add a seek command in there, but that'll make inefficient code. I'd suggest instead reading file 1 into a array, and referencing that instead.
Here's a first draft rewrite that may help.
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;
open my $PromoterFile, '<', 'promoterstest.txt' or die $!;
open my $SNPSFile, '<', 'NC_SNPtest.txt' or die $!;
open my $output, ">", "PromoterMatchtest.txt" or die $!;
my #data;
while (<$PromoterFile>) {
chomp;
my #CordFile = split;
my $Lend = $CordFile[0];
my $Rend = $CordFile[1];
my $Promoter = $CordFile[4];
my $SigmaFactor = $CordFile[7];
push(
#data,
{ lend => $CordFile[0],
rend => $CordFile[1],
promoter => $CordFile[4],
sigmafactor => $CordFile[7]
}
);
}
print Dumper \#data;
foreach my $value (<$SNPSFile>) {
chomp $value;
my $found = 0;
foreach my $element (#data) {
if ( $value >= $element->{lend}
and $value <= $element->{rend} )
{
#print "Found $value\n";
print {$output} join( "\t",
$value, $element->{promoter}, $element->{sigmafactor} ),
"\n";
$found++;
last;
}
}
if ( not $found ) {
print {$output} $value,"\n";
}
}
close $output;
close $PromoterFile;
close $SNPSFile;
First - we open file2, read in the stuff in it to an array of hashes. (If any of the elements there are unique, we could key off that instead.)
Then we read through SNPSfile one line at a time, looking for each key - printing it if it exists (at least once, on the first hit) and printing just the key if it doesn't.
This generates the output:
250 yaaAp1 Sigma70
275 yaaAp1 Sigma70
375 yaaAp2 Sigma70
Was that what you were aiming for?
Aside from that 'Dumper' statement which outputs the content of #data as thus:
$VAR1 = [
{
'sigmafactor' => 'Sigma70',
'promoter' => 'yaaX',
'lend' => '20',
'rend' => '100'
},
{
'sigmafactor' => 'Sigma70',
'promoter' => 'yaaAp1',
'rend' => '300',
'lend' => '200'
},
{
'promoter' => 'yaaAp2',
'sigmafactor' => 'Sigma70',
'rend' => '400',
'lend' => '350'
}
];
Here's my take on a programming solution. It's important to
Use lexical file handles and the three-paremeter form of open
Keep to lower-case letters, digits and underscores for local variables
I have also used the autodie pragma to remove the need to test the status of open explicitly, and the first function from the core library List::Util to make the code clearer and more concise
use strict;
use warnings;
use 5.010;
use autodie;
use List::Util 'first';
my #promoters;
{
open my $fh, '<', 'promoterstest.txt';
while ( <$fh> ) {
my #fields = split;
push #promoters, [ #fields[0,1,4,7] ];
}
}
open my $fh, '<', 'NC_SNPStest.txt';
open my $out_fh, '>', 'PromoterMatchtest.txt';
select $out_fh;
while ( <$fh> ) {
my ($num) = split;
my $match = first { $num >= $_->[0] and $num <= $_->[1] } #promoters;
if ( $match ) {
print join("\t", $num, #{$match}[2,3]), "\n";
}
else {
print $num, "\n";
}
}
output
250 yaaAp1 Sigma70
275 yaaAp1 Sigma70
375 yaaAp2 Sigma70

Parsing out text from string

I have a tab-delimited file1:
20 50 80 110
520 590 700 770
410 440 20 50
300 340 410 440
read and put them into an array:
while(<INPUT>)
{
chomp;
push #inputarray, $_;
}
Now I'm looping through another file2:
20, 410, 700
80, 520
300
foreach number of each line in file2, I want to search the #inputarray for the number. If it exists, I want to grab the corresponding number that follows. For instance, for number 20, I want to grab the number 50. I assume that they are still separated by a tab in the string that exists as an array element in #inputarray.
while(my $line = <INPUT2>)
{
chomp $line;
my #linearray = split("\t", $line);
foreach my $start (#linearray)
{
if (grep ($start, #inputarray))
{
#want to grab the corresponding number
}
}
}
Once grep finds it, i don't know how to grab that array element to find the position of the number to extract the corresponding number using perhaps the substr function. How do i grab the array element that grep found?
A desired output would be:
line1:
20 50
410 440
700 770
line2:
80 110
520 590
line3:
300 340
IMHO, it would be best to store the numbers from file1 in a hash. Referring to the example clontent of file1 as you provided above you can have something like below
{
'20' => '50',
'80' => '110',
'520'=> '590',
'700'=> '770',
'410'=> '440',
'20' => '50',
'300'=> '340',
'410' => '440'
}
A sample piece of code will be like
my %inputarray;
while(<INPUT>)
{
my #numbers = split $_;
my $length = scalar $numbers;
# For $i = 0 to $i < $length;
# $inputarray{$numbers[$i]} = $numbers[$i+1];
# $i+=2;
}
An demonstration of the above loop
index: 0 1 2 3
numbers: 20 50 80 110
first iteration: $i=0
$inputarray{$numbers[0]} = $numbers[1];
$i = 2; #$i += 2;
second iteration: $i=2
$inputarray{$numbers[2]} = $numbers[3];
And then while parsing file2, you just need to treat the number as the key of %inputarray.
I believe this gets you close to what you want.
#!/usr/bin/perl -w
my %follows;
open my $file1, "<", $ARGV[0] or die "could not open $ARGV[0]: $!\n";
while (<$file1>)
{
chomp;
my $prev = undef;
foreach my $curr ( split /\s+/ )
{
$follows{$prev} = $curr if ($prev);
$prev = $curr;
}
}
close $file1;
open my $file2, "<", $ARGV[1] or die "could not open $ARGV[1]: $!\n";
my $lineno = 1;
while (<$file2>)
{
chomp;
print "line $lineno\n";
$lineno++;
foreach my $val ( split /,\s+/, $_ )
{
print $val, " ", ($follows{$val} // "no match"), "\n";
}
print "\n";
}
If you only want to consider numbers from file1 in pairs, as opposed to seeing which numbers follow what other numbers without taking pair boundaries into account, then you need to change the logic in the first while loop slightly.
#!/usr/bin/perl -w
my %follows;
open my $file1, "<", $ARGV[0] or die "could not open $ARGV[0]: $!\n";
while (<$file1>)
{
chomp;
my $line = $_;
while ( $line =~ s/(\S+)\s+(\S+)\s*// )
{
$follows{$1} = $2;
}
}
close $file1;
open my $file2, "<", $ARGV[1] or die "could not open $ARGV[1]: $!\n";
my $lineno = 1;
while (<$file2>)
{
chomp;
print "line $lineno\n";
$lineno++;
foreach my $val ( split /,\s+/, $_ )
{
print $val, " ", ($follows{$val} // "no match"), "\n";
}
print "\n";
}
If you want to read the input once but check for numbers a lot, you might be better off to split the input line into individual numbers. Then add each each number as key into a hash with the following number as value. That makes reading slow and takes more memory but the second part, where you want to check for following numbers will be a breeze thanks to exist and the nature of hashes.
If i understood your question correct, you could use just one big hash. That is of course assuming that every number is always followed by the same number.

printing hash values in new line using tie

I have a hash with few keys and each key has 20 values.
%test={
a=> 10 14 34 56 ....
b=> 56 67 89 66 ...
..
}
#values= {a,b,..}
I want to tie values from this hash to another file as shown below
my input file.txt
ID
ID
ID
...
expected file.txt
ID ,10 ,56
ID ,14, 67
ID ,34, 89
ID ,56, 66
..
My code right now ties the all the values to the first line of my file. please help formatting it.
my $match = "ID";
tie my #lines, 'Tie::File', 'file.txt' or die "failed : $!";
for my $line (#lines) {
while ( $line =~ /^($match.*)/ ) {
$line = $1 . "," . join ',',#test{#values};
}
}
untie #lines;
right now my output is
file.txt
ID ,10 ,14, 34, 56,... 56, 67, 89, 66....
ID
ID
ID
I'm a bit confused by your question...
You have some template file that only contains ID at the beginning of (n) lines?
And you want to iterate over each $key by $test->{$key}[$line_count]?
Something seems fishy(I think you must be leaving something out) here. There's going to be quite a few ways to go wrong with this design...
Anyways, I think this is what you're going for:
my $match = "ID";
my $test = {
a => [ qw(1 3 5) ],
b => [ qw(2 4 6) ],
};
tie my #lines, 'Tie::File', 'file.txt' or die "failed : $!";
my $i = 0;
for my $line (#lines) {
if( $line =~ /^($match.*)/ ) {
my #stuff = ();
for my $key ( keys %$test ) {
push #stuff, $test->{$key}[$i];
}
$line = $1 . ", " . join(', ', #stuff);
$i++;
}
}
untie #lines;
Assuming that this is what you have/want:
$ cat file.txt
ID
ID
ID
$ test.pl
$ !cat
cat file.txt
ID, 1, 2
ID, 3, 4
ID, 5, 6
Do you simply want
my %test = (
a => [ 10, 14, 34, 56, ... ],
b => [ 56, 67, 89, 66, ... ],
);
for (0..$#{ $test{a} }) {
print(join(',', 'ID', $test{a}[$_], $test{b}[$_]), "\n");
}
You could write to a file instead of STDOUT by creating the file using
open(my $fh, '>', 'file.txt')
or die("Can't create file.txt: $!\n");
and then using
print($fh ...);
but it's better to let the user redirect the output to the file using >file.txt.
Here is my take, although the tie seems superfluous to me:
use strict;
use warnings;
use Tie::File;
my %test=(
a=> [qw(10 14 34 56)],
b=> [qw(56 67 89 66)]
);
my #values= qw(a b);
my $match = "ID";
tie my #lines, 'Tie::File', 'file.txt' or die "failed : $!";
my $i = 0;
for my $line (#lines) {
if ( $line =~ /^($match.*)/ ) {
$line = $1 . "," . join(',', map { $test{$_}->[$i]} #values );
$i++;
}
}
untie #lines;
Output (file.txt):
ID,10,56
ID,14,67
ID,34,89
ID,56,66

merging two files with similar columns

I have a two tab separated files that I need to align together. for example:
File 1: File 2:
AAA 123 BBB 345
BBB 345 CCC 333
CCC 333 DDD 444
(These are large files, potentially thousands of lines!)
What I would like to do is to have the output look like this:
AAA 123
BBB 345 BBB 345
CCC 333 CCC 333
DDD 444
Preferably I would like to do this in perl, but not sure how. any help would be greatly appreaciated.
If its just about making a data structure, this can be quite easy.
#!/usr/bin/env perl
# usage: script.pl file1 file2 ...
use strict;
use warnings;
my %data;
while (<>) {
chomp;
my ($key, $value) = split;
push #{$data{$key}}, $value;
}
use Data::Dumper;
print Dumper \%data;
You can then output in any format you like. If its really about using the files exactly as they are, then its a little bit more tricky.
Assuming the files are sorted,
sub get {
my ($fh) = #_;
my $line = <$fh>;
return () if !defined($line);
return split(' ', $line);
}
my ($key1, $val1) = get($fh1);
my ($key2, $val2) = get($fh2);
while (defined($key1) && defined($key2)) {
if ($key1 lt $key2) {
print(join("\t", $key1, $val1), "\n");
($key1, $val1) = get($fh1);
}
elsif ($key1 gt $key2) {
print(join("\t", '', '', $key2, $val2), "\n");
($key2, $val2) = get($fh2);
}
else {
print(join("\t", $key1, $val1, $key2, $val2), "\n");
($key1, $val1) = get($fh1);
($key2, $val2) = get($fh2);
}
}
while (defined($key1)) {
print(join("\t", $key1, $val1), "\n");
($key1, $val1) = get($fh1);
}
while (defined($key2)) {
print(join("\t", '', '', $key1, $val1), "\n");
($key2, $val2) = get($fh2);
}
Similar to Joel Berger's answer, but this approach allows to you keep track of whether files did or did not contain a given key:
my %data;
while (my $line = <>){
chomp $line;
my ($k) = $line =~ /^(\S+)/;
$data{$k}{line} = $line;
$data{$k}{$ARGV} = 1;
}
use Data::Dumper;
print Dumper(\%data);
Output:
$VAR1 = {
'CCC' => {
'other.dat' => 1,
'data.dat' => 1,
'line' => 'CCC 333'
},
'BBB' => {
'other.dat' => 1,
'data.dat' => 1,
'line' => 'BBB 345'
},
'DDD' => {
'other.dat' => 1,
'line' => 'DDD 444'
},
'AAA' => {
'data.dat' => 1,
'line' => 'AAA 123'
}
};
As ikegami mentioned, it assumes that the files' contents are arranged as shown in your example.
use strict;
use warnings;
open my $file1, '<file1.txt' or die $!;
open my $file2, '<file2.txt' or die $!;
my $file1_line = <$file1>;
print $file1_line;
while ( my $file2_line = <$file2> ) {
if( defined( $file1_line = <$file1> ) ) {
chomp $file1_line;
print $file1_line;
}
my $tabs = $file1_line ? "\t" : "\t\t";
print "$tabs$file2_line";
}
close $file1;
close $file2;
Reviewing your example, you show some identical key/value pairs in both files. Given this, it looks like you want to show the pair(s) unique to file 1, unique to file 2, and show the common pairs. If this is the case (and you're not trying to match the files' pairs by either keys or values), you can use List::Compare:
use strict;
use warnings;
use List::Compare;
open my $file1, '<file1.txt' or die $!;
my #file1 = <$file1>;
close $file1;
open my $file2, '<file2.txt' or die $!;
my #file2 = <$file2>;
close $file2;
my $lc = List::Compare->new(\#file1, \#file2);
my #file1Only = $lc->get_Lonly; # L(eft array)only
for(#file1Only) { print }
my #bothFiles = $lc->get_intersection;
for(#bothFiles) { chomp; print "$_\t$_\n" }
my #file2Only = $lc->get_Ronly; # R(ight array)only
for(#file2Only) { print "\t\t$_" }