Combining two csv files together in perl - perl

Hi i'm very new to perl and i've got litle knowledge on it but i'm trying to create a script that conbines two .csv files into a new one
#!/usr/bin/env perl
use strict;
use warnings;
use Text::CSV_XS;
my #rows;
{ # Read the CSV file
my $csv = Text::CSV_XS->new() or die "Cannot use Text::CSV_XS ($!)";
my $file = "file.csv";
open my $fh, '<', $file or die "Cannot open $file ($!)";
while (my $row = $csv->getline($fh)) {
push #rows, $row;
}
$csv->eof or $csv->error_diag();
close $fh or die "Failed to close $file ($!)";
}
{ # Gather the data
foreach my $row (#rows) {
foreach my $col (#{$row}) {
$col = uc($col);
}
print "\n";
}
}
# (over)Write the data
# Needs to be changed to ADD data
{
my $csv = Text::CSV_XS->new({ binary => 1, escape_char => undef })
or die "Cannot use Text::CSV ($!)";
my $file = "output.csv";
open my $fh, '>', $file or die "Cannot open $file ($!)";
$csv->eol("\n");
foreach my $row (#rows) {
$csv->print($fh, \#{$row}) or die "Failed to write $file ($!)";
}
close $fh or die "Failed to close $file ($!)";
}
this is my current code i do know this over write's the data insted of actually adding it to the new file but this is how far i managed to get with the limited time and knowledge i've got on perl
the csv format of both files are the same.
"Header1";"Header2";"Header3";"Header4";"Header5"
"Data1";"Data2";"Data3";"Data4";"Data5"
"Data1";"Data2";"Data3";"Data4";"Data5"
"Data1";"Data2";"Data3";"Data4";"Data5"
"Data1";"Data2";"Data3";"Data4";"Data5"
"Data1";"Data2";"Data3";"Data4";"Data5"

I believe the issue is here:
open my $fh, '>', $file
or die "Cannot open $file ($!)";
If I remember my Perl properly, the line should read:
open my $fh, '>>', $file
or die "Cannot open $file ($!)";
The >> should open the file handle $fh for append instead of for overwrite.

you could try something like this
opendir(hand,"DIRPATH");
#files = readdir(hand);
closedir(hand);
foreach(#files){
if(/\.csv$/i) { #if the filename has .csv at the end
push(#csvfiles,$_);
}
}
foreach(#csvfiles) {
$csvfile=$_;
open(hanr,"DIRPATH".$csvfile)or die"error $!\n"; #read handler
open(hanw , ">>DIRPATH"."outputfile.csv") or die"error $! \n"; #write handler for creating new sorted files
#lines=();
#lines=<hanr>;
foreach $line (#lines){
chomp $line;
$count++;
next unless $count; # skip header i.e the first line containing stock details
print hanw join $line,"\n";
}
$count= -1;
close(hanw);
close(hanr);
}`

Related

Write old fasta header and new to file

I want to extract the old fasta names which looks something like this:
>Bartonella bibbi
AUUCCGGUUGAUCCUGCCGGAGGCCACUGCUAUCGGGGUCCG
The new headers should look like this:
>Seq1
AUUCCGGUUGAUCCUGCCGGAGGCCACUGCUAUCGGGGUCCG
and so on...
The Bartonella Bibbi should be saved together with the new name Seq1 in a new file an so on. So I've started a bit, by looking for lines with >, and then I split to get an array to get the old name. I don't know how to continue, because I want two things here, first to put the new name in there, but also extracting the old name together with the new in a file, and ALSO get an output file with my sequence and my new names. Please, any input from you will help!
#!/usr/bin/perl
use warnings;
use strict;
my $infile = $ARGV[0];
open my $IN, '<', $infile or die "Could not open $infile: $!, $?";
while (my $line = <$IN>) {
if ($line =~ /^>/) {
my #header = split (/\>/, $line);
my $oldfasta = "$header[1]";
}
}
So after some edits, this is the current script:
#!/usr/bin/perl
use warnings;
use strict;
my $infile = $ARGV[0];
open my $IN, '<', $infile or die "Could not open $infile: $!, $?";
my $seqid = 1;
my %id;
while (my $line = <$IN>) {
if ($line =~ /^>/) {
$id{"Seq$seqid "} = $line;
print ">Seq$seqid\n";
$seqid++
} else {
print $line;
}
}
my $outfile = 'output';
open my $OUT, '>', $outfile or die "Could not open $outfile: $!, $?"; # overwrites the file $outfile;
print $OUT %id;
This gives me a file that looks like this:
Seq29 >Sulfophobococcus_zilligii
Seq20 >Pyrococcus_shinkaii
and so on.
They are not in order, how do I sort them and get rid of the > in the species name?
You’re simply not printing anything. Once you add a print statement, it should work.
In addition, it’s unclear what you’re using split for. Just increase a counter for the sequence:
#!/usr/bin/perl
use warnings;
use strict;
my $infile = $ARGV[0];
open my $IN, '<', $infile or die "Could not open $infile: $!, $?";
my $seqid = 1;
while (my $line = <$IN>) {
if ($line =~ /^>/) {
print ">Seq$seqid\n";
$seqid++;
} else {
print $line;
}
}
Simply write the new entries as you create them.
#!/usr/bin/perl
use warnings;
use strict;
my $infile = $ARGV[0];
open my $IN, '<', $infile or die "Could not open $infile: $!, $?";
my $outfile = 'output';
open my $OUT, '>', $outfile or die "Could not open $outfile: $!, $?"; # overwrites the file $outfile;
my $seqid = 1;
while (my $line = <$IN>) {
if ($line =~ /^>(.+)/) {
print $OUT "Seq$seqid\t$1\n"
print ">Seq$seqid\n";
$seqid++
} else {
print $line;
}
}
I tried to fix the indentation but left the gratutious variable for the $OUT file name.
If you want to keep the mapping in memory for other reasons (maybe to develop this into a much more complex script) using an array instead of a hash would seem like a natural way to keep the entries sorted; the new label is trivially derivable from the array index.

How to read file in Perl and if it doesn't exist create it?

In Perl, I know this method :
open( my $in, "<", "inputs.txt" );
reads a file but it only does so if the file exists.
Doing the other way, the one with the +:
open( my $in, "+>", "inputs.txt" );
writes a file/truncates if it exists so I don't get the chance to read the file and store it in the program.
How do I read files in Perl considering if the file exists or not?
Okay, I've edited my code but still the file isn't being read. The problem is it doesn't enter the loop. Anything mischievous with my code?
open( my $in, "+>>", "inputs.txt" ) or die "Can't open inputs.txt : $!\n";
while (<$in>) {
print "Here!";
my #subjects = ();
my %information = ();
$information{"name"} = $_;
$information{"studNum"} = <$in>;
$information{"cNum"} = <$in>;
$information{"emailAdd"} = <$in>;
$information{"gwa"} = <$in>;
$information{"subjNum"} = <$in>;
for ( $i = 0; $i < $information{"subjNum"}; $i++ ) {
my %subject = ();
$subject{"courseNum"} = <$in>;
$subject{"courseUnt"} = <$in>;
$subject{"courseGrd"} = <$in>;
push #subjects, \%subject;
}
$information{"subj"} = \#subjects;
push #students, \%information;
}
print "FILE LOADED.\n";
close $in or die "Can't close inputs.txt : $!\n";
Use the proper test file operator:
use strict;
use warnings;
use autodie;
my $filename = 'inputs.txt';
unless(-e $filename) {
#Create the file if it doesn't exist
open my $fc, ">", $filename;
close $fc;
}
# Work with the file
open my $fh, "<", $filename;
while( my $line = <$fh> ) {
#...
}
close $fh;
But if the file is new (without contents), the while loop won't be processed. It's easier to read the file only if the test is fine:
if(-e $filename) {
# Work with the file
open my $fh, "<", $filename;
while( my $line = <$fh> ) {
#...
}
close $fh;
}
You can use +>> for read/append, creates the file if it doesn't exist but doesn't truncate it:
open(my $in,"+>>","inputs.txt");
First check whether the file exists or not. Check the sample code below :
#!/usr/bin/perl
use strict;
use warnings;
my $InputFile = $ARGV[0];
if ( -e $InputFile ) {
print "File Exists!";
open FH, "<$InputFile";
my #Content = <FH>;
open OUT, ">outfile.txt";
print OUT #Content;
close(FH);
close(OUT);
} else {
print "File Do not exists!! Create a new file";
open OUT, ">$InputFile";
print OUT "Hello World";
close(OUT);
}

TEXT CSV cant create csv file from csv print

I am trying to write the csv->print out put to a file, the filename I want to use is in $_ and the data prints out from the csv->print command correct on screen. I just can't work out how to get it in a file.
$_ = $ARGV[0];
s/^[^=]*=//;
while (#ARGV) {
my $file = shift;
open my $IN, '<', $file or die $!;
my $html = do { local $/; <$IN> };
$te->parse($html);
}
for my $table ($te->tables) {
#print $_."\n";
open (NEWCSV, '>> '.$_);
print NEWCSV $csv->print(*STDOUT{IO}, $_) for $table->rows;
close (NEWCSV);
}
Thanks
Just open a file for output, and use its filehandle instead of *STDOUT{IO}:
open my $FH, '>', 'file.csv';
$csv->print($FH, $_) for $table->rows;

merging two files using perl keeping the copy of original file in other file

I have to files like A.ini and B.ini ,I want to merge both the files in A.ini
examples of files:
A.ini::
a=123
b=xyx
c=434
B.ini contains:
a=abc
m=shank
n=paul
my output in files A.ini should be like
a=123abc
b=xyx
c=434
m=shank
n=paul
I want to this merging to be done in perl language and I want to keep the copy of old A.ini file at some other place to use old copy
A command line variant:
perl -lne '
($a, $b) = split /=/;
$v{$a} = $v{$a} ? $v{$a} . $b : $_;
END {
print $v{$_} for sort keys %v
}' A.ini B.ini >NEW.ini
How about:
#!/usr/bin/perl
use strict;
use warnings;
my %out;
my $file = 'path/to/A.ini';
open my $fh, '<', $file or die "unable to open '$file' for reading: $!";
while(<$fh>) {
chomp;
my ($key, $val) = split /=/;
$out{$key} = $val;
}
close $fh;
$file = 'path/to/B.ini';
open my $fh, '<', $file or die "unable to open '$file' for reading: $!";
while(<$fh>) {
chomp;
my ($key, $val) = split /=/;
if (exists $out{$key}) {
$out{$key} .= $val;
} else {
$out{$key} = $val;
}
}
close $fh;
$file = 'path/to/A.ini';
open my $fh, '>', $file or die "unable to open '$file' for writing: $!";
foreach(keys %out) {
print $fh $_,'=',$out{$_},"\n";
}
close $fh;
The two files to be merged can be read in a single pass and don't need to be treated as separate source files. That allows the use of <> to read all files passed as parameters on the command line.
Keeping a backup copy of A.ini is simply a matter of renaming it before writing the merged data to a new file of the same name.
This program appears to do what you need.
use strict;
use warnings;
my $file_a = $ARGV[0];
my (#keys, %values);
while (<>) {
if (/\A\s*(.+?)\s*=\s*(.+?)\s*\z/) {
push #keys, $1 unless exists $values{$1};
$values{$1} .= $2;
}
}
rename $file_a, "$file_a.bak" or die qq(Unable to rename "$file_a": $!);
open my $fh, '>', $file_a or die qq(Unable to open "$file_a" for output: $!);
printf $fh "%s=%s\n", $_, $values{$_} for #keys;
output (in A.ini)
a=123abc
b=xyx
c=434
m=shank
n=paul

read input file, match and remove data and write remaining lines to a new file

I am stuck trying to get this to write out the contents of the file. What I am trying to do is open an input file, filter out/remove the matched line and write to a new file. Can someone show me how to do this properly? Thanks.
use strict;
use warnings;
use Text::CSV_XS;
my $csv = Text::CSV_XS->new ({ binary => 1 }) or
die "Cannot use CSV: ".Text::CSV_XS->error_diag ();
open my $fh, "<:encoding(UTF-16LE)", "InputFile.txt" or die "cannot open file: $!";
my #rows;
while (my $row = $csv->getline ($fh)) {
my #lines;
shift #lines if $row->[0] =~ m/Global/;
my $newfile = "NewFile.txt";
open(my $newfh, '>', $newfile) or die "Can't open";
print $newfh #lines;
}
$csv->eof or $csv->error_diag ();
close $fh;
Open the output file outside of the loop. As you read each line, decide if you want to keep it. If yes, write to output file. If not, don't do anything.
Something like the following (untested):
use strict;
use warnings;
use Text::CSV_XS;
my ($input_file, $output_file) = qw(InputFile.txt NewFile.txt);
my $csv = Text::CSV_XS->new ({ binary => 1 })
or die sprintf("Cannot use CSV: %s\n", Text::CSV_XS->error_diag);
open my $infh, "<:encoding(UTF-16LE)", $input_file
or die "Cannot open '$input_file': $!";
open my $outfh, '>', $output_file
or die "Cannot open '$output_file': $!";
while (my $row = $csv->getline($infh)) {
next if $row->[0] =~ m/Global/;
unless ( $csv->print($outfh, $row) ) {
die sprintf("Error writing to '%s': %s",
$output_file,
$csv->error_diag
);
}
}
close $outfh
or die "Cannot close '$output_file': $!";
close $infh
or die "Cannot close '$input_file': $!";
$csv->eof
or die "Processing of '$input_file' terminated prematurely";