I want to write multiple files from one file without using array to remove complexity - perl

I want to write multiple files from one file (getting latest data every time) without using array to remove complexity. I already tried it using array but when data is high than it will slow down the process.
Kindly give some hint to me how I will remove the complexity of the program.
Input: read a text file from a directory.
Output:
File1.pl - 1 2 3 4 5 6
File2.pl - 6 7 8 9 10
File3.pl -11 12 13 14 15
File4.pl -16 17 18 19 20
I do this using array:
use feature 'state';
open (DATA,"<","e:/today.txt");
#array=<DATA>;
$sizeofarray=scalar #array;
print "Total no. of lines in file is :$sizeofarray";
$count=1;
while($count<=$sizeofarray)
{
open($fh,'>',"E:/e$count.txt");
print $fh "#array[$count-1..($count+3)]\n";
$count+=5;
}

Store lines in a small buffer, and open a file every fifth line and write the buffer to it
use warnings;
use strict;
use feature 'say';
my $infile = shift || 'e:/today.txt';
open my $fh_in, '<', $infile or die "Can't open $infile: $!";
my ($fh_out, #buf);
while (<$fh_in>) {
push #buf, $_;
if ($. % 5 == 0) {
my $file = 'e' . (int $./5) . '.txt';
open $fh_out, '>', $file or do {
warn "Can't open $file: $!";
next;
};
print $fh_out $_ for #buf;
#buf = ();
}
}
# Write what's left over, if any, after the last batch of five
if (#buf) {
my $file = 'e' . ( int($./5)+1 ) . '.txt';
open $fh_out, '>', $file or die "Can't open $file: $!";
print $fh_out $_ for #buf;
}

As I observed from your code You can try this
use warnings;
use strict;
open (my $fh,"<","today.txt") or die "Error opening $!";
my $count = 1;
while(my $line = <$fh>)
{
open my $wh,'>',"e$count.txt" or die "Error creating $!";
print $wh $line;
for(1..4){
if(my $v = scalar <$fh>){
print $wh $v ;
}
else{
last ;
}
}
$count++;
}

Related

How to split a file of 5000 lines in several files of 200 lines each with Perl?

I've a file of 5000 lines and I want several files of 200 lines each, and I tried this:
#!/usr/bin/perl
use strict;
use warnings;
my $targetfile = '200_lines.txt';
my $filename = '5000_lines.txt';
open ( my $DATA, '<', $filename ) or die "Could not open file '$filename': $!";
while ( my $line = <$DATA> ) {
my $counter++;
open (my $FILE, '>>', $targetfile ) or die "Could not open file '$targetfile': $!";
print $FILE $line;
close $FILE;
if ( $counter % 200 == 0
if ($. % 200 == 0) {
$targetfile =~ s/200/$counter/;
}
}
My $counter variable still at 1. I don't know why. So I got just one file named 200_lines.txt with 5000 lines.
You can use sprintf to generate new filenames after every 200 lines and use $. to keep track of line numbers in the file.
The below script will generate smaller chunks of files from the larger file with 200 lines each.
#!/usr/bin/perl
use strict;
use warnings;
open my $rfh,'<','file_5000' or die "unable to open file : $! \n";
my $filecount=0;
my $wfh;
while(<$rfh>){
if(($.-1) % 200 == 0){
close($wfh) if($wfh);
open($wfh, '>', sprintf("file%02d", ++$filecount)) or die $!;
}
print $wfh "$_";
}
close($rfh);
$counter is scoped within your while block. So it is reset each iteration.
You don't really need to do it like that, as you can test $. for the current line number
don't call your filehandle $DATA. There's a special filehandle called <DATA>.
How about:
#!/usr/bin/perl
use strict;
use warnings;
open( my $input, '<', '5000_lines.txt' ) or die $!;
open( my $output, '>', '200_lines.txt' ) or die $!;
while ( my $line = <$input> ) {
unless ( $. % 200 ) {
close($output);
open( $output, '>', int( $. / 200 + 2) . "00_lines.txt" ) or die $!;
}
print {$output} $line;
}
close($input);
close($output);
This creates files:
200_lines.txt
400_lines.txt
600_lines.txt
etc.
You have a bunch of errors in your code.
#!/usr/bin/perl
use strict;
use warnings;
# Creating a file with 5000 lines
my $filename = '5000_lines.txt';
open ( $DATA, '>', $filename ) or die "Could not open file '$filename': $!";
for (my $i=0;$i<5000;$i++){
print $DATA "$i\n";
}
close ( $DATA);
my $targetfile = '200_lines.txt';
open ( my $DATA, '<', $filename ) or die "Could not open file '$filename': $!";
my $counter = 0;
my $num = 0;
my $flag = 1;
while ( my $line = <$DATA> ) {
if ($flag == 1){
open (FILE, '>', $targetfile.'_'.$num ) or die "Could not open file '$targetfile.'_'.$num': $!";
}
print FILE $line;
$flag=0;
if ( $counter % 200 == 0){
$num = $counter/200;
close FILE;
$flag=1
}
$counter++;
}
close (FILE);
It will break the large file into smaller chunks of 200 lines each.

How can I merge some columns of two files using perl?

I want to merge the first column of input1.txt and the third column of input2.txt. How can I do it? My code doesn't do what I want.
Input1:
1 6
2 7
3 8
4 9
Input2:
a 4 8
b 6 7
c 3 4
d 2 6
Requested output:
1 8
2 7
3 4
4 6
My code:
#!/usr/bin/perl
use strict;
use warnings;
open my $input1, '<', "input1.txt" or die qq{Failed to open "input1.txt" for writing: $!};
open my $input2, '<', "input2.txt" or die qq{Failed to open "input2.txt" for writing: $!};
open my $outfile, '>', "outfile.txt" or die qq{Failed to open "outfile.txt" for writing: $!};
while(<$input1>)
{
my #columns1 = split;
print $outfile join("\t", $columns1[0], "\n");
}
while(<$input2>)
{
my #columns2 = split;
print $outfile join("\t", $columns2[2], "\n");
}
close($input1);
close($input2);
close($outfile);
Another way to get the requested output is to use one while loop instead of two:
mod.pl
#!/usr/bin/perl
use strict;
use warnings;
open my $input1, '<', "input1.txt" or die qq{Failed to open "input1.txt" for writing: $!};
open my $input2, '<', "input2.txt" or die qq{Failed to open "input2.txt" for writing: $!};
open my $outfile, '>', "outfile.txt" or die qq{Failed to open "outfile.txt" for writing: $!};
while(my $l1 = <$input1>){
my $l2 = <$input2>;
chomp $l1;
chomp $l2;
my #columns1 = split(/ /, $l1);
my #columns2 = split(/ /, $l2);
print $outfile join("\t", $columns1[1-1], $columns2[3-1]),"\n";
}
close($input1);
close($input2);
close($outfile);
#$ perl mod.pl
#$ cat outfile.txt
1 8
2 7
3 4
4 6
Do this:
$filename1 = $ARGV[0]; #for taking input1.txt as the first argument
$filename2 = $ARGV[1]; #for taking input2.txt as the second argument
#data1;
#column1;
open(INPUT_FILE, $filename1)
or die "Couldn't open $filename1!";
while (<INPUT_FILE>) {
my $currentLine = $_; #read the input file one line at a time, storing it to $currentLine
#data1 = split " ", $currentLine; #split your line by space
$firstcolumn = $data1[0]; #store the first column's data
push #column1, $firstcolumn ; #push the first column's data into an array
}
#data2;
#column3;
open(INPUT_FILE, $filename2)
or die "Couldn't open $filename2!";
while (<INPUT_FILE>) {
my $currentLine = $_;
#data2 = split " ", $currentLine;
$thirdcolumn = $data2[2]; #store the third column's data
push #column3, $thirdcolumn ;
}
$size = #column1;
open (OUTPUTFILE, '>>outfile.txt');
for($i = 0; $i < $size; $i++){
print OUTPUTFILE "$column1[$i] $column3[$i]\n"; #writing each entry into the outfile.txt
}
close(INPUT_FILE);
close (OUTPUTFILE);
And when you run your perl program in command line, do:
yourprogram.pl input1.txt input2.txt outfile.txt
And it should work.
I tried the program and opened the outfile.txt and your requested output is in there.
Your code print serially, but you need is parallel
#!/usr/bin/perl
use strict;
use warnings;
open my $input1, '<', "input1.txt" or die qq{Failed to open "input1.txt" for writing: $!};
open my $input2, '<', "input2.txt" or die qq{Failed to open "input2.txt" for writing: $!};
open my $outfile, '>', "outfile.txt" or die qq{Failed to open "outfile.txt" for writing: $!};
my ($line1, $line2);
while(1)
{
$line1 = <$input1> || '';
$line2 = <$input2> || '';
my #columns1 = split ' ', $line1;
my #columns2 = split ' ', $line2;
print $outfile join("\t", $columns1[0], $columns2[2]), "\n";
last if !$line1 && !$line2;
}
close($input1);
close($input2);
close($outfile);
It doesn't have to be this complicated. Read the first file's first column in an array and print it along with the third field of second file. Unless you have files with different number of rows, this should work just fine.
perl -lane'
BEGIN { $x = pop; #col1 = map { (split)[0] } <>; #ARGV = $x }
print join " ", $col1[$.-1], $F[-1]
' input1 input2
1 8
2 7
3 4
4 6

perl match lines from one fine to another file then output the current line and the next line to a new file [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
If any of you could modify the code so that the sequence names in file 1 are searched within file 2, and if there is a match, the lines in file 1 and its next line are copied to an outfile. right now the code only copies the matched titles but not its next line which is the sequence to the outfile. thanks
for example:
FILE 1 :
SEQUENCE 1 NAME
SEQUENCE 2 NAME
SEQUENCE 3 NAME
FILE 2:
SEQUENCE 1 NAME
AGTCAGTCAGTCAGTCAGTC
SEQUENCE 2 NAME
AAGGGTTTTCCCCCCAAAAA
SEQUENCE 3 NAME
GGGGTTTTTTTTTTAAAAAC
SEQUENCE 4 NAME
AAGTCCCCCCCCCCAAGGTT
etc.
OUTFILE:
SEQUENCE 1 NAME
AGTCAGTCAGTCAGTCAGTC
SEQUENCE 2 NAME
AAGGGTTTTCCCCCCAAAAA
SEQUENCE 3 NAME
GGGGTTTTTTTTTTAAAAAC
code:
use strict;
use warnings;
my $f1 = 'FILE1.fasta';
open FILE1, "$f1" or die "Could not open file \n";
my $f2= 'FILE2.fasta';
open FILE2, "$f2" or die "Could not open file \n";
my $outfile = $ARGV[1];
my #outlines;
my $n=0;
foreach (<FILE1>) {
my $y = 0;
my $outer_text = $_ ;
seek(FILE2,0,0);
foreach (<FILE2>) {
my $inner_text = $_;
if($outer_text eq $inner_text) {
print "$outer_text\n";
push(#outlines, $outer_text);
$n++;
}
}
}
open (OUTFILE, "sequences.fasta") or die "Cannot open $outfile \ +n";
print OUTFILE #outlines;
close OUTFILE;
For very large FILE1, %seen hash could be tied to some of DBM storage,
use strict;
use warnings;
my $f1 = 'FILE1.fasta';
open FILE1, "<", $f1 or die $!;
my $f2 = 'FILE2.fasta';
open FILE2, "<", $f2 or die $!;
# my $outfile = $ARGV[1];
open OUTFILE, ">", "sequences.fasta" or die $!;
my %seen;
while (<FILE1>) {
$seen{$_} = 1;
}
while (<FILE2>) {
my $next_line = <FILE2>;
if ($seen{$_}) {
print OUTFILE $_, $next_line;
}
}
close OUTFILE;
I would put the contents of file 2 into a hash, then check if each record from file 1 was in the hash:
#!perl
use strict;
use warnings;
my $f2= 'FILE2.fasta';
open FILE2, "$f2" or die "Could not open file \n";
my $k;
my $v;
my %hash;
while (defined($k = <FILE2>)) {
chomp $k;
$v = <FILE2>;
$hash{$k} = $v;
}
my $f1 = 'FILE1.fasta';
open FILE1, "$f1" or die "Could not open file \n";
open (OUTFILE, ">sequences.fasta") or die "Cannot open seqeneces.fasta\n";
while (<FILE1>) {
chomp;
if (exists($hash{$_})) {
print OUTFILE "$_\n";
print OUTFILE "$hash{$_}\n";
}
}
close OUTFILE;

merging two files using perl keeping the copy of original file in other file

I have to files like A.ini and B.ini ,I want to merge both the files in A.ini
examples of files:
A.ini::
a=123
b=xyx
c=434
B.ini contains:
a=abc
m=shank
n=paul
my output in files A.ini should be like
a=123abc
b=xyx
c=434
m=shank
n=paul
I want to this merging to be done in perl language and I want to keep the copy of old A.ini file at some other place to use old copy
A command line variant:
perl -lne '
($a, $b) = split /=/;
$v{$a} = $v{$a} ? $v{$a} . $b : $_;
END {
print $v{$_} for sort keys %v
}' A.ini B.ini >NEW.ini
How about:
#!/usr/bin/perl
use strict;
use warnings;
my %out;
my $file = 'path/to/A.ini';
open my $fh, '<', $file or die "unable to open '$file' for reading: $!";
while(<$fh>) {
chomp;
my ($key, $val) = split /=/;
$out{$key} = $val;
}
close $fh;
$file = 'path/to/B.ini';
open my $fh, '<', $file or die "unable to open '$file' for reading: $!";
while(<$fh>) {
chomp;
my ($key, $val) = split /=/;
if (exists $out{$key}) {
$out{$key} .= $val;
} else {
$out{$key} = $val;
}
}
close $fh;
$file = 'path/to/A.ini';
open my $fh, '>', $file or die "unable to open '$file' for writing: $!";
foreach(keys %out) {
print $fh $_,'=',$out{$_},"\n";
}
close $fh;
The two files to be merged can be read in a single pass and don't need to be treated as separate source files. That allows the use of <> to read all files passed as parameters on the command line.
Keeping a backup copy of A.ini is simply a matter of renaming it before writing the merged data to a new file of the same name.
This program appears to do what you need.
use strict;
use warnings;
my $file_a = $ARGV[0];
my (#keys, %values);
while (<>) {
if (/\A\s*(.+?)\s*=\s*(.+?)\s*\z/) {
push #keys, $1 unless exists $values{$1};
$values{$1} .= $2;
}
}
rename $file_a, "$file_a.bak" or die qq(Unable to rename "$file_a": $!);
open my $fh, '>', $file_a or die qq(Unable to open "$file_a" for output: $!);
printf $fh "%s=%s\n", $_, $values{$_} for #keys;
output (in A.ini)
a=123abc
b=xyx
c=434
m=shank
n=paul

How to delete the last 5 lines of a file

I have several commands printing text to a file using perl. During these print commands I have an if statement which should delete the last 5 lines of the file I am currently writing to if the statement is true. The number of lines to delete will always be 5.
if ($exists == 0) {
print(OUTPUT ???) # this should remove the last 5 lines
}
You can use Tie::File:
use Tie::File;
tie my #array, 'Tie::File', filename or die $!;
if ($exists == 0) {
$#array -= 5;
}
You can use the same array when printing, but use push instead:
push #array, "line of text";
$ tac file | perl -ne 'print unless 1 .. 5' | tac > file.tailchopped
Only obvious ways I can think of:
Lock file, scan backwards to find a position and use
truncate.
Don't print to the file directly, go through a buffer
that's at least 5 lines long, and trim the buffer.
Print a marker that means "ignore the last five lines".
Process all your files before reading them with a buffer as in #2
All are pretty fiddly, but that's the nature of flat files I'm afraid.
HTH
As an alternative, print the whole file except last 5 lines:
open($fh, "<", $filename) or die "can't open $filename for reading: $!";
open($fh_new, ">", "$filename.new") or die "can't open $filename.new: $!";
my $index = 0; # So we can loop over the buffer
my #buffer;
my $counter = 0;
while (<$fh>) {
if ($counter++ >= 5) {
print $fh_new $buffer[$index];
}
$buffer[$index++] = $_;
$index = 0 if 5 == $index;
}
close $fh;
close $fh_new;
use File::Copy;
move("$filename.new", $filename) or die "Can not copy $filename.new to $filename: $!";
File::ReadBackwards+truncate is the fastest for large files, and probably as fast as anything else for short files.
use File::ReadBackwards qw( );
my $bfh = File::ReadBackwards->new($qfn)
or die("Can't read \"$qfn\": $!\n");
$bfh->readline() or last for 1..5;
my $fh = $bfh->get_handle();
truncate($qfn, tell($fh))
or die $!;
Tie::File is the slowest, and uses a large amount of memory. Avoid that solution.
you can try something like this:
open FILE, "<", 'filename';
if ($exists == 0){
#lines = <FILE>;
$newLastLine = $#lines - 5;
#print = #lines[0 .. $newLastLine];
print "#print";
}
or even shortened:
open FILE, "<", 'filename';
#lines = <FILE>;
if ($exists == 0){
print "#lines[0 .. $#lines-5]";
}