I have several commands printing text to a file using perl. During these print commands I have an if statement which should delete the last 5 lines of the file I am currently writing to if the statement is true. The number of lines to delete will always be 5.
if ($exists == 0) {
print(OUTPUT ???) # this should remove the last 5 lines
}
You can use Tie::File:
use Tie::File;
tie my #array, 'Tie::File', filename or die $!;
if ($exists == 0) {
$#array -= 5;
}
You can use the same array when printing, but use push instead:
push #array, "line of text";
$ tac file | perl -ne 'print unless 1 .. 5' | tac > file.tailchopped
Only obvious ways I can think of:
Lock file, scan backwards to find a position and use
truncate.
Don't print to the file directly, go through a buffer
that's at least 5 lines long, and trim the buffer.
Print a marker that means "ignore the last five lines".
Process all your files before reading them with a buffer as in #2
All are pretty fiddly, but that's the nature of flat files I'm afraid.
HTH
As an alternative, print the whole file except last 5 lines:
open($fh, "<", $filename) or die "can't open $filename for reading: $!";
open($fh_new, ">", "$filename.new") or die "can't open $filename.new: $!";
my $index = 0; # So we can loop over the buffer
my #buffer;
my $counter = 0;
while (<$fh>) {
if ($counter++ >= 5) {
print $fh_new $buffer[$index];
}
$buffer[$index++] = $_;
$index = 0 if 5 == $index;
}
close $fh;
close $fh_new;
use File::Copy;
move("$filename.new", $filename) or die "Can not copy $filename.new to $filename: $!";
File::ReadBackwards+truncate is the fastest for large files, and probably as fast as anything else for short files.
use File::ReadBackwards qw( );
my $bfh = File::ReadBackwards->new($qfn)
or die("Can't read \"$qfn\": $!\n");
$bfh->readline() or last for 1..5;
my $fh = $bfh->get_handle();
truncate($qfn, tell($fh))
or die $!;
Tie::File is the slowest, and uses a large amount of memory. Avoid that solution.
you can try something like this:
open FILE, "<", 'filename';
if ($exists == 0){
#lines = <FILE>;
$newLastLine = $#lines - 5;
#print = #lines[0 .. $newLastLine];
print "#print";
}
or even shortened:
open FILE, "<", 'filename';
#lines = <FILE>;
if ($exists == 0){
print "#lines[0 .. $#lines-5]";
}
Related
I want to write multiple files from one file (getting latest data every time) without using array to remove complexity. I already tried it using array but when data is high than it will slow down the process.
Kindly give some hint to me how I will remove the complexity of the program.
Input: read a text file from a directory.
Output:
File1.pl - 1 2 3 4 5 6
File2.pl - 6 7 8 9 10
File3.pl -11 12 13 14 15
File4.pl -16 17 18 19 20
I do this using array:
use feature 'state';
open (DATA,"<","e:/today.txt");
#array=<DATA>;
$sizeofarray=scalar #array;
print "Total no. of lines in file is :$sizeofarray";
$count=1;
while($count<=$sizeofarray)
{
open($fh,'>',"E:/e$count.txt");
print $fh "#array[$count-1..($count+3)]\n";
$count+=5;
}
Store lines in a small buffer, and open a file every fifth line and write the buffer to it
use warnings;
use strict;
use feature 'say';
my $infile = shift || 'e:/today.txt';
open my $fh_in, '<', $infile or die "Can't open $infile: $!";
my ($fh_out, #buf);
while (<$fh_in>) {
push #buf, $_;
if ($. % 5 == 0) {
my $file = 'e' . (int $./5) . '.txt';
open $fh_out, '>', $file or do {
warn "Can't open $file: $!";
next;
};
print $fh_out $_ for #buf;
#buf = ();
}
}
# Write what's left over, if any, after the last batch of five
if (#buf) {
my $file = 'e' . ( int($./5)+1 ) . '.txt';
open $fh_out, '>', $file or die "Can't open $file: $!";
print $fh_out $_ for #buf;
}
As I observed from your code You can try this
use warnings;
use strict;
open (my $fh,"<","today.txt") or die "Error opening $!";
my $count = 1;
while(my $line = <$fh>)
{
open my $wh,'>',"e$count.txt" or die "Error creating $!";
print $wh $line;
for(1..4){
if(my $v = scalar <$fh>){
print $wh $v ;
}
else{
last ;
}
}
$count++;
}
I've got a script that reformats an input file and creates an output file. When I try to read that output file for the second part of the script, it doesn't work. However if I split the script into two parts it works fine and gives me the output that I need. I'm not a programmer and surprised I've got this far - I've been banging my head for days trying to resolve this.
My command for running it is this (BTW the temp.txt was just a brute force workaround for getting rid of the final comma to get my final output file - couldn't find another solution):
c:\perl\bin\perl merge.pl F146.sel temp.txt F146H.txt
Input looks like this (from another software package) ("F146.sel"):
/ Selected holes from the .\Mag_F146_Trimmed.gdb database.
"L12260"
"L12270"
"L12280"
"L12290"
Output looks like this (mods to the text: quotes removed, insert comma, concatenate into one line, remove the last comma) "F146H.txt":
L12260,L12270,L12280,L12290
Then I want to use this as input in the next part of the script, which basically inserts this output into a line of code that I can use in another software package (my "merge.gs" file). This is the output that I get if I split my script into two parts, but it just gives me a blank if I do it as one (see below).
CURRENT Database,"RAD_F146.gdb"
SETINI MERGLINE.OUT="DALL"
SETINI MERGLINE.LINES="L12260,L12270,L12280,L12290"
GX mergline.gx
What follows is my "merge.pl". What have I done wrong?
(actually, the question could be - what haven't I done wrong, as this is probably the most retarded code you've seen in a while. In fact, I bet some of you could get this entire operation done in 10-15 lines of code, instead of my butchered 90. Thanks in advance.)
# this reformats the SEL file to remove the first line and replace the " with nothing
$file = shift ;
$temp = shift ;
$linesH = shift ;
#open (Profiles, ">.\\scripts\\P2.gs")||die "couldn't open output .gs file";
open my $in, '<', $file or die "Can't read old file: Inappropriate I/O control operation";
open my $out, '>', $temp or die "Can't write new file: Inappropriate I/O control operation";
my $firstLine = 1;
while( <$in> )
{
if($firstLine)
{
$firstLine = 0;
}
else{
s/"L/L/g; # replace "L with L
s/"/,/g; # replace " with,
s|\s+||; # concatenates it all into one line
print $out $_;
}
}
close $out;
open (part1, "${temp}")||die "Couldn't open selection file";
open (part2, ">${linesH}")||die "Couldn't open selection file";
printitChomp();
sub printitChomp
{
print part2 <<ENDGS;
ENDGS
}
while ($temp = <part1> )
{
print $temp;
printit();
}
sub printit
{$string = substr (${temp}, 0,-1);
print part2 <<ENDGS;
$string
ENDGS
}
####Theoretically this creates the merge script from the output
####file from the previous loop. However it only seems to work
####if I split this into 2 perl scripts.
open (MergeScript, ">MergeScript.gs")||die "couldn't open output .gs file";
printitMerge();
open (SEL, "${linesH}")||die "Couldn't open selection file";
sub printitMerge
#open .sel file
{
print MergeScript <<ENDGS;
ENDGS
}
#iterate over required files
while ( $line = <SEL> ){
chomp $line;
print STDOUT $line;
printitLines();
}
sub printitLines
{
print MergeScript <<ENDGS;
CURRENT Database,"RAD_F146.gdb"
SETINI MERGLINE.OUT="DALL"
SETINI MERGLINE.LINES="${line}"
GX mergline.gx
ENDGS
}
so I think all you were really missing was close(part2); to allow it to be reopened as SEL..
#!/usr/bin/env perl
use strict;
use warnings;
# this reformats the SEL file to remove the first line and replace the " with nothing
my $file = shift;
my $temp = shift;
my $linesH = shift;
open my $in, '<', $file or die "Can't read old file: Inappropriate I/O control operation";
open my $out, '>', $temp or die "Can't write new file: Inappropriate I/O control operation";
my $firstLine = 1;
while (my $line = <$in>){
print "LINE: $line\n";
if ($firstLine){
$firstLine = 0;
} else {
$line =~ s/"L/L/g; # replace "L with L
$line =~ s/"/,/g; # replace " with,
$line =~ s/\s+//g; # concatenates it all into one line
print $out $line;
}
}
close $out;
open (part1, $temp) || die "Couldn't open selection file";
open (part2, ">", $linesH) || die "Couldn't open selection file";
while (my $temp_line = <part1>){
print "TEMPLINE: $temp_line\n";
my $string = substr($temp_line, 0, -1);
print part2 <<ENDGS;
$string
ENDGS
}
close(part2);
#### this creates the merge script from the output
#### file from the previous loop.
open (MergeScript, ">MergeScript.gs")||die "couldn't open output .gs file";
open (SEL, $linesH) || die "Couldn't open selection file";
#iterate over required files
while ( my $sel_line = <SEL> ){
chomp $sel_line;
print STDOUT $sel_line;
print MergeScript <<"ENDGS";
CURRENT Database,"RAD_F146.gdb"
SETINI MERGLINE.OUT="DALL"
SETINI MERGLINE.LINES="$sel_line"
GX mergline.gx
ENDGS
}
and one alternative way of doing it..
#!/usr/bin/env perl
use strict;
use warnings;
my $file = shift;
open my $in, '<', $file or die "Can't read old file: Inappropriate I/O control operation";
my #lines = <$in>; # read in all the lines
shift #lines; # discard the first line
my $line = join(',', #lines); # join the lines with commas
$line =~ s/[\r\n"]+//g; # remove the quotes and newlines
# print the line into the mergescript
open (MergeScript, ">MergeScript.gs")||die "couldn't open output .gs file";
print MergeScript <<"ENDGS";
CURRENT Database,"RAD_F146.gdb"
SETINI MERGLINE.OUT="DALL"
SETINI MERGLINE.LINES="$line"
GX mergline.gx
ENDGS
I've already posted a question and fixed the problem in my code, but now my "specification has changed" so to say, and now I need to change some things about it.
Here's a code that takes all .txt files from the current directory, cuts off the last line of the first file, the first and the last line of every following file and the first line of the last file and writes everything in a new file (in other words: merge all files, deleting header and footer so that the new file has only one header and one footer).
#!/usr/bin/perl
use warnings;
use Cwd;
use Tie::File;
use Tie::Array;
my $cwd = getcwd();
my $buff = '';
# Get all files in cwd.
my #files = grep ( -f ,<*.txt>);
# Cut off header and footer of $files [1] to $files[$#files-1],
# but only footer of $files[0] and header of $#files[$#files]
for (my $i = 0; $i <= $#files; $i++) {
print 'Opening ' . $files[$i] . "\n";
tie (#lines, Tie::File, $files[$i]) or die "can't update $file: $!";
splice #lines, 0, 1 unless $i == 0;
splice #lines, -1, 1 unless $i == $#files;
untie #lines;
open (file, "<", $files[$i]) or die "can't update $file: $!";
while (my $line =<file>) {
$buff .= $line;
}
close file;
}
# Write the buffer to a new file.
my $allfilename = $cwd.'/Trace.txt';
print 'Writing all files into new file: ' . $allfilename . "\n";
open $outputfile, ">".$allfilename or die "can't write to new file $outputfile: $!";
# Write the buffer into the output file.
print $outputfile $buff;
close $outputfile;
My problem: I don't want to change the original files, but my code does exactly that and I'm having trouble coming up with a solution. The simplest way (simple meaning not having to change too much code) would now be, to just copy all the files to a tmp directory, messing around with them and leaving the original files untouched. Problem: a simple use of dircopy doesn't do it for me, since you have to give a new tmp dir to the dircopy function, making the code only usable for Windows or UNIX systems (but I need portability).
The next approach would be to make use of the File::Temp module but I'm really having trouble with the docs on this one.
Does anybody have a good idea on this one?
I suspected that you didn't really want your original files modified when I answered your previous question.
I don't understand why you've gone back to accumulating all the text in a buffer before printing it, or why you've removed use strict, which is essential to any well-written Perl code.
Here's my previous solution modified to leave the input data untouched.
use strict;
use warnings;
use Tie::File;
my #files = grep -f, glob '*.txt';
my $all_filename = 'Trace.txt';
open my $out_fh, '>', $all_filename or die qq{Unable to open "$all_filename" for output: $!};
for my $i ( 0 .. $#files ) {
my $file = $files[$i];
next if $file eq $all_filename;
print "Opening $file\n";
tie my #lines, 'Tie::File', $file or die qq{Can't open "$file": $!};
my ($start, $end) = (0, $#lines);
++$start unless $i == 0;
--$end unless $i == $#files;
print $out_fh "$_\n" for #lines[$start..$end];
}
close $out_fh;
#!/usr/bin/env perl
use strict;
use warnings;
use autodie;
my $outfile = 'Trace.txt';
# Get all files in cwd.
my #files = grep { -f && $_ ne $outfile } <*.txt>;
open my $outfh, '>', $outfile;
for my $file (#files) {
my #lines = do { local #ARGV = $file; <> };
shift #lines unless $file eq $files[0];
pop #lines unless $file eq $files[-1];
print $outfh #lines;
}
Just do not use Tie::File. Or is there a reason you do this, for example all your files together do not fit your memory or something?
A version very close to your current implementation would be something like the following (untested) code. It just skips the part where you update the file, just to reopen and read it afterwards. (Note that this is certainly not a very effective or overly elegant way to do this, it just sticks to your implementation as close as possible)
#!/usr/bin/perl
use warnings;
use Cwd;
# use Tie::File;
# use Tie::Array;
my $cwd = getcwd();
my $buff = '';
# Get all files in cwd.
my #files = grep ( -f ,<*.txt>);
# Cut off header and footer of $files [1] to $files[$#files-1],
# but only footer of $files[0] and header of $#files[$#files]
for (my $i = 0; $i <= $#files; $i++) {
print 'Opening ' . $files[$i] . "\n";
open (my $fh, "<", $files[$i]) or die "can't open $file for reading: $!";
my #lines = <$fh>;
splice #lines, 0, 1 unless $i == 0;
splice #lines, -1, 1 unless $i == $#files;
foreach my $line (#lines) {
$buff .= $line;
}
}
# Write the buffer to a new file.
my $allfilename = $cwd.'/Trace.txt';
print 'Writing all files into new file: ' . $allfilename . "\n";
open $outputfile, ">".$allfilename or die "can't write to new file $outputfile: $!";
# Write the buffer into the output file.
print $outputfile $buff;
close $outputfile;
Based on Miller's answer, but most suitable for large files.
#!/usr/bin/env perl
use strict;
use warnings;
use autodie;
my $outfile = 'Trace.txt';
# Get all files in cwd.
my #files = grep { -f && $_ ne $outfile } <*.txt>;
open my $outfh, '>', $outfile;
my $counter = 0;
for my $file (#files) {
open my $fh, '<', $file;
my ($line, $prev) = ('', '');
my $l = 0;
while ($line = <$fh>) {
print $outfh $prev unless $l++ == 1 and $counter > 0;
$prev = $line;
}
$counter++;
print $outfh $prev if $counter == #files and $l > 0;
close $fh;
}
I am taking a total number of line as a user input and then I am deleting those numbers of l ine from the file.
I saw this learn.perl.org/faq/perlfaq5.html#How-do-I-count-the-number-of-lines-in-a-file- and then I tired the below simple logic.
Logic:
Get the Total number of lines
Subtracts it by the numbers entered by user
print the lines
Here is my code :
#!/usr/bin/perl -w
use strict;
open IN, "<", "Delete_line.txt"
or die " Can not open the file $!";
open OUT, ">", "Update_delete_line.txt"
or die "Can not write in the file $!";
my ($total_line, $line, $number, $printed_line);
print"Enter the number of line to be delete\n";
$number = <STDIN>;
while ($line = <IN>) {
$total_line = $.; # Total number of line in the file
}
$printed_line = $total_line - $number;
while ($line = <IN>) {
print OUT $line unless $.== $printed_line;
}
Well, neither i am getting any error in code nor any out put ? why I just don't know.
Can any one give me some suggestion.
A Perl solution that's efficient for large files requires the use of File::ReadBackwards
use File::ReadBackwards qw( );
my $num_lines = 10;
my $qfn = 'file.txt';
my $pos = do {
my $fh = File::ReadBackwards->new($qfn)
or die $!;
$fh->readline() for 1..$num_lines;
$fh->tell()
};
truncate($qfn, $pos)
or die $!;
This does not read the whole file twice (unlike the OP's method).
This does not read the whole file (unlike the Tie::File solutions).
This does not read the whole file into memory.
Yet another way is to use Tie::File
#!/usr/bin/env perl
use strict;
use warnings;
use Tie::File;
tie my #lines, 'Tie::File', 'myfile' or die "$!\n";
$#lines -= 10;
untie #lines;
This has the advantage of not loading the file into memory while acting like it does.
Here a solution that passes through a stream and prints all but the last n lines where n is a command line argument:
#!/usr/bin/perl
my #cache;
my $n = shift #ARGV;
while(<>) {
push #cache, $_;
print shift #cache if #cache > $n;
}
or the one-liner version:
perl -ne'BEGIN{$n=shift#ARGV}push#c,$_;print shift#c if#c>$n' NUMBER
After finishing reading from IN, you have to reopen it or seek IN, 0, 0 to reset its position. You also have to set $. to zero again.
Also, the final condition should be changed to unless $. > $printed_line so you skip all the lines over the threshold.
The "more fun" answer: use Tie::File!
use strict;
use warnings;
use Tie::File;
tie my #file, 'Tie::File', 'filename' or die "$!";
$#file -= 10;
Just read the file in reverse and delete the first n lines: -
open my $filehandle, "<", "info.txt";
my #file = <$filehandle>;
splice(#file, -10);
print #file;
Note: This loads the entire file into memory.
You could just buffer the last 10 lines and then not print out the remaining 10.
use English qw<$INPLACE_EDIT>;
{ local #ARGV = $name_of_file_to_edit;
local $INPLACE_EDIT = '.bak';
my #buffer;
for ( 1..$num_lines_to_trim ) {
push #buffer, <>;
}
while ( <> ) {
print shift #buffer;
push #buffer, $_;
}
}
You could also do this with File::Slurp::edit_file_lines:
my #buffer;
my $limit_reached = 0;
edit_file_lines {
push #buffer, $_;
return ( $limit_reached ||= #buffer > $num_lines_to_trim ) ? shift #buffer
: ''
;
} $name_of_file;
my $num_lines = 10;
my $qfn = 'file.txt';
system('head', '-n', -$num_lines, '--', $qfn);
die "Error" if $?;
Easy with a C like for :
#!/usr/bin/perl -w
use strict;
open(my $in,"<","Delete_line.txt") or die "Can not open the file $!";
open(my $out,">","Update_delete_line.txt") or die"Can not write in the file $!";
print"Enter the number of lines to be delete\n";
my $number=<STDIN>;
my #file = <$in>;
for (my $i = 0; $i < $#file - $number + 1; $i++) {
print $out $file[$i];
}
close $in;
close $out;
#
# Reads a file trims the top and the bottom of by passed num of lines
# and return the string
# stolen from : http://stackoverflow.com/a/9330343/65706
# usage :
# my $StrCatFile = $objFileHandler->ReadFileReturnTrimmedStrAtTopBottom (
# $FileToCat , $NumOfRowsToRemoveAtTop , $NumOfRowsToRemoveAtBottom) ;
sub ReadFileReturnTrimmedStrAtTopBottom {
my $self = shift ;
my $file = shift ;
my $NumOfLinesToRemoveAtTop = shift ;
my $NumOfLinesToRemoveAtBottom = shift ;
my #cache ;
my $StrTmp = () ;
my $StrReturn = () ;
my $fh = () ;
open($fh, "<", "$file") or cluck ( "can't open file : $file for reading: $!" ) ;
my $counter = 0;
while (<$fh>) {
if ($. >= $NumOfLinesToRemoveAtTop + 1) {
$StrTmp .= $_ ;
}
}
close $fh;
my $sh = () ;
open( $sh, "<", \$StrTmp) or cluck( "can't open string : $StrTmp for reading: $!" ) ;
while(<$sh>) {
push ( #cache, $_ ) ;
$StrReturn .= shift #cache if #cache > $NumOfLinesToRemoveAtBottom;
}
close $sh ;
return $StrReturn ;
}
#eof ReadFileReturnTrimmedStrAtTopBottom
#
I need to add header of the first main file to all the split files. i.e I am able to get header for the 1st split file but i need it for all the split files, here I am splitting DAT file. Below is what i have done so for:
#!usr/bin/perl -w
my $chunksize = 25000000; # 25MB
my $filenumber = 0;
my $infile = "Test.dat";
my $outsize = 0;
my $eof = 0;
my $line = $_;
open INFILE, $infile;
open OUTFILE, ">outfile_".$filenumber.".dat";
while (<INFILE>) {
chomp;
if ($outsize > $chunksize) {
close OUTFILE;
$outsize = 0;
$filenumber++;
open (OUTFILE, ">outfile_".$filenumber.".dat")
or die "Can't open outfile_".$filenumber.".dat";
}
print OUTFILE "$_\n";
$outsize += length;
}
close INFILE;
You should always use warnings (in preference to the command-line -w) and use strict. That way many simple errors that you may otherwise have obverlooked will be flagged
Use the three-parameter form of open with lexical filehandles
Check the result of all open calls and flag errors containing the value of $! in a die string
Define constant values with the use constant pragma father than as Perl variables
The number of bytes printed to a filehandle can be evaluated using the tell function, so there is no need to keep your own count
To solve your specific problem, you should read and remember the first line of your input file, and print it to new output files every time they are opened
It is easier to keep track of the output files if you open them when you have new data to write and no open file, and close them when they are full or if you have reached the end of the input data
This program demonstrates the ideas and does what is required
use strict;
use warnings;
use constant INFILE => 'Test.dat';
use constant CHUNKSIZE => 25_000_000; # 25MB
open my $infh, '<', INFILE or die $!;
my $header = <$infh>;
my $outfh;
my $filenumber = 0;
while (my $line = <$infh>) {
unless ($outfh) {
my $outfile = "outfile_$filenumber.dat";
open $outfh, '>', $outfile or die "Can't open '$outfile': $!";
print { $outfh } $header;
$filenumber++;
}
print { $outfh } $line;
if (tell $outfh > CHUNKSIZE or eof $infh) {
close $outfh or die $!;
undef $outfh;
}
}
You need to store the header from the input file and print it every time a new file is opened:
use strict;
use warnings;
use autodie;
# initializations ...
open my $in, '<', $infile;
open my $out, '>', "outfile_${file_number}.dat";
my $header = <$in>; # Save the header...
chomp $header; # ... not strictly necessary
while ( <$in> ) {
chomp; # Not strictly necessary
if ( $outsize > $chunksize) {
close $out;
$outsize = 0;
$filenumber++;
open $out, '>', "outfile_${file_number}.dat";
print $out $header, "\n"; # Prints header at beginning of file
# Newline needed if $header chomped
}
print $out $_, "\n"; # Newline needed if $_ chomped
$outsize += length;
}