Replace last character in a large file

Replace last character in a large file - perl

I have a program that pulls data from a number of other files to form a large (~200MB) bulk SQL insert statement
INSERT INTO ...
VALUES
('a','b',1,2,3),
('c','d',4,5,6),
Unfortunately, the last line needs to end on a semicolon instead of a comma. Is there a way to (ideally within my perl program) turn only the very last character from a , into a ;?
Things I've tried:
1) After the file has been finished and closed:
open(DAT,">>$output") || die("Cannot Open File");
seek(DAT, 2, SEEK_END);
print DAT ";";
close(DAT);
This just puts a semicolon at the very end.
2) Calling `perl -p -i -e 's/,$/;/g' $output`; from within my perl program, but this is replacing every comma.
3) While printing the last line, end with a semicolon instead of a comma. This doesn't work however because I don't actually know it was the last line until the line has been written.
4) Copy the whole file into a new file, except the last character is a ; instead of a ,. This is slow however, and thus not ideal.

If you know that the ',' you are replacing is always going to be the second last byte in the file (last byte being a "\n"), then you can try this:
my $fsize = -s $filename;
# print $fsize."\n";
open($FILE, "+<", $filename) or die $!;
seek $FILE, $fsize-2, SEEK_SET; # or 0 (numeric) instead of SEEK_SET
print $FILE ";";
close $FILE;

You tried
perl -p -i -e 's/,$/;/g'
Which will apply this replacement on every line in the file. To only do it once, slurp the file using the -0 switch:
perl -0777 -pi -e 's/,$/;/'
This will only match if the last character is a comma (with optional trailing newline). If you have trailing whitespace or other characters, it will not work.

You use wron offset. SEEK_END adds(!) the offset to the END position. So use "-2" as an offset. Try this:
use strict;
use warnings;
open my $fh, "+<x.txt" or die;
seek $fh, -2, 2;
print $fh ";\n";
close $fh;
Or a little bit more talkative:
use strict;
use warnings;
use Fcntl qw(SEEK_END);
open my $fh, "+<x.txt" or die;
seek $fh, -2, SEEK_END;
print $fh ";\n";
close $fh;

Related

Scan a large .gz file and split it's strings from a known word(which is repeated in the file) and save the all split strings in a .txt file

I'm trying to write a perl script where I'm trying to open and read a .gz file and split it from a known word('.EOM') which is repeated many times in that file and save all the splits in a .txt or .tmp file. That .gz file is very very large( in some GB). I've tried many different ways but every time it's showing the following error at the end.
"panic:sv_setpvn called with negative strlen at perl_gz1.pl line 7, line 38417185 "
here 'per_gz1.pl' is my perl file name and 'line 101' is the line where I've written the following code line: my #spl=split('.EOM',$join);
I don't know what type of error is this and how I can resolve it. Can anyone help to resolve it? Is there another way to do the same without getting this error? Thanks in advance.
I've attached my full code.
I've tried following codes:
use strict ;
use warnings;
my $file = "/nfs/iind/disks/saptak/dsbnatrgd.scntcl.gz";
open(IN, "gzcat $file |",) or die "gunzip $file: $!";
my $join = join('',<IN>);
#print $join;
my #spl=split('.EOM',$join);
print #spl;
close IN;
use IO::Uncompress::Gunzip qw(gunzip $GunzipError) ;
my $input = "/nfs/iind/disks/cpc_disk0025/saptak/dsbnatrgd.scntcl.gz";
my $output = "NEW1.tmp";
gunzip $input => $output or die "gunzip failed: $GunzipError\n";
my $data = join("", "NEW1.tmp");
#use File::Slurp;
#my $data = read_file("NEW1.tmp");
my #spl=split(/.EOM/,$data)
and
use IO::Uncompress::Gunzip qw(gunzip $GunzipError) ;
use IO::File ;
my $input = new IO::File "</nfs/iind/disks/cpc_disk0025/saptak/dsbnatrgd.scntcl.gz" or die "Cannot open 'file1.txt.gz': $!\n" ;
my $buffer ;
gunzip $input => \$buffer or die "gunzip failed: $GunzipError\n";
print $buffer;
my #spl=split(".EOM",$buffer);
But same error is coming every time.
I expect array #spl will save the file with split every time at the specified word/string and the output print it. So that I can work forward with this array #spl but no output is coming and The error "panic:sv_setpvn called with negative strlen at perl_gz1.pl line 7, line 38417185 " is showing on the output screen.

This might be how I would do it if it was a one time job:
zcat dsbnatrgd.scntcl.gz | perl -ne'sub newf{$n||="0000";$n++;open($fh,">","output_$n.txt")||die}$fh||newf();/(.*)\.EOM(.*)/ and print {$fh} $1 and newf() and print {$fh} $2 or print {$fh} $_'
This gives you a new file output_nnnn.txt each time an .EOM is seen somewhere. nnnn is 0001, 0002 and so on. The .EOM can be seen in the middle of a line as well, then the before and after .EOM is kept as well as the last string in the previous file and the first string in the next file.
The oneliner explained:
sub newf{
$n||="0000";
$n++; #increase the filename counter
open($fh,">","output_$n.txt")||die #open a new output filehandler
}
$fh||newf(); # 1st input line: create $fh file handler if it dont exists
/(.*)\.EOM(.*)/ # if the input line have a .EOM mark, grab whats before and after
and print {$fh} $1 #...and print the before on current file
and newf() #...and open new file
and print {$fh} $2 #...and print the after .EOM to the new file
or print {$fh} $_ #or if no .EOM on current line, just print it to the current output file
(Or did you mean the .EOM mark was uncompressed inside the .gz file? In that case the .gz file is probably invalid)
The reason your approach don't work might be because of very large input. You mentioned that the .gz file was some GB and then the input is probably several times bigger than that even. My approach here don't attempt to keep everything in memory at once so it doesn't matter how big your file is.

Need to replace value from one file to another file using perl

I am writing a program using perl which read a value from one file and replace this value in other file. Program runs successfully, but value didn't get replaced. Please suggest me where is the error.
use strict;
use warnings;
open(file1,"address0.txt") or die "Cannot open file.\n";
my $value;
$value=<file1>;
system("perl -p -i.bak -e 's/add/$value/ig' rough.sp");
Here the value which I want to replace exists in address0.txt file. It is a single value 1. I want to place this value in place of add in other file rough.sp.
My rough.sp looks like
Vdd 1 0 add
My address0.txt looks like
1
So output should be like
Vdd 1 0 1
Please help me out. Thanks in advance

Assuming that there is a 1:1 relationship between lines in adress0.txt and rough.sp, you can proceed like this:
use strict;
use warnings;
my ($curline_1,$curline_2);
open(file1, "address0.txt") or die "Cannot open file.\n";
open(file2, "rough.sp") or die "Cannot open file.\n";
open(file3, ">out.sp") or die "Cannot open file.\n";
while (<file1>) {
$curline_1 = $_;
chomp($curline_1);
$curline_2 = <file2>;
$curline_2 =~ s/ add/ $curline_1/;
print file3 $curline_2;
}
close(file1);
close(file2);
close(file3);
exit(0);
Explanation:
The code iterates through the lines of your input files in parallel. Note that the lines read include the line terminator. Line contents from the 'address' file are taken as replacement values fpr the add literal in your .sp file. Line terminators from the 'address' file are eliminated to avoid introducing additional newlines.
Addendum:
An extension for multi-replacements might look like this:
$curline_1 = $_;
chomp($curline_1);
my #parts = split(/ +/, $curline_1); # splits the line from address0.txt into an array of strings made up of contiguous non-whitespace chars
$curline_2 = <file2>;
$curline_2 =~ s/ add/ $parts[0]/;
$curline_2 =~ s/ sub/ $parts[1]/;
# ...

Perl reading from a file and writing to another using print and inderect handler

I have a file called malwareip.txt with a list as IP :
1.1.1.1
2.2.2.2
I need to read from this file and create another file (query.txt) so that the final results be:
ip.dst=1.1.1.1 || ip.dst=2.2.2.2
I have created the following script. .However I see a || in the first line as under:
||ip.dst=1.1.1.1
||ip.dst=2.2.2.2
Why I'm getting a || before the ip.dst=1.1.1.1?
See my script below. Thanks.
#!/usr/bin/env perl
use strict;
use warnings;
my $filename="malwareip.txt";
open (my $ip, "<" , $filename) || die ("Can't open file malwareip.txt");
my $outputfile="query.txt";
open (my $out, ">" , $outputfile) || die ("CAN'T OPEN FILE query.txt");
my $OR="||";
while ( <$ip> ) {
next if (/^$/);
printf $out "ip.dst=$_$OR";
}
close $out;
close $ip;

Your current output does not make sense, because you cannot get the || at the start of the output unless you print it there. Not even if you happen to have blank lines in your file, because it would still print ip.dst= before that blank line. So, you must be mistaken about getting that output, or about having that code.
Because you forgot to chomp your input, you would normally get output like this
ip.dst=1.1.1.1
||ip.dst=2.2.2.2
||
If you have non-standard line endings, such as using a file with CR \r, then all your lines would get overwritten, but you would get only one line of output: The last one.
||ip.dst=2.2.2.2
So your output makes no sense, and it cannot be explained until you supply more information.
If I were to do something like this, I would do:
perl -lwe 'chomp(#a = <>); print join "||", grep /\S/, #a;' malwareip.txt > query.txt

You're missing chomp.
You're printing || every time your print, even though you want one less.
You might have CRLF line endings on a system which expects LF line endings. s/\s+\z// will chomp off both LF and CRLF line endings.
my $OR = '';
while ( <$ip> ) {
s/\s+\z//;
next if /^$/;
print $out "${OR}ip.dst=$_";
$OR = ' || ';
}
print $out "\n";
Note the use of print rather than printf. printf expects its first argument to be a format parameter.
print $out "${OR}ip.dst=$_"; # ok
printf $out "%s", "${OR}ip.dst=$_"; # ok
printf $out "%sip.dst=%s", $OR, $_; # ok
printf $out "${OR}ip.dst=$_"; # Not really ok

What can be wrong with word count program?

I've got a question in my test:
What is wrong with program that counts number of lines and words in file?
open F, $ARGV[0] || die $!;
my #lines = <F>;
my #words = map {split /\s/} #lines;
printf "%8d %8d\n", scalar(#lines), scalar(#words);
close(F);
My conjectures are:
If file does not exist, program won't tell us about that.
If there are punctuation signs in file, program will count them, for example, in
abc cba
, , ,dce
will be five word, but on the other hand wc outputs the same result, so it might be considered as correct behavior.
If F is a large file, it might be better to iterate over lines and not to dump it into lines array.
Do you have any less trivial ideas?

On the first line, you have a precedence problem:
open F, $ARGV[0] || die $!;
is the same as
open F, ($ARGV[0] || die $!);
which means the die is executed if the filename is false, not if the open fails. You wanted to say
open(F, $ARGV[0]) || die $!;
or
open F, $ARGV[0] or die $!;
Also, you should be using the 3 argument form of open, in case $ARGV[0] contains characters that mean something to open.
open F, '<', $ARGV[0] or die $!;
On a different note, splitting on /\s/ means that you get a "word" between consecutive whitespace characters. You probably meant /\s+/, or as amphetamachine suggested, /\W+/, depending on how you want to define a "word".
That still leaves the problem of the empty "word" you get if the line begins with whitespace. You could split on ' ' to suppress that (it's a special case), or you could trim leading whitespace first, or insert a grep { length $_ } to weed out empty "words", or abandon split and use a different method for counting words.
Processing line by line instead of reading the whole file at once would also be a good improvement, but it's not as important as those first two items.

Your conjecture #1 is incorrect: your program will die if the open fails. (see cjm's answer re order of operations.)
you're using a global filehandle, rather than a lexical variable.
you're not using the three-argument form of open.
you could just read from stdin, which gives more flexibility as to input - the user can provide a file, or pipe the input into stdin.
lastly, I wouldn't write my own code to parse words; I'd reach for CPAN, say something like Lingua::EN::Splitter.
use strict; use warnings;
use Lingua::EN::Splitter qw(words);
my ($wordcount, $lines);
while (<>)
{
my $line = $_;
$lines++;
$wordcount += scalar(words $line);
}
printf "%8d %8d\n", $lines, $wordcount;

When you open F, $ARGV[0] || die $! that will effectively exit if the file doesn't exist.
There are some improvements to be made here:
{local $/; $lines = <F>;} # read all lines at once
my #words = split /\W+/, $lines;

perl + append text between two lines in file

I need to edit file , the main issue is to append text between two known lines in the file
for example I need to append the following text
a b c d e f
1 2 3 4 5 6
bla bla
Between the first_line and the second_line
first_line=")"
second_line="NIC Hr_Nic ("
remark: first_line and second_line argument can get any line or string
How to do this by perl ? ( i write bash script and I need to insert the perl syntax in my script)
lidia

You could read the file in as a single string and then use a regular expression to do the search and replace:
use strict;
use warnings;
# Slurp file myfile.txt into a single string
open(FILE,"myfile.txt") || die "Can't open file: $!";
undef $/;
my $file = <FILE>;
# Set strings to find and insert
my $first_line = ")";
my $second_line = "NIC Hr_Nic (";
my $insert = "hello world";
# Insert our text
$file =~ s/\Q$first_line\E\n\Q$second_line\E/$first_line\n$insert\n$second_line/;
# Write output to output.txt
open(OUTPUT,">output.txt") || die "Can't open file: $!";
print OUTPUT $file;
close(OUTPUT);
By unsetting $/ we put Perl into "slurp mode" and so can easily read the whole file into $file.
We use the s/// operator to do a search and replace using our two search lines as a pattern.
The \Q and \E tell Perl to escape the strings between them, i.e. to ignore any special characters that happen to be in $first_line or $second_line.
You could always write the output over the original file if desired.
The problem as you state it is not solvable using the -i command line option since this option processes a file one line at a time; to insert text between two specific lines you'll need to know about two lines at once.

Well to concenate strings you do
my $text = $first_line . $second_line;
or
my $text = $first_line;
$text .= $second_line;

I'm not sure if I understand your question correctly. A "before and after" example of the file content would, I think, be easier. Anyhow, Here's my take on it, using splice instead of a regular expression. We must of course know the line numbers for this to work.
Load the file into an array:
my #lines;
open F, '<', 'filename' or die $!;
push #lines, $_ for <F>;
close F;
Insert the stuff (see perldoc -f splice):
splice #lines, 1, 0, ('stuff');
..and you're done. All you need to do now is save the array again:
open F, '>', 'filename' or die $!;
print F #lines;
close F;

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Replace last character in a large file - perl

Related

Scan a large .gz file and split it's strings from a known word(which is repeated in the file) and save the all split strings in a .txt file

Need to replace value from one file to another file using perl

Perl reading from a file and writing to another using print and inderect handler

What can be wrong with word count program?

perl + append text between two lines in file

Categories

Resources