My problem is how to make my script fast (I use big files)
I have the script above it add "bbb" between words if the words exist in an other file that contain sequences of words
for exemple file2.txt : i eat big pizza .my big pizza ...
file1.txt (sequences):
eat big pizza
big pizza
the result Newfile
i bigbbbpizza ...
my script:
use strict;
use warnings;
use autodie;
open Newfile ,">./newfile.txt" or die "Cannot create Newfile.txt";
my %replacement;
my ($f1, $f2) = ('file1.txt', 'file2.txt');
open(my $fh, $f1);
my #seq;
foreach (<$fh> )
push #seq, $_;
close $fh;
#seq = sort bylen #seq;
open($fh, $f2);
foreach (<$fh> ) {
foreach my $r (#seq) {
my $t = $r;
$t =~ s/\h+/bbb/g;
print Newfile ;
close $fh;
close Newfile ;
exit 0;
sub bylen {
length($b) <=> length($a);

Instead of an array
my #seq;
define your words as a hash.
my %seq;
Instead of pushing the words
push #seq, $_;
store the words in the hash. Precalculate the replacement and move it out of the loop.
my $t = $_;
$t =~ s/\h+/bbb/g;
$seq{$_} = $t;
Precalculate the words in front of the outer loop:
my #seq = keys %seq;
And use hash look-ups to find the replacement in the inner loop:
my $t = $seq{$r};
This might be a bit faster, but do not expect too much.
In most cases it is better to reduce the problem by preparing the input in a way, which makes the solution easier. For example grep -f is much faster than your Perl loops. Use grep to find the lines, which need a replacement, and do the replacement with Perl or Sed.
Another way is to parallel the job. You can divide your input in n parts and run n processes on n CPUs in parallel. See the GNU parallel tutorial.

What about a regexp like this (beware that this approach can cause security concerns) ?
use strict;
use warnings;
open (my $Newfile, '>', 'newfile.txt') or die "Cannot create Newfile.txt: $!";
my ($f1, $f2) = qw(file1.txt file2.txt);
open (my $fh, $f1) or die "Can't open $f1 for reading: $!";
my #seq = map {split ' ', $_ } <$fh>;
close $fh;
# an improvement would be to use an hash to avoid dupplicates
my $regexp = '(' . join('|', #seq) . ')';
open($fh, $f2) or die "Can't open $f2 for reading: $!";
foreach my $line (<$fh> ) {
$line =~ s/$regexp/$1bbb/g;
print $Newfile $line;
close $fh;
close $Newfile ;
exit 0;


how to combine the code to make the output is on the same line?

Can you help me to combine both of these progeam to display the output in a row with two columns? The first column is for $1 and the second column is $2.
Kindly help me to solve this. Thank you :)
This is my code 1.
use strict ;
use warnings ;
use IO::Uncompress::Gunzip qw(gunzip $GunzipError);
my $input = "par_disp_fabric.all_max_lowvcc_qor.rpt.gz";
my $output = "par_disp_fabric.all_max_lowvcc_qor.txt";
gunzip $input => $output
or die "gunzip failed: $GunzipError\n";
open (FILE, '<',"$output") or die "Cannot open $output\n";
while (<FILE>) {
my $line = $_;
chomp ($line);
if ($line=~ m/^\s+Timing Path Group \'(\S+)\'/) {
$line = $1;
print ("$1\n");
close (FILE);
This is my code 2.
my $input = "par_disp_fabric.all_max_lowvcc_qor.rpt.gz";
my $output = "par_disp_fabric.all_max_lowvcc_qor.txt";
gunzip $input => $output
or die "gunzip failed: $GunzipError\n";
open (FILE, '<',"$output") or die "Cannot open $output\n";
while (<FILE>) {
my $line = $_;
chomp ($line);
if ($line=~ m/^\s+Levels of Logic:\s+(\S+)/) {
$line = $1;
print ("$1\n");
close (FILE);
this is my output for code 1 which contain 26 line of data:
The output code 1 also has same number of line.
paste is useful for this: assuming your shell is bash, then using process substitutions
paste <(perl <(perl
That emits columns separated by a tab character. For prettier output, you can pipe the output of paste to column
paste <(perl <(perl | column -t -s $'\t'
And with this, you con't need to try and "merge" your perl programs.
To combine the two scripts and to output two items of data on the same line, you need to hold on until the end of the file (or until you have both data items) and then output them at once. So you need to combine both loops into one:
use strict ;
use warnings ;
use IO::Uncompress::Gunzip qw(gunzip $GunzipError);
my $input = "par_disp_fabric.all_max_lowvcc_qor.rpt.gz";
my $output = "par_disp_fabric.all_max_lowvcc_qor.txt";
gunzip $input => $output
or die "gunzip failed: $GunzipError\n";
open (FILE, '<',"$output") or die "Cannot open $output\n";
my( $levels, $timing );
while (<FILE>) {
my $line = $_;
chomp ($line);
if ($line=~ m/^\s+Levels of Logic:\s+(\S+)/) {
$levels = $1;
if ($line=~ m/^\s+Timing Path Group \'(\S+)\'/) {
$timing = $1;
print "$levels, $timing\n";
close (FILE);
You still haven't given us one vital piece of information - what does the input data looks like. Most importantly, are the two pieces of information you're looking for on the same line?
[Update: Looking more closely at your regexes, I see it's possible for both pieces of information to be on the same line - as they are both supposed to be the first item on the line. It would be helpful if you were clearer about that in your question.]
I think this will do the right thing, no matter what the answer to your question is. I've also added the improvements I suggested in my answer to your previous question:
use strict ;
use warnings ;
use IO::Uncompress::Gunzip qw(gunzip $GunzipError);
my $zipped = "par_disp_fabric.all_max_lowvcc_qor.rpt.gz";
my $unzipped = "par_disp_fabric.all_max_lowvcc_qor.txt";
gunzip $zipped => $unzipped
or die "gunzip failed: $GunzipError\n";
open (my $fh, '<', $unzipped) or die "Cannot open '$unzipped': $!\n";
my ($levels, $timing);
while (<$fh>) {
if (m/^\s+Levels of Logic:\s+(\S+)/) {
$levels = $1;
if (m/^\s+Timing Path Group \'(\S+)\'/) {
$timing = $1;
# If we have both values, then print them out and
# set the variables to 'undef' for the next iteration
if ($levels and $timing) {
print "$levels, $timing\n";
undef $levels;
undef $timing;
close ($fh);

How to get the output for this code?

I have a file t_code.txt in which I want to replace all occurrences of strings PIOMUX_UART1_TXD and PIOMUX_UART1_RXD with strings in #array1 containing TXD and RXD respectively and then print it in new file c_code2.txt but it's not working
open my $f6, '<', 't_code.txt' or die $!;
my #lines = <$f6>;
my #newlines;
foreach (#lines) {
$_ =~ s/PIOMUX_UART1_TXD/ grep ( / TXD / )(#array1)/g;
$_ =~ s/PIOMUX_UART1_RXD/ grep ( / RXD / )(#array1)/g;
push(#newlines, $_);
open my $output, '>', 'c_code2.txt' or die "Can't open the output file!";
print $output #newlines;
Since #array1 (a dreadful choice of identifier, by the way) doesn't change inside the loop, it is best to build the replacement strings outside instead of every time you make a replacement.
It isn't clear exactly what string you want to replace PIOMUX_UART1_TXD and PIOMUX_UART1_RXD with, but this code joins all the matching elements of the array with commas and uses that. I hope it's cler how to do something different if you need to.
I've also used a while loop, as there's no need to read the whole file into an array beforehand.
my ($in_file, $out_file) = qw/ t_code.txt c_code2.txt /;
open my $in_fh, '<', $in_file or die qq{Unable to open "$in_file" for reading: $!};
open my $out_fh, '>', $out_file or die qq{Unable to open "$out_file" for writing: $!};
my ($txd) = grep /TXD/, #array1;
my ($rxd) = grep /RXD/, #array1;
while ( <$in_fh> ) {
print $out_fh $_;
close $out_fh or die $!;
Several problems in the code:
To be able to use code in the replacement part of a substitution, you must use the /e modifier.
In a s/// construct, you can't use / unquoted. Either change the separator, or backslash it.
The replacement part in a substitution is a string. In case of code, it's evaluated in scalar context. grep in scalar context returns the number of matches.
#! /usr/bin/perl
use warnings;
use strict;
my #array1 = qw( aTXDb cRXDd );
while (<DATA>) {
s/PIOMUX_UART1_TXD/join q(), grep m=TXD=, #array1/eg;
s/PIOMUX_UART1_RXD/join q(), grep m=RXD=, #array1/eg;

Merge txt files in Perl, but modify them before, leaving original files untouched

I've already posted a question and fixed the problem in my code, but now my "specification has changed" so to say, and now I need to change some things about it.
Here's a code that takes all .txt files from the current directory, cuts off the last line of the first file, the first and the last line of every following file and the first line of the last file and writes everything in a new file (in other words: merge all files, deleting header and footer so that the new file has only one header and one footer).
use warnings;
use Cwd;
use Tie::File;
use Tie::Array;
my $cwd = getcwd();
my $buff = '';
# Get all files in cwd.
my #files = grep ( -f ,<*.txt>);
# Cut off header and footer of $files [1] to $files[$#files-1],
# but only footer of $files[0] and header of $#files[$#files]
for (my $i = 0; $i <= $#files; $i++) {
print 'Opening ' . $files[$i] . "\n";
tie (#lines, Tie::File, $files[$i]) or die "can't update $file: $!";
splice #lines, 0, 1 unless $i == 0;
splice #lines, -1, 1 unless $i == $#files;
untie #lines;
open (file, "<", $files[$i]) or die "can't update $file: $!";
while (my $line =<file>) {
$buff .= $line;
close file;
# Write the buffer to a new file.
my $allfilename = $cwd.'/Trace.txt';
print 'Writing all files into new file: ' . $allfilename . "\n";
open $outputfile, ">".$allfilename or die "can't write to new file $outputfile: $!";
# Write the buffer into the output file.
print $outputfile $buff;
close $outputfile;
My problem: I don't want to change the original files, but my code does exactly that and I'm having trouble coming up with a solution. The simplest way (simple meaning not having to change too much code) would now be, to just copy all the files to a tmp directory, messing around with them and leaving the original files untouched. Problem: a simple use of dircopy doesn't do it for me, since you have to give a new tmp dir to the dircopy function, making the code only usable for Windows or UNIX systems (but I need portability).
The next approach would be to make use of the File::Temp module but I'm really having trouble with the docs on this one.
Does anybody have a good idea on this one?
I suspected that you didn't really want your original files modified when I answered your previous question.
I don't understand why you've gone back to accumulating all the text in a buffer before printing it, or why you've removed use strict, which is essential to any well-written Perl code.
Here's my previous solution modified to leave the input data untouched.
use strict;
use warnings;
use Tie::File;
my #files = grep -f, glob '*.txt';
my $all_filename = 'Trace.txt';
open my $out_fh, '>', $all_filename or die qq{Unable to open "$all_filename" for output: $!};
for my $i ( 0 .. $#files ) {
my $file = $files[$i];
next if $file eq $all_filename;
print "Opening $file\n";
tie my #lines, 'Tie::File', $file or die qq{Can't open "$file": $!};
my ($start, $end) = (0, $#lines);
++$start unless $i == 0;
--$end unless $i == $#files;
print $out_fh "$_\n" for #lines[$start..$end];
close $out_fh;
#!/usr/bin/env perl
use strict;
use warnings;
use autodie;
my $outfile = 'Trace.txt';
# Get all files in cwd.
my #files = grep { -f && $_ ne $outfile } <*.txt>;
open my $outfh, '>', $outfile;
for my $file (#files) {
my #lines = do { local #ARGV = $file; <> };
shift #lines unless $file eq $files[0];
pop #lines unless $file eq $files[-1];
print $outfh #lines;
Just do not use Tie::File. Or is there a reason you do this, for example all your files together do not fit your memory or something?
A version very close to your current implementation would be something like the following (untested) code. It just skips the part where you update the file, just to reopen and read it afterwards. (Note that this is certainly not a very effective or overly elegant way to do this, it just sticks to your implementation as close as possible)
use warnings;
use Cwd;
# use Tie::File;
# use Tie::Array;
my $cwd = getcwd();
my $buff = '';
# Get all files in cwd.
my #files = grep ( -f ,<*.txt>);
# Cut off header and footer of $files [1] to $files[$#files-1],
# but only footer of $files[0] and header of $#files[$#files]
for (my $i = 0; $i <= $#files; $i++) {
print 'Opening ' . $files[$i] . "\n";
open (my $fh, "<", $files[$i]) or die "can't open $file for reading: $!";
my #lines = <$fh>;
splice #lines, 0, 1 unless $i == 0;
splice #lines, -1, 1 unless $i == $#files;
foreach my $line (#lines) {
$buff .= $line;
# Write the buffer to a new file.
my $allfilename = $cwd.'/Trace.txt';
print 'Writing all files into new file: ' . $allfilename . "\n";
open $outputfile, ">".$allfilename or die "can't write to new file $outputfile: $!";
# Write the buffer into the output file.
print $outputfile $buff;
close $outputfile;
Based on Miller's answer, but most suitable for large files.
#!/usr/bin/env perl
use strict;
use warnings;
use autodie;
my $outfile = 'Trace.txt';
# Get all files in cwd.
my #files = grep { -f && $_ ne $outfile } <*.txt>;
open my $outfh, '>', $outfile;
my $counter = 0;
for my $file (#files) {
open my $fh, '<', $file;
my ($line, $prev) = ('', '');
my $l = 0;
while ($line = <$fh>) {
print $outfh $prev unless $l++ == 1 and $counter > 0;
$prev = $line;
print $outfh $prev if $counter == #files and $l > 0;
close $fh;

How to replace string dynamically using perl script

I am trying to solve below issues.
I have 2 files. Address.txt and File.txt. I want to replace all A/B/C/D (File.txt) with corresponding string value (Read from Address.txt file) using perl script. It's not replacing in my output file. I am getting same content of File.txt.
I tried below codes.
Here is Address.txt file
Here is File.txt
Here is my code :
use strict;
use warnings;
open (MYFILE, 'Address.txt');
foreach (<MYFILE>){
my #data_new = split/,/sm;
open INPUTFILE, "<", $ARGV[0] or die $!;
open OUT, '>ariout.txt' or die $!;
my $src = $data_new[0];
my $des = $data_new[1];
while (<INPUTFILE>) {
# print "In while :$src \t$des\n";
$_ =~ s/$src/$des/g;
print OUT $_;
close OUT;
# /usr/bin/perl -p -i -e "s/A/APPLE/g" ARGV[0];
close (MYFILE);
If i Write $_ =~ s/A/Apple/g;
Then output file is fine and A is replacing with "Apple". But when dynamically coming it's not getting replaced.
Thanks in advance. I am new in perl scripting language . Correct me if I am wrong any where.
Update 1: I updated below code . It's working fine now. My questions Big O of this algo.
Code :
use warnings;
use strict;
open( my $out_fh, ">", "output.txt" ) || die "Can't open the output file for writing: $!";
open( my $address_fh, "<", "Address.txt" ) || die "Can't open the address file: $!";
my %lookup = map { chomp; split( /,/, $_, 2 ) } <$address_fh>;
open( my $file_fh, "<", "File1.txt" ) || die "Can't open the file.txt file: $!";
while (<$file_fh>) {
my #line = split;
for my $char ( #line ) {
( exists $lookup{$char} ) ? print $out_fh " $lookup{$char} " : print $out_fh " $char ";
print $out_fh "\n";
Not entirely sure how you want your output formatted. Do you want to keep the rows and columns as is?
I took a similar approach as above but kept the formatting the same as in your 'file.txt' file:
use warnings;
use strict;
open( my $out_fh, ">", "output.txt" ) || die "Can't open the output file for writing: $!";
open( my $address_fh, "<", "address.txt" ) || die "Can't open the address file: $!";
my %lookup = map { chomp; split( /,/, $_, 2 ) } <$address_fh>;
open( my $file_fh, "<", "file.txt" ) || die "Can't open the file.txt file: $!";
while (<$file_fh>) {
my #line = split;
for my $char ( #line ) {
( exists $lookup{$char} ) ? print $out_fh " $lookup{$char} " : print $out_fh " $char ";
print $out_fh "\n";
That will give you the output:
Here's another option that lets Perl handle the opening and closing of files:
use strict;
use warnings;
my $addresses_txt = pop;
my %hash = map { $1 => $2 if /(.+?),(.+)/ } <>;
push #ARGV, $addresses_txt;
while (<>) {
my #array;
push #array, $hash{$_} // $_ for split;
print "#array\n";
Usage: perl File.txt Addresses.txt [>outFile.txt]
The last, optional parameter directs output to a file.
Output on your dataset:
The name of the addresses' file is implicitly popped off of #ARGV for use later. Then, a hash is built, using the key/value pairs in File.txt.
The addresses' file is read, splitting each line into its single elements, and the defined-or (//) operator is used to returned the defined hash item or the single element, which is then pushed onto #array. Finally, the array is interpolated in a print statement.
Hope this helps!
First, here is your existing program, rewritten slightly
open the address file
convert the address file to a hash so that the letters are the keys and the strings the values
open the other file
read in the single line in it
split the line into single letters
use the letters to lookup in the hash
use strict;
use warnings;
open(my $a,"Address.txt")||die $!;
my %address=map {split(/,/) } map {split(' ')} <$a>;
open(my $f,"File.txt")||die $!;
my $list=<$f>;
for my $letter (split(' ',$list)) {
print $address{$letter}."\n" if (exists $address{$letter});
to make another file with the substitutions in place alter the loop that processes $list
for my $letter (split(' ',$list)) {
if (exists $address{$letter}) {
push #output, $address{$letter};
else {
push #output, $letter;
open(my $o,">newFile.txt")||die $!;
print $o "#output";
Your problem is that in every iteration of your foreach loop you overwrite any changes made earlier to output file.
My solution:
use strict;
use warnings;
open my $replacements, 'Address.txt' or die $!;
my %r;
foreach (<$replacements>) {
my ($k, $v) = split/,/sm;
$r{$k} = $v;
my $re = '(' . join('|', keys %r) . ')';
open my $input, "<", $ARGV[0] or die $!;
while (<$input>) {
#!/usr/bin/perl -w
# to replace multiple text strings in a file with text from another file
#select text from 1st file, replace in 2nd file
$file1 = 'Address.txt'; $file2 = 'File.txt';
# save the strings by which to replace
%replacement = ();
open IN,"$file1" or die "cant open $file1\n";
{chomp $_;
#a = split ',',$_;
$replacement{$a[0]} = $a[1];}
close IN;
open OUT,">replaced_file";
open REPL,"$file2" or die "cant open $file2\n";
{chomp $_;
#a = split ' ',$_; #replaced_data = ();
# replace strings wherever possible
foreach $i(#a)
{if(exists $replacement{$i}) {push #replaced_data,$replacement{$i};}
else {push #replaced_data,$i;}
print OUT trim(join " ",#replaced_data),"\n";
close REPL; close OUT;
sub trim
my $str = $_[0];
return $str;

How to search and replace using hash with Perl

I'm new to Perl and I'm afraid I am stuck and wanted to ask if someone might be able to help me.
I have a file with two columns (tab separated) of oldname and newname.
I would like to use the oldname as key and newname as value and store it as a hash.
Then I would like to open a different file (gff file) and replace all the oldnames in there with the newnames and write it to another file.
I have given it my best try but am getting a lot of errors.
If you could let me know what I am doing wrong, I would greatly appreciate it.
Here are how the two files look:
oldname newname(SFXXXX) file:
genemark-scaffold00013-abinit-gene-0.18 SF130001
augustus-scaffold00013-abinit-gene-1.24 SF130002
genemark-scaffold00013-abinit-gene-1.65 SF130003
file to search and replace in (an example of one of the lines):
scaffold00013 maker gene 258253 258759 . - . ID=maker-scaffold00013-augustus-gene-2.187;Name=maker-scaffold00013-augustus-gene-2.187;
Here is my attempt:
use warnings;
use strict;
my $hashfile = $ARGV[0];
my $gfffile = $ARGV[1];
my %names;
my $oldname;
my $newname;
if (!defined $hashfile) {
die "Usage: $0 hash_file gff_file\n";
if (!defined $gfffile) {
die "Usage: $0 hash_file gff_file\n";
###save hashfile with two columns, oldname and newname, into a hash with oldname as key and newname as value.
open(HFILE, $hashfile) or die "Cannot open $hashfile\n";
while (my $line = <HFILE>) {
my ($oldname, $newname) = split /\t/;
$names{$oldname} = $newname;
close HFILE;
###open gff file and replace all oldnames with newnames from %names.
open(GFILE, $gfffile) or die "Cannot open $gfffile\n";
while (my $line2 = <GFILE>) {
eval "$line2 =~ s/$oldname/$names{oldname}/g";
open(OUT, ">SFrenamed.gff") or die "Cannot open SFrenamed.gff: $!";
print OUT "$line2\n";
close OUT;
close GFILE;
Thank you!
Your main problem is that you aren't splitting the $line variable. split /\t/ splits $_ by default, and you haven't put anything in there.
This program builds the hash, and then constructs a regex from all the keys by sorting them in descending order of length and joining them with the | regex alternation operator. The sorting is necessary so that the longest of all possible choices is selected if there are any alternatives.
Every occurrence of the regex is replaced by the corresponding new name in each line of the input file, and the output written to the new file.
use strict;
use warnings;
die "Usage: $0 hash_file gff_file\n" if #ARGV < 2;
my ($hashfile, $gfffile) = #ARGV;
open(my $hfile, '<', $hashfile) or die "Cannot open $hashfile: $!";
my %names;
while (my $line = <$hfile>) {
my ($oldname, $newname) = split /\t/, $line;
$names{$oldname} = $newname;
close $hfile;
my $regex = join '|', sort { length $b <=> length $a } keys %names;
$regex = qr/$regex/;
open(my $gfile, '<', $gfffile) or die "Cannot open $gfffile: $!";
open(my $out, '>', 'SFrenamed.gff') or die "Cannot open SFrenamed.gff: $!";
while (my $line = <$gfile>) {
$line =~ s/($regex)/$names{$1}/g;
print $out $line, "\n";
close $out;
close $gfile;
Why are you using an eval? And $oldname is going to be undefined in the second while loop, because the first while loop you redeclare them in that scope (even if you used the outer scope, it would store the very last value that you processed, which wouldn't be helpful).
Take out the my $oldname and my $newname at the top of your script, it is useless.
Take out the entire eval line. You need to repeat the regex for each thing you want to replace. Try something like:
$line2 =~ s/$_/$names{$_}/g for keys %names;
Also see Borodin's answer. He made one big regex instead of a loop, and caught your lack of the second argument to split.