How do I update a line in a file using Perl? - perl

I need update specific line in file.
I use regular expression. Finally, I run over next line.
The file dog2cat contains:
This example
The dog is drinking.
It is drinking milk.
My code:
open(FILE,"+<dog2cat");
while(<FILE>)
{
my $line=$_;
if($line =~ /dog/)
{
$line =~ s/dog/cat/;
print FILE $line;
}
}
close FILE;
Finally the file contain
This example
The dog is drinking.
The cat is drinking.
I want to get
This example
The cat is drinking.
It is drinking milk.

For simple file update tasks, you can use a Perl "one-liner":
perl -i -pe 's/dog/cat/g' dogcat.txt
The -i says to update the file you're working on (you can add an extension to it if you want to write to a new file. For example, -i.bak will write to a file named dogcat.txt.bak).
The -p says iterate over each line, and print it out (with -i, print back to the file).
-e executes whatever is between the quotes on each line in the file.

Open a second file for output, print to that one. Or print to the screen instead of another file, and when you run your script pipe the output into a file.

Related

Perl search for string and get the full line from text file

I want to search for a string and get the full line from a text file through Perl scripting.
So the text file will be like the following.
data-key-1,col-1.1,col-1.2
data-key-2,col-2.1,col-2.2
data-key-3,col-3.1,col-3.2
Here I want to apply data-key-1 as the search string and get the full line into a Perl variable.
Here I want the exact replacement of grep "data-key-1" data.csv in the shell.
Some syntax like the following worked while running in the console.
perl -wln -e 'print if /\bAPPLE\b/' your_file
But how can I place it in a script? With the perl keyword we can't put it into a script. Is there a way to avoid the loops?
If you'd know the command line options you are giving for your one-liner, you'd know exactly what to write inside your perl script. When you read a file, you need a loop. Choice of loop can yield different results performance wise. Using for loop to read a while is more expensive than using a while loop to read a file.
Your one-liner:
perl -wln -e 'print if /\bAPPLE\b/' your_file
is basically saying:
-w : Use warnings
-l : Chomp the newline character from each line before processing and place it back during printing.
-n : Create an implicit while(<>) { ... } loop to perform an action on each line
-e : Tell perl interpreter to execute the code that follows it.
print if /\bAPPLE\b/ to print entire line if line contains the word APPLE.
So to use the above inside a perl script, you'd do:
#!usr/bin/perl
use strict;
use warnings;
open my $fh, '<', 'your_file' or die "Cannot open file: $!\n";
while(<$fh>) {
my $line = $_ if /\bAPPLE\b/;
# do something with $line
}
chomp is not really required here because you are not doing anything with the line other then checking for an existence of a word.
open($file, "<filename");
while(<$file>) {
print $_ if ($_ =~ /^data-key-3,/);
}
use strict;
use warnings;
# the file name of your .csv file
my $file = 'data.csv';
# open the file for reading
open(FILE, "<$file") or
die("Could not open log file. $!\n");
#process line by line:
while(<FILE>) {
my($line) = $_;
# remove any trail space (the newline)
# not necessary, but again, good habit
chomp($line);
my #result = grep (/data-key-1/, $line);
push (#final, #result);
}
print #final;

perl script to add line of code only modifies one file

I have this:
perl -pi -e 'print "code I want to insert\n" if $. == 2' *.php
which puts the line code I want to insert on the second line of the file, which is what I need done to every single PHP file
If I run it in a directory with both PHP files and non-PHP files it does the right thing, but only to one PHP file. I thought *.php would apply it to all PHP files, but it doesn't do it.
How can I write it so it will modify every PHP file in a directory? Bonus if there is an easy way to do this recursively through all directories. I don't mind running the Perl script for each directory as there aren't that many, but don't want to hand edit every single file.
The problem is that the file handle ARGV that Perl uses to read the files passed on the command line is never explicitly closed, so the line number $. just keeps incrementing after the end of the first file and never goes back to one.
Fix this by closing ARGV when it has reached end of file. Perl will reopen it to read the next file in the list, and so reset $.
perl -i -pe 'print "code I want to insert\n" if $. == 2; close ARGV if eof' *.php
If you can use sed, this should work:
sed -si '2i\CODE YOU WANT TO INSERT' *.php
To do it recursively, you might try:
find -name '*.php' -execdir sed -si '2i\CODE YOU WANT TO INSERT' '{}' +
Using File::Find.
Note, I've included 3 sanity checks to verify that things are actually being processed they way that you want.
Initially the script will just print out the found files until you comment out the bare return.
Then the script will save backups unless you uncomment the unlink statement.
Finally, the script will only process a single file until you comment out the exit statement.
These three checks are just so you can verify that everything is working as you desire before editing a whole directory tree.
use strict;
use warnings;
use File::Find;
my $to_insert = "code I want to insert\n";
find(sub {
return unless -f && /\.php$/;
print "Edit $File::Find::name\n";
return; # Comment out once satisfied with found files
local $^I = '.bak';
local #ARGV = $_;
while (<>) {
print $to_insert if $. == 2 && $_ ne $to_insert;
print;
}
# unlink "$_$^I"; # Uncomment to delete backups once certain that first file is processed correctly.
exit; # Comment out once certain that first file is processed correctly
}, '.')

How read/write into a named pipe in perl?

I have a script which have their input/output plugged to named pipes. I try to write something to the first named pipe and to read the result from the second named pipe but nothing happen.
I used open then open2 then sysopen whithout success :
sysopen(FH, "/home/Moses/enfr_kiid5/pipe_CGI_Uniform", O_RDWR);
sysopen(FH2, "/home/Moses/enfr_kiid5/pipe_Detoken_CGI", O_RDWR);
print FH "test 4242 test 4242" or die "error print";
doesn't made error but didn't work : i can't see trace of the print, the test sentence is not write into the first named pipe and try to read from the second block the process.
Works here.
$ mkfifo pipe
$ cat pipe &
$ perl -e 'open my $f, ">", "pipe"; print $f "test\n"'
test
$ rm pipe
You don't really need fancy sysopen stuff, named pipes are really supposed to behave like regular files, albeit half-duplex. Which happens to be a difference between your code and mine, worth investigating if you really need this opening pattern.
You may need to unbuffer your output after opening the pipe:
sysopen(...);
sysopen(...);
$old=select FH;
$|=1;
select $old;
print FH...
And, as friedo says, add a carriage return ("\n") to the end of your print statement!

How to delete a bunch of lines in perl (adapting a known one-liner)?

context: I'm a beginner in Perl and struggling, please be patient, thanks.
the question: there is a one-liner that seems to do the job I want (in a cygwin console it does fine on my test file). So now I would need to turn it into a script, but I can't manage that unfortunately.
The one-liner in question is provided in the answer by Aki here Delete lines in perl
perl -ne 'print unless /HELLO/../GOODBYE/' <file_name>
Namely I would like to have a script that opens my file "test.dat" and removes the lines between some strings HELLO and GOODBYE. Here is what I tried and which fails (the path is fine for cygwin):
#!/bin/perl
use strict;
use warnings;
open (THEFILE, "+<test.dat") || die "error opening";
my $line;
while ($line =<THEFILE>){
next if /hello/../goodbye/;
print THEFILE $line;
}
close (THEFILE);
Many thanks in advance!
Your one-liner is equivalent to the following
while (<>) {
print unless /HELLO/../GOODBYE/;
}
Your code does something quite different. You should not attempt to read and write to the same file handle, that usually does not do what you think. When you want to quickly edit a file, you can use the -i "in-place edit" switch:
perl -ni -e 'print unless /HELLO/../GOODBYE/' file
Do note that changes to the file are irreversible, so you should make backups. You can use the backup option for that switch, e.g. -i.bak, but be aware that it is not flawless, as running the same command twice will still overwrite your backup (by saving to the same file name twice).
The simplest and safest way to do it, IMO, is to simply use shell redirection
perl script.pl file.txt > newfile.txt
While using the script file I showed at the top.

Read same extension multiple files in one directory in Perl

I currently have an issue with reading files in one directory.
I need to take all the fastq files in a file and run the script for each file then put new files in an ‘Edited_sequences’ folder.
The one script I had is
perl -ne '$i++; if($i<80001){print}' BM2003_TCCCAGAACAAC_L001_R1_001.fastq > ./Edited_sequences/BM2003_TCCCAGAACAAC_L001_R1_001.fastq
It takes the first 80000 lines in one fastq file then outputs the result.
Now for example I have 2000 fastq files, then I need to copy and paste for 2000 times.
I know there is a glob command suit for this situation but I just do not know how to deal with that.
Please help me out.
You can use perl to do copy/paste for you, first argument *.fastq are all fastq files, and second ./Edited_sequences is target folder for new files,
perl -e '$d=pop; `head -8000 "$_" > "$d/$_"` for #ARGV' *.fastq ./Edited_sequences
glob gets you an array of filenames matching a particular expression. It's frequently used with <> brackets, a lot like reading input (you can think of it as reading files from a directory).
This is a simple example that will print the names of every ".fastq" file in the current directory:
print "$_\n" for <*.fastq>;
The important part is <*.fastq>, which gives us an array of filenames matching that expression (in this case, a file extension). If you need to change which directory your Perl script is working in, you can use chdir.
From there, we can process your files as needed:
while (my $filename = <*.fastq>) {
open(my $in, '<', $filename) or die $!;
open(my $out, '>', "./Edited_sequences/$filename") or die $!;
for (1..80000) {
my $line = <$in>;
print $out $line;
}
}
You have two choices:
Use Perl to read in the 2000 files and run it as part of your program
Use the Shell to pass each of those 2000 file to your command line
Here's the bash alternative:
for file in *.fastq
do
perl -ne '$i++; if($i<80001){print}' "$file" > "./Edited_sequences/$file"
done
Your same Perl script, but with the shell finding each file. This should work and not overload the command line. The for loop in bash, if handed a glob can expand them correctly.
However, I always recommend that you don't actually execute the command, but echo the resulting commands into a file:
for file in *.fastq
do
echo "perl -ne '\$i++; if(\$i<80001){print}' \
\"$file\" > \"./Edited_sequences/$file\"" >> myoutput.txt
done
Then, you can look at myoutput.txt to make sure it looks good before you actually do any real harm. Once you've determined that myoutput.txt is a good file, you can execute that as a shell script:
$ bash myoutput.txt