Replace a string in file - perl

My file is like this
DIV=25
FACILITY=11111
and I want to use Perl to replace DIV=25 into DIV=30. Below is my script to do it, but the output of the file is DIV=3030
open( IN_IOE, $FILE_NAME ) || die "Cannot open file";
my #line_ioe = <IN_IOE>;
close(IN_IOE);
chomp #line_ioe;
foreach $_ ( #line_ioe ) {
s/DIV=/DIV=30/
}
open( OUT, ">test.txt" );
foreach $_ (#line_ioe) {
print OUT "$_ \n";
}
close(OUT);
The output of my file is
DIV=3030
FACILITY=11111
Can anyone please show me how to replace that line in file with Perl, and point out where I was wrong.

You can do that in one line of Perl at the command line:
perl -pi -e 's/DIV=25/DIV=30/' file.txt

if you have multiple lines with different numeric numbers (i.e. DIV=25, DIV =31, DIV=21) you could do following.
s/DIV=\d+/DIV=25/g
here \d is to replace any digits and 'g' to perform this globally.

The code you show certainly didn't change DIV=30 into DIV=3030. It didn't do anything at all because you have opened your output file for input
This line
open( OUT, "<test.txt");
should look like this
open OUT, '>', 'test.txt' or die $!;
Also, if you want to replace DIV=30 with DIV=25 then you need to write that. I think it's clear that the substitution
s/DIV=/DIV=25/
will change DIV=30 into DIV=2530. Use this instead
s/DIV=30/DIV=25/

Related

Adding sequence from FASTA file using Perl

I'm still learning Perl and I have a program which is able to take a FASTA file sequence header and print only the species name within square brackets. I want to add to this code to have it also print the entire sequence associated with the species.
Here is my code:
#!/usr/bin/perl
use warnings;
my $file = 'seqs.fasta';
my $tmp = 'newseqs.fasta';
open(OUT, '>', $tmp) or die "Can't open $tmp: $!";
open(IN, '<', $file) or die "Can't open $file: $!";
while(<IN>) {
chomp;
if ( $_ =~ /\[([^]]+)\]/ ) {
print OUT "$1\n";
}
}
close(IN);
close(OUT);
Here is a sample of the original FASTA file I had:
>gi|334187971|ref|NP_001190408.1| Cam-binding protein 60-like G [Arabidopsis thaliana] >gi|332006244|gb|AED93627.1| Cam-binding protein 60-like G [Arabidopsis thaliana]
MKIRNSPSFHGGSGYSVFRARNLTFKKVVKKVMRDQSNNQFMIQMENMIRRIVREEIQRSLQPFLSSSCVSMERSRSETP
SSRSRLKLCFINSPPSSIFTGSKIEAEDGSPLVIELVDATTNTLVSTGPFSSSRVELVPLNADFTEESWTVEGFNRNILT
QREGKRPLLTGDLTVMLKNGVGVITGDIAFSDNSSWTRSRKFRLGAKLTGDGAVEARSEAFGCRDQRGESYKKHHPPCPS
DEVWRLEKIAKDGVSATRLAERKILTVKDFRRLYTIIGAGVSKKTWNTIVSHAMDCVLDETECYIYNANTPGVTLLFNSV
YELIRVSFNGNDIQNLDQPILDQLKAEAYQNLNRITAVNDRTFVGHPQRSLQCPQDPGFVVTCSGSQHIDFQGSLDPSSS
SMALCHKASSSTVHPDVLMSFDNSSTARFHIDKKFLPTFGNSFKVSELDQVHGKSQTVVTKGCIENNEEDENAFSYHHHD
DMTSSWSPGTHQAVETMFLTVSETEEAGMFDVHFANVNLGSPRARWCKVKAAFKVRAAFKEVRRHTTARNPREGL
Currently, the output only pulls the species name Arabidopsis thaliana
However, I want it to print properly in a fasta file as such:
>Arabidopsis thaliana
MKIRNSPSFHGGSGYSVFRARNLTFKKVVKKVMRDQSNNQFMIQMENMIRRIVREEIQRSLQPFLSSSCVSMERSRSETP
SSRSRLKLCFINSPPSSIFTGSKIEAEDGSPLVIELVDATTNTLVSTGPFSSSRVELVPLNADFTEESWTVEGFNRNILT
QREGKRPLLTGDLTVMLKNGVGVITGDIAFSDNSSWTRSRKFRLGAKLTGDGAVEARSEAFGCRDQRGESYKKHHPPCPS
DEVWRLEKIAKDGVSATRLAERKILTVKDFRRLYTIIGAGVSKKTWNTIVSHAMDCVLDETECYIYNANTPGVTLLFNSV
YELIRVSFNGNDIQNLDQPILDQLKAEAYQNLNRITAVNDRTFVGHPQRSLQCPQDPGFVVTCSGSQHIDFQGSLDPSSS
SMALCHKASSSTVHPDVLMSFDNSSTARFHIDKKFLPTFGNSFKVSELDQVHGKSQTVVTKGCIENNEEDENAFSYHHHD
DMTSSWSPGTHQAVETMFLTVSETEEAGMFDVHFANVNLGSPRARWCKVKAAFKVRAAFKEVRRHTTARNPREGL
Could you suggest ways to modify the code to achieve this?
That's because what this does:
if ( $_ =~ /\[([^]]+)\]/ ) {
print OUT "$1\n";
}
Is find and capture any text in []. But if that pattern doesn't match, you don't do anything else with the line - like print it.
Adding:
else {
print OUT $_;
}
Will mean if a line doesn't contain [] it'll get printed by default.
I will also suggest:
turn on use strict;.
lexical filehandles are good practice: open ( my $input, '<', $file ) or die $!;
a pattern match implicitly applies to $_ by default. So you can write that 'if' as if ( /\[([^]]+)\]/ )
A couple of general points about your program
You must always use strict as well as use warnings 'all' at the top of every Perl program you write. It will reveal many simple mistakes that you could otherwise easily overlook
You have done well to choose the three-parameter form of open, but you should also use lexical file handles. So this line
open(OUT, '>', $tmp) or die "Can't open $tmp: $!";
should be written as
open my $out_fh, '>', $tmp or die "Can't open $tmp: $!";
It's probably best to supply the input and output file names on the command line, so you don't have to edit your program to run it against different files
I would solve your problem like this. It checks to see if each line is a header that contains a string enclosed in square brackets. The first test is that the line starts with a close angle bracket >, and the second test is the same as you wrote in your own program that captures the bracketed string — the species name
If these checks are passed then the species name is printed with an closing angle bracket and a newline, otherwise the line is printed as it is
This program should be run like this
$ fasta_species.pl seqs.fasta > newseqs.fasta
The dollar is just the Linux prompt character, and it assumes you have put the program in a file names fasta_species.pl. You can omit the > newseqs.fasta to display the output directly to the screen so that you can see what is being produced without creating an output file and editing it
use strict;
use warnings 'all';
while ( <> ) {
if ( /^>/ and / \[ ( [^\[\]]+ ) \] /x ) {
print ">$1\n";
}
else {
print;
}
}
output
>Arabidopsis thaliana
MKIRNSPSFHGGSGYSVFRARNLTFKKVVKKVMRDQSNNQFMIQMENMIRRIVREEIQRSLQPFLSSSCVSMERSRSETP
SSRSRLKLCFINSPPSSIFTGSKIEAEDGSPLVIELVDATTNTLVSTGPFSSSRVELVPLNADFTEESWTVEGFNRNILT
QREGKRPLLTGDLTVMLKNGVGVITGDIAFSDNSSWTRSRKFRLGAKLTGDGAVEARSEAFGCRDQRGESYKKHHPPCPS
DEVWRLEKIAKDGVSATRLAERKILTVKDFRRLYTIIGAGVSKKTWNTIVSHAMDCVLDETECYIYNANTPGVTLLFNSV
YELIRVSFNGNDIQNLDQPILDQLKAEAYQNLNRITAVNDRTFVGHPQRSLQCPQDPGFVVTCSGSQHIDFQGSLDPSSS
SMALCHKASSSTVHPDVLMSFDNSSTARFHIDKKFLPTFGNSFKVSELDQVHGKSQTVVTKGCIENNEEDENAFSYHHHD
DMTSSWSPGTHQAVETMFLTVSETEEAGMFDVHFANVNLGSPRARWCKVKAAFKVRAAFKEVRRHTTARNPREGL

Need to replace value from one file to another file using perl

I am writing a program using perl which read a value from one file and replace this value in other file. Program runs successfully, but value didn't get replaced. Please suggest me where is the error.
use strict;
use warnings;
open(file1,"address0.txt") or die "Cannot open file.\n";
my $value;
$value=<file1>;
system("perl -p -i.bak -e 's/add/$value/ig' rough.sp");
Here the value which I want to replace exists in address0.txt file. It is a single value 1. I want to place this value in place of add in other file rough.sp.
My rough.sp looks like
Vdd 1 0 add
My address0.txt looks like
1
So output should be like
Vdd 1 0 1
Please help me out. Thanks in advance
Assuming that there is a 1:1 relationship between lines in adress0.txt and rough.sp, you can proceed like this:
use strict;
use warnings;
my ($curline_1,$curline_2);
open(file1, "address0.txt") or die "Cannot open file.\n";
open(file2, "rough.sp") or die "Cannot open file.\n";
open(file3, ">out.sp") or die "Cannot open file.\n";
while (<file1>) {
$curline_1 = $_;
chomp($curline_1);
$curline_2 = <file2>;
$curline_2 =~ s/ add/ $curline_1/;
print file3 $curline_2;
}
close(file1);
close(file2);
close(file3);
exit(0);
Explanation:
The code iterates through the lines of your input files in parallel. Note that the lines read include the line terminator. Line contents from the 'address' file are taken as replacement values fpr the add literal in your .sp file. Line terminators from the 'address' file are eliminated to avoid introducing additional newlines.
Addendum:
An extension for multi-replacements might look like this:
$curline_1 = $_;
chomp($curline_1);
my #parts = split(/ +/, $curline_1); # splits the line from address0.txt into an array of strings made up of contiguous non-whitespace chars
$curline_2 = <file2>;
$curline_2 =~ s/ add/ $parts[0]/;
$curline_2 =~ s/ sub/ $parts[1]/;
# ...

Replace a text in perl

I am trying to find a particular string in file and want to replace that string with another string. Then i want to replace this string in the file also. I am using following code:
open(FILEB,"+<File B (2).txt");
$hostNameA="Any string\n";
foreach $lineB(FILEB)
{
seek(FILEB,-length($lineB),1);
$lineB=~s/$hostNameB/$hostNameA/;
print FILEB $lineB;
}
Basically, my query is how to replace hostNameB ith hostNameA in FileB....
If you are working on linux, there is no need to even open a file and no need to create backup file. Following script should work -
#!/usr/bin/perl
#Commandline
my $command = "sed -i 's/FOO/BAR/g' /mydir/myfile.txt";
#Execute Command
`$command`;
Above script will replace all occurrences of string 'FOO' with 'BAR' in myfile.txt
Write file with same handle will destroy original file.
open IN, '<', 'path_to_file' or die $!
open OUT, '>', 'path_to_replaced_file' or die $!;
while (my $line = <IN>) {
$line =~ s/something/tosomething/g;
print OUT $line;
}
close OUT;
close IN;
# if you wish, backup old file, and rename new file
How about this
$ perl -pi.bak -e 's/hostNameB/hostNameA/g' "File B (2).txt"
read the file from "File B (2).txt" and then edit it with regex 's/hostNameB/hostNamea/g'
And back up the original file.

How do I copy a CSV file, but skip the first line?

I want to write a script that takes a CSV file, deletes its first row and creates a new output csv file.
This is my code:
use Text::CSV_XS;
use strict;
use warnings;
my $csv = Text::CSV_XS->new({sep_char => ','});
my $file = $ARGV[0];
open(my $data, '<', $file) or die "Could not open '$file'\n";
my $csvout = Text::CSV_XS->new({binary => 1, eol => $/});
open my $OUTPUT, '>', "file.csv" or die "Can't able to open file.csv\n";
my $tmp = 0;
while (my $line = <$data>) {
# if ($tmp==0)
# {
# $tmp=1;
# next;
# }
chomp $line;
if ($csv->parse($line)) {
my #fields = $csv->fields();
$csvout->print($OUTPUT, \#fields);
} else {
warn "Line could not be parsed: $line\n";
}
}
On the perl command line I write: c:\test.pl csv.csv and it doesn't create the file.csv output, but when I double click the script it creates a blank CSV file. What am I doing wrong?
Your program isn't ideally written, but I can't tell why it doesn't work if you pass the CSV file on the command line as you have described. Do you get the errors Could not open 'csv.csv' or Can't able to open file.csv? If not then the file must be created in your current directory. Perhaps you are looking in the wrong place?
If all you need to do is to drop the first line then there is no need to use a module to process the CSV data - you can handle it as a simple text file.
If the file is specified on the command line, as in c:\test.pl csv.csv, you can read from it without explicitly opening it using the <> operator.
This program reads the lines from the input file and prints them to the output only if the line counter (the $. variable) isn't equal to one).
use strict;
use warnings;
open my $out, '>', 'file.csv' or die $!;
while (my $line = <>) {
print $out $line unless $. == 1;
}
Yhm.. you don't need any modules for this task, since CSV ( comma separated value ) are simply text files - just open file, and iterate over its lines ( write to output all lines except particular number, e.g. first ). Such task ( skip first line ) is so simple, that it would be probably better to do it with command line one-liner than a dedicated script.
quick search - see e.g. this link for an example, there are numerous tutorials about perl input/output operations
http://learn.perl.org/examples/read_write_file.html
PS. Perl scripts ( programs ) usually are not "compiled" into binary file - they are of course "compiled", but, uhm, on the fly - that's why /usr/bin/perl is called rather "interpreter" than "compiler" like gcc or g++. I guess what you're looking for is some editor with syntax highlighting and other development goods - you probably could try Eclipse with perl plugin for that ( cross platform ).
http://www.eclipse.org/downloads/
http://www.epic-ide.org/download.php/
this
user#localhost:~$ cat blabla.csv | perl -ne 'print $_ if $x++; '
skips first line ( prints out only if variable incremented AFTER each use of it is more than zero )
You are missing your first (and only) argument due to Windows.
I think this question will help you: #ARGV is empty using ActivePerl in Windows 7

perl + append text between two lines in file

I need to edit file , the main issue is to append text between two known lines in the file
for example I need to append the following text
a b c d e f
1 2 3 4 5 6
bla bla
Between the first_line and the second_line
first_line=")"
second_line="NIC Hr_Nic ("
remark: first_line and second_line argument can get any line or string
How to do this by perl ? ( i write bash script and I need to insert the perl syntax in my script)
lidia
You could read the file in as a single string and then use a regular expression to do the search and replace:
use strict;
use warnings;
# Slurp file myfile.txt into a single string
open(FILE,"myfile.txt") || die "Can't open file: $!";
undef $/;
my $file = <FILE>;
# Set strings to find and insert
my $first_line = ")";
my $second_line = "NIC Hr_Nic (";
my $insert = "hello world";
# Insert our text
$file =~ s/\Q$first_line\E\n\Q$second_line\E/$first_line\n$insert\n$second_line/;
# Write output to output.txt
open(OUTPUT,">output.txt") || die "Can't open file: $!";
print OUTPUT $file;
close(OUTPUT);
By unsetting $/ we put Perl into "slurp mode" and so can easily read the whole file into $file.
We use the s/// operator to do a search and replace using our two search lines as a pattern.
The \Q and \E tell Perl to escape the strings between them, i.e. to ignore any special characters that happen to be in $first_line or $second_line.
You could always write the output over the original file if desired.
The problem as you state it is not solvable using the -i command line option since this option processes a file one line at a time; to insert text between two specific lines you'll need to know about two lines at once.
Well to concenate strings you do
my $text = $first_line . $second_line;
or
my $text = $first_line;
$text .= $second_line;
I'm not sure if I understand your question correctly. A "before and after" example of the file content would, I think, be easier. Anyhow, Here's my take on it, using splice instead of a regular expression. We must of course know the line numbers for this to work.
Load the file into an array:
my #lines;
open F, '<', 'filename' or die $!;
push #lines, $_ for <F>;
close F;
Insert the stuff (see perldoc -f splice):
splice #lines, 1, 0, ('stuff');
..and you're done. All you need to do now is save the array again:
open F, '>', 'filename' or die $!;
print F #lines;
close F;