Perl - substring keywords - perl

I have a text file where is lot of lines, I need search in this file keywords and if exist write to log file line where is keywords and line one line below and one above the keyword. Now search or write keyword not function if find write all and I dont known how can I write line below and above. Thanks for some advice.
my $vstup = "C:/Users/Omega/Documents/Kontroly/testkontroly/kontroly20220513_154743.txt";
my $log = "C:/Users/Omega/Documents/Kontroly/testkontroly/kontroly.log";
open( my $default_fh, "<", $vstup ) or die $!;
open( my $main_fh, ">", $log ) or die $!;
my $var = 0;
while ( <$default_fh> ) {
if (/\Volat\b/)
$var = 1;
}
if ( $var )
print $main_fh $_;
}
}
close $default_fh;
close $main_fh;

The approach below use one semaphore variable and a buffer variable to enable the desired behavior.
Notice that the pattern used was replaced by 'A` for simplicity testing.
#!/usr/bin/perl
use strict;
use warnings;
my ($in_fh, $out_fh);
my ($in, $out);
$in = 'input.txt';
$out = 'output.txt';
open($in_fh, "< ", $in) || die $!."\n";
open($out_fh, "> ", $out) || die $!;
my $p_next = 0;
my $p_line;
while (my $line = <$in_fh>) {
# print line after occurrence
print $out_fh $line if ($p_next);
if ($line =~ /A/) {
if (defined($p_line)) {
# print previous line
print $out_fh $p_line;
# once printed undefine variable to avoid printing it again in the next loop
undef($p_line);
}
# Print current line if not already printed as the line follows a pattern
print $out_fh $line if (!$p_next);
# toggle semaphore to print the next line
$p_next = 1;
} else {
# pattern not found.
# if pattern was not detected in both current and previous line.
$p_line = $line if (!$p_next);
$p_next = 0;
}
}
close($in_fh);
close($out_fh);

Related

how to combine the code to make the output is on the same line?

Can you help me to combine both of these progeam to display the output in a row with two columns? The first column is for $1 and the second column is $2.
Kindly help me to solve this. Thank you :)
This is my code 1.
#!/usr/local/bin/perl
#!/usr/bin/perl
use strict ;
use warnings ;
use IO::Uncompress::Gunzip qw(gunzip $GunzipError);
my $input = "par_disp_fabric.all_max_lowvcc_qor.rpt.gz";
my $output = "par_disp_fabric.all_max_lowvcc_qor.txt";
gunzip $input => $output
or die "gunzip failed: $GunzipError\n";
open (FILE, '<',"$output") or die "Cannot open $output\n";
while (<FILE>) {
my $line = $_;
chomp ($line);
if ($line=~ m/^\s+Timing Path Group \'(\S+)\'/) {
$line = $1;
print ("$1\n");
}
}
close (FILE);
This is my code 2.
my $input = "par_disp_fabric.all_max_lowvcc_qor.rpt.gz";
my $output = "par_disp_fabric.all_max_lowvcc_qor.txt";
gunzip $input => $output
or die "gunzip failed: $GunzipError\n";
open (FILE, '<',"$output") or die "Cannot open $output\n";
while (<FILE>) {
my $line = $_;
chomp ($line);
if ($line=~ m/^\s+Levels of Logic:\s+(\S+)/) {
$line = $1;
print ("$1\n");
}
}
close (FILE);
this is my output for code 1 which contain 26 line of data:
**async_default**
**clock_gating_default**
Ddia_link_clk
Ddib_link_clk
Ddic_link_clk
Ddid_link_clk
FEEDTHROUGH
INPUTS
Lclk
OUTPUTS
VISA_HIP_visa_tcss_2000
ckpll_npk_npkclk
clstr_fscan_scanclk_pulsegen
clstr_fscan_scanclk_pulsegen_notdiv
clstr_fscan_scanclk_wavegen
idvfreqA
idvfreqB
psf5_primclk
sb_nondet4tclk
sb_nondetl2tclk
sb_nondett2lclk
sbclk_nondet
sbclk_sa_det
stfclk_scan
tap4tclk
tapclk
The output code 1 also has same number of line.
paste is useful for this: assuming your shell is bash, then using process substitutions
paste <(perl script1.pl) <(perl script2.pl)
That emits columns separated by a tab character. For prettier output, you can pipe the output of paste to column
paste <(perl script1.pl) <(perl script2.pl) | column -t -s $'\t'
And with this, you con't need to try and "merge" your perl programs.
To combine the two scripts and to output two items of data on the same line, you need to hold on until the end of the file (or until you have both data items) and then output them at once. So you need to combine both loops into one:
#!/usr/bin/perl
use strict ;
use warnings ;
use IO::Uncompress::Gunzip qw(gunzip $GunzipError);
my $input = "par_disp_fabric.all_max_lowvcc_qor.rpt.gz";
my $output = "par_disp_fabric.all_max_lowvcc_qor.txt";
gunzip $input => $output
or die "gunzip failed: $GunzipError\n";
open (FILE, '<',"$output") or die "Cannot open $output\n";
my( $levels, $timing );
while (<FILE>) {
my $line = $_;
chomp ($line);
if ($line=~ m/^\s+Levels of Logic:\s+(\S+)/) {
$levels = $1;
}
if ($line=~ m/^\s+Timing Path Group \'(\S+)\'/) {
$timing = $1;
}
}
print "$levels, $timing\n";
close (FILE);
You still haven't given us one vital piece of information - what does the input data looks like. Most importantly, are the two pieces of information you're looking for on the same line?
[Update: Looking more closely at your regexes, I see it's possible for both pieces of information to be on the same line - as they are both supposed to be the first item on the line. It would be helpful if you were clearer about that in your question.]
I think this will do the right thing, no matter what the answer to your question is. I've also added the improvements I suggested in my answer to your previous question:
#!/usr/bin/perl
use strict ;
use warnings ;
use IO::Uncompress::Gunzip qw(gunzip $GunzipError);
my $zipped = "par_disp_fabric.all_max_lowvcc_qor.rpt.gz";
my $unzipped = "par_disp_fabric.all_max_lowvcc_qor.txt";
gunzip $zipped => $unzipped
or die "gunzip failed: $GunzipError\n";
open (my $fh, '<', $unzipped) or die "Cannot open '$unzipped': $!\n";
my ($levels, $timing);
while (<$fh>) {
chomp;
if (m/^\s+Levels of Logic:\s+(\S+)/) {
$levels = $1;
}
if (m/^\s+Timing Path Group \'(\S+)\'/) {
$timing = $1;
}
# If we have both values, then print them out and
# set the variables to 'undef' for the next iteration
if ($levels and $timing) {
print "$levels, $timing\n";
undef $levels;
undef $timing;
}
}
close ($fh);

Want to add random string to identifier line in fasta file

I want to add random string to existing identifier line in fasta file.
So I get:
MMETSP0259|AmphidiniumcarteCMP1314aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
Then the sequence on the next lines as normal. I am have problem with i think in the format output. This is what I get:
MMETSP0259|AmphidiniumCMP1314aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
CTTCATCGCACATGGATAACTGTGTACCTGACTaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab
TCTGGGAAAGGTTGCTATCATGAGTCATAGAATaaaaaaaaaaaaaaaaaaaaaaaaaaaaaac
It's added to every line. (I altered length to fit here.) I want just to add to the identifier line.
This is what i have so far:
use strict;
use warnings;
my $currentId = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa";
my $header_line;
my $seq;
my $uniqueID;
open (my $fh,"$ARGV[0]") or die "Failed to open file: $!\n";
open (my $out_fh, ">$ARGV[0]_longer_ID_MMETSP.fasta");
while( <$fh> ){
if ($_ =~ m/^(\S+)\s+(.*)/) {
$header_line = $1;
$seq = $2;
$uniqueID = $currentId++;
print $out_fh "$header_line$uniqueID\n$seq";
} # if
} # while
close $fh;
close $out_fh;
Thanks very much, any ideas will be greatly appreciated.
Your program isn't working because the regex ^(\S+)\s+(.*) matches every line in the input file. For instance, \S+ matches CTTCATCGCACATGGATAACTGTGTACCTGACT; the newline at the end of the line matches \s+; and nothing matches .*.
Here's how I would encode your solution. It simply appends $current_id to the end of any line that contains a pipe | character
use strict;
use warnings;
use 5.010;
use autodie;
my ($filename) = #ARGV;
my $current_id = 'a' x 57;
open my $in_fh, '<', $filename;
open my $out_fh, '>', "${filename}_longer_ID_MMETSP.fasta";
while ( my $line = <$in_fh> ) {
chomp $line;
$line .= $current_id if $line =~ tr/|//;
print $line, "\n";
}
close $out_fh;
output
MMETSP0259|AmphidiniumCMP1314aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
CTTCATCGCACATGGATAACTGTGTACCTGACT
TCTGGGAAAGGTTGCTATCATGAGTCATAGAAT

Perl: printing original file with changes

I wrote this code and it works fine, it should find lines in which there's no string like 'SID' and append a pipe | at the beginning of the line, so like this: find all lines in which there's no 'SID' and append a pipe | at the beginning of the line. But how I wrote it, I can just output the lines which were changed and have a pipe. What I actually want: leave the file as it is and just append the pipes to the lines which match. Thank you.
#!usr/bin/perl
use strict;
use warnings;
use autodie;
my $fh;
open $fh, '<', 'file1.csv';
my $out = 'file2.csv';
open(FILE, '>', $out);
my $myline = "";
while (my $line = <$fh>) {
chomp $line;
unless ($line =~ m/^SID/) {
$line =~ m/^(.*)$/;
$myline = "\|$1";
}
print FILE $myline . "\n";
}
close $fh;
close FILE;
my file example:
SID,bla
foo bar <- my code adds the pipe to the beginning of this line
output should be like this:
SID,bla
| foo bar
but in my case I only print $myline, I know:
| foo bar
The line
$line =~ m/^(.*)$/
is misguided: all it does is put the contents of $line into $1, so the following statement
$myline = "\|$1"
may as well be
$myline = "|$line"
(The pipe | doesn't need escaping unless it is part of a regular expression.)
Since you are printing $myline at the end of your loop you are never seeing the contents of unmodified lines.
You can fix that by printing $line or $myline according to which one contains the required output, like this
while (my $line = <$fh>) {
chomp $line;
if ($line =~ m/^SID/) {
print "$line\n";
}
else {
my $myline = "|$line";
print "$myline\n";
}
}
or, much more simply, by dropping the intermediate variable and using the default $_ for the input lines, like this
while (<$fh>) {
print '|' unless /^SID/;
print;
}
Note that I have also removed the chomp as it just means you have to put the newline back on the end of the string when you print it.
Instead of creating a new variable $myline, use the one you already have:
while (my $line =<$fh>) {
$line = '|' . $line if $line !~ /^SID/;
print FILE $line;
}
Also, you can use lexical filehandle for the output file as well. Moreover, you should check the return value of open:
open my $OUT, '>', $out or die $!;

open a file and replace a word using perl

I want to open a file and replace a word from a file.
My code is attached here.
open(my $fh, "<", "pcie_7x_v1_7.v") or die "cannot open <pcie_7x_v1_7.v:$!";
while (my $line = <$fh>) {
if ($line =~ timescale 1 ns) {
print $line $msg = "pattern found \n ";
print "$msg";
$line =~ s/`timescale 1ns/`timescale 1ps/;
}
else {
$msg = "pattern not found \n ";
print "$msg";
}
}
File contains pattern timescale 1ns/1ps.
My requirement is to replace timescale 1ns/1ps to be replaced with timescale 1ps/1ps.
At present else condition occurs always.
Update code after receiving comment:
Hi,
Thanks for the quick solution.
I changed the code accordingly, but the result was not successful.
I have attached the updated code here.
Please suggest me if I missed anything here.
use strict;
use warnings;
open(my $fh, "<", "pcie_7x_v1_7.v" )
or die "cannot open <pcie_7x_v1_7.v:$!" ;
open( my $fh2, ">", "cie_7x_v1_7.v2")
or die "cannot open <pcie_7x_v1_7.v2:$!" ;
while(my $line = <$fh> )
{
print $line ;
if ($_ =~ /timescale\s1ns/ )
{
$msg = "pattern found \n " ;
print "$msg" ;
$_ =~ s/`timescale 1ns/`timescale 1ps/g ;
}
else
{
$msg = "pattern not found \n " ;
print "$msg" ;
}
print $fh2 $line ;
}
close($fh) ;
close($fh2) ;
Result:
pattern not found
pattern not found
pattern not found
pattern not found
Regards,
Binu
3rd update:
// File : pcie_7x_v1_7.v
// Version : 1.7
//
// Description: 7-series solution wrapper : Endpoint for PCI Express
//
//--------------------------------------------------------------------------------
//`timescale 1ps/1ps
`timescale 1ns/1ps
(* CORE_GENERATION_INFO = "pcie_7x_v1_7,pcie_7x_v1_7,
You can use a perl oneliner from a command line. No need to write a script.
perl -p -i -e "s/`timescale\s1ns/`timescale 1ps/g" pcie_7x_v1_7.v
-
However,
If you still want to use the script, you are almost there. You just need to fix a couple errors
print $line; #missing
if ($line =~ /timescale\s1ns/) #made it a real regex, this should match now
$line =~ s/`timescale 1ns/`timescale 1ps/g ; #added g to match all occurences in line
after the if-else you must print the line to a file again
for example, open a new file for writing (let's call it 'pcie_7x_v1_7.v.2') at the beginning of your script
open(my $fh2, ">", "pcie_7x_v1_7.v.2" ) or die "cannot open <pcie_7x_v1_7.v.2:$!" ;
then , after the else block just print the line (whether it's changed or not) to the file
print $fh2 $line;
Don't forget to close the filehandles when you're done
close($fh);
close($fh2);
EDIT:
Your main problem was that you used $_ for the check, while you had assigned the line to $line. So you did print $line, but then if ($_ =~ /timescale/. That would never work.
I'm copy pasting your script and made a couple corrections and formatted it a little more dense to better fit in the website. I also removed the if match check as suggested by TLP and directly did the substitution in the if. It has exactly the same result. This works:
use strict;
use warnings;
open(my $fh, "<", "pcie_7x_v1_7.v" )
or die "cannot open <pcie_7x_v1_7.v:$!" ;
open( my $fh2, ">", "pcie_7x_v1_7.v2")
or die "cannot open >pcie_7x_v1_7.v2:$!" ;
while(my $line = <$fh> ) {
print $line;
if ($line =~ s|`timescale 1ns/1ps|`timescale 1ps/1ns|g) {
print "pattern found and replaced\n ";
}
else {
print "pattern not found \n ";
}
print $fh2 $line ;
}
close($fh);
close($fh2);
#now it's finished, just overwrite the old file with the new file
rename "pcie_7x_v1_7.v2", "pcie_7x_v1_7.v";

Why I am not getting "success" with this program?

I have written the following program with the hope of getting success. But I could never get it.
my $fileName = 'myfile.txt';
print $fileName,"\n";
if (open MYFILE, "<", $fileName) {
my $Data;
{
local $/ = undef;
$Data = <MYFILE>;
}
my #values = split('\n', $Data);
chomp(#values);
if($values[2] eq '9999999999') {
print "Success"."\n";
}
}
The content of myfile.txt is
160002
something
9999999999
700021
Try splitting by \s*[\r\n]+
my $fileName = 'myfile.txt';
print $fileName,"\n";
if (open MYFILE, "<", $fileName) {
my $Data;
{
local $/ = undef;
$Data = <MYFILE>;
}
my #values = split(/\s*[\r\n]+/, $Data);
if($values[2] eq '9999999999') {
print "Success";
}
}
If myfile.txt contain carriage return (CR, \r), it will not work as expected.
Another possible cause is trailing spaces before linefeed (LF, \n).
You don't need to read an entire file into an array to check one line. Open the file, skip the lines you don't care about, then play with the line you do care about. When you've done what you need to do, stop reading the file. This way, only one line is ever in memory:
my $fileName = 'myfile.txt';
open MYFILE, "<", $fileName or die "$filename: $!";
while( <MYFILE> ) {
next if $. < 3; # $. is the line number
last if $. > 3;
chomp;
print "Success\n" if $_ eq '9999999999';
}
close MYFILE;
my $fileName = 'myfile.txt';
open MYFILE, "<", $fileName || die "$fileName: $!";
while( $rec = <MYFILE> ) {
for ($rec) { chomp; s/\r//; s/^\s+//; s/\s+$//; } #Remove line-feed and space characters
$cnt++;
if ( $rec =~ /^9+$/ ) { print "Success\n"; last; } #if record matches "9"s only
#print "Success" and live the loop
}
close MYFILE;
#Or you can write: if ($cnt==3 and $rec =~ /^9{10}$/) { print "Success\n"; last; }
#If record 3 matches ten "9"s print "Success" and live the loop.