match string in file and replacement with other string - perl

I have a file containing lines as follows
#comments abc
#comments xyz
SerialPort=100
Baudrate=9600
Parity=2
Databits=8
Stopbits=1
also I have array #in = ( SerialPort=500 , Baudrate=300, parity=0, Databits=16, Stopbits=0 ),these array elements read from browser, I am trying to write perl script to match "SerialPort" in file and replace SerialPort=100 in file with SerialPort=500 of array, I want match all other elments in loop I tried code not working please improve the code which is below, I think regular expression is not working and each time if condition to match and substitution resulting false, and also when I look at file after execution of script file consists of duplicates.
#!/usr/bin/perl
$old_file = "/home/work/conf";
open (fd_old, "<", $old_file) || die "cant open file";
#read_file = <fd_old>;
close (fd_old);
#temp = ();
$flag = 0;
foreach $infile ( #read_file )
{
foreach $rr ( #in )
{
($key, $value ) = split(/=/, $rr );
if ( $infile =~ s/\b$key\b(.*)/$rr/ )
{
push ( #temp , $infile );
$flag = 0;
}
else
{
$flag = 1;
}
}
if ( $flag )
{
push (#temp, $infile );
}
}
open ( fd, ">", $old_file ) || die "can't open";
print fd #temp;
close(fd);

Perl 101: use strict; use warnings;.
Prefix variable names with $.
$old_file is undef when you try to open it.
And spell falg correctly, which if you'd turned on those options, you'd have been told about.
Also: When asking questions on SO, it's helpful if you point out what's not working.

#Maruti: Never write a perl program without use strict; and use warnings;. I have modified your code. Just have a look.
Code:
#!/usr/bin/perl
use strict;
use warnings;
my $old_file = "/home/work/conf";
open (my $fh, "<", $old_file) || die "cant open file";
my #read_file = <$fh>;
close ($fh);
my #temp = ();
my #in = ('SerialPort=500' , 'Baudrate=300', 'parity=0', 'Databits=16', 'Stopbits=0');
foreach my $infile ( #read_file )
{
foreach my $rr ( #in )
{
my ($key, $value) = split(/=/, $rr );
if ( $infile =~ m/\b$key\b\=\d+/ && $infile =~ /#.*/)
{
$infile =~ s/\b$key\b\=\d+/$rr/ig;
}
}
push (#temp, $infile );
}
open (my $out, ">", $old_file ) || die "can't open";
foreach my $res(#temp)
{
print $out $res;
}
close($out);

Related

How to filter columns from CSV file based on names of columns

I am using the CSV data like below. I don't want to use user and timestamp from csv file. I may add few columns or delete columns.
I didnt find the any suitable method in Text CSV.
Please let me know if any method or module is available
UniqueId, Name, description, user,timestamp
1,jana,testing,janardar,12-10-2018:00:
sub _filter_common_columns_from_csv{
my $csvfile = shift;
my $CSV = Text::CSV_XS->new(
{
binary => 1,
auto_diag => 3,
allow_quotes => 0,
eol => $/
});
my $_columns ||= do {
open(my $fh, '<', $csvfile) or die $!;
my #cols = #{ $CSV->getline($fh) };
close $fh or die $!;
for (#cols) { s/^\s+//; s/\s+$//; }
\#cols;
};
my #columns = #{ $_columns };
my %deleted;
my #regexes = qw(user timestamp);
foreach my $regex (#regexes) {
foreach my $i (0 .. ($#columns - 1)) {
my $col = $columns[$i];
$deleted{$i} = $col if $col =~ /$regex/;
}
}
my #wanted_columns = grep { !$deleted{$_} } 0 .. $#columns - 1;
my $input_temp = "$ENV{HOME}/output/temp_test.csv";
open my $tem, ">",$input_temp or die "$input_temp: $!";
open(my $fh, '<', $csvfile) or die $!;
while (my $row = $CSV->getline($fh)) {
my #fields = #$row;
$CSV->print($tem, [ #fields[#wanted_columns] ]) or $CSV->error_diag;
}
close $fh or die $!;
close $tem or die $!;
return $input_temp;
}
See getline_hr
use warnings;
use strict;
use feature 'say';
use List::MoreUtils qw(any);
use Text::CSV;
my $file = shift #ARGV || die "Usage: $0 filename\n";
my #exclude_cols = qw(user timestamp);
my $csv = Text::CSV->new ( { binary => 1 } )
or die "Cannot use CSV: ".Text::CSV->error_diag ();
open my $fh, '<', $file or die "Can't open $file: $!";
my #cols = #{ $csv->getline($fh) };
my #wanted_cols = grep {
my $name = $_;
not any { $name eq $_ } #exclude_cols;
} #cols;
my $row = {};
$csv->bind_columns (\#{$row}{#cols});
while ($csv->getline($fh)) {
my #wanted_fields = #$row{ #wanted_cols };
say "#wanted_fields";
}
The syntax #$row{#wanted_cols} is for a hash slice, which returns a list of values for the keys in #wanted_cols from the hashref $row.
Actual example using Text::AutoCSV to remove given named columns from arbitrary CSV files like in your posted code (More complicated than the examples in the documentation for only writing specific columns):
#!/usr/bin/perl
use warnings;
use strict;
use Text::AutoCSV qw/remove_accents/;
sub remove_columns {
my ($infile, $outfile, $drop) = #_;
my $csv = Text::AutoCSV->new(in_file => $infile, out_file => $outfile);
# Normalize column names the same way that Text::AutoCSV does
my %drops = map { my $h = remove_accents $_;
$h =~ s/[^[:alnum:]_]//gi;
$h = uc $h;
$h => 1 } #$drop;
my #cols = grep { not exists $drops{$_} } $csv->get_fields_names;
# Hack to avoid reading the file twice.
$csv->{out_fields} = \#cols;
$csv->write();
}
remove_columns "in.csv", "out.csv", [ "user", "timestamp" ];
If you want to modify your CSV in other ways, too, and if SQL would be convenient for those modifications, then consider using DBD::CSV.
You can then open a database handle on your CSV file, select the desired columns with a SELECT query, and write the results with Text::CSV or Text::CSV_XS.
For more details, see the DBD::CSV documentation or e.g. this simple wrapper script for querying CSV files.

Duplicate values in column

I have a original file which has following columns,
02-May-2018,AAPL,Sell,0.25,1000
02-May-2018,C,Sell,0.25,2000
02-May-2018,JPM,Sell,0.25,3000
02-May-2018,WFC,Sell,0.25,5000
02-May-2018,AAPL,Sell,0.25,7000
02-May-2018,GOOG,Sell,0.25,8000
02-May-2018,GOOG,Sell,0.25,9000
02-May-2018,C,Sell,0.25,2000
02-May-2018,AAPL,Sell,0.25,3000
I am trying to print this original line if I see value in the second column more then 2 times.. for example, if I see AAPL more then 2 times desired result should print
02-May-2018,AAPL,Sell,0.25,1000
02-May-2018,AAPL,Sell,0.25,7000
02-May-2018,AAPL,Sell,0.25,3000
So Far, I have written the following which prints results multiple times which is wrong.. can you please help on what I am doing wrong?
open (FILE, "<$TMPFILE") or die "Could not open $TMPFILE";
open (OUT, ">$TMPFILE1") or die "Could not open $TMPFILE1";
%count = ();
#symbol = ();
while ($line = <FILE>)
{
chomp $line;
(#data) = split(/,/,$line);
$count{$data[1]}++;
#keys = sort {$count{$a} cmp $count{$b}} keys %count;
for my $key (#keys)
{
if ( $count{$key} > 2 )
{
print "$line\n";
}
}
}
I'd do it something like this - store lines you've seen in a 'buffer' and print them out again if the condition is hit (before continuing to print as you go):
#!/usr/bin/env perl
use strict;
use warnings;
my %buffer;
my %count_of;
while ( my $line = <> ) {
my ( $date, $ticker, #values ) = split /,/, $line;
#increment the count
$count_of{$ticker}++;
if ( $count_of{$ticker} < 3 ) {
#count limit not hit, so stash the current line in the buffer.
$buffer{$ticker} .= $line;
next;
}
#print the buffer if the count has been hit
if ( $count_of{$ticker} == 3 ) {
print $buffer{$ticker};
}
#only gets to here once the limit is hit, so just print normally.
print $line;
}
With your input data, this outputs:
02-May-2018,AAPL,Sell,0.25,1000
02-May-2018,AAPL,Sell,0.25,7000
02-May-2018,AAPL,Sell,0.25,3000
Simple answer:
push #{ $lines{(split",")[1]} }, $_ while <>;
print #{ $lines{$_} } for grep #{ $lines{$_} } > 2, sort keys %lines;
perl program.pl inputfile > outputfile
You need to read the input file twice, because you don't know the final counts until you get to the end of the file
use strict;
use warnings 'all';
my ($TMPFILE, $TMPFILE1) = qw/ infile outfile /;
my %counts;
{
open my $fh, '<', $TMPFILE or die "Could not open $TMPFILE: $!";
while ( <$fh> ) {
my #fields = split /,/;
++$counts{$fields[1]};
}
}
open my $fh, '<', $TMPFILE or die "Could not open $TMPFILE: $!";
open my $out_fh, '>', $TMPFILE1 or die "Could not open $TMPFILE1: $!";
while ( <$fh> ) {
my #fields = split /,/;
print $out_fh $_ if $counts{$fields[1]} > 2;
}
output
02-May-2018,AAPL,Sell,0.25,1000
02-May-2018,AAPL,Sell,0.25,7000
02-May-2018,AAPL,Sell,0.25,3000
This should work:
use strict;
use warnings;
open (FILE, "<$TMPFILE") or die "Could not open $TMPFILE";
open (OUT, ">$TMPFILE1") or die "Could not open $TMPFILE1";
my %data;
while ( my $line = <FILE> ) {
chomp $line;
my #line = split /,/, $line;
push(#{$data{$line[1]}}, $line);
}
foreach my $key (keys %data) {
if(#{$data{$key}} > 2) {
print "$_\n" foreach #{$data{$key}};
}
}

Merge two files based on the starting of the line

I want to merge two files into one using perl. Below are the sample files.
***FILE 1***
XDC123
XDC456
XDC678
BB987
BB654
*** FILE 2 ***
XDC876
XDC234
XDC789
BB456
BB678
And I want the merged file to look like:
***MERGED FILE***
XDC123
XDC456
XDC678
XDC876
XDC234
XDC789
BB987
BB654
BB456
BB678
For the above functionality I have written the below perl script snippet:
#!/usr/bin/env perl;
use strict;
use warnings;
my $file1 = 'C:/File1';
my $file2 = 'C:/File2';
my $file3 = 'C:/File3';
open( FILEONE, '<$file1' );
open( FILETWO, '<$file2' );
open( FILETHREE, '>$file3' );
while (<FILEONE>) {
if (/^XDC/) {
print FILETHREE;
}
if (/^BB/) {
last;
}
}
while (<FILETWO>) {
if (/^XDC/) {
print FILETHREE;
}
if (/^BB/) {
last;
}
}
while (<FILEONE>) {
if (/^BB/) {
print FILETHREE;
}
}
while (<FILETWO>) {
if (/^BB/) {
print FILETHREE;
}
}
close($file1);
close($file2);
close($file3);
But the merged file that is generated from the above code looks like:
***FILE 3***
XDC123
XDC456
XDC678
XDC876
XDC234
XDC789
BB654
BB678
The first line that starts from BB is missed out from both the files. Any help on this will be appreciated. Thank you.
The problem is, you iterate each file to the end, but never 'rewind' for if you're wanting to start over.
So your while ( <FILEONE> ) { line consumes (and discards) the first line that matches m/^BB/ - the last exits the "while" loop, but only after it's already read the line.
However that's assuming you get your open statements right, because:
open( FILEONE, '>$file1' );
Actually empties it, it doesn't read from it. So I am assuming you've transposed your code, and introduced new errors whilst doing so.
As a style point - you should really use 3 argument open, with lexical filehandles.
So instead:
#!/usr/bin/env perl;
use strict;
use warnings;
my $file1 = 'C:/File1';
my $file2 = 'C:/File2';
my $file3 = 'C:/File3';
my #lines;
foreach my $file ( $file1, $file2 ) {
open( my $input, '<', $file ) or die $!;
push( #lines, <$input> );
close($input);
}
open( my $output, '>', $file3 ) or die $!;
print {$output} sort #lines;
close($output)
(Although as noted in the comments - if that's all you want to do, the unix sort utility is probably sufficient).
However, if you need to preserve the numeric ordering, whilst sorting on the alphabetical, you need a slightly different data structure:
#!/usr/bin/env perl;
use strict;
use warnings;
my $file1 = 'C:/File1';
my $file2 = 'C:/File2';
my $file3 = 'C:/File3';
my %lines;
foreach my $file ( $file1, $file2 ) {
open( my $input, '<', $file ) or die $!;
while ( my $line = <$file> ) {
my ( $key ) = $line =~ m/^(\D+)/;
push %{$lines{$key}}, $line;
}
close($input);
}
open( my $output, '>', $file3 ) or die $!;
foreach my $key ( sort keys %lines ) {
print {$output} #{$lines{$key}};
}
close($output)

How do i pattern match and keep writing to new file until another pattern match

My goal is to find and print all the lines in a "big.v" file starting from pattern match "module" until "endmodule" into individual files.
big.v: module test;
<bunch of code>
endmodule
module foo;
<bunch of code>
endmodule
And the individual files would look like:
test.v : module test;
..
endmodule
foo.v: module test1;
..
endmodule
I got most of it working using:
use strict;
use warnings;
#open(my $fh, ">", $f1) || die "Couldn't open '".$f."' for writing because: ".$!;
while (<>) {
my $line = $_;
if ($line =~ /(module)(\s+)(\w+)(.*)/) {
my $modname = $3;
open(my $fh1, ">", $modname.".v") ;
print $fh1 $line."\n";
## how do i keep writing next lines to this file until following pattern
if ($line =~ /(endmodule)(\s+)(.*)/) { close $fh1;}
}
}
Thanks,
There's a useful perl construct called the 'range operator':
http://perldoc.perl.org/perlop.html#Range-Operators
It works like this:
while ( <$file> ) {
if ( m/startpattern/ .. m/endpattern/ ) {
print;
}
}
So given your example - I think this should do the trick:
my $output;
while ( my $line = <STDIN> ) {
if ( $line =~ m/module/ .. m/endmodule/ ) {
my ( $modname ) = ( $line =~ m/module\s+(\w+)/ );
if ( defined $modname) {
open ( $output, ">", "$modname.v" ) or warn $!;
}
print {$output} $line;
}
}
Edit: But given your source data - you don't actually need to use a range operator I don't think. You could just close/reopen new 'output' files as you go. This assumes that you could 'cut up' your file based on 'module' lines, which isn't necessarily a valid assumption.
But sort of more like this:
use strict;
use warnings;
open ( my $input, "<", "big.v" ) or die $!;
my $output;
while ( my $line = <$input> ) {
if ( $line =~ m/^\s*module/ ) {
#start of module line found
#close filehandle if it's open
close($output) if defined $output;
#extract the module name from the line.
my ($modulename) = ( $line =~ m/module\s+(\w+)/ );
#open new output file (overwriting)
open( $output, ">", "$modulename.v" ) or warn $!;
}
#this test might not be necessary.
if ( defined $output ) {
print {$output} $line;
}
}

Extracting specific multiple line of records that is pipe delimited in perl

I have a file that looks like
NAME|JOHN|TOKYO|JPN
AGE|32|M
INFO|SINGLE|PROFESSIONAL|IT
NAME|MARK|MANILA|PH
AGE|37|M
INFO|MARRIED|PROFESSIONAL|BPO
NAME|SAMANTHA|SYDNEY|AUS
AGE|37|F
INFO|MARRIED|PROFESSIONAL|OFFSHORE
NAME|LUKE|TOKYO|JPN
AGE|27|M
INFO|SINGLE|PROFESSIONAL|IT
I want to separate the records by country. I have stored each line into array variable #fields
my #fields = split(/\|/, $_ );
making $fields[3] as my basis for sorting it. I wanted it to separate into 2 output text files
OUTPUT TEXT FILE 1:
NAME|JOHN|TOKYO|JPN
AGE|32|M
INFO|SINGLE|PROFESSIONAL|IT
NAME|LUKE|TOKYO|JPN
AGE|27|M
INFO|SINGLE|PROFESSIONAL|IT
OUTPUT TEXT FILE 2
NAME|MARK|MANILA|PH
AGE|37|M
INFO|MARRIED|PROFESSIONAL|BPO
NAME|SAMANTHA|SYDNEY|AUS
AGE|37|F
INFO|MARRIED|PROFESSIONAL|OFFSHORE
Putting all that is from JPN to output text 1 & non-JPN country to output text file 2
here's the code that what trying to work out
use strict;
use warnings;
use Data::Dumper;
use Carp qw(croak);
my #fields;
my $tmp_var;
my $count;
;
my ($line, $i);
my $filename = 'data.txt';
open(my $input_fh, '<', $filename ) or croak "Can't open $filename: $!";
open(OUTPUTA, ">", 'JPN.txt') or die "wsl_reformat.pl: could not open $ARGV[0]";
open(OUTPUTB, ">", 'Non-JPN.txt') or die "wsl_reformat.pl: could not open $ARGV[0]";
my $fh;
while (<$input_fh>) {
chomp;
my #fields = split /\|/;
if ($fields[0] eq 'NAME') {
for ($i=1; $i < #fields; $i++) {
if ($fields[3] eq 'JPN') {
$fh = $_;
print OUTPUTA $fh;
}
else {
$fh = $_;
print OUTPUTB $fh;
}
}
}
}
close(OUTPUTA);
close(OUTPUTB)
Still has no luck on it :(
Here is the way I think ikegami was saying, but I've never tried this before (although it gave the correct results).
#!/usr/bin/perl
use strict;
use warnings;
open my $jpn_fh, ">", 'o33.txt' or die $!;
open my $other_fh, ">", 'o44.txt' or die $!;
my $fh;
while (<DATA>) {
if (/^NAME/) {
if (/JPN$/) {
$fh = $jpn_fh;
}
else {
$fh = $other_fh;
}
}
print $fh $_;
}
close $jpn_fh or die $!;
close $other_fh or die $!;
__DATA__
NAME|JOHN|TOKYO|JPN
AGE|32|M
INFO|SINGLE|PROFESSIONAL|IT
NAME|MARK|MANILA|PH
AGE|37|M
INFO|MARRIED|PROFESSIONAL|BPO
NAME|SAMANTHA|SYDNEY|AUS
AGE|37|F
INFO|MARRIED|PROFESSIONAL|OFFSHORE
NAME|LUKE|TOKYO|JPN
AGE|27|M
INFO|SINGLE|PROFESSIONAL|IT
You didn't say what you needed help with, so I'm assuming it's coming up with an algorithm. Here's a good one:
Open the file to read.
Open the file for the JPN entries.
Open the file for the non-JPN entries.
While not eof,
Read a line.
Parse the line.
If it's the first line of a record,
If the person's country is JPN,
Set current file handle to the file handle for JPN entries.
Else,
Set current file handle to the file handle for non-JPN entries.
Print the line to the current file handle.
my $jpn_qfn = '...';
my $other_qfn = '...';
open(my $jpn_fh, '>', $jpn_qfn)
or die("Can't create $jpn_qfn: $!\n");
open(my $other_fh, '>', $other_qfn)
or die("Can't create $other_qfn: $!\n");
my $fh;
while (<>) {
chomp;
my #fields = split /\|/;
if ($fields[0] eq 'NAME') {
$fh = $fields[3] eq 'JPN' ? $jpn_fh : $other_fh;
}
say $fh $_;
}
#!/usr/bin/env perl
use 5.012;
use autodie;
use strict;
use warnings;
# store per country output filehandles
my %output;
# since this is just an example, read from __DATA__ section
while (my $line = <DATA>) {
# split the fields
my #cells = split /[|]/, $line;
# if first field is NAME, this is a new record
if ($cells[0] eq 'NAME') {
# get the country code, strip trailing whitespace
(my $country = $cells[3]) =~ s/\s+\z//;
# if we haven't created and output file for this
# country, yet, do so
unless (defined $output{$country}) {
open my $fh, '>', "$country.out";
$output{$country} = $fh;
}
my $out = $output{$country};
# output this and the next two lines to
# country specific output file
print $out $line, scalar <DATA>, scalar <DATA>;
}
}
close $_ for values %output;
__DATA__
NAME|JOHN|TOKYO|JPN
AGE|32|M
INFO|SINGLE|PROFESSIONAL|IT
NAME|MARK|MANILA|PH
AGE|37|M
INFO|MARRIED|PROFESSIONAL|BPO
NAME|SAMANTHA|SYDNEY|AUS
AGE|37|F
INFO|MARRIED|PROFESSIONAL|OFFSHORE
NAME|LUKE|TOKYO|JPN
AGE|27|M
INFO|SINGLE|PROFESSIONAL|IT
Thanks for your Help heaps
I was able to solved this problem in perl,
many thanks
#!/usr/local/bin/perl
use strict;
use warnings;
use Data::Dumper;
use Carp qw(croak);
my #fields;
my $tmp_var;
my ($rec_type, $country);
my $filename = 'data.txt';
open (my $input_fh, '<', $filename ) or croak "Can't open $filename: $!";
open my $OUTPUTA, ">", 'o33.txt' or die $!;
open my $OUTPUTB, ">", 'o44.txt' or die $!;
my $Combline;
while (<$input_fh>) {
$_ = _trim($_);
#fields = split (/\|/, $_);
$rec_type = $fields[0];
$country = $fields[3];
if ($rec_type eq 'NAME') {
if ($country eq 'JPN') {
*Combline = $OUTPUTA;
}
else {
*Combline = $OUTPUTB;
}
}
print Combline;
}
close $OUTPUTA or die $!;
close $OUTPUTB or die $!;
sub _trim {
my $word = shift;
if ( $word ) {
$word =~ s/\s*\|/\|/g; #remove trailing spaces
$word =~ s/"//g; #remove double quotes
}
return $word;
}