1=ABC,2=mnz,3=xyz
1=pqr,3=ijk,2=lmn
I have this in text file I want to search 1= and that should print only matched word 1=ABC and 1=pqr
Any suggestions in Perl or Unix?
Input:
$ cat grep.in
1=ABC,2=mnz,3=xyz
1=pqr,3=ijk,2=lmn
4=pqr,3=ijk,2=lmn
Command:
$ grep -o '1=[^,]\+' grep.in
1=ABC
1=pqr
Explanations:
You can just use grep on your input
-o is to output only the matching pattern
1=[^,]\+ the regex will match strings that start by 1= followed by at least one character that is not a comma (I have based this on the hypothesis that there is no comma in the right part of the = except the separator)
if you want to accept empty result you can change the \+ by *
It appears that your input data is in CSV format. Here is a Perl solution based on Text::CSV
parse the CSV content row-wise
print out columns that start with 1=
#!/usr/bin/perl
use warnings;
use strict;
use Text::CSV;
my $csv = Text::CSV->new({
binary => 1,
eol => "\n",
}) or die "CSV\n";
# parse
while (my $row = $csv->getline(\*DATA)) {
foreach (#{ $row }) {
print "$_\n" if /^1=/;
}
}
exit 0;
__DATA__
1=ABC,2=mnz,3=xyz
1=pqr,3=ijk,2=lmn
Test run:
$ perl dummy.pl
1=ABC
1=pqr
Replace DATA with STDIN to read the input from standard input instead.
Related
I am trying to grep [0](including square brackets) in a file using perl, I tried following code
my #output = `grep \"\[0\]\" log `;
But instead of returning [0], it is giving output where it matches 0
Your problem is that you need to escape the [ and ] twice, as [ ... ] has a special meaning in regexes (it defines a character class).
#!/usr/bin/perl
use strict;
use warnings;
my #output = `grep "\\[0\\]" log `;
print for #output;
But you really don't need to use the external grep command. Perl is great at text processing.
#!/usr/bin/perl
use strict;
use warnings;
while (<>) {
print if /\[0\]/;
}
My solution reads from any file whose name is given as an argument to the program (or from STDIN).
How to read the text file using perl command line arguments and print the third column using perl?
I'm struck with taking input from the command line and printing the required column. Help me to choose the right way to reach the expected output.
Code which I wrote to take command line input:(map.pl)
use strict;
use warnings 'all';
use Getopt::Long 'GetOptions';
my #files=GetOptions(
'p_file=s' => \my $p_file,
);
print $p_file ? "p_file = $p_file\n" : "p_file\n";
Output I got for above code:
perl map.pl -p_file cat.txt
p_file = cat.txt
cat.txt:(Input file)
ADG:YUF:TGH
UIY:POG:YTH
GHJUR:"HJKL:GHKIO
Expected output:
TGH
YTH
GHKIO
Perl can automatically read files whose names are provided as command line arguments. The command below should produce your expected output
perl -F: -le 'print $F[2]' cat.txt
-F: turns on autosplit mode, sets the field separator to : and loops over lines of input files. -l handles line endings during input and output. The code after e flag ('print $F[2]' prints 3rd field) is executed for each line of file. Find out more by reading perldoc perlrun.
You'd need to read the file and split the lines to get the columns, and print the required column. Here's a demo code snippet, using the perl -s switch to parse command line arguments. Run like this ./map.pl -p_file=cat.txt
#!/usr/bin/perl -s
use strict;
use warnings;
use vars qw[$p_file];
die("You need to pass a filename as argument") unless(defined($p_file));
die("Filename ($p_file) does not exists") unless(-f $p_file);
print "Proceeding to read file : $p_file\n\n";
open(my $fh,'<',$p_file) or die($!);
while(chomp(my $line = <$fh>)) {
next unless(defined($line) && $line);
my #cols = split(/:/,$line);
print $cols[-1],"\n";
}
close($fh);
my #up = `cat abc.txt|head -2|tail -1|cut -d' ' -f1-3`;
Instead of storing the individual fields in the array. It's storing the entire output as a string in the first element.
This is the output I am getting
$up[0] = 'xxx 12 234'
I want this
#up = ('xxx', 12, 234)
|
It looks like you want the first three space-delimited fields of the second line of file abc.txt
The problem is that backticks will return one line of output in each element of the array, and because cut prints all three fields on a single line, they appear as a single array element.
You could split the value again inside Perl, but when you have the whole of the Perl language available, it's wasteful to use the shell to do something so simple and you should do everything in Perl
This program will do as you ask. I've used Data::Dump only so that you can verify that the contents of #up are as you wanted
use strict;
use warnings 'all';
use Data::Dump;
my #up = do {
open my $fh, '<', 'abc.txt' or die $!;
<$fh>; # Skip one line
(split ' ', <$fh>)[0 .. 2];
};
dd \#up;
output
["xxx", 12, 234]
You can either split the result by whitespaces:
my #up = split(/\s+/, `cat abc.txt ...`);
Or prior you can set input record separator to space. This one however is not as flexible, it's just simple string so in case there are two spaces in a row it will treat it as empty field in the middle:
local $/ = " ";
my #up = `cat abc.txt ...`;
I want to count the number of columns in a row for a CSV file.
row 1 10 columns
row 2 11 columns
etc.
I can print out the value of the last column, but I really just want a count per row.
perl -F, -lane "{print #keys[$_].$F[$_] foreach(-1)}" < testing.csv
I am on a windows machine
Thanks.
If you have a proper csv file, it can contain embedded delimiters (e.g. 1,"foo,bar",2), in which case a simple split will not be enough. You can use the Text::CSV module fairly easily with a one-liner like this:
Copy/paste version:
perl -MText::CSV -lwe"my $c=Text::CSV->new({sep_char=>','}); while($r=$c->getline(*STDIN)) { print scalar #$r }" < sorted.csv
Readable version:
perl -MText::CSV # use Text::CSV module
-lwe # add newline to print, use warnings
"my $c = Text::CSV->new(); # set up csv object
while( $r = $c->getline(*STDIN) ) { # get lines from stdin
print scalar #$r # print row size
}" < sorted.csv # input file to stdin
If your input can be erratic, Text::CSV->getline might choke on corrupted lines (the while loop is ended), in which case it may be safer to use plain parsing:
perl -MText::CSV -nlwe"
BEGIN { $r = Text::CSV->new() };
$r->parse($_);
print scalar $r->fields
" comma.csv
Note that in this case we use a different input method. This is because while getline() requires a file handle, parse() does not. Since the diamond operator uses either ARGV or STDIN depending on your argument, I find it is better to be explicit.
If you don't have commas as part of the fields, you can split the line and count the number of fields
#! /usr/bin/perl
use strict;
use warnings;
my #cols = split(',', $_);
my $n = #cols;
print "row $. $n columns\n";
you can call this
perl -n script.pl testing.csv
I have a series of strings and their replacements separated by spaces:
a123 b312
c345 d453
I'd like to replace those strings in the left column with those in the right column, and undo the replacements later on. For the first part I could construct a sed command s/.../...;s/.../... but that doesn't consider reversing, and it requires me to significantly alter the input, which takes time. Is there a convenient way to do this?
Listed some example programs, could be anything free for win/lin.
Text editors provide "undo" functionality, but command-line utilities don't. You can write a script to do the replacement, then reverse the replacements file to do the same thing in reverse.
Here's a script that takes a series of replacements in 'replacements.txt' and runs them against the script's input:
#!/usr/bin/perl -w
use strict;
open REPL, "<replacements.txt";
my #replacements;
while (<REPL>) {
chomp;
push #replacements, [ split ];
}
close REPL;
while (<>) {
for my $r (#replacements) { s/$r->[0]/$r->[1]/g }
print;
}
If you save this file as 'repl.pl', and you save your file above as 'replacements.txt', you can use it like this:
perl repl.pl input.txt >output.txt
To convert your replacements file into a 'reverse-replacements.txt' file, you can use a simple awk command:
awk '{ print $2, $1 }' replacements.txt >reverse-replacements.txt
Then just modify the Perl script to use the reverse replacements file instead of the forward one.
use strict;
use warnings;
unless (#ARGV == 3) {
print "Usage: script.pl <reverse_changes?> <rfile> <input>\n";
exit;
}
my $reverse_changes = shift;
my $rfile = shift;
open my $fh, "<", $rfile or die $!;
my %reps = map split, <$fh>;
if ($reverse_changes) {
%reps = reverse %reps;
}
my $rx = join "|", keys %reps;
while (<>) {
s/\b($rx)\b/$reps{$1}/g;
print;
}
The word boundary checks \b surrounding the replacements will prevent partial matches, e.g. replacing a12345 with b31245. In the $rx you may wish to escape meta characters, if such can be present in your replacements.
Usage:
To perform the replacements:
script.pl 0 replace.txt input.txt > output.txt
To reverse changes:
script.pl 1 replace.txt output.txt > output2.txt