How to read the textfile by command line arguments and print the column by using perl? - perl

How to read the text file using perl command line arguments and print the third column using perl?
I'm struck with taking input from the command line and printing the required column. Help me to choose the right way to reach the expected output.
Code which I wrote to take command line input:(map.pl)
use strict;
use warnings 'all';
use Getopt::Long 'GetOptions';
my #files=GetOptions(
'p_file=s' => \my $p_file,
);
print $p_file ? "p_file = $p_file\n" : "p_file\n";
Output I got for above code:
perl map.pl -p_file cat.txt
p_file = cat.txt
cat.txt:(Input file)
ADG:YUF:TGH
UIY:POG:YTH
GHJUR:"HJKL:GHKIO
Expected output:
TGH
YTH
GHKIO

Perl can automatically read files whose names are provided as command line arguments. The command below should produce your expected output
perl -F: -le 'print $F[2]' cat.txt
-F: turns on autosplit mode, sets the field separator to : and loops over lines of input files. -l handles line endings during input and output. The code after e flag ('print $F[2]' prints 3rd field) is executed for each line of file. Find out more by reading perldoc perlrun.

You'd need to read the file and split the lines to get the columns, and print the required column. Here's a demo code snippet, using the perl -s switch to parse command line arguments. Run like this ./map.pl -p_file=cat.txt
#!/usr/bin/perl -s
use strict;
use warnings;
use vars qw[$p_file];
die("You need to pass a filename as argument") unless(defined($p_file));
die("Filename ($p_file) does not exists") unless(-f $p_file);
print "Proceeding to read file : $p_file\n\n";
open(my $fh,'<',$p_file) or die($!);
while(chomp(my $line = <$fh>)) {
next unless(defined($line) && $line);
my #cols = split(/:/,$line);
print $cols[-1],"\n";
}
close($fh);

Related

How I search and print matched wold in UNIX or perl?

1=ABC,2=mnz,3=xyz
1=pqr,3=ijk,2=lmn
I have this in text file I want to search 1= and that should print only matched word 1=ABC and 1=pqr
Any suggestions in Perl or Unix?
Input:
$ cat grep.in
1=ABC,2=mnz,3=xyz
1=pqr,3=ijk,2=lmn
4=pqr,3=ijk,2=lmn
Command:
$ grep -o '1=[^,]\+' grep.in
1=ABC
1=pqr
Explanations:
You can just use grep on your input
-o is to output only the matching pattern
1=[^,]\+ the regex will match strings that start by 1= followed by at least one character that is not a comma (I have based this on the hypothesis that there is no comma in the right part of the = except the separator)
if you want to accept empty result you can change the \+ by *
It appears that your input data is in CSV format. Here is a Perl solution based on Text::CSV
parse the CSV content row-wise
print out columns that start with 1=
#!/usr/bin/perl
use warnings;
use strict;
use Text::CSV;
my $csv = Text::CSV->new({
binary => 1,
eol => "\n",
}) or die "CSV\n";
# parse
while (my $row = $csv->getline(\*DATA)) {
foreach (#{ $row }) {
print "$_\n" if /^1=/;
}
}
exit 0;
__DATA__
1=ABC,2=mnz,3=xyz
1=pqr,3=ijk,2=lmn
Test run:
$ perl dummy.pl
1=ABC
1=pqr
Replace DATA with STDIN to read the input from standard input instead.

How to calculate size of a file from the command line arguments in perl

I was trying out a sample program that can calculate the size of the file given in command line arguments. It gives the size correctly when I have a file name stored inside a variable, but doesn't output a result when got the filename from the command line arguments.
#! /usr/bin/perl
use File::stat;
while(<>){
if(($_ cmp "\n") == 0){
exit 0;
}
else{
my $file_size = stat($_)->size; # $filesize = s $_;
print $file_size;
}
}
I get no output when using file test operator -s and I get errors when using stat module:
Unsuccessful stat on filename containing newline at /usr/share/perl/5.10/File/stat.pm line 49, <> line 1.
Can't call method "size" on an undefined value at 2.pl line 17, <> line 1.
1.txt is the filename I'm giving as an input.
#!/usr/bin/perl
for (#ARGV){
my $file_size = -s $_;
print $file_size;
}
or similar cmd oneliner,
perl -E 'say "$_, size: ", -s for #ARGV' *
#!/usr/bin/perl -w
$filename = '/path/to/your/file.doc';
$filesize = -s $filename;
print $filesize;
Simple enough, right? First you create a string that contains the path to the file that you want to test, then you use the -s File Test Operator on it. You could easily shorten this to one line using simply:
print -s '/path/to/your/file.doc';
Also, keep in mind that this will always return true if a file is larger than zero bytes, but will be false if the file size is zero. It makes a handy and quick way to check for zero byte files.

Multiple text parsing and writing using the while statement, the diamond operator <> and $ARGV variable in Perl

I have some text files, inside a directory and i want to parse their content and write it to a file. So far the code i am using is this:
#!/usr/bin/perl
#The while loop repeats the execution of a block as long as a certain condition is evaluated true
use strict; # Always!
use warnings; # Always!
my $header = 1; # Flag to tell us to print the header
while (<*.txt>) { # read a line from a file
if ($header) {
# This is the first line, print the name of the file
**print "========= $ARGV ========\n";**
# reset the flag to a false value
$header = undef;
}
# Print out what we just read in
print;
}
continue { # This happens before the next iteration of the loop
# Check if we finished the previous file
$header = 1 if eof;
}
When i run this script i am only getting the headers of the files, plus a compiled.txt entry.
I also receive the following message in cmd : use of uninitialized $ARGV in concatenation <.> or string at concat.pl line 12
So i guess i am doing something wrong and $ARGV isn't used at all. Plus instead of $header i should use something else in order to retrieve the text.
Need some assistance!
<*.txt> does not read a line from a file, even if you say so in a comment. It runs
glob '*.txt'
i.e. the while loop iterates over the file names, not over their contents. Use empty <> to iterate over all the files.
BTW, instead of $header = undef, you can use undef $header.
As I understand you want to print a header with the filename just before the first line, and concatenate them all to a new one. Then a one-liner could be enough for the task.
It checks first line with variable $. and closes the filehandle to reset its value between different input files:
perl -pe 'printf qq|=== %s ===\n|, $ARGV if $. == 1; close ARGV if eof' *.txt
An example in my machine yields:
=== file1.txt ===
one
=== file2.txt ===
one
two

Perl - count number of columns per row in a csv file

I want to count the number of columns in a row for a CSV file.
row 1 10 columns
row 2 11 columns
etc.
I can print out the value of the last column, but I really just want a count per row.
perl -F, -lane "{print #keys[$_].$F[$_] foreach(-1)}" < testing.csv
I am on a windows machine
Thanks.
If you have a proper csv file, it can contain embedded delimiters (e.g. 1,"foo,bar",2), in which case a simple split will not be enough. You can use the Text::CSV module fairly easily with a one-liner like this:
Copy/paste version:
perl -MText::CSV -lwe"my $c=Text::CSV->new({sep_char=>','}); while($r=$c->getline(*STDIN)) { print scalar #$r }" < sorted.csv
Readable version:
perl -MText::CSV # use Text::CSV module
-lwe # add newline to print, use warnings
"my $c = Text::CSV->new(); # set up csv object
while( $r = $c->getline(*STDIN) ) { # get lines from stdin
print scalar #$r # print row size
}" < sorted.csv # input file to stdin
If your input can be erratic, Text::CSV->getline might choke on corrupted lines (the while loop is ended), in which case it may be safer to use plain parsing:
perl -MText::CSV -nlwe"
BEGIN { $r = Text::CSV->new() };
$r->parse($_);
print scalar $r->fields
" comma.csv
Note that in this case we use a different input method. This is because while getline() requires a file handle, parse() does not. Since the diamond operator uses either ARGV or STDIN depending on your argument, I find it is better to be explicit.
If you don't have commas as part of the fields, you can split the line and count the number of fields
#! /usr/bin/perl
use strict;
use warnings;
my #cols = split(',', $_);
my $n = #cols;
print "row $. $n columns\n";
you can call this
perl -n script.pl testing.csv

How to replace ^M with a new line in perl

My test file has "n" number of lines and between each line there is a ^M, which in turn makes it one big string. The code I am working with opens said file and should parse out a header and then the subsequent rows, then searches for the Directory Path and File name. But because the file just ends up as a big string it doesn't work correctly
#!/usr/bin/perl
#use strict;
#use warnings;
open (DATA, "<file.txt") or die ("Unable to open file");
my $search_string = "Directory Path";
my $column_search = "Filename";
my $header = <DATA>;
my #header_titles = split /\t/, $header;
my $extract_col = 0;
my $col_search = 0;
for my $header_line (#header_titles) {
last if $header_line =~ m/$search_string/;
$extract_col++;
}
for my $header_line (#header_titles) {
last if $header_line =~m/$column_search/;
$col_search++;
}
print "Extracting column $extract_col $search_string\n";
while ( my $row = <DATA> ) {
last unless $row =~ /\S/;
chomp $row;
my #cells = split /\t/, $row;
$cells[74]=~s/:/\//g;
$cells[$extract_col]= $cells[74] . $cells[$col_search];
print "$cells[$extract_col] \n";
}
When i open the test file in VI i have used
:%s/^M/\r/g
and that removes the ^M's but how do i do it inside this perl program? When i tried a test program and inserted that s\^M/\r/g and had it write to a different file it came up as a lot of Chinese characters.
If mac2unix isn't working for you, you can write your own mac2unix as a Perl one-liner:
perl -pi -e 'tr/\r/\n/' file.txt
That will likely fail if the size of the file is larger than virtual memory though, as it reads the whole file into memory.
For completeness, let's also have a dos2unix:
perl -pi -e 'tr/\r//d' file.txt
and a unix2dos:
perl -pi -e 's/\n/\r\n/g' file.txt
Before you start reading the file, set $/ to "\r". This is set to the linefeed character by default, which is fine for UNIX-style line endings, and almost OK for DOS-style line endings, but useless for the old Mac-style line endings you are seeing. You can also try mac2unix on your input file if you have it installed.
For more, look for "INPUT_RECORD_SEPARATOR" in the perlvar manpage.
Did this file originate on a windows system? If so, try running the dos2unix command on the file before reading it. You can do this before invoking the perl script or inside the script before you read it.
You might want to set $\ (input record separator) to ^M in the beginning of your script, such as:
$\ = "^M";
perl -MExtUtils::Command -e dos2unix file