create a hash from command line in perl - perl

I have a file with data like below:
4 1
7 12
2 5
4 4
6 67
12 5
through command line i can split each and every line into an array like below:
perl -F'\s+' -ane 'print $F[0]' file
thus will print all the first fields.
Now the above command transforms every line into an array.
in a similar way can this be done line creating a hash with keys as the first field and values for each key is the second field.?

Try this:
perl -MData::Dumper -ane '$X{$F[0]}=$F[1]}{print Dumper \%X' file

Yes, it can be done.
perl -MData::Dumper -e '%a = map { (split)[0,1] } <ARGV>;print Dumper \%a' dt.txt

Related

Can "perl -a" somehow re-join #F using the original whitespace?

My input has a mix of tabs and spaces for readability. I want to modify a field using perl -a, then print out the line in its original form. (The data is from findup, showing me a count of duplicate files and the space they waste.) Input is:
2 * 4096 backup/photos/photo.jpg photos/photo.jpg
2 * 111276032 backup/books/book.pdf book.pdf
The output would convert field 3 to kilobytes, like this:
2 * 4 KB backup/photos/photo.jpg photos/photo.jpg
2 * 108668 KB backup/books/book.pdf book.pdf
In my dream world, this would be my code, since I could just will perl to automatically recombine #F and preserve the original whitespace:
perl -lanE '$F[2]=int($F[2]/1024)." KB"; print;'
In real life, joining with a single space seems like my only option:
perl -lanE '$F[2]=int($F[2]/1024)." KB"; print join(" ", #F);'
Is there any automatic variable which remembers the delimiters? If I had a magic array like that, the code would be:
perl -lanE 'BEGIN{use List::Util "reduce";} $F[2]=int($F[2]/1024)." KB"; print reduce { $a . shift(#magic) . $b } #F;'
No, there is no such magic object. You can do it by hand though
perl -wnE'#p = split /(\s+)/; $p[4] = int($p[4]/1024); print #p' input.txt
The capturing parens in split's pattern mean that it is also returned, so you catch exact spaces. Since spaces are in the array we now need the fifth field.
As it turns out, -F has this same property. Thanks to Сухой27. Then
perl -F'(\s+)' -lanE'$F[4] = int($F[4]/1024); say #F' input.txt
Note: with 5.20.0 "-F now implies -a and -a implies -n". Thanks to ysth.
You could just find the correct part of the line and modify it:
perl -wpE's/^\s*+(?>\S+\s+){2}\K(\S+)/int($1\/1024) . " KB"/e'

perl : how to print after a specific line

For example, my.txt containts
a
b
xx
c
d
I want print from the second line below lines that contains xx
I tried
perl -nle 'if(/xx/){$n=$.};print if $.>($n+1)' my.txt
But it didn't work. It just print all lines.
Before $n is defined it is interpreted as 0 (zero), meaning that $. > 1 will also be printed before xx. This might be what you wanted:
perl -nle 'if(/xx/){$n=$.}; print if defined($n) and $. > $n+1' my.txt

How to pass both variables and files to a perl -p -e command

I have a command line written in perl that executes in Solaris (maybe this is irrelevant as it is UNIX-like) which inserts a "wait" string every 6 lines
perl -pe 'print "wait\n" if ($. % 6 == 0);' file
However, I want to replace that 6 by a parameter (ARGV[0]), resulting in something like this:
perl -pe 'print "wait\n" if ($. % ARGV[0] == 0);' file 6
It goes well, giving me the right output, until it finishes reading the file and treats "6" as the next file (even when it understood it as ARGV[0] before).
Is there any way to use the -p option and specify which parameters are files and which ones are not?
Edited: I thought there was a problem with using the -f option but as #ThisSuitIsBlackNot pointed out, I was using it wrongly.
-p, as a superset of -n, wraps the code with a while (<>) { } loop, which reads from the files named on the command line. You need to extract the argument before entering the loop.
perl -e'$n = shift; while (<>) { print "wait\n" if $. % $n == 0; print }' 6 file
or
perl -pe'BEGIN { $n = shift } print "wait\n" if $. % $n == 0' 6 file
Alternatively, you could also use an env var.
N=6 perl -pe'print "wait\n" if $. % $ENV{N} == 0' file

Delete \n characters from line range in text file

Let's say we have a text file with 1000 lines.
How can we delete new line characters from line 20 to 500 (replace them with space for example)?
My try:
sed '20,500p; N; s/\n/ /;' #better not to say anything
All other lines (1-19 && 501-1000) should be preserved as-is.
As I'm familiar with sed, awk or perl solutions are welcomed, but please give an explanation with them as I'm a perl and awk newbie.
You could use something like this (my example is on a slightly smaller scale :-)
$ cat file
1
2
3
4
5
6
7
8
9
10
$ awk '{printf "%s%s", $0, (2<=NR&&NR<=5?FS:RS)}' file
1
2 3 4 5 6
7
8
9
10
The second %s in the printf format specifier is replaced by either the Field Separator (a space by default) or the Record Separator (a newline) depending on whether the Record Number is within the range.
Alternatively:
$ awk '{ORS=(2<=NR&&NR<=5?FS:RS)}1' file
1
2 3 4 5 6
7
8
9
10
Change the Output Record Separator depending on the line number and print every line.
You can pass variables for the start and end if you want, using awk -v start=2 -v end=5 '...'
This might work for you (GNU sed):
sed -r '20,500{N;s/^(.*)(\n)/\2\1 /;D}' file
or perhaps more readably:
sed ':a;20,500{N;s/\n/ /;ta}' file
Using a perl one-liner to strip the newline:
perl -i -pe 'chomp if 20..500' file
Or to replace it with a space:
perl -i -pe 's/\R/ / if 20..500' file
Explanation:
Switches:
-i: Edit <> files in place (makes backup if extension supplied)
-p: Creates a while(<>){...; print} loop for each “line” in your input file.
-e: Tells perl to execute the code on command line.
Code:
chomp: Remove new line
20 .. 500: if Range operator .. is between line numbers 20 to 500
Here's a perl version:
my $min = 5; my $max = 10;
while (<DATA>) {
if ($. > $min && $. < $max) {
chomp;
$_ .= " ";
}
print;
}
__DATA__
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Output:
1
2
3
4
5
6 7 8 9 10
11
12
13
14
15
It reads in DATA (which you can set to being a filehandle or whatever your application requires), and checks the line number, $.. While the line number is between $min and $max, the line ending is chomped off and a space added to the end of the line; otherwise, the line is printed as-is.

How to extract a particular column of data in Perl?

I have some data from a unix commandline call
1 ab 45 1234
2 abc 5
4 yy 999 2
3 987 11
I'll use the system() function for the call.
How can I extract the second column of data into an array in Perl? Also, the array size has to be dependent on the number of rows that I have (it will not necessarily be 4).
I want the array to have ("ab", "abc", "yy", 987).
use strict;
use warnings;
my $data = "1 ab 45 1234
2 abc 5
2 abc 5
2 abc 5
4 yy 999 2
3 987 11";
my #second_col = map { (split)[1] } split /\n/, $data;
To get unique values, see perlfaq4. Here's part of the answer provided there:
my %seen;
my #unique = grep { ! $seen{ $_ }++ } #second_col;
You can chain a Perl cmd-line call (aka: one-liner) to your unix script:
perl -lane 'print $F[1]' data.dat
instead of data.dat, use a pipe from your command line tool
cat data.dat | perl -lane 'print $F[1]'
Addendum:
The extension for unique-ness of the resulting column is straightforward:
cat data.dat | perl -lane 'print $F[1] unless $seen{$F[1]}++'
or, if you are lazy (employing %_):
cat data.dat | perl -lane 'print unless $_{$_=$F[1]}++'