How to manipulate output in perl - perl

I'm new to perl and I can't find whether I can manipulate the output format in perl or not.
for a code like
print "$arOne[i] => $arTwo[i]\n";
I want the oputput to be like
8 => 9
10 => 25
7 => 456
If it is possible, then how to do it?

You want to use printf.
printf ("%2d => %-3d\n", $arOne[$i], $arTwo[$i]);
The formatting instructions are embedded between the % and a letter. In your case, you print numbers, so you need the letter d. The number left to the d specifies how many digits you want to reserve for the number. In your case, I made the assumption that the left number consists of at most two digits, while the right number consists of at most three digits. That might vary. Finally, the - in front of the 3d tells printf to left (rather than right) align the number.

In the spirit of TMTOWTDI-ness, there's also the old facility of perl formats:
#! /usr/bin/perl
use strict;
use warnings;
use List::MoreUtils qw(each_array);
my #arOne = (8, 10, 7);
my #arTwo = (9, 25, 456); # #arTwoDeeTwo ? #ceeThreePO ?
my ($one, $two);
format STDOUT =
#> => #<<
$one,$two
.
# Now write to the format we described above
my $next_pair = each_array(#arOne, #arTwo);
while (($one, $two) = $next_pair->()) {
write;
}
UPDATE
Note that this "report generation" capability is little-used in contemporary perl programming. The printf suggestion is typically more flexible (and less surprising). It seemed a pity, however, not to mention formats in perl in question about formatting in perl.

Related

Shortest Perl solution for outputing 4 random words

I have this one-line Unix shell script
for i in 1 2 3 4; do sed "$(tr -dc '0-9' < /dev/urandom | fold -w 5 |
awk '$0>=35&&$0<=65570' | head -1)q;d" "$0"; done | perl -p00e
's/\n(?!\Z)/ /g'
The script has 65K words in it, one per line, from line 35 to 65570. The code and the data are in the same file.
This script outputs 4 space-separated random words from this list with a newline at the end. For example
first fourth third second
How can I make this one-liner much shorter with Perl, keeping the
tr -dc '0-9' < /dev/urandom
part?
Keeping it is important since it provides Cryptographically Secure Pseudo-Random Numbers (CSPRNs) for all Unix OSs. Of course, if Perl can get numbers from /dev/urandom then the tr can be replaced with Perl too, but the numbers from urandom need to stay.
For convenience, I shared the base script with 65K words
65kwords.txt
or
65kwords.txt
Please use only core modules. It would be used for generating "human memorable passwords".
Later, the (hashing) iteration count, where we would use this to store the passwords would be extremely high, so brute-force would be very slow, even with many many GPUs/FPGAs.
You mention needing a CSPRN, which makes this a non trivial exercise - if you need cryptographic randomness, then using built in stuff (like rand) is not a good choice, as the implementation is highly variable across platforms.
But you've got Rand::Urandom which looks like it does the trick:
By default it uses the getentropy() (only available in > Linux 3.17) and falls back to /dev/arandom then /dev/urandom.
#!/usr/bin/env perl
use strict;
use warnings;
use Rand::Urandom;
chomp ( my #words = <DATA> );
print $words[rand #words], " " for 1..4;
print "\n";
__DATA__
yarn
yard
wound
worst
worry
work
word
wool
wolf
wish
wise
wipe
winter
wing
wind
wife
whole
wheat
water
watch
walk
wake
voice
Failing that though - you can just read bytes from /dev/urandom directly:
#!/usr/bin/env perl
use strict;
use warnings;
my #number_of_words = 4;
chomp ( my #words = <DATA> );
open ( my $urandom, '<:raw', '/dev/urandom' ) or die $!;
my $bytes;
read ( $urandom, $bytes, 2 * $number_of_words ); #2 bytes 0 - 65535
#for testing
#unpack 'n' is n An unsigned short (16-bit)
# unpack 'n*' in a list context returns a list of these.
foreach my $value ( unpack ( "n*", $bytes ) ) {
print $value,"\n";
}
#actually print the words.
#note - this assumes that you have the right number in your list.
# you could add a % #words to the map, e.g. $words[$_ % #words]
#but that will mean wrapping occurs, and will alter the frequency distribution.
#a more robust solution would be to fetch additional bytes if the 'slot' is
#empty.
print join " ", ( map { $words[$_] } unpack ( "n*", $bytes )),"\n";
__DATA__
yarn
yard
wound
worst
#etc.
Note - the above relies on the fact that your wordlist is the same size as two bytes (16 bits) - if this assumption isn't true, you'll need to deal with 'missed' words. A crude approach would be to take a modulo, but that would mean some wrapping and therefore not quite truly even distribution of word picks. Otherwise you can bit-mask and reroll, as indicated below:
On a related point though - have you considered not using a wordlist, and instead using consonant-vowel-consonant groupings?
E.g.:
#!/usr/bin/env perl
use strict;
use warnings;
#uses /dev/urandom to fetch bytes.
#generates consonant-vowel-consonant groupings.
#each are 11.22 bits of entropy, meaning a 4-group is 45 bits.
#( 20 * 6 * 20 = 2400, which is 11.22 bits of entropy log2 2400
#log2(2400 ^ 4) = 44.91
#but because it's generated 'true random' it's a know entropy string.
my $num = 4;
my $format = "CVC";
my %letters = (
V => [qw ( a e i o u y )],
C => [ grep { not /[aeiouy]/ } "a" .. "z" ], );
my %bitmask_for;
foreach my $type ( keys %letters ) {
#find the next power of 2 for the number of 'letters' in the set.
#So - for the '20' letter group, that's 31. (0x1F)
#And for the 6 letter group that's 7. (0x07)
$bitmask_for{$type} = ( 2 << log ( #{$letters{$type}} ) / log 2 ) - 1 ;
}
open( my $urandom, '<:raw', '/dev/urandom' ) or die $!;
for ( 1 .. $num ) {
for my $type ( split //, $format ) {
my $value;
while ( not defined $value or $value >= #{ $letters{$type} } ) {
my $byte;
read( $urandom, $byte, 1 );
#byte is 0-255. Our key space is 20 or 6.
#So rather than modulo, which would lead to an uneven distribution,
#we just bitmask and discard and 'too high'.
$value = (unpack "C", $byte ) & $bitmask_for{$type};
}
print $letters{$type}[$value];
}
print " ";
}
print "\n";
close($urandom);
This generates 3 character CVC symbols, with a known entropy level (11.22 per 'group') for making reasonably robust passwords. (45 bits as opposed to the 64 bits of your original, although obviously you can add extra 'groups' to gain 11.22 bits per time).
This answer is not cryptographically safe!
I would do this completely in Perl. No need for a one-liner. Just grab your word-list and put it into a Perl program.
use strict;
use warnings;
my #words = qw(
first
second
third
fourth
);
print join( q{ }, map { $words[int rand #words] } 1 .. 4 ), "\n";
This grabs four random words from the list and outputs them.
rand #words evaluates #words in scalar context, which gives the number of elements, and creates a random floating point value between 0 and smaller than that number. int cuts off the decimals. This is used as the index to grab an element out of #words. We repeat this four times with the map statement, where the 1 .. 4 is the same as passing a list of (1, 2, 3, 4) into map as an argument. This argument is ignored, but instead our random word is picked. map returns a list, which we join on one space. Finally we print the resulting string, and a newline.
The word list is created with the quoted words qw() operator, which returns a list of quoted words. It's shorthand so you don't need to type all the quotes ' and commas ,.
If you'd want to have the word list at the bottom you could either put the qw() in a sub and call it at the top, or use a __DATA__ section and read from it like a filehandle.
The particular method using tr and fold on /dev/urandom is a lot less efficient than it could be, so let's fix it up a little bit, while keeping the /dev/urandom part.
Assuming that available memory is enough to contain your script (including wordlist):
chomp(#words = <DATA>);
open urandom, "/dev/urandom" or die;
read urandom, $randbytes, 4 * 2 or die;
print join(" ", map $words[$_], unpack "S*", $randbytes), "\n";
__DATA__
word
list
goes
here
This goes for brevity and simplicity without outright obfuscation — of course you could make it shorter by removing whitespace and such, but there's no reason to. It's self-contained and will work with several decades of perls (yes, those bareword filehandles are deliberate :-P)
It still expects exactly 65536 entries in the wordlist, because that way we don't have to worry about introducing bias to the random number choice using a modulus operator. A slightly more ambitious approach might be to read 48 bytes from urandom for each word, turning it into a floating-point value between 0 and 1 (portable to most systems) and multiplying it by the size of the word list, allowing for a word list of any reasonable size.
A lot of nonsense is talked about password strength, and I think you're overestimating the worth of several of your requirements here
I don't understand your preoccupation with making your code "much shorter with perl". (Why did you pick Perl?) Savings here can only really be useful to make the script quicker to read and compile, but they will be dwarfed by the half megabyte of data following the code which must also be read
In this context, the usefulness to a hacker of a poor random number generator depends on prior knowledge of the construction of the password together with the passwords that have been most recently generated. With a sample of only 65,000 words, even the worst random number generator will show insignificant correlation between successive passwords
In general, a password is more secure if it is longer, regardless of its contents. Forming a long password out of a sequence of English words is purely a way of making the sequence more memorable
"Of course later, the (hashing) iteration count ... would be extreme high, so brute-force [hacking?] would be very slow"
This doesn't follow at all. Cracking algorithms won't try to guess the four words you've chosen: they will see only a thirty-character (or so) string consisting only of lower-case letters and spaces, and whose origin is insignificant. It will be no more or less crackable than any other password of the same length with the same character set
I suggest that you should rethink your requirements and so make things easier for yourself. I don't find it hard to think of four English words, and don't need a program to do it for me. Hint: pilchard is a good one: they never guess that!
If you still insist, then I would write something like this in Perl. I've used only the first 18 lines of your data for
use strict;
use warnings 'all';
use List::Util 'shuffle';
my #s = map /\S+/g, ( shuffle( <DATA> ) )[ 0 .. 3 ];
print "#s\n";
__DATA__
yarn
yard
wound
worst
worry
work
word
wool
wolf
wish
wise
wipe
winter
wing
wind
wife
whole
wheat
output
wind wise winter yarn
You could use Data::Random::rand_words()
perl -MData::Random -E 'say join $/, Data::Random::rand_words(size => 4)'

How to use the Perl format function to print multiple columns

I have a Perl hash like %word. The key is the word and the value is its count. Now I want to display %word like:
the 20 array 10 print 2
a 18 perl 8 function 1
of 12 code 5
I search and Perl format can solve this, and I learn this page perlform, but I still don't how to do it.
I knew about format and that it could be vary handy to generate nice forms... at the time we still had a world where all was monospaced...
So, I researched it a bit and found the following solution:
use strict;
use warnings;
my %word = (
the => 20,
array => 10,
print => 2,
a => 18,
perl => 8,
function => 1,
of => 12,
code => 5,
);
my #word = %word; # turn the hash into a list
format =
#<<<<<<<<<<< #>>>> #<<<<<<<<<<< #>>>> #<<<<<<<<<<< #>>>>~~
shift #word, shift #word, shift #word, shift #word, shift #word, shift #word
.
write;
The nasty problem sits in the ~~ which makes the line repeating and that for each field in the format line you do need a corresponding scalar value... In order to get those scalar values, I shifted them off from the #word array.
There is a lot more to know about format and write.
Have fun!

print function in Perl

This is a perl code I use for compiling pressure data.
$data_ct--;
mkdir "365Days", 0777 unless -d "365Days";
my $file_no = 1;
my $j = $num_levels;
for ($i = 0; $i < $data_ct; $i++) {
if ($j == $num_levels) {
close OUT;
$j = 0;
my $file = "365days/wind$file_no";
$file_no++;
open OUT, "> $file" or die "Can't open $file: $!";
}
{
$wind_direction = (270-atan2($vwind[$i], $uwind[$i])*(180/pi))%360;
}
$wind_speed = sqrt($uwind[$i]*$uwind[$i]+$vwind[$i]*$vwind[$i]);
printf OUT "%.0f %.0f %.1f\n", $level[$i], $wind_direction, $wind_speed;
$j++;
}
$file_no--;
print STDERR "Wrote out $file_no wind files.\n";
print STDERR "Done\n";
The problem I am having is when it prints out the numbers, I want it to be in this format
Level Wind direction windspeed
250 320 1.5
870 56 4.6
Right now when I run the script the columns names do not show up rather just the numbers. Can someone direct me as to how to rectify the script?
There are several ways to do this in Perl. First, Perl has built in form ability. It's been a part of Perl since version 3.0 (about 20 years old). However, it is rarely used. In fact, it is so rarely used I am not even going to attempt to write an example with it because I'd have to spend way too much time relearning it. It's there and documented.
You can try to figure it out for yourself. Or, maybe some old timer Perl programmer might wake up from his nap and help out. (All bets are off if it's meatloaf night at the old age home, though).
Perl has evolved greatly in the last few decades, and this old forms bit represents a much older way of writing Perl programs. It just isn't pretty.
Another way this can be done and is more popular is to use the printf function. If you're not familiar with C and printf from there, it can be a bit intimidating to use. It depends upon formatting codes (the things that start with % to specify what you want to print (strings, integers, floating point numbers, etc.), and how you want those values formatted.
Fortunately, printf is so useful, that most programming languages have their own version of printf. Once you learn it, your knowledge is transferable to other places. There's an equivalent sprintf for setting variable with formats.
# Define header and line formats
my $header_fmt = "%-5.5s %-14.14s %-9.9s\n";
my $data_fmt = "%5d %14d %9.1f\n";
# Print my table header
printf $header_fmt, "Level", "Wind direction", "windspeed";
my $level = 230;
my $direction = 120;
my $speed = 32.3;
# Print my table data
printf $data_fmt, $level, $direction, $speed;
This prints out:
Level Wind direction windspeed
230 120 32.3
I like defining the format of my printed lines all together, so I can tweak them to get what I want. It's a great way to make sure your data line lines up with your header.
Okay, Matlock wasn't on tonight, so this crusty old Perl programmer has plenty of time.
In my previous answer, I said there was an old way of doing forms in Perl, but I didn't remember how it went. Well, I spent some time and got you an example of how it works.
Basically, you sort of need globalish variables. I thought you needed our variables for this to work, but I can get my variables to work if I define them on the same level as my format statements. It's not pretty.
You use GLOBS to define your formats with _TOP appended for your headers. Since I'm printing these on STDOUT, I define STDOUT_TOP for my heading and STDOUT for my data lines.
The format must start at the beginning of a column. The lone . on the end ends the format definition. You notice I write the entire thing with just a single write statement. How does it know to print out the heading? Perl tracks the number of lines printed and automatically writes a Form Feed character and a new heading when Perl thinks it's at the bottom of a page. I am assuming Perl uses 66 line pages as a default.
You can in Perl set your own form names via select. Perl uses $= as the number of lines on a page, and $- on the number of lines left. These variables are global, but are set by the selected format via the select statement. You can use IO::Handle for better variable naming.
#! /usr/bin/env perl
use strict;
use warnings;
use feature qw(say);
my #data = (
{
level => 250,
direction => 320,
speed => 1.5,
},
{
level => 870,
direction => 55,
speed => 4.5,
},
);
my $level;
my $direction;
my $speed;
for my $item_ref ( #data ) {
$level = $item_ref->{level};
$direction = $item_ref->{direction};
$speed = $item_ref->{speed};
write;
}
format STDOUT_TOP =
Level Wind Direction Windspeed
===== ============== =========
.
format STDOUT =
##### ############## ######.##
$level, $direction, $speed
.
This prints:
Level Wind Direction Windspeed
===== ============== =========
250 320 1.50
870 55 4.50
#Gunnerfan : Can you replace the line from your code as shown below
Your line of code: printf OUT "%.0f %.0f %.1f\n",$level[$i], wind_direction, $wind_speed;
Replacement code:
if($i==0) {
printf OUT "\n\t%s%-20s %-10s%-12s %-20s%s\n", 'Level' , 'Wind direction' , 'windspeed');
}
printf OUT "\t%s%-20s%s %-10s%s%-12s%s %-20s\n",$level[$i],$wind_direction, $wind_speed;

Hash Key and Value in Perl

I have the question in Perl:Read a series of last names and phone numbers from the given input. The names and numbers should be separated by a comma. Then print the names and numbers alphabetically according to last name. Use hashes.
#!usr/bin/perl
my %series = ('Ashok','4365654435' 'Ramnath','4356456546' 'Aniketh','4565467577');
while (($key, $value) = each(sort %series))
{
print $key.",".$value."\n";
}
I am not getting the output. Where am I going wrong? Please help. Thanks in advance
#!usr/bin/perl
my %series = ('Ashok','4365654435' 'Ramnath','4356456546' 'Aniketh','4565467577');
print $_.",".$series{$_}."\n" for sort keys %series;
If I execute any of the above 2 programs, I get the same output as:
String found where operator expected at line 2, near "'4365654435' 'Ramnath'" (Missing operator before 'Ramnath'?)
String found where operator expected at line 2, near "'4356456546' 'Aniketh'" (Missing operator before 'Aniketh'?)
syntax error at line 2, near "'4365654435' 'Ramnath'"
Execution aborted due to compilation errors
But according to the question, I think I cannot store the input as my %series = ('Ashok','4365654435','Ramnath','4356456546','Aniketh','4565467577');
each only operates on hashes. You can't use sort like that, it sorts lists not hashes.
Your loop could be:
foreach my $key (sort keys %series) {
print $key.",".$series{$key}."\n";
}
Or in shorthand:
print $_.",".$series{$_}."\n" for sort keys %series;
In your hash declaration you have:
my %series = ('Ashok','4365654435' 'Ramnath','4356456546' 'Aniketh','4565467577');
This is generating the warnings.
A hash is simply an even list of scalars. Therefore, you have to put a comma between each pair:
my %series = ('Ashok','4365654435', 'Ramnath','4356456546', 'Aniketh','4565467577');
# ^--- ^---
If you want visual distinction between the pairs, you can use the => operator. This behaves the same as the comma. Additionaly, if the left hand side is a legal bareword, it is viewed as a quoted string. Therefore, we could write any of these:
# it is just a comma after all, with autoquoting
my %series = (Ashok => 4365654435 => Ramnath => 4356456546 => Aniketh => 4565467577);
# using it as a visual "pair" constructor
my %series = ('Ashok'=>'4365654435', 'Ramnath'=>'4356456546', 'Aniketh'=>'4565467577');
# as above, but using autoquoting. Numbers don't have to be quoted.
my %series = (
Ashok => 4365654435,
Ramnath => 4356456546,
Aniketh => 4565467577,
);
This last solution is the best. The last coma is optional, but I consider it good style—it makes it easy to add another entry. You can use autoquoting whenever the bareword on the left would be a legal variable name. E.g. a_bc => 1 is valid, but a bc => 1 is not (whitespace is not allowed in variable names), and +/- => 1 is not allowed (reserved characters). However Ünıçøðé => 1 is allowed when your source code is encoded in UTF-8 and you use uft8 in your script.
Besides what amonand Mat said, I'd like to notice other issues in your code:
your shebang is wrong it should be #!/usr/bin/perl - notice the first /
you don't have use strict; and use warnings; in your code - although this is not strictly a mistake, I consider this to be an issue. Those 2 commands will save you from a lot of trouble later on.
PS: you have to use commas between your number and names also, not only between names and numbers - you have to, because otherwise you get a compile error

re-order alphabet sorting in Perl

I am trying to fix sorting in Armenian alphabet, because all standard Unix tools and programming languages sort letters and words as a result for only 1 of the 2 major dialects (Western).
Translating this into technical problem is to re-order one of the chars "ւ", to put it in different place among letters, let's say to make it the last character so that words are ordered correctly for the order dialect (Eastern). Linguistically speaking in Eastern dialect this "ւ" symbol is not written "standalone" but is a part of letter that's written with 2 chars "ու". Current sorting puts letter "ու" behind "ոք" or "ոփ" 2-letter constructs.
Basically, it should be totally similar if you wanted to make e. g. letter "v" be on place of letter "z" in Latin alphabet.
I am trying to use something like
#!/usr/bin/perl -w
use strict;
my (#sortd, #unsortd, $char_u, $char_x);
##unsortd = qw(աբասի ապուշ ապրուստ թուր թովիչ թոշակ թոք);
#unsortd = qw(ու ոց ոք ոփ);
#sortd = sort {
$char_u = "ւ";
$char_x = split(//, #unsortd);
if ($char_u gt $char_x) {
1;
} else {
return $a cmp $b;
}
} #unsortd;
print "#sortd\n";
but that does not scale for whole words, just 2 letter forms are fixed.
UPDATE: I was able to solve this using tr function to map letters to numbers as shown in Perlmonks
You should have a look at the Unicode::Collate::Locale module if you haven't done so already.
use Unicode::Collate::Locale;
my $collator = Unicode::Collate::Locale->new(locale => "hy");
#sortd = $collator->sort(#unsortd);
print join("\n", #sortd, '');
This prints:
ու
ոց
ոք
ոփ
(I'm not sure this is the output you're expecting, but that module and Unicode::Collate has quite a lot of information, it might be easier to create a custom collation for your needs based on that rather than rolling your own.)
For standard alphabets Unicode::Collate::Locale as suggested by #mat should be the first choice.
On the other hand, if you have very specific needs `index' can be used as follows. To sort single characters (note that missing characters would be first):
my $alphabet_A = "acb";
sub by_A {index($alphabet_A,$a) <=> index($alphabet_A,$b)};
...
my #sorted = sort by_A #unsorted;
For words, one can include a loop in the definition of by_A. For the following to work define the function min() and fine-tune the case of words of different lengths:
sub by_A {
$flag=0;
foreach my $i (0..min(length($a),length($b))-1) {
return ($flag) if ($flag);
$flag = ($flag or
index($alphabet_A,substr($a,$i,1)) <=> index($alphabet_A,substr($b,$i,1)));
}
return $flag;
}