generate random binary number in perl - perl

I want to generate 64 iteration of non-repetitive 6 digits that only consist of 0 and 1 (eg. 111111, 101111, 000000) by using perl.
I found code that can generate random hex and try to modify it but I think my code is all wrong. This is my code:
use strict;
use warnings;
my %a;
foreach (1 .. 64) {
my $r;
do {
$r = int(rand(2));
}
until (!exists($a{$r}));
printf "%06d\n", $r;
$a{$r}++;
}

Do you mean that you want 64 six-bit numbers, all distinct from each other? If so, then you should just shuffle the list (0, 1, 2, 3, …, 63), because there are exactly 64 six-bit numbers — you just want them in a random order.
And if you want to print them as base-two string, use the %06b format.
use List::Util;
my #list = List::Util::shuffle 0..63;
printf "%06b\n", $_ for #list;

From the comments:
I am actually want to generate all possible 6-bit binary number. Since writing all the possible combination by hand is cumbersome and prone to human error, I think it will be good idea to just generate it by using rand() with no repetition and store it into array.
This is a horribly inefficent approach to take, thanks to random number collisons.
You get the same result with:
printf ( "%06b\n", $_ ) for 1..63;
If you're after a random order (although, you don't seem to suggest that you do):
use List::Util qw ( shuffle );
printf ( "%06b\n", $_ ) for shuffle (0..63);

If you want 64 x 6-bit integers you can call int(rand(64)); 64 times, there's no need to generate each bit separately.
Your code can be modified to work like this:
#!/usr/bin/perl
# your code goes here
use strict;
use warnings;
my %a;
foreach (1 .. 64) {
my $r;
do
{
$r = int(rand(64));
} until (!exists($a{$r}));
printf "%06b\n", $r;
$a{$r}++;
}
The results are stored in a array of integers. The %06b format specifier string prints out a 6 bit binary number.

Related

Shortest Perl solution for outputing 4 random words

I have this one-line Unix shell script
for i in 1 2 3 4; do sed "$(tr -dc '0-9' < /dev/urandom | fold -w 5 |
awk '$0>=35&&$0<=65570' | head -1)q;d" "$0"; done | perl -p00e
's/\n(?!\Z)/ /g'
The script has 65K words in it, one per line, from line 35 to 65570. The code and the data are in the same file.
This script outputs 4 space-separated random words from this list with a newline at the end. For example
first fourth third second
How can I make this one-liner much shorter with Perl, keeping the
tr -dc '0-9' < /dev/urandom
part?
Keeping it is important since it provides Cryptographically Secure Pseudo-Random Numbers (CSPRNs) for all Unix OSs. Of course, if Perl can get numbers from /dev/urandom then the tr can be replaced with Perl too, but the numbers from urandom need to stay.
For convenience, I shared the base script with 65K words
65kwords.txt
or
65kwords.txt
Please use only core modules. It would be used for generating "human memorable passwords".
Later, the (hashing) iteration count, where we would use this to store the passwords would be extremely high, so brute-force would be very slow, even with many many GPUs/FPGAs.
You mention needing a CSPRN, which makes this a non trivial exercise - if you need cryptographic randomness, then using built in stuff (like rand) is not a good choice, as the implementation is highly variable across platforms.
But you've got Rand::Urandom which looks like it does the trick:
By default it uses the getentropy() (only available in > Linux 3.17) and falls back to /dev/arandom then /dev/urandom.
#!/usr/bin/env perl
use strict;
use warnings;
use Rand::Urandom;
chomp ( my #words = <DATA> );
print $words[rand #words], " " for 1..4;
print "\n";
__DATA__
yarn
yard
wound
worst
worry
work
word
wool
wolf
wish
wise
wipe
winter
wing
wind
wife
whole
wheat
water
watch
walk
wake
voice
Failing that though - you can just read bytes from /dev/urandom directly:
#!/usr/bin/env perl
use strict;
use warnings;
my #number_of_words = 4;
chomp ( my #words = <DATA> );
open ( my $urandom, '<:raw', '/dev/urandom' ) or die $!;
my $bytes;
read ( $urandom, $bytes, 2 * $number_of_words ); #2 bytes 0 - 65535
#for testing
#unpack 'n' is n An unsigned short (16-bit)
# unpack 'n*' in a list context returns a list of these.
foreach my $value ( unpack ( "n*", $bytes ) ) {
print $value,"\n";
}
#actually print the words.
#note - this assumes that you have the right number in your list.
# you could add a % #words to the map, e.g. $words[$_ % #words]
#but that will mean wrapping occurs, and will alter the frequency distribution.
#a more robust solution would be to fetch additional bytes if the 'slot' is
#empty.
print join " ", ( map { $words[$_] } unpack ( "n*", $bytes )),"\n";
__DATA__
yarn
yard
wound
worst
#etc.
Note - the above relies on the fact that your wordlist is the same size as two bytes (16 bits) - if this assumption isn't true, you'll need to deal with 'missed' words. A crude approach would be to take a modulo, but that would mean some wrapping and therefore not quite truly even distribution of word picks. Otherwise you can bit-mask and reroll, as indicated below:
On a related point though - have you considered not using a wordlist, and instead using consonant-vowel-consonant groupings?
E.g.:
#!/usr/bin/env perl
use strict;
use warnings;
#uses /dev/urandom to fetch bytes.
#generates consonant-vowel-consonant groupings.
#each are 11.22 bits of entropy, meaning a 4-group is 45 bits.
#( 20 * 6 * 20 = 2400, which is 11.22 bits of entropy log2 2400
#log2(2400 ^ 4) = 44.91
#but because it's generated 'true random' it's a know entropy string.
my $num = 4;
my $format = "CVC";
my %letters = (
V => [qw ( a e i o u y )],
C => [ grep { not /[aeiouy]/ } "a" .. "z" ], );
my %bitmask_for;
foreach my $type ( keys %letters ) {
#find the next power of 2 for the number of 'letters' in the set.
#So - for the '20' letter group, that's 31. (0x1F)
#And for the 6 letter group that's 7. (0x07)
$bitmask_for{$type} = ( 2 << log ( #{$letters{$type}} ) / log 2 ) - 1 ;
}
open( my $urandom, '<:raw', '/dev/urandom' ) or die $!;
for ( 1 .. $num ) {
for my $type ( split //, $format ) {
my $value;
while ( not defined $value or $value >= #{ $letters{$type} } ) {
my $byte;
read( $urandom, $byte, 1 );
#byte is 0-255. Our key space is 20 or 6.
#So rather than modulo, which would lead to an uneven distribution,
#we just bitmask and discard and 'too high'.
$value = (unpack "C", $byte ) & $bitmask_for{$type};
}
print $letters{$type}[$value];
}
print " ";
}
print "\n";
close($urandom);
This generates 3 character CVC symbols, with a known entropy level (11.22 per 'group') for making reasonably robust passwords. (45 bits as opposed to the 64 bits of your original, although obviously you can add extra 'groups' to gain 11.22 bits per time).
This answer is not cryptographically safe!
I would do this completely in Perl. No need for a one-liner. Just grab your word-list and put it into a Perl program.
use strict;
use warnings;
my #words = qw(
first
second
third
fourth
);
print join( q{ }, map { $words[int rand #words] } 1 .. 4 ), "\n";
This grabs four random words from the list and outputs them.
rand #words evaluates #words in scalar context, which gives the number of elements, and creates a random floating point value between 0 and smaller than that number. int cuts off the decimals. This is used as the index to grab an element out of #words. We repeat this four times with the map statement, where the 1 .. 4 is the same as passing a list of (1, 2, 3, 4) into map as an argument. This argument is ignored, but instead our random word is picked. map returns a list, which we join on one space. Finally we print the resulting string, and a newline.
The word list is created with the quoted words qw() operator, which returns a list of quoted words. It's shorthand so you don't need to type all the quotes ' and commas ,.
If you'd want to have the word list at the bottom you could either put the qw() in a sub and call it at the top, or use a __DATA__ section and read from it like a filehandle.
The particular method using tr and fold on /dev/urandom is a lot less efficient than it could be, so let's fix it up a little bit, while keeping the /dev/urandom part.
Assuming that available memory is enough to contain your script (including wordlist):
chomp(#words = <DATA>);
open urandom, "/dev/urandom" or die;
read urandom, $randbytes, 4 * 2 or die;
print join(" ", map $words[$_], unpack "S*", $randbytes), "\n";
__DATA__
word
list
goes
here
This goes for brevity and simplicity without outright obfuscation — of course you could make it shorter by removing whitespace and such, but there's no reason to. It's self-contained and will work with several decades of perls (yes, those bareword filehandles are deliberate :-P)
It still expects exactly 65536 entries in the wordlist, because that way we don't have to worry about introducing bias to the random number choice using a modulus operator. A slightly more ambitious approach might be to read 48 bytes from urandom for each word, turning it into a floating-point value between 0 and 1 (portable to most systems) and multiplying it by the size of the word list, allowing for a word list of any reasonable size.
A lot of nonsense is talked about password strength, and I think you're overestimating the worth of several of your requirements here
I don't understand your preoccupation with making your code "much shorter with perl". (Why did you pick Perl?) Savings here can only really be useful to make the script quicker to read and compile, but they will be dwarfed by the half megabyte of data following the code which must also be read
In this context, the usefulness to a hacker of a poor random number generator depends on prior knowledge of the construction of the password together with the passwords that have been most recently generated. With a sample of only 65,000 words, even the worst random number generator will show insignificant correlation between successive passwords
In general, a password is more secure if it is longer, regardless of its contents. Forming a long password out of a sequence of English words is purely a way of making the sequence more memorable
"Of course later, the (hashing) iteration count ... would be extreme high, so brute-force [hacking?] would be very slow"
This doesn't follow at all. Cracking algorithms won't try to guess the four words you've chosen: they will see only a thirty-character (or so) string consisting only of lower-case letters and spaces, and whose origin is insignificant. It will be no more or less crackable than any other password of the same length with the same character set
I suggest that you should rethink your requirements and so make things easier for yourself. I don't find it hard to think of four English words, and don't need a program to do it for me. Hint: pilchard is a good one: they never guess that!
If you still insist, then I would write something like this in Perl. I've used only the first 18 lines of your data for
use strict;
use warnings 'all';
use List::Util 'shuffle';
my #s = map /\S+/g, ( shuffle( <DATA> ) )[ 0 .. 3 ];
print "#s\n";
__DATA__
yarn
yard
wound
worst
worry
work
word
wool
wolf
wish
wise
wipe
winter
wing
wind
wife
whole
wheat
output
wind wise winter yarn
You could use Data::Random::rand_words()
perl -MData::Random -E 'say join $/, Data::Random::rand_words(size => 4)'

perl-how to treat a string as a binary number?

Read a file that contains an address and a data, like below:
#0, 12345678
#1, 5a5a5a5a
...
My aim is to read the address and the data. Consider the data I read is in hex format, and then I need to unpack them to binary number.
So 12345678 would become 00010010001101000101011001111000
Then, I need to further unpack the transferred binary number to another level.
So it becomes, 00000000000000010000000000010000000000000001000100000001000000000000000100000001000000010001000000000001000100010001000000000000
They way I did is like below
while(<STDIN>) {
if (/\#(\S+)\s+(\S+)/) {
$addr = $1;
$data = $2;
$mem{$addr} = ${data};
}
}
foreach $key (sort {$a <=> $b} (keys %mem)) {
my $str = unpack ('B*', pack ('H*',$mem{$key}));
my $str2 = unpack ('B*', pack ('H*', $str));
printf ("#%x ", $key);
printf ("%s",$str2);
printf ("\n");
}
It works, however, my next step is to do some numeric operation on the transferred bits.
Such as bitwise or and shifting. I tried << and | operator, both are for numbers, not strings. So I don't know how to solve this.
Please leave your comments if you have better ideas. Thanks.
You can employ Bit::Vector module from metaCPAN
use strict;
use warnings;
use Bit::Vector;
my $str = "1111000011011001010101000111001100010000001111001010101000111010001011";
printf "orig str: %72s\n", $str;
#only 72 bits for better view
my $vec = Bit::Vector->new_Bin(72,$str);
printf "vec : %72s\n", $vec->to_Bin();
$vec->Move_Left(2);
printf "left 2 : %72s\n", $vec->to_Bin();
$vec->Move_Right(4);
printf "right 4 : %72s\n", $vec->to_Bin();
prints:
orig str: 1111000011011001010101000111001100010000001111001010101000111010001011
vec : 001111000011011001010101000111001100010000001111001010101000111010001011
left 2 : 111100001101100101010100011100110001000000111100101010100011101000101100
right 4 : 000011110000110110010101010001110011000100000011110010101010001110100010
If you need do some math with arbitrary precision, you can also use Math::BigInt or use bigint (http://perldoc.perl.org/bigint.html)
Hex and binary are text representation of numbers. Shifting and bit manipulations are numerical operations. You want a number, not text.
my $hex = '5a5a5a5a';
$num = hex($hex); # Convert to number.
$num >>= 1; # Manipulate the number.
$hex = sprintf('%08X', $num); # Convert back to hex.
In a comment, you mention you want to deal with 256 bit numbers. The native numbers don't support that, but you can use Math::BigInt.
My final solution of this is forget about treat them as numbers, just treat them as string . I use substring and string concentration instead of shift. Then for the or operation , I just add each bit of the string, if it's 0 the result is 0, else is 1.
It may not be the best way to solve this problem. But that's the way I finally used.

string array sorting issue in perl

I am running below code to sort strings and not getting the expected results.
Code:
use warnings;
use strict;
my #strArray= ("64.0.71","68.0.71","62.0.1","62.0.2","62.0.11");
my #sortedStrArray = sort { $a cmp $b } #strArray;
foreach my $element (#sortedStrArray ) {
print "\n$element";
}
Result:
62.0.1
62.0.11 <--- these two
62.0.2 <---
64.0.71
68.0.71
Expected Result:
62.0.1
62.0.2 <---
62.0.11 <---
64.0.71
68.0.71
"1" character 0x31. "2" is character 0x32. 0x31 is less than 0x32, so "1" sorts before "2". Your expectations are incorrect.
To obtain the results you desire to obtain, you could use the following:
my #sortedStrArray =
map substr($_, 3),
sort
map pack('CCCa*', split(/\./), $_),
#strArray;
Or for a much wider range of inputs:
use Sort::Key::Natural qw( natsort );
my #sortedStrArray = natsort(#strArray);
cmp is comparing lexicographically (like a dictionary), not numerically. This means it will go through your strings character by character until there is a mismatch. In the case of "62.0.11" vs. "62.0.2", the strings are equal up until "62.0." and then it finds a mismatch at the next character. Since 2 > 1, it sorts "62.0.2" > "62.0.11". I don't know what you are using your strings for or if you have any control over how they're formatted, but if you were to change the formatting to "62.00.02" (every segment has 2 digits) instead of "62.0.2" then they would be sorted as you expect.
Schwartzian_transform
This is usage of randal schwartz transofm:
First, understand, what you want:
sorting by first number, then second, then third:
let's do it with this:
use warnings;
use strict;
use Data::Dumper;
my #strArray= ("64.0.71","68.0.71","62.0.1","62.0.2","62.0.11");
my #transformedArray = map{[$_,(split(/\./,$_))]}#strArray;
=pod
here #transformedArray have such structure:
$each_element_of_array: [$element_from_original_array, $firstNumber, $secondNumber, $thirdNumber];
for example:
$transformedArray[0] ==== ["64.0.71", 64, 0, 71];
after that we will sort it
first by first number
then: by second number
then: by third number
=cut
my #sortedArray = map{$_->[0]} # save only your original string.
sort{$a->[3]<=>$b->[3]}
sort{$a->[2]<=>$b->[2]}
sort{$a->[1]<=>$b->[1]}
#transformedArray;
print Dumper(\#sortedArray);
Try the Perl module Sort::Versions, it is designed to give you what you expect.http://metacpan.org/pod/Sort::Versions
It supports alpha-numeric version ids as well.

A quick string checksum function in Perl generating values in the 0..2^32-1 range

I'm looking for a Perl string checksum function with the following properties:
Input: Unicode string of undefined length ($string)
Output: Unsigned integer ($hash), for which 0 <= $hash <= 2^32-1 holds (0 to 4294967295, matching the size of a 4-byte MySQL unsigned int)
Pseudo-code:
sub checksum {
my $string = shift;
my $hash;
... checksum logic goes here ...
die unless ($hash >= 0);
die unless ($hash <= 4_294_967_295);
return $hash;
}
Ideally the checksum function should be quick to run and should generate values somewhat uniformly in the target space (0 .. 2^32-1) to avoid collisions. In this application random collisions are totally non-fatal, but obviously I want to avoid them to the extent that it is possible.
Given these requirements, what is the best way to solve this?
Any hash function will be sufficient - simply truncate it to 4-bytes and convert to a number. Good hash functions have a random distribution, and this distribution will be constant no matter where you truncate the string.
I suggest Digest::MD5 because it is the fastest hash implementation that comes with Perl as standard. String::CRC, as Pim mentions, is also implemented in C and should be faster.
Here's how to calculate the hash and convert it to an integer:
use Digest::MD5 qw(md5);
my $str = substr( md5("String-to-hash"), 0, 4 );
print unpack('L', $str); # Convert to 4-byte integer (long)
From perldoc -f unpack:
For example, the following computes the same number as the
System V sum program:
$checksum = do {
local $/; # slurp!
unpack("%32W*",<>) % 65535;
};
Don't know how quick it is, but you might try String::CRC.

How can I generate non-repetitive random 4 bytes hex values in Perl?

I want to generate random hex values and those values should not be repetitive
and it should be of 4 bytes (ie: 0x00000000 to 0xffffffff) and the display output
should contain leading zeros.
For example: if I get the value 1 it should not represented as 0x1 but 0x00000001.
I want a minimum of 100 random values. Please tell me: how can I do that in Perl?
To get a random number in the range 0 .. (2<<32)-1:
my $rand = int(rand(0x100000000));
To print it in hex with leading zeroes:
printf "%08x", $rand;
Do please note this from the Perl man page:
Note: If your rand function consistently returns numbers that
are too large or too small, then your version of Perl was probably compiled with the wrong number of RANDBITS
If that's a concern, do this instead:
printf "%04x%04x", int(rand(0x10000)), int(rand(0x10000));
Note, also, that this does nothing to prevent repetition, although to be honest the chance of a repeating 32 bit number in a 100 number sequence is pretty small.
If it's absolutely essential that you don't repeat, do something like this:
my (%a); # create a hash table for remembering values
foreach (0 .. 99) {
my $r;
do {
$r = int(rand(0x100000000));
} until (!exists($a{$r})); # loop until the value is not found
printf "%08x\n", $r; # print the value
$a{$r}++; # remember that we saw it!
}
For what it's worth, this algorithm shouldn't be used if the range of possible values is less than (or even near to) the number of values required. That's because the random number generator loop will just repeatedly pull out numbers that were already seen.
However in this case where the possible range is so high (2^32) and the number of values wanted so low it'll work perfectly. Indeed with a range this high it's about the only practical algorithm.
perl -e 'printf "%08X\n", int rand 0xFFFFFFFF for 1 .. 100'
Alnitak explained it, but here's a much simpler implementation. I'm not sure how everyone starting reaching for do {} while since that's a really odd choice:
my $max = 0xFFFF_FFFF;
my( %Seen, #numbers );
foreach ( 1 .. 100 )
{
my $rand = int rand( $max + 1 );
redo if $Seen{$rand}++;
push #numbers, $rand;
}
print join "\n", map { sprintf "0x%08x", $_ } #numbers;
Also, as Alnitak pointed out, if you are generating a lot of numbers, that redo might cycle many, many times.
These will only be pseudorandom numbers, but you're not really asking for real random number anyway. That would involve possible repetition. :)
use LWP::Simple "get";
use List::MoreUtils "uniq";
print for uniq map { s/\t//, "0x$_" } split /^/, LWP::Simple::get('http://www.random.org/integers/?num=220&min=0&max=65535&col=2&base=16&format=plain&rnd=date.2009-12-14');
Adjust the url (see the form on http://www.random.org/integers/?mode=advanced) to not always return the same list. There is a minuscule chance of not returning at least 100 results.
Note that this answer is intentionally "poor" as a comment on the poor question. It's not a single question, it's a bunch all wrapped up together, all of which I'd bet have existing answers already (how do I generate a random number in range x, how do I format a number as a hex string with 0x and 0-padding, how do I add only unique values into a list, etc.). It's like asking "How do I write a webserver in Perl?" Without guessing what part the questioner really wants an answer to, you either have to write a tome for a response, or say something like:
perl -MIO::All -e 'io(":80")->fork->accept->(sub { $_[0] < io(-x $1 ? "./$1 |" : $1) if /^GET \/(.*) / })'
To get a random integer:
int(rand(0x10000000))
To format it as 8 hexadecimal digits:
printf "%08x", int(rand(0x10000000))