I am trying to generate a random id for sessions cookie for every user session in Perl. Of course I searched cpan and google and found many similar topics and same weakness. The most modules used are Digest::SHA and Data::UUID and the module Data::GUID which internally uses Data::UUID.
Here is the code I can summarize the most methods used in modules on cpan:
#!/usr/bin/perl
use v5.10;
use Digest::SHA;
use Data::UUID;
use Data::GUID;# uses Data::UUID internally, so no need for it
use Time::HiRes ();
for (1..10) {
#say generate_sha(1); # 1= 40 bytes, 256=64 bytes, 512=128 bytes, 512224, 512256
say generate_uuid();
#say generate_guid();
}
sub generate_sha {
my ($bits) = #_;
# SHA-1/224/256/384/512
return Digest::SHA -> new($bits) -> add($$, +{}, Time::HiRes::time(), rand(Time::HiRes::time()) ) -> hexdigest;
}
sub generate_uuid {
return Data::UUID->new->create_hex(); #create_str, create_b64
}
sub generate_guid {
# uses Data::UUID internally
return Data::GUID->guid;
}
Here is a sample output form Data::UUID module:
0x0217C34C6C0710149FE4C7FBB6FA663B
0x0218665F6C0710149FE4C7FBB6FA663B
0x0218781A6C0710149FE4C7FBB6FA663B
0x021889316C0710149FE4C7FBB6FA663B
0x021899E16C0710149FE4C7FBB6FA663B
0x0218AB2B6C0710149FE4C7FBB6FA663B
0x0218BB1D6C0710149FE4C7FBB6FA663B
0x0218CABD6C0710149FE4C7FBB6FA663B
0x0218DB786C0710149FE4C7FBB6FA663B
0x0218ED396C0710149FE4C7FBB6FA663B
The id's generated from these seems to be unique, but what I am concerning is about high traffic or concurrency, say what if a 1000 only not saying 1000,000 users connected at the same time either from the same process like running under FCGI (say each FCGI process serving only 10 users) or from separate processes like running under CGI mode.
In the SHA I used this random string:
($$, +{}, Time::HiRes::time(), rand(Time::HiRes::time())
it includes Anonymous hash reference address and the current time in microseconds with Time::HiRes::time. Is there any other ways to make random string.
I have read topics to add the Host name and IP address of the remote user but others say about proxies could be used.
I see Plack::Session::State module uses this simple code to generate id's:
Digest::SHA1::sha1_hex(rand() . $$ . {} . time)
So the question in short I want to generate a unique may be up to 64 bytes long session id guaranteed to work with high traffic.
You can safely use Data::UUID and you shouldn't be concerned about duplicates, you will not encounter them.
Also rand() will not return the same number when called subsequently even under assumption that it is called at the same moment of time. A pseudo random algorithm generates the next number based on its current state and the previously generated values. A true random generator is generally not used alone but in conjunction with a pseudo random number generator. In either case the likelihood of subsequently generated numbers to repeat between the nearest clock ticks in practical terms is negligible. In your example you may want to use rand(2**32).
Related
Does Data::UUID generates secure and random sequences? Is it ok to use it to generate password recovery link?
For example:
use Data::UUID;
my $u = Data::UUID->new;
my $uuid = $u->create_from_name_str(NameSpace_URL, 'www.example.com');
#then add $uuid to db
#and send email to user
Personally I'd use UUID::Tiny because that's capable of generating version 4 UUIDs, which are more random. However, in either case the modules are just using Perl's rand function which isn't considered random enough for serious crypto work.
Still, this is likely to be random enough for a typical password-recovery e-mail. Especially if the password recovery link is only kept working for, say, 24 hours and stops working after that.
It really depends on what you're securing though. Is it a forum for posting pictures of your pets dressed in superhero costumes, or is it nuclear launch codes? If you think that your website is likely to be a target for criminal elements, then it might be wise to opt for something stronger.
A fairly good random string with low collision probability can be generated using:
use Crypt::PRNG;
my $string = sprintf(
q/%08x%s/,
time(),
Crypt::PRNG->new->bytes_hex(24),
);
Data::UUID can generate either version 1 (create) or version 3 (create_from_name) UUIDs. Neither of those is random. Version 1 is your MAC address plus a timestamp, and version 3 is a MD5 hash of the string you passed in.
How would you do a one may mapping between a string and a UUID in perl.
I need integrate a legacy perl system that assigns users usernames, with a java system that assigns users a UUID.
(Only needs to be one way, that is, username to UUID, I don't need to go back the other way)
I was thinking something like this, although I bet theres a much simpler way:
#!/usr/bin/perl
use strict;
use Digest::MD5 qw(md5_hex);
my $username = "bob";
my $hash = md5_hex($username);
my $uuid = substr($hash, 0, 8)."-".substr($hash,8,4)."-".substr($hash,12,4)."-".substr($hash,16,4)."-".substr($hash,20,32);
print "$uuid\n";
I would suggest following RFC 4122's guidelines on generating UUIDs from names.
First, generate a random UUID and store it as part of your app / configuration.
Then:
use Data::UUID;
my $ug = Data::UUID->new;
my $namespace = $ug->from_string("65faad2c-7841-4b60-a7f4-560db1c5e683");
my $uuid = $ug->create_from_name_str($namespace, $username);
Where you replace "65faad2c-7841-4b60-a7f4-560db1c5e683" with your own randomly generated UUID.
This is guaranteed to generate valid UUIDs (your md5 method isn't), and if you ever have another legacy app that needs to be imported into the new system, conflicts will be avoided just by giving that app its own random UUID to use as a seed.
I am writing a very small URL shortener with Dancer. It uses the REST plugin to store a posted URL in a database with a six character string which is used by the user to access the shorted URL.
Now I am a bit unsure about my random string generation method.
sub generate_random_string{
my $length_of_randomstring = shift; # the length of
# the random string to generate
my #chars=('a'..'z','A'..'Z','0'..'9','_');
my $random_string;
for(1..$length_of_randomstring){
# rand #chars will generate a random
# number between 0 and scalar #chars
$random_string.=$chars[rand #chars];
}
# Start over if the string is already in the Database
generate_random_string(6) if database->quick_select('urls', { shortcut => $random_string });
return $random_string;
}
This generates a six char string and calls the function recursively if the generated string is already in the DB. I know there are 63^6 possible strings but this will take some time if the database gathers more entries. And maybe it will become a nearly infinite recursion, which I want to prevent.
Are there ways to generate unique random strings, which prevent recursion?
Thanks in advance
We don't really need to be hand-wavy about how many iterations (or recursions) of your function there will be. I believe at every invocation, the expected number of iterations is geomtrically distributed (i.e. number of trials before first success is governed by the geomtric distribution), which has mean 1/p, where p is the probability of successfully finding an unused string. I believe that p is just 1 - n/63^6, where n is the number of currently stored strings. Therefore, I think that you will need to have stored 30 billion strings (~63^6/2) in your database before your function recurses on average more than 2 times per call (p = .5).
Furthermore, the variance of the geomtric distribution is 1-p/p^2, so even at 30 billion entries, one standard deviation is just sqrt(2). Therefore I expect ~99% of the time that the loop will take fewerer than 2 + 2*sqrt(2) interations or ~ 5 iterations. In other words, I would just not worry too much about it.
From an academic stance this seems like an interesting program to work on. But if you're on the clock and just need random and distinct strings I'd go with the Data::GUID module.
use strict;
use warnings;
use Data::GUID qw( guid_string );
my $guid = guid_string();
Getting rid of recursion is easy; turn your recursive call into a do-while loop. For instance, split your function into two; the "main" one and a helper. The "main" one simply calls the helper and queries the database to ensure it's unique. Assuming generate_random_string2 is the helper, here's a skeleton:
do {
$string = generate_random_string2(6);
} while (database->quick_select(...));
As for limiting the number of iterations before getting a valid string, what about just saving the last generated string and always building your new string as a function of that?
For example, when you start off, you have no strings, so let's just say your string is 'a'. Then the next time you build a string, you get the last built string ('a') and apply a transformation on it, for instance incrementing the last character. This gives you 'b'. and so on. Eventually you get to the highest character you care for (say 'z') at which point you append an 'a' to get 'za', and repeat.
Now there is no database, just one persistent value that you use to generate the next value. Of course if you want truly random strings, you will have to make the algorithm more sophisticated, but the basic principle is the same:
Your current value is a function of the last stored value.
When you generate a new value, you store it.
Ensure your generation will produce a unique value (one that did not occur before).
I've got one more idea based on using MySQL.
create table string (
string_id int(10) not null auto_increment,
string varchar(6) not null default '',
primary key(string_id)
);
insert into string set string='';
update string
set string = lpad( hex( last_insert_id() ), 6, uuid() )
where string_id = last_insert_id();
select string from string
where string_id = last_insert_id();
This gives you an incremental hex value which is left padded with non-zero junk.
I know the _id (ObjectID) of some entry; is there any way to get its relative position from the table start / number of records before it, without writing any code?
*(the stuff was required for debugging some application which ha*d* messy 'no deletions' policy along with incremental record numbers and in-memory collections)*
UPD: still looking for native way to do such things, but here's some perl sweets:
#!/usr/bin/perl -w
use MongoDB;
use MongoDB::OID;
use strict;
my $ppl = MongoDB::Connection->new(username=>"root", password=>"toor")->webapp->users->find();
my $c = 0;
while (my $user = $ppl->next) {
$c++;
print "$user->{_id} $c\n" if ( $user->{'_id'} =~/4...6|4...5/);
}
This is not possible. There is no information in an ObjectID that you can reliably use to know how many older documents are in the same collection. The "inc" part of the ObjectId comes close but exact values depend on driver implementation (and can even be random) and would require all writes to come from the same machine to a mongod that's managing a single collection.
TL;DR : No
I hope this question is still on topic, but recently I found a key-value store programmed in Perl. It was pretty simple, RAM based and I think it had just set and get and also an 'expire' option for keys. I also think it came with as both XS and pure Perl version.
I have been searching for quite a while now and I not sure whether it is on CPAN or I saw it on GitHub. Maybe someone knows what I am talking about.
It might be helpful in narrowing things down if you could explain what exactly the module does that is special in that regard. If you're looking to implement something with caching in general, I'd point you towards CHI, which is basically a common API with multiple caching drivers.
Do you mean Cache? It can store key/value pairs in a number of places, including shared memory.
It sounds like you are describing Memcached. There is a Perl interface on CPAN.
I've used Tie::Cache for this in the past with excellent results. It created a tied hash variable that exhibits LRU behavior when it grows beyond a configured maximum key count.
my $cache_size = 1000;
use vars qw(cache);
%cache = ();
tie %cache, 'Tie::Cache', $cache_size;
From here, you can store hash/value pairs (of course, the value side can be a reference) in %cache and should its size grow to 1000 keys, the LRU keys will be deleted as more are added.
In my usage, I store the right-hand side as an arrayref holding the cached value along with a timestamp of when the entry was cached; my cache reference code checks the timestamp and deletes the key without using it if the entry isn't fresh enough:
sub getCacheMatch {
my $check_value = shift;
my $timeout = 600; # 10 minutes
# Check cache for a match.
my ($result, $time_cached);
my $now = time();
my $time_cached;
my $cache_entry = $cache{$check_value};
if ($cache_entry) {
($result, $time_cached) = #{$cache_entry};
if ($now - $time_cached > $timeout) {
delete $cache{$check_value);
return undef;
} else {
return $result;
}
}
}
And I update the cache elsewhere in the code like so:
$url{$cache_checkstring} = [$value_to_cache, $now];