Map a string (usernames) to UUID's in perl - perl

How would you do a one may mapping between a string and a UUID in perl.
I need integrate a legacy perl system that assigns users usernames, with a java system that assigns users a UUID.
(Only needs to be one way, that is, username to UUID, I don't need to go back the other way)
I was thinking something like this, although I bet theres a much simpler way:
#!/usr/bin/perl
use strict;
use Digest::MD5 qw(md5_hex);
my $username = "bob";
my $hash = md5_hex($username);
my $uuid = substr($hash, 0, 8)."-".substr($hash,8,4)."-".substr($hash,12,4)."-".substr($hash,16,4)."-".substr($hash,20,32);
print "$uuid\n";

I would suggest following RFC 4122's guidelines on generating UUIDs from names.
First, generate a random UUID and store it as part of your app / configuration.
Then:
use Data::UUID;
my $ug = Data::UUID->new;
my $namespace = $ug->from_string("65faad2c-7841-4b60-a7f4-560db1c5e683");
my $uuid = $ug->create_from_name_str($namespace, $username);
Where you replace "65faad2c-7841-4b60-a7f4-560db1c5e683" with your own randomly generated UUID.
This is guaranteed to generate valid UUIDs (your md5 method isn't), and if you ever have another legacy app that needs to be imported into the new system, conflicts will be avoided just by giving that app its own random UUID to use as a seed.

Related

Perl generate random session id for high traffic

I am trying to generate a random id for sessions cookie for every user session in Perl. Of course I searched cpan and google and found many similar topics and same weakness. The most modules used are Digest::SHA and Data::UUID and the module Data::GUID which internally uses Data::UUID.
Here is the code I can summarize the most methods used in modules on cpan:
#!/usr/bin/perl
use v5.10;
use Digest::SHA;
use Data::UUID;
use Data::GUID;# uses Data::UUID internally, so no need for it
use Time::HiRes ();
for (1..10) {
#say generate_sha(1); # 1= 40 bytes, 256=64 bytes, 512=128 bytes, 512224, 512256
say generate_uuid();
#say generate_guid();
}
sub generate_sha {
my ($bits) = #_;
# SHA-1/224/256/384/512
return Digest::SHA -> new($bits) -> add($$, +{}, Time::HiRes::time(), rand(Time::HiRes::time()) ) -> hexdigest;
}
sub generate_uuid {
return Data::UUID->new->create_hex(); #create_str, create_b64
}
sub generate_guid {
# uses Data::UUID internally
return Data::GUID->guid;
}
Here is a sample output form Data::UUID module:
0x0217C34C6C0710149FE4C7FBB6FA663B
0x0218665F6C0710149FE4C7FBB6FA663B
0x0218781A6C0710149FE4C7FBB6FA663B
0x021889316C0710149FE4C7FBB6FA663B
0x021899E16C0710149FE4C7FBB6FA663B
0x0218AB2B6C0710149FE4C7FBB6FA663B
0x0218BB1D6C0710149FE4C7FBB6FA663B
0x0218CABD6C0710149FE4C7FBB6FA663B
0x0218DB786C0710149FE4C7FBB6FA663B
0x0218ED396C0710149FE4C7FBB6FA663B
The id's generated from these seems to be unique, but what I am concerning is about high traffic or concurrency, say what if a 1000 only not saying 1000,000 users connected at the same time either from the same process like running under FCGI (say each FCGI process serving only 10 users) or from separate processes like running under CGI mode.
In the SHA I used this random string:
($$, +{}, Time::HiRes::time(), rand(Time::HiRes::time())
it includes Anonymous hash reference address and the current time in microseconds with Time::HiRes::time. Is there any other ways to make random string.
I have read topics to add the Host name and IP address of the remote user but others say about proxies could be used.
I see Plack::Session::State module uses this simple code to generate id's:
Digest::SHA1::sha1_hex(rand() . $$ . {} . time)
So the question in short I want to generate a unique may be up to 64 bytes long session id guaranteed to work with high traffic.
You can safely use Data::UUID and you shouldn't be concerned about duplicates, you will not encounter them.
Also rand() will not return the same number when called subsequently even under assumption that it is called at the same moment of time. A pseudo random algorithm generates the next number based on its current state and the previously generated values. A true random generator is generally not used alone but in conjunction with a pseudo random number generator. In either case the likelihood of subsequently generated numbers to repeat between the nearest clock ticks in practical terms is negligible. In your example you may want to use rand(2**32).

Is UUID random enought for password recovery link?

Does Data::UUID generates secure and random sequences? Is it ok to use it to generate password recovery link?
For example:
use Data::UUID;
my $u = Data::UUID->new;
my $uuid = $u->create_from_name_str(NameSpace_URL, 'www.example.com');
#then add $uuid to db
#and send email to user
Personally I'd use UUID::Tiny because that's capable of generating version 4 UUIDs, which are more random. However, in either case the modules are just using Perl's rand function which isn't considered random enough for serious crypto work.
Still, this is likely to be random enough for a typical password-recovery e-mail. Especially if the password recovery link is only kept working for, say, 24 hours and stops working after that.
It really depends on what you're securing though. Is it a forum for posting pictures of your pets dressed in superhero costumes, or is it nuclear launch codes? If you think that your website is likely to be a target for criminal elements, then it might be wise to opt for something stronger.
A fairly good random string with low collision probability can be generated using:
use Crypt::PRNG;
my $string = sprintf(
q/%08x%s/,
time(),
Crypt::PRNG->new->bytes_hex(24),
);
Data::UUID can generate either version 1 (create) or version 3 (create_from_name) UUIDs. Neither of those is random. Version 1 is your MAC address plus a timestamp, and version 3 is a MD5 hash of the string you passed in.

How do you use SHA256 to create a token of key,value pairs and a secret signature?

I want to validate some hidden input fields (to make sure they arent changed on submission) with the help of a sha-encoded string of the key value pairs of these hidden fields. I saw examples of this online but I didnt understand how to encode and
decode the values with a dynamic secret value. Can someone help me understand how to do this in perl?
Also which signature type (MD5, SHA1, SHA256, etc), has a good balance of performance and security?
update
So, how do you decode the string once you get it encoded?
What you really need is not a plain hash function, but a message authentication code such as HMAC. Since you say you'd like to use SHA-256, you might like HMAC_SHA256, which is available in Perl via the Digest::SHA module:
use Digest::SHA qw(hmac_sha256_base64);
my $mac = hmac_sha256_base64( $string, $key );
Here, $key is an arbitrary key, which you should keep secret, and $string contains the data you want to sign. To apply this to a more complex data structure (such as a hash of key–value pairs), you first need to convert it to a string. There are several ways to do that; for example, you could use Storable:
use Storable qw(freeze);
sub pairs_to_string {
local $Storable::canonical = 1;
my %hash = #_;
return freeze( \%hash );
}
You could also URL-encoding, as suggested by David Schwartz. The important thing is that, whatever method you use, it should always return the exact same string when given the same hash as input.
Then, before sending the data to the user, you calculate a MAC for them and include it as an extra field in the data. When you receive the data back, you remove the MAC field (and save its value), recalculate the MAC for the remaining fields and compare it to the value you received. If they don't match, someone (or something) has tampered with the data. Like this:
my $key = "secret";
sub mac { hmac_sha256_base64( pairs_to_string(#_), $key ) }
# before sending data to client:
my %data = (foo => "something", bar => "whatever");
$data{mac} = mac( %data );
# after receiving %data back from client:
my $mac = delete $data{mac};
die "MAC mismatch" if $mac ne mac( %data );
Note that there are some potential tricks this technique doesn't automatically prevent, such as replay attacks: once you send the data and MAC to the user, they'll learn the MAC corresponding to the particular set of data, and could potentially replace the fields in a later form with values saved from an earlier form. To protect yourself against such attacks, you should include enough identifying information in the data protected by the MAC to ensure that you can detect any potentially harmful replays. Ideally, you'd want to include a unique ID in every form and check that no ID is ever submitted twice, but that may not always be practical. Failing that, it may be a good idea to include a user ID (so that a malicious user can't trick someone else into submitting their data) and a form ID (so that a user can't copy data from one form to another) and perhaps a timestamp and/or a session ID (so that you can reject old data) in the form (and in the MAC calculation).
I don't know what you mean by "unpack", but you can't get original string from the hash.
Let's understand the problem: you render some hidden fields and you want to make sure that they're submitted unchanged, right? Here's how you can ensure that.
Let's suppose you have two variables:
first: foo
second: bar
You can hash them together with a secret key:
secret_key = "ysEJbKTuJU6u"
source_string = secret_key + "first" + "foo" + "second" + "bar"
hash = MD5(source_string)
# => "1adfda97d28af6535ef7e8fcb921d3f0"
Now you can render your markup:
<input type="hidden" name="first" value="foo" />
<input type="hidden" name="second" value="bar" />
<input type="hidden" name="hash" value="1adfda97d28af6535ef7e8fcb921d3f0">
Upon form submission, you get values of first and second fields, concat them to your secret key in a similar manner and hash again.
If hashes are equal, your values haven't been changed.
Note: never render secret key to the client. And sort key/value pairs before hashing (to eliminate dependency on order).
( disclaimer: I am not a crypto person, so you may just stop reading now)
As for performance/security, even though MD5 was found to have a weakness, it's still pretty usable, IMHO. SHA1 has a theoretical weakness, although no successful attack has been made yet. There are no known weaknesses in SHA-256.
For this application, any of the encryption algorithms is fine. You can pack the values any way you want, so long as it's repeatable. One common method is to pack the fields into a string the same way you would encode them into a URL for a GET request (name=value).
To compute the hash, create a text secret that can be whatever you want. It should be at least 12 bytes long though. Compute the hash of the secret concatenated with the packed fields and append that onto the end.
So, say you picked MD5, a secret of JS90320ERHe2 and you have these fields:
first_name = Jack
last_name = Smith
other_field = 7=2
First, URL encode it:
first_name=Jack&last_name=Smith&other_field=7%3d=2
Then compute the MD5 hash of
JS90320ERHe2first_name=Jack&last_name=Smith&other_field=7%3d=2
Which is 6d0fa69703935efaa183be57f81d38ea. The final encoded field is:
first_name=Jack&last_name=Smith&other_field=7%3d=2&hash=6d0fa69703935efaa183be57f81d38ea
So that's what you pass to the user. To validate it, remove the hash from the end, compute the MD5 hash by concatenating what's left with the secret, and if the hashes match, the field hasn't been tampered with.
Nobody can compute their own valid MD5 because they don't know to prefix the string with.
Note that an adversary can re-use any old valid value set. They just can't create their own value set from scratch or modify an existing one and have it test valid. So make sure you include something in the information so you can verify that it is suitable for the purpose it has been used.

Get number of records in table before given entry / get record absolute ID

I know the _id (ObjectID) of some entry; is there any way to get its relative position from the table start / number of records before it, without writing any code?
*(the stuff was required for debugging some application which ha*d* messy 'no deletions' policy along with incremental record numbers and in-memory collections)*
UPD: still looking for native way to do such things, but here's some perl sweets:
#!/usr/bin/perl -w
use MongoDB;
use MongoDB::OID;
use strict;
my $ppl = MongoDB::Connection->new(username=>"root", password=>"toor")->webapp->users->find();
my $c = 0;
while (my $user = $ppl->next) {
$c++;
print "$user->{_id} $c\n" if ( $user->{'_id'} =~/4...6|4...5/);
}
This is not possible. There is no information in an ObjectID that you can reliably use to know how many older documents are in the same collection. The "inc" part of the ObjectId comes close but exact values depend on driver implementation (and can even be random) and would require all writes to come from the same machine to a mongod that's managing a single collection.
TL;DR : No

How can I map UIDs to user names using Perl library functions?

I'm looking for a way of mapping a uid (unique number representing a system user) to a user name using Perl.
Please don't suggest greping /etc/passwd :)
Edit
As a clarification, I wasn't looking for a solution that involved reading /etc/passwd explicitly. I realize that under the hood any solution would end up doing this, but I was searching for a library function to do it for me.
The standard function getpwuid, just like the same C function, gets user information based on its ID. No uses needed:
my ($name) = getpwuid(1000);
print $name,"\n";
Although it eventually reads /etc/passwd file, using standard interfaces is much more clear for other users to see, let alone it saves you some keystrokes.
Read /etc/passwd and hash the UID to login name.
Edit:
$uid = getpwnam($name);
$name = getpwuid($num);
$name = getpwent();
$gid = getgrnam($name);
$name = getgrgid($num);
$name = getgrent();
As you can see, regardless of which one you pick, the system call reads from /etc/passwd (see this for reference)
Actually I would suggest building a hash based on /etc/passwd :-) This should work well as the user ids are required to be unique.