security code permutations; security methodology - perl

I'm writing a Perl email subscription management app, based on a url containing two keycode parameters. At the time of subscription, a script will create two keycodes for each subscriber that are unique in the database (see below for script sample).
The codes will be created using Digest::SHA qw(sha256_hex). My understanding of it is that one way to ensure that codes are not duplicated in the database is to create a unique prefix in the raw data to be encoded. (see below, also).
Once a person is subscribed, I then have a database record of a person with two "code" fields, each containing values that are unique in the database. Each value is a string of alphanumeric characters that is 64 characters long, using lower case (only?) a-z and 0-9, e.g:
code1: ae7518b42b0514d69ae4e87d7d9f888ad268f4a398e7b88cbaf1dc2542858ba3
code2: 71723cf0aecd27c6bbf73ec5edfdc6ac912f648683470bd31debb1a4fbe429e8
These codes are sent in newsletter emails as parameters in a subscription management url. Thus, the person doesn't have to log in to manage their subscription; but simply click the url.
My question is:
If a subscriber tried to guess the values of the pair of codes for another person, how many possible combinations would there be to not only guess code1 correctly, but also guess code2? I suppose, like the lottery, a person could get lucky and just guess both; but I want to understand the odds against that, and its impact on security.
If the combo is guessed, the person would gain access to the database; thus, I'm trying to determine the level of security this method provides, compared to a more normal method of a username and 8 character password (which generically speaking could be considered two key codes themselves, but much shorter than the 64 characters above.)
I also welcome any feedback about the overall security of this method. I've noticed that many, many email newsletters seem to use similar keycodes, and don't require logging in to unsubscribe, etc. To, the primary issue (besides ease of use) is that a person should not be able to unsubscribe someone else.
Thanks!
Peter (see below for the code generation snippet)
Note that each ID and email would be unique.
The password is a 'system' password, and would be the same for each person.
#
#!/usr/bin/perl
use Digest::SHA qw(sha256_hex);
$clear = `clear`;
print $clear;
srand;
$id = 1;
$email = 'someone#domain.com';
$tag = ':!:';
$password = 'z9.4!l3tv+qe.p9#';
$rand_str = '9' x 15;
$rand_num = int(rand( $rand_str ));
$time = time() * $id;
$key_data = $id . $tag . $password . $rand_num . $time;
$key_code = sha256_hex($key_data);
$email_data = $email . $tag . $password . $time . $rand_num;
$email_code = sha256_hex($email_data);
print qq~
ID: $id
EMAIL: $email
KEY_DATA: $key_data
KEY_CODE: $key_code
EMAIL_DATA: $email_data
EMAIL_CODE: $email_code
~;
exit;
#

This seems like a lot of complexity to guard against a 3rd party unsubscribing someone. Why not generate a random code for each user, and store it in the database alongside the username? The method you are using creates a long string of digits, but there isn't actually much randomness in it. SHA is a deterministic algorithm that thoroughly scrambles bits, but it doesn't add entropy.
For an N bit truly random number, an attacker will only have a 1/(2^N) chance of guessing it right each time. Even with a small amount of entropy, say, 64 bits, your server should be throttling unsubscribe requests from the attacking IP address long before the attacker gets significant odds of succeeding. They'd have better luck guessing the user's email password, or intercepting the unencrypted email in transit.
That is why the unsubscribe codes are usually short. There's no need for a long code, and a long URL is more likely to be truncated or mistyped.

If you're asking how difficult it would be to "guess" two 256-bit "numbers", getting the one specific person you want to hack, that'd be 2^512:1 against. If there are, say, 1000 users in the database, and the attacker doesn't care which one s/he gets, that's 2^512:1000 against - not a significant change in likelihood.
However, it's much simpler than that if your attacker is either in control of (hacked in is close enough) one of the mail servers from your machine to the user's machine, or in control of any of the routers along the way, since your email goes out in plain text. A well-timed hacker who saw the email packet go through would be able to see the URL you've embedded no matter how many bits it is.
As with many security issues, it's a matter of how much effort to put in vs the payoff. Passwords are nice in that users expect them, so it's not a significant barrier to send out URLs that then need a password to enter. If your URL were even just one SHA key combined with the password challenge, this would nearly eliminate a man-in-the-middle attack on your emails. Up to you whether that's worth it. Cheap, convenient, secure. Pick one. :-)
More effort would be to gpg-encrypt your email with the client's public key (not your private one). The obvious downside is that gpg (or pgp) is apparently so little used that average users are unlikely to have it set up. Again, this would entirely eliminate MITM attacks, and wouldn't need a password, as it basically uses the client-side gpg private key password.

You've essentially got 1e15 different possible hashes generated for a given user email id (once combined with other information that could be guessed). You might as well just supply a hex-encoded random number of the same length and require the 'unsubscribe' link to include the email address or user id to be unsubscribed.
I doubt anyone would go to the lengths required to guess a number from 1 to 1e15, especially if you rate limit unsubscribe requests, and send a 'thanks, someone unsubscribed you' email if anyone is unsubscribed, and put a new subsubscription link into that.
A quick way to generate the random string is:
my $hex = join '', map { unpack 'H*', chr(rand(256)) } 1..8;
print $hex, "\n";
b4d4bfb26fddf220
(This gives you 2^64, or about 2*10^19 combinations. Or 'plenty' if you rate limit.)

Related

How can I be sure an email address is unique?

There's a pub in my town whereby, if you sign up to their newsletter using their website and provide a "unique" email address, you get a free drink. On a whim, I decided to sign up a second time using myemail+one#gmail.com. It let me. I'm now sitting on a nice comfy pile of free drink vouchers.
This got me thinking about a system we have here, where the email address is considered the unique identifier. Checking the code, sure enough, if we were offering vouchers in our business, someone else would be sitting pretty.
The basic, stab-in-the-dark, fix is to check for the "+" character and ignore everything after it (up to the #), and compare using that. But I am unsure if this was the intent for the + character. Would that work?
Secondly, are there any other caveats that would allow a user to sign up multiple times with a seemingly different email address, but which actually would always end up in the same mailbox?
This question is language-agnostic.
While using a plus sign as an e-mail address alias is a known feature of gmail, other mailers do either not allow it or use a minus sign instead. '+' is a legitimate character to be used as part of an email address according to the RFC.
The use of '.' is also a gray area. john.doe#gmail.com and johndoe#gmail.com send also both to the same email address and look different.
In order to validate the uniqueness of an email address you will have to prepare a rule base for your application, keep it up to date and still expect surprises...

Encoding a email address that can be used as part of a URL in codeigniter

Is there a way to encode a email address that can be used as a part of a url in codeigniter?. I need to decode back the email address from the url.
What I am trying to do is just a -forgotten password recovery- thing. I send a confirmation link to the user's email address, the link needs to be like ../encodedEmail/forgottenPasswordCode (with the forgottenPasswordCode updated in the db for the user with the submitted email).
When the user visits that link, I decode the email(if the email - forgottenPasswordCode pair is in the table), i allow them to reset their password (and i reset forgottenPasswordCode back to null).
I could just do a loop -checking the table with a select query- (or) -set that forgottenPasswordCode column unique, so i keep generating on a insert failure(would that be a lot faster ?)- until I generate a forgottenPasswordCode that doesn't already exist in the table.
But the guy I do this for would not accept it this way:). He wants the checking be done with the user's email, he thinks its much faster.
I am working with codeigniter, I used its encode() function, it seems to produce characters like '-slashes-' at times that breaks the encoded-email-string.
Any other ideas?
try using bin2hex() and hex2bin() function,
<?php
function hex2bin($str)
{
$bin = "";
$i = 0;
do
{
$bin .= chr(hexdec($str{$i}.$str{($i + 1)}));
$i += 2;
} while ($i < strlen($str));
return $bin;
}
$str = 'email#website.com';
$output = bin2hex($str);
echo $output . '<br/>';
echo hex2bin($output);
?>
Don't put data in the URL that doesn't have some sort of meaning. This leaves two choices:
Send the address as part of a POST. If it's coming from a web form this is the way to go.
Refer to the address in the database using an ID or hashed value. If you need the user to click a link referring to their account, use something that clearly refers to their account. If you need to refer to an instance of a password reset (many systems do this), add a table containing hashes, using that hash in the URL.
Why not just encode it in the URL?
You can see URLs (it's part of the UI), encoded things look weird
URLs represent resources, things in your app (users probably already have IDs)
Encoded email addresses are long (making these URLs harder to work with in things like emails)
Try to keep parameters in URLs to clear references to concepts in your web app (point at one user by ID or plaintext name, for example). Parameters that don't fit in URLs go in POST parameters. If you must use something encoded in a URL, prefer one-way-encoding and database lookups.
Although it may be not optimal design solution to use email as a part URL,
use email as base64 encoded string to avoid any issues with special chars
E.g. Base64 encoded string 'abc-def#example.com' is
YWJjLWRlZkBleGFtcGxlLmNvbQ==
In your case the URL is
../YWJjLWRlZkBleGFtcGxlLmNvbQ==/forgottenPasswordCode
All you need is to decode that string back before usage

Safe non-tamperable URL component in Perl using symmetric encryption?

OK, I'm probably just having a bad Monday, but I have the following need and I'm seeing lots of partial solutions but I'm sure I'm not the first person to need this, so I'm wondering if I'm missing the obvious.
$client has 50 to 500 bytes worth of binary data that must be inserted into the middle of a URL and roundtrip to their customer's browser. Since it's part of the URL, we're up against the 1K "theoretical" limit of a GET URL. Also, $client doesn't want their customer decoding the data, or tampering with it without detection. $client would also prefer not to store anything server-side, so this must be completely standalone. Must be Perl code, and fast, in both encoding and decoding.
I think the last step can be base64. But what are the steps for encryption and hashing that make the most sense?
I have some code in a Cat App that uses Crypt::Util to encode/decode a user's email address for an email verification link.
I set up a Crypt::Util model using Catalyst::Model::Adaptor with a secret key. Then in my Controller I have the following logic on the sending side:
my $cu = $c->model('CryptUtil');
my $token = $cu->encode_string_uri_base64( $cu->encode_string( $user->email ) );
my $url = $c->uri_for( $self->action_for('verify'), $token );
I send this link to the $user->email and when it is clicked on I use the following.
my $cu = $c->model('CryptUtil');
if ( my $id = $cu->decode_string( $cu->decode_string_uri_base64($token) ) ) {
# handle valid link
} else {
# invalid link
}
This is basically what edanite just suggested in another answer. You'll just need to make sure whatever data you use to form the token with that the final $url doesn't exceed your arbitrary limit.
Create a secret key and store it on the server. If there are multiple servers and requests aren't guaranteed to come back to the same server; you'll need to use the same key on every server. This key should be rotated periodically.
If you encrypt the data in CBC (Cipher Block Chaining) mode (See the Crypt::CBC module), the overhead of encryption is at most two blocks (one for the IV and one for padding). 128 bit (i.e. 16 byte) blocks are common, but not universal. I recommend using AES (aka Rijndael) as the block cipher.
You need to authenticate the data to ensure it hasn't been modified. Depending on the security of the application, just hashing the message and including the hash in the plaintext that you encrypt may be good enough. This depends on attackers being unable to change the hash to match the message without knowing the symmetric encryption key. If you're using 128-bit keys for the cipher, use a 256-bit hash like SHA-256 (you can use the Digest module for this). You may also want to include some other things like a timestamp in the data to prevent the request from being repeated multiple times.
I see three steps here. First, try compressing the data. With so little data bzip2 might save you maybe 5-20%. I'd throw in a guard to make sure it doesn't make the data larger. This step may not be worth while.
use Compress::Bzip2 qw(:utilities);
$data = memBzip $data;
You could also try reducing the length of any keys and values in the data manually. For example, first_name could be reduced to fname.
Second, encrypt it. Pick your favorite cipher and use Crypt::CBC. Here I use Rijndael because its good enough for the NSA. You'll want to do benchmarking to find the best balance between performance and security.
use Crypt::CBC;
my $key = "SUPER SEKRET";
my $cipher = Crypt::CBC->new($key, 'Rijndael');
my $encrypted_data = $cipher->encrypt($data);
You'll have to store the key on the server. Putting it in a protected file should be sufficient, securing that file is left as an exercise. When you say you can't store anything on the server I presume this doesn't include the key.
Finally, Base 64 encode it. I would use the modified URL-safe base 64 which uses - and _ instead of + and / saving you from having to spend space URL encoding these characters in the base 64 string. MIME::Base64::URLSafe covers that.
use MIME::Base64::URLSafe;
my $safe_data = urlsafe_b64encode($encrypted_data);
Then stick it onto the URL however you want. Reverse the process for reading it in.
You should be safe on size. Encrypting will increase the size of the data, but probably by less than 25%. Base 64 will increase the size of the data by a third (encoding as 2^6 instead of 2^8). This should leave encoding 500 bytes comfortably inside 1K.
How secure does it need to be? Could you just xor the data with a long random string then add an MD5 hash of the whole lot with another secret salt to detect tampering?
I wouldn't use that for banking data, but it'd probably be fine for most web things...
big

How to make an email bot that replies to users not reply to auto-responses and get itself into mail loops

I have a bot that replies to users. But sometimes when my bot sends its reply, the user or their email provider will auto-respond (vacation message, bounce message, error from mailer-daemon, etc). That is then a new message from the user (so my bot thinks) that it in turn replies to. Mail loop!
I'd like my bot to only reply to real emails from real humans. I'm currently filtering out email that admits to being bulk precedence or from a mailing list or has the Auto-Submitted header equal to "auto-replied" or "auto-generated" (see code below). But I imagine there's a more comprehensive or standard way to deal with this. (I'm happy to see solutions in other languages besides Perl.)
NB: Remember to have your own bot declare that it is autoresponding! Include
Auto-Submitted: auto-reply
in the header of your bot's email.
My original code for avoiding mail loops follows. Only reply if realmail returns true.
sub realmail {
my($email) = #_;
$email =~ /\nSubject\:\s*([^\n]*)\n/s;
my $subject = $1;
$email =~ /\nPrecedence\:\s*([^\n]*)\n/s;
my $precedence = $1;
$email =~ /\nAuto-Submitted\:\s*([^\n]*)\n/s;
my $autosub = $1;
return !($precedence =~ /bulk|list|junk/i ||
$autosub =~ /(auto\-replied|auto\-generated)/i ||
$subject =~ /^undelivered mail returned to sender$/i
);
}
(The Subject check is surely unnecessary; I just added these checks one at a time as problems arose and the above now seems to work so I don't want to touch it unless there's something definitively better.)
RFC 3834 provides some guidance for what you should do, but here are some concrete guidelines:
Set your envelope sender to a different email address than your auto-responder so bounces don't feed back into the system.
I always store in a database a key of when an email response was sent from a specific address to another address. Under no circumstance will I ever respond to the same address more than once in a 10 minute period. This alone stopped all loops, but doesn't ensure nice behavior (auto-responses to mailing lists are annoying).
Make sure you add any permutation of header that other people are matching on to stop loops. Here's the list I use:
X-Loop: autoresponder
Auto-Submitted: auto-replied
Precedence: bulk (autoreply)
Here are some header regex's I use to avoid loops and to try to play nice:
/^precedence:\s+(?:bulk|list|junk)/i
/^X-(?:Loop|Mailing-List|BeenThere|Mailman)/i
/^List-/i
/^Auto-Submitted:/i
/^Resent-/i
I also avoid responding if any of these are the envelop senders:
if ($sender eq ""
|| $sender =~ /^(?:request|owner|admin|bounce|bounces)-|-(?:request|owner|admin|bounce|bounces)\#|^(?:mailer-daemon|postmaster|daemon|majordomo|ma
ilman|bounce)\#|(?:listserv|listsrv)/i) {
That really sounds like something that's probably available as a module from CPAN, but I didn't find anything clearly relevant in five minutes of searching. Mail::Lite::Mbox::Processor looks like it might do what you want:
Mail::Lite::Message::Matcher is a
framework for automated mail
processing. For example you have a
mail server and you have a need to
process some types of incoming mail
messages automatically. For example,
you can extract automated
notifications, invoices, alerts etc.
from your mail flow and perform some
tasks based on content of those
messages.
but its docs are sparse enough that it isn't immediately obvious whether it provides those example functions itself or if you have to provide the code to drive them.
In any case, though, if you haven't already checked CPAN, that's where I would start if I wanted to do something like this.
My answer here only deals with bounces which is more straightforward.
Using DSN (Delivery Status Notification) identifier will help you detect a DSN/bounced message. It should go to Return-Path and not Reply-To.
Here's a sample of a typical DSN message. The header information includes the message id, content type has specific values (delivery-status) etc.
Not able to provide you any codes in perl, just my 2 cents of idea.
PS: Do note that not all mail servers or MTA conforms to this, but I guess most do.
There should be a standard way of dealing with this, but the problem is that you'd have to assume that systems that send auto-replies comply to that standard, when most the time, they just don't.
How do you get the address that you reply to? I hope you aren't using the From: header. Check the Reply-to: header first and if that doesn't exist, use the Return-path:.
But whatever you do, you will simply have to keep a log of what you sent to whom and throttle your bot to some sensible value of messages per time.

Passing Perl Data Structures as Serialized GET Strings to a Perl CGI program

I want to pass a serialized Perl data structure as a GET variable to a CGI application. I tried Data::Serializer as my first option. Unfortunately the serialized string is too long for my comfort, in addition to containing options joined by '^' (a caret).
Is there a way I can create short encoded strings from perl data structures so that I can safely pass them as GET variables to a perl CGI application?
I would also appreciate being told that this (serialized, encoded strings) is a bad way to pass complex data structures to web applications and suggestions on how I could accomplish this
If you need to send URL's to your users that contains a few key datapoints and you want to ensure it can't be forged you can do this with a Digest (such as from Digest::SHA) and a shared secret. This lets you put the data out there in your messages without needing to keep a local database to track it all. My example doesn't include a time element, but that would be easy enough to add in if you want.
use Digest::SHA qw(sha1_base64);
my $base_url = 'http://example.com/thing.cgi';
my $email = 'guy#somewhere.com';
my $msg_id = '123411';
my $secret = 'mysecret';
my $data = join(":", $email, $msg_id, $secret);
my $digest = sha1_base64($data);
my $url = $base_url . '?email=' . $email . '&msg_id=' . $msg_id' . '&sign=' . $digest;
Then send it along.
In your "thing.cgi" script you just need to extract the parameters and see if the digest submitted in the script matches the one you locally regenerate (using $email and $msg_id, and of course your $secret). If they don't match, don't authorize them, if they do then you have a legitimately authorized request.
Footnote:
I wrote the "raw" methods into Data::Serializer to make translating between serializers much easier and that in fact does help with going between languages (to a point). But that of course is a separate discussion as you really shouldn't ever use a serializer for exchanging data on a web form.
One of the drawbacks of the approach — using a perl-specific serializer, that is — is that if you ever want to communicate between the client and server using something other than perl it will probably be more work than something like JSON or even XML would be. The size limitations of GET requests you've already run in to, but that's problematic for any encoding scheme.
It's more likely to be a problem for the next guy down the road who maintains this code than it is for you. I have a situation now where a developer who worked on a large system before I did decided to store several important bits of data as perl Storable objects. Not a horrible decision in and of itself, but it's making it more difficult than it should be to access the data with tools that aren't written in perl.
Passing serialized encoded strings is a bad way to pass complex data structures to web applications.
If you are trying to pass state from page to page, you can use server side sessions which would only require you to pass around a session key.
If you need to email a link to someone, you can still create a server-side session with a reasonable expiry time (you'll also need to decide if additional authentication is necessary) and then send the session id in the link. You can/should expire the session immediately once the requested action is taken.