Including email in restful URIs - rest

In my application I have URIs that include the userid to identify user private resources. And userids are emails, for example:
/users/user2#example.com/private-resource
It's a good practice to put email in the URI, including characters like . and #?
Or should I use some other type of userid? Like an hash for example?

If the email can function as fixed identifier it would be ok.
The thing is that most of the time applications will allow users to change emails; in this case it would be more bullet proof to control the ID-space yourself, e.g. by using a surrogate key. (Because the email (if users can change them) is not the identity but merely a property of the resource).
Another argument against emails is - as #Rob points out - a potential security issue.

There's no problem with that... just make sure you URL-encode the appropriate characters.
(Email's kind of sensitive information, though, you might want to pass a hash or surrogate key instead just to protect your users' email addresses if the URLs get passed around.)

though having '-' in uri is absolutely fine.
use of '#' , '#' are discouraged
however, as long as email id is an id e.g. http://example.org/user/mail/{email}/1234
guess thats fine.

Related

How can I be sure an email address is unique?

There's a pub in my town whereby, if you sign up to their newsletter using their website and provide a "unique" email address, you get a free drink. On a whim, I decided to sign up a second time using myemail+one#gmail.com. It let me. I'm now sitting on a nice comfy pile of free drink vouchers.
This got me thinking about a system we have here, where the email address is considered the unique identifier. Checking the code, sure enough, if we were offering vouchers in our business, someone else would be sitting pretty.
The basic, stab-in-the-dark, fix is to check for the "+" character and ignore everything after it (up to the #), and compare using that. But I am unsure if this was the intent for the + character. Would that work?
Secondly, are there any other caveats that would allow a user to sign up multiple times with a seemingly different email address, but which actually would always end up in the same mailbox?
This question is language-agnostic.
While using a plus sign as an e-mail address alias is a known feature of gmail, other mailers do either not allow it or use a minus sign instead. '+' is a legitimate character to be used as part of an email address according to the RFC.
The use of '.' is also a gray area. john.doe#gmail.com and johndoe#gmail.com send also both to the same email address and look different.
In order to validate the uniqueness of an email address you will have to prepare a rule base for your application, keep it up to date and still expect surprises...

Create URL based on entered data in iOS

If user entered say google then
i need to add http://www.google.com ie missing part.
User may enter any thing say google.in or www.google or anything.
Now goal to complete the left over url as we check url using regex like this:
NSString *urlRegEx = #"(http|https)://((\\w)*|([0-9]*)|([-|_])*)+([\\.|/]((\\w)*|([0-9]*)|([-|_])*))+";
That given url is valid or not
Don't forget the third level domain names with suffixes like .co.uk, .co.us, .com.co, etc.
A fully qualified domain name must have at least one dot. If it doesn't have at least one dot, then you might add .com to the end.
If it does have at least one dot, then it gets more complicated. .google could be a top level domain in the future, though it isn't now. Perhaps you want to keep a white list of all "valid" first and second level domain names. You evaluate the entered domain name from the right until it stops matching domains from your list. The remainder is the "registered" domain name and any sub domains. If you don't find any matches, then add .com.
Alternatively, rather than parsing the domain name, you could just try to resolve it, and if it doesn't resolve, then add .com and try again.
I think I know what you're asking however other than the regex how are you actually validating that the URL is valid? It seems as though you're making an incorrect assumption that all URLs follow a common syntax. As as example, http://www.www.extra-www.org/ is a valid URL so if you apply your regex (as I understand your intent) the user would get to http://www.extra-www.org which may not be the same site as the one the user wanted (even though in this case it is because it forwards). Another example is http://www.www.com... if the user enters "www" your regex will kill it. A final example is if a site doesn't have its DNS registered WITH the "www" - your regex will incorrectly add the "www" piece.
EDIT: what happens if the user needs https as opposed to http and only enters "google"?
Haven't tried this one myself yet, but an NSDataDetector for the type NSTextCheckingTypeLink should be able to do you a decent job.

security code permutations; security methodology

I'm writing a Perl email subscription management app, based on a url containing two keycode parameters. At the time of subscription, a script will create two keycodes for each subscriber that are unique in the database (see below for script sample).
The codes will be created using Digest::SHA qw(sha256_hex). My understanding of it is that one way to ensure that codes are not duplicated in the database is to create a unique prefix in the raw data to be encoded. (see below, also).
Once a person is subscribed, I then have a database record of a person with two "code" fields, each containing values that are unique in the database. Each value is a string of alphanumeric characters that is 64 characters long, using lower case (only?) a-z and 0-9, e.g:
code1: ae7518b42b0514d69ae4e87d7d9f888ad268f4a398e7b88cbaf1dc2542858ba3
code2: 71723cf0aecd27c6bbf73ec5edfdc6ac912f648683470bd31debb1a4fbe429e8
These codes are sent in newsletter emails as parameters in a subscription management url. Thus, the person doesn't have to log in to manage their subscription; but simply click the url.
My question is:
If a subscriber tried to guess the values of the pair of codes for another person, how many possible combinations would there be to not only guess code1 correctly, but also guess code2? I suppose, like the lottery, a person could get lucky and just guess both; but I want to understand the odds against that, and its impact on security.
If the combo is guessed, the person would gain access to the database; thus, I'm trying to determine the level of security this method provides, compared to a more normal method of a username and 8 character password (which generically speaking could be considered two key codes themselves, but much shorter than the 64 characters above.)
I also welcome any feedback about the overall security of this method. I've noticed that many, many email newsletters seem to use similar keycodes, and don't require logging in to unsubscribe, etc. To, the primary issue (besides ease of use) is that a person should not be able to unsubscribe someone else.
Thanks!
Peter (see below for the code generation snippet)
Note that each ID and email would be unique.
The password is a 'system' password, and would be the same for each person.
#
#!/usr/bin/perl
use Digest::SHA qw(sha256_hex);
$clear = `clear`;
print $clear;
srand;
$id = 1;
$email = 'someone#domain.com';
$tag = ':!:';
$password = 'z9.4!l3tv+qe.p9#';
$rand_str = '9' x 15;
$rand_num = int(rand( $rand_str ));
$time = time() * $id;
$key_data = $id . $tag . $password . $rand_num . $time;
$key_code = sha256_hex($key_data);
$email_data = $email . $tag . $password . $time . $rand_num;
$email_code = sha256_hex($email_data);
print qq~
ID: $id
EMAIL: $email
KEY_DATA: $key_data
KEY_CODE: $key_code
EMAIL_DATA: $email_data
EMAIL_CODE: $email_code
~;
exit;
#
This seems like a lot of complexity to guard against a 3rd party unsubscribing someone. Why not generate a random code for each user, and store it in the database alongside the username? The method you are using creates a long string of digits, but there isn't actually much randomness in it. SHA is a deterministic algorithm that thoroughly scrambles bits, but it doesn't add entropy.
For an N bit truly random number, an attacker will only have a 1/(2^N) chance of guessing it right each time. Even with a small amount of entropy, say, 64 bits, your server should be throttling unsubscribe requests from the attacking IP address long before the attacker gets significant odds of succeeding. They'd have better luck guessing the user's email password, or intercepting the unencrypted email in transit.
That is why the unsubscribe codes are usually short. There's no need for a long code, and a long URL is more likely to be truncated or mistyped.
If you're asking how difficult it would be to "guess" two 256-bit "numbers", getting the one specific person you want to hack, that'd be 2^512:1 against. If there are, say, 1000 users in the database, and the attacker doesn't care which one s/he gets, that's 2^512:1000 against - not a significant change in likelihood.
However, it's much simpler than that if your attacker is either in control of (hacked in is close enough) one of the mail servers from your machine to the user's machine, or in control of any of the routers along the way, since your email goes out in plain text. A well-timed hacker who saw the email packet go through would be able to see the URL you've embedded no matter how many bits it is.
As with many security issues, it's a matter of how much effort to put in vs the payoff. Passwords are nice in that users expect them, so it's not a significant barrier to send out URLs that then need a password to enter. If your URL were even just one SHA key combined with the password challenge, this would nearly eliminate a man-in-the-middle attack on your emails. Up to you whether that's worth it. Cheap, convenient, secure. Pick one. :-)
More effort would be to gpg-encrypt your email with the client's public key (not your private one). The obvious downside is that gpg (or pgp) is apparently so little used that average users are unlikely to have it set up. Again, this would entirely eliminate MITM attacks, and wouldn't need a password, as it basically uses the client-side gpg private key password.
You've essentially got 1e15 different possible hashes generated for a given user email id (once combined with other information that could be guessed). You might as well just supply a hex-encoded random number of the same length and require the 'unsubscribe' link to include the email address or user id to be unsubscribed.
I doubt anyone would go to the lengths required to guess a number from 1 to 1e15, especially if you rate limit unsubscribe requests, and send a 'thanks, someone unsubscribed you' email if anyone is unsubscribed, and put a new subsubscription link into that.
A quick way to generate the random string is:
my $hex = join '', map { unpack 'H*', chr(rand(256)) } 1..8;
print $hex, "\n";
b4d4bfb26fddf220
(This gives you 2^64, or about 2*10^19 combinations. Or 'plenty' if you rate limit.)

Is it possible to include comments inside a non email host name?

I am working on a more complete email validator in java and came across an interesting ability to embed comments within an email both in the "username" and "address" portions.
The following snippet from http://www.dominicsayers.com/isemail/ has this to say about comments within an email.
Comments are text surrounded by parentheses (like these). These are OK but don't form part of the address. In other words mail sent to first.last#example.com will go to the same place as first(a).last(b)#example(c).com(d). Strange but true.
Do emails like this really exist ?
Is it possible to form hosts like this ?
I tried entering an url such as "google(ignore).com" but firefox and some other browsers failed and i was wondering is this because its it wrong or is it because they dont know about host name comments ?
That syntax -- comments within an addr-spec -- was indeed permissible by the original email RFC, RFC 822. However, the placement of comments like you'd like to use them was deprecated when that RFC was revised by RFC 2822... 10 years ago. It's still marked as obsolete in the current version, RFC 5322. There's no good excuse for emitting anything using that syntax.
Address parsers are supposed to be backwards-compatible in order to cover all conceivable cases, including 10-years-deprecated bits like the one you're trying to take advantage of here. But I'll bet that many, many receiving mail agents will fail to properly parse out those comments. So even though you may have technically found a loophole via the "obsolete addressing" section of the RFC, it's not likely to do you much good in practice.
As for HTTP, the syntax rules aren't the same as email syntax rules. As you're seeing, the comment section from RFC 822 isn't applicable.
Just because you can do it in the spec, doesn't mean that you should. For example, Gmail will not accept that comment format for address.
Second, (to your last point), paren-comments being allowed in email addresses doesn't mean that they work for URLs.
Finally, my advice: I'd tailor the completeness of your validator to your requirements. If you're writing a new MTA (mail transfer agent), you'll probably have to do it all. If you're writing a validator for a user input, keep it simple:
look for one #,
make sure you have stuff before (username) and after (domain name),
make sure you have a "dot" in the hostname string,
[extra credit] do a DNS lookup of the hostname to make sure it resolves.

Encoding a email address that can be used as part of a URL in codeigniter

Is there a way to encode a email address that can be used as a part of a url in codeigniter?. I need to decode back the email address from the url.
What I am trying to do is just a -forgotten password recovery- thing. I send a confirmation link to the user's email address, the link needs to be like ../encodedEmail/forgottenPasswordCode (with the forgottenPasswordCode updated in the db for the user with the submitted email).
When the user visits that link, I decode the email(if the email - forgottenPasswordCode pair is in the table), i allow them to reset their password (and i reset forgottenPasswordCode back to null).
I could just do a loop -checking the table with a select query- (or) -set that forgottenPasswordCode column unique, so i keep generating on a insert failure(would that be a lot faster ?)- until I generate a forgottenPasswordCode that doesn't already exist in the table.
But the guy I do this for would not accept it this way:). He wants the checking be done with the user's email, he thinks its much faster.
I am working with codeigniter, I used its encode() function, it seems to produce characters like '-slashes-' at times that breaks the encoded-email-string.
Any other ideas?
try using bin2hex() and hex2bin() function,
<?php
function hex2bin($str)
{
$bin = "";
$i = 0;
do
{
$bin .= chr(hexdec($str{$i}.$str{($i + 1)}));
$i += 2;
} while ($i < strlen($str));
return $bin;
}
$str = 'email#website.com';
$output = bin2hex($str);
echo $output . '<br/>';
echo hex2bin($output);
?>
Don't put data in the URL that doesn't have some sort of meaning. This leaves two choices:
Send the address as part of a POST. If it's coming from a web form this is the way to go.
Refer to the address in the database using an ID or hashed value. If you need the user to click a link referring to their account, use something that clearly refers to their account. If you need to refer to an instance of a password reset (many systems do this), add a table containing hashes, using that hash in the URL.
Why not just encode it in the URL?
You can see URLs (it's part of the UI), encoded things look weird
URLs represent resources, things in your app (users probably already have IDs)
Encoded email addresses are long (making these URLs harder to work with in things like emails)
Try to keep parameters in URLs to clear references to concepts in your web app (point at one user by ID or plaintext name, for example). Parameters that don't fit in URLs go in POST parameters. If you must use something encoded in a URL, prefer one-way-encoding and database lookups.
Although it may be not optimal design solution to use email as a part URL,
use email as base64 encoded string to avoid any issues with special chars
E.g. Base64 encoded string 'abc-def#example.com' is
YWJjLWRlZkBleGFtcGxlLmNvbQ==
In your case the URL is
../YWJjLWRlZkBleGFtcGxlLmNvbQ==/forgottenPasswordCode
All you need is to decode that string back before usage