Play! hash password returns bad result - hash

I'm using Play 1.2.1. I want to hash my users password. I thought that Crypto.passwordHash will be good, but it isn't. passwordHash documentation says it returns MD5 password hash. I created some user accounts in fixture, where I put md5 password hash:
...
User(admin):
login: admin
password: f1682b54de57d202ba947a0af26399fd
fullName: Administrator
...
The problem is, when I try to log in, with something like this:
user.password.equals(Crypto.passwordHash(password))
and it doesn't work. So I put a log statement in my autentify method:
Logger.info("\nUser hashed password is %s " +
"\nPassed password is %s " +
"\nHashed passed password is %s",
user.password, password, Crypto.passwordHash(password));
And the password hashes are indeed different, but hey! The output of passwordHash method isn't even an MD5 hash:
15:02:16,164 INFO ~
User hashed password is f1682b54de57d202ba947a0af26399fd
Passed password is <you don't have to know this :P>
Hashed passed password is 8WgrVN5X0gK6lHoK8mOZ/Q==
How about that? How to fix it? Or maybe I have to implement my own solution?

Crypto.passwordHash returns base64-encoded password hash, while you are comparing to hex-encoded.

MD5 outputs a sequence of 16 bytes, each byte having (potentially) any value between 0 and 255 (inclusive). When you want to print the value, you need to convert the bytes to a sequence of "printable characters". There are several possible conventions, the two main being hexadecimal and Base64.
In hexadecimal notation, each byte value is represented as two "hexadecimal digits": such a digit is either a decimal digit ('0' to '9') or a letter (from 'a' to 'f', case is irrelevant). The 16 bytes thus become 32 characters.
In Base64 encoding, each group of three successive bytes is encoded as four characters, taken in a list of 64 possible characters (digits, lowercase letters, uppercase letters, '+' and '/'). One or two final '=' signs may be added so that the encoded string consists in a number of characters which is multiple of 4.
Here, '8WgrVN5X0gK6lHoK8mOZ/Q==' is the Base64 encoding of a sequence of 16 bytes, the first one having value 241, the second one 104, then 43, and so on. In hexadecimal notation, the first byte would be represented by 'f1', the second by '68', the third by '2b'... and the hexadecimal notation of the complete sequence of 16 bytes is then 'f1682b54de57d202ba947a0af26399fd', the value that you expected.
The play.libs.Codec class contains methods for decoding and encoding Base64 and hexadecimal notations. It also contains Codec.hexMD5() which performs MD5 hashing and returns the value in hexadecimal notation instead of Base64.

as Nickolay said you are comparing Hex vs Base-64 strings. Also, I would recommend using BCrypt for that, not the Crypto tool of Play.

Related

Base64 SHA-512 hash not working as intended

Hello I'm trying to get the Base64 encoded value of a SHA512 hash. I want my output to match the output using this site but I can't seem to get it when I try step by step. For example,
The string admin gives x61Ey612Kl2gpFL56FT9weDnpSo4AV8j8+qx2AuTHdRyY036xxzTTrw10Wq3+4qQyB+XURPWx1ONxp3Y3pB37A== when I use the site above.
When I try it step by step, I use a SHA-512 hash generator on admin which results in C7AD44CBAD762A5DA0A452F9E854FDC1E0E7A52A38015F23F3EAB1D80B931DD472634DFAC71CD34EBC35D16AB7FB8A90C81F975113D6C7538DC69DD8DE9077EC
and then I use a Base64 encoder on that which gives me QzdBRDQ0Q0JBRDc2MkE1REEwQTQ1MkY5RTg1NEZEQzFFMEU3QTUyQTM4MDE1RjIzRjNFQUIxRDgwQjkzMURENDcyNjM0REZBQzcxQ0QzNEVCQzM1RDE2QUI3RkI4QTkwQzgxRjk3NTExM0Q2Qzc1MzhEQzY5REQ4REU5MDc3RUM=
which is different. How do I obtain the first output above?
There's two different transformations in play here: the SHA-512 hash of an input and the Base64 encoding of an input. They can be combined or used alone.
C7AD44CBAD762A5DA0A452F9E854FDC1E0E7A52A38015F23F3EAB1D80B931DD472634DFAC71CD34EBC35D16AB7FB8A90C81F975113D6C7538DC69DD8DE9077EC is the SHA-512 hash of the text admin represented in uppercase hexadecimal.
QzdBRDQ0Q0JBRDc2MkE1REEwQTQ1MkY5RTg1NEZEQzFFMEU3QTUyQTM4MDE1RjIzRjNFQUIxRDgwQjkzMURENDcyNjM0REZBQzcxQ0QzNEVCQzM1RDE2QUI3RkI4QTkwQzgxRjk3NTExM0Q2Qzc1MzhEQzY5REQ4REU5MDc3RUM= is the SHA-512 hash of the text admin represented in uppercase hexadecimal and then encoded with Base64.
x61Ey612Kl2gpFL56FT9weDnpSo4AV8j8+qx2AuTHdRyY036xxzTTrw10Wq3+4qQyB+XURPWx1ONxp3Y3pB37A== is the SHA-512 hash of the text admin in encoded with Base64. There was no intermediate transformation to hexadecimal.
In other words, x61Ey612Kl2gpFL56FT9weDnpSo4AV8j8+qx2AuTHdRyY036xxzTTrw10Wq3+4qQyB+XURPWx1ONxp3Y3pB37A== is the Base64 encoding of the hash output bytes, and QzdBRDQ0Q0JBRDc2MkE1REEwQTQ1MkY5RTg1NEZEQzFFMEU3QTUyQTM4MDE1RjIzRjNFQUIxRDgwQjkzMURENDcyNjM0REZBQzcxQ0QzNEVCQzM1RDE2QUI3RkI4QTkwQzgxRjk3NTExM0Q2Qzc1MzhEQzY5REQ4REU5MDc3RUM= is the Base64 encoding of the hash output text (in uppercase hexadecimal).

Are MD5 hashes always either capital or lowercase?

I'm passing an HMAC-MD5 encoded parameter into a form and the vendor is returning it as invalid. However, it matches what their hash generator gives me, with the exception of capitalization on the letters. What I did to get around this was use an lcase command. I'm wondering if this will cause me trouble later. Coldfusion generates the hashed string in capital letters, the vendor always seems to use lowercase; is it always one or the other or will they ever be mixed?
MD5 as every other hash function will produce binary output, in case of MD5 it is 16 bytes.
Because those bytes are difficult to handle, they are encoded to a string. In case of MD5 they are usually encoded to 32 lowercase hexadecimal digits, so every byte is represented by 2 characters.
Whether the target system accepts upper- or lowercase encodings or both is up to the system, it is unrelated to the hash function, both are different representations of a the same MD5 hash. So to answer your question, format the output as the target system requires it.
While RFC-1321 MD5 Message-Digest Algorithm doesn't discuss hexadecimal string encoding, the test suite does show results in lowercase.
The MD5 test suite (driver option "-x") should print the following results:
MD5 test suite:
MD5 ("") = d41d8cd98f00b204e9800998ecf8427e
MD5 ("a") = 0cc175b9c0f1b6a831c399e269772661
MD5 ("abc") = 900150983cd24fb0d6963f7d28e17f72
MD5 ("message digest") = f96b697d7cb7938d525a2f31aaf161d0
MD5 ("abcdefghijklmnopqrstuvwxyz") = c3fcd3d76192e4007dfb496cca67e13b
MD5 ("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789") =
d174ab98d277d9f5a5611c2c9f419d9f
MD5 ("123456789012345678901234567890123456789012345678901234567890123456
78901234567890") = 57edf4a22be3c955ac49da2e2107b67a
Lowercase is simply the outcome of C/C++ printf() format specifier %02x, not a requirement: "should print", not "must print".
Ref: RFC-1321 Appendix A.5 Test suite
A hex string can contain anything in the 0-9 and a-f, A-F range, so you should anticipate both upper and lower-case versions.
If you're really stuck trying to interface between two highly opinionated systems, force upper or lower case depending on your requirements.

Correct SHA256 implementation with UTF-8 characters

I'm running into issues comparing SHA256 hashes generated by different languages/functions.
For example, SHA256("í") either returns:
f3df1f9c358ae8eceb8fce7c00614288d113ad55315f4ebb909774a7daadfc84
-or-
127035a8ff26256ea0541b5add6dcc3ecdaeea603e606f84e0fd63492fbab2c5
Which of the above hash is correct for a string of one character, and what's the correct way of handling UTF-8 strings?
Which of the above hash is correct for a string of one character
There is no "correct" answer. What's being hashed is the bytes, not the "character". What bytes are hashed exactly depends on the encoding of the string.
"í" in Windows-1252 is byte ED, which hashes as:
f3df1f9c358ae8eceb8fce7c00614288d113ad55315f4ebb909774a7daadfc84
"í" in UTF-8 is bytes C3 AD, which hashes as:
127035a8ff26256ea0541b5add6dcc3ecdaeea603e606f84e0fd63492fbab2c5
"í" in UTF-16LE is bytes ED 00, which hashes as:
430e2ca27910b5ee6e0ec56a12b81325c763376cb8e25a60362dce9444424f95
How exactly that works in various programming languages depends on the languages and the encodings they use for strings.

Why does a base64 encoded string have an = sign at the end

I know what base64 encoding is and how to calculate base64 encoding in C#, however I have seen several times that when I convert a string into base64, there is an = at the end.
A few questions came up:
Does a base64 string always end with =?
Why does an = get appended at the end?
Q Does a base64 string always end with =?
A: No. (the word usb is base64 encoded into dXNi)
Q Why does an = get appended at the end?
A: As a short answer:
The last character (= sign) is added only as a complement (padding) in the final process of encoding a message with a special number of characters.
You will not have an = sign if your string has a multiple of 3 characters, because Base64 encoding takes each three bytes (a character=1 byte) and represents them as four printable characters in the ASCII standard.
Example:
(a) If you want to encode
ABCDEFG <=> [ABC] [DEF] [G]
Base64 deals with the first block (producing 4 characters) and the second (as they are complete). But for the third, it will add a double == in the output in order to complete the 4 needed characters. Thus, the result will be QUJD REVG Rw== (without spaces).
[ABC] => QUJD
[DEF] => REVG
[G] => Rw==
(b) If you want to encode ABCDEFGH <=> [ABC] [DEF] [GH]
similarly, it will add one = at the end of the output to get 4 characters.
The result will be QUJD REVG R0g= (without spaces).
[ABC] => QUJD
[DEF] => REVG
[GH] => R0g=
It serves as padding.
A more complete answer is that a base64 encoded string doesn't always end with a =, it will only end with one or two = if they are required to pad the string out to the proper length.
From Wikipedia:
The final '==' sequence indicates that the last group contained only one byte, and '=' indicates that it contained two bytes.
Thus, this is some sort of padding.
Its defined in RFC 2045 as a special padding character if fewer than 24 bits are available at the end of the encoded data.
No.
To pad the Base64-encoded string to a multiple of 4 characters in length, so that it can be decoded correctly.
The equals sign (=) is used as padding in certain forms of base64 encoding. The Wikipedia article on base64 has all the details.
It's padding. From http://en.wikipedia.org/wiki/Base64:
In theory, the padding character is not needed for decoding, since the
number of missing bytes can be calculated from the number of Base64
digits. In some implementations, the padding character is mandatory,
while for others it is not used. One case in which padding characters
are required is concatenating multiple Base64 encoded files.
http://www.hcidata.info/base64.htm
Encoding "Mary had" to Base 64
In this example we are using a simple text string ("Mary had") but the principle holds no matter what the data is (e.g. graphics file). To convert each 24 bits of input data to 32 bits of output, Base 64 encoding splits the 24 bits into 4 chunks of 6 bits. The first problem we notice is that "Mary had" is not a multiple of 3 bytes - it is 8 bytes long. Because of this, the last group of bits is only 4 bits long. To remedy this we add two extra bits of '0' and remember this fact by putting a '=' at the end. If the text string to be converted to Base 64 was 7 bytes long, the last group would have had 2 bits. In this case we would have added four extra bits of '0' and remember this fact by putting '==' at the end.
= is a padding character. If the input stream has length that is not a multiple of 3, the padding character will be added. This is required by decoder: if no padding present, the last byte would have an incorrect number of zero bits.
Better and deeper explanation here: https://base64tool.com/detect-whether-provided-string-is-base64-or-not/
The equals or double equals serves as padding. It's a stupid concept defined in RFC2045 and it is actually superfluous. Any decend parser can encode and decode a base64 string without knowing about padding by just counting up the number of characters and filling in the rest if size isn't dividable by 3 or 4 respectively. This actually leads to difficulties every now and then, because some parsers expect padding while others blatantly ignore it. My MPU base64 decoder for example needs padding, but it receives a non-padded base64 string over the network. This leads to erronous parsing and I had to account for it myself.

Verifying salted hashes with Perls unpack()

I'm trying to verify salted passwords with Perl and am stuck with unpack.
I've got a salted hashed password, e.g. for SHA256: SSHA256 = SHA256('password' + 'salt') + 'salt'
Base64 encoded that gets '
{SSHA256}eje4XIkY6sGakInA+loqtNzj+QUo3N7sEIsj3fNge5lzYWx0'.
I store this string in my user database. When a user logs in need to separate the salt from the hash to hash the supplied password with the salt and compare the result to the one retrieved from the db. This is where I'm stuck. I don't seem to have the right unpack template separate the hash (8-bit binary, fixed length, in this case 32 byte) from the salt (8-bit binary, variable length).
I have tried something like
my ($hash, $salt) = unpack('N32 N*', $data);
but that doesn't seem to work out.
My question is: How can I unpack this hash (after it has been Base64 decoded) to get the fixed length hash in one and the variable length salt in another variable?
I think you're needlessly re-inventing the wheel.
You could use e.g. Crypt::SaltedHash to easily verify it, for instance:
my $password_entered = $cgi->param('password');
my $valid = Crypt::SaltedHash->validate($salted, $password_entered);
A longer example, showing using Crypt::SaltedHash to generate the salted password in the first instance, too:
my $csh = Crypt::SaltedHash->new(algorithm => 'SHA-256');
$csh->add('secretpassword');
my $salted = $csh->generate;
# $salted will contain the salted hash (Crypt::SaltedHash picks random
# salt for you automatically)
# for example:
DB x $salted = $csh->generate;
0 '{SSHA256}H1WaxHcyAB81iyIPwib/cCUtjqCm2sxQNA1QvGeh/iT3m51w'
# validating that against the plaintext 'secretpassword' shows it's right:
DB x Crypt::SaltedHash->validate($salted, 'secretpassword');
0 1
# and trying it with an incorrect password:
DB x Crypt::SaltedHash->validate($salted, 'wrongpassword');
0 ''
No reason to re-invent all of this yourself.
You seem to be doing RFC2307 the hard way and also manage to introduce bugs. Those + do not mean what you think.
Subclass Authen::Passphrase::SaltedDigest instead.
Not sure the whole picture is present, but the unpack template you have specified -'N32 N*'- is for 32 unsigned long (32-bit) (big-endian) integers (see pack docs).
Looks like you may instead need unsigned chars: '32C C*'