Base64 SHA-512 hash not working as intended - hash

Hello I'm trying to get the Base64 encoded value of a SHA512 hash. I want my output to match the output using this site but I can't seem to get it when I try step by step. For example,
The string admin gives x61Ey612Kl2gpFL56FT9weDnpSo4AV8j8+qx2AuTHdRyY036xxzTTrw10Wq3+4qQyB+XURPWx1ONxp3Y3pB37A== when I use the site above.
When I try it step by step, I use a SHA-512 hash generator on admin which results in C7AD44CBAD762A5DA0A452F9E854FDC1E0E7A52A38015F23F3EAB1D80B931DD472634DFAC71CD34EBC35D16AB7FB8A90C81F975113D6C7538DC69DD8DE9077EC
and then I use a Base64 encoder on that which gives me QzdBRDQ0Q0JBRDc2MkE1REEwQTQ1MkY5RTg1NEZEQzFFMEU3QTUyQTM4MDE1RjIzRjNFQUIxRDgwQjkzMURENDcyNjM0REZBQzcxQ0QzNEVCQzM1RDE2QUI3RkI4QTkwQzgxRjk3NTExM0Q2Qzc1MzhEQzY5REQ4REU5MDc3RUM=
which is different. How do I obtain the first output above?

There's two different transformations in play here: the SHA-512 hash of an input and the Base64 encoding of an input. They can be combined or used alone.
C7AD44CBAD762A5DA0A452F9E854FDC1E0E7A52A38015F23F3EAB1D80B931DD472634DFAC71CD34EBC35D16AB7FB8A90C81F975113D6C7538DC69DD8DE9077EC is the SHA-512 hash of the text admin represented in uppercase hexadecimal.
QzdBRDQ0Q0JBRDc2MkE1REEwQTQ1MkY5RTg1NEZEQzFFMEU3QTUyQTM4MDE1RjIzRjNFQUIxRDgwQjkzMURENDcyNjM0REZBQzcxQ0QzNEVCQzM1RDE2QUI3RkI4QTkwQzgxRjk3NTExM0Q2Qzc1MzhEQzY5REQ4REU5MDc3RUM= is the SHA-512 hash of the text admin represented in uppercase hexadecimal and then encoded with Base64.
x61Ey612Kl2gpFL56FT9weDnpSo4AV8j8+qx2AuTHdRyY036xxzTTrw10Wq3+4qQyB+XURPWx1ONxp3Y3pB37A== is the SHA-512 hash of the text admin in encoded with Base64. There was no intermediate transformation to hexadecimal.
In other words, x61Ey612Kl2gpFL56FT9weDnpSo4AV8j8+qx2AuTHdRyY036xxzTTrw10Wq3+4qQyB+XURPWx1ONxp3Y3pB37A== is the Base64 encoding of the hash output bytes, and QzdBRDQ0Q0JBRDc2MkE1REEwQTQ1MkY5RTg1NEZEQzFFMEU3QTUyQTM4MDE1RjIzRjNFQUIxRDgwQjkzMURENDcyNjM0REZBQzcxQ0QzNEVCQzM1RDE2QUI3RkI4QTkwQzgxRjk3NTExM0Q2Qzc1MzhEQzY5REQ4REU5MDc3RUM= is the Base64 encoding of the hash output text (in uppercase hexadecimal).

Related

Perl Digest Bcrypt, generating a proper hash

I have written a test program that generates a Bcrypt hash. This hash later needs to be verified by a PHP backend.
This is my perl code:
use Digest;
#use Data::Entropy::Algorithms qw(rand_bits);
#my $bcrypt = Digest->new('Bcrypt', cost=>10, salt=>rand_bits(16*8));
my $bcrypt = Digest->new('Bcrypt', cost=>10, salt=>'1111111111111111');
my $settings = $bcrypt->settings(); # save for later checks.
my $pass_hash = $bcrypt->add('bob')->b64digest;
print $settings.$pass_hash."\n";
This prints
$2a$10$KRCvKRCvKRCvKRCvKRCvKOoFxCE1d/OZTKQqhet3bKOq6ZVIACXBU
This does not validate as a proper hash if I use an online bcrypt tool such as https://bcrypt-generator.com
Can someone point out the error? Thanks.
Figured out the problem. I have to use bcrypt_b64digest instead of b64digest. I wish the perl documentation was clearer in which one needs to be used so that other bcrypt implementations can "get it".
my $pass_hash = $bcrypt->add('bob')->bcrypt_b64digest;
From https://metacpan.org/pod/Digest::Bcrypt#bcrypt_b64digest
Same as "digest", but will return the digest base64 encoded using the
alphabet that is commonly used with bcrypt. The length of the returned
string will be 31 and will only contain characters from the ranges
'0'..'9', 'A'..'Z', 'a'..'z', '+', and '.'
The base64 encoded string returned is not padded to be a multiple of 4
bytes long. Note: This is bcrypt's own non-standard base64 alphabet,
It is not compatible with the standard MIME base64 encoding.

Correct SHA256 implementation with UTF-8 characters

I'm running into issues comparing SHA256 hashes generated by different languages/functions.
For example, SHA256("í") either returns:
f3df1f9c358ae8eceb8fce7c00614288d113ad55315f4ebb909774a7daadfc84
-or-
127035a8ff26256ea0541b5add6dcc3ecdaeea603e606f84e0fd63492fbab2c5
Which of the above hash is correct for a string of one character, and what's the correct way of handling UTF-8 strings?
Which of the above hash is correct for a string of one character
There is no "correct" answer. What's being hashed is the bytes, not the "character". What bytes are hashed exactly depends on the encoding of the string.
"í" in Windows-1252 is byte ED, which hashes as:
f3df1f9c358ae8eceb8fce7c00614288d113ad55315f4ebb909774a7daadfc84
"í" in UTF-8 is bytes C3 AD, which hashes as:
127035a8ff26256ea0541b5add6dcc3ecdaeea603e606f84e0fd63492fbab2c5
"í" in UTF-16LE is bytes ED 00, which hashes as:
430e2ca27910b5ee6e0ec56a12b81325c763376cb8e25a60362dce9444424f95
How exactly that works in various programming languages depends on the languages and the encodings they use for strings.

Correct Hashing Algorithm/Function

Are there any secure hashing algorithms/functions that give all the letters and numbers, and not just 0-9,a-f.
So the output could contain: 0-9, a-z, A-Z and even some symbols.
Any hashing algorithm, really.
Hexadecimal is just a common representation for them. Look at this code snippet (using perl, because you didn't tag a programming language):
use Digest::MD5 qw/md5 md5_hex/;
use MIME::Base64;
my $str = 'Foobar';
# Hexadecimal representation
print md5_hex($str),"\n";
# Base64 encoded representation
print encode_base64(md5($str));
Output:
89d5739baabbbe65be35cbe61c88e06d
idVzm6q7vmW+NcvmHIjgbQ==
The first output is the hexadecimal representation of the MD5 digest of the string; the second is the Base64 encoded representation of the raw digest.
This would work with any digesting algorithm. It does not, however, affect how secure the underlying algorithm actually is.
Use your favorite hashing algorithm/function and convert the output to base64. A mechanism to do that in Java is here: how to convert hex to base64.
Note that the hash value will still be the same, but the presentation will be different. If there's a reason you want to use a fuller symbol set, perhaps you could edit your question.

How to decode a Base64 string?

I have a normal string in Powershell that is from a text file containing Base64 text; it is stored in $x. I am trying to decode it as such:
$z = [System.Text.Encoding]::Unicode.GetString([System.Convert]::FromBase64String($x));
This works if $x was a Base64 string created in Powershell (but it's not). And this does not work on the $x Base64 string that came from a file, $z simply ends up as something like 䐲券.
What am I missing? For example, $x could be YmxhaGJsYWg= which is Base64 for blahblah.
In a nutshell, YmxhaGJsYWg= is in a text file then put into a string in this Powershell code and I try to decode it but end up with 䐲券 etc.
Isn't encoding taking the text TO base64 and decoding taking base64 BACK to text? You seem be mixing them up here. When I decode using this online decoder I get:
BASE64: blahblah
UTF8: nVnV
not the other way around. I can't reproduce it completely in PS though. See sample below:
PS > [System.Text.Encoding]::UTF8.GetString([System.Convert]::FromBase64String("blahblah"))
nV�nV�
PS > [System.Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes("nVnV"))
blZuVg==
EDIT I believe you're using the wrong encoder for your text. The encoded base64 string is encoded from UTF8(or ASCII) string.
PS > [System.Text.Encoding]::UTF8.GetString([System.Convert]::FromBase64String("YmxhaGJsYWg="))
blahblah
PS > [System.Text.Encoding]::Unicode.GetString([System.Convert]::FromBase64String("YmxhaGJsYWg="))
汢桡汢桡
PS > [System.Text.Encoding]::ASCII.GetString([System.Convert]::FromBase64String("YmxhaGJsYWg="))
blahblah
There are no PowerShell-native commands for Base64 conversion - yet (as of PowerShell [Core] 7.1), but adding dedicated cmdlets has been suggested in GitHub issue #8620.
For now, direct use of .NET is needed.
Important:
Base64 encoding is an encoding of binary data using bytes whose values are constrained to a well-defined 64-character subrange of the ASCII character set representing printable characters, devised at a time when sending arbitrary bytes was problematic, especially with the high bit set (byte values > 0x7f).
Therefore, you must always specify explicitly what character encoding the Base64 bytes do / should represent.
Ergo:
on converting TO Base64, you must first obtain a byte representation of the string you're trying to encode using the character encoding the consumer of the Base64 string expects.
on converting FROM Base64, you must interpret the resultant array of bytes as a string using the same encoding that was used to create the Base64 representation.
Examples:
Note:
The following examples convert to and from UTF-8 encoded strings:
To convert to and from UTF-16LE ("Unicode") instead, substitute [Text.Encoding]::Unicode for [Text.Encoding]::UTF8
Convert TO Base64:
PS> [Convert]::ToBase64String([Text.Encoding]::UTF8.GetBytes('Motörhead'))
TW90w7ZyaGVhZA==
Convert FROM Base64:
PS> [Text.Encoding]::Utf8.GetString([Convert]::FromBase64String('TW90w7ZyaGVhZA=='))
Motörhead
This page shows up when you google how to convert to base64, so for completeness:
$b = [System.Text.Encoding]::UTF8.GetBytes("blahblah")
[System.Convert]::ToBase64String($b)
Base64 encoding converts three 8-bit bytes (0-255) into four 6-bit bytes (0-63 aka base64). Each of the four bytes indexes an ASCII string which represents the final output as four 8-bit ASCII characters. The indexed string is typically 'A-Za-z0-9+/' with '=' used as padding. This is why encoded data is 4/3 longer.
Base64 decoding is the inverse process. And as one would expect, the decoded data is 3/4 as long.
While base64 encoding can encode plain text, its real benefit is encoding non-printable characters which may be interpreted by transmitting systems as control characters.
I suggest the original poster render $z as bytes with each bit having meaning to the application. Rendering non-printable characters as text typically invokes Unicode which produces glyphs based on your system's localization.
Base64decode("the answer to life the universe and everything") = 00101010
If anyone would like to do it with a pipe in Powershell (like a filter) (e.g. read file contents and decode it), it can be achieved with a one-liner like that:
Get-Content base64.txt | %{[Text.Encoding]::UTF8.GetString([Convert]::FromBase64String($_))}
I had issues with spaces showing in between my output and there was no answer online at all to fix this issue. I literally spend many hours trying to find a solution and found one from playing around with the code to the point that I almost did not even know what I typed in at the time that I got it to work. Here is my fix for the issue: [System.Text.Encoding]::UTF8.GetString(([System.Convert]::FromBase64String($base64string)|?{$_}))
Still not a "built-in", but published to gallery, authored by MS:
https://github.com/powershell/textutility
TextUtility
ConvertFrom-Base64
Return a string decoded from base64.
ConvertTo-Base64
Return a base64 encoded representation of a string.

Play! hash password returns bad result

I'm using Play 1.2.1. I want to hash my users password. I thought that Crypto.passwordHash will be good, but it isn't. passwordHash documentation says it returns MD5 password hash. I created some user accounts in fixture, where I put md5 password hash:
...
User(admin):
login: admin
password: f1682b54de57d202ba947a0af26399fd
fullName: Administrator
...
The problem is, when I try to log in, with something like this:
user.password.equals(Crypto.passwordHash(password))
and it doesn't work. So I put a log statement in my autentify method:
Logger.info("\nUser hashed password is %s " +
"\nPassed password is %s " +
"\nHashed passed password is %s",
user.password, password, Crypto.passwordHash(password));
And the password hashes are indeed different, but hey! The output of passwordHash method isn't even an MD5 hash:
15:02:16,164 INFO ~
User hashed password is f1682b54de57d202ba947a0af26399fd
Passed password is <you don't have to know this :P>
Hashed passed password is 8WgrVN5X0gK6lHoK8mOZ/Q==
How about that? How to fix it? Or maybe I have to implement my own solution?
Crypto.passwordHash returns base64-encoded password hash, while you are comparing to hex-encoded.
MD5 outputs a sequence of 16 bytes, each byte having (potentially) any value between 0 and 255 (inclusive). When you want to print the value, you need to convert the bytes to a sequence of "printable characters". There are several possible conventions, the two main being hexadecimal and Base64.
In hexadecimal notation, each byte value is represented as two "hexadecimal digits": such a digit is either a decimal digit ('0' to '9') or a letter (from 'a' to 'f', case is irrelevant). The 16 bytes thus become 32 characters.
In Base64 encoding, each group of three successive bytes is encoded as four characters, taken in a list of 64 possible characters (digits, lowercase letters, uppercase letters, '+' and '/'). One or two final '=' signs may be added so that the encoded string consists in a number of characters which is multiple of 4.
Here, '8WgrVN5X0gK6lHoK8mOZ/Q==' is the Base64 encoding of a sequence of 16 bytes, the first one having value 241, the second one 104, then 43, and so on. In hexadecimal notation, the first byte would be represented by 'f1', the second by '68', the third by '2b'... and the hexadecimal notation of the complete sequence of 16 bytes is then 'f1682b54de57d202ba947a0af26399fd', the value that you expected.
The play.libs.Codec class contains methods for decoding and encoding Base64 and hexadecimal notations. It also contains Codec.hexMD5() which performs MD5 hashing and returns the value in hexadecimal notation instead of Base64.
as Nickolay said you are comparing Hex vs Base-64 strings. Also, I would recommend using BCrypt for that, not the Crypto tool of Play.