Correct Hashing Algorithm/Function - hash

Are there any secure hashing algorithms/functions that give all the letters and numbers, and not just 0-9,a-f.
So the output could contain: 0-9, a-z, A-Z and even some symbols.

Any hashing algorithm, really.
Hexadecimal is just a common representation for them. Look at this code snippet (using perl, because you didn't tag a programming language):
use Digest::MD5 qw/md5 md5_hex/;
use MIME::Base64;
my $str = 'Foobar';
# Hexadecimal representation
print md5_hex($str),"\n";
# Base64 encoded representation
print encode_base64(md5($str));
Output:
89d5739baabbbe65be35cbe61c88e06d
idVzm6q7vmW+NcvmHIjgbQ==
The first output is the hexadecimal representation of the MD5 digest of the string; the second is the Base64 encoded representation of the raw digest.
This would work with any digesting algorithm. It does not, however, affect how secure the underlying algorithm actually is.

Use your favorite hashing algorithm/function and convert the output to base64. A mechanism to do that in Java is here: how to convert hex to base64.
Note that the hash value will still be the same, but the presentation will be different. If there's a reason you want to use a fuller symbol set, perhaps you could edit your question.

Related

Perl Digest Bcrypt, generating a proper hash

I have written a test program that generates a Bcrypt hash. This hash later needs to be verified by a PHP backend.
This is my perl code:
use Digest;
#use Data::Entropy::Algorithms qw(rand_bits);
#my $bcrypt = Digest->new('Bcrypt', cost=>10, salt=>rand_bits(16*8));
my $bcrypt = Digest->new('Bcrypt', cost=>10, salt=>'1111111111111111');
my $settings = $bcrypt->settings(); # save for later checks.
my $pass_hash = $bcrypt->add('bob')->b64digest;
print $settings.$pass_hash."\n";
This prints
$2a$10$KRCvKRCvKRCvKRCvKRCvKOoFxCE1d/OZTKQqhet3bKOq6ZVIACXBU
This does not validate as a proper hash if I use an online bcrypt tool such as https://bcrypt-generator.com
Can someone point out the error? Thanks.
Figured out the problem. I have to use bcrypt_b64digest instead of b64digest. I wish the perl documentation was clearer in which one needs to be used so that other bcrypt implementations can "get it".
my $pass_hash = $bcrypt->add('bob')->bcrypt_b64digest;
From https://metacpan.org/pod/Digest::Bcrypt#bcrypt_b64digest
Same as "digest", but will return the digest base64 encoded using the
alphabet that is commonly used with bcrypt. The length of the returned
string will be 31 and will only contain characters from the ranges
'0'..'9', 'A'..'Z', 'a'..'z', '+', and '.'
The base64 encoded string returned is not padded to be a multiple of 4
bytes long. Note: This is bcrypt's own non-standard base64 alphabet,
It is not compatible with the standard MIME base64 encoding.

Base64 SHA-512 hash not working as intended

Hello I'm trying to get the Base64 encoded value of a SHA512 hash. I want my output to match the output using this site but I can't seem to get it when I try step by step. For example,
The string admin gives x61Ey612Kl2gpFL56FT9weDnpSo4AV8j8+qx2AuTHdRyY036xxzTTrw10Wq3+4qQyB+XURPWx1ONxp3Y3pB37A== when I use the site above.
When I try it step by step, I use a SHA-512 hash generator on admin which results in C7AD44CBAD762A5DA0A452F9E854FDC1E0E7A52A38015F23F3EAB1D80B931DD472634DFAC71CD34EBC35D16AB7FB8A90C81F975113D6C7538DC69DD8DE9077EC
and then I use a Base64 encoder on that which gives me QzdBRDQ0Q0JBRDc2MkE1REEwQTQ1MkY5RTg1NEZEQzFFMEU3QTUyQTM4MDE1RjIzRjNFQUIxRDgwQjkzMURENDcyNjM0REZBQzcxQ0QzNEVCQzM1RDE2QUI3RkI4QTkwQzgxRjk3NTExM0Q2Qzc1MzhEQzY5REQ4REU5MDc3RUM=
which is different. How do I obtain the first output above?
There's two different transformations in play here: the SHA-512 hash of an input and the Base64 encoding of an input. They can be combined or used alone.
C7AD44CBAD762A5DA0A452F9E854FDC1E0E7A52A38015F23F3EAB1D80B931DD472634DFAC71CD34EBC35D16AB7FB8A90C81F975113D6C7538DC69DD8DE9077EC is the SHA-512 hash of the text admin represented in uppercase hexadecimal.
QzdBRDQ0Q0JBRDc2MkE1REEwQTQ1MkY5RTg1NEZEQzFFMEU3QTUyQTM4MDE1RjIzRjNFQUIxRDgwQjkzMURENDcyNjM0REZBQzcxQ0QzNEVCQzM1RDE2QUI3RkI4QTkwQzgxRjk3NTExM0Q2Qzc1MzhEQzY5REQ4REU5MDc3RUM= is the SHA-512 hash of the text admin represented in uppercase hexadecimal and then encoded with Base64.
x61Ey612Kl2gpFL56FT9weDnpSo4AV8j8+qx2AuTHdRyY036xxzTTrw10Wq3+4qQyB+XURPWx1ONxp3Y3pB37A== is the SHA-512 hash of the text admin in encoded with Base64. There was no intermediate transformation to hexadecimal.
In other words, x61Ey612Kl2gpFL56FT9weDnpSo4AV8j8+qx2AuTHdRyY036xxzTTrw10Wq3+4qQyB+XURPWx1ONxp3Y3pB37A== is the Base64 encoding of the hash output bytes, and QzdBRDQ0Q0JBRDc2MkE1REEwQTQ1MkY5RTg1NEZEQzFFMEU3QTUyQTM4MDE1RjIzRjNFQUIxRDgwQjkzMURENDcyNjM0REZBQzcxQ0QzNEVCQzM1RDE2QUI3RkI4QTkwQzgxRjk3NTExM0Q2Qzc1MzhEQzY5REQ4REU5MDc3RUM= is the Base64 encoding of the hash output text (in uppercase hexadecimal).

Are MD5 hashes always either capital or lowercase?

I'm passing an HMAC-MD5 encoded parameter into a form and the vendor is returning it as invalid. However, it matches what their hash generator gives me, with the exception of capitalization on the letters. What I did to get around this was use an lcase command. I'm wondering if this will cause me trouble later. Coldfusion generates the hashed string in capital letters, the vendor always seems to use lowercase; is it always one or the other or will they ever be mixed?
MD5 as every other hash function will produce binary output, in case of MD5 it is 16 bytes.
Because those bytes are difficult to handle, they are encoded to a string. In case of MD5 they are usually encoded to 32 lowercase hexadecimal digits, so every byte is represented by 2 characters.
Whether the target system accepts upper- or lowercase encodings or both is up to the system, it is unrelated to the hash function, both are different representations of a the same MD5 hash. So to answer your question, format the output as the target system requires it.
While RFC-1321 MD5 Message-Digest Algorithm doesn't discuss hexadecimal string encoding, the test suite does show results in lowercase.
The MD5 test suite (driver option "-x") should print the following results:
MD5 test suite:
MD5 ("") = d41d8cd98f00b204e9800998ecf8427e
MD5 ("a") = 0cc175b9c0f1b6a831c399e269772661
MD5 ("abc") = 900150983cd24fb0d6963f7d28e17f72
MD5 ("message digest") = f96b697d7cb7938d525a2f31aaf161d0
MD5 ("abcdefghijklmnopqrstuvwxyz") = c3fcd3d76192e4007dfb496cca67e13b
MD5 ("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789") =
d174ab98d277d9f5a5611c2c9f419d9f
MD5 ("123456789012345678901234567890123456789012345678901234567890123456
78901234567890") = 57edf4a22be3c955ac49da2e2107b67a
Lowercase is simply the outcome of C/C++ printf() format specifier %02x, not a requirement: "should print", not "must print".
Ref: RFC-1321 Appendix A.5 Test suite
A hex string can contain anything in the 0-9 and a-f, A-F range, so you should anticipate both upper and lower-case versions.
If you're really stuck trying to interface between two highly opinionated systems, force upper or lower case depending on your requirements.

How can I work with raw bytes in perl

Documentation all directs me to unicode support, yet I don't think my request has anything to do with Unicode. I want to work with raw bytes within the context of a single scalar; I need to be able to figure out its length (in bytes), take substrings of it (in bytes), write the bytes to disc, and over the network. Is there an easy way to do this, without treating the bytes as any sort of encoding in perl?
EDIT
More explicitly,
my $data = "Perl String, unsure of encoding and don't need to know";
my #data_chunked_into_1024_bytes_each = #???
Perl strings are, conceptually, strings of characters, which are positive 32-bit integers that (normally) represent Unicode code points. A byte string, in Perl, is just a string in which all the characters have values less than 256.
(That's the conceptual view. The internal representation is somewhat more complicated, as the perl interpreter tries to store byte strings — in the above sense — as actual byte strings, while using a generalized UTF-8 encoding for strings that contain character values of 256 or higher. But this is all supposed to be transparent to the user, and in fact mostly is, except for some ugly historical corner cases like the bitwise not (~) operator.)
As for how to turn a general string into a byte string, that really depends on what the string you have contains and what the byte string is supposed to contain:
If your string already is a string of bytes — e.g. if you read it from a file in binary mode — then you don't need to do anything. The string shouldn't contain any characters above 255 to being with, and if it does, that's an error and will probably be reported as such by the encryption code.
Similarly, if your string is supposed to encode text in the ASCII or ISO-8859-1 encodings (which encode the 7- and 8-bit subsets of Unicode respectively), then you don't need to do anything: any characters up to 255 are already correctly encoded, and any higher values are invalid for those encodings.
If your input string contains (Unicode) text that you want to encode in some other encoding, then you'll need to convert the string to that encoding. The usual way to do that is by using the Encode module, like this:
use Encode;
my $byte_string = encode( "name of encoding", $text_string );
Obviously, you can convert the byte string back to the corresponding character string with:
use Encode;
my $text_string = decode( "name of encoding", $byte_string );
For the special case of the UTF-8 encoding, it's also possible to use the built-in utf8::encode() function instead of Encode::encode():
utf8::encode( $string );
which does essentially the same thing as:
use Encode;
$string = encode( "utf8", $string );
Note that, unlike Encode::encode(), the utf8::encode() function modifies the input string directly. Also note that the "utf8" above refers to Perl's extended UTF-8 encoding, which allows values outside the official Unicode range; for strictly standards-compliant UTF-8 encoding, use "utf-8" with a hyphen (see Encode documentation for the gory details). And, yes, there's also a utf8::decode() function that does pretty much what you'd expect.
If I understood your question correctly, what you want is the pack/unpack functions: http://perldoc.perl.org/functions/pack.html
As long as your string doesn't contain characters above codepoint 255, it will mostly work as plain byte string, with length and substr operating on bytes. Additionally, most output functions like print expect octets/bytes by default and will actually complain if you try to stuff anything else to them.
You may need to explicitly encode/decode your output if it is known to be in some encoding, but more details can only be added if you ask another specific question for each problematic part of your program.

Convert a UTF8 string to ASCII in Perl

I've tried everything Google and StackOverflow have recommended (that I could find) including using Encode. My code works but it just uses UTF8 and I get the wide character warnings. I know how to work around those warnings but I'm not using UTF8 for anything else so I'd like to just convert it and not have to adapt the rest of my code to deal with it. Here's my code:
my $xml = XMLin($content);
# Populate the #titles array with each item title.
my #titles;
for my $item (#{$xml->{channel}->{item}}) {
my $title = Encode::decode_utf8($item->{title});
#my $title = $item->{title};
#utf8::downgrade($title, 1);
Encode::from_to($title, 'utf8', 'iso-8859-1');
push #titles, $title;
}
return #titles;
Commented out you can see some of the other things I've tried. I'm well aware that I don't know what I'm doing here. I just want to end up with a plain old ASCII string though. Any ideas would be greatly appreciated. Thanks.
The answer depends on how you want to use the title. There are 3 basic ways to go:
Bytes that represent a UTF-8 encoded string.
This is the format that should be used if you want to store the UTF-8 encoded string outside your application, be it on disk or sending it over the network or anything outside the scope of your program.
A string of Unicode characters.
The concept of characters is internal to Perl. When you perform Encode::decode_utf8, then a bunch of bytes is attempted to be converted to a string of characters, as seen by Perl. The Perl VM (and the programmer writing Perl code) cannot externalize that concept except through decoding UTF-8 bytes on input and encoding them to UTF-8 bytes on output. For example, your program receives two bytes as input that you know they represent UTF-8 encoded character(s), let's say 0xC3 0xB6. In that case decode_utf8 returns a representation that instead of two bytes, sees one character: ö.
You can then proceed to manipulate that string in Perl. To illustrate the difference further, consider the following code:
my $bytes = "\xC3\xB6";
say length($bytes); # prints "2"
my $string = decode_utf8($bytes);
say length($string); # prints "1"
The special case of ASCII, a subset of UTF-8.
ASCII is a very small subset of Unicode, where characters in that range are represented by a single byte. Converting Unicode into ASCII is an inherently lossy operation, as most of the Unicode characters are not ASCII characters. You're either forced to drop every character in your string which is not in ASCII or try to map from a Unicode character to their closest ASCII equivalents (which isn't possible in the vast majority of cases), when trying to coerce a Unicode string to ASCII.
Since you have wide character warnings, it means that you're trying to manipulate (possibly output) Unicode characters that cannot be represented as ASCII or ISO-8859-1.
If you do not need to manipulate the title from your XML document as a string, I'd suggest you leave it as UTF-8 bytes (I'd mention that you should be careful not to mix bytes and characters in strings). If you do need to manipulate it, then decode, manipulate, and on output encode it in UTF-8.
For further reading, please use perldoc to study perlunitut, perlunifaq, perlunicode, perluniintro, and Encode.
Although this is an old question, I just spent several hours (!) trying to do more or less the same thing! That is: read data from a UTF-8 XML file, and convert that data into the Windows-1252 codepage (I could also have used Latin1, ISO-8859-1 etc.) in order to be able to create filenames with accented letters.
After much experimentation, and even more searching, I finally managed to get the conversion working. The "trick" is to use Encode::encode instead of Encode::decode.
For example, given the code in the original question, the correct (or at least one :-) way to convert from UTF-8 would be:
my $title = Encode::encode("Windows-1252", $item->{title});
or
my $title = Encode::encode("ISO-8859-1", $item->{title});
or
my $title = Encode::encode("<your-favourite-codepage-here>", $item->{title});
I hope this helps others having similar problems!
You can use the following line to simply get rid of the warning. This assumes that you want to use UTF8, which shouldn't normally be a problem.
binmode(STDOUT, ":encoding(utf8)");