I want to hash sensitive information (1 field) in my SAS data set using MD5. But after hashing the data looks awkward, i.e. all special characters. Is this the right way to use a hash function?
My Code:
data md5;
set sashelp.class (obs=2);
md5 = md5(strip(name));
keep name md5;
put _all_;
run;
My Output:
Name=Alfred Sex=M Age=14 Height=69 Weight=112.5 md5=�p?ޞ��\�rT]( _ERROR_=0 _N_=1
Name=Alice Sex=F Age=13 Height=56.5 Weight=84 md5=dH���/�x{�͇!K8 _ERROR_=0 _N_=2
That's correct, you just need to apply a hexadecimal format $hex32. so it's readable. MD5 is 128-bit Hash but there's a better hashing called SHA256() which is 256-bit hash.
Code:
data md5;
set sashelp.class (obs=2);
format md5 $hex32.;
md5 = md5(strip(name));
keep name md5;
put _all_;
run;
Output:
Name=Alfred md5=86703FDE9E87DD5C0F8E1072545D0128
Name=Alice md5=64489C85DC2FE0787B85CD87214B3810
Note:
You can also add a SALT or PEPPER values to your string for added security; These are string concatenate to the beginning or end of your string.
Related
Hello I'm trying to get the Base64 encoded value of a SHA512 hash. I want my output to match the output using this site but I can't seem to get it when I try step by step. For example,
The string admin gives x61Ey612Kl2gpFL56FT9weDnpSo4AV8j8+qx2AuTHdRyY036xxzTTrw10Wq3+4qQyB+XURPWx1ONxp3Y3pB37A== when I use the site above.
When I try it step by step, I use a SHA-512 hash generator on admin which results in C7AD44CBAD762A5DA0A452F9E854FDC1E0E7A52A38015F23F3EAB1D80B931DD472634DFAC71CD34EBC35D16AB7FB8A90C81F975113D6C7538DC69DD8DE9077EC
and then I use a Base64 encoder on that which gives me QzdBRDQ0Q0JBRDc2MkE1REEwQTQ1MkY5RTg1NEZEQzFFMEU3QTUyQTM4MDE1RjIzRjNFQUIxRDgwQjkzMURENDcyNjM0REZBQzcxQ0QzNEVCQzM1RDE2QUI3RkI4QTkwQzgxRjk3NTExM0Q2Qzc1MzhEQzY5REQ4REU5MDc3RUM=
which is different. How do I obtain the first output above?
There's two different transformations in play here: the SHA-512 hash of an input and the Base64 encoding of an input. They can be combined or used alone.
C7AD44CBAD762A5DA0A452F9E854FDC1E0E7A52A38015F23F3EAB1D80B931DD472634DFAC71CD34EBC35D16AB7FB8A90C81F975113D6C7538DC69DD8DE9077EC is the SHA-512 hash of the text admin represented in uppercase hexadecimal.
QzdBRDQ0Q0JBRDc2MkE1REEwQTQ1MkY5RTg1NEZEQzFFMEU3QTUyQTM4MDE1RjIzRjNFQUIxRDgwQjkzMURENDcyNjM0REZBQzcxQ0QzNEVCQzM1RDE2QUI3RkI4QTkwQzgxRjk3NTExM0Q2Qzc1MzhEQzY5REQ4REU5MDc3RUM= is the SHA-512 hash of the text admin represented in uppercase hexadecimal and then encoded with Base64.
x61Ey612Kl2gpFL56FT9weDnpSo4AV8j8+qx2AuTHdRyY036xxzTTrw10Wq3+4qQyB+XURPWx1ONxp3Y3pB37A== is the SHA-512 hash of the text admin in encoded with Base64. There was no intermediate transformation to hexadecimal.
In other words, x61Ey612Kl2gpFL56FT9weDnpSo4AV8j8+qx2AuTHdRyY036xxzTTrw10Wq3+4qQyB+XURPWx1ONxp3Y3pB37A== is the Base64 encoding of the hash output bytes, and QzdBRDQ0Q0JBRDc2MkE1REEwQTQ1MkY5RTg1NEZEQzFFMEU3QTUyQTM4MDE1RjIzRjNFQUIxRDgwQjkzMURENDcyNjM0REZBQzcxQ0QzNEVCQzM1RDE2QUI3RkI4QTkwQzgxRjk3NTExM0Q2Qzc1MzhEQzY5REQ4REU5MDc3RUM= is the Base64 encoding of the hash output text (in uppercase hexadecimal).
I'm passing an HMAC-MD5 encoded parameter into a form and the vendor is returning it as invalid. However, it matches what their hash generator gives me, with the exception of capitalization on the letters. What I did to get around this was use an lcase command. I'm wondering if this will cause me trouble later. Coldfusion generates the hashed string in capital letters, the vendor always seems to use lowercase; is it always one or the other or will they ever be mixed?
MD5 as every other hash function will produce binary output, in case of MD5 it is 16 bytes.
Because those bytes are difficult to handle, they are encoded to a string. In case of MD5 they are usually encoded to 32 lowercase hexadecimal digits, so every byte is represented by 2 characters.
Whether the target system accepts upper- or lowercase encodings or both is up to the system, it is unrelated to the hash function, both are different representations of a the same MD5 hash. So to answer your question, format the output as the target system requires it.
While RFC-1321 MD5 Message-Digest Algorithm doesn't discuss hexadecimal string encoding, the test suite does show results in lowercase.
The MD5 test suite (driver option "-x") should print the following results:
MD5 test suite:
MD5 ("") = d41d8cd98f00b204e9800998ecf8427e
MD5 ("a") = 0cc175b9c0f1b6a831c399e269772661
MD5 ("abc") = 900150983cd24fb0d6963f7d28e17f72
MD5 ("message digest") = f96b697d7cb7938d525a2f31aaf161d0
MD5 ("abcdefghijklmnopqrstuvwxyz") = c3fcd3d76192e4007dfb496cca67e13b
MD5 ("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789") =
d174ab98d277d9f5a5611c2c9f419d9f
MD5 ("123456789012345678901234567890123456789012345678901234567890123456
78901234567890") = 57edf4a22be3c955ac49da2e2107b67a
Lowercase is simply the outcome of C/C++ printf() format specifier %02x, not a requirement: "should print", not "must print".
Ref: RFC-1321 Appendix A.5 Test suite
A hex string can contain anything in the 0-9 and a-f, A-F range, so you should anticipate both upper and lower-case versions.
If you're really stuck trying to interface between two highly opinionated systems, force upper or lower case depending on your requirements.
Ive been trying to write the md5-digest algorithm in erlang and have no clue how to implement this step,
1. creating 16 octet MD5 hash of X where X is a string.
Can someone help ?
Does this mean this:
Create a 16 byte(32-hex digits) of base - 8(octet) which is md5 of X. ?
Thank you!
Using crypto module and hash function, you can calculate the MD5 which is a 16 byte digest algorithm.
crypto:hash(Type, Data) -> Digest
Type = md5
Data = iodata()
Digest = binary()
It gets a md5 atom as Type and an iodata() as Data, and returns a binary() as Digest. Following code snippet is a simple example:
crypto:hash(md5, "put-your-string-here").
Check crypto documentation for more information.
Also for converting the returned binary value to hex string, there is no function in standard library, but it is as simple as few lines of code which is well explained in this thread.
This md5 module from the epop package calculates the md5 and returns it as a hex string.
epop_md5:string("put-your-string-here").
Say I want to create a password that is purely random, call it randPass of size 32 characters.
Is there any advantage in terms of security of using this directly as a password, vs hashing and salting it (like md5(randPass + salt)?
I mean at the end of the day, they will both be a 32 character long random characters.
Here is a dummy example:
salt = SFZYCr4Ul1zz1rhurksC67AugGIYOKs5;
randPass = VgQK1AOlXYiNwfe74RlU8e8E4szC4UXK;
Then the md5(randPass + salt) = md5(VgQK1AOlXYiNwfe74RlU8e8E4szC4UXKSFZYCr4Ul1zz1rhurksC67AugGIYOKs5) becomes
hash = dddbc2cbda808beeb7e64ce578ef4020
The main advantage of a RANDOM salt is that you cannot run a dictionary attack against the hash table since each password should have a different salt, thus "Password" and salt "jfadljfadiufosd38120809321" turns into "Passwordjfadljfadiufosd38120809321" which is definitely not in a pre-computed dictionary md5 hash dictionary so you cannot do a reverse lookup and figure out the users password.
By using a hash you are limiting the characters in the password.
Hash characters are: range(0,9) and range('a','f').
More characters is better.
If the password is to be submitted to a web page then the symbols should not include those commonly used in sql injection. (e.g. ",',%,\)
To eliminate symbols change range('!','#') to range('0','9')
Set your criteria for how many and what characters and symbols are allowed.
This algorithm uses upper case, lower case, numeric and symbols.
Length of 32 characters.
$characters = array_merge(
range('a','z'),
range('A','Z'),
range('!','#'));
shuffle($characters);
shuffle($characters);
$characters = array_flip($characters);
$ndx = 33;
$pass = '';
while($ndx-->1){
$pass .= array_rand($characters);
}
echo $pass;
I'm trying to verify salted passwords with Perl and am stuck with unpack.
I've got a salted hashed password, e.g. for SHA256: SSHA256 = SHA256('password' + 'salt') + 'salt'
Base64 encoded that gets '
{SSHA256}eje4XIkY6sGakInA+loqtNzj+QUo3N7sEIsj3fNge5lzYWx0'.
I store this string in my user database. When a user logs in need to separate the salt from the hash to hash the supplied password with the salt and compare the result to the one retrieved from the db. This is where I'm stuck. I don't seem to have the right unpack template separate the hash (8-bit binary, fixed length, in this case 32 byte) from the salt (8-bit binary, variable length).
I have tried something like
my ($hash, $salt) = unpack('N32 N*', $data);
but that doesn't seem to work out.
My question is: How can I unpack this hash (after it has been Base64 decoded) to get the fixed length hash in one and the variable length salt in another variable?
I think you're needlessly re-inventing the wheel.
You could use e.g. Crypt::SaltedHash to easily verify it, for instance:
my $password_entered = $cgi->param('password');
my $valid = Crypt::SaltedHash->validate($salted, $password_entered);
A longer example, showing using Crypt::SaltedHash to generate the salted password in the first instance, too:
my $csh = Crypt::SaltedHash->new(algorithm => 'SHA-256');
$csh->add('secretpassword');
my $salted = $csh->generate;
# $salted will contain the salted hash (Crypt::SaltedHash picks random
# salt for you automatically)
# for example:
DB x $salted = $csh->generate;
0 '{SSHA256}H1WaxHcyAB81iyIPwib/cCUtjqCm2sxQNA1QvGeh/iT3m51w'
# validating that against the plaintext 'secretpassword' shows it's right:
DB x Crypt::SaltedHash->validate($salted, 'secretpassword');
0 1
# and trying it with an incorrect password:
DB x Crypt::SaltedHash->validate($salted, 'wrongpassword');
0 ''
No reason to re-invent all of this yourself.
You seem to be doing RFC2307 the hard way and also manage to introduce bugs. Those + do not mean what you think.
Subclass Authen::Passphrase::SaltedDigest instead.
Not sure the whole picture is present, but the unpack template you have specified -'N32 N*'- is for 32 unsigned long (32-bit) (big-endian) integers (see pack docs).
Looks like you may instead need unsigned chars: '32C C*'