Password hashing and salting - hash

I'm trying to wrap my head around the logic of encrypting passwords with MD5/SHA combined with salting.
I understand the concept of a user proving a text password, and appending a random string (salt) to the text password, and hashing the final string via whatever encryption method you want.
This is where I lose the concept
Say in my database of users, I have usernames, and encrypted passwords generated with the random salt value
When the user goes to log into a system, and they enter their password, how do I obtain the correct salt to check the password validity?
If the salt is randomly generated to begin with, I can't recalculate it
Do I have to store the salt with the username/password record? If I query the database for the salt value by username, it would seem that defeats the purpose of having the salting.
How do I obtain the correct salt when it comes time to validate the supplied password?

From Wikipedia, Salt (cryptography)
A new salt is randomly generated for each password. In a typical setting, the salt and the password are concatenated and processed with a cryptographic hash function, and the resulting output (but not the original password) is stored with the salt in a database
You store it with the hash, to prevent dictionary attacks.

The salt is stored in the database, so you can use the same salt to verify the password. Todays libraries often will include the salt in the resulting hash-value like this (result of the PHP function password_hash()):
$2y$10$nOUIs5kJ7naTuTFkBy1veuK0kSxUFXfuaOKdOKf9xYT0KKIGSJwFa
| | | |
| | | hash-value = K0kSxUFXfuaOKdOKf9xYT0KKIGSJwFa
| | |
| | salt = nOUIs5kJ7naTuTFkBy1veu (22 characters)
| |
| cost-factor = 10 = 2^10 iterations
|
hash-algorithm = 2y = BCrypt
This 60 character string can be stored into a single field in the database. The verifying function can extract the salt from this string. The salt is not a secret, it fulfills its purpose even when it is known.
Please note that algorithms like MD5 and SHA-* are not appropriate to hash passwords, because they are too fast. Instead use an algorithm with a cost factor like BCrypt or PBKDF2. For more information you can have a look at my tutorial about safely storing passwords.

Yes, you store the salt. Salting is used to prevent pregenerated rainbow tables, it is not required to be secret, just unpredictable.

Related

When using werkzeug.security.generate_password_hash(), what is the difference between the terminology of "iterations" and "salt_length"?

I'm looking at the documentation for werkzeug.security.generate_password_hash() and it mentions that you can specify the salt_length. I understand that the salt_length refers the number of times that the salt is added to the most recent hash and hashed to create a new hash but I'm not sure what the iterations in the documentation refers to. When I print the output of the generate_password_hash method, it returns "pbkdf2:sha256:150000$randomsalt$resultinghash" and I'm assuming the 150000 is the iterations value but I have no idea what this means or how this affects the output.Can someone please explain this to me? And according to the documentation, does this mean that salt_length=8 is the default value?
Salt length is the string length of the random salt
Iterations is the number of times the hash is salted.
Example: let initial hashed password be "hash"
salt be "salt"
salt length is 4.
If i salt the hash 10 times (code below)
hash = "hash"
salt = "salt"
for i in range(0, 10, 1):
hash = generate_password_hash(f"{salt}{hash}")
salted_hash = hash
Iterations is 10 and salt_length is 1
From the docs, the method can be provided as "pbkdf2:[:iterations]"
where method can be md5, sha256, or whatever ones you like. Iterations can be specified in this string too. And yes, 150000 is the iterations. More the salting iterations, It is more difficult to crack the password.

Reversing a password-hashing function knowing plaintext + output

I have a password that has been stored and I'd like to figure out how it's been 'transformed' to be stored in my database.
The plaintext password is:
k4oK203$
And the password as it is stored 'crypted' in my database is:
6xqmRr0QNUrc0uvwGchWqA==
How would I go about figuring out what transformation (base64? sha1? md5? etc.) that were used in order to get the plain text password in to the database value?

Data hashing in Pentaho

Can anyone suggest me the best possible options that I can use in pentaho to suit my requirement. The requirement is we need to convert first_name & last_name attributes into hash and load the hash values for these columns into the user table to support the business reports. For the reports the actual values for these columns are not needed, the reporting code only checks for NULL values in first_name & last_name columns, and validates length of these fields.
I tried converting the fields to hash using Add checksum transformation but wasn't sure about which type of checksum to use (CRC 32, ADLER 32, MD5, SHA-1). Any suggestions?
source & target DB is PostgreSql not sure if it's needed.
Thanks in advance.
Hashing and encryption are not the same thing.
It seems you want a one-way hash. What hash you choose depends mainly on how much you care about collisions. If you don't care that multiple names could generate the same hash, a short fast hash like CRC32 is fine. If you do care about collisions then I'd use at least MD5.

Is there a way to disable Redshift password requirements?

According to Amazon Redshift docs, the passwords must be at least 8 chars, and contain at least one uppercase letter, one lowercase letter, and one number.
Is there a way to disable this for a database?
We do not need such stringent requirements.
Also, the docs are unclear, but if I don't specify VALID UNTIL 'something' then it is valid forever, right? (The docs say you can also use VALID UNTIL 'infinity' but don't explain what would happen if you don't include VALID UNTIL at all)
You cannot modify the Redshift password criteria.
If you are referring to ALTER USER ... [VALID UNTIL], the validity date is not a required field. The password will remain valid forever.
By using the md5 function, you can get around the lengh/char requirements:
https://docs.aws.amazon.com/redshift/latest/dg/r_CREATE_USER.html
In particular, quoting from the above page:
To specify an MD5 password, follow these steps:
Concatenate the password and user name.
For example, for password ez and user user1, the concatenated string is ezuser1.
Convert the concatenated string into a 32-character MD5 hash string. You can use any MD5 utility to create the hash string. The following example uses the Amazon Redshift MD5 Function and the concatenation operator ( || ) to return a 32-character MD5-hash string.
select md5('ez' || 'user1');
md5
153c434b4b77c89e6b94f12c5393af5b
Concatenate 'md5' in front of the MD5 hash string and provide the concatenated string as the md5hash argument.
create user user1 password 'md5153c434b4b77c89e6b94f12c5393af5b';
Log on to the database using the user name and password.
For this example, log on as user1 with password ez.

BCrypt: How to determine whether two hashes refer to the same password

I wonder how BCrypt can infer the correctness of a entered password, if the generated hash is different for each run?
Example:
Given password: "password123"
Lets say, I hash the given password 10 times and receive:
$2a$10$Uw0LDj343yp1tIpouRwHGeWflT3.QjDp9DeJ2XiwTIHf1T.pjEy0i
$2a$10$uYWUCEnh4gn00w57VSrYjej.UvhzBL8Wf2doTAGSGfhUMtuGr5bha
$2a$10$cJi3XOkRxxicDjEBibNhNOg5MGM.G/.p70KE75.44ayPQo8kCDxUu
$2a$10$qLcN2obMThH544U967JM5OS0vtcfP.Iq1.f0mZdvWfyeIoWHyr422
$2a$10$5/JssXqJyGHeMQlB4pr7zebTRFSt/2iwYJHF5f7.jdlTxbH4c9Sjq
$2a$10$La1UQKu306aNWkhhfhC5XeX7mfcnfbSchBIpLG6O57gur/U/n/fua
$2a$10$xTzEGVfc1D1UHFeMO95ktOJGFT79ybKUKN.z.MidMjP1XfAeElNEi
$2a$10$i9Y.1Ix6PL1bDwoTYtC49.Y0LKpar/S5qC1SkzFB4vnafikOhHSga
$2a$10$FJNTj5xeVbIcMaf9EhodHu9jJLrJL53QHQK9OuemwMh3WuTfxXEqu
$2a$10$OXMToK5CXeNtRHC3w7eqe.Mr7p4fJanbE28E2Y3MHh6f6cq1chyE6
If we assume that I store the first hash in my database and a user tries to log in a few hours later with correct password. The hash, which is generated while the user tries to log in, is totally different to the hash I have stored in my database.
How does BCrypt determine whether the two hashes refer to the same password?
The hash-values in your example contain all the necessary information to do the verification:
$2y$10$nOUIs5kJ7naTuTFkBy1veuK0kSxUFXfuaOKdOKf9xYT0KKIGSJwFa
| | | |
| | | hash-value = K0kSxUFXfuaOKdOKf9xYT0KKIGSJwFa
| | |
| | salt = nOUIs5kJ7naTuTFkBy1veu
| |
| cost-factor = 10 = 2^10 iterations
|
hash-algorithm = 2y = BCrypt
As you can see, this string contains the algorithm, the cost factor and the salt. With these parameters you can calculate a comparable hash value from the login password. In PHP you can use the function password_verify() to verify the password, it will extract the cost factor and the salt automatically.
// Check if the hash of the entered login password, matches the stored hash.
// The salt and the cost factor will be extracted from $existingHashFromDb.
$isPasswordCorrect = password_verify($password, $existingHashFromDb);