In a tutorial about hashing and salting password I saw the hash+salt being performed multiple times by using a for loop.
$salt = dechex(mt_rand(0, 2147483647)) . dechex(mt_rand(0, 2147483647));
$password = hash('sha256', $_POST['password'] . $salt);
for($round = 0; $round < 65536; $round++)
{
$password = hash('sha256', $password . $salt);
}
What is the advantage of using such a method? Is it more secure against for example brute force methods?
Also: Should I be considering another hashing algorithm other than sha256? I know there is no clear-cut answer because it's likely to depend on many factors such as degree of safety, speed etc. But are there any recommendations for let's say a fairly simple website?
The reason to iterate the hashing many times, is to slow down the calculation. Today (in 2013) you can calculate about 1.4 Giga SHA256 hashes per second with common hardware, so you can brute-force a whole english dictionary with about 500'000 words in a fraction of a millisecond.
That's why one should use a slow key-derivation function like BCrypt or PBKDF2 to hash passwords. Using some milliseconds for a login is no problem, but brute-forcing with only 1000 words per second is not practicable.
PHP 5.5 will have it's own functions password_hash() and password_verify() ready, to simplify generating BCrypt hashes. I strongly recommend to use this excellent api, or it's compatibility pack for earlier PHP versions. The usage is very straightforward:
// Hash a new password for storing in the database.
// The function automatically generates a cryptographically safe salt.
$hashToStoreInDb = password_hash($password, PASSWORD_BCRYPT);
// Check if the hash of the entered login password, matches the stored hash.
// The salt and the cost factor will be extracted from $existingHashFromDb.
$isPasswordCorrect = password_verify($password, $existingHashFromDb);
If you are interested in a more detailed answer, you may have a look at my tutorial about safely storing passwords.
http://www.openwall.com/phpass/
is one of the more well used soltuions, I would look at implementing that.
The reason loops like that exist are to slow down the hashing process. Revese lookup tables take longer because there are more calculations in place.
by having that loop there you have slowed someone scanning for passwords by up to 65536 times.
Related
I'm creating new login system for my single page app. This system will require administrator to create account for the users. Once they setup account for the user I will send them an email where they have to enter their Information like Security Question and Password. So I have done some research and looked over our existing system. There is hash function that is used together with salt. I read few articles and there is a lot of debate on hash being vulnerable. Also I see that in this case hashed password is stored as well as salt. They are in separate columns. Is this good practice to store salt in DB? Also is there better way to store password in database? Here is example of the logic that I found:
<cfset password = trim(FORM.password)>
<cfset salt = randomSalt()> //This is function that generates random salt.
<cfset totPW = password & salt>
<cfset hashedPW = hash(totPW,"SHA-256")>
I'm currently using Cold Fusion 2016. I'm not sure if there is some better way to encrypt the password in CF. If anyone can provide some useful resource or example please let me know. Thanks.
Generally speaking, hashing is still fine these days. The thing that matters here is what hashing algorithm you use (and how many iterations). In case of database leaks, you want to make it as difficult as possible to match inputs (commmon password/dictionary based attack). Salting does help a little bit, so does having an irregular pattern salt or a hidden number of iterations based on the username etc. Having different hashing strategies helps as long as the attacker doesn't know how your hashing is implemented. (Accessing the database server is one thing, accessing your source code another.) It's about causing effort to the attacker.
Now about hashing algorithms: SHA-2 is easier to attack than for example bcrypt due to being targetable by GPUs. The number of iterations on the hash will take the attacker more time for each password. bcrypt isn't supported by hash(), but at least SHA-512 is. hash() supports iterations (see docs). Rule of thumb is having an iteration count that takes at least a second to process on your server hardware.
On a side note: Don't trim the password input, people might intend to use leading/trailing whitespaces.
I have been reading on correct procedures for storing and checking user passwords, and am puzzling a little with salts.
I get the idea that they are to prevent use of such tools as rainbow tables, but to me the idea of storing the salt along with the hash seems a potential security problem. Too much data in one place for my liking.
The idea I have that I would like to run by folks is to use a 'lucky number' to create a salt from a portion of the password hash.
Basically, along with choosing a password, a user would choose a 'lucky number' as well. This number would then be used as the starting index to retrieve a salt from the hashed pass.
So a very basic example would be something like this.
Pass = "Password"
Lucky Number = "4"
Pass Hash = "00003gebdksjh2h4"
Salt Length = "5"
Resulting Salt = "3gebd"
My thinking is that as the 'lucky number' would not need to be stored, the salt would then also take computational time to work out and so make any attacks even more difficult. Plus means storing slightly less data.
It wouldn't add more security. The goal of using rainbow table is to have a map of combinations of dictionary words and relevant hashes. By using a more-or-less distinct salt (which in general shouldn't be found in a dictionary) for each password hash you compute, you force the attacker to generate a new set of rainbow tables for each entry, which would be prohibitively computationally expensive. At no point (except for generating the hash table) the attacker calculates hashes, so your strategy fails here.
On the other hand when using a regular dictionary attack the attacker only needs a constant-time computation to calculate the salt, which he would have to compute only once for many millions or more combinations of hashes he would have to generate. It would only work if it was more computationally expensive than generating all that, but then you'd have to make the same computation each time a user wants to log in, which is not feasible.
Disclaimer: there are many similar questions on SO, but I am looking for a practical suggestion instead of just general principles. Also, feel free to point out implementations of the "ideal" algorithm (PHP would be nice ;), but please provide specifics (how it works).
What is the best way to calculate hash string of a password for storing in a database? I know I should:
use salt
iterate hashing process multiple times (hash chaining for key stretching)
I was thinking of using such algorithm:
x = md5( salt + password);
repeat N-times:
x = md5( salt + password + x );
I am sure this is quite safe, but there are a few questions that come to mind:
would it be beneficial to include username in salt?
I have decided to use a common salt for all users, any downside in this?
what is the recommended minimum salt length, if any?
should I use md5, sha or something else?
is there anything wrong with the above algorithm / any suggestions?
... (feel free to provide more :)
I know the decisions necessarily depend on the situation, but I am looking for a solution that would:
provide as much security as possible
be fast enough ( < 0.5 second on a decent machine )
So, what would the ideal algorithm look like, preferably in pseudo-code?
The "ideal" password hashing function, right now, is bcrypt. It includes salt processing and a configurable number of iterations. There is a free opensource PHP implementation.
Second best would be PBKDF2, which relies on an underlying hash function and is somewhat similar to what you suggest. There are technical reasons why bcrypt is "better" than PBKDF2.
As for your specific questions:
1. would it be beneficial to include username in salt?
Not really.
2. I have decided to use a common salt for all users, any downside in this?
Yes: it removes the benefits of having a salt. The salt sole reason to exist is to be unique for each hashed password. This prevents an attacker from attacking two hashed passwords with less effort than twice that of attacking one hashed password. Salts must be unique. Even having a per-user salt is bad: the salt must also be changed when a user changes his password. The kind of optimization that an attacker may apply when a salt is reused / shared includes (but is not limited to) tables of precomputed hashes, such as rainbow tables.
3. what is the recommended minimum salt length, if any?
A salt must be unique. Uniqueness is a hard property to maintain. But by using long enough random salts (generated with a good random number generator, preferably a cryptographically strong one), you get uniqueness with a high enough probability. 128-bit salts are long enough.
4. should I use md5, sha or something else?
MD5 is bad for public relations. It has known weaknesses, which may or may not apply to a given usage, and it is very hard to "prove" with any kind of reliability that these weaknesses do not apply to a specific situation. SHA-1 is better, but not "good", because it also has weaknesses, albeit much less serious ones than MD5's. SHA-256 is a reasonable choice. As was pointed out above, for password hashing, you want a function which does not scale well on parallel architectures such as GPU, and SHA-256 scales well, which is why the Blowfish-derivative used in bcrypt is preferable.
5. is there anything wrong with the above algorithm / any suggestions?
It is homemade. That's bad. The trouble is that there is no known test for security of a cryptographic algorithm. The best we can hope for is to let a few hundreds professional cryptographer try to break an algorithm for a few years -- if they cannot, then we can say that although the algorithm is not really "proven" to be secure, at least weaknesses must not be obvious. Bcrypt has been around, widely deployed, and analyzed for 12 years. You cannot beat that by yourself, even with the help of StackOverflow.
As a professional cryptographer myself, I would raise a suspicious eyebrow at the use of simple concatenation in MD5 or even SHA-256: these are Merkle–Damgård hash functions, which is fine for collision resistance but does not provide a random oracle (there is the so-called "length extension attack"). In PBKDF2, the hash function is not used directly, but through HMAC.
I tend to use a fixed application salt, the username and the password
Example...
string prehash = "mysaltvalue" + "myusername" + "mypassword";
The benefit here is that people using the same password don't end up with the same hash value, and it prevents people with access to the database copying their password over another users - of course, if you can access the DB you don't really need to hack a login to get the data ;)
IMO, salt length doesn't matter too much, the hashed value length is always going to be 32 anyway (using MD5 - which again is what I would use)
I would say in terms of security, this password encryption is enough, the most important thing is to make sure your application/database has no security leaks in it!
Also, I wouldn't bother with repeated hashing, no point in my opinion. Somebody would have to know you algorithm to try to hack it that way and then it doesn't matter if it is hashed once or many times, if they know it, they know it
How is bcrypt stronger than, say,
def md5lots(password, salt, rounds):
if (rounds < 1)
return password
else
newpass = md5(password + salt)
return md5lots(newpass, salt, rounds-1)
I get the feeling, given its hype, that more intelligent people than me have figured out that bcrypt is better than this. Could someone explain the difference in 'smart layman' terms?
The principal difference - MD5 and other hash functions designed to verify data have been designed to be fast, and bcrypt() has been designed to be slow.
When you are verifying data, you want the speed, because you want to verify the data as fast as possible.
When you are trying to protect credentials, the speed works against you. An attacker with a copy of a password hash will be able to execute many more brute force attacks because MD5 and SHA1, etc, are cheap to execute.
bcrypt in contrast is deliberately expensive. This matters little when there are one or two tries to authenticate by the genuine user, but is much more costly to brute-force.
There are three significant differences between bcrypt and hashing multiple times with MD5:
The size of the output: 128-bit (16-bytes) for MD5 and 448 bits (56-bytes) for bcrypt. If you store millions of hashes in a database, this has to be taken into account.
Collisions and preimage attacks are possible against MD5.
Bcrypt can be configured to iterate more and more as cpu's become more and more powerful.
Hence, using salting-and-stretching with MD5 is not as safe as using bcrypt. This issue can be solved by selecting a better hash function than MD5.
For example, if SHA-256 is selected, the output size will be 256-bits (32-bytes). If the salting-and-stretching can be configured to increase the number of iterations like bcrypt, then there is no difference between both methods, except the amount of space required to store result hashes.
You are effectively talking about implementing PBKDF2 or Password-Based Key Derivation Function. Effectively it is the same thing as BCrypt, the advantage being that you can lengthen the amount of CPU time it takes to derive a password. The advantage of this over something like BCrypt is that, by knowing how many 'Iterations' you have put the password through, when you need to increase it you could do it without resetting all the passwords in the database. Just have your algorithm pick up the end result as if it were at the nth iteration (where n is the previous itteration count) and keep going!
It is recomended you use a proper PBKDF2 library instead of creating your own, because lets face it, as with all cryptography, the only way you know if something is safe is if it has been 'tested' by the interwebs. (see here)
Systems that use this method:
.NET has a library already implemented. See it here
Mac, linux and windows file encryption uses many itteration (10,000+) versions of this encryption method to secure their file systems.
Wi-Fi networks are often secured using this method of encryption
Source
Thanks for asking the question, it forced me to research the method i was using for securing my passwords.
TTD
Although this question is already answered, i would like to point out a subtle difference between BCrypt and your hashing-loop. I will ignore the deprecated MD5 algorithm and the exponential cost factor, because you could easily improve this in your question.
You are calculating a hash-value and then you use the result to calculate the next hash-value. If you look at the implementation of BCrypt, you can see, that each iteration uses the resulting hash-value, as well as the original password (key).
Eksblowfish(cost, salt, key)
state = InitState()
state = ExpandKey(state, salt, key)
repeat (2^cost)
state = ExpandKey(state, 0, key)
state = ExpandKey(state, 0, salt)
return state
This is the reason, you cannot take a Bcrypt-hashed password and continue with iterating, because you would have to know the original password then. I cannot prove it, but i suppose this makes Bcrypt safer than a simple hashing-loop.
Strictly speaking, bcrypt actually encrypts the text:
OrpheanBeholderScryDoubt
64 times.
But it does it with a key that was derived from your password and some randomly generated salt.
Password hashing is not hashing
The real virtue of "password hashing algorithms" (like bcrypt) is that they use a lot of RAM.
SHA2 is designed to be fast. If you're a real-time web-server, and you want to validate file integrity, you want something that runs extraordinarly fast, with extraordinarliy low resource usage. That is the antithesis of password hashing.
SHA2 is designed to be fast
SHA2 can operate with 128 bytes of RAM
SHA2 is easily implementable in hardware
i own a USB stick device that can calculate 330 million hashes per second
in fact, i own 17 of them
If you perform a "fast" hash multiple times (e.g. 10,000 is a common recommendation of PBDKF2), then you're not really adding any security.
What you need is a hash that is difficult to implement in hardware. What you need is a hash that is hard to parallelize on a GPU.
Over the last few decades we've learned that RAM is the key to slowing down password hashing attempts. Custom hardware shines at performing raw computation (in fact, only 1% of your CPU is dedicated to computation - the rest is dedicated to jitting the machine instructions into something faster; pre-fetching, out-of-order-execution, branch prediction, cache). The way to styme custom hardware is to make the algorithm have to touch a lot of RAM.
SHA2: 128 bytes
bcrypt: 4 KB
scrypt (configurable): 16 MB in LiteCoin
Argon2 (configurable): 64 MB in documentation examples
Password hashing does not mean simply using a fast hash multiple times.
A modern recommended bcrypt cost factor is 12; so that it takes about 250 ms to compute.
you would have to perform about 330,000 iterations of SHA2 to equal that time cost on a modern single-core CPU
But then we get back to my 2.5W, USB, SHA2 stick and it's 330 Mhashes/sec. In order to defend against that, it would have to be 83M iterations.
If you're try to add only CPU cost: you're losing.
You have to add memory cost
bcrypt is 21 years old, and it only uses 4KB. But it is still ~infinitely better than any amount of MD5, SHA-1, or SHA2 hashing.
In a web application written in Perl and using PostgreSQL the users have username and password. What would be the recommended way to store the passwords?
Encrypting them using the crypt() function of Perl and a random salt? That would limit the useful length of passswords to 8 characters and will require fetching the stored password in order to compare to the one given by the user when authenticating (to fetch the salt that was attached to it).
Is there a built-in way in PostgreSQL to do this?
Should I use Digest::MD5?
Don't use SHA1 or SHA256, as most other people are suggesting. Definitely don't use MD5.
SHA1/256 and MD5 are both designed to create checksums of files and strings (and other datatypes, if necessary). Because of this, they're designed to be as fast as possible, so that the checksum is quick to generate.
This fast speed makes it much easier to bruteforce passwords, as a well-written program easily can generate thousands of hashes every second.
Instead, use a slow algorithm that is specifically designed for passwords. They're designed to take a little bit longer to generate, with the upside being that bruteforce attacks become much harder. Because of this, the passwords will be much more secure.
You won't experience any significant performance disadvantages if you're only looking at encrypting individual passwords one at a time, which is the normal implementation of storing and checking passwords. It's only in bulk where the real difference is.
I personally like bcrypt. There should be a Perl version of it available, as a quick Google search yielded several possible matches.
MD5 is commonly used, but SHA1/SHA256 is better. Still not the best, but better.
The problem with all of these general-purpose hashing algorithms is that they're optimized to be fast. When you're hashing your passwords for storage, though, fast is just what you don't want - if you can hash the password in a microsecond, then that means an attacker can try a million passwords every second if they get their hands on your password database.
But you want to slow an attacker down as much as possible, don't you? Wouldn't it be better to use an algorithm which takes a tenth of a second to hash the password instead? A tenth of a second is still fast enough that users won't generally notice, but an attacker who has a copy of your database will only be able to make 10 attempts per second - it will take them 100,000 times longer to find a working set of login credentials. Every hour that it would take them at a microsecond per attempt becomes 11 years at a tenth of a second per attempt.
So, how do you accomplish this? Some folks fake it by running several rounds of MD5/SHA digesting, but the bcrypt algorithm is designed specifically to address this issue. I don't fully understand the math behind it, but I'm told that it's based on the creation of Blowfish frames, which is inherently slow (unlike MD5 operations which can be heavily streamlined on properly-configured hardware), and it has a tunable "cost" parameter so that, as Moore's Law advances, all you need to do is adjust that "cost" to keep your password hashing just as slow in ten years as it is today.
I like bcrypt the best, with SHA2(256) a close second. I've never seen MD5 used for passwords but maybe some apps/libraries use that. Keep in mind that you should always use a salt as well. The salt itself should be completely unique for each user and, in my opinion, as long as possible. I would never, ever use just a hash against a string without a salt added to it. Mainly because I'm a bit paranoid and also so that it's a little more future-proof.
Having a delay before a user can try again and auto-lockouts (with auto-admin notifications) is a good idea as well.
The pgcrypto module in PostgreSQL has builtin suppotr for password hashing, that is pretty smart about storage, generation, multi-algorithm etc. See http://www.postgresql.org/docs/current/static/pgcrypto.html, the section on Password Hashing Functions. You can also see the pgcrypto section of http://www.hagander.net/talks/hidden%20gems%20of%20postgresql.pdf.
Use SHA1 or SHA256 hashing with salting. Thats the way to go for storing passwords.
If you don't use a password recovery mechanism (Not password reset) I think using a hashing mechanism is better than trying to encrypt the password. You can just check the hashes without any security risk. Even you don't know the password of the user.
I would suggest storing it as a salted md5 hash.
INSERT INTO user (password) VALUES (md5('some_salt'||'the_password'));
You could calculate the md5 hash in perl if you wish, it doesn't make much difference unless you are micro-optimizing.
You could also use sha1 as an alternative, but I'm unsure if Postgres has a native implementation of this.
I usually discourage the use of a dynamic random salt, as it is yet another field that must be stored in the database. Plus, if your tables were ever compromised, the salt becomes useless.
I always go with a one-time randomly generated salt and store this in the application source, or a config file.
Another benefit of using a md5 or sha1 hash for the password is you can define the password column as a fixed width CHAR(32) or CHAR(40) for md5 and sha1 respectively.