Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I've always been curious... Which is better when salting a password for hashing: prefix, or postfix? Why? Or does it matter, so long as you salt?
To explain: We all (hopefully) know by now that we should salt a password before we hash it for storage in the database [Edit: So you can avoid things like what happened to Jeff Atwood recently]. Typically this is done by concatenating the salt with the password before passing it through the hashing algorithm. But the examples vary... Some examples prepend the salt before the password. Some examples add the salt after the password. I've even seen some that try to put the salt in the middle.
So which is the better method, and why? Is there a method that decreases the chance of a hash collision? My Googling hasn't turned up a decent analysis on the subject.
Edit: Great answers folks! I'm sorry I could only pick one answer. :)
Prefix or suffix is irrelevant, it's only about adding some entropy and length to the password.
You should consider those three things:
The salt has to be different for every password you store. (This is quite a common misunderstanding.)
Use a cryptographically secure random number generator.
Choose a long enough salt. Think about the birthday problem.
There's an excellent answer by Dave Sherohman to another question why you should use randomly generated salts instead of a user's name (or other personal data). If you follow those suggestions, it really doesn't matter where you put your salt in.
I think it's all semantics. Putting it before or after doesn't matter except against a very specific threat model.
The fact that it's there is supposed to defeat rainbow tables.
The threat model I alluded to would be the scenario where the adversary can have rainbow tables of common salts appended/prepended to the password. (Say the NSA) You're guessing they either have it appended or prepended but not both. That's silly, and it's a poor guess.
It'd be better to assume that they have the capacity to store these rainbow tables, but not, say, tables with strange salts interspersed in the middle of the password. In that narrow case, I would conjecture that interspersed would be best.
Like I said. It's semantics. Pick a different salt per password, a long salt, and include odd characters in it like symbols and ASCII codes: ©¤¡
The real answer, which nobody seems to have touched upon, is that both are wrong. If you are implementing your own crypto, no matter how trivial a part you think you're doing, you are going to make mistakes.
HMAC is a better approach, but even then if you're using something like SHA-1, you've already picked an algorithm which is unsuitable for password hashing due to its design for speed. Use something like bcrypt or possibly scrypt and take the problem out of your hands entirely.
Oh, and don't even think about comparing the resulting hashes for equality with with your programming language or database string comparison utilities. Those compare character by character and short-circuit as false if a character differs. So now attackers can use statistical methods to try and work out what the hash is, a character at a time.
It shouldn't make any difference. The hash will be no more easily guessable wherever you put the salt. Hash collisions are both rare and unpredictable, by virtue of being intentionally non-linear. If it made a difference to the security, that would suggest a problem with the hashing, not the salting.
If using a cryptographically secure hash, it shouldn't matter whether you pre- or postfix; a point of hashing is that a single bit change in the source data (no matter where) should produce a different hash.
What is important, though, is using long salts, generating them with a proper cryptographic PRNG, and having per-user salts. Storing the per-user salts in your database is not a security issue, using a site-wide hash is.
First of all, the term "rainbow table" is consistently misused. A "rainbow" table is just a particular kind of lookup table, one that allows a particular kind of data compression on the keys. By trading computation for space, a lookup table that would take 1000 TB can be compressed a thousand times so that it can be stored on a smaller drive drive.
You should be worried about hash to password lookup tables, rainbow or otherwise.
#onebyone.livejournal.com:
The attacker has 'rainbow tables' consisting not of the hashes of dictionary words, but of the state of the hash computation just before finalising the hash calculation.
It could then be cheaper to brute-force a password file entry with postfix salt than prefix salt: for each dictionary word in turn you would load the state, add the salt bytes into the hash, and then finalise it. With prefixed salt there would be nothing in common between the calculations for each dictionary word.
For a simple hash function that scans linearly through the input string, such as a simple linear congruential generator, this is a practical attack. But a cryptographically secure hash function is deliberately designed to have multiple rounds, each of which uses all the bits of the input string, so that computing the internal state just prior to the addition of the salt is not meaningful after the first round. For example, SHA-1 has 80 rounds.
Moreover password hashing algorithms like PBKDF compose their hash function multiple times (it is recommended to iterate PBKDF-2 a minimum of 1000 times, each iteration applying SHA-1 twice) making this attack doubly impractical.
BCrypt hash if the platform has a provider. I love how you don't worry about creating the salts and you can make them even stronger if you want.
Inserting the salt an arbitrary number of characters into the password is the least expected case, and therefore the most "secure" socially, but it's really not very significant in the general case as long as you're using long, unique-per-password strings for salts.
Related
I know there are many questions on SALT and hashing passwords, but I have yet to find a tutorial to walk me through this in VS using the MVC pattern.
I currently have a DB created with a user table containing three columns:
userID(PK, int, not null)
password(varchar(45), not null)
loginID(varchar(8), null)
The password is saved as a visible string in the DB. After researching the issue, I assume password is easiest as binary instead of varchar. Does anyone know of a good tutorial to implement hashing and SALT into my program? One that clearly defines this in terms of the MVC pattern is preferred.
MVC doesn't have anything to do with salting your passwords, although someone might point to the proper libraries that might be used with your tech stack.
Salting involves using a specific sequence, and appending that to the end of user passwords, and then hashing that data.
The reason this is done is because a hash algorithm applies on a well known string is easily reversible. A person could, for example, use well known hash algorithms against a whole dictionary, and compare to user passwords to determine what it was hashed from. While a good hash function is a one way function (aka can't find the input based on the output), if you had a dictionary to map you could easily do it for well known strings/ string combinations.
For example, the password password has a well known hash. When you attach a random sequence to the end (or start) and then hash that, it's a significantly less common hash as a result, and then it's significantly harder to reverse.
Sorry for not having the specific technologies related, but I wanted to communicate the general higher level concept of it since the over-focus on the technologies loses the bigger picture.
Is there really a point in salting a password?
if a program does all the processing of a SALT server side then does it really make it any more difficult for brute force or other attack. The code is only going to apply the salt to whatever is entered by a user.
Do I have this all wrong?
Yes, there is a point in salting a password.
The point is that each password has its own salt, so that an attacker can't make use of dictionaries and rainbow tables to brute force all passwords at once.
The salt doesn't make it harder to crack a single password¹, but it removes the benefit from attempting to crack multiple passwords at once. An attacker has to brute force one password at a time.
¹ At least not enough to be a good reason to use it. Using better passwords works much better.
In a word, yes.
Salting a password adds a level of complexity to the string and confuses humans, and makes dictionary attacks less likely to succeed.
Brute force can still crack this password however, hence the need for a randomly generated salt.
Salts are typically generated via byte-arrays, which is then fed into a function to combine the two strings into one at intervals. See my answer here.
The hashes may be leaked without the salt (common scenario: database gets dumped, but a salt i present in PHP source that does not leak).
You are right in a way but ... the most significant protection from SALT is that if the hashes ever do get released into the wild then reverse hash lookups are much much harder.
Hash a word and then put the hash result into your favourite search engine to see what I mean.
Hearing about all the recent hacks at big tech firms, it made me wonder their use of password storage.
I know salting + hashing is accepted as being generally secure but ever example I've seen of salting has the salt key hard-coded into the password script which is generally stored on the same server.
So is it a logical solution to hash the user's password initially, pass that hash to a "salting server" or some function stored off-site, then pass back the salted hash?
The way I I'm looking at it is, if an intruder gains access to the server or database containing the stored passwords, they won't immediately have access to the salt key.
No -- salt remains effective even if known to the attacker.
The idea of salt is that it makes a dictionary attack on a large number of users more difficult. Without salt, the attacker hashes all the words in a dictionary, and sees which match with your users' hashed paswords. With salt, he has to hash each word in the dictionary many times over (once for each possible hash value) to be certain of having one that fits each user.
This multiplication by several thousand (or possibly several million, depending on how large a salt you use) increases the time to hash all the values, and the storage need to store the results -- the point that (you hope) it's impractical.
I should add, however, that in many (most?) cases, a very large salt doesn't really add a lot of security. The problem is that if you use, say, a 24 bit salt (~16 million possible values) but have only, say, a few hundred users, the attacker can collect the salt values you're actually using ahead of time, then do his dictionary attack for only those values instead of the full ~16 million potential values. In short, your 24-bit salt adds only a tiny bit of difficulty beyond what a ~8 bit salt would have provided.
OTOH, for a large server (Google, Facebook, etc.) the story is entirely different -- a large salt becomes quite beneficial.
Salting is useful even if intruder knows the salt.
If passwords are NOT salted, it makes possible to use widely available precomputed rainbow tables to quickly attack your passwords.
If your password table was salted, it makes it very difficult to precompute rainbow tables - it is impractical to create rainbow table for every possible salt.
If you use random salt that is different for every password entry, and put it in plaintext right next to it, it makes very difficult for intruder to attack your passwords, short of brute force attack.
Salting passwords protects passwords against attacks where the attacker has a list of hashed passwords. There are some common hashing algorithms that hackers have tables for that allow them to look up a hash and retrieve the password. For this to work, the hacker has to have broken into the password storage and stolen the hashes.
If the passwords are salted, then the attacker must re-generate their hash tables, using the hashing algorithm and the salt. Depending on the hashing algorithm, this can take some time. To speed things up, hackers also use lists of the most common passwords and dictionary words. The idea of the salt is to slow an attacker down.
The best approach to use a different salt for each password, make it long and random, and it's ok to store the salt next to each password. This really slows an attacker down, because they would have to run their hash table generation for each individual password, for every combination of common passwords and dictionary words. This would make it implausible for an attacker to deduce strong passwords.
I had read a good article on this, which I can't find now. But Googling 'password salt' gives some good results. Have a look at this article.
I would like to point out, that the scheme you described with the hard-coded salt, is actually not a salt, instead it works like a key or a pepper. Salt and pepper solve different problems.
A salt should be generated randomly for every password, and can be stored together with the hashed password in the database. It can be stored plain text, and fullfills it's purpose even when known to the attacker.
A pepper is a secret key, that will be used for all passwords. It will not be stored in the database, instead it should be deposited in a safe place. If the pepper is known to the attacker, it becomes useless.
I tried to explain the differences in a small tutorial, maybe you want to have a look there.
Makes sense. Seems like more effort than worth (unless its a site of significant worth or importance) for an attacker.
all sites small or large, important or not, should take password hashing as high importance
as long as each hash has its own large random salt then yes it does become mostly impracticable, if each hash uses an static salt you can use Rainbow tables to weed out the users hashs who used password1 for example
using an good hashing algorithm is also important as well (using MD5 or SHA1 is nearly like using plaintext with the mutli gpu setups these days) use scrypt if not then bcrypt or if you have to use PBKDF2 then (you need the rounds to be very high)
Disclaimer: there are many similar questions on SO, but I am looking for a practical suggestion instead of just general principles. Also, feel free to point out implementations of the "ideal" algorithm (PHP would be nice ;), but please provide specifics (how it works).
What is the best way to calculate hash string of a password for storing in a database? I know I should:
use salt
iterate hashing process multiple times (hash chaining for key stretching)
I was thinking of using such algorithm:
x = md5( salt + password);
repeat N-times:
x = md5( salt + password + x );
I am sure this is quite safe, but there are a few questions that come to mind:
would it be beneficial to include username in salt?
I have decided to use a common salt for all users, any downside in this?
what is the recommended minimum salt length, if any?
should I use md5, sha or something else?
is there anything wrong with the above algorithm / any suggestions?
... (feel free to provide more :)
I know the decisions necessarily depend on the situation, but I am looking for a solution that would:
provide as much security as possible
be fast enough ( < 0.5 second on a decent machine )
So, what would the ideal algorithm look like, preferably in pseudo-code?
The "ideal" password hashing function, right now, is bcrypt. It includes salt processing and a configurable number of iterations. There is a free opensource PHP implementation.
Second best would be PBKDF2, which relies on an underlying hash function and is somewhat similar to what you suggest. There are technical reasons why bcrypt is "better" than PBKDF2.
As for your specific questions:
1. would it be beneficial to include username in salt?
Not really.
2. I have decided to use a common salt for all users, any downside in this?
Yes: it removes the benefits of having a salt. The salt sole reason to exist is to be unique for each hashed password. This prevents an attacker from attacking two hashed passwords with less effort than twice that of attacking one hashed password. Salts must be unique. Even having a per-user salt is bad: the salt must also be changed when a user changes his password. The kind of optimization that an attacker may apply when a salt is reused / shared includes (but is not limited to) tables of precomputed hashes, such as rainbow tables.
3. what is the recommended minimum salt length, if any?
A salt must be unique. Uniqueness is a hard property to maintain. But by using long enough random salts (generated with a good random number generator, preferably a cryptographically strong one), you get uniqueness with a high enough probability. 128-bit salts are long enough.
4. should I use md5, sha or something else?
MD5 is bad for public relations. It has known weaknesses, which may or may not apply to a given usage, and it is very hard to "prove" with any kind of reliability that these weaknesses do not apply to a specific situation. SHA-1 is better, but not "good", because it also has weaknesses, albeit much less serious ones than MD5's. SHA-256 is a reasonable choice. As was pointed out above, for password hashing, you want a function which does not scale well on parallel architectures such as GPU, and SHA-256 scales well, which is why the Blowfish-derivative used in bcrypt is preferable.
5. is there anything wrong with the above algorithm / any suggestions?
It is homemade. That's bad. The trouble is that there is no known test for security of a cryptographic algorithm. The best we can hope for is to let a few hundreds professional cryptographer try to break an algorithm for a few years -- if they cannot, then we can say that although the algorithm is not really "proven" to be secure, at least weaknesses must not be obvious. Bcrypt has been around, widely deployed, and analyzed for 12 years. You cannot beat that by yourself, even with the help of StackOverflow.
As a professional cryptographer myself, I would raise a suspicious eyebrow at the use of simple concatenation in MD5 or even SHA-256: these are Merkle–Damgård hash functions, which is fine for collision resistance but does not provide a random oracle (there is the so-called "length extension attack"). In PBKDF2, the hash function is not used directly, but through HMAC.
I tend to use a fixed application salt, the username and the password
Example...
string prehash = "mysaltvalue" + "myusername" + "mypassword";
The benefit here is that people using the same password don't end up with the same hash value, and it prevents people with access to the database copying their password over another users - of course, if you can access the DB you don't really need to hack a login to get the data ;)
IMO, salt length doesn't matter too much, the hashed value length is always going to be 32 anyway (using MD5 - which again is what I would use)
I would say in terms of security, this password encryption is enough, the most important thing is to make sure your application/database has no security leaks in it!
Also, I wouldn't bother with repeated hashing, no point in my opinion. Somebody would have to know you algorithm to try to hack it that way and then it doesn't matter if it is hashed once or many times, if they know it, they know it
If someone is purposely trying to modify two files to have the same hash, what are ways to stop them? Can md5 and sha1 prevent the majority case?
I was thinking of writing my own and I figure even if I don't do a good job if the user doesn't know my hash he may not be able to fool mine.
What's the best way to prevent this?
MD5 is generally considered insecure if hash collisions are a major concern. SHA1 is likewise no longer considered acceptable by the US government. There is was a competition under way to find a replacement hash algorithm, but the recommendation at the moment is to use the SHA2 family - SHA-256, SHA-384 or SHA-512. [Update: 2012-10-02 NIST has chosen SHA-3 to be the algorithm Keccak.]
You can try to create your own hash — it would probably not be as good as MD5, and 'security through obscurity' is likewise not advisable.
If you want security, hash with multiple hash algorithms. Being able to simultaneously create files that have hash collisions using a number of algorithms is excessively improbable. [And, in the light of comments, let me make it clear: I mean publish both the SHA-256 and the Whirlpool values for the file — not combining hash algorithms to create a single value, but using separate algorithms to create separate values. Generally, a corrupted file will fail to match any of the algorithms; if, perchance, someone has managed to create a collision value using one algorithm, the chance of also producing a second collision in one of the other algorithms is negligible.]
The Public TimeStamp uses an array of algorithms. See, for example, sqlcmd-86.00.tgz for an illustration.
If the user doesn't know your hashing algorithm he also can't verify your signature on a document that you actually signed.
The best option is to use public-key one-way hashing algorithms that generate the longest hash. SHA-256 creates a 256-bit hash, so a forger would have to try 2255 different documents (on average) before they created one that matched a given document, which is pretty secure. If that's still not secure enough for you, there's SHA-512.
Also, I think it's worth mentioning that a good low-tech way to protect yourself against forged digitally-signed documents is to simply keep a copy of anything you sign. That way, if it comes down to a dispute, you can show that the original document you signed was altered.
There is a hierarchy of difficulty (for an attacker) here. It is easier to find two files with the same hash than to generate one to match a given hash, and easier to do the later if you don't have to respect form/content/lengths restrictions.
Thus, if it is possible to use a well defined document structure and lengths, you can make an attackers life a bit harder no matter what underling hash you use.
Why are you trying to create your own hash algorithm? What's wrong with SHA1HMAC?
Yes, there are repeats for hashes.
Any hash that is shorter than the plaintext is necessarily less information. That means there will be some repeats. The key for hashes is that the repeats are hard to reverse-engineer.
Consider CRC32 - commonly used as a hash. It's a 32-bit quantity. Because there are more than 2^32 messages in the universe, then there will be repeats with CRC32.
The same idea applies to other hashes.
This is called a "hash collision", and the best way to avoid it is to use a strong hash function. MD5 is relatively easy to artificially build colliding files, as seen here. Similarly, it's known there is a relatively efficient method for computing colliding SH1 files, although in this case "relatively efficient" still takes hunreds of hours of compute time.
Generally, MD5 and SHA1 are still expensive to crack, but not impossible. If you're really worried about it, use a stronger hash function, like SHA256.
Writing your own isn't actually a good idea unless you're a pretty expert cryptographer. most of the simple ideas have been tried and there are well-known attacks against them.
If you really want to learn more about it, have a look at Schneier's Applied Cryptography.
I don't think coming up with your own hash algorithm is a good choice.
Another good option is used Salted MD5. For example, the input to your MD5 hash function is appended with string "acidzom!##" before passing to MD5 function.
There is also a good reading at Slashdot.