As an ordinary method, I always used to save MD5 of passwords in database while there are many websites that decode the MD5 hashed data to its original data (using rainbow database).
I wonder if it is more safe to modify the output of MD5 function (e.g. omitting the last character of MD5 output to create a new hashed data)? or there is a logic behind the MD5 that makes is more safe than every modified version?
No this doesn't do much good to make your passwords more secure. It adds a bit of "security by obscurity", but when we hash passwords, we prepare for the case where the attacker knows the hashes and the algorithm.
The problem with MD5 in general and with derivations is, that they can be calculated ways too fast. With common hardware you can calculate 8Giga MD5/s, which makes brute-forcing too easy. Todays password cracker tools do not only offer plain MD5 hashes, you can calculate also derivations, e.g. md5(strtoupper(md5($pass))) out of the box.
For a secure storing of passwords you need a slow hash function like BCrypt, PBKDF2 or SCrypt with a cost factor. Of course they should be salted with a unique salt per password.
Perhaps you should consider a different hashing algorithm instead?
https://security.stackexchange.com/questions/4789/most-secure-password-hash-algorithms
Related
I am working on a project that has to store users passwords. With that password you can gain access to a user achievements and stuff so it's really important that you can not get the password even if you hacked into the database. My problem is which hashing function to choose in security and efficiency level.
Right now I am using sha256 with salt and pepper but I read that using a slow hashing function like Bcrypt with cost factor of 12 can be superior. And if it does, how much more security do I gain from using a pepper as well because it's really time consuming for hash function like Bcrypt.
My question is what hash function should I use base on the assumption that I do not expect to get hacked by a global hacking organization with supercomputers?
Plain crypto hash functions like sha2 or sha3 don't cut it anymore for password hashing, because it's too efficient to compute them, which means an attacker that controls many cpu cores can compute large tables relatively quickly. In practice the feasibility of this still depends on the length of passwords mostly (and the character set used in those password to some extent), but you should still not use these unless you really know why that will be acceptable.
Instead, you should be using a hash or key derivation function more suitable for password hashing. Pbkdf2, bcrypt, scrypt and Argon2 are all acceptable candidates, with somewhat different strengths and weaknesses. For a regular web app, it almost doesn't matter which of these you choose, and all will be more secure than plain hashes.
The difference will be very real if your database is compromised, properly hashed passwords will likely not be revealed, while at least some passwords with sha likely will, despite salt and pepper.
Hearing about all the recent hacks at big tech firms, it made me wonder their use of password storage.
I know salting + hashing is accepted as being generally secure but ever example I've seen of salting has the salt key hard-coded into the password script which is generally stored on the same server.
So is it a logical solution to hash the user's password initially, pass that hash to a "salting server" or some function stored off-site, then pass back the salted hash?
The way I I'm looking at it is, if an intruder gains access to the server or database containing the stored passwords, they won't immediately have access to the salt key.
No -- salt remains effective even if known to the attacker.
The idea of salt is that it makes a dictionary attack on a large number of users more difficult. Without salt, the attacker hashes all the words in a dictionary, and sees which match with your users' hashed paswords. With salt, he has to hash each word in the dictionary many times over (once for each possible hash value) to be certain of having one that fits each user.
This multiplication by several thousand (or possibly several million, depending on how large a salt you use) increases the time to hash all the values, and the storage need to store the results -- the point that (you hope) it's impractical.
I should add, however, that in many (most?) cases, a very large salt doesn't really add a lot of security. The problem is that if you use, say, a 24 bit salt (~16 million possible values) but have only, say, a few hundred users, the attacker can collect the salt values you're actually using ahead of time, then do his dictionary attack for only those values instead of the full ~16 million potential values. In short, your 24-bit salt adds only a tiny bit of difficulty beyond what a ~8 bit salt would have provided.
OTOH, for a large server (Google, Facebook, etc.) the story is entirely different -- a large salt becomes quite beneficial.
Salting is useful even if intruder knows the salt.
If passwords are NOT salted, it makes possible to use widely available precomputed rainbow tables to quickly attack your passwords.
If your password table was salted, it makes it very difficult to precompute rainbow tables - it is impractical to create rainbow table for every possible salt.
If you use random salt that is different for every password entry, and put it in plaintext right next to it, it makes very difficult for intruder to attack your passwords, short of brute force attack.
Salting passwords protects passwords against attacks where the attacker has a list of hashed passwords. There are some common hashing algorithms that hackers have tables for that allow them to look up a hash and retrieve the password. For this to work, the hacker has to have broken into the password storage and stolen the hashes.
If the passwords are salted, then the attacker must re-generate their hash tables, using the hashing algorithm and the salt. Depending on the hashing algorithm, this can take some time. To speed things up, hackers also use lists of the most common passwords and dictionary words. The idea of the salt is to slow an attacker down.
The best approach to use a different salt for each password, make it long and random, and it's ok to store the salt next to each password. This really slows an attacker down, because they would have to run their hash table generation for each individual password, for every combination of common passwords and dictionary words. This would make it implausible for an attacker to deduce strong passwords.
I had read a good article on this, which I can't find now. But Googling 'password salt' gives some good results. Have a look at this article.
I would like to point out, that the scheme you described with the hard-coded salt, is actually not a salt, instead it works like a key or a pepper. Salt and pepper solve different problems.
A salt should be generated randomly for every password, and can be stored together with the hashed password in the database. It can be stored plain text, and fullfills it's purpose even when known to the attacker.
A pepper is a secret key, that will be used for all passwords. It will not be stored in the database, instead it should be deposited in a safe place. If the pepper is known to the attacker, it becomes useless.
I tried to explain the differences in a small tutorial, maybe you want to have a look there.
Makes sense. Seems like more effort than worth (unless its a site of significant worth or importance) for an attacker.
all sites small or large, important or not, should take password hashing as high importance
as long as each hash has its own large random salt then yes it does become mostly impracticable, if each hash uses an static salt you can use Rainbow tables to weed out the users hashs who used password1 for example
using an good hashing algorithm is also important as well (using MD5 or SHA1 is nearly like using plaintext with the mutli gpu setups these days) use scrypt if not then bcrypt or if you have to use PBKDF2 then (you need the rounds to be very high)
I've a legacy app where passwords are hashed using MD5 without salt. I'd like to switch to SHA1 with salt, but I'd like to keep current users' passwords.
My plan is to change hashing function to sha1(md5(password) + salt). I'll be able to batch process all existing hashes using sha1(<existing_pass> + salt).
Is it safe to keep md5 in this case?
Is it ok to have one single salt for all users?
It's not a good idea to keep md5, read this question: Use SHA-512 and salt to hash an MD5 hashed password?.
It's better to have one salt for each user. With the same salt, users with the same password will have the same hash, and a rainbow table can be created for all your passwords at the same time.
As for question 1 I'm not quite sure but it seems to be OK.
For question 2: It is never OK to have same salt for all users. Salt has two functions. To prevent using pre-generated hashes / rainbow tables to search leaked database, and to prevent generation of dictionary-based hashes and searching databases with them too. Common salt will work in first case making rainbow tables unusable, but won't prevent cracker from dictionary attack. If cracker knows the global salt, he can generate frequent passwords, hash them and grep entire database. If salt is generated per user this scenario isn't possible.
In a web application written in Perl and using PostgreSQL the users have username and password. What would be the recommended way to store the passwords?
Encrypting them using the crypt() function of Perl and a random salt? That would limit the useful length of passswords to 8 characters and will require fetching the stored password in order to compare to the one given by the user when authenticating (to fetch the salt that was attached to it).
Is there a built-in way in PostgreSQL to do this?
Should I use Digest::MD5?
Don't use SHA1 or SHA256, as most other people are suggesting. Definitely don't use MD5.
SHA1/256 and MD5 are both designed to create checksums of files and strings (and other datatypes, if necessary). Because of this, they're designed to be as fast as possible, so that the checksum is quick to generate.
This fast speed makes it much easier to bruteforce passwords, as a well-written program easily can generate thousands of hashes every second.
Instead, use a slow algorithm that is specifically designed for passwords. They're designed to take a little bit longer to generate, with the upside being that bruteforce attacks become much harder. Because of this, the passwords will be much more secure.
You won't experience any significant performance disadvantages if you're only looking at encrypting individual passwords one at a time, which is the normal implementation of storing and checking passwords. It's only in bulk where the real difference is.
I personally like bcrypt. There should be a Perl version of it available, as a quick Google search yielded several possible matches.
MD5 is commonly used, but SHA1/SHA256 is better. Still not the best, but better.
The problem with all of these general-purpose hashing algorithms is that they're optimized to be fast. When you're hashing your passwords for storage, though, fast is just what you don't want - if you can hash the password in a microsecond, then that means an attacker can try a million passwords every second if they get their hands on your password database.
But you want to slow an attacker down as much as possible, don't you? Wouldn't it be better to use an algorithm which takes a tenth of a second to hash the password instead? A tenth of a second is still fast enough that users won't generally notice, but an attacker who has a copy of your database will only be able to make 10 attempts per second - it will take them 100,000 times longer to find a working set of login credentials. Every hour that it would take them at a microsecond per attempt becomes 11 years at a tenth of a second per attempt.
So, how do you accomplish this? Some folks fake it by running several rounds of MD5/SHA digesting, but the bcrypt algorithm is designed specifically to address this issue. I don't fully understand the math behind it, but I'm told that it's based on the creation of Blowfish frames, which is inherently slow (unlike MD5 operations which can be heavily streamlined on properly-configured hardware), and it has a tunable "cost" parameter so that, as Moore's Law advances, all you need to do is adjust that "cost" to keep your password hashing just as slow in ten years as it is today.
I like bcrypt the best, with SHA2(256) a close second. I've never seen MD5 used for passwords but maybe some apps/libraries use that. Keep in mind that you should always use a salt as well. The salt itself should be completely unique for each user and, in my opinion, as long as possible. I would never, ever use just a hash against a string without a salt added to it. Mainly because I'm a bit paranoid and also so that it's a little more future-proof.
Having a delay before a user can try again and auto-lockouts (with auto-admin notifications) is a good idea as well.
The pgcrypto module in PostgreSQL has builtin suppotr for password hashing, that is pretty smart about storage, generation, multi-algorithm etc. See http://www.postgresql.org/docs/current/static/pgcrypto.html, the section on Password Hashing Functions. You can also see the pgcrypto section of http://www.hagander.net/talks/hidden%20gems%20of%20postgresql.pdf.
Use SHA1 or SHA256 hashing with salting. Thats the way to go for storing passwords.
If you don't use a password recovery mechanism (Not password reset) I think using a hashing mechanism is better than trying to encrypt the password. You can just check the hashes without any security risk. Even you don't know the password of the user.
I would suggest storing it as a salted md5 hash.
INSERT INTO user (password) VALUES (md5('some_salt'||'the_password'));
You could calculate the md5 hash in perl if you wish, it doesn't make much difference unless you are micro-optimizing.
You could also use sha1 as an alternative, but I'm unsure if Postgres has a native implementation of this.
I usually discourage the use of a dynamic random salt, as it is yet another field that must be stored in the database. Plus, if your tables were ever compromised, the salt becomes useless.
I always go with a one-time randomly generated salt and store this in the application source, or a config file.
Another benefit of using a md5 or sha1 hash for the password is you can define the password column as a fixed width CHAR(32) or CHAR(40) for md5 and sha1 respectively.
I remember a guy telling me that if I let him change 4 bytes he can make a file have any checksum he wants (CRC-32).
I heard mention of salting a hash. I am wondering if someone had his file match my file would salting the MD5 or SHA-1 hash change the result so both files no longer collide? Or does it change the end hash value only?
You are mixing up two different uses of hash values:
Checksumming for guarding against random (non-malicious) errors.
Computing cryptographical message digests for storing passwords, signing messages, certificates ...
CRCs are a good choice for the first application, but totally unsuited for the second, because it is easy to compute a collision (in math-speak: CRCs are linear). This is what your friend is essentially telling you.
MD5 and SHA1 are cryptographic hashes intended for the second kind of application. However, MD5 has been cracked and SHA1 is considered weak these days. Still, even though MD5 can be cracked it takes a long time to find MD5 collisions (days to weeks).
As for salt, it makes the computation of the cryptographic hash local by mixing in some random non-secret value, this value is called the salt. This prevents computing global tables which make it easy to compute possible values (e.g. passwords) from the hash value. The computation of the tables is extremely expensive, but without salt the cost would be amortized over many cracked passwords.
The attack (against CRC-32) is irrelevant if the hash you are using is not CRC-32 - MD5 and SHA-1 are not vulnerable to that kind of attack (yet).
The current attacks against MD5 are where an attacker creates two documents with the same hash.
Salts are used for password verification - they prevent an attacker performing an offline attack against the password database - each user's password has a salt attached to the plain-text before the hashing - then a pre-computed rainbow table of plaintext <-> hashed text is useless.
Adding salt to your hash function doesn't really serve any purpose if the digest function has been compromised, because the salt will have to be made public to be used, and the attacker can adjust their file to factor this in too.
The solution to this problem is to use a secure hash function. MD5 has shown to be vulnerable to hash collision, but I believe SHA-1 has not (so far).
Salting is usually used in password hashes to avoid dictionary attacks. There are plenty of web based reverse hash dictionaries where you enter the hash (say: 1a79a4d60de6718e8e5b326e338ae533) and get back the text: "example". With salt, this becomes next to impossible. If you prepend a password with random salt, the dictionary attack become more difficult.
As for collisions, I don't think you need to worry about entire files having the same md5 or sha1 hash. it's not important. The important use of the hash is to prove the file you receive is the same as the file that was approved by someone who is an authority on the file. If you add salt to the file, you need to send the salt so the user can verify the hash.
This actually makes it easier for the attacker to spoof your file because he can provide a false salt along with the false file. The user can usually tell if the file is faked because it no longer serves the purpose it is supposed to. But how is the user supposed to know the difference between the correct salt and the attacker's salt?