I had to hash my password, for that i used bcrypt.
By following a tutorial, i found this code :
const saltRounds = 10;
bcrypt.hash(password, saltRounds).then(hashedPassword => {
//...
}
I googled what is saltRound and what's its propos, till i found here
What are Salt Rounds and how are Salts stored in Bcrypt? that you can control the time it takes to hash your passwords.
Does this mean you can control the strength of crypted passwords with just one property saltRounds ?
Yes, more work for bcrypt does indeed increase password security. The whole purpose of increasing work factors is to provide that increase.
Higher bcrypt work factors improve password security at the individual user level, because it's harder to crack each user's hash individually (if the attacker is interested in just one specific account). No hash / factors will protect an extremely weak password, but stronger hashing has a better chance of extending protection "downward" through a weaker / intermediate range of password strengths.
Stronger work factors also increase password security at the aggregate security level (for the entire target user set). This is because higher bcrypt work factors make attacking the entire set at once materially harder - especially as the number of users grows. The more compute / cores / memory are required, the more expensive (in time and resources) it is for the attacker.
The trade-off, of course, is that more work for the attacker (harder to crack) also means more work for the defender / maintainer (higher resource cost for valid authentications). It's important to tune your work factor to be the highest that both you and your users can tolerate, including worst-case scenarios like an "authentication storm" (where all of your users need to re-authenticate simultaneously). A bcrypt cost of 12, and a target per-user delay of .5 seconds, is often a good balance of these factors - but you need to assess (and test!) this for your own use case.
Bonus advice: make sure that your library and supporting code can support multiple costs simultaneously. As hardware gets faster, you'll want to increase your work factor for new/reset passwords, while simultaneously supporting existing ones. Fortunately, most libraries handle this transparently - but again, test for your own use case.
Related
On our message board, we use password matching to help detect members with multiple registrations and enforce our rules against malicious puppet accounts. It worked well when we had SHA256 hashes and a per-site salt. But we recently had a humbling security breach in which a number of password hashes fell to a dictionary attack. So we forced a password change, and switched to bcrypt + per-user salts.
Of course, now password matching doesn't work anymore. I don't have a formal education in cryptography or computer science so I wanted to ask if there's a secure way to overcome this problem. Somebody I work with suggested a second password field using a loose hashing algorithm which intentionally has lots of collisions, but it seems to me that this would either lead to tons of false positives, or else reduce the search space too much to be secure. My idea was to stick with bcrypt, but store a second password hash which uses a per-site salt and an extremely high iteration count (say 10+ seconds to generate on modern hardware). That way users with the same password would have the same hash, but it couldn't be easily deduced with a dictionary attack.
I'm just wondering if there's an obvious problem with this, or if someone more knowledgeable than me has any suggestions for a better way to approach things? It seems to me like it would work, but I've learned that there can be a lot of hidden gotchas when it comes to security. :P Thanks!
Short Answer
Any algorithm that would allow you to detect whether or not 2 users had the same password would also allow an attacker to detect whether or not 2 users had the same password. This is, effectively, a precomputation attack. Therefore, your problem is not securely solvable.
Example
Assume I've compromised your password database.
Assume I've figured out how your hashes are calculated.
If I can apply your password transformation algorithm to "password" and quickly tell which users use "password" as their password, then the system is vulnerable to a form of precomputation attack.
If I must do an expensive calculation to determine the password for each individual user and work spent to calculate User A's password does not make calculating User B's password easier, then the system is secure (against these type of attacks).
Further Consideration
Your idea of using a per-site salt with bcrypt and a high iteration count may seem attractive at first, but it just can't scale. Even at 10 seconds, that's 6 password guesses per minute, 360 per hour, 8640 per day, or 3M per year (that's a lot). And that's just one machine. Throw a botnet of machines at that problem, or some GPU's and suddenly that number goes through the roof. Just 300 machines/cores/GPU's could knock out 2.5M guesses in a day.
Because you would be using the same salt for each one, you're allowing the attacker to crack all of your user's passwords at once. By sticking with a per-user salt only, the attacker can effectively only attempt to crack a single user's password at a time.
The short answer given above makes the assumption that the attacker has the same access as the server at all times, which is probably not reasonable. If the server is compromised in a permanent manner (owned by the attacker) then no scheme can save you - the attacker can retrieve all passwords as they are set by the user. The model is more normally that an attacker is able to access your server for a limited period of time, some point after it has gone live. This introduces an opportunity to perform the password matching that you've asked about without providing information that is useful to an attacker.
If at sign-up or password change your server has access to the password in plain text, then the server could iterate through all the user accounts on the system, hashing the new password with each user's individual salt, and testing to see if they were the same.
This doesn't introduce any weaknesses, but it would only be useful to you if your algorithm for preventing multiple fake accounts can use this as a one-time input ("this password matches these accounts").
Storing that information for later analysis would obviously be a weakness (for if an attacker can obtain your database of passwords, they can probably also obtain this list of accounts with the same password). A middle ground might be to store the information for daily review - reducing the total useful information available to an attacker who temporarily compromises your storage.
All of this is moot if the salting and hashing occurs client-side - then the server can't carry out the test.
I'm a little confused about password-safe-keeping.
Let's say I've got database with user-account table.
And this is the place where i keep passwords.
At this time i'm using salted sha1.
I read Blowfish based function are better then sha1 because they need more time to process request.
Is there any reason why not to use salted sha1 and just limit login attempt count to some reasonable number (for example 50times per hour) as a 'firewall' to bruteforce attacks?
person who is working with this database has no need to bruteforce anything because
he can change records by queries.
With blowfish based function, you surely mean the BCrypt hash function. As you already stated BCrypt is designed to be slow (need some computing time), that's the only advantage over other fast hash functions, but this is crucial.
With an off-the-shelf GPU, you are able to calculate about 3 Giga hash values per second, so you can brute-force a whole english dictionary with 5'000'000 words in less than 2 milliseconds. Even if SHA-1 is a safe hash function, that makes it inappropriate for hashing passwords.
BCrypt has a cost factor, which can be adapted to future, and therefore faster, hardware. The cost factor determines how many iterations of hashing are performed. Recently i wrote a tutorial about hashing passwords, i would invite you to have a look at it.
Your point about restricting login attempts makes sense, but the hashing should protect the passwords in case the attacker has access to the database (SQL-injection). Of course you can limit the login attempts, but that has nothing to do with hashing, you could even store the passwords plaintext in this scenario.
Storing passwords in Blowfish is more secure than SHA-1 because, as of now, there has been no reported method of obtaining the value of a Blowfish-encrypted string. SHA-1, on the other hand, does have reported methods of obtaining data from encrypted strings. You cannot trust SHA-1 to prevent someone from obtaining its data.
If you are open to suggestion, I don't see a need to work with two-way encryption at all as you are storing passwords. Hashing your users passwords with a salted SHA-256 method may be an option. Allowing your users to reset their own passwords via Email is generally considered a good policy, and it results in a data set that cannot be easily cracked.
If you do require two-way encryption for any reason, aside from Blowfish, AES-256 (Rijndael) or Twofish are also currently secure enough to handle sensitive data. Don't forget that you are free to use multiple algorithms to store encrypted data.
On the note of brute forcing, it has little to do with encrypted database storage. You are looking at a full security model when you refer to methods of attack. Using a deprecated algorithm and "making up for it" by implementing policies to prevent ease of attack is not considered a mature approach to security.
In Short
Use one way hashing for storing passwords, allow users to reset via email
Don't be afraid use multiple methods to store encrypted data
If you must use an encryption/decryption scheme, keep your keys safe and only use proven algorithms
Preventing brute force attacks is a good mindset, but it will only slow someone down or encourage them to search for other points of entry
Don't take this as gospel: when it comes to security everyone has different requirements, the more research you do the better your methods will become. If you don't completely encapsulate your sensitive data with a full-on security policy, you may get a nasty surprise down the track.
Source: Wikipedia, http://eprint.iacr.org/2005/010
Is there any reason why not to use salted sha1 and just limit login
attempt count to some reasonable number (for example 50times per hour)
as a 'firewall' to bruteforce attacks?
If you don't encrypt your passwords with any decent algorithm you are failing basic security precautions.
Why isn't 'just' blocking login attempts safe?
Well beside the fact you would need to block EVERY possible entrance, eg:
ssh
webservices (your webapp, phpmyadmin, openpanel, etcetera)
ftp
lots more
You would also need to trust every user that has access to the database and server, I wouldn't like people to read my password, but what I dislike even more, is you deciding for me, metaforically speaking :-)
Maybe someone else can shed light on the Blowfish vs SHA discussion, although I doubt that part is a stackworthy formatted question
How is bcrypt stronger than, say,
def md5lots(password, salt, rounds):
if (rounds < 1)
return password
else
newpass = md5(password + salt)
return md5lots(newpass, salt, rounds-1)
I get the feeling, given its hype, that more intelligent people than me have figured out that bcrypt is better than this. Could someone explain the difference in 'smart layman' terms?
The principal difference - MD5 and other hash functions designed to verify data have been designed to be fast, and bcrypt() has been designed to be slow.
When you are verifying data, you want the speed, because you want to verify the data as fast as possible.
When you are trying to protect credentials, the speed works against you. An attacker with a copy of a password hash will be able to execute many more brute force attacks because MD5 and SHA1, etc, are cheap to execute.
bcrypt in contrast is deliberately expensive. This matters little when there are one or two tries to authenticate by the genuine user, but is much more costly to brute-force.
There are three significant differences between bcrypt and hashing multiple times with MD5:
The size of the output: 128-bit (16-bytes) for MD5 and 448 bits (56-bytes) for bcrypt. If you store millions of hashes in a database, this has to be taken into account.
Collisions and preimage attacks are possible against MD5.
Bcrypt can be configured to iterate more and more as cpu's become more and more powerful.
Hence, using salting-and-stretching with MD5 is not as safe as using bcrypt. This issue can be solved by selecting a better hash function than MD5.
For example, if SHA-256 is selected, the output size will be 256-bits (32-bytes). If the salting-and-stretching can be configured to increase the number of iterations like bcrypt, then there is no difference between both methods, except the amount of space required to store result hashes.
You are effectively talking about implementing PBKDF2 or Password-Based Key Derivation Function. Effectively it is the same thing as BCrypt, the advantage being that you can lengthen the amount of CPU time it takes to derive a password. The advantage of this over something like BCrypt is that, by knowing how many 'Iterations' you have put the password through, when you need to increase it you could do it without resetting all the passwords in the database. Just have your algorithm pick up the end result as if it were at the nth iteration (where n is the previous itteration count) and keep going!
It is recomended you use a proper PBKDF2 library instead of creating your own, because lets face it, as with all cryptography, the only way you know if something is safe is if it has been 'tested' by the interwebs. (see here)
Systems that use this method:
.NET has a library already implemented. See it here
Mac, linux and windows file encryption uses many itteration (10,000+) versions of this encryption method to secure their file systems.
Wi-Fi networks are often secured using this method of encryption
Source
Thanks for asking the question, it forced me to research the method i was using for securing my passwords.
TTD
Although this question is already answered, i would like to point out a subtle difference between BCrypt and your hashing-loop. I will ignore the deprecated MD5 algorithm and the exponential cost factor, because you could easily improve this in your question.
You are calculating a hash-value and then you use the result to calculate the next hash-value. If you look at the implementation of BCrypt, you can see, that each iteration uses the resulting hash-value, as well as the original password (key).
Eksblowfish(cost, salt, key)
state = InitState()
state = ExpandKey(state, salt, key)
repeat (2^cost)
state = ExpandKey(state, 0, key)
state = ExpandKey(state, 0, salt)
return state
This is the reason, you cannot take a Bcrypt-hashed password and continue with iterating, because you would have to know the original password then. I cannot prove it, but i suppose this makes Bcrypt safer than a simple hashing-loop.
Strictly speaking, bcrypt actually encrypts the text:
OrpheanBeholderScryDoubt
64 times.
But it does it with a key that was derived from your password and some randomly generated salt.
Password hashing is not hashing
The real virtue of "password hashing algorithms" (like bcrypt) is that they use a lot of RAM.
SHA2 is designed to be fast. If you're a real-time web-server, and you want to validate file integrity, you want something that runs extraordinarly fast, with extraordinarliy low resource usage. That is the antithesis of password hashing.
SHA2 is designed to be fast
SHA2 can operate with 128 bytes of RAM
SHA2 is easily implementable in hardware
i own a USB stick device that can calculate 330 million hashes per second
in fact, i own 17 of them
If you perform a "fast" hash multiple times (e.g. 10,000 is a common recommendation of PBDKF2), then you're not really adding any security.
What you need is a hash that is difficult to implement in hardware. What you need is a hash that is hard to parallelize on a GPU.
Over the last few decades we've learned that RAM is the key to slowing down password hashing attempts. Custom hardware shines at performing raw computation (in fact, only 1% of your CPU is dedicated to computation - the rest is dedicated to jitting the machine instructions into something faster; pre-fetching, out-of-order-execution, branch prediction, cache). The way to styme custom hardware is to make the algorithm have to touch a lot of RAM.
SHA2: 128 bytes
bcrypt: 4 KB
scrypt (configurable): 16 MB in LiteCoin
Argon2 (configurable): 64 MB in documentation examples
Password hashing does not mean simply using a fast hash multiple times.
A modern recommended bcrypt cost factor is 12; so that it takes about 250 ms to compute.
you would have to perform about 330,000 iterations of SHA2 to equal that time cost on a modern single-core CPU
But then we get back to my 2.5W, USB, SHA2 stick and it's 330 Mhashes/sec. In order to defend against that, it would have to be 83M iterations.
If you're try to add only CPU cost: you're losing.
You have to add memory cost
bcrypt is 21 years old, and it only uses 4KB. But it is still ~infinitely better than any amount of MD5, SHA-1, or SHA2 hashing.
I am designing a storage cloud software on top of a LAMP stack.
Files could have an internal ID, but it would have many advantages to store them not with an incrementing id as filename in the servers filesystems, but using an hash as filename.
Also hashes as identifier in the database would have a lot of advantages if the currently centralized database should be sharded or decentralized or some sort of master-master high availability environment should be set up. But I am not sure about that yet.
Clients can store files under any string (usually some sort of path and filename).
This string is guaranteed to be unique, because on the first level is something like "buckets" that users have go register like in Amazon S3 and Google storage.
My plan is to store files as hash of the client side defined path.
This way the storage server can directly serve the file without needing the database to ask which ID it is because it can calculate the hash and thus the filename on the fly.
But I am afraid of collisions. I currently think about using SHA1 hashes.
I heard that GIT uses hashes also revision identifiers as well.
I know that the chances of collisions are really really low, but possible.
I just cannot judge this. Would you or would you not rely on hash for this purpose?
I could also us some normalization of encoding of the path. Maybe base64 as filename, but i really do not want that because it could get messy and paths could get too long and possibly other complications.
Assuming you have a hash function with "perfect" properties and assuming cryptographic hash functions approach that the theory that applies is the same that applies to birthday attacks . What this says is that given a maximum number of files you can make the collision probability as small as you want by using a larger hash digest size. SHA has 160 bits so for any practical number of files the probability of collision is going to be just about zero. If you look at the table in the link you'll see that a 128 bit hash with 10^10 files has a collision probability of 10^-18 .
As long as the probability is low enough I think the solution is good. Compare with the probability of the planet being hit by an asteroid, undetectable errors in the disk drive, bits flipping in your memory etc. - as long as those probabilities are low enough we don't worry about them because they'll "never" happen. Just take enough margin and make sure this isn't the weakest link.
One thing to be concerned about is the choice of the hash function and it's possible vulnerabilities. Is there any other authentication in place or does the user simply present a path and retrieve a file?
If you think about an attacker trying to brute force the scenario above they would need to request 2^18 files before they can get some other random file stored in the system (again assuming 128 bit hash and 10^10 files, you'll have a lot less files and a longer hash). 2^18 is a pretty big number and the speed you can brute force this is limited by the network and the server. A simple lock the user out after x attempts policy can completely close this hole (which is why many systems implement this sort of policy). Building a secure system is complicated and there will be many points to consider but this sort of scheme can be perfectly secure.
Hope this is useful...
EDIT: another way to think about this is that practically every encryption or authentication system relies on certain events having very low probability for its security. e.g. I can be lucky and guess the prime factor on a 512 bit RSA key but it is so unlikely that the system is considered very secure.
Whilst the probability of a collision might be vanishingly small, imagine serving a highly confidential file from one customer to their competitor just because there happens to be a hash collision.
= end of business
I'd rather use hashing for things that were less critical when collisions DO occur ;-)
If you have a database, store the files under GUIDs - so not an incrementing index, but a proper globally unique identifier. They work nicely when it comes to distributed shards / high availability etc.
Imagine the worst case scenario and assume it will happen the week after you are featured in wired magazine as an amazing startup ... that's a good stress test for the algorithm.
I've spent the past 2 hours reading up on salting passwords, making sure that I understood the idea. I was hoping some of you could share your knowledge on my conclusions.
If I'm an attacker, and I gain access to a user database, I could just take all the per-user salts present in the table and use those to create my rainbow tables. For big tables this could take a long time. If I could cut the list down to users of interest (admins, mods) I could use much bigger dictionary lists to create the rainbow tables, raising my percentage of hits...
If this is true then it seems that salting really doesn't do all that much to help. It only marginally slows down an attacker.
I know ideally you would want to force complex passwords and salt them with unique and random strings, but forcing complex passwords can annoy users (i know it annoys me), so a lot of sites don't do it. It seems sites are doing their users a disservice with this, and that forcing complex passwords is a lot more important than a good salting method.
I guess this isn't so much a question, but a request for others knowledge on the situation.
The point of a salt is that an attacker can no longer use a pre-existing dictionary to attack any user in your system. They have to create a brand new dictionary for each user using that user's salt, which takes time and effort. If you learn about a breach before dictionaries are created for all users of your system, you have time to act. (Alert users that their log-in credentials must be changed, which should generate a new random salt.)
I would say that you should use both a salt and the most complex password (pass phrase, really) that your users will tolerate. Even still, salting is a fundamental security measure, and you can't really afford to do without it.
Is keeping properly hydrated more important than breathing?
I tend to favor an approach that uses a salt per user, global salt (salt per algorithm), and modest password complexity rules (8+ characters with some combination of at least 2 uppercase/digit/punctuation characters) for most web sites. Using salts requires the generation of a rainbow table per account you want to break -- assuming unique salts per user. Using a global salt requires that you both compromise the DB and the application server. In my case, these are always two separate systems. Using password complexity rules helps to protect against simple, easy to guess passwords being used.
For accounts with more privileges, you may want to enforce greater password complexity. For example, admins in our AD forest are required to have a minimum 15 character passwords. It's actually easier than shorter passwords because it pretty much forces you to use a pass phrase rather than a password.
You also want to instruct your users in how to create good password, or better yet, pass phrases and to be aware of various social engineering attacks that circumvent all of your technical means of protecting your data.
Okay, let's look at real figures:
A single Nvidia 9800GTX can calculate 350 million MD5 hashes / second. At this rate, the entire keyspace of lower and uppercase alphanumeric characters will be done in 7 days. 7 chars, two hours. Applying salting will only double or triple these times depending on your algorithm.
Cheap modern GPUs will easily boast one billion MD5 hashes / second. Determined people typically link up about 6 of these, and get 6 billion / second, rendering the 9 character keyspace obsolete in 26 days.
Note that I'm talking about brute force here, as preimage attacks may or may not apply after this level of complexity.
Now if you want to defend against professional attackers, there is no reason they can't get 1 trillion hashes / second, they'd just use specialized hardware or a farm of some cheap GPU machines, whichever is cheaper.
And boom, your 10 character keyspace is done in 9.7 days, but then 11 character passwords take 602 days. Notice that at this point, adding 10 or 20 special characters to the allowed character list will only bring the cracking time of a 10 character keyspace to 43 or 159 days, respectively.
See the problem with password hashing is that it only reduces time until your futile doom. If you want something really strong, but still as naive as stored hashed passwords are, go for PBKDF2.
Then there is still one more problem, will the user use this "strong" password you forced him to use on all his other sites? If he doesn't save them in a master password file, he most certainly will. And those other sites wont use the same strong hashing algorithms you use, defeating the purpose of your system. I can't really see why you want your hashes to be super strong if it isn't to stop users from using the same password on multiple sites; if an attacker has access to your hashes, you most likely already lost.
On the other hand, like I will repeat and repeat again to people asking questions about how "secure" their hashing scheme is, just use client certificates, and all your problems are solved. It becomes impossible for users to use the same credentials on multiple sites, attackers cannot break your credentials without modifying them, users cannot easily have their credentials stolen if they store them on a smart card, etc etc.
To naively answer your question: a strong password is only backed by a strong hashing algorithm.
With the sole exception of a requirement for a long string, every constraint reduces the size of password phase space. Constraints cause a decrease in complexity, not an increase.
If this seems counterintuitive, consider that you are providing a bunch of reliable assumptions for the cracker. Let me illustrate this point with a true story from my misspent youth:
In the early days of twin-primes encryption processors were so slow that people tended to use int32 arithmetic for speed. This allowed me to assume the primes were between 0 and four billion. People always picked large primes because conventional wisdom held that bigger was better. So I used a pre-computed dictionary of primes and worked down from a known ceiling, knowing that people generally chose keys close to that ceiling. This generally allowed me to crack their key in about 30 seconds.
Insist on long pass phrases, and use salting, with no other constraints.
When people say "sophisticated" techniques they often mean complicated. A transformation can be very complicated and yet be commutative, and unfortunately if you don't know what that means then you're not in a position to assess the merits of the technique. Complexity of algorithm lends only security by obscurity, which is a bit like getting a house out of town and not locking the doors.
Use sophisticated techniques like salthash to keep your users' private information safe.
But don't obstruct your users. Offer suggestions, but don't get in their way.
It's up to your users to pick good passwords. It's up to you to suggest how to pick good passwords, and to accept any password given and keep the user's information as safe as the password given permits.
Both salted passwords and complex passwords are necessary for real security. Typically rainbow tables aren't created on the fly to attack a specific site, but are rather precomputed. It would be far more efficient to simply brute force a password than to generate a look up table based on a particular salt value.
That being said the purpose of a hash it to ensure that an attacker can't recover a password if your database is compromised. It does nothing to prevent an attacker from guessing an easy password.
Requiring password complexity is really a matter of the type of site/ type of data you are protecting. It will annoy some users, and cause others to write their password out on a post it and stick it to their monitor. I'd say it is absolutely essential to use a strong hash and salt on your end- neglecting to do so exposes not only your site, but completely compromises every user who recycles username/ password combinations.
So in my opinion salting is mandatory regardless of the security level of your site. Enforced password complexity is good for high security sites - but is definitely more situational. It won't guarantee good security practices on the part of your users. I'll also add that requiring a secure password for a site that doesn't require it can do more harm than good as it is more likely that a user will recycle a high-security password that they use on other more essential sites.