How to securely detect accounts with matching passwords?

How to securely detect accounts with matching passwords? - hash

On our message board, we use password matching to help detect members with multiple registrations and enforce our rules against malicious puppet accounts. It worked well when we had SHA256 hashes and a per-site salt. But we recently had a humbling security breach in which a number of password hashes fell to a dictionary attack. So we forced a password change, and switched to bcrypt + per-user salts.
Of course, now password matching doesn't work anymore. I don't have a formal education in cryptography or computer science so I wanted to ask if there's a secure way to overcome this problem. Somebody I work with suggested a second password field using a loose hashing algorithm which intentionally has lots of collisions, but it seems to me that this would either lead to tons of false positives, or else reduce the search space too much to be secure. My idea was to stick with bcrypt, but store a second password hash which uses a per-site salt and an extremely high iteration count (say 10+ seconds to generate on modern hardware). That way users with the same password would have the same hash, but it couldn't be easily deduced with a dictionary attack.
I'm just wondering if there's an obvious problem with this, or if someone more knowledgeable than me has any suggestions for a better way to approach things? It seems to me like it would work, but I've learned that there can be a lot of hidden gotchas when it comes to security. :P Thanks!

Short Answer
Any algorithm that would allow you to detect whether or not 2 users had the same password would also allow an attacker to detect whether or not 2 users had the same password. This is, effectively, a precomputation attack. Therefore, your problem is not securely solvable.
Example
Assume I've compromised your password database.
Assume I've figured out how your hashes are calculated.
If I can apply your password transformation algorithm to "password" and quickly tell which users use "password" as their password, then the system is vulnerable to a form of precomputation attack.
If I must do an expensive calculation to determine the password for each individual user and work spent to calculate User A's password does not make calculating User B's password easier, then the system is secure (against these type of attacks).
Further Consideration
Your idea of using a per-site salt with bcrypt and a high iteration count may seem attractive at first, but it just can't scale. Even at 10 seconds, that's 6 password guesses per minute, 360 per hour, 8640 per day, or 3M per year (that's a lot). And that's just one machine. Throw a botnet of machines at that problem, or some GPU's and suddenly that number goes through the roof. Just 300 machines/cores/GPU's could knock out 2.5M guesses in a day.
Because you would be using the same salt for each one, you're allowing the attacker to crack all of your user's passwords at once. By sticking with a per-user salt only, the attacker can effectively only attempt to crack a single user's password at a time.

The short answer given above makes the assumption that the attacker has the same access as the server at all times, which is probably not reasonable. If the server is compromised in a permanent manner (owned by the attacker) then no scheme can save you - the attacker can retrieve all passwords as they are set by the user. The model is more normally that an attacker is able to access your server for a limited period of time, some point after it has gone live. This introduces an opportunity to perform the password matching that you've asked about without providing information that is useful to an attacker.
If at sign-up or password change your server has access to the password in plain text, then the server could iterate through all the user accounts on the system, hashing the new password with each user's individual salt, and testing to see if they were the same.
This doesn't introduce any weaknesses, but it would only be useful to you if your algorithm for preventing multiple fake accounts can use this as a one-time input ("this password matches these accounts").
Storing that information for later analysis would obviously be a weakness (for if an attacker can obtain your database of passwords, they can probably also obtain this list of accounts with the same password). A middle ground might be to store the information for daily review - reducing the total useful information available to an attacker who temporarily compromises your storage.
All of this is moot if the salting and hashing occurs client-side - then the server can't carry out the test.

Related

Best practice for storing password in database?

I'm creating new login system for my single page app. This system will require administrator to create account for the users. Once they setup account for the user I will send them an email where they have to enter their Information like Security Question and Password. So I have done some research and looked over our existing system. There is hash function that is used together with salt. I read few articles and there is a lot of debate on hash being vulnerable. Also I see that in this case hashed password is stored as well as salt. They are in separate columns. Is this good practice to store salt in DB? Also is there better way to store password in database? Here is example of the logic that I found:
<cfset password = trim(FORM.password)>
<cfset salt = randomSalt()> //This is function that generates random salt.
<cfset totPW = password & salt>
<cfset hashedPW = hash(totPW,"SHA-256")>
I'm currently using Cold Fusion 2016. I'm not sure if there is some better way to encrypt the password in CF. If anyone can provide some useful resource or example please let me know. Thanks.

Generally speaking, hashing is still fine these days. The thing that matters here is what hashing algorithm you use (and how many iterations). In case of database leaks, you want to make it as difficult as possible to match inputs (commmon password/dictionary based attack). Salting does help a little bit, so does having an irregular pattern salt or a hidden number of iterations based on the username etc. Having different hashing strategies helps as long as the attacker doesn't know how your hashing is implemented. (Accessing the database server is one thing, accessing your source code another.) It's about causing effort to the attacker.
Now about hashing algorithms: SHA-2 is easier to attack than for example bcrypt due to being targetable by GPUs. The number of iterations on the hash will take the attacker more time for each password. bcrypt isn't supported by hash(), but at least SHA-512 is. hash() supports iterations (see docs). Rule of thumb is having an iteration count that takes at least a second to process on your server hardware.
On a side note: Don't trim the password input, people might intend to use leading/trailing whitespaces.

Why to use blowfish for passwords?

I'm a little confused about password-safe-keeping.
Let's say I've got database with user-account table.
And this is the place where i keep passwords.
At this time i'm using salted sha1.
I read Blowfish based function are better then sha1 because they need more time to process request.
Is there any reason why not to use salted sha1 and just limit login attempt count to some reasonable number (for example 50times per hour) as a 'firewall' to bruteforce attacks?
person who is working with this database has no need to bruteforce anything because
he can change records by queries.

With blowfish based function, you surely mean the BCrypt hash function. As you already stated BCrypt is designed to be slow (need some computing time), that's the only advantage over other fast hash functions, but this is crucial.
With an off-the-shelf GPU, you are able to calculate about 3 Giga hash values per second, so you can brute-force a whole english dictionary with 5'000'000 words in less than 2 milliseconds. Even if SHA-1 is a safe hash function, that makes it inappropriate for hashing passwords.
BCrypt has a cost factor, which can be adapted to future, and therefore faster, hardware. The cost factor determines how many iterations of hashing are performed. Recently i wrote a tutorial about hashing passwords, i would invite you to have a look at it.
Your point about restricting login attempts makes sense, but the hashing should protect the passwords in case the attacker has access to the database (SQL-injection). Of course you can limit the login attempts, but that has nothing to do with hashing, you could even store the passwords plaintext in this scenario.

Storing passwords in Blowfish is more secure than SHA-1 because, as of now, there has been no reported method of obtaining the value of a Blowfish-encrypted string. SHA-1, on the other hand, does have reported methods of obtaining data from encrypted strings. You cannot trust SHA-1 to prevent someone from obtaining its data.
If you are open to suggestion, I don't see a need to work with two-way encryption at all as you are storing passwords. Hashing your users passwords with a salted SHA-256 method may be an option. Allowing your users to reset their own passwords via Email is generally considered a good policy, and it results in a data set that cannot be easily cracked.
If you do require two-way encryption for any reason, aside from Blowfish, AES-256 (Rijndael) or Twofish are also currently secure enough to handle sensitive data. Don't forget that you are free to use multiple algorithms to store encrypted data.
On the note of brute forcing, it has little to do with encrypted database storage. You are looking at a full security model when you refer to methods of attack. Using a deprecated algorithm and "making up for it" by implementing policies to prevent ease of attack is not considered a mature approach to security.
In Short
Use one way hashing for storing passwords, allow users to reset via email
Don't be afraid use multiple methods to store encrypted data
If you must use an encryption/decryption scheme, keep your keys safe and only use proven algorithms
Preventing brute force attacks is a good mindset, but it will only slow someone down or encourage them to search for other points of entry
Don't take this as gospel: when it comes to security everyone has different requirements, the more research you do the better your methods will become. If you don't completely encapsulate your sensitive data with a full-on security policy, you may get a nasty surprise down the track.
Source: Wikipedia, http://eprint.iacr.org/2005/010

Is there any reason why not to use salted sha1 and just limit login
attempt count to some reasonable number (for example 50times per hour)
as a 'firewall' to bruteforce attacks?
If you don't encrypt your passwords with any decent algorithm you are failing basic security precautions.
Why isn't 'just' blocking login attempts safe?
Well beside the fact you would need to block EVERY possible entrance, eg:
ssh
webservices (your webapp, phpmyadmin, openpanel, etcetera)
ftp
lots more
You would also need to trust every user that has access to the database and server, I wouldn't like people to read my password, but what I dislike even more, is you deciding for me, metaforically speaking :-)
Maybe someone else can shed light on the Blowfish vs SHA discussion, although I doubt that part is a stackworthy formatted question

Find Encryption method and key from before and after encryption string

I have a database with roughly 4,000 members. I need to decrypt all their passwords so that I may re-encrypt them using MD5 and put into a new system. I have one account with before encryption and after encryption password strings so with this I assume I should be able to figure out a method and key range. I do not have access to the original source code or I would not be here.
I should mention that existing passwords vary dramatically.
User1 - qwDyP1db9iOPI
User2 - $1$.WXtDyLg$yxBPAWxzd.srEWJZfqZAY/
User3 - do8sLMoVVqxKQ
User4 - TKmnAlPe
How do I approach such a problem?

Typically, you can't decrypt the passwords, because the passwords are not stored. What you are seeing is hashes of the password - there's no efficient way to find a password that matches a particular hash.
You can either force everyone to change their password once you have the configuration changed to use MD5 hashing, or you can generate new temporary passwords for everyone to get all the hashes replaced at once, but then they should all change them again anyway.
Those look like they probably came from Linux (the one that starts with $1$ is already an MD5 hash, the others look like the older crypt() DES variant style).
That's not always the case when dealing with application passwords - for example, some databases use considerably weaker credential storage - but Unix/Linux flavors have not stored actual passwords for a very long time, if ever.

let's think. Suppose I can hack my way to your computer / DB (not an impossible task). So I have what you have - a list of all users' passwords, but encrypted. But most probably I can register to your system myself, and I know my own password. I can find its encrypted form in the DB, either by my username, or by the latest entry, so now I have the clear + encrypted pair! If I could do what you are asking, I'd have all passwords and full access to all accounts. Meaning that encrypting passwords, or data in general has virtually no additional security effect, and all I should care about is preventing unauthorized access to the data. But this is not true, so you can't :)
BTW, MD5 is not an encryption but a hash, or more correctly a digest.

Looks like twalberg is right for the most part, but I'd like to go a step further to give you some practical advice:
1) Look at http://www.openwall.com/john/ (John the Ripper) for help getting the original passwords out of the database you have. Note that the passwords generated may not be the original passwords (it is likely, but not certain).
Once you've gotten all of the passwords using this tool, think to yourself: Anyone could do this. Your users might use these passwords somewhere else. If someone broke into your site, they could expose the people who trust you. Your users could blame you if someone used this information to break into their other accounts. Having the cleartext passwords around is risky for you.
I recommend you take this opportunity to have your users change their passwords. Those that no longer use their accounts should probably have to go through a small verification step to regain access, so you can disable their accounts now.
2) When you generate the new password database, I recommend you at a minimum use a secure hash (E.g. SHA1 or SHA2) along with a per-user salt (random value prepended to the generated hash output in a standard way) to help secure the database from accidental exposure.
While this answer isn't efficient, it is probably practical for the most part for your current database if your actual password database has even approximately the proportions you represent.

If you're switching away from your current system don't switch to plain MD5 - there are "rainbow tables" to crack a lot of common passwords; instead use salted/double MD5 or SHA

Thanks all for your input. I will just have everyone change their passwords. I have no intention of keeping anyone's password. The goal was to not have all 4000 users have to change their passwords after migrating to the new system but it seems unavoidable. As davka pointed out, I have an account in this database and know my password and can see the encrypted version so thought that would be sufficient to develop a script that decrypts and renters the password into the new database. Thanks again.

Preventing preimage attack on limited set of values

I have asked about the cost of running a preimage attack on the hashes of social security numbers. The excellent answer I got was that the type of social security numbers only has 366,000,000 hashes, which would make it easy to run a preimage attack.
My question now is whether it is possible to avoid a preimage attack altogether. My scenario is that several clients need to store the social security number on a central server. The hashing must be consistent between the clients. The clients could communicate with online web services.

Your problem is similar to what must be done when using passwords. Passwords fit in human brains, and, as such, cannot be much difficult to guess.
There are two complementary ways to mitigate risks when using low-entropy secrets:
Use iterated/repeated hashing to make each "guess" more expensive for the attacker.
Use salts to prevent cost sharing. The attacker shall pay the full dictionary search attack for every single attacked password/SSN.
One way to make hashing more expensive is to hash the concatenation of n copies of the data, with a n as big as possible (depending on the computing power of the clients, and, ultimately, the patience of the user). For instance, for (dummy) SSN "123456789", use H(123456789123456789123456789...123456789). You would count n in millions here; on a basic PC, SHA-256 can easily process a hundred megabytes per second.
A salt is a piece of public data which is used along the data to hash (the SSN), and which is different for each user. A salt needs not be secret, but it should not be reused (or at least not often). Since SSN tend to be permanent (an individual has a unique SSN for his whole life), then you can use the user name as salt (this contrasts with passwords, where a user can change his password, and should use a new salt for every new password). Hence, if user Bob Smith has SSN 123456789, you would end up using: H("Bob Smith 123456789 Bob Smith 123456789 Bob Smith 123456789... Smith 123456789") with enough repetitions to make the process sufficiently slow.
Assuming you can make the user wait for one second (it is difficult to make a user wait for more) on a not-so-new computer, it can be expected that even a determined attacker will have trouble trying more than a few hundred SSN per second. The cost of cracking a single SSN will be counted in weeks, and, thanks to the use of the user name as a salt, the attacker will have no shortcut (e.g. salting defeats precomputed tables, including the much-hyped "rainbow tables").

Is forcing complex passwords "more important" than salting?

I've spent the past 2 hours reading up on salting passwords, making sure that I understood the idea. I was hoping some of you could share your knowledge on my conclusions.
If I'm an attacker, and I gain access to a user database, I could just take all the per-user salts present in the table and use those to create my rainbow tables. For big tables this could take a long time. If I could cut the list down to users of interest (admins, mods) I could use much bigger dictionary lists to create the rainbow tables, raising my percentage of hits...
If this is true then it seems that salting really doesn't do all that much to help. It only marginally slows down an attacker.
I know ideally you would want to force complex passwords and salt them with unique and random strings, but forcing complex passwords can annoy users (i know it annoys me), so a lot of sites don't do it. It seems sites are doing their users a disservice with this, and that forcing complex passwords is a lot more important than a good salting method.
I guess this isn't so much a question, but a request for others knowledge on the situation.

The point of a salt is that an attacker can no longer use a pre-existing dictionary to attack any user in your system. They have to create a brand new dictionary for each user using that user's salt, which takes time and effort. If you learn about a breach before dictionaries are created for all users of your system, you have time to act. (Alert users that their log-in credentials must be changed, which should generate a new random salt.)
I would say that you should use both a salt and the most complex password (pass phrase, really) that your users will tolerate. Even still, salting is a fundamental security measure, and you can't really afford to do without it.

Is keeping properly hydrated more important than breathing?

I tend to favor an approach that uses a salt per user, global salt (salt per algorithm), and modest password complexity rules (8+ characters with some combination of at least 2 uppercase/digit/punctuation characters) for most web sites. Using salts requires the generation of a rainbow table per account you want to break -- assuming unique salts per user. Using a global salt requires that you both compromise the DB and the application server. In my case, these are always two separate systems. Using password complexity rules helps to protect against simple, easy to guess passwords being used.
For accounts with more privileges, you may want to enforce greater password complexity. For example, admins in our AD forest are required to have a minimum 15 character passwords. It's actually easier than shorter passwords because it pretty much forces you to use a pass phrase rather than a password.
You also want to instruct your users in how to create good password, or better yet, pass phrases and to be aware of various social engineering attacks that circumvent all of your technical means of protecting your data.

Okay, let's look at real figures:
A single Nvidia 9800GTX can calculate 350 million MD5 hashes / second. At this rate, the entire keyspace of lower and uppercase alphanumeric characters will be done in 7 days. 7 chars, two hours. Applying salting will only double or triple these times depending on your algorithm.
Cheap modern GPUs will easily boast one billion MD5 hashes / second. Determined people typically link up about 6 of these, and get 6 billion / second, rendering the 9 character keyspace obsolete in 26 days.
Note that I'm talking about brute force here, as preimage attacks may or may not apply after this level of complexity.
Now if you want to defend against professional attackers, there is no reason they can't get 1 trillion hashes / second, they'd just use specialized hardware or a farm of some cheap GPU machines, whichever is cheaper.
And boom, your 10 character keyspace is done in 9.7 days, but then 11 character passwords take 602 days. Notice that at this point, adding 10 or 20 special characters to the allowed character list will only bring the cracking time of a 10 character keyspace to 43 or 159 days, respectively.
See the problem with password hashing is that it only reduces time until your futile doom. If you want something really strong, but still as naive as stored hashed passwords are, go for PBKDF2.
Then there is still one more problem, will the user use this "strong" password you forced him to use on all his other sites? If he doesn't save them in a master password file, he most certainly will. And those other sites wont use the same strong hashing algorithms you use, defeating the purpose of your system. I can't really see why you want your hashes to be super strong if it isn't to stop users from using the same password on multiple sites; if an attacker has access to your hashes, you most likely already lost.
On the other hand, like I will repeat and repeat again to people asking questions about how "secure" their hashing scheme is, just use client certificates, and all your problems are solved. It becomes impossible for users to use the same credentials on multiple sites, attackers cannot break your credentials without modifying them, users cannot easily have their credentials stolen if they store them on a smart card, etc etc.
To naively answer your question: a strong password is only backed by a strong hashing algorithm.

With the sole exception of a requirement for a long string, every constraint reduces the size of password phase space. Constraints cause a decrease in complexity, not an increase.
If this seems counterintuitive, consider that you are providing a bunch of reliable assumptions for the cracker. Let me illustrate this point with a true story from my misspent youth:
In the early days of twin-primes encryption processors were so slow that people tended to use int32 arithmetic for speed. This allowed me to assume the primes were between 0 and four billion. People always picked large primes because conventional wisdom held that bigger was better. So I used a pre-computed dictionary of primes and worked down from a known ceiling, knowing that people generally chose keys close to that ceiling. This generally allowed me to crack their key in about 30 seconds.
Insist on long pass phrases, and use salting, with no other constraints.
When people say "sophisticated" techniques they often mean complicated. A transformation can be very complicated and yet be commutative, and unfortunately if you don't know what that means then you're not in a position to assess the merits of the technique. Complexity of algorithm lends only security by obscurity, which is a bit like getting a house out of town and not locking the doors.

Use sophisticated techniques like salthash to keep your users' private information safe.
But don't obstruct your users. Offer suggestions, but don't get in their way.
It's up to your users to pick good passwords. It's up to you to suggest how to pick good passwords, and to accept any password given and keep the user's information as safe as the password given permits.

Both salted passwords and complex passwords are necessary for real security. Typically rainbow tables aren't created on the fly to attack a specific site, but are rather precomputed. It would be far more efficient to simply brute force a password than to generate a look up table based on a particular salt value.
That being said the purpose of a hash it to ensure that an attacker can't recover a password if your database is compromised. It does nothing to prevent an attacker from guessing an easy password.
Requiring password complexity is really a matter of the type of site/ type of data you are protecting. It will annoy some users, and cause others to write their password out on a post it and stick it to their monitor. I'd say it is absolutely essential to use a strong hash and salt on your end- neglecting to do so exposes not only your site, but completely compromises every user who recycles username/ password combinations.
So in my opinion salting is mandatory regardless of the security level of your site. Enforced password complexity is good for high security sites - but is definitely more situational. It won't guarantee good security practices on the part of your users. I'll also add that requiring a secure password for a site that doesn't require it can do more harm than good as it is more likely that a user will recycle a high-security password that they use on other more essential sites.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse