MD5 hashing collision in username hashing [duplicate]

MD5 hashing collision in username hashing [duplicate] - hash

This question already has answers here:
What is the clash rate for md5? [closed]
(2 answers)
Closed 9 years ago.
This question does not need any code, it's just a conceptual thing about MD5 hashing.
My app manages a community of users.
I use MD5 hashing to reduce a user nickname of arbitrary length to a hash. I expect the MD5 of every nick to be different, because this MD5(nick) will be kind of my user ID for every user.
Is this always true? I'm sure I'm missing something and there can be collisions in the long term (millions of users === millions of different nicks with different lengths)

MD5 collisions for random data (eg. usernames) are rare enough that you'd probably never see them. The problem is that MD5 has been broken with respect to collision resistance, so an attacker could easily generate a pair of usernames that have the same hash, with whatever security and/or functionality implications that would have for your design.
The usual way to generate a short identifier in your situation is to simply associate each username with a sequentially-generated number in the account database. The application uses the number internally, and only references the username when it needs to display something to a user.

Related

Rainbow tables and John the Ripper [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 2 years ago.
Improve this question
I am working on a uni project and I have to present the tool "John the Ripper" and the usage of "Rainbow tables" with it.
I played around with the different modes of "John the Ripper" and searched the concept of the "Rainbow tables".
The problem is that I cannot understand how these two are connected and how, if possible, can I use my own "Rainbow tables" in the decryption of the password hash?

They solve the same problem, but in opposite directions:
Password-cracking software like JtR dynamically performs hashing of large lists of candidate plaintexts until a plaintext is found that produces a hash that matches the target hash. If no candidate plaintext produces a match, then the original plaintext has not been discovered and the hash has not been "cracked".
Rainbow tables compare a given hash to a large (but finite) list of precomputed hashes. If a matching hash is not already present in the rainbow table, the plaintext cannot be discovered with that table.
This is the classic "time/memory trade-off" concept. Cracking takes more computation power and time, but less storage. Rainbow tables take less computation power and time, but much more storage (often terabytes in size).
And because modern GPUs can attempt billions of unsalted candidate passwords per second, rainbow tables are only more useful than GPU-based attacks in a very specific and constrained set of circumstances:
The length of the password and it composition (whether specific characters are required, etc.) are known in advance, and small enough to be generated in bulk and stored in a rainbow table (often no more than 9 or 10 characters, depending on composition)
The password was probably randomly generated (because most non-randomly-generated candidates, of much greater length and complexity, can be attempted on GPU)
So unless you're a pentester with specific knowledge that a high-value password was randomly generated but is also relatively short (which would be rare in practice), rainbow tables are largely outdated.
It also makes no sense to "build a rainbow table" on the fly for a new target because the speed of using a rainbow table is only achievable after it has been built. You can simply run through the equivalent GPU attack faster ... and still have your 4TB of disk space available for something else.

Hashing functions on passwords [duplicate]

This question already has answers here:
How long to brute force a salted SHA-512 hash? (salt provided)
(3 answers)
protect hash code? [closed]
(2 answers)
Closed 3 years ago.
I have a question regarding the purpose of hashing passwords. I understand that hash functions are a one way pseudo random algorithm that turns a string into a seemingly random n-bit string (depending on the hash). Sure, this means that they cannot be reversed to find the original string and they do not need to be stored as plain text in a database. But if the hashed passwords were to be obtained or leaked in any way, what is stopping someone from performing the same hash function on them to crack the passwords? There exist a few number of hash generators online such as MD5, SHA-1 and SHA-256 that anyone can (potentially) use to brute-force or dictionary attack a list of hashed passwords if they wanted to.
Maybe the idea is that one does not know what hashing function was used to generate the hashes? But even then the length of the hash itself could give it away. Maybe it's because hashes take a while to compute? But couldn't online generators speed up the process by mapping lists of words to certain hashes?
Any help or understanding would be greatly appreciated!

Salting the hash prevents the brute forcing and the usage of rainbow tables.
From Wikipedia:
In cryptography, a salt is random data that is used as an additional
input to a one-way function that "hashes" data, a password or
passphrase. Salts are used to safeguard passwords in storage.

How to decrypt SHA-512 hashed data [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
Update:
SHA-512 is one-way so I will need to not attempt to crack passwords because it is easier to simply reset a password. If someone is aware of getting the original password from a SHA-512 hashed text, please let me know. Otherwise, I am moving on from this question. Thank you for all of your answers.
Original Question:
I have read a lot of articles that state that SHA-512 hashing cannot be unhashed. However, there is a source code for the various SHA-1 + algorithms here: https://tls.mbed.org/sha-512-source-code.
I would like to know if it is possible to reverse this coding, in a way, to decrypt SHA-512 hashed text. Linux encrypts their passwords with SHA-512 hashing. As a systems administrator, I would prefer to simply decrypt or unhash this information as needed, rather than guessing whether a password is correct or incorrect and see if the hash matches. Creating new passwords can cause a lot of extra time and money. If you do not feel comfortable publishing this information and would like to discuss it privately, feel free to request my contact information.
Thank you!

Why do you not believe what you have read?
Cryptographic hash functions can not be reversed.
Thought experiment: You have 200 bytes you pass to SHA512, out come 64 bytes. Something has been lost. How do you regain what is lost?
In a similar manner if you have an integer, say 123, and mod by 10 the result would be 3. Now reverse that–oh it could have been and of 3, 13, 23, 33, 123, 9343453, *3.

I have read a lot of articles that state that SHA-512 hashing cannot be unhashed.
Yes. That is the definition of "hash". This has nothing to do with SHA-512. The definition of a hash function is that it cannot be reversed. Period. If it can be reversed, it's not a hash.
I would like to know if it is possible to reverse this coding, in a way, to decrypt SHA-512 hashed text.
No, you can't decrypt it, because it isn't encrypted, it's hashed.
Linux encrypts their passwords with SHA-512 hashing.
No, it doesn't. It hashes them, it doesn't encrypt them.
As a systems administrator, I would prefer to simply decrypt or unhash this information as needed, rather than guessing whether a password is correct or incorrect and see if the hash matches. Creating new passwords can cause a lot of extra time and money. If you do not feel comfortable publishing this information and would like to discuss it privately, feel free to request my contact information.
As a systems administrator, if you don't understand the difference between encryption and hashing, please tell me where you work, so that I never ever accidentally become of customer of yours! The Pigeonhole Principle is so simple and obvious that it can be understood by a child.

The articles you have read are correct.
However, if for example, a user uses a dictionary word, and you aren't salting your hashes, then those circumstances are open to dictionary attacks. Which is why no-one worth their salt, pun intended would use a hash algorithm without a salt.
Frankly I find it unlikely a systems administrator would need to get a password, as generally they have impersonation rights.

Using an ASIC to brute force MD5

Is it possible to use an Application Specific Integrated Circuit (ASIC) to brute force MD5 hashes and thus reverse them down to their original form? I know there could be multiple collisions, but leaving that aside, would it be possible? The idea interests me because I happen to have ASIC Miner Block Erupters which are ASIC's used to generate the SHA-256 hash, but why not MD5?
Thanks in advance.

This is a very old question, but while working with a client and working to convince them that they couldn't use MD5 to hash passwords and needed to upgrade to something more secure, this post came up in the discussion.
While the accepted answer is technically correct, one doesn't have to calculate all possible md5 hashes to break a password, one only has rotate strings and positions in a methodical fashion to land on actual passwords. If we assume 8 characters in length and the common rule of uppercase, lowercase, and digits at minimum, that's only 218 trillion combinations.
Within the narrow confines of the answer, yes, it is completely impractical to brute force md5 collisions, but it is absolutely feasible to throw random smaller data sets at MD5 records and see what matches you get. Put simply, to calculate every possible MD5 for a set of passwords 5 characters in length containing letters, numbers and special characters might take two hours at 1 Mh/s.
I did that exact thing using a MacBook and some hastily written code for the aforementioned client. Within the span of the 45 minutes it took to explain the problem, and for them to point to this answer as a reason that they didn't need to bother, I had already gotten almost a thousand of the horrifyingly insecure passwords stored in their database.
Long story short, I just don't want people reading this answer and thinking that passwords hashed using MD5 are impossible to crack.

A brute force attack is futile as there are 2^128 MD5 hashes. If you could compute 10^18 (that's a billion times a billion) hashes per second it would still take billions of years to find a single collision (unless you are extraordinarily lucky). Terahashes per second is not nearly enough. 2^128 / 1 terahertz is in the order of 10^26 seconds, which is about 10^19 years.
MD5 is broken, but broken does not imply "feasible to brute force", only "feasible to attack in some manner (probably more sophisticated than brute force)".

Salting Your Password: Best Practices? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I've always been curious... Which is better when salting a password for hashing: prefix, or postfix? Why? Or does it matter, so long as you salt?
To explain: We all (hopefully) know by now that we should salt a password before we hash it for storage in the database [Edit: So you can avoid things like what happened to Jeff Atwood recently]. Typically this is done by concatenating the salt with the password before passing it through the hashing algorithm. But the examples vary... Some examples prepend the salt before the password. Some examples add the salt after the password. I've even seen some that try to put the salt in the middle.
So which is the better method, and why? Is there a method that decreases the chance of a hash collision? My Googling hasn't turned up a decent analysis on the subject.
Edit: Great answers folks! I'm sorry I could only pick one answer. :)

Prefix or suffix is irrelevant, it's only about adding some entropy and length to the password.
You should consider those three things:
The salt has to be different for every password you store. (This is quite a common misunderstanding.)
Use a cryptographically secure random number generator.
Choose a long enough salt. Think about the birthday problem.
There's an excellent answer by Dave Sherohman to another question why you should use randomly generated salts instead of a user's name (or other personal data). If you follow those suggestions, it really doesn't matter where you put your salt in.

I think it's all semantics. Putting it before or after doesn't matter except against a very specific threat model.
The fact that it's there is supposed to defeat rainbow tables.
The threat model I alluded to would be the scenario where the adversary can have rainbow tables of common salts appended/prepended to the password. (Say the NSA) You're guessing they either have it appended or prepended but not both. That's silly, and it's a poor guess.
It'd be better to assume that they have the capacity to store these rainbow tables, but not, say, tables with strange salts interspersed in the middle of the password. In that narrow case, I would conjecture that interspersed would be best.
Like I said. It's semantics. Pick a different salt per password, a long salt, and include odd characters in it like symbols and ASCII codes: ©¤¡

The real answer, which nobody seems to have touched upon, is that both are wrong. If you are implementing your own crypto, no matter how trivial a part you think you're doing, you are going to make mistakes.
HMAC is a better approach, but even then if you're using something like SHA-1, you've already picked an algorithm which is unsuitable for password hashing due to its design for speed. Use something like bcrypt or possibly scrypt and take the problem out of your hands entirely.
Oh, and don't even think about comparing the resulting hashes for equality with with your programming language or database string comparison utilities. Those compare character by character and short-circuit as false if a character differs. So now attackers can use statistical methods to try and work out what the hash is, a character at a time.

It shouldn't make any difference. The hash will be no more easily guessable wherever you put the salt. Hash collisions are both rare and unpredictable, by virtue of being intentionally non-linear. If it made a difference to the security, that would suggest a problem with the hashing, not the salting.

If using a cryptographically secure hash, it shouldn't matter whether you pre- or postfix; a point of hashing is that a single bit change in the source data (no matter where) should produce a different hash.
What is important, though, is using long salts, generating them with a proper cryptographic PRNG, and having per-user salts. Storing the per-user salts in your database is not a security issue, using a site-wide hash is.

First of all, the term "rainbow table" is consistently misused. A "rainbow" table is just a particular kind of lookup table, one that allows a particular kind of data compression on the keys. By trading computation for space, a lookup table that would take 1000 TB can be compressed a thousand times so that it can be stored on a smaller drive drive.
You should be worried about hash to password lookup tables, rainbow or otherwise.
#onebyone.livejournal.com:
The attacker has 'rainbow tables' consisting not of the hashes of dictionary words, but of the state of the hash computation just before finalising the hash calculation.
It could then be cheaper to brute-force a password file entry with postfix salt than prefix salt: for each dictionary word in turn you would load the state, add the salt bytes into the hash, and then finalise it. With prefixed salt there would be nothing in common between the calculations for each dictionary word.
For a simple hash function that scans linearly through the input string, such as a simple linear congruential generator, this is a practical attack. But a cryptographically secure hash function is deliberately designed to have multiple rounds, each of which uses all the bits of the input string, so that computing the internal state just prior to the addition of the salt is not meaningful after the first round. For example, SHA-1 has 80 rounds.
Moreover password hashing algorithms like PBKDF compose their hash function multiple times (it is recommended to iterate PBKDF-2 a minimum of 1000 times, each iteration applying SHA-1 twice) making this attack doubly impractical.

BCrypt hash if the platform has a provider. I love how you don't worry about creating the salts and you can make them even stronger if you want.

Inserting the salt an arbitrary number of characters into the password is the least expected case, and therefore the most "secure" socially, but it's really not very significant in the general case as long as you're using long, unique-per-password strings for salts.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse