How to extract salt from sha512 digest? - hash

I encoded(sha512 hash)the password string "hello" using the salt string "world" and saved the string in a file.
hex: 2b83319d3e78544e4430c4f5621968fee8b6ffa1254678b2c6fb98f7f79ff16afee2da909a7bb741488ca3bacbbf6cec8fd226c5a52eef805ea65a352e2ece8e
base64: K4MxnT54VE5EMMT1Yhlo/ui2/6ElRniyxvuY9/ef8Wr+4tqQmnu3QUiMo7rLv2zsj9ImxaUu74Beplo1Li7Ojg==
Now in my program i have the above encoded value of salted "hello" and the fresh password string "hello". I have to again encode "hello" using same salt and compare the output. Is it possible to extract the salt from the above output?

You cannot retrieve the "salt" from a hash. A hash function is a one-way function that cannot be reversed (only brute-forced).
Since you're using SHA-512 and the output is 512-bit long (128 hex-encoded bytes), there is simply no room where something like a salt is stored. When you create hashes using additional data such as a salt, you need to either store it yourself or use a function that produces a string that encodes such additional data into the output.
If you're hashing passwords or other easily brute-forceable data, use many iterations of such a hash function, because only one iteration is not enough. It is common to use PBKDF2, bcrypt or scrypt for these use cases.

The salt, when concatenated with the user's password, in essence becomes part of the password. There is the part of this compound password you know (the salt) and the part the user knows. There is no known way to identify even a single bit of the password or the salt from the output of the better hash functions. They are, for any outside party, supposed to be indistinguishable from randomness and are not reversible.
So if you have good salt and did not store it, you will never find the concatenated string that became the compound seed for password generation. Or at least not without brute forcing it, which would take nearly forever to do.

Related

Can we find the hashing type of the password?but it is strored in system

Isn't the password hashing type stored with the hash?
Otherwise how would the system verify the password without knowing the hash type?
Yes, the system must recognize the hash type.
Either there is only one hash type (and the code working with the hashes implicitly assumes the hash type), or else there is a mix of hash types (and the hash type may still be stored in the code, or else stored with the hash in some way).
But the system doesn't need to expose this information to the user. In the case of Facebook, their password storage approach is public enough that Alec Muffett gave a public recorded talk about it. Many other systems do not disclose their password hashing methodology.
Sometimes, the hashing methodology can be deduced, most famously in the case of descrypt (which truncates at eight characters) and bcrypt (which truncates at 72 characters). In both cases, a password longer than the truncation length, but not an exact match, can be used to infer the password hash being used.

How is correctness of the input checked when hash has a salt in it? (Different hashes - the same input)

By adding salt to the hash – an element of randomness that will make the same input create a different hash, but still usable to check the correctness of the input when entered again.
And how is that?
How is that possible to check the corectness of the input, when the same input will create different hash? (hash with salt)

Identifying password similarity without storing in plain text?

One of my SaaS software vendors requires me to change passwords every 90 days, which is good.
What surprises me though, is that the password change screen errors with a note that my new password is too similar to an old password.
This most often happens if I change less than three or four of the characters within a password.
If it were an exact match to an old password, I would have confidence that they are hashing my password, and comparing the hashes. The "similarity" matching makes me think they are storing and comparing the plaintext versions.
Is it possible to determine "similarity" by comparing one hash to another, or is this vendor more likely storing my password in plain-text?
It's possible. Whenever you change the password, the software could create hash codes for all combinations of the same password with a few characters masked or removed.
If your password is hello, it could create hash codes for _ello, h_llo, he_lo, hel_o, hell_, __llo, _e_lo, _ell_, he_l_, he__o... et.c.
The next time you change the password, it can create the same set of combinations of that password, and compare to all the previous hash codes. If there is a match, only a few characters were changed.
It's a lot simpler to just save the passwords in plain text, of course.
This depends whether they are checking all old passwords, or just your last one.
The last one will be available in memory if you had to enter your old password in order to set a new one. A form usually asks for three inputs: old password, new password and confirm new password.
If they are storing your last few passwords in hashed form, they would be able to check these for an exact match, and they could check your previous password for similarities using an algorithm using the old password that you just re-entered.
In all likelihood they are storing the plain text. With a good hashing algorithm there should be no correlation between the original content and the hash value (that is what makes it good).
It is possible they are storing some characteristics of the original password to use as reference. For example the counts of characters, any numeric value, etc., and then comparing to that but I doubt it.
One way to do this is by reducing the space of the password.
For example, if you think that "Hello" and "h3LL0" are similar, then you can make a reduce() function that changes the string to uppercase and changes all vowels and digits to #. Both "Hello" and "h3LL0" will be reduced to "H#LL#".
In the database you need to store hash() of the current password and hash(reduce()) of the current and all previous passwords.
You can design any policy of similarity you want, as long as you can make a suitable reduce() function.

What kind of encrypted data is this?

A friend of me ask this, and i was thinking of asking this here too..
"What kind of data are this, how are they encrypted, or decrypted?"
My friend told me he got this from facebook.
d9ca6435295fcd89e85bd56c2fd51ccc
It looks like it could be an md5 hash.
Basically a hash is a one-way function. The idea is that you take some input data and run it through the algorithm to create a value (such as the string above) that has a low probability of collisions (IE, two input values hashing to the same string).
You cannot decrypt a hash because there is not enough information in the resultant string to go back. However, it may be possible for someone to figure out your input values if you use a 'weak' hashing algorithm and do not do proper techniques such as salting a hash, etc.
I don't know how FaceBook uses hashes, but a common use for a hash might be to uniquely identify a page. For example, if you had a private image on a page, you might ask to generate a link to the image that you can email to friends. That link might use a hash as part of the URL since the value can be computed quickly, is reasonably unique, and has a low probability of a third party figuring it out.
This is actually a large topic that I am by no means doing justice to. I suggest googling around for hash, md5, etc to learn more, if you are so inclinded.
It is a sequence of 128 bits, encoded as a lower-case hex string.
If you are talking about a Facebook API key, there is no deeper meaning to decode from the bits. The keys are created at random by Facebook and assigned to a particular application to identify it. Each application gets a different set of random bits for its API key.
This appears the be the...
hexadecimal representation for...
- ... a 16 bytes encryption block or..
- ... some 128 bits hash code or even
- ... just for some plain random / identifying number.
(Hexadecimal? : note how there are only 0 thru 9 digits and a thru f letters.)
While the MD5 Hash guess suggested by others is quite plausible, it could be just about anything...
If it is a hash or a identifying / randomly assigned number, its meaning is external to the code itself.
For example it could be a key to be used to locate records in a database, or a value to be compared with the result of the hash function applied to the user supplied password etc.
If it is an encrypted value, its meaning (decrypted value) is directly found within the code, but it could be just about anything. Also, assuming it is produced with modern encryption algorithm, it could take a phenomenal amount of effort to crack the code (if at all possible).

Am I misunderstanding what a hash salt is?

I am working on adding hash digest generating functionality to our code base. I wanted to use a String as a hash salt so that a pre-known key/passphrase could be prepended to whatever it was that needed to be hashed. Am I misunderstanding this concept?
A salt is a random element which is added to the input of a cryptographic function, with the goal of impacting the processing and output in a distinct way upon each invocation. The salt, as opposed to a "key", is not meant to be confidential.
One century ago, cryptographic methods for encryption or authentication were "secret". Then, with the advent of computers, people realized that keeping a method completely secret was difficult, because this meant keeping software itself confidential. Something which is regularly written to a disk, or incarnated as some dedicated hardware, has trouble being kept confidential. So the researchers split the "method" into two distinct concepts: the algorithm (which is public and becomes software and hardware) and the key (a parameter to the algorithm, present in volatile RAM only during processing). The key concentrates the secret and is pure data. When the key is stored in the brain of a human being, it is often called a "password" because humans are better at memorizing words than bits.
Then the key itself was split later on. It turned out that, for proper cryptographic security, we needed two things: a confidential parameter, and a variable parameter. Basically, reusing the same key for distinct usages tends to create trouble; it often leaks information. In some cases (especially stream ciphers, but also for hashing passwords), it leaks too much and leads to successful attacks. So there is often a need for variability, something which changes every time the cryptographic method runs. Now the good part is that most of the time, variability and secret need not be merged. That is, we can separate the confidential from the variable. So the key was split into:
the secret key, often called "the key";
a variable element, usually chosen at random, and called "salt" or "IV" (as "Initial Value") depending on the algorithm type.
Only the key needs to be secret. The variable element needs to be known by all involved parties but it can be public. This is a blessing because sharing a secret key is difficult; systems used to distribute such a secret would find it expensive to accommodate a variable part which changes every time the algorithm runs.
In the context of storing hashed passwords, the explanation above becomes the following:
"Reusing the key" means that two users happen to choose the same password. If passwords are simply hashed, then both users will get the same hash value, and this will show. Here is the leakage.
Similarly, without a hash, an attacker could use precomputed tables for fast lookup; he could also attack thousands of passwords in parallel. This still uses the same leak, only in a way which demonstrates why this leak is bad.
Salting means adding some variable data to the hash function input. That variable data is the salt. The point of the salt is that two distinct users should use, as much as possible, distinct salts. But password verifiers need to be able to recompute the same hash from the password, hence they must have access to the salt.
Since the salt must be accessible to verifiers but needs not be secret, it is customary to store the salt value along with the hash value. For instance, on a Linux system, I may use this command:
openssl passwd -1 -salt "zap" "blah"
This computes a hashed password, with the hash function MD5, suitable for usage in the /etc/password or /etc/shadow file, for the password "blah" and the salt "zap" (here, I choose the salt explicitly, but under practical conditions it should be selected randomly). The output is then:
$1$zap$t3KZajBWMA7dVxwut6y921
in which the dollar signs serve as separators. The initial "1" identifies the hashing method (MD5). The salt is in there, in cleartext notation. The last part is the hash function output.
There is a specification (somewhere) on how the salt and password are sent as input to the hash function (at least in the glibc source code, possibly elsewhere).
Edit: in a "login-and-password" user authentication system, the "login" could act as a passable salt (two distinct users will have distinct logins) but this does not capture the situation of a given user changing his password (whether the new password is identical to an older password will leak).
You are understanding the concept perfectly. Just make sure the prepended salt is repeatable each and every time.
If I'm understanding you correctly, it sounds like you've got it right. The psuedocode for the process looks something like:
string saltedValue = plainTextValue + saltString;
// or string saltedalue = saltString + plainTextValue;
Hash(saltedValue);
The Salt just adds another level of complexity for people trying to get at your information.
And it's even better if the salt is different for each encrypted phrase since each salt requires its own rainbow table.
Its worth mentioning that even though the salt should be different for each password usage, your salt should in NO WAY be computed FROM the password itself! This sort of thing has the practical upshot of completely invalidating your security.