If I create a 64bit hash by concatenating a 32bit hash of the odd bytes of a string with a 32bit hash of the even bytes of a string, how does that compare to a proper 64bit hash of the full string?
I have the FNV-1a hash in mind, but it's really a general question about hash strength.
Related
I am developing a project with arduino and I want to use a hash function on the data generated by a temperature sensor?
To be more specific I want to use the SHA-1 hash.
see https://en.wikipedia.org/wiki/SHA-1 and you notice, an 8-bit controller which stores integers in LittleEndian is not the optimal platform for your idea.
The available RAM (2kB) of an atmega328 should be sufficient, if you do not need too much RAM for the raw data.
So, have fun ;)
My main concern is rather the "why"?
What's wrong with a CRC checksum or similar, or a eventually a private hash algorithm, to ensure data integrity?
I understand that hash functions like md5 can be used to tell if two files(or sets of data) are similar or not. Even changing a single bit changes the hash value of any file. Apart from this information is there any other information when comparing two hash function like to what degree are the two files different or the location of the changes. Are there any hash functions that can used to get these information
None if the hash function is cryptographically secure.
If you are presented with two hashes coming from two files, the only thing you can tell is if the files are exactly, bit for bit, identical (same hash) or not.
Some properties of a hash function is that any final bit of the hash depends on multiple bits of the message, and that a change in a single bit in the message will result in a completely different hash, to the extent that this second hash cannot be distinguished from any other possible hash.
Even with a somewhat vulnerable hash function like md5, the main thing an attacker could do is create a second document hashing to the same final hash (a collision). Not really infer the relatedness of two documents. For this to be possible, the hash function would have to be quite weak.
I know String and Algorithm to generate MD5 hash value, Is it possible to get back the String from generated Hash.
The very definition of a hash is a one-way, unique, encrypted value. Mathematically, consider it nearly impossible to get back the string generated from the hash.
Exceptions would be:
a vulnerability in the hashing algorithm (this happened with MD5, but its still difficult to crack it)
brute forcing (guessing) the string until you find a matching hash
using lookup tables of well-known phrases/words, and their associated hash values, eg: https://crackstation.net/
No; hashing is, by definition, a one-way process.
To derive the original string from the hash cannot be done without brute forcing different strings until one is found which generates an identical hash.
This process can take a very long time, though databases of known hashes exist which can speed up the process.
You should also know that two different strings can have the same hash. This is called a hash collision.
The MD5 is cryptographic hash function. It produces a 128-bit hash value. it is in text format of 32 digit hexadecimal number. It is used to verify data integrity.
No, you can not get actual value from hash value. I think you are looking for encryption and decryption mechanism.
This question already has answers here:
Decrypt MD5 hash [duplicate]
(5 answers)
Closed 5 years ago.
Does MD5 hashes the string or it encrypt it? If it hashes it, then it's as they say a one-way hash function and original string (or data) are non-recoverable by the hash produced because it's only for authentication. Then how can we explain the online websites for MD5 decryption? I actually tried it, it gets back the original string. And here's a site that does this: http://www.md5decrypter.co.uk/
How is this possible?
MD5 is a hash algorithm, meaning that it maps an arbitrary-length string to a string of some fixed length. The intent is to make it so that it is hard to start with the output of an MD5 hash and to recover some particular input that would hash to that output. Because there are infinitely many strings and finitely many outputs, it is not an encryption function, and given just the output it's impossible to determine which input produced that output.
However, MD5 has many cryptographic weaknesses and has been superseded by a variety of other hash functions (the SHA family). I would strongly suggest not using MD5 if cryptographic security is desired, since there are much better algorithms out there.
Hope this helps!
MD5 is a cryptographic hash function. It maps a variable-length string to a 128-bit hash value. It's one-way but the code can be cracked quickly using Rainbow Tables. Not to mention the site you posted says it has
a total of just over 8.7 billion unique decrypted MD5 hashes...
so it can check against those first before it even needs to try to crack it.
They don't "decrypt", they find a string that matches your hash, which is not the same thing but when you limit yourself to common English words it could very well be.
To understand what's going on you have to consider the count of possible MD5 hashes - 2^128, which is more than the count of words in English (2^16?) but much less than all possible string values 2^(number of bits the internet has and then some)
When you convert from a smaller set into a bigger one (english->MD5) it's likely all values will be different, but the other way around isn't true.
Bottom line: use a password that isn't a string that can be found by google anywhere on the net.
I create a GUID (as a string) and get the hash of it. Can I consider this hash to be unique?
Not as reliably unique as the GUID itself, no.
Just to expand, you are reducing your uniqueness by a factor of 4, going from 16 bytes to 4 bytes of possible combinations.
As pointed out in the comments the hash size will make a difference. The 4 byte thing was an assumption, horrible at best I know, that it may be used in .NET, where the default hash size is 4 bytes (int). So you can replace what I said above with whatever byte size your hash may be.
Nope.
See here, if you want a mini GUID: https://devblogs.microsoft.com/oldnewthing/20080627-00/?p=21823
In a word, no.
Let's assume that your hash has fewer bits than the GUID, by the pigeon hole principle, there must exist more than one mapping of some GUID -> hash simply because there are fewer hashes than GUIDS.
If we assume that the hash has a larger number of bits than the GUID, there is a very small--but finite--chance of a collision, assuming you're using a good hash function.
No hash function that reduces an arbitrary sized data block to a fixed size number of bits will produce a 1-to-1 mapping between the two. There will always exist a chance of having two different data blocks be reduced to the same sequence of bits in the hash.
Good hash algorithms minimizes the likelihood of this happening, and generally, the more bits in the hash, the less chance of a collision.
It's not guranteed to be, due to hash collisions. The GUID itself is almost-guaranteed to be.
For practical reasons you probably can assume that a hash is unique, but why not use the GUID itself?
No, and I wouldn't assume uniqueness of any hash value. That shouldn't matter because hash values don't need to unique, they just need to evenly distributed across their range. The more even the distribution, the fewer collisions occur (in the hashtable). Fewer collisions mean better hashtable performance.
fyi For a good description of how hash tables work, read the accepted answer to What are hashtables and hashmaps and their typical use cases?
If you use cryptographic hash (MD5, SHA1, RIPEMD160), the hash will be unique (modulo collisions which are very improbable -- SHA1 is used e.g. for digital signatures, and MD5 is also collision-resistant on random inputs). Though, why do you want to hash a GUID?