SAME content of PDF have DIFFERENT hash values - hash

Is it possible that same content of a pdf have different HASH value ?
database value "Hello World" => convert into PDF =>generate HASH =>Save HASH into Database
Now,After one year
scenario 1:
Generate HASH from Same PDF => Generate HASH => compare generated HASH with saved Database HASH Value
Both Hash value will be same or different ?
scenario 2:
Generate again New PDF
database value "Hello World" => convert into New PDF =>generate HASH => compare generated HASH with saved Database HASH Value
Both Hash value will be same or different ?
Is there any possibility that Both HASH will be different in CORDA Blockchain ?

Hash of a file is generated based on the entire file (including all metadata) not just the content.
Hence two file having similar content doesn't necessarily mean that they have the same digital fingerprint. There is no guarantee that two files with similar content would produce similar hashes.

Related

How to count collisions in chaining hash (with linked list)?

I have two ".txt" files, one contains a lot of strings to be inserted on a hash table and another one with strings to be searched on that hash table.
I'm making a code where I create the hash table, insert the strings of the first ".txt" in it using a hash function, and then I search each of the strings on the second ".txt" in that hash table.
The works is to display the time spent to find all the strings on the hash table (no problem), the number of found strings (no problem) and the collision count (here is the problem).
I'm using chaining hash with linked list, counting the collisions while inserting elements on the hash table.Therefore I found two ways of counting collisions, both appears to make sense to me.
First one: Once I generate the key, check if that index is NULL, if not: "collision++" once and insert the element in the end of the linked list.
Second one: Once I generate the key, check if that index is NULL, if not: "collision++" for each element that is already on the linked list while position on the linked list !NULL.
Which one is more appropriate?

Any way to get orginal data from hashed values in snowflake?

I have a table which uses the snowflake hash function to store values in some columns.
Is there any way to reverse the encrytion from the hash function and get the original values from the table?
As per the documentation, the function is not "not a cryptographic hash function", and will always return the same result for the same input expression.
Example :
select hash(1) always returns -4730168494964875235
select hash('a') always returns -947125324004678632
select hash('1234') always returns -4035663806895772878
I was wondering if there is any way to reverse the hashing and get the original input expression from the hashed values.
I think these disclaimers are for preventing potential legal disputes:
Cryptographic hash functions have a few properties which this function
does not, for example:
The cryptographic hashing of a value cannot be inverted to find the
original value.
It's not possible to reserve a hash value in general. If you consider that when you even send a very long text, and it is represented in a 64-bit value, it's obvious that the data is not preserved. On the other hand, if you use a brute force technique, you may find the actual value producing the hash, and it can be counted as reserving the hash value.
For example, if you store all hash values for the numbers between 0 and 5000 in a table, when I came with hash value '-7875472545445966613', you can look up that value in your table, and say it belongs to 1000 (number).

Key, Value, Hash and Hash function for HashTable

I'm having trouble understanding what the Hash Function does and doesn't do, as well as what exactly a Bucket is.
From my understanding:
A HashTable is a data structure that maps keys to values using a Hash Function.
A HashFunction is meant to map data from an array of arbitrary/unknown size to a data array of fixed size.
There can be duplicate Values in the original data array, but this is irrelevant.
Each Value will have a unique Key. Thus, each Key has exactly 1 Value.
The HashFunction will generate a HashCode for each (Value, Key) pair. However, Collisions can occur in which multiple (Value, Key) pairs map to the same HashCode.
This can be remedied by using either Chaining/Open Addressing methods.
The HashCode is the index value indicating the position of a particular entry from the original data array within the Bucket array.
The Bucket array is the fixed data array constructed that will contain the entries from the original array.
My questions:
How are the Keys generated for each value? Is the HashFunction meant to generate both Key and HashCode values for each entry? Does each Bucket thus contain only one entry (assuming a Chaining implementation to remedy Collision)?
How are the Keys generated for each value?
Key is not generated, it is provided by you and serves as an input to the hash function which in turn converts that key into index of hash table. Simply speaking:
H(key)=index
so the value you are looking for is:
hash_table[index] = value
Is the HashFunction meant to generate HashCode values for each entry?
It all depends on the implementation of hash function and hash table. Some hash functions might generate a hashcode out of provided key and then for example take its modulo(size) where size is the size of hash table, in order to get the index. Others might convert the key directly into index. In either case the ultimate goal of hash function is to find the location of searched data within hash table in constant time.
Does each Bucket thus contain only one entry (assuming a Chaining implementation to remedy Collision)?
Ideally each key should be mapped to a unique index but mostly that's not the case since the number of buckets (i.e. indices) is far smaller than the number of keys so the average length of a chain per bucket (i.e. number of collisions per bucket) is no.of keys/no.of indices

Redis HMSET Documentation: What does it mean by 'hash'?

Redis HMSET command documentation describes it as:
"Sets the specified fields to their respective values in the hash stored at key. This command overwrites any existing fields in the hash. If key does not exist, a new key holding a hash is created."
What does the word 'hash' mean in this case? Does it mean a hash table? Or, hash code computed for the given the field,value pairs? I would like to think it means the former, i.e., a hash table, but I would still like to clarify as the documentation is not explicit?
Hash refers to the Redis Hash Data-Type:
Redis Hashes are maps between string fields and string values, so they
are the perfect data type to represent objects (e.g. A User with a
number of fields like name, surname, age, and so forth)

Query several hash from redis efficiently

I want to have some objects data in redis and I want to display all objects in a table.
in SQL I would just get the entire row for all object and display it in a view
in redis, I don't want to query each hash separately, since that will be unbearable slow.
Assuming I know the hash keys and the hash names I want to pull, Is there a way to do this effienctly?
I'm not sure why you believe querying each hash would be unbearably slow. If you loop through your hash keys and do an HMGET for each with the field names you should be good, provided you pipeline the requests.
Alternatively, you could do this in a Lua script that accepts (some of) the key names as KEYS and the fields as ARGV, returnint the answer in whatever format you need it.
Store all hash key in a set, let's called it 'hashkeyset'
Use 'sort' command to retrieve all hash values sort hashkeyset get *->field0 get *->field1 ... get *->fieldN
You can find more about 'sort' in this link http://redis.io/commands/sort