Storing Argon2 hash in database - hash

This is how I am using Argon2:
step1 = argon2.PasswordHasher(time_cost=16, memory_cost=2**15, parallelism=2,
hash_len=32, salt_len=16, encoding = 'utf-8')
step2 = step1.hash('password1')
print step2
# $argon2id$v=19$m=32768,t=16,p=2$vruz5GwPq3vNO9SOlf1O4w$ahmCvQcgB+MqUrWdYGLbLB4G7ZOGP5bgcYxaDM/AaLo
I am storing the output, so obtained in step2, as a single unit in one column with
character set utf8mb4 and collation utf8mb4_unicode_520_ci.
I have no separate column for salt, since the hash already has it.
Is this a proper way to store Argon2 hash?

Related

Any way to get orginal data from hashed values in snowflake?

I have a table which uses the snowflake hash function to store values in some columns.
Is there any way to reverse the encrytion from the hash function and get the original values from the table?
As per the documentation, the function is not "not a cryptographic hash function", and will always return the same result for the same input expression.
Example :
select hash(1) always returns -4730168494964875235
select hash('a') always returns -947125324004678632
select hash('1234') always returns -4035663806895772878
I was wondering if there is any way to reverse the hashing and get the original input expression from the hashed values.
I think these disclaimers are for preventing potential legal disputes:
Cryptographic hash functions have a few properties which this function
does not, for example:
The cryptographic hashing of a value cannot be inverted to find the
original value.
It's not possible to reserve a hash value in general. If you consider that when you even send a very long text, and it is represented in a 64-bit value, it's obvious that the data is not preserved. On the other hand, if you use a brute force technique, you may find the actual value producing the hash, and it can be counted as reserving the hash value.
For example, if you store all hash values for the numbers between 0 and 5000 in a table, when I came with hash value '-7875472545445966613', you can look up that value in your table, and say it belongs to 1000 (number).

MD5/SHA Field Dataset in Data Fusion

I need to concatanate a few string values in order to obtain the SHA256 encrypted string. I've seen Data Fusion has a plugin to do the job:
The documentation however is very poor and nothing I've tried seems to work. I created a table in BQ with the string fields I need to concatanate but the output is same as input. Can anyone provide with an example on how to use this plugin?
EDIT
Below I present the example,
This is how the workflow looks like:
For the testing purposes, I added one column with the following string:
2022-01-01T00:00:00+01:00
And here's the output:
You can use Wrangler to concatenate the string values.
I tried your scenario adding Wrangler to the Pipeline:
Joining 2 Columns:
I named the column new_col, using , as delimiter:
Output:
What you described can be achieved by 2 Wranglers:
The first Wrangler will be what #angela-b described. Use the merge directive to create a new column with the concatenation of two columns. Example directive that joins column a and b using , as the delimiter and stores the result in column a_b:
merge a b a_b ,
The second Wrangler will use the hash directive which will hash the column in place using a specified algorithm. Example of a directive that hashes column a_b using MD5:
hash :a_b 'MD5' true
Remember to set the last parameter encode to true so that you get a string output instead of a byte array.

SAME content of PDF have DIFFERENT hash values

Is it possible that same content of a pdf have different HASH value ?
database value "Hello World" => convert into PDF =>generate HASH =>Save HASH into Database
Now,After one year
scenario 1:
Generate HASH from Same PDF => Generate HASH => compare generated HASH with saved Database HASH Value
Both Hash value will be same or different ?
scenario 2:
Generate again New PDF
database value "Hello World" => convert into New PDF =>generate HASH => compare generated HASH with saved Database HASH Value
Both Hash value will be same or different ?
Is there any possibility that Both HASH will be different in CORDA Blockchain ?
Hash of a file is generated based on the entire file (including all metadata) not just the content.
Hence two file having similar content doesn't necessarily mean that they have the same digital fingerprint. There is no guarantee that two files with similar content would produce similar hashes.

Password hashing and salting

I'm trying to wrap my head around the logic of encrypting passwords with MD5/SHA combined with salting.
I understand the concept of a user proving a text password, and appending a random string (salt) to the text password, and hashing the final string via whatever encryption method you want.
This is where I lose the concept
Say in my database of users, I have usernames, and encrypted passwords generated with the random salt value
When the user goes to log into a system, and they enter their password, how do I obtain the correct salt to check the password validity?
If the salt is randomly generated to begin with, I can't recalculate it
Do I have to store the salt with the username/password record? If I query the database for the salt value by username, it would seem that defeats the purpose of having the salting.
How do I obtain the correct salt when it comes time to validate the supplied password?
From Wikipedia, Salt (cryptography)
A new salt is randomly generated for each password. In a typical setting, the salt and the password are concatenated and processed with a cryptographic hash function, and the resulting output (but not the original password) is stored with the salt in a database
You store it with the hash, to prevent dictionary attacks.
The salt is stored in the database, so you can use the same salt to verify the password. Todays libraries often will include the salt in the resulting hash-value like this (result of the PHP function password_hash()):
$2y$10$nOUIs5kJ7naTuTFkBy1veuK0kSxUFXfuaOKdOKf9xYT0KKIGSJwFa
| | | |
| | | hash-value = K0kSxUFXfuaOKdOKf9xYT0KKIGSJwFa
| | |
| | salt = nOUIs5kJ7naTuTFkBy1veu (22 characters)
| |
| cost-factor = 10 = 2^10 iterations
|
hash-algorithm = 2y = BCrypt
This 60 character string can be stored into a single field in the database. The verifying function can extract the salt from this string. The salt is not a secret, it fulfills its purpose even when it is known.
Please note that algorithms like MD5 and SHA-* are not appropriate to hash passwords, because they are too fast. Instead use an algorithm with a cost factor like BCrypt or PBKDF2. For more information you can have a look at my tutorial about safely storing passwords.
Yes, you store the salt. Salting is used to prevent pregenerated rainbow tables, it is not required to be secret, just unpredictable.

Hashing of timestamp

I need a hash function(maybe I should not call that a "hash" function) that:
1.is used for hashing timestamps only;
2.there exist a reverse function that I can restore the timestamp through that function;
3.does not generate duplicate hash value;
4.whether not it is a hash function, it is nearly as fast as a hash function;
PS: About the data type of timestamp --- image that as a 4 bytes "long" type in C.
Is that possible?
(I need the timestamp to be a secret. --- In fact, I need the hash value as a session id and the original timestamp as an index in my database. Whenever user request something with the session id, I can get the timestamp as an index to get the request info.)
If you can skip #2 MurmurHash might be a good option:
https://sites.google.com/site/murmurhash/
(2) If you must crypt/decrypt there are standard implementations of the various algorithms for most languages (AES, for instance). This will be much slower than hashing.
If you don't actually need this to secure the data (which begs the question: why bother at all with any conversion?) and just want to make some non-timestamp-looking string that is easily reversible (by you -- and anyone else) then check this question:
Rot13 for numbers