I have a database of similar integers by the fact that they all share the same first 3 numbers:
7537463746
7536735325
7538236775
7538273826
...
Each one is associated to a user, and they are all almost exposed to the public, meaning they are sent as a sort of peer discovery, but not directly shared. I don't want the bare integer to be accessible, so I thought about hashing them with a one-way hashing function like MD5.
Since the output is not reversible like encryption or compression algorithm do, it looks great. But there's a problem; Getting the integer database is easy and inevitable, so looping through them, hashing the loop results and comparing all the hashes to the ones sent through peer communication is going to be a trivial job for malicious users.
The schema is something like this:
user1[hash(integer1),hash(integer2)...] -> |server hash database| ->
↓
↓
hash(integer1) = user8
hash(integer2) = user40
A malicious user will get user1 integers data by social engineering or other means and hash all of them to see if they're in the database by adding them to his peers data.
Now, is there any hashing algorithm to avoid this type of situation? I need the peers to communicate without giving out their integers data but still both mutually associate the same integer to a unique hash. In alternative, is key signing the only solution? I would like to avoid it since it will make the whole system slower.
Have you thought about salting your MD5? What that means is that you have some sort of secret key that only your application knows. This is actually always a good practice. So rather than doing this...
md5($userId)
You would append the "salt" inside of the MD5 like this...
md5($userId . 'this is a secret shhh!')
Now they can't get the integer from the MD5.
Related
Consider this code:
const hashPassword = function(plainText) {
return crypto
.createHmac(process.env.Secret_hash_Password, "secret key")
.update(plainText)
.digest("hex");
};
As you may have noticed, this is a simple hashing function using crypto.
Now consider this code excerpt:
bcrypt.compare(password, user.password, (err, isMatch) => {....}
As you may have noticed, this is a simple comparing hashing function using bcryptjs.
As I believe everyone will agree, the second is most secure.
Now consider the problem:
I have a key to store on mongo, and this key is a sensitive information, as so, I have decided to hash it as so no one can decrypt it. This key is used to make mongo searches, this an information that just the user has, a sort of password.
Solution: use the first code, as so nonetheless you cannot decrypt, you can get the same result of hashing if the input is the same.
Problem: my solution is using a tecnique that is well-known to be easily hacked, someone that somehow had access to the server just need to enter several inputs and once they get the same output, they got it! this is a well-known flaw of my solution.
Desired solution: use the second code with mongo.
Discussion: I could simply get all the database information with find({}), and apply say ForEach and bcrypt.compare, nonetheless, I know from my studies that mongo is optimized for search, e.g. they use indexes. It would be nice to be able to pass the bcrypt.compare as a customized function to mongo search enginee.
It was suggested "Increase the bcrypt salt rounds.": I cannot use salt since that would change the key and whenever I will need to compare, it will change. bcrypt.compareexists to overcome that, but mongo/mongoose queries does not have such internal enginee.
What I have in my head, in pseudocode:
Model.findOne({bcrypt.compare (internalID, internalID')}) //return when true
Where: bcrypt.compare (internalID, internalID') would be a sort of callback function, on each search, mongo would use this function with internalID', the current internalID under comparison, and return the document that produces true.
Any suggestion, comment, or anything?
PS. I am using mongoose.
From what i understand, you don't ever want anyone to know the patient ids (non -discover-able from real life patient-ids), even the database admin (and of course hackers).
I think you design is a bit messed up.
Firstly - indexes use B tree data structure for faster lookup so you have to provide exact string for lookup and by your condition of un-hash-able ids, indexes won't work. So you'll have to iterate over every patient id by that doctor and compare to get true result, which is pretty compute- extensive and frankly bad design.
There are multiple ways to approach to approaching this problem- depending upon your level of trust and paranoia.
I think using cryptojs is the correct solution. Now you have to add some randomness to the key/solution. Basically you hash the id with cryptojs, but instead of supplying the key yourself, you could take the secret key from doctor itself then hash every id with that key. Now you will have to unhash and hash every patient id everytime doctor changes secret key (using some sort of message queue).
You could also hash the secret key entered by doctor before saving and will have to unhash everytime (twice!) doctor wants to lookup by patientId.
Depending upon the number of users you expect your application to serve, if number is low enough- my solution would work. But too many users, you'd have to increase compute resources and probably invest in some security measures instead of my overkill solution. Why'd you be losing secret key to hackers anyway?
Good luck.
I need to recompute the hash of private data to proof the integrity of the data. When private data collections are used the private data are stored in SideDBs and the hash of the data on the ledger according to the documentation. Basically the question splits up into two subquestions:
How to access the hash of the private data?
Which method to use to recompute the hash that is saved on the ledger?
Thanks in advance.
I use Hyperledger Fabric v1.4.2 with private data. I followed marbles example.
I expect to be able to calculate the private data hash and verify that it corresponds to the hash saved in the ledger.
to get the SHA256 hash (using Fabric 1.4.x contract API) use:
let pdHashBytes = await ctx.stub.getPrivateDataHash(collectionA, readKey);
let actual_hash = pdHashBytes.toString('hex');
You can calculate the private data written on Ubuntu like shown below.
echo -n "{\"name\":\"Joe\",\"quantity\":999}" |shasum -a 256
and verify they match. So that's the mechanics of using private data method and verify patterns. Now lets add information about salting mechanics, as mentioned elsewhere in this post.
For most uses of private data, you'll most likely use a random salt so the private data cannot be brute force attacked in the permissioned blockchain network (between agreed parties). The salt is passed along in the same transient field as the private data. And (later on), it will need to be included with the private data itself, when recalculating the private data hash. See https://hyperledger-fabric.readthedocs.io/en/release-1.4/private-data-arch.html#protecting-private-data-content
Don't use it, private data is security hole.
It amazes me that nobody had mentioned this before so I guess I better point this out now before more damages are being done.
The logic behind Privated data is simple, it puts data in a local embedded data store and puts a hash of that data on Blockchain.
The issue is that cryptographic hash is not an encryption mechanism, same data hashed by anyone using the same hashing algorithm (which is also very standardized) will always get the same hash! This is exactly what hash functions are designed for, and that’s why we use hash in digital signature to allow anyone to validate signed data.
However, this also means anyone can “decrypt” the data behind the hash by using dictionary attack.
Hashing is cheap, the cost of each hash on a normal laptop CPU core is about 3 microseconds, basically I can create 1 billion candidate hashes within one hour on a single laptop CPU core, and compare them to the hashes on Hyperledger Fabric DLT.
And I am just talking about using a single core on my laptop, not even 50% power of my laptop
Why is it dangerous? Because if an attacker is connected to a Blockchain system, the attacker knows the range of the data being hashed (etc, trade ID, item name, bank name, address, cell phone number), so you can easily create dictionary attack to get the true data behind the hash out.
How about adding salt to each data to be hashed? Well, that’s one thing Hyperledger Fabric didn’t do.
To their defense, Hyperledger didn’t implement salt because it is difficult to pass salts to counter parties. You can’t use DLT to pass salt value because attackers would see it, so you have to create another P2P connection with counter party. If you need to create connection with all the counter parties, what’s the point of using Blockchain in the first place?
It’s just scary that so many people are using this security whole.
Is it possible to take an original hash value, and then decode it back to the original string?
hash('sha256', $login_password_for_login) gets me a hash, as shown below, but I'd like to go from the hash value back to the original string.
With $login_password_for_login = 12345, the hash function gives me this:
5994471abb01112afcc18159f6cc74b4f511b99806da59b3caf5a9c173cacfc5
I'd like to be able to retrieve the original number or string that I had for the login password. How do I reverse the hash and get that original string?
You don't 'decrypt' the hashes because hashing is not encryption.
As for undoing the hash function to get the original string, there is no way to go from hash to original item, as hashing is a one-direction action. You can take an item and get a hash, but you can't take the hash and get the original item.
Make a note that hashes should NOT be confused with encryption; encryption is a different process where you can take an item, encrypt it with some type of key (either preshared or symmetric keys like PGP keys), and then later decrypt it. Hashes do not work that way.
In comments, you indicate that you're trying to save a passcode in the database. The problem is, you don't want someone who can breach the DB to be immediately be able to decrypt passcodes, which is why hashing is so attractive.
The idea, then, is that you would consider using salted hashes, storing only the salt on a per-user basis in the DB as its own record, and then store the salted hash of their original password string in the database.
Then, to verify a password is entered proper, get the salt from the DB, get the user input for a given password, and then using the salt from the DB, get the salted hash for that input. Take that resultant hash and compare it to the salted hash stored in the DB. If they match, you have a validated password; if they don't match, it's invalid.
This way, there's actually no decryption of any passwords readily doable, which means in a data breach situation of your site the passwords are not easily able to be retrieved. (This doesn't rule out someone breaching your database, copying down the data, and trying to brute-force the passwords, but depending on what you enforce for password complexity and the effort a hacker wants to actually go through to get credentials, this is less likely to happen)
I'd write an example of this in a language I understand, but as you don't define what language you're working with, it's not going to be possible for me to write a useful example for you here.
That said, if you're working with PHP, you may find this document on crackstation.net about doing secure salted password hashing properly; there's already PHP implementations to do this proper so you wouldn't have to write your own code, supposedly.
Hashes cannot be decrypted, as they are not encryption.
Although the output of a hash function often looks similar to that of an encryption function, hashing is actually an extremely lossy form of data compression. When I say "extremely lossy", I mean "all of the original data is destroyed in order to get a fixed length." Since none of your original data remains, you cannot decrypt a hash.
That being said, hashes can be used to emulate encryption. What you do is that, when a person registers, you make a tuple containing the hashes of their username and password. Then, when somebody tries to login, you compare the hashes like this*:
import hashlib
from login_info import logins # This is an array containing the tuples.
def hasher(string: str) -> bytes:
stringer = bytes(string)
return hashlib.sha256(stringer).hexdigest()
def login(username: str, password: str) -> int: # Returns 0 if login correct, else 1.
user = hasher(username)
pass = hasher(password)
for i in range(len(logins)):
if logins[i][0] == user:
if logins[i][1] == pass:
return 0
else:
return 1
else:
return 1
* Nota Bene: I am using Python 3 for the example, as my PHP and Javascript are a little out of practice.
EDIT: On second thought, it is actually possible to (somewhat) decrypt a hash. Basically, you take the hash and then try every entry in the corresponding section of the hash table to see if it's right. This is why you should always salt password hashes.
I'm in the middle of development and I need to test my login/password api.
Problem is in the database the password is encrypted.
I have the following information.
key
iteration
salt
Is this enough to recover the password?
By the way I can edit these values as well if that will help.
I think you misunderstood, how a password API works. You cannot reverse a properly hashed password, but you can validate an entered password against the stored hash.
To validate the entered password, you need to calculate the hash again, with the same parameters you used to create the first hash. Then you can compare the two hashes, if they match, the password was the same.
You cannot reverse PBKDF2, but you could brute-force the common passwords to see if any of them matches. If a random salt is used every time, then you will need to do that for each password independently. If a large iteration count is used then prepare for it to take very long.
First, you should just reset it.
Second, you can recover it if and only if the password was weak (assuming correctly implemented PBKDF2), and you either know which HMAC it used (probably was PBKDF2-HMAC-SHA-1 - test with a known password), or you're willing to spend time trying several and hoping.
Try a tool like oclHashcat that's designed for password cracking - note PBKDF2 generic at the end of the list of examples for this, preferably with one or more good GPUs.
Alternately, if you're just testing your password API, you can run the test vectors at my Github repository through it and see if your results are correct or not.
What is the Difference between a Hash and MAC (Message Authentication code)?
By their definitions they seem to serve the same function.
Can someone explain what the difference is?
The main difference is conceptual: while hashes are used to guarantee the integrity of data, a MAC guarantees integrity AND authentication.
This means that a hashcode is blindly generated from the message without any kind of external input: what you obtain is something that can be used to check if the message got any alteration during its travel.
A MAC instead uses a private key as the seed to the hash function it uses when generating the code: this should assure the receiver that, not only the message hasn't been modified, but also who sent it is what we were expecting: otherwise an attacker couldn't know the private key used to generate the code.
According to wikipedia you have that:
While MAC functions are similar to cryptographic hash functions, they possess different security requirements. To be considered secure, a MAC function must resist existential forgery under chosen-plaintext attacks. This means that even if an attacker has access to an oracle which possesses the secret key and generates MACs for messages of the attacker's choosing, the attacker cannot guess the MAC for other messages without performing infeasible amounts of computation.
Of course, although their similarities, they are implemented in a different way: usually a MAC generation algorithm is based upon a hash code generation algorithm with the extension that cares about using a private key.
A hash is a function that produces a digest from a message. A cryptographically secure hash is for which it is computationally infeasible to generate a message with a given digest. On its own a hash of a message gives no information about the sender of a given message. If you can securely communicate the hash of a message then it can be used to verify that a large message has been correctly received over an unsecured transport.
A message authentication code is a way of combining a shared secret key with the a message so that the recipient of the message can authenticate that the sender of the message has the shared secret key and the no-one who doesn't know the secret key could have sent or altered the message.
An HMAC is a hash-based message authentication code. Usually this involves applying a hash function one or more times to some sort of combination of the shared secret and the message. HMAC usually refers the the algorithm documented in RFC 2104 or FIPS-198.
A MAC does not encrypt the message so the message is in plain text. It does not reveal the secret key so a MAC can be sent across on open channel with out compromising the key.
Found this to the point answer from another forum.
These types of cryptographic primitive can be distinguished by the security goals they fulfill (in the simple protocol of "appending to a message"):
Integrity: Can the recipient be confident that the message has not been accidentally modified?
Authentication: Can the recipient be confident that the message originates from the sender?
Non-repudiation: If the recipient passes the message and the proof to a third party, can the third party be confident that the message originated from the sender? (Please note that I am talking about non-repudiation in the cryptographic sense, not in the legal sense.) Also important is this question:
Keys: Does the primitive require a shared secret key, or public-private keypairs? I think the short answer is best explained with a table:
Cryptographic primitive | Hash | MAC | Digital
Security Goal | | | signature
------------------------+------+-----------+-------------
Integrity | Yes | Yes | Yes
Authentication | No | Yes | Yes
Non-repudiation | No | No | Yes
------------------------+------+-----------+-------------
Kind of keys | none | symmetric | asymmetric
| | keys | keys
Please remember that authentication without confidence in the keys used is useless. For digital signatures, a recipient must be confident that the verification key actually belongs to the sender. For MACs, a recipient must be confident that the shared symmetric key has only been shared with the sender.
Click here for more info
HASH FUNCTION: A function that maps a message of any length into a fixed length hash value, which serves as the authenticator.
MAC: A function of the message and a secret key that produces a fixed length value that serves as the authenticator.
A Hash is a summary or a finger print of a message and provide neither integrity nor authentication itself, as is it is susceptible to man-in-the-middle attack. Suppose A wants to send a message M, combined with hash H of M, to B. Instead C capture the message and generate Message M2 and hash H2 of M2, and sends it to B. Now B, by no mean can verify whether this is the original message from A or not. However, hash can be used in some other ways to achieve integrity and authentication, such as MAC.
A MAC which is also a summary of the message provide Integrity and Authentication. MAC can be computed in many ways. The simplest method is to use a hash function with two inputs, the message and a shared secret key. The use of the shared secret key adds the Authentication ability to the MAC, and thus provide integrity and authentication. However, MAC still does not provide non-repudiation, as any of the party(es) having the shared secret key can produce the message and MAC.
Here comes the Digital Signature and Public Key Cryptography in action.
Basically the main difference is MAC uses a private key and hash does not use any keys. Because of that MAC allows us to achieve authentication.
Hash functions utilize asymmetric cryptography whereas, MAC use symmetric cryptography.
Cryptographic hash functions are not always a MAC, but MAC can be a cryptographic hash functions (keyed hash functions).
Hash functions provide non-repudiation where MAC do no provide non-re