What to use (best/good practice) for the secret key in HMAC solution? - hash

I am implementing a HMAC-like solution based upon specifications provided to me by another company. The hashing parameters and use of the secret key is not an issue, and neither is the distribution of the key itself, since we are in close contact and close geographical location.
However - what is best practice for the actual secret key value?
Since both companies are working together, it would seem that
c9ac56dd392a3206fc80145406517d02 generated with a Rijndael algorithm and
Daisy Daisy give me your answer doare both pretty much equally secure (in this context) as a secret key used to add to the hash?

Citing Wikipedia page on HMAC:
The cryptographic strength of the HMAC depends upon the cryptographic strength of the underlying hash function, the size of its hash output, and on the size and quality of the key.
This means that completely random key, where every bit is randomly generated, is far better than set of characters.
The optimum size of the key is equal to block size. If the key is too short then it is padded usually with zeroes (which are not random). If the key is too long then its hash function is used. The length of hash output is anyway block size.
Use of visible characters as a key makes the key easier to guess as there are far less combinations of visible characters than if we allow for every possible combination of bits. For example:
There are 95 visible characters in ASCII (out of 256 combinations). If the block size is 16 bytes (HMAC_MD5) then there are 95^16 ~= 4.4*10^31 combinations. But for 16 bytes there are 3.4*10^38 possibilities. Attacker knowing that the key consists only of visible ASCII characters knows that he requires around 10 000 000 times less time than if he had to consider every possible combination of bits.
Summarizing I recommend use of cryptographic pseudo-random number generator to generate secret keys instead of coming up with your own keys.
Edit:
As martinstoeckli suggested if you have to you can use key-derivation-function to generate byte key of specified length from text password. This is much safer than converting plain text to bytes and using these bytes as a key directly. Nevertheless, there is nothing more secure than key consisting of random bytes.

Related

How to assess security of Argon2 parameters

I would like to hash user password with Argon2id, then use this hash as ECDSA private key. I made some tests with Python and passlib.hash and found following parameters:
memory: 32 MB
iterations: 100
hash length: 32 bytes (the same as ECDSA private key length)
parallelism: 1 (I want to make it slow on purpose and the algorithm will be run on variety of devices)
With these parameters it takes about 1 second to hash the password on my computer, which is acceptable.
The question is: how to assess if these parameters are secure enough? Please note that public keys are public, so we don't have a typical situation here like with passwords when attacker has limited capabilities to check if hash is correct, unless the database is compromised. Here, with known public key attacker can easily check (offline) if generated private key (thus Argon2 result) matches to known public key without a need to gain access to protected database.
My calculations:
Let's say passwords have 12 characters including lowercase, uppercase letters and numbers (26+26+10) so at best (with naïve assumption that passwords are generated randomly) there are about (62^12 or 2^71.45) 3,226,266,762,397,899,821,056 combinations. Attacker will need to try half of tries to crack the password, which gives us 1,613,133,381,198,949,910,528 tries. It is 51,152,123,959,885 years per single core. We can use thousands of cores at the same time but it will still need too much time to crack it. Obviously this time in reality will be much shorter because attackers will rather use dictionary attacks.
Using a private key (ECDSA or not) derived from a user-provided password completely defeats the whole security and purpose of private/public keys. User passwords typically have very little entropy (50 bits or less), way less than even your basic RSA 2048. Hashing the password with Argon2 (or any other method) makes absolutely no difference, as hashes provides no extra entropy, it's merely a "one-way" encoding.

Encrypt numeric data while preserving inequalities among ciphertexts

How to encrypt numeric data such that the cipher text produced by the encryption function is numeric, also Enc[m1] < Enc[m2] where m1 < m2.
I have gone through number of references all pointing to Format Preserving Encryption. However, no open source code implementation is available for it.
Is there a way (Encryption or Encoding) which can conceal the data with the aforementioned properties by using Java or C# ?
I want to encrypt numeric data within the range of [1 – 50] to cipher text within the range of [1000 - 5000]. I am trying to implement Secure Inverted Index mentioned in Enabling Search over Encrypted Multimedia Databases.
I think you've got a basic contradiction here. If you encrypt a number of values and somehow maintain the sort order among them, then someone, knowing that "abc" encrypts to 567 and "abe" encrypts to 569, will know that 568 => "abd". (Not that your encryption algorithm would be that naive, but you're seriously weakening anything you do manage to devise.)
Encrypting to a number is not difficult, if you allow the number to be longer than your cleartext. (After all, characters themselves are just numbers with special meaning.) A simple approach is to just decode the cyphertext into octal, but other techniques will produce slightly more compact representations of decimal digits.

What difference does key length make when signing a file?

I've never taken any classes on encryption or security and I'm trying to teach myself some basics, so forgive me if this is a silly question (don't worry, I'm not working on anything sensitive)
So, I'm playing around with Crypto++ so that I can make a signature of a file to see if the file has been edited by someone other than me. The test application that comes with the library looks like it has options (rs and rv) that do exactly what I want to do in my own program (verify the integrity of the signature of a file). Of course, before doing that I need to generate a public and private key. When doing so with the test application's g option it asks me to specify the key length in bits. What difference does the key length make?
The key length determines how hard it is for someone to break your cryptography. For digital signatures, that means how hard is it for someone to generate a fake signature.
For RSA a key length of 1024 bits is sufficient for non-sensitive information, but it should only be used for a few years and then replaced with a new key. 2048 bits is stronger and 4096 is stronger still.
For a naive brute-force attacker, adding a single bit to the key length doubles the amount of work they need to do to compromise your key. However, algorithms like RSA do not scale in this way: a 2048-bit RSA key is not 2^1024 times as hard to break as a 1024-bit key (unless the attacker is really stupid).
Generally public key algorithms (e.g. RSA) need much larger keys than symmetric key algorithms (e.g. AES) because they rely on different mathematical properties.
For a good primer on cryptography you should check out Peter Gutmann's godzilla crypto tutorial. It's pretty readable and gives you a good overview of how crypto works in its various forms.

What kind of encrypted data is this?

A friend of me ask this, and i was thinking of asking this here too..
"What kind of data are this, how are they encrypted, or decrypted?"
My friend told me he got this from facebook.
d9ca6435295fcd89e85bd56c2fd51ccc
It looks like it could be an md5 hash.
Basically a hash is a one-way function. The idea is that you take some input data and run it through the algorithm to create a value (such as the string above) that has a low probability of collisions (IE, two input values hashing to the same string).
You cannot decrypt a hash because there is not enough information in the resultant string to go back. However, it may be possible for someone to figure out your input values if you use a 'weak' hashing algorithm and do not do proper techniques such as salting a hash, etc.
I don't know how FaceBook uses hashes, but a common use for a hash might be to uniquely identify a page. For example, if you had a private image on a page, you might ask to generate a link to the image that you can email to friends. That link might use a hash as part of the URL since the value can be computed quickly, is reasonably unique, and has a low probability of a third party figuring it out.
This is actually a large topic that I am by no means doing justice to. I suggest googling around for hash, md5, etc to learn more, if you are so inclinded.
It is a sequence of 128 bits, encoded as a lower-case hex string.
If you are talking about a Facebook API key, there is no deeper meaning to decode from the bits. The keys are created at random by Facebook and assigned to a particular application to identify it. Each application gets a different set of random bits for its API key.
This appears the be the...
hexadecimal representation for...
- ... a 16 bytes encryption block or..
- ... some 128 bits hash code or even
- ... just for some plain random / identifying number.
(Hexadecimal? : note how there are only 0 thru 9 digits and a thru f letters.)
While the MD5 Hash guess suggested by others is quite plausible, it could be just about anything...
If it is a hash or a identifying / randomly assigned number, its meaning is external to the code itself.
For example it could be a key to be used to locate records in a database, or a value to be compared with the result of the hash function applied to the user supplied password etc.
If it is an encrypted value, its meaning (decrypted value) is directly found within the code, but it could be just about anything. Also, assuming it is produced with modern encryption algorithm, it could take a phenomenal amount of effort to crack the code (if at all possible).

How can I generate a unique, small, random, and user-friendly key?

A few months back I was tasked with implementing a unique and random code for our web application. The code would have to be user friendly and as small as possible, but still be essentially random (so users couldn't easily predict the next code in the sequence).
It ended up generating values that looked something like this:
Af3nT5Xf2
Unfortunately, I was never satisfied with the implementation. Guid's were out of the question, they were simply too big and difficult for users to type in. I was hoping for something more along the lines of 4 or 5 characters/digits, but our particular implementation would generate noticeably patterned sequences if we encoded to less than 9 characters.
Here's what we ended up doing:
We pulled a unique sequential 32bit id from the database. We then inserted it into the center bits of a 64bit RANDOM integer. We created a lookup table of easily typed and recognized characters (A-Z, a-z, 2-9 skipping easily confused characters such as L,l,1,O,0, etc.). Finally, we used that lookup table to base-54 encode the 64-bit integer. The high bits were random, the low bits were random, but the center bits were sequential.
The final result was a code that was much smaller than a guid and looked random, even though it absolutely wasn't.
I was never satisfied with this particular implementation. What would you guys have done?
Here's how I would do it.
I'd obtain a list of common English words with usage frequency and some grammatical information (like is it a noun or a verb?). I think you can look around the intertubes for some copy. Firefox is open-source and it has a spellchecker... so it must be obtainable somehow.
Then I'd run a filter on it so obscure words are removed and that words which are too long are excluded.
Then my generation algorithm would pick 2 words from the list and concatenate them and add a random 3 digits number.
I can also randomize word selection pattern between verb/nouns like
eatCake778
pickBasket524
rideFlyer113
etc..
the case needn't be camel casing, you can randomize that as well. You can also randomize the placement of the number and the verb/noun.
And since that's a lot of randomizing, Jeff's The Danger of Naïveté is a must-read. Also make sure to study dictionary attacks well in advance.
And after I'd implemented it, I'd run a test to make sure that my algorithms should never collide. If the collision rate was high, then I'd play with the parameters (amount of nouns used, amount of verbs used, length of random number, total number of words, different kinds of casings etc.)
In .NET you can use the RNGCryptoServiceProvider method GetBytes() which will "fill an array of bytes with a cryptographically strong sequence of random values" (from ms documentation).
byte[] randomBytes = new byte[4];
RNGCryptoServiceProvider rng = new RNGCryptoServiceProvider();
rng.GetBytes(randomBytes);
You can increase the lengh of the byte array and pluck out the character values you want to allow.
In C#, I have used the 'System.IO.Path.GetRandomFileName() : String' method... but I was generating salt for debug file names. This method returns stuff that looks like your first example, except with a random '.xyz' file extension too.
If you're in .NET and just want a simpler (but not 'nicer' looking) solution, I would say this is it... you could remove the random file extension if you like.
At the time of this writing, this question's title is:
How can I generate a unique, small, random, and user-friendly key?
To that, I should note that it's not possible in general to create a random value that's also unique, at least if each random value is generated independently of any other. In addition, there are many things you should ask yourself if you want to generate unique identifiers (which come from my section on unique random identifiers):
Can the application easily check identifiers for uniqueness within the desired scope and range (e.g., check whether a file or database record with that identifier already exists)?
Can the application tolerate the risk of generating the same identifier for different resources?
Do identifiers have to be hard to guess, be simply "random-looking", or be neither?
Do identifiers have to be typed in or otherwise relayed by end users?
Is the resource an identifier identifies available to anyone who knows that identifier (even without being logged in or authorized in some way)?
Do identifiers have to be memorable?
In your case, you have several conflicting goals: You want identifiers that are—
unique,
easy to type by end users (including small), and
hard to guess (including random).
Important points you don't mention in the question include:
How will the key be used?
Are other users allowed to access the resource identified by the key, whenever they know the key? If not, then additional access control or a longer key length will be necessary.
Can your application tolerate the risk of duplicate keys? If so, then the keys can be completely randomly generated (such as by a cryptographic RNG). If not, then your goal will be harder to achieve, especially for keys intended for security purposes.
Note that I don't go into the issue of formatting a unique value into a "user-friendly key". There are many ways to do so, and they all come down to mapping unique values one-to-one with "user-friendly keys" — if the input value was unique, the "user-friendly key" will likewise be unique.
If by user friendly, you mean that a user could type the answer in then I think you would want to look in a different direction. I've seen and done implementations for initial random passwords that pick random words and numbers as an easier and less error prone string.
If though you're looking for a way to encode a random code in the URL string which is an issue I've dealt with for awhile then I what I have done is use 64-bit encoded GUIDs.
You could load your list of words as chakrit suggested into a data table or xml file with a unique sequential key. When getting your random word, use a random number generator to determine what words to fetch by their key. If you concatenate 2 of them, I don't think you need to include the numbers in the string unless "true randomness" is part of the goal.