pbkdf2 key length - hash

What is the $key_length in PBKDF2
It says that it will be derived from the input, but I see people using key_lengths of 256 and greater, but when I enter 256 as a key_length the output is 512 characters. Is this intentional? Can I safely use 64 as the key_length so the output is 128 characters long?

$key_length is the number of output bytes that you desire from PBKDF2. (Note that if key_length is more than the number of output bytes of the hash algorithm, the process is repeated twice, slowing down that hashing perhaps more than you desire. SHA256 gives 32 bytes of output, for example, so asking for 33 bytes will take roughly twice as long as asking for 32.)
The doubling of the length that you mention is because the code converts the output bytes to hexadecimal (i.e. 2 characters per 1 byte) unless you specify $raw_output = true. The test vectors included specify $raw_output = false, since hexadecimal is simply easier to work with and post online. Depending on how you are storing the data in your application, you can decide if you want to store the results as hex, base64, or just raw binary data.

In the IETF specification of Password-Based Cryptography Specification Version 2.0 the key length is defined as
"intended length in octets of the derived key, a positive integer, at most
(2^32 - 1) * hLen" Here hLen denotes the length in octets of the pseudorandom function output. For further details on pbkdf2 you can refer How to store passwords securely with PBKDF2

Related

Which hashing algorithm generates alphanumeric output?

I am looking for an hashing algorithm that generates alphanumeric output. I did few tests with MD5 , SHA3 etc and they produce hexadecimal output.
Example:
Input: HelloWorld
Output[sha3_256]: 92dad9443e4dd6d70a7f11872101ebff87e21798e4fbb26fa4bf590eb440e71b
The 1st character in the above output is 9. Since output is in HEX format, maximum possible values are [0-9][a-f]
I am trying to achieve maximum possible values for the 1st character. [0-9][a-z][A-Z]
Any ideas would be appreciated . Thanks in advance.
Where MD5 computes a 128bit hash and SHA256 a 256bit hash, the output they provide is nothing more than a 128, respectively 256 long binary number. In short, that are a lot of zero's and ones. In order to use a more human-friendly representation of binary-coded values, Software developers and system designers use hexadecimal numbers, which is a representation in base(16). For example, an 8-bit byte can have values ranging from 00000000 to 11111111 in binary form, which can be conveniently represented as 00 to FF in hexadecimal.
You could convert this binary number into a base(32) if you want. This is represented using the characters "A-Z2-7". Or you could use base(64) which needs the characters "A-Za-z0-9+/". In the end, it is just a representation.
There is, however, some practical use to base(16) or hexadecimal. In computer lingo, a byte is 8 bits and a word consists of two bytes (16 bits). All of these can be comfortably represented hexadecimally as 28 = 24×24 = 16×16. Where 28 = 25×23 = 32×8. Hence, in base(32), a byte is not cleanly represented. You already need 5 bytes to have a clean base(32) representation with 8 characters. That is not comfortable to deal with on a daily basis.

Maximum input and output length for Argon2

As you may know, maximum input length for bcrypt is 72 characters and the output length is 60 characters. (I've it tested in PHP. Correct me if I'm wrong)
I want to know maximum input length and the exact output length for argon2. Thanks.
According to https://en.wikipedia.org/wiki/Argon2#Algorithm max input length is 2^32-1 bytes or 4294967295 bytes.
As to the equivalent in character length, it depends on what character encoding you use.
According to this answer:
In ASCII or ISO 8859, each character is represented by one byte
In UTF-32, each character is represented by 4 bytes
In UTF-8, each character uses between 1 and 4 bytes
In ISO 2022, it's much more complicated
Still according to https://en.wikipedia.org/wiki/Argon2#Algorithm I cannot give you an 'exact' output length because it depends on the length you choose for various parameters such as the salt and the output hash itself.

What should be the data type for the hashed value of a password encrypted using PBKDF2?

I am trying to learn to use PBKDF2 hash functions for storing passwords in the database. I have a rough draft of the procedure that I'll be using to generate the hashed function. But while I am creating the table in PL/SQL Developer which will hold the generated hashed password, what should I declare the data type for the encrypted password variable?
It might be a lame question but I'm trying to learn online. It would be a huge help if I can get links for further study as well. thank you. please help
The first link, as always, is Thomas Pornin's canonical answer to How to securely hash passwords.
Storage in the database
The hash can be stored in BINARY format for the least transformations and smallest number of bytes; see below for sizes.
Alternately, store it in a CHAR after converting to hex, which costs a transformation and double the bytes of the BINARY size
Alternatively, store it in a CHAR after converting to Base64, which costs a transformation and 4/3rds the number of bytes of BINARY size plus padding
i.e. PBKDF2-HMAC-SHA-512 where all 64 bytes of output are used would be
BINARY(64) as binary
CHAR(128) as hex
CHAR(88) as Base64
The number of iterations should be stored in an INT, so it can be trivially increased later
The salt, which must be a per-user, cryptographically random value, can be stored in a BINARY format for the smallest number of bytes, and should be at least 12, and preferably 16-24 bytes long.
i.e. for a 16 byte binary salt
BINARY(16) as binary
CHAR(32) as hex
CHAR(24) as Base64
Optionally a password hash algorithm version as a small INT type
i.e. 1 for PBKDF2-HMAC-SHA-512, and then later if you change to BCrypt, 2 for BCrypt, etc.
Normal PBKDF2 considerations
Consider using PBKDF2-HMAC-SHA-512, as SHA-512 in particular has 64-bit operations that reduce the advantage most GPU based attackers have over you as of early 2016.
Use a high (hundreds of thousands or high tens of thousands) of iterations.
Don't ask for a larger number of PBKDF2 output bytes than the native hash function supports
SHA-512 <= 64 bytes
SHA-384 <= 48 bytes
SHA-256 <= 32 bytes
SHA-224 <= 28 bytes
MD5 <= 20 bytes

Why do md5 and sha-* only use alphanumeric characters in their hash result?

I understand not wanting to use '\0', but all the rest in the extended ASCII range is usable right?
Wouldn't this provide a much better/secure/"less coliding" hash?
You're starting from false premise -- they produce a result that can (does) include all 8-bit values from 0 to 255. Just for example, one of the test vectors for SHA-256 is an input of "abc". The result from this (in hexadecimal) is:
ba7816bf 8f01cfea 414140de 5dae2223 b00361a3 96177a9c b410ff61 f20015ad
Just within that test, the result includes bytes with values from 0x03 to 0xff.
For display, that may be (often is) rendered in something like hexadecimal. For transmission in email they're often encoded with something like MIME or UUENCODE. The hash itself, however, is not limited in this way.
Transforming the result this way makes no difference to collision resistance -- you still have 160/256/whatever bits of actual data, but the representation is expanded.
The result is just hexadecimal encoded to be better readable.
In fact, those hash algorithms are outputting numbers, not strings. They use only letters a-f in combination with numbers 0-9, which makes the output a hexadecimal number.
MD5 produces an 128 bit hash. (16 byte)
sha, depending of whether is sha1 or sha256 produces either 160 bit (20 byte) or 256 bit (32 byte) hash.
Note that I'm talking about binary length/strength. The longer the less likely a collision occurs.
The fact that most users stick it into a DB field or whatnot makes it convenient to convert it to ASCII using varions binary-ascii conversion algos. This should not influence the strength of collision probability at all since you'll end up with a larger ASCII string.
FWIW I've been using SHA1, SHA256 in crypto products in binary form for over 5 years and I'd recommend choosing hashes in this following order, from the strongest to the weakest: SHA256, SHA1, MD5. There is a website that can "reverse" MD5 so I'd strongly suggest against it.

md5 hash or crc32 which one to use in this case

I need a hash that can be represented in less than 26 chars
Md5 produces 32 chars long string , if convert it to base 36 how good will it be,
I am need of hash not for cryptography but rather for uniqueness basically identifying each input dependent on time of input and input data. currently i can think of this as
$hash=md5( str_ireplace(".","",microtime()).md5($input_data) ) ;
$unique_id= base_convert($hash,16,36) ;
should go like this or use crc32 which will give smaller hash size but i afraid it wont be that unique ?
I think a much simpler solution could take place.
According to your statement, you have 26 characters of space. However, to clarify what I understand to be character and what you understand to be character, let's do some digging.
The MD5 hash acc. to wikipedia produces 16 byte hashes.
The CRC32 algorithm prodces 4 byte hashes.
I understand "characters" (in the most simplest sense) to be ASCII characters. Each ascii character (eg. A = 65) is 8 bits long.
The MD5 aglorithm produces has 16 bytes * 8 bits per byte = 128 bits, CRC32 is 32 bits.
You must understand that hashes are not mathematically unique, but "likely to be unique."
So my solution, given your description, would be to then represent the bits of the hash as ascii characters.
If you only have the choice between MD5 and CRC32, the answer would be MD5. But you could also fit a SHA-1 160 bit hash < 26 character string (it would be 20 ascii characters long).
If you are concerned about the set of symbols that each hash uses, both hashes are in the set [A-Za-z0-9] (I believe).
Finally, when you convert what are essentially numbers from one base to another, the number doesn't change, therefore the strength of the algorithm doesn't change; it just changes the way the number is represented.