sha256 hash length - hash

I want to get hash sha256 from a string, as sha256 has 256-bit length and the size of each byte is 8 bit, I think the length of hash(in a string) must be 32 characters(256/8 = 32). but the hash length is 64 characters. can anyone help me?
print(hashlib.sha256('lsd'.encode()).hexdigest())
output:
3fd7dcbd130286b3799aa74e7fcb1d2ecc80d4c95a158d91dfa1d6a72557f769
hash length is 64, shouldn't it be 32?

The output characters are actually hexadecimal digits, There are 16 possible digits used to represent numbers (0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F). We need only 4 bits for each hexadecimal digits, because 2^4=16.
Now we have 256 bits, and 4 bits per digit, so 256/4=64. We get 64 digits in length.

Related

Is this a bug in the passlib base64 encoding?

I am trying to decode an re-encode a bytesytring using passlibs base64 encoding:
from passlib.utils import binary
engine = binary.Base64Engine(binary.HASH64_CHARS)
s2 = engine.encode_bytes(engine.decode_bytes(b"1111111111111111111111w"))
print(s2)
This prints b'1111111111111111111111A' which is of course not what I expected. The last character is different.
Where is my mistake? Is this a bug?
No, it's not a bug.
In all variants of Base64, every encoded character represents just 6 bits and depending on the number of bytes encoded you can end up with 0, 2 or 4 insignificant bits on the end.
In this case the encoded string 1111111111111111111111w is 23 characters long, that means 23*6 = 138 bits which can be decoded to 17 bytes (136 bits) + 2 insignifcant bits.
The encoding you use here is not Base64 but Hash64
Base64 character map used by a number of hash formats; the ordering is wildly different from the standard base64 character map.
In the character map
HASH64_CHARS = u("./0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz")
we find A on index 12 (001100) and w on index 60 (111100)
Now the 'trick' here is, that
binary.Base64Engine(binary.HASH64_CHARS) has a default parameter big=False, which means that the encoding is done in little endian format by default.
In your example it means that w is 001111 and A is 001100. During decoding the last two bits are cut off as they are not needed as explained above. When you encode it again, A is taken as the first character in the character map that can be used two encode 0011 plus two insignifiant bits.

Perl: Decimal to 32bits hex convert

I want to convert decimal number 64 into hex representation: 0x00000040. I am using
printf("0x%X", 64);
but it gives output: 0x40. Can anyone please help me how to represent the decimal number in 0x00000000 format?
You can specify the length of the field between the % and the X (e.g. %8X). By default, the number will be padded with spaces, but using a leading zero for the length (e.g. %08X) will cause printf to pad with zeroes instead. Therefore, the following can be used:
printf("0x%08X", 64);

How to denote 160 bits SHA-1 string as 20 characters in ANSI?

For an input "hello", SHA-1 returns "aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d", which are 40 hex outputs. I know 1 byte can denote as 1 character, so the 160 bits output should be able to converted to 20 characters. But when I look up "aa" in an ASCII table, there are no such hex value, and I'm confused about that. How to map 160 bits SHA-1 string as 20 characters in ANSI?
ASCII only has 128 characters (7 bits), while ANSI has 256 (8 bits). As to the ANSI value of hex value AA (decimal 170), the corresponding ANSI character would be ª (see for example here).
Now, you have to keep in mind that a number of both ASCII and ANSI characters (0-31) are non-printable control characters (system bell, null character, etc.), so turning your hash into a readable 20 character string will be not possible in most cases. For instance, your example contains the hex value 0F, which would translate to a shift-in character.

0x00000000 hexadecimal?

I had always been taught 0–9 to represent values zero to nine, and A, B, C, D, E, F for 10-15.
I see this format 0x00000000 and it doesn't fit into the pattern of hexadecimal. Is there a guide or a tutor somewhere that can explain it?
I googled for hexadecimal but I can't find any explanation of it.
So my 2nd question is, is there a name for the 0x00000000 format?
0x simply tells you the number after it will be in hex
so 0x00 is 0, 0x10 is 16, 0x11 is 17 etc
The 0x is just a prefix (used in C and many other programming languages) to mean that the following number is in base 16.
Other notations that have been used for hex include:
$ABCD
ABCDh
X'ABCD'
"ABCD"X
Yes, it is hexadecimal.
Otherwise, you can't represent A, for example. The compiler for C and Java will treat it as variable identifier. The added prefix 0x tells the compiler it's hexadecimal number, so:
int ten_i = 10;
int ten_h = 0xA;
ten_i == ten_h; // this boolean expression is true
The leading zeroes indicate the size: 0x0080 hints the number will be stored in two bytes; and 0x00000080 represents four bytes. Such notation is often used for flags: if a certain bit is set, that feature is enabled.
P.S. As an off-topic note: if the number starts with 0, then it's interpreted as octal number, for example 010 == 8. Here 0 is also a prefix.
Everything after the x are hex digits (the 0x is just a prefix to designate hex), representing 32 bits (if you were to put 0xFFFFFFFF in binary, it would be 1111 1111 1111 1111 1111 1111 1111 1111).
hexadecimal digits are often prefaced with 0x to indicate they are hexadecimal digits.
In this case, there are 8 digits, each representing 4 bits, so that is 32 bits or a word. I"m guessing you saw this in an error, and it is a memory address. this value means null, as the hex value is 0.

MD5 Hash and Base64 encoding

If I have a 32 character string (an MD5 hash) and I encode it using Base64, what's the maximun length of the encoded string?
An MD5 value is always 22 (useful) characters long in Base64 notation. Many Base64 algorithms will also append 2 characters of padding when encoding an MD5 hash, bringing the total to 24 characters. The padding adds no useful information and can be discarded. Only the first 22 characters matter.
Here's why:
An MD5 hash is a 128-bit value. Every character in a Base64 string contains 6 bits of information, because there are 64 possible values for the character, and it takes 6 powers of 2 to reach 64. With 6 bits of information in every character, 21 characters has 126 bits of information, and 22 characters contains 132 bits of information. Since 128 bits cannot fit within 21 characters but does fit within 22 characters (with a little room to spare), a 128-bit value will always be represented as 22 characters in Base64.
A note on the padding:
I mentioned above that many Base64 encoding algorithms add a couple of characters of padding when encoding an MD5 value. This is because Base64 represents 3 bytes of information as 4 characters. Since MD5 has 16 bytes of information, many Base64 encoding algorithms append "==" to designate that the input of 16 bytes was 2 bytes short of the next multiple of 3, which would have been 18 bytes. These 2 equal signs add no information whatsoever to the string, and can be discarded when storing.
As per http://en.wikipedia.org/wiki/Base64
"Note that given an input of n bytes, the output will be (n + 2 - ((n + 2) % 3)) / 3 * 4 bytes long, which converges to n * 4 / 3 or 1.33333n for large n."
So, it will be ((32 + 2 - (32 + 2) % 3)) / 3 * 4 = 34 - (34 % 3) / 3 * 4 = (34 - 1) / 3 * 4 = 33/3*4 = 44 characters.
You could always extract it in raw binary form (128 bits) and encode it directly into base 64, which means converting 16 bytes instead of 32, which becomes 24 bytes when base 64 encoded.
MD5 128 bits is represented as 22 characters in Base64. also have 2 padding charater '=' in this case.
How?
$ md5sum ./README.md
c6b5f48774aa0a87a82a276ff86be507 ./README.md
$ md5sum ./README.md | base64
YzZiNWY0ODc3NGFhMGE4N2E4MmEyNzZmZjg2YmU1MDcgIC4vUkVBRE1FLm1kCg==
In this case Base64 encoded string does not shorter than the MD5 hash length
Because what is encoded is the storage form of MD5 hash. not MD5 hash value itself.
Need to note how many bit is used to store one digit of MD5 hash.
Right way:
convert the hash value so
1 convert the hexadecimal to binary
2 convert the binary to base64 coded sting
$ cat ./README.md | openssl dgst -md5
c6b5f48774aa0a87a82a276ff86be507
$ cat ./README.md | openssl dgst -md5 -binary | openssl enc -base64
xrX0h3SqCoeoKidv+GvlBw==
or
$ md5sum ./LICENSE
e3fc50a88d0a364313df4b21ef20c29e ./LICENSE
$ cat ./LICENSE | openssl dgst -md5 -binary | openssl enc -base64
4/xQqI0KNkMT30sh7yDCng==
$ (echo 0:; echo e3fc50a88d0a364313df4b21ef20c29e) | xxd -rp -l 16|base64
4/xQqI0KNkMT30sh7yDCng==