Total numbers represented in IEEE single precision and double precision - double

How many total numbers can be represented in IEEE single precision format and double precision format?
single precision - 1 sign bit, 8 exponent , 23 bit mantissa
double precision - 1 sign bit, 11 exponent bit, 52 mantissa

Single precision:
23 bit mantissa: 2^23 possibilities
1 bit sign: 2 possibilities
8 bit exponent: 2^8 - 1 possible exponents (1 exponent used for
special cases such as NaN)
So, total numbers that can be represented = 2^23 * 2 * (2^8 - 1)
I hope I'm not missing anything.

Related

Why Int8.max &+ Int8.max equals to "-2"?

Following Swift Standard Library documentation, &+ discards any bits that overflow the fixed width of the integer type. I just did not get why adding two maximum values, 8-bit signed integer can hold results in -2:
/// Two max Int8 values (127 each, 8-bit group)
let x6 = Int8.max
let x7 = Int8.max
/// Prints `1 1 1 1 1 1 1`
String(Int8.max, radix: 2)
/// Here we get `-2` in decimal system
let x8 = x6 &+ x7
/// Prints `-1 0`
String(x8, radix: 2)
If we break down the binary calculation we will get this:
1 1 1 1 1 1 1
+ 1 1 1 1 1 1 1
-----------------------------
1 1 1 1 1 1 1 0
Which is -126, as the leftmost bit is a negative sign.
Why does Swift discards any bits except the rightmost two (1 and 0). Did I miss some overflow rules? I've read some pieces of knowledge in the web, but did not get closed to cracking this one.
Swift (and every other programming language I know) uses 2's complement to represent signed integers, rather than sign-and-magnitude as you seem to assume.
In the 2's complement representation, the leftmost 1 does not represent "a negative sign". You can think of it as representing -128, so the Int8 value of -2 would be represented as 1111 1110 (-128 + 64 + 32 + 16 + 8 + 4 + 2).
OTOH, -126 would be represented as 1000 0010 (-128 + 2).

How to modify the last 3 bits of signed numbers

When I apply the function dwt2() on an image, I get the four subband coefficients. By choosing any of the four subbands, I work with a 2D matrix of signed numbers.
In each value of this matrix I want to embed 3 bits of information, i.e., the numbers 0 to 7 in decimal, in the last 3 least significant bits. However, I don't know how to do that when I deal with negative numbers. How can I modify the coefficients?
First of all, you want to use an Integer Wavelet Transform, so you only have to deal with integers. This will allow you a lossless transformation between the two spaces without having to round float numbers.
Embedding bits in integers is a straightforward problem for binary operations. Generally, you want to use the pattern
(number AND mask) OR bits
The bitwise AND operation clears out the desired bits of number, which are specified by mask. For example, if number is an 8-bit number and we want to zero out the last 3 bits, we'll use the mask 11111000. After the desired bits of our number have been cleared, we can substitute them for the bits we want to embed using the bitwise OR operation.
Next, you need to know how signed numbers are represented in binary. Make sure you read the two's complement section. We can see that if we want to clear out the last 3 bits, we want to use the mask ...11111000, which is always -8. This is regardless of whether we're using 8, 16, 32 or 64 bits to represent our signed numbers. Generally, if you want to clear the last k bits of a signed number, your mask must be -2^k.
Let's put everything together with a simple example. First, we generate some numbers for our coefficient subband and embedding bitstream. Since the coefficient values can take any value in [-510, 510], we'll use 'int16' for the operations. The bitstream is an array of numbers in the range [0, 7], since that's the range of [000, 111] in decimal.
>> rng(4)
>> coeffs = randi(1021, [4 4]) - 511
coeffs =
477 202 -252 371
48 -290 -67 494
483 486 285 -343
219 -504 -309 99
>> bitstream = randi(8, [1 10]) - 1
bitstream =
0 3 0 7 3 7 6 6 1 0
We embed our bitstream by overwriting the necessary coefficients.
>> coeffs(1:numel(bitstream)) = bitor(bitand(coeffs(1:numel(bitstream)), -8, 'int16'), bitstream, 'int16')
coeffs =
472 203 -255 371
51 -289 -72 494
480 486 285 -343
223 -498 -309 99
We can then extract our bitstream by using the simple mask ...00000111 = 7.
>> bitand(coeffs(1:numel(bitstream)), 7, 'int16')
ans =
0 3 0 7 3 7 6 6 1 0

What's the is maximum length of scrypt output?

I'd like to store an scrypt-hashed password in a database. What is the maximum length I can expect?
According to https://github.com/wg/scrypt the output format is $s0$params$salt$key where:
s0 denotes version 0 of the format, with 128-bit salt and 256-bit derived key.
params is a 32-bit hex integer containing log2(N) (16 bits), r (8 bits), and p (8 bits).
salt is the base64-encoded salt.
key is the base64-encoded derived key.
According to https://stackoverflow.com/a/13378842/14731 the length of a base64-encoded string is where n denotes the number of bytes being encoded.
Let's break this down:
The dollar signs makes up 4 characters.
The version numbers makes up 2 characters.
Each hex character represents 4 bits ( log2(16) = 4 ), so the params field makes up (32-bit / 4 bits) = 8 characters.
The 128-bit salt is equivalent to 16 bytes. The base64-encoded format makes up (4 * ceil(16 / 3)) = 24 characters.
The 256-bit derived key is equivalent to 32 bytes. The base64-encoded format makes up (4 * ceil(32 / 3)) = 44 characters.
Putting that all together, we get: 4 + 2 + 8 + 24 + 44 = 82 characters.
In Colin Percival's own implementation, the tarsnap scrypt header is 96 bytes. This comprises:
6 bytes 'scrypt'
10 bytes N, r, p parameters
32 bytes salt
16 bytes SHA256 checksum of bytes 0-47
32 bytes HMAC hash of bytes 0-63 (using scrypt hash as key)
This is also the format used by node-scrypt. There is an explanation of the rationale behind the checksum and the HMAC hash on stackexchange.
As a base64-encoded string, this makes 128 characters.

converting degrees minutes seconds to decimal degrees adds too much precision

I'm converting coordinates from degrees minutes seconds to decimal degrees. However, in the process of converting, the resulting coordinates are much more precise than they should be.
How can I correctly incorporate the lack of precision?
For example, some coordinates lacking seconds:
143 DEG 10 MIN W, 28 DEG 25 MIN N
To convert, I would do the following:
143.1667 <- 143 + 10/60
28.41667 <- 28 + 25/60
But really, the longitude could be anywhere from:
143.1667 <- 143 + 10/60 + 0/3600
to
143.1831 <- 143 + 10/60 + 59/3600
It seems like I should be rounding these coordinates so that they do not convey artificial precision...
You can round it just like this:
double roundedValue = Math.floor(unroundedValue * 10) / 10;
// 123.456 --> 123.4
If you use "100" instead of "10", you will get a a precision of two digits.

Padding in MD5 Hash Algorithm

I need to understand the Md5 hash algorithm. I was reading a documents and it states
"The message is "padded" (extended) so that its length (in bits) is
congruent to 448, modulo 512. That is, the message is extended so
that it is just 64 bits shy of being a multiple of 512 bits long.
Padding is always performed, even if the length of the message is
already congruent to 448, modulo 512."
I need to understand what this means in simple terms, especially the 448 modulo 512. The word MODULO is the issue. Please I will appreciate simple examples to this. Funny though, this is the first step to MD5 hash! :)
Thanks
Modulo or mod, is a function that results in telling you the remainder when two numbers are divided by each other.
For example:
5 modulo 3:
5/3 = 1, with 2 remainder. So 5 mod 3 is 2.
10 modulo 16 = 10, because 16 cannot be made.
15 modulo 5 = 0, because 15 goes into 5 exactly 3 times. 15 is a multiple of 5.
Back in school you would have learnt this as "Remainder" or "Left Over", modulo is just a fancy way to say that.
What this is saying here, is that when you use MD5, one of the first things that happens is that you pad your message so it's long enough. In MD5's case, your message must be n bits, where n= (512*z)+448 and z is any number.
As an example, if you had a file that was 1472 bits long, then you would be able to use it as an MD5 hash, because 1472 modulo 512 = 448. If the file was 1400 bits long, then you would need to pad in an extra 72 bits before you could run the rest of the MD5 algorithm.
Modulus is the remainder of division. In example
512 mod 448 = 64
448 mod 512 = 448
Another approach of 512 mod 448 would be to divide them 512/448 = 1.142..
Then you subtract 512 from result number before dot multiplied by 448:
512 - 448*1 == 64 That's your modulus result.
What you need to know that 448 is 64 bits shorter than multiple 512.
But what if it's between 448 and 512??
Normally we need to substract 448 by x(result of modulus).
447 mod 512 = 447; 448 - 447 = 1; (all good, 1 zero to pad)
449 mod 512 = 1; 448 - 449 = -1 ???
So this problem solution would be to take higher multiple of 512 but still shorter of 64;
512*2 - 64 = 960
449 mod 512 = 1; 960 - 449 = 511;
This happens because afterwards we need to add 64 bits original message and the full length have to be multiple of 512.
960 - 449 = 511;
511 + 449 + 64 = 1024;
1024 is multiple of 512;