What methods exist for signed number representation?
How do you know which signed number representation is used for the application?
e.g.
IEEE 754 allows you to represent 1.3444E-15 and 1.3444E+15... implying very large number & a very small number simply based on 1 signed representation of exponent. IEEE 754 exponent field uses biased exponent representation see page 7. Similarly , which other methods exist.
For Verilog, integers use 2's complement. Real numberes use IEEE-754. This is for declaring constants that you use to initialize a reg or assign to a wire and for built-in operands. Actual regs/wires are only a bunch of bits, and it's your design which determines what format numbers are stored with.
Forget about the types. Just use bit vector and interpret the bits as float.
wire unsigned [31:0] bits;
wire unsigned sign;
wire unsigned [7:0] exp;
wire unsigned [22:0] mantissa;
assign sign = bits[31];
assign exp = bits[30:23];
assign matissa = bits[22:0];
Related
In the Kademlia paper it mentions using the XOR of the NodeID interpreted as an integer. Let's pretend my NodeID1 is aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d and my NodeID2 is ab4d8d2a5f480a137067da17100271cd176607a1. What's the appropriate way to interpret this as an integer for comparison of NodeID1 and NodeID2? Would I convert these into BigInt and XOR those two BigInts? I saw that in one implementation. Could I also just convert each NodeID into decimal and XOR those values?
I found this question but I'm trying to better understand exactly how this works.
Note: This isn't for implementation, I'm just trying to understand how the integer interpretation works.
For a basic kademlia implementation you only need 2 bit arithmetic operations on the IDs: xor and comparison. For both cases the ID conceptually is a 160bit unsigned integer with overflow, i.e. modulo 2^160 arithmetic. It can be decomposed into a 20bytes or 5×u32 array, assuming correct endianness conversion in the latter case. The most common endianness for network protocols is big-endian, so byte 0 will contain the most significant 8 bits out of 160.
Then the xor or comparisons can be applied on a subunit by subunit basis. I.e. xor is just an xor for all the bytes, the comparison is a binary array comparison.
Using bigint library functions are probably sufficient for implementation but not optimal because they have size and signedness overhead compared to implementing the necessary bit-twiddling on fixed-sized arrays.
A more complete implementation may also need some additional arithmetic and utility functions.
Could I also just convert each NodeID into decimal and XOR those values?
Considering the size of the numbers decimal representation is not particularly useful. For the human reader heaxadecimal or the individual bits are more useful and computers operates on binary and practically never on decimal.
According to The Swift Programming Language :
For example, 0xFp2 represents 15 ⨉ 2^2, which evaluates to 60.
Similarly, 0xFp-2 represents 15 ⨉ 2^(-2), which evaluates to 3.75.
Why is 2 used as base for the exponent instead of 16? I'd have expected 0xFp2 == 15 * (16**2) instead of 0xFp2 == 15 * (2**2)
Swift's hexadecimal notation for floating-point numbers is just a variation of the notation introduced for C in the C99 standard for both input and output (with the printf %a format).
The purpose of that notation is to be both easy to interpret by humans and to let the bits of the IEEE 754 representation be somewhat recognizable. The IEEE 754 representation uses base two. Consequently, for a normal floating-point number, when the number before p is between 1 and 2, the number after p is directly the value of the exponent field of the IEEE 754 representation. This is in line with the dual objectives of human-readability and closeness to the bit representation:
$ cat t.c
#include <stdio.h>
int main(){
printf("%a\n", 3.14);
}
$ gcc t.c && ./a.out
0x1.91eb851eb851fp+1
The number 0x1.91eb851eb851fp+1 can be seen to be slightly above 3 because the exponent is 1 and the significand is near 0x1.9, slightly above 0x1.8, which indicates the exact middle between two powers of two.
This format helps remember that numbers that have a compact representation in decimal are not necessarily simple in binary. In the example above, 3.14 uses all the digits of the significand to approximate (and even so, it isn't represented exactly).
Hexadecimal is used for the number before the p, which corresponds to the significand in the IEEE 754 format, because it is more compact than binary. The significand of an IEEE 754 binary64 number requires 13 hexadecimal digits after 0x1. to represent fully, which is a lot, but in binary 52 digits would be required, which is frankly impractical.
The choice of hexadecimal actually has its drawbacks: because of this choice, the several equivalent representations for the same number are not always easy to recognize as equivalent. For instance, 0x1.3p1 and 0x2.6p0 represent the same number, although their digits have nothing in common. In binary, the two notations would correspond to 0b1.0011p1 and 0b10.011p0, which would be easier to see as equivalent. To take another example, 3.14 can also be represented as 0xc.8f5c28f5c28f8p-2, which is extremely difficult to see as the same number as 0x1.91eb851eb851fp+1. This problem would not exist if the number after the p represented a power of 16, as you suggest in your question, but unicity of the representation was not an objective when C99 was standardized: closeness to the IEEE 754 representation was.
Is it possible to somehow calculate minimum number of bits that are needed to represent integers in arithmetic coding? Let's say i have 100characters-length string. How many bits do i need to encode and decode that sequence?
Floating point values (IEEE 32 and 64-bit) are encoded using
fixed-length big-endian encoding (7 bits used to avoid use of reserved
bytes like 0xFF):
That paragraphs comes from the Smile Format spec (a JSON-like binary format).
What could that mean? Is there some standard way to encode IEEE floating point (single and double precision) so that the encoded bytes are in the 0-127 range?
More in general: I believe that, in the standard binary representation, there is no reserved or prohibited byte value, a IEEE floating point number can include any of the 256 possible bytes. Granted that, is there any standard binary encoding (or trick) so that some bytes value/s will never appear (as, say, in UTF8 encoding of strings one have some prohibited bytes values, as 0xFF)?
(I guess that would imply either losing some precision, or using more bytes.)
I don't know the detail of such a format, but it looks as a kind of serialization of a data structure. Of course, since the final result is a byte-stream, you should be able to recognize a value from some other metadata. Probably they use the 7th bit a special-bit, then any misinterpreting byte-value should be avoided. That's the reason to "spread" an IEEE fp number along (five, for a single) bytes where only seven bits are actually used for the value.
I should read the format specs, so I tried to "extrapolate" what they're going to do. However, this kind of codify is relatively often in the low-level (e.g. embedded) programming.
I would like to know what is the differecnce between Integer 16, Integer 32 and Integer 64, and the difference between a signed integer and an unsigned integer(NSInteger and NSUInteger)
I'm not sure exactly what types you mean by "Integer 16", "Integer 32", and "Integer 64", but normally, those numbers refer to the size in bits of the integer type.
The difference between a signed and an unsigned integer is the range of values it can represent. For example, a two's-complement signed 16-bit integer can represent numbers between -32,768 and 32,767. An unsigned 16-bit integer can represent values between 0 and 65,535.
For most computers in use today, a signed integer of width n can represent the values [-2n-1,2n-1) and an unsigned integer of width n can represent values [0,2n).
NSInteger and NSUInteger are Apple's custom integer data types. The first is signed while the latter is unsigned. On 32-bit builds NSInteger is typedef'd as an int while on 64-bit builds it's typedef'd as a long. NSUInteger is typedef'd as an unsigned int for 32-bit and an unsigned long for 64-bit. Signed types cover the range [-2^(n-1), 2^(n-1)] where n is the bit value and unsigned types cover the range [0, 2^n].
When coding for a single, self-contained program, using NSInteger or NSUInteger are considered the best practice for future-proofing against platform bit changes. It is not the best practice when dealing with fixed-size data needs, such as with binary file formats or networking, because the required field widths are defined previously and constant regardless of the platform bit level. This is where the fixed-size types defined in stdint.h (i.e., uint8_t, uint16_t, uint32_t, etc) come into use.
Unsigned vs signed integer -
Unsigned is usually used where variables aren't allowed to take negative numbers. For example, while looping through an array, its always useful/readable if the array subscript variable is unsigned int and loop through until the length of the array.
On the other hand, if variable can have negative numbers too then declare the variable as signed int. Integer variables are signed by default.
Have a look at the Foundation Data types. NInteger and NSUInteger and typedef for int and unsigned int.
From wikipedia
In computing, signed number
representations are required to encode
negative numbers in binary number
systems
which means that you normally have to use a bit to encode the sign thus reducing the range in of number you can represent.