On AMD64 compliant architectures, addresses need to be in canonical form before being dereferenced.
From the Intel manual, section 3.3.7.1:
In 64-bit mode, an address is considered to be in canonical form if
address bits 63 through to the most-significant implemented bit by the
microarchitecture are set to either all ones or all zeros.
Now, the most significat implemented bit on current operating systems and architectures is the 47th bit. This leaves us with a 48-bit address space.
Especially when ASLR is enabled, user programs can expect to receive an address with the 47th bit set.
If optimizations such as pointer tagging are used and the upper bits are used to store information, the program must make sure the 48th to 63th bits are set back to whatever the 47th bit was before dereferencing the address.
But consider this code:
int main()
{
int* intArray = new int[100];
int* it = intArray;
// Fill the array with any value.
for (int i = 0; i < 100; i++)
{
*it = 20;
it++;
}
delete [] intArray;
return 0;
}
Now consider that intArray is, say:
0000 0000 0000 0000 0111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1100
After setting it to intArray and increasing it once, and considering sizeof(int) == 4, it will become:
0000 0000 0000 0000 1000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
The 47th bit is in bold. What happens here is that the second pointer retrieved by pointer arithmetic is invalid because not in canonical form. The correct address should be:
1111 1111 1111 1111 1000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
How do programs deal with this? Is there a guarantee by the OS that you will never be allocated memory whose address range does not vary by the 47th bit?
The canonical address rules mean there is a giant hole in the 64-bit virtual address space. 2^47-1 is not contiguous with the next valid address above it, so a single mmap won't include any of the unusable range of 64-bit addresses.
+----------+
| 2^64-1 | 0xffffffffffffffff
| ... |
| 2^64-2^47| 0xffff800000000000
+----------+
| |
| unusable | not to scale: this part is 2^16 times as large
| |
+----------+
| 2^47-1 | 0x00007fffffffffff
| ... |
| 0 | 0x0000000000000000
+----------+
Also most kernels reserve the high half of the canonical range for their own use. e.g. x86-64 Linux's memory map. User-space can only allocate in the contiguous low range anyway so the existence of the gap is irrelevant.
Is there a guarantee by the OS that you will never be allocated memory whose address range does not vary by the 47th bit?
Not exactly. The 48-bit address space supported by current hardware is an implementation detail. The canonical-address rules ensure that future systems can support more virtual address bits without breaking backwards compatibility to any significant degree.
At most, you'd just need a compat flag to have the OS not give the process any memory regions with high bits not all the same. (Like Linux's current MAP_32BIT flag for mmap, or a process-wide setting). That could support programs that used the high bits for tags and manually redid sign-extension.
Future hardware won't need to support any kind of flag to ignore high address bits or not, because junk in the high bits is currently an error. Intel 5-level paging adds another 9 virtual address bits, widening the canonical high andd low halves. white paper.
See also Why in 64bit the virtual address are 4 bits short (48bit long) compared with the physical address (52 bit long)?
Fun fact: Linux defaults to mapping the stack at the top of the lower range of valid addresses. (Related: Why does Linux favor 0x7f mappings?)
$ gdb /bin/ls
...
(gdb) b _start
Function "_start" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (_start) pending.
(gdb) r
Starting program: /bin/ls
Breakpoint 1, 0x00007ffff7dd9cd0 in _start () from /lib64/ld-linux-x86-64.so.2
(gdb) p $rsp
$1 = (void *) 0x7fffffffd850
(gdb) exit
$ calc
2^47-1
0x7fffffffffff
(Modern GDB can use starti to break before the first user-space instruction executes instead of messing around with breakpoint commands.)
Related
I am currently trying to hash a set of strings using MurmurHash3, since 32 bit hash seems to be too large for me to handle. I wanted to reduce the number of bit used to generate hashes to around 24 bits. I already found some questions explaining how to reduce it to 16, 8, 4, 2 bit using XOR folding, but these are too few bits for my application.
Can somebody help me?
When you have a 32-bit hash, it's something like (with spaces for readability):
1101 0101 0101 0010 1010 0101 1110 1000
To get a 24-bit hash, you want to keep the lower order 24 bits. The notation for that will vary by language, but many languages use "x & 0xFFF" for a bit-wise AND operation with 0xFFF hex. That effectively does (with the AND logic applied to each vertical column of numbers, so 1 AND 1 is 1, and 0 and 1 is 0):
1101 0101 0101 0010 1010 0101 1110 1000 AND <-- hash value from above
0000 0000 1111 1111 1111 1111 1111 1111 <-- 0xFFF in binary
==========================================
0000 0000 0101 0010 1010 0101 1110 1000
You do waste a little randomness from your hash value though, which doesn't matter so much with a pretty decent hash like murmur32, but you can expect slightly reduced collisions if you instead further randomise the low-order bits using the high order bits you'd otherwise chop off. To do that, right-shift the high order bits and XOR them with lower-order bits (it doesn't really matter which). Again, a common notation for that is:
((x & 0xF000) >> 8) ^ x
...which can be read as: do a bitwise-AND to retrain only the most significant byte of x, then shift that right by 8 bits, then bitwise excluse-OR that with the original value of X. The result of the above expression then has bit 23 (counting from 0 as the least signficant bit) set if and only if one or other (but not both) of bits 23 and 31 were set in the value of x. Similarly, bit 22 is the XOR of bits 22 and 30. So it goes down to bit 16 which is the XOR of bit 16 and bit 24. Bits 0..15 remain the same as in the original value of x.
Yet another approach is to pick a prime number ever-so-slightly lower than 2^24-1, and mod (%) your 32-bit murmur hash value by that, which will mix in the high order bits even more effectively than the XOR above, but you'll obviously only get values up to the prime number - 1, and not all the way to 2^24-1.
Can someone help me understand this question:
A processor providing 64GB of addressable main memory such as the AMD FX8350
Which of the following is the correct maximun range of main memory locations for such a processor?
A.0x000 to 0x3FF
B.0x0 0000 TO 0x3F FFFF
C.0x000 0000 TO 0x3FF FFFF
D.0x0 0000 0000 TO 0x3F FFFF FFFF
E.0x0 000 000 000 TO 0x3F FFFF FFFF FFFF
I am afraid there is no simple answer to this question. The microprocessor will have different addressing modes and will map real memory into the virtual address space in pages, usually of 4k in size. So the virtual address space may not even be contiguous.
First of all there is no such thing as "Processor providing xx memory". A processor can specify how many bits of addresses it is able to operate on. Most common cases are 32 and 64 bit. Processors with 32 bit addressing, are able to access 2^32 locations = 4GB. Processors with 64 bit addressing, theoretically, is able to address 2^64 locations. However most of them, only support 48 bits of addressing providing 256 TB of addressable space.
Now in order to make use this capabilities, you need the support of the operating system as well, i. if you have a 64 bit processor and a 32 bit OS, you can only access 32 bit addresses.
Let's say I've created a type in Ada:
type Coord_Type is range -32 .. 31;
What can I expect the bits to look like in memory, or specifically when transmitting this value to another system?
I can think of two options.
One is that the full (default integer?) space is used for all variables of "Coord_Type", but only the values within the range are possible. If I assume 2s complement, then, the value 25 and -25 would be possible, but not 50 or -50:
0000 0000 0001 1001 ( 25)
1111 1111 1110 0111 (-25)
0000 0010 0011 0010 ( 50) X Not allowed
1111 1111 1100 1110 (-50) X Not allowed
The other option is that the space is compressed to only what is needed. (I chose a byte, but maybe even only 6 bits?) So with the above values, the bits might be arranged as such:
0000 0000 0001 1001 ( 25)
0000 0000 1110 0111 (-25)
0000 0000 0011 0010 ( 50) X Not allowed
0000 0000 1100 1110 (-50) X Not allowed
Essentially, does Ada further influence the storage of values beyond limiting what range is allowed in a variable space? Is this question, Endianness, and 2s complement even controlled by Ada?
When you declare the type like that, you leave it up to the compiler to choose the optimal layout for each architecture. You might even get binary-coded-decimal (BCD) instead of two's complement on some architectures.
Please help me out, im studying operating systems. under virtual memory i found this:
A user process generates a virtual address 11123456. and it is said the virtual address in binary form is 0001 0001 0001 0010 0011 0100 0101 0110. how was that converted because when i convert 11123456 to bin i get 0001 0101 0011 0111 0110 000 0000. it is said The virtual memory is implemented by paging, and the page size is 4096 bytes
You assume that 11123456 is a decimal number, while according to the result it's hexadecimal. In general, decimal numbers are rarely used in CS, representation in orders of 2 is much more common and convenient. Today mostly used are base 16 (hexadecimal) and 2 (binary).
Converting into binary may help to identify the page number and offset so that you can calculate the physical address corresponding to the logical address. It should be good if you can understand how to do this if you are CS student.
For the particular problem, i.e. paging, you can convert from logical to physical address without converting into binary using modulo (%) and divide (/) operators. However, doing things in binary is original way for this.
In your question, the value 11123456 should be a hexadecimal number and it should be written as 0x11123456 to distinguish with the decimal numbers. And from the binary format "0001 0001 0001 0010 0011 0100 0101 0110", we can infer that the offset of the logical address is "0100 0101 0110" (12 rightmost bits, or 132182 in decimal, or 0x20456 in hexadecimal) and the page number is "0001 0001 0001 0010 0011" (the rest bits, 69923 in decimal, or 0x11123 in hexadecimal).
I am trying to change the state (output/input) for more then one pin at the same time (with a bitmask).
The code for one pin is:
#define INP_GPIO(g) *(gpio+((g)/10)) &= ~(7<<(((g)%10)*3))
#define OUT_GPIO(g) *(gpio+((g)/10)) |= (1<<(((g)%10)*3))
I don't really understand what this code does.
Let's say, gpio := 0x20200000 so for Pin 1 it should be
10 0000 0010 0000 0000 0000 0000 0000 + 0 = (10 0000 0010 0000 0000 0000 0000 0000 + 0) & ~11 1000 = 0
I think this can't be correct. What am i doing wrong?
so from the broadcom arm peripherals manual which you should already be referencing before asking us to read it for you...0x20200000 is the function select register for gpio pins 0 to 9 for each set of 10 pins there is a register 3 bits per gpio to select one of 8 functions 2 bits unused. so the modulo 10 is to figure out which function select register, then times 3 is three bits per gpio pin.
the bit pattern 0b000 defines the pin as an input and the bit pattern 0b001 as an output, so this code you referenced either zeros the three bits or it ors the three bits with a 1 which of course is buggy since the other two bits are not guaranteed to be zero. to use that code properly you should probably either fix it or set the gpio as an input then an output so that it clears the three bits then sets the one bit.