VirtualQuery gives illegal result. Is it a bug? - virtualquery

My code:
MEMORY_BASIC_INFORMATION meminf;
::VirtualQuery(box.pBits, &meminf, sizeof(meminf));
The results:
meminf:
BaseAddress 0x40001000 void *
AllocationBase 0x00000000 void *
AllocationProtect 0x00000000 unsigned long
RegionSize 0x0de0f000 unsigned long
State 0x00010000 unsigned long
Protect 0x00000001 unsigned long
Type 0x00000000 unsigned long
Notes:
(1) AllocationBase is NULL while BaseAddress is not NULL
(2) AllocationProtect is 0 (not a protection value)
Is it a bug of VirtualQuery?

This is not a bug. The documentation of VirtualQuery() states:
The return value is the actual number of bytes returned in the information buffer.
If the function fails, the return value is zero. To get extended error information, call GetLastError. Possible error values include ERROR_INVALID_PARAMETER.
Check the function result to be equal to sizeof(meminf) before using the data in the structure, or initialize the structure with values that will make the code that follows do the right thing. If the function returned 0 no data was copied to the structure, so it will still contain whatever data was previously in it. Without initialization this will be random bytes on the stack.

Passing a kernel-mode pointer to this function can result in no information being returned.
Check the return value.

Related

Systemverilog expansion rule

When I review some codes, I found something strange.
It seems that it comes from expansion and operation priority.
(I know that because "sig" is declared with 'signed', $signed is not necessary and '-sig' is correct one, anyway..)
reg signed [9:0] sig;
reg signed [11:0] out;
initial
begin
$monitor ("%0t] sig=%0d, out=%0h", $time, sig, out);
sig = 64;
out = $signed(-sig);
#1
out = -$signed(sig);
#1
sig = -512;
out = $signed(-sig);
#1
out = -$signed(sig);
#1
$finish;
end
Simulation result for above codes is,
0] sig=64, out=-64
2] sig=-512, out=-512
3] sig=-512, out=512
When sig=-512, I expected that 10 bits sig would be expanded to 12bits before negation, but it was expanded after negation.
Therefore negation of -512 was still -512, and after expansion, it had a -512.
I guess "$signed() blocks expansion..Any idea what happens??
First of all, -512 and 512 are identical numbers in 10-bit represenntation. 10 bits can actually only hold signed values from -512 to 511. In this scheme negation of -512 should work weirdly, not mentioned in lrm, at least i was not able to locate anything related. This is probably an undefined behavior.
However, it is logical to assume that in this scheme in order to represent a negated value of '-512' just removing signess is sufficient. It seems that all commercial compilers in eda playground do this. So, a result of the unaray - operator in this case will be unsigned value of 512.
So, in out = $signed(-(-512)) the negation operator returns an unsigned value of 512 and it gets converted to a signed by the system task. Therefore, it gets sign extended in out.
out = -$signed(-512) for the same reason the outermost negation operator returns an unsigned value of 512. No sign extension happens here.
You can again make it signed by enclosing in yet another $signed as out = $signed(-$signed(-512))

MATLAB's mxGetFieldByNumber and mxGetFieldNameByNumber return incongruent results

I have a C mex routine that is iterating over subfields of a structure. Sometimes calling mxGetFieldByNumber() returns NULL when mxGetFieldNameByNumber() returns a string for the same field idx. Here is a toy:
numFields = getNumberOfFields( currentField );
for( fieldIdx = 0; fieldIdx < numFields; fieldIdx ++){
subField = mxGetFieldByNumber( currentField, 0 , fieldIdx );
fieldName = mxGetFieldNameByNumber(currentField, fieldIdx );
}
I have read through the documentation of both functions. A NULL can be returned if (in this example) currentField were not a mxArray which I know is not the case because mxGetFieldNameByNumber() returns something sensible. Insufficient heap space could be the problem but I've checked that and it is on 400kb. NULL can also be returned when no value is assigned to the specified field but I've looked and it appears the value is zero.
Any thoughts?
When a struct is created at the MATLAB level or in a mex routine via mxCreateStruct, not all field elements are necessarily populated. In such case, MATLAB physically stores a NULL pointer (i.e., 0) in those data spots (a struct is essentially an array of mxArray pointers). E.g., take the following code snippet assuming X doesn't exist yet:
X.a = 5;
X(2).b = 7;
The X struct variable actually has four elements, namely X(1).a, X(1).b, X(2).a, and X(2).b. But you only set two of these elements. What does MATLAB do with the other elements? Answer: It simply stores NULL pointers for those spots. If you subsequently access those NULL spots in your MATLAB code, MATLAB will simply create an empty double matrix on the fly.
At the mex level, a similar thing happens. When you first create the struct with mxCreateStruct, MATLAB simply fills all of the element spots with NULL values. Then you can populate them in your code if you want, but note that leaving them as NULL is perfectly acceptable for returning back to MATLAB. The routine mxGetFieldByNumber actually gets the element mxArray pointer, and mxGetFieldNameByNumber gets the name of the field itself ... two very different things. If you get a NULL result from a valid mxGetFieldByNumber call (i.e. your index is not out of range), that simply means this element was never set to anything. You should never get a NULL result from a valid mxGetFieldNameByNumber call, since all field names are required to exist.
If you were to pass in the X created above to a mex routine and then examine prhs[0] you would find the following:
mxGetFieldByNumber(prhs[0],0,0)
returns a pointer to an mxArray that is the scalar double 5
mxGetFieldByNumber(prhs[0],0,1)
returns a NULL pointer
mxGetFieldByNumber(prhs[0],1,0)
returns a NULL pointer
mxGetFieldByNumber(prhs[0],1,1)
returns a pointer to an mxArray that is the scalar double 7
mxGetFieldNameByNumber(prhs[0],0)
returns a pointer to the string "a"
mxGetFieldNameByNumber(prhs[0],1)
returns a pointer to the string "b"

D language unsigned hash of string

I am a complete beginner with the D language.
How to get, as an uint unsigned 32 bits integer in the D language, some hash of a string...
I need a quick and dirty hash code (I don't care much about the "randomness" or the "lack of collision", I care slightly more about performance).
import std.digest.crc;
uint string_hash(string s) {
return crc320f(s);
}
is not good...
(using gdc-5 on Linux/x86-64 with phobos-2)
While Adams answer does exactly what you're looking for, you can also use a union to do the casting.
This is a pretty useful trick so may as well put it here:
/**
* Returns a crc32Of hash of a string
* Uses a union to store the ubyte[]
* And then simply reads that memory as a uint
*/
uint string_hash(string s){
import std.digest.crc;
union hashUnion{
ubyte[4] hashArray;
uint hashNumber;
}
hashUnion x;
x.hashArray = crc32Of(s); // stores the result of crc32Of into the array.
return x.hashNumber; // reads the exact same memory as the hashArray
// but reads it as a uint.
}
A really quick thing could just be this:
uint string_hash(string s) {
import std.digest.crc;
auto r = crc32Of(s);
return *(cast(uint*) r.ptr);
}
Since crc32Of returns a ubyte[4] instead of the uint you want, a conversion is necessary, but since ubyte[4] and uint are the same thing to the machine, we can just do a reinterpret cast with the pointer trick seen there to convert types for free at runtime.

Different offset in libc's backtrace_symbols() and libunwind's unw_get_proc_name()

I make a stack trace at some point in my program. Once with libc's backtrace_symbols() function and once with unw_get_proc_name() from libunwind.
backtrace_symbols() output:
/home/jj/test/mylib.so(+0x97004)[0x7f6b47ce9004]
unw_get_proc_name() output:
ip: 0x7f6b47ce9004, offset: 0x458e4
Here you see that the instruction pointer address (0x7f6b47ce9004) is the same and correct. The function offset 0x97004 from backtrace_symbols() is also correct but not the one I get from unw_get_proc_name() (0x458e4).
Does somebody have a clue what's going on here and what might cause this difference in offsets?
Both methods use a similar code like the following examples:
backtrace():
void *array[10];
size_t size;
size = backtrace(array, 10);
backtrace_symbols_fd(array, size, STDERR_FILENO);
libunwind:
unw_cursor_t cursor;
unw_context_t context;
unw_getcontext(&context);
unw_init_local(&cursor, &context);
while (unw_step(&cursor) > 0) {
unw_word_t offset, pc;
char fname[64];
unw_get_reg(&cursor, UNW_REG_IP, &pc);
fname[0] = '\0';
(void) unw_get_proc_name(&cursor, fname, sizeof(fname), &offset);
printf ("%p : (%s+0x%x) [%p]\n", pc, fname, offset, pc);
}
I think unw_get_proc_name compute offset from an unnamed internal frame.
For example:
void f() {
int i;
while (...) {
int j;
}
}
Notice there is a variable declaration inside loop block. In this case (and depending of level of optimization), compiler may create a frame (and related unwind information) for the loop. Consequently, unw_get_proc_name compute offset from this loop instead of begin of function.
This is explained in unw_get_proc_name man page:
Note that on some platforms there is no reliable way to distinguish
between procedure names and ordinary labels. Furthermore, if symbol
information has been stripped from a program, procedure names may be
completely unavailable or may be limited to those exported via a
dynamic symbol table. In such cases, unw_get_proc_name() may return
the name of a label or a preceeding (nearby) procedure.
You may try to test again but without stripping your binary (Since unw_get_proc_name is not able to find name of function, I think your binary is stripped).

Using memcpy/memset

When using memset or memcpy within an Obj-C program, will the compiler optimise the setting (memset) or copying (memcpy) of data into 32-bit writes or will it do it byte by byte?
You can see the libc implementations of these methods in the Darwin source. In 10.6.3, memset works at the word level. I didn't check memcpy, but probably it's the same.
You are correct that it's possible for the compiler to do the work inline instead of calling these functions. I suppose I'll let someone who knows better answer what it will do, though I would not expect a problem.
Memset will come as part of your standard C library so it depends on the implementation you are using. I would guess most implementations will copy in blocks of the native CPU size (32/64 bits) and then the remainder byte-by-byte.
Here is glibc's version of memcpy for an example implementation:
void *
memcpy (dstpp, srcpp, len)
void *dstpp;
const void *srcpp;
size_t len;
{
unsigned long int dstp = (long int) dstpp;
unsigned long int srcp = (long int) srcpp;
/* Copy from the beginning to the end. */
/* If there not too few bytes to copy, use word copy. */
if (len >= OP_T_THRES)
{
/* Copy just a few bytes to make DSTP aligned. */
len -= (-dstp) % OPSIZ;
BYTE_COPY_FWD (dstp, srcp, (-dstp) % OPSIZ);
/* Copy whole pages from SRCP to DSTP by virtual address manipulation,
as much as possible. */
PAGE_COPY_FWD_MAYBE (dstp, srcp, len, len);
/* Copy from SRCP to DSTP taking advantage of the known alignment of
DSTP. Number of bytes remaining is put in the third argument,
i.e. in LEN. This number may vary from machine to machine. */
WORD_COPY_FWD (dstp, srcp, len, len);
/* Fall out and copy the tail. */
}
/* There are just a few bytes to copy. Use byte memory operations. */
BYTE_COPY_FWD (dstp, srcp, len);
return dstpp;
}
So you can see it copies a few bytes first to get aligned, then copies in words, then finally in bytes again. It does some optimized page copying using some kernel operations.