Microchip dsPIC33 C30 function pointer size? - function-pointers

The C30 user manual manual states that pointers near and far are 16bits wide.
How then does this address the full code memory space which is 24bits wide?
I am so confused as I have an assembler function (called from C) returning the program counter (from the stack) where a trap error occurred. I am pretty sure it sets w1 and w0 before returning.
In C, the return value is defined as a function pointer:
void (*errLoc)(void);
and the call is:
errLoc = getErrLoc();
When I now look at errLoc, it is a 16 bit value and I just do not think that is right. Or is it? Can function pointers (or any pointers) not access the full code address space?
All this has to do with a TRAP Adress error I am trying to figure out for the past 48 hours.

I see you are trying to use the dsPIC33Fxxxx/PIC24Hxxxx fault interrupt trap example code.
The problem is that pointer size for dsPIC33 (via the MPLAB X C30 compiler) is 16bit wide, but the program counter is 24bits. Fortunately the getErrLoc() assembly function does return the correct size.
However the example C source code function signature provided is void (*getErrLoc(void))(void) which is incorrect as it will be treating the return values as if it was a 16bit pointer. You want to change the return type of the function signature to be large enough to store the 24bits program counter value below instead. Thus if you choose unsigned long integer as the return type of getErrLoc(), then it will be large enough to store the 24bit program counter into a 32bit unsigned long integer location.
unsigned long getErrLoc(void); // Get Address Error Loc
unsigned long errLoc __attribute__((persistent));
(FYI: Using __attribute__((persistent)) to record trap location on next reboot)

Related

How to understand this line of chisel code

I'm in the process of learning chisel and scala language and try to analyse some lines of rocket-chip code.Could anyone try to explain me this line? https://github.com/chipsalliance/rocket-chip/blob/54237b5602a273378e3b35307bb47eb1e58cb9fb/src/main/scala/rocket/RocketCore.scala#L957
I understand what log2Up function is doing but don't understand why that log2Up(n)-1 and 0,were passed like "arguments" to addr which is val of type UInt!?
I could not find where UInt was defined, but if I had to guess, UInt is a class that has an apply method. This is a special method that allows us to use a parenthesis operator on an instance of the class.
For example lets say we have a class called Multiply that defines an apply method.
class Multiply {
def apply(a: Int, b: Int): Int = a * b
}
This allows you to call operator () on any instance of that class. For example:
val mul = new Multiply()
println(mul(5, 6)) // prints 30
What i concluded is that we use addr(log2Up(n)-1, 0) to get addres bits starting from zero up to log2Up(n)-1 bit. Lets take an example.
If we made object of class RegFile in this way
val reg = RegFile(31, 10)
First, memory rf is created. Size of that memory is 31 data of type UInt width of 10, starting from 0 up to 30.
When we compute log2Up(n)-1 we get 4, and we have something like this: addr(4,0). This gives us last five bits of addr. Like #Jack Koenig said in one of the comment above: "Rocket's register file uses a little trick where it reverses the order of the registers physically compared to the RISC-V", that's why we use ~addr. And at least rf(~addr) gives us back what is in that memory location.
This is implemented in this way to provide adequate memory access.
Take a look what whould be if we try to get data from memory location that we don't have in our memory. If method access was called in this way
access(42)
We try to access memory location on 41th place, but we only have 31 memory location(30 is top). 42 binary is 101010. Using what i said above
~addr(log2Up(n)-1,0)
would return us 10101 or 21 in decimal. Because order of registers is reversed this is
10th memory location (we try to access 41th but only have 31, 41 minus 31 is 10).

c cast and deference a pointer strict aliasing

In http://blog.regehr.org/archives/1307, the author claims that the following snippet has undefined behavior:
unsigned long bogus_conversion(double d) {
unsigned long *lp = (unsigned long *)&d;
return *lp;
}
The argument is based on http://port70.net/~nsz/c/c11/n1570.html#6.5p7, which specified the allowed access circumstances. However, in the footnote(88) for this bullet point, it says this list is only for checking aliasing purpose, so I think this snippet is fine, assuming sizeof(long) == sizeof(double).
My question is whether the above snippet is allowed.
The snippet is erroneous but not because of aliasing. First there is a simple rule that says to deference a pointer to object with a different type than its effective type is wrong. Here the effective type is double, so there is an error.
This safeguard is there in the standard, because the bit representation of a double must not be a valid representation for unsigned long, although this would be quite exotic nowadays.
Second, from a more practical point of view, double and unsigned long may have different alignment properties, and accessing this in that way may produce a bus error or just have a run time penalty.
Generally casting pointers like that is almost always wrong, has no defined behavior, is bad style and in addition is mostly useless, anyhow. Focusing on aliasing in the argumentation about these problems is a bad habit that probably originates in incomprehensible and scary gcc warnings.
If you really want to know the bit representation of some type, there are some exceptions of the "effective type" rule. There are two portable solutions that are well defined by the C standard:
Use unsigned char* and inspect the bytes.
Use a union that comprises both types, store the value in there and read it with the other type. By that you are telling the compiler that you want an object that can be seen as both types. But here you should not use unsigned long as a target type but uint64_t, since you have to be sure that the size is exactly what you think it is, and that there are no trap representations.
To illustrate that, here is the same function as in the question but with defined behavior.
unsigned long valid_conversion(double d) {
union {
unsigned long ul;
double d;
} ub = { .d = d, };
return ub.ul;
}
My compiler (gcc on a Debian, nothing fancy) compiles this to exactly the same assembler as the code in the question. Only that you know that this code is portable.

IEC61131-3 directly represented variables: data width and datatype

Directly represented variables (DRV) in IEC61131-3 languages include in their "addresses" a data-width specifier: X for 1 bit, B for byte, W for word, D for dword, etc.
Furthermore, when a DRV is declared, a IEC data type is specified, as any variable (BYTE, WORD, INT, REAL...).
I'm not sure about how these things are related. Are they orthogonal or not? Can one define a REAL variable with a W (byte) address? What would be the expected result?
A book says:
Assigning a data type to a flag or I/O address enables the programming
system to check whether the variable is being accessed correctly. For
example, a variable declared by AT %QD3 : DINT; cannot be
inadvertently accessed with UINT or REAL.
which does not make things clearer for me. Take for example this fragment (recall that W means Word, i.e., 16 bits - and both DINT and REAL correspond to 32 bits)
X AT %MW3 : DINT;
Y AT %MD4.1 : DINT;
Z AT %MD4.1 : REAL;
The first line maps a 32-bits IEC var to a 16-bits location. Is this legal? would the write/read be equivalent to a "cast" or what?
The other lines declare two 32-bits IEC variables of different type that points to the same address (I guess this should be legal). What is the expected result when reading or writing?
Like everything in PLC world, its all vendor and model specific, unfortunately.
Siemens compiler would not let you declare Real address with bit component like MD4.1, it allowed only MD4 and data length had to be double word, MB4 was not allowed.
Reading would not be equivalent to cast. For example you declare MW2 as integer and copy some value there. PLC stores integer as, lets say in twos complement format. Later in program you read MD2 as real. The PLC does not try to convert integer to real, it just blindly reads the bytes and treats it as real regardless what was saved there or what was declared there. There was no automatic casting.
This is how things worked in Siemens S7 plc-s. But you have to be very careful since each vendor does things its own way.

how to get 18 bit code address from symbol defined in linker command file

In Code Composer, you can define new symbols in the linker command file simply:
_Addr_start = 0x5C00;
_AppLength = 0x4C000;
before the memory map and section assignment. This is done in the bootloader example from TI.
You can then refer to the address (as integers) in your c-code as this
extern uint32_t _Addr_start; // note that uint32_t is fake.
extern uint32_t _AppLength; // there is no uint32_t object allocated
printf("start = %X len= %X\r\n", (uint32_t)&_Addr_start, (uint32_t)&_AppLength);
The problem is that if you use the 'small' memory model, the latter symbol (at 0x45C00) gives linker warning, because it tries to cast it to a 16-bit pointer.
"C:/lakata/hardware-platform/CommonSW/otap.c", line 78: warning #17003-D:
relocation from function "OtapGetExternal_CRC_Calc" to symbol "_AppLength"
overflowed; the 18-bit relocated address 0x3f7fc is too large to encode in
the 16-bit field (type = 'R_MSP_REL16' (161), file = "./otap.obj", offset =
0x00000002, section = ".text:OtapGetExternal_CRC_Calc")
I tried using explicit far pointers, but code composer doesn't understand the keyword far. I tried to make the dummy symbol a function pointer, to trick the compiler into thinking that dereferencing it would.... The pointer points to code space, and the code space model is "large" while the data space model is "small".
I figured it out before I finished entering the question!
Instead of declaring the symbol as
extern uint32_t _AppLength; // pretend it is a dummy data
declare it as
void _AppLength(void); // pretend it is a dummy function
Then the pointer conversion works properly, because &_AppLength is assumed to be far now. (When it declared as an integer, &_AppLength is assumed to be near and the linker fails.)

What is the default hash code that Mathematica uses?

The online documentation says
Hash[expr]
gives an integer hash code for the expression expr.
Hash[expr,"type"]
gives an integer hash code of the specified type for expr.
It also gives "possible hash code types":
"Adler32" Adler 32-bit cyclic redundancy check
"CRC32" 32-bit cyclic redundancy check
"MD2" 128-bit MD2 code
"MD5" 128-bit MD5 code
"SHA" 160-bit SHA-1 code
"SHA256" 256-bit SHA code
"SHA384" 384-bit SHA code
"SHA512" 512-bit SHA code
Yet none of these correspond to the default returned by Hash[expr].
So my questions are:
What method does the default Hash use?
Are there any other hash codes built in?
The default hash algorithm is, more-or-less, a basic 32-bit hash function applied to the underlying expression representation, but the exact code is a proprietary component of the Mathematica kernel. It's subject to (and has) change between Mathematica versions, and lacks a number of desirable cryptographic properties, so I personally recommend you use MD5 or one of the SHA variants for any serious application where security matters. The built-in hash is intended for typical data structure use (e.g. in a hash table).
The named hash algorithms you list from the documentation are the only ones currently available. Are you looking for a different one in particular?
I've been doing some reverse engeneering on 32 and 64 bit Windows version of Mathematica 10.4 and that's what I found:
32 BIT
It uses a Fowler–Noll–Vo hash function (FNV-1, with multiplication before) with 16777619 as FNV prime and ‭84696351‬ as offset basis. This function is applied on Murmur3-32 hashed value of the address of expression's data (MMA uses a pointer in order to keep one instance of each data). The address is eventually resolved to the value - for simple machine integers the value is immediate, for others is a bit trickier. The Murmur3-32 implementing function contains in fact an additional parameter (defaulted with 4, special case making behaving as in Wikipedia) which selects how many bits to choose from the expression struct in input. Since a normal expression is internally represented as an array of pointers, one can take the first, the second etc.. by repeatedly adding 4 (bytes = 32 bit) to the base pointer of the expression. So, passing 8 to the function will give the second pointer, 12 the third and so on. Since internal structs (big integers, machine integers, machine reals, big reals and so on) have different member variables (e.g. a machine integer has only a pointer to int, a complex 2 pointers to numbers etc..), for each expression struct there is a "wrapper" that combine its internal members in one single 32-bit hash (basically with FNV-1 rounds). The simplest expression to hash is an integer.
The murmur3_32() function has 1131470165 as seed, n=0 and other params as in Wikipedia.
So we have:
hash_of_number = 16777619 * (84696351‬ ^ murmur3_32( &number ))
with " ^ " meaning XOR.
I really didn't try it - pointers are encoded using WINAPI EncodePointer(), so they can't be exploited at runtime. (May be worth running in Linux under Wine with a modified version of EncodePonter?)
64 BIT
It uses a FNV-1 64 bit hash function with 0xAF63BD4C8601B7DF as offset basis and 0x100000001B3 as FNV prime, along with a SIP64-24 hash (here's the reference code) with the first 64 bit of 0x0AE3F68FE7126BBF76F98EF7F39DE1521 as k0 and the last 64 bit as k1. The function is applied to the base pointer of the expression and resolved internally. As in 32-bit's murmur3, there is an additional parameter (defaulted to 8) to select how many pointers to choose from the input expression struct. For each expression type there is a wrapper to condensate struct members into a single hash by means of FNV-1 64 bit rounds.
For a machine integer, we have:
hash_number_64bit = 0x100000001B3 * (0xAF63BD4C8601B7DF ^ SIP64_24( &number ))
Again, I didn't really try it. Could anyone try?
Not for the faint-hearted
If you take a look at their notes on internal implementation, they say that "Each expression contains a special form of hash code that is used both in pattern matching and evaluation."
The hash code they're referring to is the one generated by these functions - at some point in the normal expression wrapper function there's an assignment that puts the computed hash inside the expression struct itself.
It would certainly be cool to understand HOW they can make use of these hashes for pattern matching purpose. So I had a try running through the bigInteger wrapper to see what happens - that's the simplest compound expression.
It begins checking something that returns 1 - dunno what.
So it executes
var1 = 16777619 * (67918732 ^ hashMachineInteger(1));
with hashMachineInteger() is what we said before - including values.
Then it reads the length in bytes of the bigInt from the struct (bignum_length) and runs
result = 16777619 * (v10 ^ murmur3_32(v6, 4 * v4));
Note that murmur3_32() is called if 4 * bignum_length is greater than 8 (may be related to the max value of machine integers $MaxMachineNumber 2^32^32 and by converse to what a bigInt is supposed to be).
So, the final code is
if (bignum_length > 8){
result = 16777619 * (16777619 * (67918732 ^ ( 16777619 * (84696351‬ ^ murmur3_32( 1, 4 )))) ^ murmur3_32( &bignum, 4 * bignum_length ));
}
I've made some hypoteses on the properties of this construction. The presence of many XORs and the fact that 16777619 + 67918732 = 84696351‬ may make one think that some sort of cyclic structure is exploited to check patterns - i.e. subtracting the offset and dividing by the prime, or something like that. The software Cassandra uses the Murmur hash algorithm for token generation - see these images for what I mean with "cyclic structure". Maybe various primes are used for each expression - must still check.
Hope it helps
It seems that Hash calls the internal Data`HashCode function, then divides it by 2, takes the first 20 digits of N[..] and then the IntegerPart, plus one, that is:
IntegerPart[N[Data`HashCode[expr]/2, 20]] + 1