Cortex-M3 Data type: signed int - cortex-m3

I have studied the dsPIC33 for FFT operation and realised the int16 is native fractional type Q15 and int32 Q31.
I now studying using math in NXP Cortex-M3, I have looked into document about data type and could not find reference (including CMSIS) to define what is int32_t actually is.
My question: is this int32_t a (native) fractional type (Q31?), if not what are they.
Is there (easy) presentation material that details data type defined by CMSIS or NXP to be used for math in general?, if so can you provide me links.
BTW, I use NXPxpresso and CMSIS-3.

define what is int32_t actually is
Look in <stdint.h>, it is part of C99. On Cortex M3 (ARM) compilers usually defined as long.
is int32_t a (native) fractional type
No, it is just a normal integer type.

Related

SystemVerilog: Data types and display of default size of data type

How can I display the size of a 'real' (or 'float') in system verilog?
$bits can display size of int, shortint, longint, time, integer, etc. but cannot do the same for a real.
You cannot select individual bits of a real number, nor is there any other construct that requires to know the number of bits in a real number. So SystemVerilog does not need to provide a way to tell you.
real is not a real verilog type. It is intended for testbench or for analog calculations, not for design. Therefore it has no bit size associated with it.
However from lrm:
The real data type is the same as a C double. The shortreal data type is the same as a C float. The
realtime declarations shall be treated synonymously with real declarations and can be used
interchangeably. Variables of these three types are collectively referred to as real variables.
And there is a function which converts real to bits:
$realtobits converts values from a real type to a 64-bit vector representation of the real number.
and corresponding
$bitstoreal converts a bit pattern created by $realtobits to a value of the real type
So, you can assume that the size of real is 64 bits after conversion to bits.

Float and Double network byte order

The Swift library includes the function bigEndian that can be used on integer types (such as Int, UInt, UInt8, UInt64, Int64, etc) to convert them from host order (which might presumably be anything, but realistically will be big or little endian) to network byte order (which is big endian). There're some good SO answers referring to this, and a particularly complete one is here.
However, I've not found a good resource that covers arranging a Float (32 bit) or Double (64 bit) type in to network byte order. Given that these types don't have a bigEndian method, I'm wondering if there is some subtlety involved? (The linked question does discuss floating point types, but I'm not sure it is definitely covering all details that might be relevant).
Specifically, I want to handle the 64 bit Double floating point type. I'd like a solution that will work on any platform where Swift is available.
Thank you.

representing Double values in Katai

Some of the values I need to read in my ksy file are double's which I assume is a binary64 structure. The native data-types for a float won't stretch that far. Has anyone managed to represent this datatype in Kaitai ?
"binary64" is a normal IEEE 754 double-precision floats, occupying 64 bits = 8 bytes.
They're perfectly supported by vast majority of languages and, subsequently, Kaitai Struct offers built-in supports for them as type: f8 (float, 8 bytes long).
If you're rather interested in larger floating point values (binary128, binary256 — i.e. quad or octuple precision), there is no built-in support for them in KS due to lack of standard support for these types in most target languages. If you want something like that, the recommended way would be implementing one as opaque type in a target language of your choice. That will likely require you to bringing in some external library which implements this type using some kind of software emulation / complex arithmetics — as hardware support seems to be almost non-existent in commodity CPUs (like Intel or ARM) as of 2020.
For more details on these, see issue #101.

Trying to understand how the casting/conversion is done by compiler,e.g., when cast from float to int

When a float is casted to int, how this casting is implemented by compiler.
Does compiler masks some part of memory of float variable i.e., which part of memory is plunked by compiler to pass the remaining to int variable.
I guess the answer to this lies in how the int and float is maintained in memory.
But isn't it machine dependent rather than compiler dependent. How compiler decides which part of memory to copy when casted to lower type (this is a static casting, right).
I am kind of confused with some wrong information, I guess.
(I read some questions on tag=downcasting, where debate on whether it is a cast or a conversion was going on, I am not very much interested on what it is called, as both are performed by compiler, but on how this is being performed).
...
Thanks
When talking about basic types and not pointers, then a conversion is done. Because floating point and integer representations are very different (usually IEEE-754 and two's complement respectively) it's more than just masking out some bits.
If you wanted to see the floating point number represented as an int without doing a conversion, you can do something like this (in C):
float f = 10.5;
int i2 = (int*)&f;
printf("%f %d\n", f, i2);
Most CPU architectures provide a native instruction (or multi-instruction sequence) to do float<->int conversions. The compiler will generally just generate this instruction. There's often faster methods. This question has some good information: What is the fastest way to convert float to int on x86.

double precision in Ada?

I'm very new to Ada and was trying to see if it offers double precision type. I see that we have float and
Put( Integer'Image( Float'digits ) );
on my machine gives a value of 6, which is not enough for numerical computations.
Does Ada has double and long double types as in C?
Thanks a lot...
It is a wee bit more complicated than that.
The only predefined floating-point type that compilers have to support is Float. Compilers may optionally support Short_Float and Long_Float. You should be able to look in appendex F of your compiler documentation to see what it supports.
In practice, your compiler almost certianly defines Float as a 32-bit IEEE float, and Long_Float as a 64-bit. Note that C pretty much works this way too with its float and double. C doesn't actually define the size of those.
If you absolutely must have a certian precision (eg: you are sharing the data with something external that must use IEEE 64-bit), then you should probably define your own float type with exactly that precision. That would ensure your code is either portable to any platform or compiler you move it to, or that it will produce a compiler error so you can fix the issue.
You can create any size Float you like. For a long it would be:
type My_Long_Float is digits 11;
Wiki Books is a good reference for things like this.