Checking the value of a typedef alias - macros

I'm trying to write some DSP code that will need to run in both floating and fixed point environments (the numeric type will be determined at compile time). I'd like to alias the particular numeric type using either a Macro or a typedef. Multiplication, division and other math functions will vary considerably in implementation depending on the numeric type, so I'll need some sort of switch to determine whether to include certain headers and perhaps alter the implementation based on the numeric type.
I'll give a short code snippet as an example ...
typedef samp_t float;
// or #define samp_t float (bad naming practice?)
// An alternative in fixed point
samp_t multiply_samp_t(samp_t a, samp_t b){
return a*b;
}
/* typedef samp_t int;
#define RADIX 24
samp_t multiply_samp_t(samp_t a, samp_t b){
return (samp_t) ((long) a)*((long) b) >> RADIX);
}
*/
void main(void){
samp_t a,b,c;
a = 15;
b = 27;
c = multiply_samp_t(a,b);
}
So, how would one switch between the two different multiplications functions based on samp_t's type? Any recommendations or suggestions are welcome.
Thanks!
-Brant

Here is an example of choosing the functions corresponding to the data type at compile time:
#define SYSTEM_FLOAT 0
#define SYSTEM_INT 1
// main selection
#define TYPE_SYSTEM FLOAT
#if TYPE_SYSTEM == SYSTEM_FLOAT
#define SAMP_T float
#define mult_samp_t mult_samp_t_float
#elif TYPE_SYSTEM == SYSTEM_INT
#define SAMP_T int
#define mult_samp_t mult_samp_t_int
#elif ...
#endif
void main(void){
SAMP_T a,b,c;
a = 15;
b = 27;
c = mult_samp_t(a,b);
}
Somewhere in the code you must define:
float mult_samp_t_float(float a, float b)
{
...
return a float;
}
float mult_samp_t_int(int a, int b)
{
...
return an int;
}
That is, the type system you select will set the entire functions set you are going to use.
I still recommend, though, to use this scheme only if you can't do with runtime selection and function, because the way I presented here makes the code harder to debug (although I never had problems with it in the past).
(I encountered much worse things in code of professional OS :-) )

Related

Constant expression with casting uint to float in GLSL

I defined a global constant like this
const uint Y = 4;
const float X = float(Y);
It works just fine. Now I would like to make it a bit more flexible by using specialization constants
layout (constant_id = 1) const uint Y = 4;
const float X = float(Y);
Unfortunately now I get
error: '=' : global const initializers must be constant ' const highp float'
Why is it that GLSL cannot perform such simple conversion in the presence of specialization constants? Shouldn't specialization constants be treated on equal with any other constant expression? Is it really the case that the only possible solution to this problem is to provide both constants using
layout (constant_id = 1) const uint Y = 4;
layout (constant_id = 2) const float X = 4.;
According to the Vulkan GLSL specification, only certain operators can be used on a specialization constant in order for it to remain a specialization constant. Converting one to a float isn't on that list.
As for why you can't do this, it's because Vulkan compilers should be as simple as possible. When you have a specialization constant and perform some operation on it and try to use it as a constant, the compiler has to invisibly perform that operation at specialization time. That's a big pain for the implementation to do, so Vulkan GLSL restricts how much of this is needed.
The Vulkan GLSL specification contains something of a bug. It's non-normative text says that this is possible:
layout(constant_id = 18) const int scX = 1;
layout(constant_id = 19) const int scZ = 1;
const vec3 scVec = vec3(scX, 1, scZ); // partially specialized vector
But the section "Specialization Constant Operations" specifically says that it isn't, due to the implicit conversion from int to float.

How to use a fixed point sin function in Vivado HLS

I am calculating the intersection point of two lines given in the polar coordinate system:
typedef ap_fixed<16,3,AP_RND> t_lines_angle;
typedef ap_fixed<16,14,AP_RND> t_lines_rho;
bool get_intersection(
hls::Polar_< t_lines_angle, t_lines_rho>* lineOne,
hls::Polar_< t_lines_angle, t_lines_rho>* lineTwo,
Point* point)
{
float angleL1 = lineOne->angle.to_float();
float angleL2 = lineTwo->angle.to_float();
t_lines_angle rhoL1 = lineOne->rho.to_float();
t_lines_angle rhoL2 = lineTwo->rho.to_float();
t_lines_angle ct1=cosf(angleL1);
t_lines_angle st1=sinf(angleL1);
t_lines_angle ct2=cosf(angleL2);
t_lines_angle st2=sinf(angleL2);
t_lines_angle d=ct1*st2-st1*ct2;
// we make sure that the lines intersect
// which means that parallel lines are not possible
point->X = (int)((st2*rhoL1-st1*rhoL2)/d);
point->Y = (int)((-ct2*rhoL1+ct1*rhoL2)/d);
return true;
}
After synthesis for our FPGA I saw that the 4 implementations of the float sine (and cos) take 4800 LUTs per implementation, which sums up to 19000 LUTs for these 4 functions. I want to reduce the LUT count by using a fixed point sine. I already found a implementation of CORDIC but I am not sure how to use it. The input of the function is an integer but i have a ap_fixed datatype. How can I map this ap_fixed to integer? and how can I map my 3.13 fixed point to the required 2.14 fixed point?
With the help of one of my colleagues I figured out a quite easy solution that does not require any hand written implementations or manipulation of the fixed point data:
use #include "hls_math.h" and the hls::sinf() and hls::cosf() functions.
It is important to say that the input of the functions should be ap_fixed<32, I> where I <= 32. The output of the functions can be assigned to different types e.g., ap_fixed<16, I>
Example:
void CalculateSomeTrig(ap_fixed<16,5>* angle, ap_fixed<16,5>* output)
{
ap_fixed<32,5> functionInput = *angle;
*output = hls::sinf(functionInput);
}
LUT consumption:
In my case the consumption of LUT was reduced to 400 LUTs for each implementation of the function.
You can use bit-slicing to get the fraction and the integer parts of the ap_fixed variable, and then manipulate them to get the new ap_fixed. Perhaps something like:
constexpr int max(int a, int b) { return a > b ? a : b; }
template <int W2, int I2, int W1, int I1>
ap_fixed<W2, I2> convert(ap_fixed<W1, I1> f)
{
// Read fraction part as integer:
ap_fixed<max(W2, W1) + 1, max(I2, I1) + 1> result = f(W1 - I1 - 1, 0);
// Shift by the original number of bits in the fraction part
result >>= W1 - I1;
// Add the integer part
result += f(W1 - 1, W1 - I1);
return result;
}
I haven't tested this code well, so take it with a grain of salt.

returns long double instead of double

I am writing a program in c++ where I want to find the epsilon of my pc.
I want the result to be double precision (which is 2.2204460492503131 E-16) but instead the output is 1.0842 E-019 which is the epsilon in long double precision.
My program is this:
#include <iostream>
double e = 1.0;
double x;
int main ()
{
for (int i = 0; e + 1.0!=1.0 ; i++)
{
std::cout<<e<<'\n';
x = e;
e/=2.0;
}
std::cout << "The epsilon of this Computer is "<< x <<'\n';
return 0;
}
Output std::numeric_limits<double>::epsilon() instead. std::numeric_limits is declared in the standard header <limits>.
A more usual technique, if you really must calculate it (rather than trusting your standard library to provide a correct value) is
double epsilon = 1.0;
while ((1.0 + 0.5 * epsilon) != 1.0)
epsilon *= 0.5;
or to do the calculation.
Note that (although you haven't shown how you did it) it may actually be your long double calculation that is incorrect, since literal floating point values (like 1.0) default to being of type double, not long double - which might suggest the error is in your calculation of the long double result, not the double one.. If you want the result to be of type long double, it would be advisable to give all of that literal values (1.0, 0.5) the L suffix, to force them to be of type long double.
Also remember to use appropriate formatting when streaming the resultant value to std::cout, to ensure output also has the accuracy/precision you need. The default settings (what you get if you don't control the formatting) may differ.

Do I need to use decimal places when using floats? Is the "f" suffix necessary?

I've seen several examples in books and around the web where they sometimes use decimal places when declaring float values even if they are whole numbers, and sometimes using an "f" suffix. Is this necessary?
For example:
[UIColor colorWithRed:0.8 green:0.914 blue:0.9 alpha:1.00];
How is this different from:
[UIColor colorWithRed:0.8f green:0.914f blue:0.9f alpha:1.00f];
Does the trailing "f" mean anything special?
Getting rid of the trailing zeros for the alpha value works too, so it becomes:
[UIColor colorWithRed:0.8 green:0.914 blue:0.9 alpha:1];
So are the decimal zeros just there to remind myself and others that the value is a float?
Just one of those things that has puzzled me so any clarification is welcome :)
Decimal literals are treated as double by default. Using 1.0f tells the compiler to use a float (which is smaller than double) instead. In most cases it doesn't really matterĀ if a number is a double or a float, the compiler will make sure you get the right format for the job in the end. In high-performance code you may want to be explicit, but I'd suggest benchmarking it yourself.
As John said numbers with a decimal place default to double. TomTom is wrong.
I was curious to know if the compiler would just optimize the double to a const float (which I assumed would happen)... turns out it doesn't and the idea of the speed increase is actually legit... depending on how much you use it. In math-heavy application, you probably do want to use this trick.
It must be the case that it is taking the stored float variable, casting it to a double, performing the math against the double (the number without the f), then casting it back to a float to store it again. That would explain the diference in calculation even though we're storing in floats each time.
The code & raw results:
https://gist.github.com/1880400
Pulled out relevant benchmark on an iPad 1 in Debug profile (Release resulted in even more of a performance increase by using the f notation):
------------ 10000000 total loops
timeWithDoubles: 1.33593 sec
timeWithFloats: 0.80924 sec
Float speed up: 1.65x
Difference in calculation: -0.000038
Code:
int main (int argc, const char * argv[]) {
for (unsigned int magnitude = 100; magnitude < INT_MAX; magnitude *= 10) {
runTest(magnitude);
}
return 0;
}
void runTest(int numIterations) {
NSTimeInterval startTime = CFAbsoluteTimeGetCurrent();
float d = 1.2f;
for (int i = 0; i < numIterations; i++) {
d += 1.8368383;
d *= 0.976;
}
NSTimeInterval timeWithDoubles = CFAbsoluteTimeGetCurrent() - startTime;
startTime = CFAbsoluteTimeGetCurrent();
float f = 1.2f;
for (int i = 0; i < numIterations; i++) {
f += 1.8368383f;
f *= 0.976f;
}
NSTimeInterval timeWithFloats = CFAbsoluteTimeGetCurrent() - startTime;
printf("\n------------ %d total loops\n", numIterations);
printf("timeWithDoubles: %2.5f sec\n", timeWithDoubles);
printf("timeWithFloats: %2.5f sec\n", timeWithFloats);
printf("Float speed up: %2.2fx\n", timeWithDoubles / timeWithFloats);
printf("Difference in calculation: %f\n", d - f);
}
Trailing f: this is a float.
Trailing f + "." - redundant.
That simple.
8f is 8 as a float.
8.0 is 8 as a float.
8 is 8 as integer.
8.0f is 8 as a float.
Mostly the "f" can be style - to make sure it is a float, not a double.

Find the smallest value among variables?

I have from 4 up to 20 variables that differ in size.
They are all of type float and number values.
Is there an easy way to find the smallest value among them and assign it to a variable?
Thanks
Not sure about objective-c but the procedure's something like:
float min = arrayofvalues[0];
foreach( float value in arrayofvalues)
{
if(value < min)
min=value;
}
I agree with Davy8 - you could try rewriting his code into Objective C.
But, I have found some min()-like code - in Objective C!
Look at this:
- (int) smallestOf: (int) a andOf: (int) b andOf: (int) c
{
int min = a;
if ( b < min )
min = b;
if( c < min )
min = c;
return min;
}
This code assumes it'll always compare only three variables, but I guess that's something you can deal with ;)
The best solution, without foreach.
`- (float)minFromArray:(float *)array size:(int)arrSize
{
float min;
int i;
min = array[0]
for(i=1;i<arrSize;i++)
if(array[i] < min)
min = array[i];
return min;
}
`
If you want to be sure, add a check of the arrSize > 0.
Marco
Thanks for all your answers and comments.. I learn a lot from you guys :)
I ended up using something like Martin suggested.
if (segmentValueNumber == 11){
float min = 100000000;
if(game51 > 0, game51 < min){
min=game51;
}
if(game52 > 0, game52 < min){
min=game52;
}
}
...............................................
I could not figure out how to implement it all into one array since each result depends on a segment control, and I think the program is more optimised this way since it only checks relevant variables.
But thanks again, you are most helpful..