Constant expression with casting uint to float in GLSL - constants

I defined a global constant like this
const uint Y = 4;
const float X = float(Y);
It works just fine. Now I would like to make it a bit more flexible by using specialization constants
layout (constant_id = 1) const uint Y = 4;
const float X = float(Y);
Unfortunately now I get
error: '=' : global const initializers must be constant ' const highp float'
Why is it that GLSL cannot perform such simple conversion in the presence of specialization constants? Shouldn't specialization constants be treated on equal with any other constant expression? Is it really the case that the only possible solution to this problem is to provide both constants using
layout (constant_id = 1) const uint Y = 4;
layout (constant_id = 2) const float X = 4.;

According to the Vulkan GLSL specification, only certain operators can be used on a specialization constant in order for it to remain a specialization constant. Converting one to a float isn't on that list.
As for why you can't do this, it's because Vulkan compilers should be as simple as possible. When you have a specialization constant and perform some operation on it and try to use it as a constant, the compiler has to invisibly perform that operation at specialization time. That's a big pain for the implementation to do, so Vulkan GLSL restricts how much of this is needed.
The Vulkan GLSL specification contains something of a bug. It's non-normative text says that this is possible:
layout(constant_id = 18) const int scX = 1;
layout(constant_id = 19) const int scZ = 1;
const vec3 scVec = vec3(scX, 1, scZ); // partially specialized vector
But the section "Specialization Constant Operations" specifically says that it isn't, due to the implicit conversion from int to float.

Related

How can i initialize constant with derived values in HLSL?

I'm trying to migrate this glsl code into hlsl (unity shader). But the compiler complains about the following lines:
#define Length float
const Length m = 1.0;
const Length km = 1000.0 * m;
where km is derived from m, and the error msg said:
'km': initial value must be a literal expression
Is there any way to solve this without just replacing m with its literal value manually?
I tried to google this but found nothing related, or maybe this question is just a complaint about HLSL's weak compiler.
According to glsl-to-hlsl-reference, we should use static const qualifiers in hlsl.

How to use a fixed point sin function in Vivado HLS

I am calculating the intersection point of two lines given in the polar coordinate system:
typedef ap_fixed<16,3,AP_RND> t_lines_angle;
typedef ap_fixed<16,14,AP_RND> t_lines_rho;
bool get_intersection(
hls::Polar_< t_lines_angle, t_lines_rho>* lineOne,
hls::Polar_< t_lines_angle, t_lines_rho>* lineTwo,
Point* point)
{
float angleL1 = lineOne->angle.to_float();
float angleL2 = lineTwo->angle.to_float();
t_lines_angle rhoL1 = lineOne->rho.to_float();
t_lines_angle rhoL2 = lineTwo->rho.to_float();
t_lines_angle ct1=cosf(angleL1);
t_lines_angle st1=sinf(angleL1);
t_lines_angle ct2=cosf(angleL2);
t_lines_angle st2=sinf(angleL2);
t_lines_angle d=ct1*st2-st1*ct2;
// we make sure that the lines intersect
// which means that parallel lines are not possible
point->X = (int)((st2*rhoL1-st1*rhoL2)/d);
point->Y = (int)((-ct2*rhoL1+ct1*rhoL2)/d);
return true;
}
After synthesis for our FPGA I saw that the 4 implementations of the float sine (and cos) take 4800 LUTs per implementation, which sums up to 19000 LUTs for these 4 functions. I want to reduce the LUT count by using a fixed point sine. I already found a implementation of CORDIC but I am not sure how to use it. The input of the function is an integer but i have a ap_fixed datatype. How can I map this ap_fixed to integer? and how can I map my 3.13 fixed point to the required 2.14 fixed point?
With the help of one of my colleagues I figured out a quite easy solution that does not require any hand written implementations or manipulation of the fixed point data:
use #include "hls_math.h" and the hls::sinf() and hls::cosf() functions.
It is important to say that the input of the functions should be ap_fixed<32, I> where I <= 32. The output of the functions can be assigned to different types e.g., ap_fixed<16, I>
Example:
void CalculateSomeTrig(ap_fixed<16,5>* angle, ap_fixed<16,5>* output)
{
ap_fixed<32,5> functionInput = *angle;
*output = hls::sinf(functionInput);
}
LUT consumption:
In my case the consumption of LUT was reduced to 400 LUTs for each implementation of the function.
You can use bit-slicing to get the fraction and the integer parts of the ap_fixed variable, and then manipulate them to get the new ap_fixed. Perhaps something like:
constexpr int max(int a, int b) { return a > b ? a : b; }
template <int W2, int I2, int W1, int I1>
ap_fixed<W2, I2> convert(ap_fixed<W1, I1> f)
{
// Read fraction part as integer:
ap_fixed<max(W2, W1) + 1, max(I2, I1) + 1> result = f(W1 - I1 - 1, 0);
// Shift by the original number of bits in the fraction part
result >>= W1 - I1;
// Add the integer part
result += f(W1 - 1, W1 - I1);
return result;
}
I haven't tested this code well, so take it with a grain of salt.

returns long double instead of double

I am writing a program in c++ where I want to find the epsilon of my pc.
I want the result to be double precision (which is 2.2204460492503131 E-16) but instead the output is 1.0842 E-019 which is the epsilon in long double precision.
My program is this:
#include <iostream>
double e = 1.0;
double x;
int main ()
{
for (int i = 0; e + 1.0!=1.0 ; i++)
{
std::cout<<e<<'\n';
x = e;
e/=2.0;
}
std::cout << "The epsilon of this Computer is "<< x <<'\n';
return 0;
}
Output std::numeric_limits<double>::epsilon() instead. std::numeric_limits is declared in the standard header <limits>.
A more usual technique, if you really must calculate it (rather than trusting your standard library to provide a correct value) is
double epsilon = 1.0;
while ((1.0 + 0.5 * epsilon) != 1.0)
epsilon *= 0.5;
or to do the calculation.
Note that (although you haven't shown how you did it) it may actually be your long double calculation that is incorrect, since literal floating point values (like 1.0) default to being of type double, not long double - which might suggest the error is in your calculation of the long double result, not the double one.. If you want the result to be of type long double, it would be advisable to give all of that literal values (1.0, 0.5) the L suffix, to force them to be of type long double.
Also remember to use appropriate formatting when streaming the resultant value to std::cout, to ensure output also has the accuracy/precision you need. The default settings (what you get if you don't control the formatting) may differ.

Float not returning the right decimal value

This is the code i have:
int resultInt = [ja.resultCount intValue];
float pages = resultInt / 10;
NSLog(#"%d",resultInt);
NSLog(#"%.2f",pages);
the resultInt comes back from php script with the value 3559 so the pages result should be 355.9 but i get the result as 355.00 which isn't right
Use
float pages = resultInt / 10.0f;
int/int is int
but int/float or float/int is float
Edited for more explanation
It is important to remember that the resultant value of a mathematical operation is subject to the rules of the receiving variable's data type. The result of a division operation may yield a floating point value. However, if assigned to an integer the fractional part will be lost. Equally important, and less obvious, is the effect of an operation performed on several integers and assigned to a non-integer. In this case, the result is calculated as an integer before being implicitly converted. This means that although the resultant value is assigned to a floating point variable, the fractional part is still truncated unless at least one of the values is explicitly converted first. The following examples illustrate this:
int a = 7;
int b = 3;
int integerResult;
float floatResult;
integerResult = a / b; // integerResult = 2 (truncated)
floatResult = a / b; // floatResult = 2.0 (truncated)
floatResult = (float)a / b; // floatResult = 2.33333325
This has to do with the fact that you're using integer and not float.
Tell the variables that you are using that they are floats and you are done.
int resultInt = [ja.resultCount intValue];
float pages = (float)resultInt / 10.f;

Checking the value of a typedef alias

I'm trying to write some DSP code that will need to run in both floating and fixed point environments (the numeric type will be determined at compile time). I'd like to alias the particular numeric type using either a Macro or a typedef. Multiplication, division and other math functions will vary considerably in implementation depending on the numeric type, so I'll need some sort of switch to determine whether to include certain headers and perhaps alter the implementation based on the numeric type.
I'll give a short code snippet as an example ...
typedef samp_t float;
// or #define samp_t float (bad naming practice?)
// An alternative in fixed point
samp_t multiply_samp_t(samp_t a, samp_t b){
return a*b;
}
/* typedef samp_t int;
#define RADIX 24
samp_t multiply_samp_t(samp_t a, samp_t b){
return (samp_t) ((long) a)*((long) b) >> RADIX);
}
*/
void main(void){
samp_t a,b,c;
a = 15;
b = 27;
c = multiply_samp_t(a,b);
}
So, how would one switch between the two different multiplications functions based on samp_t's type? Any recommendations or suggestions are welcome.
Thanks!
-Brant
Here is an example of choosing the functions corresponding to the data type at compile time:
#define SYSTEM_FLOAT 0
#define SYSTEM_INT 1
// main selection
#define TYPE_SYSTEM FLOAT
#if TYPE_SYSTEM == SYSTEM_FLOAT
#define SAMP_T float
#define mult_samp_t mult_samp_t_float
#elif TYPE_SYSTEM == SYSTEM_INT
#define SAMP_T int
#define mult_samp_t mult_samp_t_int
#elif ...
#endif
void main(void){
SAMP_T a,b,c;
a = 15;
b = 27;
c = mult_samp_t(a,b);
}
Somewhere in the code you must define:
float mult_samp_t_float(float a, float b)
{
...
return a float;
}
float mult_samp_t_int(int a, int b)
{
...
return an int;
}
That is, the type system you select will set the entire functions set you are going to use.
I still recommend, though, to use this scheme only if you can't do with runtime selection and function, because the way I presented here makes the code harder to debug (although I never had problems with it in the past).
(I encountered much worse things in code of professional OS :-) )