Data types in VHDL - filtering

I am trying to implement a filter in VHDL. All input vectors and output vectors are in signed 16 bits (1.15 format, the first bit is a sign bit). I plan to declare all signals / variables as STD_LOGIC / STD_LOGIC_VECTOR type(s). All the calculations will be based on 2's complement.
There are packages (in IEEE) such as std_logic_1164 (std_logic types & related functions), std_logic_arith (arithmetic fuctions), std_logic_signed (signed arithmetic functions), and std_logic_unsigned (unsigned arithmetic functions).
In order to achieve all the 2's complement operations in this filter implementation based on types of STD_LOGIC / STD_LOGIC_VECTOR, which library should I use? Should I use both std_logic_signed.ALL and std_logic_1164.ALL?

std_logic_(un)signed and std_logic_arith are NOT standard VHDL packages. They were written by synopsys nearly 30 years ago and never part of the VHDL standard.
The VHDL standard packages appropriate here would be numeric_std from VHDL 93 (for unsigned and signed types) or numeric_std_unsigned for arithmetic with std_logic_vector (from VHDL 2008). Also available in VHDL2008 is the fixed_pkg, that allows the user to define and do arithmetic with fixed point packages. eg:
signal a,b : sfixed(0 downto -15); -- 1.15 signed fixed
signal c : sfixed(1 downto -15);
....
a <= to_sfixed(0.12345, a);
b <= to_sfixed(-0.54321, b);
c <= a + b;
This is fixed point, 2s compliment, integer arithmatic.

Related

Why doesn't 'd0 extend the full width of the signal (as '0 does)?

Using SystemVerilog and Modelsim SE 2020.1, I was surprised to see a behavior:
bus_address is a 64b signal input logic [63:0] bus_address
Using '0
.bus_address ('0),
Using 'd0
.bus_address ('d0),
Riviera-Pro 2020.04 (too buggy, we gave up using it and we are in a dispute with Aldec)
'd0:
'0:
Investigation/Answer:
11.3.3 Using integer literals in expressions: An unsized, based integer (e.g., 'd12 , 'sd12 )
5.7.1 Integer literal constants:
The number of bits that make up an unsized number (which is a simple
decimal number or a number with a base specifier but no size
specification) shall be at least 32. Unsized unsigned literal
constants where the high-order bit is unknown ( X or x ) or
three-state ( Z or z ) shall be extended to the size of the expression
containing the literal constant.
That was tricky and I thought it would set 0 all the other bits as '0 does.
I hope specs' authors will think more when defining such non-sense behaviors.
This problem has more with port connections with mismatched sizes than anything to do with numeric literals. It's just that the issue does not present itself when using the fill literals. This is because the fill literal automatically sizes itself eliminating port width mismatch.
The problem you see exists whether you use literals or other signals like in this example:
module top;
wire [31:0] a = 0;
dut d(a);
endmodule
module dut(input wire [63:0] p1);
initial $strobeb(p1);
endmodule
According to section 23.3.3.7 Port connections with dissimilar net types (net and port collapsing), the nets a and p1 might get merged into a single 64-bit net, but only the lower 32-bits remain driven, or 64'hzzzzzzzz00000000.
If you change the port connection to a sized literal, dut d(32'b0);, you see the same behavior 64'hzzzzzzzz00000000.
Now let's get back to the unsized numeric literal 'd0. Unsized is a misnomer—all numbers have a size. It's just that the size is implicit and never the size you want it to be. 😮 How many people write {'b1,'b0,'b1,'b0} thinking they've just wrote the same thing as 4'b1010? This is actually illegal in the LRM, but some tools silently interpret it as {32'b1,32'b0,32'b1,32'b0}.
Just never use an unsized literal.

What is packed and unpacked and extended packed data

I have been going through Intel Intrinsics and every function is working on integers or floats or double that are packed or unpacked or extended packed.
It seems like this question should be answered some where on the internet but I can't find the answer at all.
What is that packing thing?
Well, I've just been searching for the answer to the same question, and also with no success. So I can only be guessing.
Intel introduced packed and scalar instructions already in their MMX technology. For example, they introduced a function
__m64 _mm_add_pi8 (__m64 a, __m64 b)
At that time there was no such a thing as "extended packed". The only data type was __m64 and all operations worked on integers.
With SSE there came 128-bit registers and operations on floating point numbers. However, SSE2 included a superset of MMX operations on integers performed in 128-bit registers. For example,
__m128i _mm_add_epi8 (__m128i a, __m128i b)
Here for the first time we see the "ep" (extended packed") part of the function name. Why it was introduced? I believe this was a solution to the problem of the name _mm_add_pi8 being already taken by the MMX instruction listed above. The interface of SSE/AVX is in the C language, where there's no polymorphism of function names.
With AVX, Intel chose a different strategy, and started to add the register length just after the opening "_mm" letters, c.f.:
__m256i _mm256_add_epi8 (__m256i a, __m256i b)
__m512i _mm512_add_epi8 (__m512i a, __m512i b)
Why they here chose "ep" and not "p" is a mystery, irrelevant for programmers. Actually, they seem to use "p" for operations on floats and doubles and "ep" for integers.
__m128d _mm_add_pd (__m128d a, __m128d b); // "d": function operates on doubles
__m256 _mm256_add_ps (__m256 a, __m256 b); // "s": function operates on floats
Perhaps this goes back to the transition from MMX to SSE, where "ep" was introduced for operations on integers (no floats were handled by MMX) and an attempt to make AVX mnemonics as close to the SSE ones as possible
Thus, basically, from the perspective of a programmer, there's no difference between "ep" ("extended packed") and "p" ("packed"), for we are already aware of the register length that we target in our code.
As for the next part of the question, "unpacking" belongs to a completely different category of notions than "scalar" and "packed". This is rather a colloquial term for a particular data rearrangement or shuffle, like rotation or shift.
The reason for using "epi" in the name of intrinsics like _mm256_unpackhi_epi16 is that it is a truly vector (not scalar) function on a vector of 16-bit integer elements. Notice that here "unpack" belongs to the part of the function name that describe its action (like mul, add, or permute), whereas "s" / "p" / "ep" (scalar, packed, extended packed) belong to the part describing the operation mode (scalar for "s", vector for "p" or "ep").
(There are no scalar-integer instructions that operate between two XMM registers, but "si" does appear in the intrinsic name for movd eax, xmm0: _mm_cvtsi128_si32. There are a few similar intrinsics.)

Any Software to convert float to any-precision FPU? [or Matlab solution]

I want to convert a lot of float numbers to multiple-precision FPU like '0x4049000000.....', perform calculations, change the precision and then again perform calculations and so on...
I know the theory as here but I want a Software, Online tool or Matlab solution to convert 6000+ numbers to FPU format (like IEEE single precision) in 0.0 ~ 51.0 range.
Any Suggestions?
Note: I need custom precision where I can describe number of digits of Mentissa and Exponent.
EDIT: It is also called Radix-Independent Floating-Point as described here and here
2nd EDIT: IEEE single precision convert and IEEE double Precision convert are examples. You enter any float number e.g. 3.1454, you will get its IEEE (single or double precision) float value in binary/hex. #rick
A quick look at the VHDL-2008 floating point library float_pkg shows that it instantiates a generic package with generic parameters set to match IEEE single precision floats.
package float_pkg is new IEEE.float_generic_pkg (...)
You should find this library as part of your simulator installation, wherever you usually look for the standard libraries such as numeric_std. On my system it is at /opt/ghdl-0.32/lib/gcc/x86_64-unknown-linux-gnu/4.9.2/vhdl/src/ieee2008/float_pkg.vhdl - if you can't find it on your system, it's available online, search for "VHDL 2008 float package" ought to get you there.
You can instantiate this generic package (using float_pkg as an example) for any precision you like (within reasonable limits).
A quick look at IEEE.float_generic_pkg shows that it declares functions to_float and to_real, I'd like to think they have the obvious behaviour.
So the answer is ... yes.
-- real to float
function to_float (
arg : REAL;
size_res : UNRESOLVED_float;
constant round_style : round_type := float_round_style; -- rounding option
constant denormalize : BOOLEAN := float_denormalize) -- Use IEEE extended FP
return UNRESOLVED_float;
and
-- float to real
function to_real (
arg : UNRESOLVED_float; -- floating point input
constant check_error : BOOLEAN := float_check_error; -- check for errors
constant denormalize : BOOLEAN := float_denormalize) -- Use IEEE extended FP
return REAL;
float_pkg should be all you need. It gives you some predefined floating-point types (e.g., IEEE single and double precision), plus the ability to define custom floating-point values with arbitrary number of bits for the fraction and the exponent.
Based on your numeric example, here is some code to convert from a real value to single, double, and quadruple precision floats.
library ieee;
use ieee.float_pkg.all;
entity floating_point_demo is
end;
architecture example of floating_point_demo is
begin
process
variable real_input_value: real := 49.0215463456;
variable single_precision_float: float32;
variable double_precision_float: float64;
variable quadruple_precision_float: float128;
begin
single_precision_float := to_float(real_input_value, single_precision_float);
report to_string(single_precision_float);
report to_hstring(single_precision_float);
double_precision_float := to_float(real_input_value, double_precision_float);
report to_string(double_precision_float);
report to_hstring(double_precision_float);
quadruple_precision_float := to_float(real_input_value, quadruple_precision_float);
report to_string(quadruple_precision_float);
report to_hstring(quadruple_precision_float);
wait;
end process;
end;
The example above uses types float32, float64, and float128 from float_pkg. However, you can achieve the same effect using objects of type float, whose size can be defined at their declarations:
variable single_precision_float: float(8 downto -23);
variable double_precision_float: float(11 downto -52);
variable quadruple_precision_float: float(15 downto -112);
To convert from a float to a real value, you can use the to_real() function:
-- To print the real value of a float object:
report to_string(to_real(quadruple_precision_float));
-- To convert from a float and assign to a real:
real_value := to_real(quadruple_precision_float);

Real numbers (constants) in genetic programming

I can't figure out how a genetically programmed A.I. can determine when there should be a constant in the final equation. If I take the formula F(m) = ma; F(m) = m9.8, how can the A.I. know what the real number 9.8 actually is? I understand that instead of putting the final number in the binary tree, you can actually put a symbol that describes a constant and then later calculate or guess what is its value in a certain way.
Thank you
Given a predefined set of constants (part of the terminal set) they'll be combined to form new constants (using a tree-representation, any sub-tree with only numeric constants as leaves can itself be thought of as a new numeric constant).
Even with a single constant (c) the system will create:
the 1.0 constant (constant divided by itself: c / c);
the 2.0 constant (1.0 + 1.0 i.e. c / c + c / c);
the 0.5 constant (1.0 / 2.0 i.e. c / c / (c / c + c / c));
many constants will be created this way (if you are lucky... 9.8).
Sometimes special terminals named "ephemeral random constant" (Koza) are used. For each ephemeral in the initial population, a random number in a specified range is generated. Then these random constants are moved around and combined.
Anyway, even with the use of the ephemeral random constant, GP can be hard put to generate the right constants (Koza said "the finding of numeric constants is a skeleton in the GP closet").
So other techniques can be used during/after the evolution, e.g. numeric mutation, hill climbing...
These hybrid systems often have significant improvements in the success ratios (at least for regression problems).

Fixed point arithmetic

I'm currently using Microchip's Fixed Point Library, but I think this applies to most fixed point libraries. It supports Q15 and Q15.16 types, respectively 16-bit and 32-bit data.
One thing I noticed is that it does not include add, subtract, multiply or divide functions.
How am I supposed to do these? Is it as simple as just adding/subtracting/multiplying/dividing them together using integer math? I can see addition and subtraction working, but multiplying or dividing wouldn't take care of the fractional part...?
The Microsoft library includes functions for adding and subtracting that deal with underflow/overflow (_Q15add and _Q15sub).
Multiplication can be implemented as an assembly function (I think the code is good - this is from memory).
C calling prototype is:
extern _Q15 Q15mpy(_Q15 a, _Q15 b);
The routine (placed in a .s source file in your project) is:
.global _Q15mpy
_Q15mpy:
mul.ss w0, w1, w2 ; signed multiple parameters, result in w2:w3
SL w2, w2 ; place most significant bit of W2 in carry
RLC w3, w0 ; rotate left carry into w3; result in W0
return ; return value in W0
.end
Remember to include libq.h
This routine does a left-shift of one bit rather than a right-shift of 15 bit on the result. There are no overflow concerns because Q15 numbers always have a magnitude <= 1.
It turns out that all basic arithmetic functions are performed by using the native operators due to how the numbers are represented. e.g. divide uses the / operator and multiply the * operator, and these compile to simple 32-bit divides and multiplies.