Function of $clog2(N) in Mojo IDE - system-verilog

I am a beginner in this, but I was wondering what exactly is the function of $clog2(N) in general? Some websites say that it is the number of address bits needed for a memory of size N and not the number of bits needed to express the value N. What does that mean?

IEEE Std 1800-2012 § 20.8.1 Integer math functions
The system function $clog2 shall return the ceiling of the log base 2 of the argument (the log rounded up to an integer value). The argument can be an integer or an arbitrary sized vector value. The argument shall be treated as an unsigned value, and an argument value of 0 shall produce a result of 0.
This system function can be used to compute the minimum address width necessary to address a memory of a given size or the minimum vector width necessary to represent a given number of states.
For example:
integer result;
result = $clog2(n);

Related

Why doesn't 'd0 extend the full width of the signal (as '0 does)?

Using SystemVerilog and Modelsim SE 2020.1, I was surprised to see a behavior:
bus_address is a 64b signal input logic [63:0] bus_address
Using '0
.bus_address ('0),
Using 'd0
.bus_address ('d0),
Riviera-Pro 2020.04 (too buggy, we gave up using it and we are in a dispute with Aldec)
'd0:
'0:
Investigation/Answer:
11.3.3 Using integer literals in expressions: An unsized, based integer (e.g., 'd12 , 'sd12 )
5.7.1 Integer literal constants:
The number of bits that make up an unsized number (which is a simple
decimal number or a number with a base specifier but no size
specification) shall be at least 32. Unsized unsigned literal
constants where the high-order bit is unknown ( X or x ) or
three-state ( Z or z ) shall be extended to the size of the expression
containing the literal constant.
That was tricky and I thought it would set 0 all the other bits as '0 does.
I hope specs' authors will think more when defining such non-sense behaviors.
This problem has more with port connections with mismatched sizes than anything to do with numeric literals. It's just that the issue does not present itself when using the fill literals. This is because the fill literal automatically sizes itself eliminating port width mismatch.
The problem you see exists whether you use literals or other signals like in this example:
module top;
wire [31:0] a = 0;
dut d(a);
endmodule
module dut(input wire [63:0] p1);
initial $strobeb(p1);
endmodule
According to section 23.3.3.7 Port connections with dissimilar net types (net and port collapsing), the nets a and p1 might get merged into a single 64-bit net, but only the lower 32-bits remain driven, or 64'hzzzzzzzz00000000.
If you change the port connection to a sized literal, dut d(32'b0);, you see the same behavior 64'hzzzzzzzz00000000.
Now let's get back to the unsized numeric literal 'd0. Unsized is a misnomer—all numbers have a size. It's just that the size is implicit and never the size you want it to be. 😮 How many people write {'b1,'b0,'b1,'b0} thinking they've just wrote the same thing as 4'b1010? This is actually illegal in the LRM, but some tools silently interpret it as {32'b1,32'b0,32'b1,32'b0}.
Just never use an unsized literal.

How to convert values to N bits of resolution in MATLAB?

My computer uses 32 bits of resolution as default. I'm writing a script that involves taking measurements with a multimeter that has N bits of resolution. How do I convert the values to that?
For example, if I have a RNG that gives 1000 values
nums = randn(1,1000);
and I use an N-bit multimeter to read those values, how would I get the values to reflect that?
I currently have
meas = round(nums,N-1);
but it's giving me N digits, not N bits. The original random numbers are unbounded, but the resolution of the multimeter is the limitation; how to implement the limitation is what I'm looking for.
Edit I: I'm talking about the resolution of measurement, not the bounds of the numbers. The original values are unbounded. The accuracy of the measured values should be limited by the resolution.
Edit II: I revised the question to try to be a bit clearer.
randn doesn’t produce bounded numbers. Let’s say you are producing 32-bit integers instead:
mums = randi([0,2^32-1],1,n);
To drop the bottom 32-N bits, simply divide by an appropriate value and round (or take the floor):
nums = round(nums/(2^(32-N)));
Do note that we only use floating-point arithmetic here, numbers are integer-valued, but not actually integers. You can do a similar operation using actual integers if you need that.
Also, obviously, N should be lower than 32. You cannot invent new bits. If N is larger, the code above will add zero bits at the bottom of the number.
With a multimeter, it is likely that the range is something like -M V to M V with a a constant resolution, and you can configure the M selecting the range.
This is fixed point math. My answer will not use it because I don't have the toolbox available, if you have it you could use it to have simpler code.
You can generate the integer values with the intended resolution, then rescale it to the intended range.
F=2^N-1 %Maximum integer value
X=randi([0,F],100,1)
X*2*M/F-M %Rescale, divide by the integer range, multiply by the intended range. Then offset by intended minimum.

Why is the product of two positive integers a negative integer?

This semester i took system proramming course.
Why 50000*50000 will be negative?
I try to understand logic of this.
Here is the screenshot of the slide
slide image
32-bit signed integers are stored by using bits 0-30 as the number and bit 31 indicating the sign of the number.
This means that the maximum value that can be represented is 2,147,483,647 (all bits from 0-30 are set, bit 31 is 0 indicating a positive number).
The product of 50,000 and 50,000 is 25,000,000,000 is greater than this number and you have what is called an overflow. This means that data has "overflowed" from its expected bounds (the bottom 31 bits) into the sign bit).
You now have bit 31 set, indicating that this is a negative number. To figure out a negative number from its binary representation, you take the ones' complement (flip all the bits), add one and then throw a negative sign in front of it.
Be careful when you take the ones' complement that you limit yourself to a 32-bit range... you shouldn't be including bits higher than bit 31.
Check out signed number representations for more information.
Sample Program Pseudo Code
Print --> ("Size of int: " + (Integer.SIZE/8) + " bytes.");
int a=50000;
int b=50000;
Print --> (" Product of a and b " + a*b);
Output :
Size of int: 4 bytes.
Product of a and b:-1794967296
Analysis :
4 bytes= 4*8= 32bits.
Since signed int can hold negative values, one-bit is used for sign (- or +), so bits available for numeric range=31.
Number range = -(2^31) , 0 and (2^31-1)
[one positive number is sacrificed for 0]
-2147483648, 0 and 2147483647
Maximum possible positive int = 2147483647 (greater than 1600000000, so 40000*40000 is fine)
Actual Product 50000*50000=2500000000 (greater than 2147483647)
In practice many portable C programs assume that signed integer overflow wraps around reliably using two's complement arithmetic.
Yet the C standard says that program behavior is undefined on overflow, and in a few cases C programs do not work on some modern implementations because their overflows do not wrap around as their authors expected.
http://www.gnu.org/software/autoconf/manual/autoconf-2.62/html_node/Integer-Overflow.html
This is because in most programming languages, the integer data type has a fixed size.
That means that each integer value have a defined MIN and MAX value.
For example in C# MAX INT is 2147483647 and MIN is -2147483648
In PHP 32 bits it's 2147483647 and -2147483648
In PHP 64 bits it's 9223372036854775807 and -9223372036854775808
What happen when you try to go over that value? Simply the computer will make what's called an integer overflow and the value will loop back to the min value.
In other words, in C# 2147483647 + 1 = -2147483648 (assuming you use an integer datatype, not long or float). That exactly what happen with 50000 * 50000, it just goes over max value and loop from the next value.
The exact min and max values are dependent on the language used, the platform the code is built, the platform the code is run on and the static type of the value.
Hope it clears everything out for you!

How is eps() calculated in MATLAB?

The eps routine in MATLAB essentially returns the positive distance between floating point numbers. It can take an optional argument, too.
My question: How does MATLAB calculate this value? (Does it use a lookup table, or does it use some algorithm to calculate it at runtime, or something else...?)
Related: how could it be calculated in any language providing bit access, given a floating point number?
WIkipedia has quite the page on it
Specifically for MATLAB it's 2^(-53), as MATLAB uses double precision by default. Here's the graph:
It's one bit for the sign, 11 for the exponent and the rest for the fraction.
The MATLAB documentation on floating point numbers also show this.
d = eps(x), where x has data type single or double, returns the positive distance from abs(x) to the next larger floating-point number of the same precision as x.
As not all fractions are equally closely spaced on the number line, different fractions will show different distances to the next floating-point within the same precision. Their bit representations are:
1.0 = 0 01111111111 0000000000000000000000000000000000000000000000000000
0.9 = 0 01111111110 1100110011001100110011001100110011001100110011001101
the sign for both is positive (0), the exponent is not equal and of course their fraction is vastly different. This means that the next floating point numbers would be:
dec2bin(typecast(eps(1.0), 'uint64'), 64) = 0 01111001011 0000000000000000000000000000000000000000000000000000
dec2bin(typecast(eps(0.9), 'uint64'), 64) = 0 01111001010 0000000000000000000000000000000000000000000000000000
which are not the same, hence eps(0.9)~=eps(1.0).
Here is some insight into eps which will help you to write an algorithm.
See that eps(1) = 2^(-52). Now, say you want to compute the eps of 17179869183.9. Note that, I have chosen a number which is 0.1 less than 2^34 (in other words, something like 2^(33.9999...)). To compute eps of this, you can compute log2 of the number, which would be ~ 33.99999... as mentioned before. Take a floor() of this number and add it to -52, since eps(1) = 2^(-52) and the given number 2^(33.999...). Therefore, eps(17179869183.9) = -52+33 = -19.
If you take a number which is fractionally more than 2^34, e.g., 17179869184.1, then the log2(eps(17179869184.1)) = -18. This also shows that the eps value will change for the numbers that are integer powers of your base (or radix), in this case 2. Since eps value only changes at those numbers which are integer powers of 2, we take floor of the power. You will be able to get the perfect value of eps for any number using this. I hope it is clear.
MATLAB uses (along with other languages) the IEEE754 standard for representing real floating point numbers.
In this format the bits allocated for approximating the actual1 real number, usually 32 - for single or 64 - for double precision, are grouped into: 3 groups
1 bit for determining the sign, s.
8 (or 11) bits for exponent, e.
23 (or 52) bits for the fraction, f.
Then a real number, n, is approximated by the following three - term - relation:
n = (-1)s * 2(e - bias) * (1 + fraction)
where the bias offsets negatively2 the values of the exponent so that they describe numbers between 0 and 1 / (1 and 2) .
Now, the gap reflects the fact that real numbers does not map perfectly to their finite, 32 - or 64 - bit, representations, moreover, a range of real numbers that differ by abs value < eps maps to a single value in computer memory, i.e: if you assign a values val to a variable var_i
var_1 = val - offset
...
var_i = val;
...
val_n = val + offset
where
offset < eps(val) / 2
Then:
var_1 = var_2 = ... = var_i = ... = var_n.
The gap is determined from the second term containing the exponent (or characteristic):
2(e - bias)
in the above relation3, which determines the "scale" of the "line" on which the approximated numbers are located, the larger the numbers, the larger the distance between them, the less precise they are and vice versa: the smaller the numbers, the more densely located their representations are, consequently, more accurate.
In practice, to determine the gap of a specific number, eps(number), you can start by adding / subtracting a gradually increasing small number until the initial value of the number of interest changes - this will give you the gap in that (positive or negative) direction, i.e. eps(number) / 2.
To check possible implementations of MATLAB's eps (or ULP - unit of last place , as it is called in other languages), you could search for ULP implementations either in C, C++ or Java, which are the languages MATLAB is written in.
1. Real numbers are infinitely preciser i.e. they could be written with arbitrary precision, i.e. with any number of digits after the decimal point.
2. Usually around the half: in single precision 8 bits mean decimal values from 1 to 2^8 = 256, around the half in our case is: 127, i.e. 2(e - 127)
2. It can be thought that: 2(e - bias), is representing the most significant digits of the number, i.e. the digits that contribute to describe how big the number is, as opposed to the least significant digits that contribute to describe its precise location. Then the larger the term containing the exponent, the smaller the significance of the 23 bits of the fraction.

How to use Bitxor for Double Numbers?

I want to use xor for my double numbers in matlab,but bitxor is only working for int numbers. Is there a function that could convert double to int in Matlab?
The functions You are looking for might be: int8(number), int16(number), uint32(number) Any of them will convert Double to an Integer, but You must pick the best one for the result You want to achieve. Remember that You cannot cast from Double to Integer without rounding the number.
If I understood You correcly, You could create a function that would simply remove the "comma" from the Double number by multiplying your starting value by 2^n and then casting it to Integer using any of the functions mentioned earlier, performing whatever you want and then returning comma to its original position by dividing the number by 2^n
Multiplying the starting value by 2^n is a hack that will decrease the rounding error.
The perfect value for n would be the number of digits after the comma if this number is relatively small.
Please also specify, why are You trying to do this? This doesn't seem to be the optimal solution.
You can just cast to an integer:
a = 1.003
int8(a)
ans =
1
That gives you an 8 bit signed integer, you can also get other size i.e. int16 or else unsigned i.e. uint8 depending on what you want to do