Matlab fixed point quantizer - matlab

It seems tricky to use Matlab's fixed point quantizer, probably some hidden parameters. Here is an example:
b = fi(pi, 1, 21, 16) % use total length = 21-bit, fraction = 16-bit
ntBP = numerictype(1, 10, 9) % quantize to total length = 10-bit, fraction = 9-bit
c = quantize(b, ntBP) % this returns -0.8594
Looks like the problem is matlab didn't take care of the sign bit:
bin(b) % return '000110010010000111111'
bin(c) % return '1001001000'. This is why the quantizer output is negative
The question is, is this the expected way to use the quantizer or not?
Thanks

Related

fixed point binary operation with matlab

I have these decimal values:
x1=-43.00488
x4=11.5048
y1=-11.5048
y4=-43.004
I converted them to their equal binary values, in the format Q7.10
So, these are the binary values:
% All of the binary values are signed and in Q7.10 format.
x1=1010100_1111111011
x4=0001011_1000000101
y1=1110100_0111111011
y4=1010100_1111111011
I want to do this operation with binary values in matlab :
% This line is equal to multiplying "((x1-x4) / (y1-y4))" with 2^10;
x1x4_div_y1y4 = ((x1-x4) / (y1-y4)) << 10
While trying to do this operation I had some difficulties,
firstly, I couldn't declare the negative binary values in Matlab.
secondly, are we allowed to do math operations with binary values or should I do the operations with decimal values then convert them to binary values?
But what I need is to do this operation with binary operations so I can implement it in verilog hdl.
a= ((-43.00488-11.5048) / (-11.5048+43.00488))*(2^10)
a =
-1.7720e+03
I am not sure if these statements are given the true answer. Should I multiply it with 2^10 or so...
I want to do the same operation using binary values. Can I do that in Matlab? And how to do that?
Thank you in advance.
Your question is not very clear. I think you probably need to think about what you want the fixed-point format of x1x4_div_y1y4 to be. I'm not sure if you really want to multiply by 2^10, or you just did that because you thought you needed to.
However, since you stated that's the operation you want to do, I will assume you really wanted to multiply by 2^10.
The code below converts the binary numbers to fixed point, does the calculation you want, then converts the result back to binary.
Your decimal result (-1772) was correct. You just need to convert it back to signed binary. However, be careful because this number cannot be represented in Q7.10 format (because you multiplied by 2^10, so now it's too large).
In the code below, I just assumed you want the result in signed Q16.8 format (which I interpret as 1 sign bit + 16 integer bits + 8 fractional bits). If you want something different, you can just change those numbers.
close all; clear all; clc;
% All of the binary values are signed and in Q7.10 format.
x1 = '10101001111111011';
x4 = '00010111000000101';
y1 = '11101000111111011';
y4 = '10101001111111011';
% Convert to signed integers
x1 = -bin2dec(x1(1))*2^16 + bin2dec(x1(2:end));
x4 = -bin2dec(x4(1))*2^16 + bin2dec(x4(2:end));
y1 = -bin2dec(y1(1))*2^16 + bin2dec(y1(2:end));
y4 = -bin2dec(y4(1))*2^16 + bin2dec(y4(2:end));
% Convert from integer to fixed point values
x1 = x1 / 2^10;
x4 = x4 / 2^10;
y1 = y1 / 2^10;
y4 = y4 / 2^10;
% The operation I want to do
x1x4_div_y1y4 = ((x1-x4) / (y1-y4)) * 2^10; % << 10
% Convert back to binary...
% Let's assume we want signed Q16.8 output
INTEGER_BITS = 16;
FRACTIONAL_BITS = 8;
% Convert from fixed-point to integer
x1x4_div_y1y4 = round(x1x4_div_y1y4 * 2^FRACTIONAL_BITS);
% Handle the sign bit
if x1x4_div_y1y4 < 0
x1x4_div_y1y4 = x1x4_div_y1y4 + 2*2^(INTEGER_BITS + FRACTIONAL_BITS);
end
% Convert to binary
x1x4_div_y1y4 = dec2bin(x1x4_div_y1y4, 1+INTEGER_BITS+FRACTIONAL_BITS)

Summing very large numbers without using toolboxes

I am trying to sum very large numbers in MATLAB, such as e^800 and e^1000 and obtain an answer.
I know that in Double-Precision, the largest number I can represent is 1.8 * 10^308, otherwise I get Inf, which I am getting when trying to sum these numbers.
My question is, how do I go about estimating an answer for sums of very, very large numbers like these without using vpa, or some other toolbox?
Should I use strings? It is possible to do this using logs? Can I represent the floats as m x 2^E and if so, how do I take a number such as e^700 and convert it to that? If the number is larger than the threshold for Inf, should I divide it by two, and store it in two different variables?
For example, how would I obtain an approximate answer for:
e^700 + e^800 + e^900 + e^1000 ?
A possible approximation is to use the rounded values of these numbers (I personally used Wolfram|Alpha), then perform "long addition" as they teach in elementary school:
function sumStr = q57847408()
% store rounded values as string:
e700r = "10142320547350045094553295952312676152046795722430733487805362812493517025075236830454816031618297136953899163768858065865979600395888785678282243008887402599998988678389656623693619501668117889366505232839133350791146179734135738674857067797623379884901489612849999201100199130430066930357357609994944589";
e800r = "272637457211256656736477954636726975796659226578982795071066647118106329569950664167039352195586786006860427256761029240367497446044798868927677691427770056726553709171916768600252121000026950958713667265709829230666049302755903290190813628112360876270335261689183230096592218807453604259932239625718007773351636778976141601237086887204646030033802";
e900r = "7328814222307421705188664731793809962200803372470257400807463551580529988383143818044446210332341895120636693403927733397752413275206079839254190792861282973356634441244426690921723184222561912289431824879574706220963893719030715472100992004193705579194389741613195142957118770070062108395593116134031340597082860041712861324644992840377291211724061562384383156190256314590053986874606962229";
e1000r = "197007111401704699388887935224332312531693798532384578995280299138506385078244119347497807656302688993096381798752022693598298173054461289923262783660152825232320535169584566756192271567602788071422466826314006855168508653497941660316045367817938092905299728580132869945856470286534375900456564355589156220422320260518826112288638358372248724725214506150418881937494100871264232248436315760560377439930623959705844189509050047074217568";
% pad to the same length with zeros on the left:
padded = pad([e700r; e800r; e900r; e1000r], 'left', '0');
% convert the padded value to an array of digits:
dig = uint8(char(padded) - '0');
% some helpful computations for later:
colSum = [0 uint8(sum(dig, 1))]; % extra 0 is to prevent overflow of MSB
remainder = mod(colSum, 10);
carry = idivide(colSum, 10, 'floor');
while any(carry) % can also be a 'for' loop with nDigit iterations (at most)
result = remainder + circshift(carry, -1);
remainder = mod(result, 10);
carry = idivide(result, 10, 'floor');
end
% remove leading zero (at most one):
if ~result(1)
result = result(2:end);
end
% convert result back to string:
sumStr = string(char(result + '0'));
This gives the (rounded) result of:
197007111401704699388887935224332312531693805861198801302702004327171116872054081548301452764017301057216669857236647803717912876737392925607579016038517631441936559738211677036898431095605804172455718237264052427496060405708350697523284591075347592055157466708515626775854212347372496361426842057599220506613838622595904885345364347680768544809390466197511254544019946918140384750254735105245290662192955421993462796807599177706158188
Typos fixed from before.
Decimal Approximation:
function [m, new_exponent] = base10_mantissa_exponent(base, exponent)
exact_exp = exponent*log10(abs(base));
new_exponent = floor(exact_exp);
m = power(10, exact_exp - new_exponent);
end
So the value e600 would become 3.7731 * 10260.
And the value 117150 would become 1.6899 * 10310.
To add these two values together, I took the difference between the two exponents and divided the mantissa of the smaller term by it. Then it's just as simple as adding the mantissas together.
mantissaA = 3.7731;
exponentA = 260;
mantissaB = 1.6899;
exponentB = 310;
diff = abs(exponentA - exponentB);
if exponentA < exponentB
mantissaA = mantissaA / (10^diff);
finalExponent = exponentB;
elseif exponentB < exponentA
mantissaB = mantissaB / (10^diff);
finalExponent = exponentA;
end
finalMantissa = mantissaA + mantissaB;
This was important for me as I was performing sums such as:
(Σ ex) / (Σ xex)
From x=1 to x=1000.

Quantizing Double Type Input to Double Type Output in MATLAB

I'm trying to quantize a set of double type samples with 128 level uniform quantizer and I want my output to be double type aswell. When I try to use "quantize" matlab gives an error: Inputs of class 'double' are not supported. I tried "uencode" as well but its answer was nonsense. I'm quite new to matlab and I've been working on this for hours. Any help appriciated. Thanks
uencode is supposed to give integer results. Thats the point of it. but the key point is that it assumes a symmetric range. going from -x to +x where x is the largest or smallest value in your data set. So if your data is from 0-10 your result looks like nonsense because it quantizes the values on the range -10 to 10.
In any event, you actually want the encoded value and the quantized value. I wrote a simple function to do this. It even has little help instructions (really just type "help ValueQuantizer"). I also made it very flexible so it should work with any data size (assuming you have enough memory) it can be a vector, 2d array, 3d, 4d....etc
here is an example to see how it works. Our number is a Uniform distribution from -0.5 to 3.5 this shows that unlike uencode, my function works with nonsymmetric data, and that it works with negative values
a = 4*rand(2,4,2) - .5
[encoded_vals, quant_values] = ValueQuantizer(a, 3)
produces
a(:,:,1) =
0.6041 2.1204 -0.0240 3.3390
2.2188 0.1504 1.4935 0.8615
a(:,:,2) =
1.8411 2.5051 1.5238 3.0636
0.3952 0.5204 2.2963 3.3372
encoded_vals(:,:,1) =
1 4 0 7
5 0 3 2
encoded_vals(:,:,2) =
4 5 3 6
1 1 5 7
quant_values(:,:,1) =
0.4564 1.8977 -0.0240 3.3390
2.3781 -0.0240 1.4173 0.9368
quant_values(:,:,2) =
1.8977 2.3781 1.4173 2.8585
0.4564 0.4564 2.3781 3.3390
so you can see it returns the encoded values as integers (just like uencode but without the weird symmetric assumption). Unlike uencode, this just returns everything as doubles rather than converting to uint8/16/32. The important part is it also returns the quantized values, which is what you wanted
here is the function
function [encoded_vals, quant_values] = ValueQuantizer(U, N)
% ValueQuantizer uniformly quantizes and encodes the input into N-bits
% it then returns the unsigned integer encoded values and the actual
% quantized values
%
% encoded_vals = ValueQuantizer(U,N) uniformly quantizes and encodes data
% in U. The output range is integer values in the range [0 2^N-1]
%
% [encoded_vals, quant_values] = ValueQuantizer(U, N) uniformly quantizes
% and encodes data in U. encoded_vals range is integer values [0 2^N-1]
% quant_values shows the original data U converted to the quantized level
% representing the number
if (N<2)
disp('N is out of range. N must be > 2')
return;
end
quant_values = double(U(:));
max_val = max(quant_values);
min_val = min(quant_values);
%quantizes the data
quanta_size = (max_val-min_val) / (2^N -1);
quant_values = (quant_values-min_val) ./ quanta_size;
%reshapes the data
quant_values = reshape(quant_values, size(U));
encoded_vals = round(quant_values);
%returns the original numbers in their new quantized form
quant_values = (encoded_vals .* quanta_size) + min_val;
end
As far as I can tell this should always work, but I haven't done extensive testing, good luck

Error:Maximum variable size allowed by the program is exceeded. while using sub2ind

Please suggest how to sort out this issue:
nNodes = 50400;
adj = sparse(nNodes,nNodes);
adj(sub2ind([nNodes nNodes], ind, ind + 1)) = 1; %ind is a vector of indices
??? Maximum variable size allowed by the program is exceeded.
I think the problem is 32/64-bit related. If you have a 32 bit processor, you can address at most
2^32 = 4.294967296e+09
elements. If you have a 64-bit processor, this number increases to
2^64 = 9.223372036854776e+18
Unfortunately, for reasons that are at best vague to me, Matlab does not use this full range. To find out the actual range used by Matlab, issue the following command:
[~,maxSize] = computer
On a 32-bit system, this gives
>> [~,maxSize] = computer
maxSize =
2.147483647000000e+09
>> log2(maxSize)
ans =
3.099999999932819e+01
and on a 64-bit system, it gives
>> [~,maxSize] = computer
maxSize =
2.814749767106550e+14
>> log2(maxSize)
ans =
47.999999999999993
So apparently, on a 32-bit system, Matlab only uses 31 bits to address elements, which gives you the upper limit.
If anyone can clarify why Matlab only uses 31 bits on a 32-bit system, and only 48 bits on a 64-bit system, that'd be awesome :)
Internally, Matlab always uses linear indices to access elements in an array (it probably just uses a C-style array or so), which implies for your adj matrix that its final element is
finEl = nNodes*nNodes = 2.54016e+09
This, unfortunately, is larger than the maximum addressable with 31 bits. Therefore, on the 32-bit system,
>> adj(end) = 1;
??? Maximum variable size allowed by the program is exceeded.
while this command poses no problem at all on the 64-bit system.
You'll have to use a workaround on a 32-bit system:
nNodes = 50400;
% split sparse array up into 4 pieces
adj{1,1} = sparse(nNodes/2,nNodes/2); adj{1,2} = sparse(nNodes/2,nNodes/2);
adj{2,1} = sparse(nNodes/2,nNodes/2); adj{2,2} = sparse(nNodes/2,nNodes/2);
% assign or index values to HUGE sparse arrays
function ret = indHuge(mat, inds, vals)
% get size of cell
sz = size(mat);
% return current values when not given new values
if nargin < 3
% I have to leave this up to you...
% otherwise, assign new values
else
% I have to leave this up to you...
end
end
% now initialize desired elements to 1
adj = indHuge(adj, sub2ind([nNodes nNodes], ind, ind + 1), 1);
I just had the idea to cast all this into a proper class, so that you can use much more intuitive syntax...but that's a whole lot more than I have time for now :)
adj = sparse(ind, ind + 1, ones(size(ind)), nNodes, nNodes, length(ind));
This worked fine...
And, if we have to access the last element of the sparse matrix, we can access by adj(nNodes, nNodes), but adj(nNodes * nNodes) throws error.

double to int16 (generation or conversion?)

fsamp = 2;
deltaf = fsamp/nfft; % FFT resolution
Nfreqtimestwo = 128; % Used below
Nsines = Nfreqtimestwo/2 - 1; % Number of sine waves
fmult = [1:Nsines]; % multiplicative factor
freq_fund = fsamp/Nfreqtimestwo;
freq_sines = freq_fund.*fmult;
omega = 2*pi*freq_sines;
r = int16(0);
for(ii=1:Nsines)
r = r + cos((omega(ii)/fsamp)*(0:messageLen-1));
end
This is the code I am currently using to create my input signal. However, the end result of r is a 32,768 array of doubles. Now I would like to do the best approximation of that using int16. However, I would like to note that amplitude doesn't really matter. For example, my best approach so far I think has been:
fsamp = 2;
deltaf = fsamp/nfft; % FFT resolution
Nfreqtimestwo = 128; % Used below
Nsines = Nfreqtimestwo/2 - 1; % Number of sine waves
fmult = [1:Nsines]; % multiplicative factor
freq_fund = fsamp/Nfreqtimestwo;
freq_sines = freq_fund.*fmult;
omega = 2*pi*freq_sines;
r = int16(0);
for(ii=1:Nsines)
r = r + int16(8192*cos((omega(ii)/fsamp)*(0:messageLen-1)));
end
Are there any better ways to approach this?
EDIT
The reason I want to convert the doubles to ints is because this list is being used in an embedded system and eventually going to a 16-bit DAC... no doubles allowed
int16(vector) converts vector from double to int16 and this is the preferred way. The alternate way of doing it is to define all your constants as int16s in which case, MATLAB will give you the result as an int16. However, this is cumbersome, so stick with what you have (unless if you absolutely have to do it this way).
Also, unrelated to your actual question, you can ditch the loop by using cumsum. I'll leave that for you to try out :)