fixed point binary operation with matlab - matlab

I have these decimal values:
x1=-43.00488
x4=11.5048
y1=-11.5048
y4=-43.004
I converted them to their equal binary values, in the format Q7.10
So, these are the binary values:
% All of the binary values are signed and in Q7.10 format.
x1=1010100_1111111011
x4=0001011_1000000101
y1=1110100_0111111011
y4=1010100_1111111011
I want to do this operation with binary values in matlab :
% This line is equal to multiplying "((x1-x4) / (y1-y4))" with 2^10;
x1x4_div_y1y4 = ((x1-x4) / (y1-y4)) << 10
While trying to do this operation I had some difficulties,
firstly, I couldn't declare the negative binary values in Matlab.
secondly, are we allowed to do math operations with binary values or should I do the operations with decimal values then convert them to binary values?
But what I need is to do this operation with binary operations so I can implement it in verilog hdl.
a= ((-43.00488-11.5048) / (-11.5048+43.00488))*(2^10)
a =
-1.7720e+03
I am not sure if these statements are given the true answer. Should I multiply it with 2^10 or so...
I want to do the same operation using binary values. Can I do that in Matlab? And how to do that?
Thank you in advance.

Your question is not very clear. I think you probably need to think about what you want the fixed-point format of x1x4_div_y1y4 to be. I'm not sure if you really want to multiply by 2^10, or you just did that because you thought you needed to.
However, since you stated that's the operation you want to do, I will assume you really wanted to multiply by 2^10.
The code below converts the binary numbers to fixed point, does the calculation you want, then converts the result back to binary.
Your decimal result (-1772) was correct. You just need to convert it back to signed binary. However, be careful because this number cannot be represented in Q7.10 format (because you multiplied by 2^10, so now it's too large).
In the code below, I just assumed you want the result in signed Q16.8 format (which I interpret as 1 sign bit + 16 integer bits + 8 fractional bits). If you want something different, you can just change those numbers.
close all; clear all; clc;
% All of the binary values are signed and in Q7.10 format.
x1 = '10101001111111011';
x4 = '00010111000000101';
y1 = '11101000111111011';
y4 = '10101001111111011';
% Convert to signed integers
x1 = -bin2dec(x1(1))*2^16 + bin2dec(x1(2:end));
x4 = -bin2dec(x4(1))*2^16 + bin2dec(x4(2:end));
y1 = -bin2dec(y1(1))*2^16 + bin2dec(y1(2:end));
y4 = -bin2dec(y4(1))*2^16 + bin2dec(y4(2:end));
% Convert from integer to fixed point values
x1 = x1 / 2^10;
x4 = x4 / 2^10;
y1 = y1 / 2^10;
y4 = y4 / 2^10;
% The operation I want to do
x1x4_div_y1y4 = ((x1-x4) / (y1-y4)) * 2^10; % << 10
% Convert back to binary...
% Let's assume we want signed Q16.8 output
INTEGER_BITS = 16;
FRACTIONAL_BITS = 8;
% Convert from fixed-point to integer
x1x4_div_y1y4 = round(x1x4_div_y1y4 * 2^FRACTIONAL_BITS);
% Handle the sign bit
if x1x4_div_y1y4 < 0
x1x4_div_y1y4 = x1x4_div_y1y4 + 2*2^(INTEGER_BITS + FRACTIONAL_BITS);
end
% Convert to binary
x1x4_div_y1y4 = dec2bin(x1x4_div_y1y4, 1+INTEGER_BITS+FRACTIONAL_BITS)

Related

Matlab fixed point quantizer

It seems tricky to use Matlab's fixed point quantizer, probably some hidden parameters. Here is an example:
b = fi(pi, 1, 21, 16) % use total length = 21-bit, fraction = 16-bit
ntBP = numerictype(1, 10, 9) % quantize to total length = 10-bit, fraction = 9-bit
c = quantize(b, ntBP) % this returns -0.8594
Looks like the problem is matlab didn't take care of the sign bit:
bin(b) % return '000110010010000111111'
bin(c) % return '1001001000'. This is why the quantizer output is negative
The question is, is this the expected way to use the quantizer or not?
Thanks

Summing very large numbers without using toolboxes

I am trying to sum very large numbers in MATLAB, such as e^800 and e^1000 and obtain an answer.
I know that in Double-Precision, the largest number I can represent is 1.8 * 10^308, otherwise I get Inf, which I am getting when trying to sum these numbers.
My question is, how do I go about estimating an answer for sums of very, very large numbers like these without using vpa, or some other toolbox?
Should I use strings? It is possible to do this using logs? Can I represent the floats as m x 2^E and if so, how do I take a number such as e^700 and convert it to that? If the number is larger than the threshold for Inf, should I divide it by two, and store it in two different variables?
For example, how would I obtain an approximate answer for:
e^700 + e^800 + e^900 + e^1000 ?
A possible approximation is to use the rounded values of these numbers (I personally used Wolfram|Alpha), then perform "long addition" as they teach in elementary school:
function sumStr = q57847408()
% store rounded values as string:
e700r = "10142320547350045094553295952312676152046795722430733487805362812493517025075236830454816031618297136953899163768858065865979600395888785678282243008887402599998988678389656623693619501668117889366505232839133350791146179734135738674857067797623379884901489612849999201100199130430066930357357609994944589";
e800r = "272637457211256656736477954636726975796659226578982795071066647118106329569950664167039352195586786006860427256761029240367497446044798868927677691427770056726553709171916768600252121000026950958713667265709829230666049302755903290190813628112360876270335261689183230096592218807453604259932239625718007773351636778976141601237086887204646030033802";
e900r = "7328814222307421705188664731793809962200803372470257400807463551580529988383143818044446210332341895120636693403927733397752413275206079839254190792861282973356634441244426690921723184222561912289431824879574706220963893719030715472100992004193705579194389741613195142957118770070062108395593116134031340597082860041712861324644992840377291211724061562384383156190256314590053986874606962229";
e1000r = "197007111401704699388887935224332312531693798532384578995280299138506385078244119347497807656302688993096381798752022693598298173054461289923262783660152825232320535169584566756192271567602788071422466826314006855168508653497941660316045367817938092905299728580132869945856470286534375900456564355589156220422320260518826112288638358372248724725214506150418881937494100871264232248436315760560377439930623959705844189509050047074217568";
% pad to the same length with zeros on the left:
padded = pad([e700r; e800r; e900r; e1000r], 'left', '0');
% convert the padded value to an array of digits:
dig = uint8(char(padded) - '0');
% some helpful computations for later:
colSum = [0 uint8(sum(dig, 1))]; % extra 0 is to prevent overflow of MSB
remainder = mod(colSum, 10);
carry = idivide(colSum, 10, 'floor');
while any(carry) % can also be a 'for' loop with nDigit iterations (at most)
result = remainder + circshift(carry, -1);
remainder = mod(result, 10);
carry = idivide(result, 10, 'floor');
end
% remove leading zero (at most one):
if ~result(1)
result = result(2:end);
end
% convert result back to string:
sumStr = string(char(result + '0'));
This gives the (rounded) result of:
197007111401704699388887935224332312531693805861198801302702004327171116872054081548301452764017301057216669857236647803717912876737392925607579016038517631441936559738211677036898431095605804172455718237264052427496060405708350697523284591075347592055157466708515626775854212347372496361426842057599220506613838622595904885345364347680768544809390466197511254544019946918140384750254735105245290662192955421993462796807599177706158188
Typos fixed from before.
Decimal Approximation:
function [m, new_exponent] = base10_mantissa_exponent(base, exponent)
exact_exp = exponent*log10(abs(base));
new_exponent = floor(exact_exp);
m = power(10, exact_exp - new_exponent);
end
So the value e600 would become 3.7731 * 10260.
And the value 117150 would become 1.6899 * 10310.
To add these two values together, I took the difference between the two exponents and divided the mantissa of the smaller term by it. Then it's just as simple as adding the mantissas together.
mantissaA = 3.7731;
exponentA = 260;
mantissaB = 1.6899;
exponentB = 310;
diff = abs(exponentA - exponentB);
if exponentA < exponentB
mantissaA = mantissaA / (10^diff);
finalExponent = exponentB;
elseif exponentB < exponentA
mantissaB = mantissaB / (10^diff);
finalExponent = exponentA;
end
finalMantissa = mantissaA + mantissaB;
This was important for me as I was performing sums such as:
(Σ ex) / (Σ xex)
From x=1 to x=1000.

How to fix the difference in precision between double data type and uint64

I'm in the process of implementing Three Fish block cipher using MATLAB. At first, I implemented the algorithm on uint8 numbers to validate my code. Every thing was OK and the decryption was successful. But when I replaced the numbers to uint64 the plain text did not retrieved correctly.
I traced the rounds results again and over again to find the reason, but I couldn't find it so far. There is difference in the first four digits between encryption and decryption, that is, along the rounds x encrypted as 9824265115183455531, but it decrypts as 9824265115183455488.
I think the reason behind this difference is in the functions AddMod64 and SubMod64 to find arithmetic modulo 2 to the power 64. but really I could not fix it so far.
I know that
double(2^64) = 18446744073709552000
and
uint64(2^64) = 18446744073709551615 % z = ( x + y ) % 2^64
function z = AddMod64(x , y)
m = uint64(2^64);
z = double(mod(mod(double(x),m)+mod(double(y),m),m));
end
% z = (x - y ) % 2^64
function z = SubMod64(x , y)
m = uint64(2^64);
z = double(mod(mod(double(x),m) - mod(double(y),m),m));
end
double(2^64) is already the wrong result, the double type can hold only up to 2^52-1 as an integer without rounding.
Also, when you do uint64(2^64), the power is computed using double, giving the wrong result, which you then cast to uint64. And because the maximum value that a uint64 van hold is 2^64-1, that whole operation is wrong.
Use maxint instead:
m = maxint('uint64');
To do modulo addition in MATLAB is rather tricky, because MATLAB does saturated arithmetic with integers. You need to test for overflow before doing the computation.
if x > m - y
x = y - (m - x + 1);
else
x = x + y
end

Convolution Kernel using a user defined function. How to deal with negative pixel values?

I've declared a function that will be used to calculate the convolution of an image using an arbitrary 3x3 kernel. I also created a script that will prompt the user to select both an image as well as enter the convolution kernel of their choice. However, I do not know how to go about dealing with negative pixel values that will arise for various kernels. How would I implement a condition into my script that will deal with these negative values?
This is my function:
function y = convul(x,m,H,W)
y=zeros(H,W);
for i=2:(H-1)
for j=2:(W-1)
Z1=(x(i-1,j-1))*(m(1,1));
Z2=(x(i-1,j))*(m(1,2));
Z3=(x(i-1,j+1))*(m(1,3));
Z4=(x(i,j-1))*(m(2,1));
Z5=(x(i,j))*(m(2,2));
Z6=(x(i,j+1))*(m(2,3));
Z7=(x(i+1,j-1))*(m(3,1));
Z8=(x(i+1,j))*(m(3,2));
Z9=(x(i+1,j+1))*(m(3,3));
y(i,j)=Z1+Z2+Z3+Z4+Z5+Z6+Z7+Z8+Z9;
end
end
And this is the script that I've written that prompts the user to enter an image and select a kernel of their choice:
[file,path]=uigetfile('*.bmp');
x = imread(fullfile(path,file));
x_info=imfinfo(fullfile(path,file));
W=x_info.Width;
H=x_info.Height;
L=x_info.NumColormapEntries;
prompt='Enter a convulation kernel m: ';
m=input(prompt)/9;
y=convul(x,m,H,W);
imshow(y,[0,(L-1)]);
I've tried to use the absolute value of the convolution, as well as attempting to locate negatives in the output image, but nothing worked.
This is the original image:
This is the image I get when I use the kernel [-1 -1 -1;-1 9 -1; -1 -1 -1]:
I don't know what I'm doing wrong.
MATLAB is rather unique in how it handles operations between different data types. If x is uint8 (as it likely is in this case), and m is double (as it likely is in this case), then this operation:
Z1=(x(i-1,j-1))*(m(1,1));
returns a uint8 value, not a double. Arithmetic in MATLAB always takes the type of the non-double argument. (And you cannot do arithmetic between two different types unless one of them is double.)
MATLAB does integer arithmetic with saturation. That means that uint8(5) * -1 gives 0, not -5, because uint8 cannot represent a negative value.
So all your Z1..Z9 are uint8 values, negative results have been set to 0. Now you add all of these, again with saturation, leading to a value of at most 255. This value is assigned to the output (a double). So it looks like you are doing your computations correctly and outputting a double array, but you are still clamping your result in an odd way.
A Correct implementation would cast each of the values of x to double before multiplying by a potentially negative number. For example:
for i = 2:H-1
for j = 2:W-1
s = 0;
s = s + double(x(i-1,j-1))*m(1,1);
s = s + double(x(i-1,j))*m(1,2);
s = s + double(x(i-1,j+1))*m(1,3);
s = s + double(x(i,j-1))*m(2,1);
s = s + double(x(i,j))*m(2,2);
s = s + double(x(i,j+1))*m(2,3);
s = s + double(x(i+1,j-1))*m(3,1);
s = s + double(x(i+1,j))*m(3,2);
s = s + double(x(i+1,j+1))*m(3,3);
y(i,j) = s;
end
end
(Note that I removed your use of 9 different variables, I think this is cleaner, and I also removed a lot of your unnecessary brackets!)
A simpler implementation would be:
for i = 2:H-1
for j = 2:W-1
s = double(x(i-1:i+1,j-1:j+1)) .* m;
y(i,j) = sum(s(:));
end
end

Quantizing Double Type Input to Double Type Output in MATLAB

I'm trying to quantize a set of double type samples with 128 level uniform quantizer and I want my output to be double type aswell. When I try to use "quantize" matlab gives an error: Inputs of class 'double' are not supported. I tried "uencode" as well but its answer was nonsense. I'm quite new to matlab and I've been working on this for hours. Any help appriciated. Thanks
uencode is supposed to give integer results. Thats the point of it. but the key point is that it assumes a symmetric range. going from -x to +x where x is the largest or smallest value in your data set. So if your data is from 0-10 your result looks like nonsense because it quantizes the values on the range -10 to 10.
In any event, you actually want the encoded value and the quantized value. I wrote a simple function to do this. It even has little help instructions (really just type "help ValueQuantizer"). I also made it very flexible so it should work with any data size (assuming you have enough memory) it can be a vector, 2d array, 3d, 4d....etc
here is an example to see how it works. Our number is a Uniform distribution from -0.5 to 3.5 this shows that unlike uencode, my function works with nonsymmetric data, and that it works with negative values
a = 4*rand(2,4,2) - .5
[encoded_vals, quant_values] = ValueQuantizer(a, 3)
produces
a(:,:,1) =
0.6041 2.1204 -0.0240 3.3390
2.2188 0.1504 1.4935 0.8615
a(:,:,2) =
1.8411 2.5051 1.5238 3.0636
0.3952 0.5204 2.2963 3.3372
encoded_vals(:,:,1) =
1 4 0 7
5 0 3 2
encoded_vals(:,:,2) =
4 5 3 6
1 1 5 7
quant_values(:,:,1) =
0.4564 1.8977 -0.0240 3.3390
2.3781 -0.0240 1.4173 0.9368
quant_values(:,:,2) =
1.8977 2.3781 1.4173 2.8585
0.4564 0.4564 2.3781 3.3390
so you can see it returns the encoded values as integers (just like uencode but without the weird symmetric assumption). Unlike uencode, this just returns everything as doubles rather than converting to uint8/16/32. The important part is it also returns the quantized values, which is what you wanted
here is the function
function [encoded_vals, quant_values] = ValueQuantizer(U, N)
% ValueQuantizer uniformly quantizes and encodes the input into N-bits
% it then returns the unsigned integer encoded values and the actual
% quantized values
%
% encoded_vals = ValueQuantizer(U,N) uniformly quantizes and encodes data
% in U. The output range is integer values in the range [0 2^N-1]
%
% [encoded_vals, quant_values] = ValueQuantizer(U, N) uniformly quantizes
% and encodes data in U. encoded_vals range is integer values [0 2^N-1]
% quant_values shows the original data U converted to the quantized level
% representing the number
if (N<2)
disp('N is out of range. N must be > 2')
return;
end
quant_values = double(U(:));
max_val = max(quant_values);
min_val = min(quant_values);
%quantizes the data
quanta_size = (max_val-min_val) / (2^N -1);
quant_values = (quant_values-min_val) ./ quanta_size;
%reshapes the data
quant_values = reshape(quant_values, size(U));
encoded_vals = round(quant_values);
%returns the original numbers in their new quantized form
quant_values = (encoded_vals .* quanta_size) + min_val;
end
As far as I can tell this should always work, but I haven't done extensive testing, good luck