Convert 64 bit numbers from binary to decimal using uint64 - matlab

I want to convert 64 bit numbers from binary to decimal. Since dec2bin only supports up to 52 bits, I thought I could roll my own function and use uint64 to go beyond this limit:
function [dec] = my_bin2dec(bin)
v = uint64(length(bin)-1:-1:0);
base = uint64(2).^v;
dec = uint64(sum(uint64(base.*(uint64(bin-'0')))));
end
However, it does not work as expected:
my_bin2dec('111000000000000000000000000000000000001010110101011101000001110')
ans =
8070450532270651392
my_bin2dec('111000000000000000000000000000000000001010110101011101000001111')
ans =
8070450532270651392
Whereas this is the correct result:
(111000000000000000000000000000000000001010110101011101000001110)bin
= (8070450532270651918)dec
(111000000000000000000000000000000000001010110101011101000001111)bin
= (8070450532270651919)dec
What am I missing? It seems like there is some operation still performed using 52bit double arithmetic, but I don't know which one.
I checked if the operations are available for uint64 and it seems that the ones I use (power, times, sum) are there:
>> methods uint64
Methods for class uint64:
abs bitxor diff isinf mod plus sum
accumarray bsxfun display isnan mpower power times
all ceil eq issorted mrdivide prod transpose
and colon find ldivide mtimes rdivide tril
any conj fix le ne real triu
bitand ctranspose floor linsolve nnz rem uminus
bitcmp cummax full lt nonzeros reshape uplus
bitget cummin ge max not round xor
bitor cumprod gt min nzmax sign
bitset cumsum imag minus or sort
bitshift diag isfinite mldivide permute sortrowsc

You were right in saying that
It seems like there is some operation still performed using 52bit double arithmetic.
The problem is in line
dec = uint64(sum(uint64(base.*(uint64(bin-'0')))));
The operation sum(uint64(base.*(uint64(bin-'0')))) gives a double result, which only has about 15 significant digits. That's why your lowest digits are wrong. Subsequent conversion into uint64 doesn't help, because precision has already been lost.
The solution is to sum natively in uint64. This gives a uint64 result with its full precision:
dec = sum(uint64(base.*(uint64(bin-'0'))), 'native');

Had the same thought as #beaker, break it into chunks:
%% dec2bin
x=intmax('uint64')
MSBs = dec2bin( bitshift(x,-32) ,32)
LSBs = dec2bin( bitand(x, hex2dec('FFFFFFFF')) ,32)
y = [MSBs LSBs]
%% bin2dec
MSBs = y(1:32)
LSBs = y(33:64)
z = bitor( bitshift( uint64(bin2dec(MSBs)) , 32 ) , uint64(bin2dec(LSBs)) )
% (now x = z)
Oddly enough, it seems that dec2bin doesn't give an error, but does give incorrect answers for 64 bit numbers:
dec2bin( intmax('uint64') )
ans =
10000000000000000000000000000000000000000000000000000000000000000

Related

How to find the nearest match for an integer in a given matrix?

I have two matrices. Matrix A(2048,64) and matrix B(10000,64). Values in each element of these matrices is a binary bit, so each row is a representation of a 64-bit binary value, so each row of the matrix has a magnitude between 2^63 and 2^0; Most Significant Bit to Least Significant Bit, respectively.
Problem:
For each row of A I want to find the value in B which is the closest to it in an absolute, numeric sense.
Consider A(i,1:64) being a binary representation of decimal value Xi, and B(j,1:64) a binary representation of decimal value Yj. So at the first step I want to find the best j such that X1 or A(1,1:64) has the closest numeric value to the element at Yj, i.e. abs(X1-Yj) is minimized among all possible values for j.
The below image, brought from here, describes my problem rather well, but the difference is that each of my values are contained in a row of a matrix containing 64 elements.
I tried to convert the 64-bit values to decimal, however dec2bin supports values up to 56-bit only.
You can divide your 64-bit number into two 32-bit pieces, b1 and b2, convert them to decimal values d1 and d2, then combine them into a uint64 value that has enough precision to hold the result.
bin2uint64 = #(b) uint64(bin2dec(b(:,1:32)))*(2^32) + uint64(bin2dec(b(:,33:64)));
(This assumes that you have your data in the same format required by bin2dec, i.e. a vector of char. If you have a vector of numeric values, just add in a b = char(b+'0');)
Given an initial value
>> b = 1100110010111100101101111010100010101010010011010010000110011010
>> d = bin2uint64(b)
d = 14752868414398472602
>> r = dec2bin(d, 64)
r = 1100110010111100101101111010100010101010010011010010000110011010
>> any(b-r)
ans = 0
Since b-r gives all zeros, the values are identical. You can pass the entire nx64 matrix as b and it will convert all of the values at once.
>> bin2uint64(char(randi([0 1], 20, 64) + '0'))
ans =
4169100589409210726
8883634060077187622
15399652840620725530
12845470998093501747
14561257795005665153
1133198980289431407
13360302497937328511
563773644115232568
8825360015701340662
2543400693478304607
11786523850513558107
8569436845019332309
2720129551425231323
5937260866696745014
4974981393428261150
16646060326132661642
5943867124784820058
2385960312431811974
13146819635569970159
6273342847731389380
You'll notice that I manually converted my random array to char. Assuming your input is numeric, you'll have to convert it first:
Achar = char(A + '0');
Yes, this is a pain, MATLAB should have included a destination type parameter in bin2dec, but they didn't. Now you can use your linked solution to find the matchings.
Converting your values:
Assuming your matrices A and B contain the numeric values 0 and 1, you can easily convert the rows to uint64 data types without precision loss using the bitset and sum functions (and bsxfun for a small efficiency boost):
result = sum(bsxfun(#(bit, V) bitset(uint64(0), bit, V), 64:-1:1, A), 2, 'native');
Compared to the solution from beaker, this one is over 4 times faster for a 10,000 row matrix:
% Sample data:
A = randi([0 1], 10000, 64);
% Test functions:
bin2uint64 = #(b) uint64(bin2dec(b(:,1:32)))*(2^32) + uint64(bin2dec(b(:,33:64)));
beaker_fcn = #(A) bin2uint64(char(A+'0'));
gnovice_fcn = #(A) sum(bsxfun(#(b, V) bitset(uint64(0), b, V), 64:-1:1, A), 2, 'native');
% Accuracy test:
isMatch = isequal(beaker_fcn(A), gnovice_fcn(A)); % Return "true"
% Timing:
timeit(#() beaker_fcn(A))
ans =
0.022865378234183
timeit(#() gnovice_fcn(A))
ans =
0.005434031911843
Computing nearest matches:
You provide a link to some solutions for finding the nearest matches for A in B. However, the fact that you are using unsigned integer types requires some modification. Specifically, order matters when subtracting values due to integer overflow. For example uint64(8) - uint64(1) gives you 7, but uint64(1) - uint64(8) gives you 0.
Here's the modified solution for unsigned integers, applied to the sample data you provide:
A = uint64([1 5 7 3 2 8]);
B = uint64([4 12 11 10 9 23 1 15]);
delta = bsxfun(#(a, b) max(a-b, b-a), A(:), reshape(B, 1, []));
[~, index] = min(delta, [], 2);
result = B(index)
result =
1×6 uint64 row vector
1 4 9 4 1 9 % As expected!

Is there any way to increase 'realmax' in MATLAB?

realmax on my machine is:
1.7977e+308
I know I have to write my code in a way to avoid long integer calculations, but is there any way to increase the limit?
I mean something like gmp library in C
You may find vpa (variable- precision arithmetic) helpful:
R = vpa(A) uses variable-precision arithmetic (VPA) to compute each element of A to at least d decimal digits of accuracy, where d is the current setting of digits.
R = vpa(A,d) uses at least d significant (nonzero) digits, instead of the current setting of digits.
Here's an example how to use it:
>> x = vpa('10^500/20')
ans =
5.0e498
Note that:
The output x is of symbolic (sym) type. Of course, you shouldn't convert it to double, because it would exceed realmax:
>> double(x)
ans =
Inf
Use string input in order to avoid evaluating large input values as double. For example, this doesn't work
>> vpa(10^500/20)
ans =
Inf
because 10^500 is evaluated as double, giving inf, and then is used as an input to vpa.

Matlab : How to represent a real number as binary

Problem : How do I use a continuous map - The Link1: Bernoulli Shift Map to model binary sequence?
Concept :
The Dyadic map also called as the Bernoulli Shift map is expressed as x(k+1) = 2x(k) mod 1. In Link2: Symbolic Dynamics, explains that the Bernoulli Map is a continuous map and is used as the Shift Map. This is explained further below.
A numeric trajectory can be symbolized by partitioning into appropriate regions and assigning it with a symbol. A symbolic orbit is obtained by writing down the sequence of symbols corresponding to the successive partition elements visited by the point in its orbit. One can learn much about the dynamics of the system by studying its symbolic orbits. This link also says that the Bernoulli Shift Map is used to represent symbolic dynamics.
Question :
How is the Bernoulli Shift Map used to generate the binary sequence? I tried like this, but this is not what the document in Link2 explains. So, I took the numeric output of the Map and converted to symbols by thresholding in the following way:
x = rand();
y = mod(2* x,1) % generate the next value after one iteration
y =
0.3295
if y >= 0.5 then s = 1
else s = 0
where 0.5 is the threshold value, called the critical value of the Bernoulli Map.
I need to represent the real number as fractions as explained here on Page 2 of Link2.
Can somebody please show how I can apply the Bernoulli Shift Map to generate symbolized trajectory (also called time series) ?
Please correct me if my understanding is wrong.
How do I convert a real valued numeric time series into symbolized i.e., how do I use the Bernoulli Map to model binary orbit /time series?
You can certainly compute this in real number space, but you risk hitting precision problems (depending on starting point). If you're interested in studying orbits, you may prefer to work in a rational fraction representation. There are more efficient ways to do this, but the following code illustrates one way to compute a series derived from that map. You'll see the period-n definition on page 2 of your Link 2. You should be able to see from this code how you could easily work in real number space as an alternative (in that case, the matlab function rat will recover a rational approximation from your real number).
[EDIT] Now with binary sequence made explicit!
% start at some point on period-n orbit
period = 6;
num = 3;
den = 2^period-1;
% compute for this many steps of the sequence
num_steps = 20;
% for each step
for n = 1:num_steps
% * 2
num = num * 2;
% mod 1
if num >= den
num = num - den;
end
% simplify rational fraction
g = gcd(num, den);
if g > 1
num = num / g;
den = den / g;
end
% recover 8-bit binary representation
bits = 8;
q = 2^bits;
x = num / den * q;
b = dec2bin(x, bits);
% display
fprintf('%4i / %4i == 0.%s\n', num, den, b);
end
Ach... for completeness, here's the real-valued version. Pure mathematicians should look away now.
% start at some point on period-n orbit
period = 6;
num = 3;
den = 2^period-1;
% use floating point approximation
x = num / den;
% compute for this many steps of the sequence
num_steps = 20;
% for each step
for n = 1:num_steps
% apply map
x = mod(x*2, 1);
% display
[num, den] = rat(x);
fprintf('%i / %i\n', num, den);
end
And, for extra credit, why is this implementation fast but daft? (HINT: try setting num_steps to 50)...
% matlab vectorised version
period = 6;
num = 3;
den = 2^period-1;
x = zeros(1, num_steps);
x(1) = num / den;
y = filter(1, [1 -2], x);
[a, b] = rat(mod(y, 1));
disp([a' b']);
OK, this is supposed to be an answer, not a question, so let's answer my own questions...
It's fast because it uses Matlab's built-in (and highly optimised) filter function to handle the iteration (that is, in practice, the iteration is done in C rather than in M-script). It's always worth remembering filter in Matlab, I'm constantly surprised by how it can be turned to good use for applications that don't look like filtering problems. filter cannot do conditional processing, however, and does not support modulo arithmetic, so how do we get away with it? Simply because this map has the property that whole periods at the input map to whole periods at the output (because the map operation is multiply by an integer).
It's daft because it very quickly hits the aforementioned precision problems. Set num_steps to 50 and watch it start to get wrong answers. What's happening is the number inside the filter operation is getting to be so large (order 10^14) that the bit we actually care about (the fractional part) is no longer representable in the same double-precision variable.
This last bit is something of a diversion, which has more to do with computation than maths - stick to the first implementation if your interest lies in symbol sequences.
If you only want to deal with rational type of output, you'll first have to convert the starting term of your series into a rational number if it is not. You can do that with:
[N,D] = rat(x0) ;
Once you have a numerator N and a denominator D, it is very easy to calculate the series x(k+1)=mod(2*x(k), 1) , and you don't even need a loop.
for the part 2*x(k), it means all the Numerator(k) will be multiplied by successive power of 2, which can be done by matrix multiplication (or bsxfun for the lover of the function):
so 2*x(k) => in Matlab N.*(2.^(0:n-1)) (N is a scalar, the numerator of x0, n is the number of terms you want to calculate).
The Mod1 operation is also easy to translate to rational number: mod(x,1)=mod(Nx,Dx)/Dx (Nx and Dx being the numerator and denominator of x.
If you do not need to simplify the denominator, you could get all the numerators of the series in one single line:
xn = mod( N.*(2.^(0:n-1).'),D) ;
but for visual comfort, it is sometimes better to simplify, so consider the following function:
function y = dyadic_rat(x0,n)
[N,D] = rat(x0) ; %// get Numerator and Denominator of first element
xn = mod( N.*(2.^(0:n-1).'),D) ; %'// calculate all Numerators
G = gcd( xn , D ) ; %// list all "Greatest common divisor"
y = [xn./G D./G].' ; %'// output simplified Numerators and Denominators
If I start with the example given in your wiki link (x0=11/24), I get:
>> y = dyadic_rat(11/24,8)
y =
11 11 5 2 1 2 1 2
24 12 6 3 3 3 3 3
If I start with the example given by Rattus Ex Machina (x0=3/(2^6-1)), I also get the same result:
>> y = dyadic_rat(3/63,8)
y =
1 2 4 8 16 11 1 2
21 21 21 21 21 21 21 21

Stability (Numerical analysis)

I'm trying to find the max machine number x that satisfies the following equation: x+a=a, where a is a given integer. (I'm not allowed to use eps.)
Here's my code (which is not really working):
function [] = Largest_x()
a=2184;
x=0.0000000001
while (x+a)~=a
x=2*x;
end
fprintf('The biggest value of x in order that x+a=a \n (where a is equal to %g) is : %g \n',a,x);
end
Any help would be much appreciated.
The answer is eps(a)/2.
eps is the difference to the next floating point number, so if you add half or less than that to a float, it won't change. For example:
100+eps(100)/2==100
ans =
1
%# divide by less than two
100+eps(100)/1.9==100
ans =
0
%# what is that number x?
eps(100)/2
ans =
7.1054e-15
If you don't want to rely on eps, you can calculate the number as
2^(-53+floor(log2(a)))
You're small algorithm is certainly not correct. The only conditions where A = X + A are when X is equal to 0. By default matlab data types are doubles with 64 bits.
Lets pretend that matlab were instead using 8 bit integers. The only way to satisfy the equation A = X + A is for X to have the binary representation of [0 0 0 0 0 0 0 0]. So any number between 1 and 0 would work as decimal points are truncated from integers. So again if you were using integers A = A + X would resolve to true if you were to set the value of X to any value between [0,1). However this value is meaningless because X would not take on this value but rather it would take on the value of 0.
It sounds like you are trying to find the resolution of matlab data types. See this: http://www.mathworks.com/help/matlab/matlab_prog/floating-point-numbers.html
The correct answer is that, provided by Jonas: 0.5 * eps(a)
Here is an alternative for the empirical and approximate solution:
>> a = 2184;
>> e = 2 .^ (-100 : 100); % logarithmic scale
>> idx = find(a + e == a, 1, 'last')
idx =
59
>> e(idx)
ans =
2.2737e-013

Performing Bit modification on Floating point numbers in Matlab

I'm working in Matlab using Non-negative Matrix factorization to decompose a matrix into two factors. Using this I get from A two double precision floating point matrices, B and C.
sample results are
B(1,1) = 0.118
C(1,1) = 112.035
I am now trying to modify specific bits within these values but using the bitset function on either values I get an error because bitset requires unsigned integers.
I have also tried using dec2bin function, which I assumed would convert decimals to binary but it returns '0' for B(1,1).
Does anyone know of any way to deal with floats at bit level without losing precision?
You should look into the typecast and bitset functions. (Doc here and here respectively). That lets you do stuff like
xb = typecast( 1.0, 'uint64' );
xb = bitset( xb, 10, 1 );
typecast( xb, 'double' );
The num2hex and hex2num functions are your friends. (Though not necessarily very good friends; hexadecimal strings aren't the best imaginable form for working on binary floating-point numbers. You could split them into, say, 8-nybble chunks and convert each to an integer.)
From the MATLAB docs:
num2hex([1 0 0.1 -pi Inf NaN])
returns
ans =
3ff0000000000000
0000000000000000
3fb999999999999a
c00921fb54442d18
7ff0000000000000
fff8000000000000
and
num2hex(single([1 0 0.1 -pi Inf NaN]))
returns
ans =
3f800000
00000000
3dcccccd
c0490fdb
7f800000
ffc00000