Matlab : Binary to decimal conversion using symbols from clustering algorithm - matlab

q = 2;
k= 2^q;
x1 = [0.0975000000000000, 0.980987500000000, -0.924672950312500, -0.710040130079246];
for i = 1 : length(x1)
[idx_centers,location] = kmeans(x1',q);
end
temp = idx_centers;
for i = 1 : length(x1)
if temp(i)== 2
idx_centers(i) = 0;
end
BinaryCode_KMeans(i) = idx_centers(i); % output is say [0,0,1,1];
end
strng = num2str(BinaryCode_KMeans);
DecX = bin2dec(strng);
In the above code snippet, I want to express the binary string to its decimal equivalent where the binary string is obtained from kmeans clustering. The decimal equivalent should either be 1,2,3, or 4 i.e., k = 2^q when q=2.
But sometimes after conversion, the decimal equivalent is 12 because for a 4 bit binary code we get decimal numbers in 1 to 16 or 0 -- 15. the number of elements in x1 can vary and can be less than or greater than k. What should I do so that I can always get the decimal equivalent of the binary code within k for any value of q?

First of, there is no need to run kmeans multiple times, it will calculate the cluster centers using a single run. Note that, the code below tries to find a mapping between the clustering results and n the number of samples. There are three ways in the code below to encode this information.
clear
clc
q = 2;
k= 2^q;
n = 4;
x1 = rand(n,1);
fprintf('x1 = [ '); fprintf('%d ', x1); fprintf(']\n');
[idx_centers, location] = kmeans(x1, q);
fprintf('idx_centers = [ '); fprintf('%d ', idx_centers); fprintf(']\n');
for i = 1:q
idx_centers(idx_centers == i) = i-1;
end
fprintf('idx_centers = [ '); fprintf('%d ', idx_centers); fprintf(']\n');
string = num2str(idx_centers');
% Original decimal value
DecX = bin2dec(string);
fprintf('0 to (2^n) - 1: %d\n', DecX);
% Reduced space decimal value
% Ignoring the 0/1 order as [ 1 1 0 0 ]
% would be the same as [ 0 0 1 1 ]
if DecX >= (2^n)/2
complement = bitget(bitcmp(int64(DecX)),n:-1:1);
DecX = bin2dec(num2str(complement));
end
fprintf('0 to ((2^n)/2) - 1: %d\n', DecX);
% Minimal Decimal value based on the number of samples
% in the 0's cluster which is in the range of 0 to n-1
fprintf('0 to n - 1: %d\n', numel(find(idx_centers == 0)));
Hint: If you change the q to more than 2, the code will not work because bin2dec only accepts zeros and ones. In case of having more than 2 clusters, you need to elaborate the code and use multidimensional arrays to store the pairwise clustering results.

Related

How to programmatically compute this summation

I want to compute the above summation, for a given 'x'. The summation is to be carried out over a block of lengths specified by an array , for example block_length = [5 4 3]. The summation is carried as follows: from -5 to 5 across one dimension, -4 to 4 in the second dimension and -3 to 3 in the last dimension.
The pseudo code will be something like this:
sum = 0;
for i = -5:5
for j = -4:4
for k = -3:3
vec = [i j k];
tv = vec * vec';
sum = sum + 1/(1+tv)*cos(2*pi*x*vec'));
end
end
end
The problem is that I want to find the sum when the number of dimensions are not known ahead of time, using some kind of variable nested loops hopefully. Matlab uses combvec, but it returns all possible combinations of vectors, which is not required as we only compute the sum. When there are many dimensions, combvec returning all combinations is not feasible memory wise.
Appreciate any ideas towards solutions.
PS: I want to do this at high number of dimensions, for example 650, as in machine learning.
Based on https://www.mathworks.com/matlabcentral/answers/345551-function-with-varying-number-of-for-loops I came up with the following code (I haven't tested it for very large number of indices!):
function sum = fun(x, block_length)
sum = 0;
n = numel(block_length); % Number of loops
vec = -ones(1, n) .* block_length; % Index vector
ready = false;
while ~ready
tv = vec * vec';
sum = sum + 1/(1+tv)*cos(2*pi*x*vec');
% Update the index vector:
ready = true; % Assume that the WHILE loop is ready
for k = 1:n
vec(k) = vec(k) + 1;
if vec(k) <= block_length(k)
ready = false;
break; % v(k) increased successfully, leave "for k" loop
end
vec(k) = -1 * block_length(k); % v(k) reached the limit, reset it
end
end
end
where x and block_length should be both 1-x-n vectors.
The idea is that, instead of using explicitly nested loops, we use a vector of indices.
How good/efficient is this when tackling the suggested use case where block_length can have 650 elements? Not much! Here's a "quick" test using merely 16 dimensions and a [-1, 1] range for the indices:
N = 16; tic; in = 0.1 * ones(1, N); sum = fun(in, ones(size(in))), toc;
which yields an elapsed time of 12.7 seconds on my laptop.

How does the colon operator work in MATLAB?

As noted in this answer by Sam Roberts and this other answer by gnovice, MATLAB's colon operator (start:step:stop) creates a vector of values in a different way that linspace does. In particular, Sam Roberts states:
The colon operator adds increments to the starting point, and subtracts decrements from the end point to reach a middle point. In this way, it ensures that the output vector is as symmetric as possible.
However, offical documentation about this from The MathWorks has been deleted from their site.
If Sam's description is correct, wouldn't the errors in the step sizes be symmetric?
>> step = 1/3;
>> C = 0:step:5;
>> diff(C) - step
ans =
1.0e-15 *
Columns 1 through 10
0 0 0.0555 -0.0555 -0.0555 0.1665 -0.2776 0.6106 -0.2776 0.1665
Columns 11 through 15
0.1665 -0.2776 -0.2776 0.6106 -0.2776
Interesting things to note about the colon operator:
Its values depend on its length:
>> step = 1/3;
>> C = 0:step:5;
>> X = 0:step:3;
>> C(1:10) - X
ans =
1.0e-15 *
0 0 0 0 0 -0.2220 0 -0.4441 0.4441 0
It can generate repeated values if they are rounded:
>> E = 1-eps : eps/4 : 1+eps;
>> E-1
ans =
1.0e-15 *
-0.2220 -0.2220 -0.1110 0 0 0 0 0.2220 0.2220
There is a tolerance for the last value, if the step size creates a value just above the end, this end value is still used:
>> A = 0 : step : 5-2*eps(5)
A =
Columns 1 through 10
0 0.3333 0.6667 1.0000 1.3333 1.6667 2.0000 2.3333 2.6667 3.0000
Columns 11 through 16
3.3333 3.6667 4.0000 4.3333 4.6667 5.0000
>> A(end) == 5 - 2*eps(5)
ans =
logical
1
>> step*15 - 5
ans =
0
The deleted page referred to by Sam's answer is still archived by the Way Back Machine. Luckily, even the attached M-file colonop is there too. And it seems that this function still matches what MATLAB does (I'm on R2017a):
>> all(0:step:5 == colonop(0,step,5))
ans =
logical
1
>> all(-pi:pi/21:pi == colonop(-pi,pi/21,pi))
ans =
logical
1
I'll replicate here what the function does for the general case (there are some shortcuts for generating integer vectors and handling special cases). I'm replacing the function's variable names with more meaningful ones. The inputs are start, step and stop.
First it computes how many steps there are in between start and stop. If the last step exceeds stop by more than a tolerance, it is not taken:
n = round((stop-start)/step);
tol = 2.0*eps*max(abs(start),abs(stop));
sig = sign(step);
if sig*(start+n*step - stop) > tol
n = n - 1;
end
This explains the last observation mentioned in the question.
Next, it computes the value of the last element, and makes sure that it does not exceed the stop value, even if it allowed to go past it in the previous computation.
last = start + n*step;
if sig*(last-stop) > -tol
last = stop;
end
This is why the lasat value in the vector A in the question actually has the stop value as the last value.
Next, it computes the output array in two parts, as advertised: the left and right halves of the array are filled independently:
out = zeros(1,n+1);
k = 0:floor(n/2);
out(1+k) = start + k*step;
out(n+1-k) = last - k*step;
Note that they are not filled by incrementing, but by computing an integer array and multiplying it by the step size, just like linspace does. This exaplains the observation about array E in the question. The difference is that the right half of the array is filled by subtracting those values from the last value.
As a final step, for odd-sized arrays, the middle value is computed separately to ensure it lies exactly half-way the two end points:
if mod(n,2) == 0
out(n/2+1) = (start+last)/2;
end
The full function colonop is copied at the bottom.
Note that filling the left and right side of the array separately does not mean that the errors in step sizes should be perfectly symmetric. These errors are given by roundoff errors. But it does make a difference where the stop point is not reached exactly by the step size, as in the case of array A in the question. In this case, the slightly shorter step size is taken in the middle of the array, rather than at the end:
>> step=1/3;
>> A = 0 : step : 5-2*eps(5);
>> A/step-(0:15)
ans =
1.0e-14 *
Columns 1 through 10
0 0 0 0 0 0 0 -0.0888 -0.4441 -0.5329
Columns 11 through 16
-0.3553 -0.3553 -0.5329 -0.5329 -0.3553 -0.5329
But even in the case where the stop point is reached exactly, some additional error accumulates in the middle. Take for example the array C in the question. This error accumulation does not happen with linspace:
C = 0:1/3:5;
lims = eps(C);
subplot(2,1,1)
plot(diff(C)-1/3,'o-')
hold on
plot(lims,'k:')
plot(-lims,'k:')
plot([1,15],[0,0],'k:')
ylabel('error')
title('0:1/3:5')
L = linspace(0,5,16);
subplot(2,1,2)
plot(diff(L)-1/3,'x-')
hold on
plot(lims,'k:')
plot(-lims,'k:')
plot([1,15],[0,0],'k:')
title('linspace(0,5,16)')
ylabel('error')
colonop:
function out = colonop(start,step,stop)
% COLONOP Demonstrate how the built-in a:d:b is constructed.
%
% v = colonop(a,b) constructs v = a:1:b.
% v = colonop(a,d,b) constructs v = a:d:b.
%
% v = a:d:b is not constructed using repeated addition. If the
% textual representation of d in the source code cannot be
% exactly represented in binary floating point, then repeated
% addition will appear to have accumlated roundoff error. In
% some cases, d may be so small that the floating point number
% nearest a+d is actually a. Here are two imporant examples.
%
% v = 1-eps : eps/4 : 1+eps is the nine floating point numbers
% closest to v = 1 + (-4:1:4)*eps/4. Since the spacing of the
% floating point numbers between 1-eps and 1 is eps/2 and the
% spacing between 1 and 1+eps is eps,
% v = [1-eps 1-eps 1-eps/2 1 1 1 1 1+eps 1+eps].
%
% Even though 0.01 is not exactly represented in binary,
% v = -1 : 0.01 : 1 consists of 201 floating points numbers
% centered symmetrically about zero.
%
% Ideally, in exact arithmetic, for b > a and d > 0,
% v = a:d:b should be the vector of length n+1 generated by
% v = a + (0:n)*d where n = floor((b-a)/d).
% In floating point arithmetic, the delicate computatations
% are the value of n, the value of the right hand end point,
% c = a+n*d, and symmetry about the mid-point.
if nargin < 3
stop = step;
step = 1;
end
tol = 2.0*eps*max(abs(start),abs(stop));
sig = sign(step);
% Exceptional cases.
if ~isfinite(start) || ~isfinite(step) || ~isfinite(stop)
out = NaN;
return
elseif step == 0 || start < stop && step < 0 || stop < start && step > 0
% Result is empty.
out = zeros(1,0);
return
end
% n = number of intervals = length(v) - 1.
if start == floor(start) && step == 1
% Consecutive integers.
n = floor(stop) - start;
elseif start == floor(start) && step == floor(step)
% Integers with spacing > 1.
q = floor(start/step);
r = start - q*step;
n = floor((stop-r)/step) - q;
else
% General case.
n = round((stop-start)/step);
if sig*(start+n*step - stop) > tol
n = n - 1;
end
end
% last = right hand end point.
last = start + n*step;
if sig*(last-stop) > -tol
last = stop;
end
% out should be symmetric about the mid-point.
out = zeros(1,n+1);
k = 0:floor(n/2);
out(1+k) = start + k*step;
out(n+1-k) = last - k*step;
if mod(n,2) == 0
out(n/2+1) = (start+last)/2;
end

Generate cell with random pairs without repetitions

How to generate a sequence of random pairs without repeating pairs?
The following code already generates the pairs, but does not avoid repetitions:
for k=1:8
Comb=[randi([-15,15]) ; randi([-15,15])];
T{1,k}=Comb;
end
When running I got:
T= [-3;10] [5;2] [1;-5] [10;9] [-4;-9] [-5;-9] [3;1] [-3;10]
The pair [-3,10] is repeated, which cannot happen.
PS : The entries can be positive or negative.
Is there any built in function for this? Any sugestion to solve this?
If you have the Statistics Toolbox, you can use randsample to sample 8 numbers from 1 to 31^2 (where 31 is the population size), without replacement, and then "unpack" each obtained number into the two components of a pair:
s = -15:15; % population
M = 8; % desired number of samples
N = numel(s); % population size
y = randsample(N^2, M); % sample without replacement
result = s([ceil(y/N) mod(y-1, N)+1]); % unpack pair and index into population
Example run:
result =
14 1
-5 7
13 -8
15 4
-6 -7
-6 15
2 3
9 6
You can use ind2sub:
n = 15;
m = 8;
[x y]=ind2sub([n n],randperm(n*n,m));
Two possibilities:
1.
M = nchoosek(1:15, 2);
T = datasample(M, 8, 'replace', false);
2.
T = zeros(8,2);
k = 1;
while (k <= 8)
t = randi(15, [1,2]);
b1 = (T(:,1) == t(1));
b2 = (T(:,2) == t(2));
if ~any(b1 & b2)
T(k,:) = t;
k = k + 1;
end
end
The first method is probably faster but takes up more memory and may not be practicable for very large numbers (ex: if instead of 15, the max was 50000), in which case you have to go with 2.

Fast way to test whether n^2 + (n+1)^2 is perfect square

I am trying to program a code to test whether n^2 + (n+1)^2 is a perfect.
As i do not have much experience in programming, I only have Matlab at my disposal.
So far this is what I have tried
function [ Liste ] = testSquare(N)
if exist('NumberTheory')
load NumberTheory.mat
else
MaxT = 0;
end
if MaxT > N
return
elseif MaxT > 0
L = 1 + MaxT;
else
L = 1;
end
n = (L:N)'; % Makes a list of numbers from L to N
m = n.^2 + (n+1).^2; % Makes a list of numbers on the form A^2+(A+1)^2
P = dec2hex(m); % Converts this list to hexadecimal
Length = length(dec2hex(P(N,:))); %F inds the maximum number of digits in the hexidecimal number
Modulo = ['0','1','4','9']'; % Only numbers ending on 0,1,4 or 9 can be perfect squares in hex
[d1,~] = ismember(P(:,Length),Modulo); % Finds all numbers that end on 0,1,4 or 9
m = m(d1); % Removes all numbers not ending on 0,1,4 or 9
n = n(d1); % -------------------||-----------------------
mm = sqrt(m); % Takes the square root of all the possible squares
A = (floor(mm + 0.5).^2 == m); % Tests wheter these are actually squares
lA = length(A(A>0)); % Finds the number of such numbers
MaxT = N;
save NumberTheory.mat MaxT;
if lA>0
m = m(A); % makes a list of all the square numbers
n = n(A); % finds the corresponding n values
mm = mm(A); % Finds the squareroot values of m
fid = fopen('Tallteori.txt','wt'); % Writes everything to a simple text.file
for ii = 1:lA
fprintf(fid,'%20d %20d %20d\t',n(ii),m(ii),mm(ii));
fprintf(fid,'\n');
end
fclose(fid);
end
end
Which will write the squares with the corresponding n values to a file. Now I saw that using hexadecimal was a fast way to find perfect squares in C+, and tried to use this in matlab. However I am a tad unsure if this is the best approach.
The code above breaks down when m > 2^52 due to the hexadecimal conversion.
Is there an alternative way/faster to write all the perfect squares on the form n^2 + (n+1)^2 to a text file from 1 to N ?
There is a much faster way that doesn't even require testing. You need a bit of elementary number theory to find that way, but here goes:
If n² + (n+1)² is a perfect square, that means there is an m such that
m² = n² + (n+1)² = 2n² + 2n + 1
<=> 2m² = 4n² + 4n + 1 + 1
<=> 2m² = (2n+1)² + 1
<=> (2n+1)² - 2m² = -1
Equations of that type are easily solved, starting from the "smallest" (positive) solution
1² - 2*1² = -1
of
x² - 2y² = -1
corresponding to the number 1 + √2, you obtain all further solutions by multiplying that with a power of the primitive solution of
a² - 2b² = 1
which is (1 + √2)² = 3 + 2*√2.
Writing that in matrix form, you obtain all solutions of x² - 2y² = -1 as
|x_k| |3 4|^k |1|
|y_k| = |2 3| * |1|
and all x_k are necessarily odd, thus can be written as 2*n + 1.
The first few solutions (x,y) are
(1,1), (7,5), (41,29), (239,169)
corresponding to (n,m)
(0,1), (3,5), (20,29), (119,169)
You can get the next (n,m) solution pair via
(n_(k+1), m_(k+1)) = (3*n_k + 2*m_k + 1, 4*n_k + 3*m_k + 2)
starting from (n_0, m_0) = (0,1).
Quick Haskell code since I don't speak MatLab:
Prelude> let next (n,m) = (3*n + 2*m + 1, 4*n + 3*m + 2) in take 20 $ iterate next (0,1)
[(0,1),(3,5),(20,29),(119,169),(696,985),(4059,5741),(23660,33461),(137903,195025)
,(803760,1136689),(4684659,6625109),(27304196,38613965),(159140519,225058681)
,(927538920,1311738121),(5406093003,7645370045),(31509019100,44560482149)
,(183648021599,259717522849),(1070379110496,1513744654945),(6238626641379,8822750406821)
,(36361380737780,51422757785981),(211929657785303,299713796309065)]
Prelude> map (\(n,m) -> (n^2 + (n+1)^2 - m^2)) it
[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
Edit by EitanT:
Here's the MATLAB code to calculate the first N numbers:
res = zeros(1, N);
nm = [0, 1];
for k = 1:N
nm = nm * [3 4; 2 3] + [1, 2];
res(k) = nm(1);
end
The resulting array res should hold the values of n that satisfy the condition of the perfect square.

decimal to binary conversion in matlab

the conversion of decimal to binary of text derived from a file is
temp=textread('E:\one.txt', '%1s', 'whitespace', '');
text = char(temp);
y = zeros(length(text)*8,1);
for n = 1:1:length(text)
a=abs(text(n));
f = 8*(n-1)+1;
y(f:f+7,1)=(de2bi(a,8))';
end
disp('THE MAGNITUDE OF THE TEXT IS =');
disp(a);
disp(f);
x=y';
disp('THE BINARY BITS ARE');
disp(x);
output of this program if file contained '1' stored in it
THE MAGNITUDE OF THE TEXT IS =
49
1
THE BINARY BITS ARE
1 0 0 0 1 1 0 0
if the number of bit of x is 8bit then i want first 3 bits displayed in a variable and rest 5 bits in another variable
i want a program for this in matlab.
eg x=00110011
a=001
b=10011
encoding program
clc;
clear all;
temp=textread('E:\one.txt', '%1s', 'whitespace', '');
text = char(temp);
y = zeros(length(text)*8,1);
for n = 1:1:length(text)
a=abs(text(n));
f = 8*(n-1)+1;
y(f:f+7,1)=(de2bi(a,8))';
end
disp('THE MAGNITUDE OF THE TEXT IS =');
disp(a);
disp(f);
x=y';
disp('THE BINARY BITS ARE');
disp(x);
z=length(x);
savefile='D:\mat\z.mat';
save (savefile,'z','-MAT');
disp('TOTAL NUMBER OF BITS =');
disp(z);
bk=input('ENTER THE NUMBER OF ROWS =');
savefile='D:\mat\bk.mat';
save (savefile,'bk','-MAT');
c=z/bk;
savefile='D:\mat\c.mat';
save (savefile,'c','-MAT');
k=1;
for i=1:bk
for j=1:c
m(i,j)=x(k);
k=k+1;
end
end
%disp(m(i,j));
disp('THE MESSAGE BITS ARE ');
disp(m);
savefile='D:\mat\m.mat';
save (savefile,'m','-MAT');
m_tot=(size(m,1)*size(m,2));
savefile='D:\mat\m_tot.mat';
save (savefile,'m_tot','-MAT');
savefile='D:\mat\r1.mat';
r1=[randperm(bk),randperm(bk)];
save (savefile,'r1','-MAT');
disp(r1);
savefile='D:\mat\r2.mat';
r2=[randperm(bk),randperm(bk)];
save (savefile,'r2','-MAT');
disp(r2);
savefile='D:\mat\f(1).mat';
f1= randint(1,1,[1,bk]);
save (savefile,'f1','-MAT');
savefile='D:\mat\en.mat';
en(1,:)=m(f1,:);
save (savefile,'en','-MAT');
disp('DIRECTLY ASSIGNED BLOCK IS');
disp(f1);
for w=1:(length(r1))
en(w+1,:)=xor((m(r1(w),:)),(m(r2(w),:)));
disp('THE EXORED BLOCKS ARE= ');
disp(r1(w));
disp(r2(w));
end
disp('THE ENCODED BITS ARE');
disp(en);
en_tot=(size(en,1)*size(en,2));
disp('tot no of encoded bits');
disp(en_tot);
save (savefile,'en_tot','-MAT');
savefile='D:\mat\en_tot.mat';
the variable en should be split based on hop count same as u did with variable x.
Try this:
%After computing "x", the double array, as required...
d = input('Enter the length of the first sub-array: ');
a = x(1:d)
b = x(d+1:end)
For example:
>>
temp='1';
text = char(temp);
y = zeros(length(text)*8,1);
for n = 1:1:length(text)
a=abs(text(n));
f = 8*(n-1)+1;
y(f:f+7,1)=(de2bi(a,8))';
end
disp('THE MAGNITUDE OF THE TEXT IS =');
disp(a);
disp(f);
x=y';
disp('THE BINARY BITS ARE');
disp(x);
%After computing "x", the double array, as required...
d = input('Enter the length of the first sub-array: ');
a = x(1:d)
b = x(d+1:end)
The result would be:
THE MAGNITUDE OF THE TEXT IS =
49
1
THE BINARY BITS ARE
1 0 0 0 1 1 0 0
Enter the length of the first sub-array: 3
a =
1 0 0
b =
0 1 1 0 0
I'm still unsure as to what this program is supposed to achieve though.
EDIT
New code as per changed requirement:
>>
temp='1';
text = char(temp);
y = zeros(length(text)*8,1);
for n = 1:1:length(text)
a=abs(text(n));
f = 8*(n-1)+1;
y(f:f+7,1)=(de2bi(a,8))';
end
disp('THE MAGNITUDE OF THE TEXT IS =');
disp(a);
disp(f);
x=y';
disp('THE BINARY BITS ARE');
disp(x);
%After computing "x", the double array, as required...
d = input('Enter the hop-count vector: ');
for i=2:length(d)
d(i) = d(i) + d(i-1);
end
d = [0, ceil((d./d(end))*length(x))];
disp('The resultant split up is:')
for i=2:length(d)
disp(x((d(i-1)+1):d(i)));
end
The result will be:
THE MAGNITUDE OF THE TEXT IS =
49
1
THE BINARY BITS ARE
1 0 0 0 1 1 0 0
Enter the hop-count vector: [3 2 3]
The resultant split up is:
1 0 0
0 1
1 0 0
Just slice the array:
disp(x(1:3));
disp(x(4:end));