MATLAB - Plotting random points in a circle using a congruential RNG - matlab

I am aiming at plotting some random numbers in a circle using MATLAB. My code:
c = 3; p = 31; x = [7];
% generating random numbers (z) in the range [0,1) using
% congruential random number generator (multiplicative)
for i = 2:200;
x(i) = mod(c*x(i-1),p);
end;
z = x/p;
% plot unit circle
hold on;
theta = 0:pi/50:2*pi;
plot(cos(theta),sin(theta),'.');
hold off;
% plotting random points in the unit circle using in-built rand function
phi = 2*pi*rand(1,200);
r = 1*sqrt(rand(1,200));
% plotting random points using the RNG above
% phi = 2*pi*z;
% r = 1*sqrt(z);
hold on;
x = 0 + r.*cos(phi);
y = 0 + r.*sin(phi);
plot(x,y,'r*');
hold off;
clear;
The problem I am facing is that both z and rand consist of random numbers in the range [0,1). However, when I plot using rand I get the ideal result -
while z gives me a helix sorta thing -
What could be the problem?

Besides Ander's good point about the RNG there is also the problem of using z for both phi and r. Check it by using z=rand(200,1) and then creating your plot:
gives the same result as you had before. If you let z be different for both, you get "true" randomness, to some extend in your RNG. I used this RNG:
c = 991;
p = 997;
x=zeros(400,1);
x(1,1) = 7;
for ii = 2:400;
x(ii,1) = mod(c*x(ii-1),p);
end;
z = x/p;
phi2 = 2*pi*z(1:200,1);
r2 = 1*sqrt(z(201:400,1));
where I let your RNG run a bit longer and then used the first 200 for phi and the last 200 for r:
As you can see there's still some kind of swirl visible, but that's due to your RNG. The larger you pick your c and p the less that will be.
Just to show you how pretty your RNG becomes by setting c=3 and p=31 and using the full 400 range of z as above. Isn't that a great swirl?

Easy! Your random number generator is good to some excent.
Random number generators based on prime division do generally have a period. After a number of samples they repeat themselves.
In your case, try to plot(z)
You will notice that that set of numbers is periodic, and has a period of 31. Coincidence?
I THINK NOT!
Thus, remember that when you want to generate pseudorandom numbers, you need a p bigger than the amount of samples you want to generate.
For example if we choose another set of coprime numbers to generate z
c = 991; p = 997;
The plot(z) will be:
And the final plot:

Related

Vectors must be the same length error in Curve Fitting in Matlab

I'm having problems in curve fitting my randomized data for the function
Here is my code
N = 100;
mu = 5; stdev = 2;
x = mu+stdev*randn(N,1);
bin=mu-6*stdev:0.5:mu+6*stdev;
f=hist(x,bin);
plot(bin,f,'bo'); hold on;
x_ = x(1):0.1:x(end);
y_ = (1./sqrt(8.*pi)).*exp(-((x_-mu).^2)./8);
plot(x_,y_,'b-'); hold on;
It seems like I'm having vector size problems since it is giving me the error
Error using plot
Vectors must be the same length.
Note that I simplified y_ since mu and the standard deviation is known.
Plot:
Well first of all some adjustments to your question:
You are not trying to do curve fitting. What you are trying to do (in my opinion) is to overlay a probability density function on an histogram obtained by taking random points from the same distribution (A normal distribution with parameters (mu,sigma)). These two curve should indeed overlay, as they represent the same thing, only one is analytical and the other one is obtained numerically.
As seen in the hist documentation, hist is not recommended and you should use histogram instead
First step: Generating your random data
Knowing the distribution is the Normal distribution, we can use MATLAB's random function to do that :
N = 150;
rng('default') % For reproducibility
mu = 5;
sigma = 2;
r = random('Normal',mu,sigma,N,1);
Second step: Plot the histogram
Because we don't just want a count of the elements in each bin, but a feel of the probability density function, we can use the 'Normalization' 'pdf' arguments
Nbins = 25;
f=histogram(r,Nbins,'Normalization','pdf');
hold on
Here I'd rather specify a number of bins than specifying the bins themselves, because you never know in advance how far from the mean your data is going to be.
Last step: overlay the probability density function over the histogram
The histogram being already consistent with a probability density function, it is sufficient to just overlay the density function:
x_ = linspace(min(r),max(r),100);
y_ = (1./sqrt(2*sigma^2*pi)).*exp(-((x_-mu).^2)./(2*sigma^2));
plot(x_,y_,'b-');
With N = 150
With N = 1500
With N = 150.000 and Nbins = 50
If for some obscure reason you want to use old hist() function
The old hist() function can't handle normalization, so you'll have to do it by hand, by normalizing your density function to fit your histogram:
N = 1500;
% rng('default') % For reproducibility
mu = 5;
sigma = 2;
r = random('Normal',mu,sigma,1,N);
Nbins = 50;
[~,centers]=hist(r,Nbins);
hist(r,Nbins); hold on
% Width of bins
Widths = diff(centers);
x_ = linspace(min(r),max(r),100);
y_ = N*mean(Widths)*(1./sqrt(2*sigma^2*pi)).*exp(-((x_-mu).^2)./(2*sigma^2));
plot(x_,y_,'r-');

Creating histograms of distance from the origin for 2D Random Walkers

Let's say you can show the distribution in space of the the positions of a large number of random walkers at three different time points. This was provided an answer to my previous question and with some tweaks is beautiful.
clc;
close all;
M = 1000; % The amount of random walks.
steps = [100,200,300]; % here we analyse the step 10,200 and 1000
cc = hsv(length(steps)); % manage the color of the plot
%generation of each random walk
x = sign(randn(max(steps),M));
y = sign(randn(max(steps),M));
xs = cumsum(x);
xval = xs(steps,:);
ys = cumsum(y);
yval = ys(steps,:);
hold on
for n=1:length(steps)
plot(xval(n,:),yval(n,:),'o','markersize',1,'color',cc(n,:),'MarkerFaceColor',cc(n,:));
end
legend('100','200','300')
axis square
grid on;
Now to the question, could I in some way use the hist() and subplot() functions to show the distance from the origin of the random walkers at three separate time points, or more I guess, but three for simplicity.
I'm not sure how to go about this beyond producing distributions of random walkers at the three time points themselves so far.
I hope that I've understand your question, I think that you want to use the bar plot with the stack option.
I've used the answer of #LuisMendo to my question to increase the code efficiency.
steps = [10,200,1000]; % the steps
M = 5000; % Number of random walk
DV = [-1 1]; % Discrete value
p = .5; % probability of DV(2)
% Using the #LuisMendo binomial solution:
for ii = 1:length(steps)
xval(ii,:) = (DV(2)-DV(1))*binornd(steps(ii), p, M, 1)+DV(1)*steps(ii);
yval(ii,:) = (DV(2)-DV(1))*binornd(steps(ii), p, M, 1)+DV(1)*steps(ii);
end
[x, cen] = hist(sqrt(xval.^2+yval.^2).'); %where `sqrt(xval.^2+yval.^2)` is the euclidian distance
bar(cen,x,'stacked');
legend('10','200','1000')
axis square
grid on;
Increase the # of bins in the histogram function to increase the plot precision.
Results:

Creating graphs that show the distribution in space of a large number of 2D Random Walks at three different time points

So essentially I have this code here that I can use to generate a 2D Random Walk discretely along N number of steps with M number of walkers. I can plot them all on the same graph here.
clc;
clearvars;
N = 500; % Length of the x-axis, also known as the length of the random walks.
M = 3; % The amount of random walks.
x_t(1) = 0;
y_t(1) = 0;
for m=1:M
for n = 1:N % Looping all values of N into x_t(n).
A = sign(randn); % Generates either +1/-1 depending on the SIGN of RAND.
x_t(n+1) = x_t(n) + A;
A = sign(randn); % Generates either +1/-1 depending on the SIGN of RAND.
y_t(n+1) = y_t(n) + A;
end
plot(x_t, y_t);
hold on
end
grid on;
% Enlarge figure to full screen.
set(gcf, 'Units', 'Normalized', 'Outerposition', [0, 0.05, 1, 0.95]);
axis square;
Now, I want to be able to Create graphs that show the distribution in space of the positions of a large number
(e.g. n = 1000) random walkers, at three different time points (e.g. t = 100, 200 and 300 or any three time points really).
I'm not sure how to go about this, I need to turn this into a function and iterate it through itself three different times and store the coordinates? I have a rough idea but iffy on actually implementing. I'd assume the safest and least messy way would be to use subplot() to create all three plots together in the same figure.
Appreciate any assistance!
You can use cumsum to linearize the process. Basically you only want to cumsum a random matrix composed of [-1 and 1].
clc;
close all;
M = 50; % The amount of random walks.
steps = [10,200,1000]; % here we analyse the step 10,200 and 1000
cc = hsv(length(steps)); % manage the color of the plot
%generation of each random walk
x = sign(randn(max(steps),M));
y = sign(randn(max(steps),M));
xs = cumsum(x);
xval = xs(steps,:);
ys = cumsum(y);
yval = ys(steps,:);
hold on
for n=1:length(steps)
plot(xval(n,:),yval(n,:),'o','markersize',1,'color',cc(n,:),'MarkerFaceColor',cc(n,:));
end
legend('10','200','1000')
axis square
grid on;
Results:
EDIT:
Thanks to #LuisMendo that answered my question here, you can use a binomial distribution to get the same result:
steps = [10,200,10000];
cc = hsv(length(steps)); % manage the color of the plot
M = 50;
DV = [-1 1];
p = .5; % probability of DV(2)
% Using the #LuisMendo binomial solution:
for ii = 1:length(steps)
SDUDx(ii,:) = (DV(2)-DV(1))*binornd(steps(ii), p, M, 1)+DV(1)*steps(ii);
SDUDy(ii,:) = (DV(2)-DV(1))*binornd(steps(ii), p, M, 1)+DV(1)*steps(ii);
end
hold on
for n=1:length(steps)
plot(SDUDx(n,:),SDUDy(n,:),'o','markersize',1,'color',cc(n,:),'MarkerFaceColor',cc(n,:));
end
legend('10','200','1000')
axis square
grid on;
What is the advantage ? Even if you have a big number of steps, let's say 1000000, matlab can handle it. Because in the first solution you have a bruteforce solution, and in the second case a statistical solution.
If you want to show the distribution of a large number, say 1000, of these points, I would say the most suitable way of plotting is as a 'point cloud' using scatter. Then you create an array of N points for both the x and the y coordinate, and let it compute the coordinate in a loop for i = 1:Nt, where Nt will be 100, 200, or 300 as you describe. Something along the lines of the following:
N = 500;
x_t = zeros(N,1);
y_t = zeros(N,1);
Nt = 100;
for tidx = 1:Nt
x_t = x_t + sign(randn(N,1));
y_t = y_t + sign(randn(N,1));
end
scatter(x_t,y_t,'k*');
This will give you N x and y coordinates generated in the same way as in the sample you provided.
One thing to keep in mind is that sign(0)=0, so I suppose there is a chance (admittedly a small one) of not altering the coordinate. I am not sure if you intended this behaviour to be possible (a walker standing still)?
I will demonstrate the 1-dimensional case for clarity; you only need to implement this for each dimension you add.
Model N steps for M walkers using an NxM matrix.
>> N = 5;
>> M = 4;
>> steps = sign(randn(N,M));
steps =
1 1 1 1
-1 1 -1 1
1 -1 -1 -1
1 1 -1 1
1 -1 -1 -1
For plotting, it is useful to make a second NxM matrix s containing the updated positions after each step, where s(N,M) gives the position of walker M after N steps.
Use cumsum to vectorize instead of looping.
>> s = cumsum(steps)
s =
1 1 1 1
0 2 0 2
1 1 -1 1
2 2 -2 2
3 1 -3 1
To prevent plot redraw after each new line, use hold on.
>> figure; hold on
>> plot(1:N, s(1:N, 1:M), 'marker', '.', 'markersize', 20, 'linewidth', 3)
>> xlabel('Number of steps'); ylabel('Position')
The output plot looks like this: picture
This method scales very well to 2- and 3-dimensional random walks.

I need to spectral clustering for two donuts shape dataset.(Matlab)

I have tried hours but I cannot find solution.
I have "two Donuts" Data sample (variable "X")
you can download file below link
donut dataset(rings.mat)
which spreads to 2D shape like below image
First 250pts are located inside donuts and last 750 pts are located outside donuts.
and I need to perform spectral clustering.
I made (similarity matrix "W") with Gaussian similarity distance.
and I made degree matrix by sum of each raw of "W"
and then I computed eigen value(E) and eigen Vector(V)
and the shape of "V" is not good.
what is wrong with my trial???
I cannot figure out.
load rings.mat
[D, N] = size(X); % data stored in X
%initial plot data
figure; hold on;
for i=1:N,
plot(X(1,i), X(2,i),'o');
end
% perform spectral clustering
W = zeros(N,N);
D = zeros(N,N);
sigma = 1;
for i=1:N,
for j=1:N,
xixj2 = (X(1,i)-X(1,j))^2 + (X(2,i)-X(2,j))^2 ;
W(i,j) = exp( -1*xixj2 / (2*sigma^2) ) ; % compute weight here
% if (i==j)
% W(i,j)=0;
% end;
end;
D(i,i) = sum(W(i,:)) ;
end;
L = D - W ;
normL = D^-0.5*L*D^-0.5;
[u,s,v] = svd(normL);
If you use the Laplacian like it is in your code (the "real" laplacian), then to cluster your points into two sets you will want the eigenvector corresponding to second smallest eigenvalue.
The intuitive idea is to connect all of your points to each other with springs, where the springs are stiffer if the points are near each other, and less stiff for points far away. The eigenvectors of the Laplacian are the modes of vibration if you hit your spring network with a hammer and watch it oscillate - smaller eigenvalues corresponding to lower frequency "bulk" modes, and larger eigenvalues corresponding to higher frequency oscillations. You want the eigenvalue corresponding to the second smallest eigenvalue, which will be like the second mode in a drum, with a positive clustered together, and negative part clustered together.
Now there is some confusion in the comments about whether to use the largest or smallest eigenvalue, and it is because the laplacian in the paper linked there by dave is slightly different, being the identity minus your laplacian. So there they want the largest ones, whereas you want the smallest. The clustering in the paper is also a bit more advanced, and better, but not as easy to implement.
Here is your code, modified to work:
load rings.mat
[D, N] = size(X); % data stored in X
%initial plot data
figure; hold on;
for i=1:N,
plot(X(1,i), X(2,i),'o');
end
% perform spectral clustering
W = zeros(N,N);
D = zeros(N,N);
sigma = 0.3; % <--- Changed to be smaller
for i=1:N,
for j=1:N,
xixj2 = (X(1,i)-X(1,j))^2 + (X(2,i)-X(2,j))^2 ;
W(i,j) = exp( -1*xixj2 / (2*sigma^2) ) ; % compute weight here
% if (i==j)
% W(i,j)=0;
% end;
end;
D(i,i) = sum(W(i,:)) ;
end;
L = D - W ;
normL = D^-0.5*L*D^-0.5;
[u,s,v] = svd(normL);
% New code below this point
cluster1 = find(u(:,end-1) >= 0);
cluster2 = find(u(:,end-1) < 0);
figure
plot(X(1,cluster1),X(2,cluster1),'.b')
hold on
plot(X(1,cluster2),X(2,cluster2),'.r')
hold off
title(sprintf('sigma=%d',sigma))
Here is the result:
Now notice that I changed sigma to be smaller - from 1.0 to 0.3. When I left it at 1.0, I got the following result:
which I assume is because with sigma=1, the points in the inner cluster were able to "pull" on the outer cluster (which they are about distance 1 away from) enough so that it was more energetically favorable to split both circles in half like a solid vibrating drum, rather than have two different circles.

Equally spaced points in a contour

I have a set of 2D points (not ordered) forming a closed contour, and I would like to resample them to 14 equally spaced points. It is a contour of a kidney on an image. Any ideas?
One intuitive approach (IMO) is to create an independent variable for both x and y. Base it on arc length, and interpolate on it.
% close the contour, temporarily
xc = [x(:); x(1)];
yc = [y(:); y(1)];
% current spacing may not be equally spaced
dx = diff(xc);
dy = diff(yc);
% distances between consecutive coordiates
dS = sqrt(dx.^2+dy.^2);
dS = [0; dS]; % including start point
% arc length, going along (around) snake
d = cumsum(dS); % here is your independent variable
perim = d(end);
Now you have an independent variable and you can interpolate to create N segments:
N = 14;
ds = perim / N;
dSi = ds*(0:N).'; %' your NEW independent variable, equally spaced
dSi(end) = dSi(end)-.005; % appease interp1
xi = interp1(d,xc,dSi);
yi = interp1(d,yc,dSi);
xi(end)=[]; yi(end)=[];
Try it using imfreehand:
figure, imshow('cameraman.tif');
h = imfreehand(gca);
xy = h.getPosition; x = xy(:,1); y = xy(:,2);
% run the above solution ...
Say your contour is defined by independent vector x and dependent vector y.
You can get your resampled x vector using linspace:
new_x = linspace(min(x),max(x),14); %14 to get 14 equally spaced points
Then use interp1 to get new_y values at each new_x point:
new_y = interp1(x,y,new_x);
There are a few interpolation methods to choose from - default is linear. See interp1 help for more info.