Related
I would like to populate random points on a 2D plot, in such a way that the points fall in proximity of a "C" shaped polyline.
I managed to accomplish this for a rather simple square shaped "C":
This is how I did it:
% Marker color
c = 'k'; % Black
% Red "C" polyline
xl = [8,2,2,8];
yl = [8,8,2,2];
plot(xl,yl,'r','LineWidth',2);
hold on;
% Axis settings
axis equal;
axis([0,10,0,10]);
set(gca,'xtick',[],'ytick',[]);
step = 0.05; % Affects point quantity
coeff = 0.9; % Affects point density
% Top Horizontal segment
x = 2:step:9.5;
y = 8 + coeff*randn(size(x));
scatter(x,y,'filled','MarkerFaceColor',c);
% Vertical segment
y = 1.5:step:8.5;
x = 2 + coeff*randn(size(y));
scatter(x,y,'filled','MarkerFaceColor',c);
% Bottom Horizontal segment
x = 2:step:9.5;
y = 2 + coeff*randn(size(x));
scatter(x,y,'filled','MarkerFaceColor',c);
hold off;
As you can see in the code, for each segment of the polyline I generate the scatter point coordinates artificially using randn.
For the previous example, splitting the polyline into segments and generating the points manually is fine. However, what if I wanted to experiment with a more sophisticated "C" shape like this one:
Note that with my current approach, when the geometric complexity of the polyline increases so does the coding effort.
Before going any further, is there a better approach for this problem?
A simpler approach, which generalizes to any polyline, is to run a loop over the segments. For each segment, r is its length, and m is the number of points to be placed along that segment (it closely corresponds to the prescribed step size, with slight deviation in case the step size does not evenly divide the length). Note that both x and y are subject to random perturbation.
for n = 1:numel(xl)-1
r = norm([xl(n)-xl(n+1), yl(n)-yl(n+1)]);
m = round(r/step) + 1;
x = linspace(xl(n), xl(n+1), m) + coeff*randn(1,m);
y = linspace(yl(n), yl(n+1), m) + coeff*randn(1,m);
scatter(x,y,'filled','MarkerFaceColor',c);
end
Output:
A more complex example, using coeff = 0.4; and xl = [8,4,2,2,6,8];
yl = [8,6,8,2,4,2];
If you think this point cloud is too thin near the endpoints, you can artifically lengthen the first and last segments before running the loop. But I don't see the need: it makes sense that the fuzzied curve is thinning out at the extremities.
With your original approach, two places with the same distance to a line can sampled with a different probability, especially at the corners where two lines meet. I tried to fix this rephrasing the random experiment. The random experiment my code does is: "Pick a random point. Accept it with a probability of normpdf(d)<rand where d is the distance to the next line". This is a rejection sampling strategy.
xl = [8,4,2,2,6,8];
yl = [8,6,8,2,4,2];
resolution=50;
points_to_sample=200;
step=.5;
sigma=.4; %lower value to get points closer to the line.
xmax=(max(xl)+2);
ymax=(max(yl)+2);
dist=zeros(xmax*resolution+1,ymax*resolution+1);
x=[];
y=[];
for n = 1:numel(xl)-1
r = norm([xl(n)-xl(n+1), yl(n)-yl(n+1)]);
m = round(r/step) + 1;
x = [x,round(linspace(xl(n)*resolution+1, xl(n+1)*resolution+1, m*resolution))];
y = [y,round(linspace(yl(n)*resolution+1, yl(n+1)*resolution+1, m*resolution))];
end
%dist contains the lines:
dist(sub2ind(size(dist),x,y))=1;
%dist contains the normalized distance of each rastered pixel to the line.
dist=bwdist(dist)/resolution;
pseudo_pdf=normpdf(dist,0,sigma);
%scale up to have acceptance rate of 1 for most likely pixels.
pseudo_pdf=pseudo_pdf/max(pseudo_pdf(:));
sampled_points=zeros(0,2);
while size(sampled_points,1)<points_to_sample
%sample a random point
sx=rand*xmax;
sy=rand*ymax;
%accept it if criteria based on normal distribution matches.
if pseudo_pdf(round(sx*resolution)+1,round(sy*resolution)+1)>rand
sampled_points(end+1,:)=[sx,sy];
end
end
plot(xl,yl,'r','LineWidth',2);
hold on
scatter(sampled_points(:,1),sampled_points(:,2),'filled');
I have a time-dependent system of varying number of particles (~100k particles). In fact, each particle represents an interaction in a 3D space with a particular strength. Thus, each particle has (X,Y,Z;w) which is the coordinate plus a weight factor between 0 and 1, showing the strength of interaction in that coordinate.
Here http://pho.to/9Ztti I have uploaded 10 real-time snapshots of the system, with particles are represented as reddish small dots; the redder the dot, the stronger the interaction is.
The question is: how one can produce a 3D (spatial) density map of these particles, preferably in Matlab or Origin Pro 9 or ImageJ? Is there a way to, say, take the average of these images based on the red-color intensity in ImageJ?
Since I have the numerical data for particles (X,Y,Z;w) I can analyze those data in other software as well. So, you are welcome to suggest any other analytical approach/software
Any ideas/comments are welcome!
Assuming your data is in 3D continuous space and your dataset is just a list of the 3d positions of each particle interaction, it sounds like you want to make a 4D weighted histogram. You'll have to chop the 3d space into bins and sum the weighted points in each bin over time, then plot the results in a single 3d plot where color represents the summed weighted interactions over time.
Heres an example with randomly generated particle interactions:`
%% Create dataSet of random particle interations in 3d space
for i=1:5000
if i == 1
dataSet = [rand()*100 rand()*100 rand()*100 rand() i];
else
dataSet(i,:) = [rand()*100 rand()*100 rand()*100 rand() i];
end
end
% dataSet = [x y z interactionStrength imageNumber]
xLimits = [min(dataSet(:,1)) max(dataSet(:,1))];
yLimits = [min(dataSet(:,2)) max(dataSet(:,2))];
zLimits = [min(dataSet(:,3)) max(dataSet(:,3))];
binSize = 10; % Number of bins to split each spatial dimention into
binXInterval = (xLimits(2)-xLimits(1))/binSize;
binYInterval = (yLimits(2)-yLimits(1))/binSize;
binZInterval = (zLimits(2)-zLimits(1))/binSize;
histo = [];
for i=xLimits(1)+(binSize/2):binXInterval:xLimits(2) + (binSize/2)
for j=yLimits(1)+(binSize/2):binYInterval:yLimits(2) + (binSize/2)
for k=zLimits(1)+(binSize/2):binZInterval:zLimits(2) + (binSize/2)
%% Filter out particle interactions found within the current spatial bin
idx = find((dataSet(:,1) > (i - binSize)) .* (dataSet(:,1) < i));
temp = dataSet(idx,:);
idx = find((temp(:,2) > (j - binSize)) .* (temp(:,2) < j));
temp = temp(idx,:);
idx = find((temp(:,3) > (k - binSize)) .* (temp(:,3) < k));
temp = temp(idx,:);
%% Add up all interaction strengths found within this bin
histo = [histo; i j k sum(temp(:,4))];
end
end
end
%% Remove bins with no particle interactions
idx = find(histo(:,4)>0);
histo = histo(idx,:);
numberOfImages = max(dataSet(:,5));
%% Plot result
PointSizeMultiplier = 100000;
scatter3(histo(:,1).*binXInterval + xLimits(1),histo(:,2).*binYInterval + yLimits(1),histo(:,3).*binZInterval + zLimits(1),(histo(:,4)/numberOfImages)*PointSizeMultiplier,(histo(:,4)/numberOfImages));
colormap hot;
%Size and color represent the average interaction intensity over time
4D histogram made from 10000 randomly generated particle interactions. Each axis divided into 10 bins. Size and color represent summed particle interactions in each bin over time:
If your system can handle the matrix in Matlab it could be as easy as
A = mean(M, 4);
Assuming M holds the 4D compilation of your images then A would be your map.
One way would be to use a 3D scatter (bubble) plot, with variable circle/bubble sizes, proportional to the intensity of your particle.
Here is a simulated example:
N = 1e4; % number of particles
X = randn(N,1); % randomly generated coordinates
Y = 2*randn(N,1);
Z = 0.5*randn(N,1);
S = exp(-sqrt(X.^2 + Y.^2 + Z.^2)); % bubble size vector
scatter3(X,Y,Z,S*200)
end
Here I have randomly generated values for X, Y and Z, while S is reversely proportional to the distance from the center of the cloud.
In your case, if we assume that the (X,Y,Z,w) values are stored in a 2D array called Particles, it would be:
X = Particles(:,1);
Y = Particles(:,2);
Z = Particles(:,3);
S = Particles(:,4);
Hope that helped.
By using normrnd, I would like to create a normal distribution function with mean and sigma values expressed as vectors of size 1x45 varying from 1:45 and plot this simulated PDF with ideal values.
Whenever I create a normrnd like the one expressed below,
Gaussian = normrnd([1 45],[1 45],[1 500],length(c_t));
I am obtaining the following error,
Size information is inconsistent.
The reason for creating this PDF is to compute Chemical kinetics of a tracer with variable gaussian noise model. Basically i have an Ideal characteristics of a Tracer now i would like to add gaussian noise and understand how the chemical kinetics of a tracer vary with changing noise.
Basically there are different computational models for understanding chemical kinetics of tracer, one of which is Three compartmental model ,others are viz shape analysis,constrained shape analysis model.
I currently have ideal curve for all respective models, now i would like to add noise to these models and understand how each particular model behaves with varying noise
This is why i would like to create a variable noise model with normrnd add this model to ideal characteristics and compute Noise(Sigma) Vs Error -This analysis will give me an approximate estimation how different models behave with varying noise and which model is suitable for estimating chemical kinetics of tracer.
function [c_t,c_t_noise] =Noise_ConstrainedK2(t,a1,a2,a3,b1,b2,b3,td,tmax,k1,k2,k3)
K_1 = (k1*k2)/(k2+k3);
K_2 = (k1*k3)/(k2+k3);
%DV_free= k1/(k2+k3);
c_t = zeros(size(t));
ind = (t > td) & (t < tmax);
c_t(ind)= conv(((t(ind) - td) ./ (tmax - td) * (a1 + a2 + a3)),(K_1*exp(-(k2+k3)*t(ind)+K_2)),'same');
ind = (t >= tmax);
c_t(ind)=conv((a1 * exp(-b1 * (t(ind) - tmax))+ a2 * exp(-b2 * (t(ind) - tmax))) + a3 * exp(-b3 * (t(ind) - tmax)),(K_1*exp(-(k2+k3)*t(ind)+K_2)),'same');
meanAndVar = (rand(45,2)-0.5)*2;
numPoints = 500;
randSamples = zeros(1,numPoints);
for ii = 1:numPoints
idx = mod(ii,size(meanAndVar,1))+1;
randSamples(ii) = normrnd(meanAndVar(idx,1),meanAndVar(idx,2));
c_t_noise = c_t + randSamples(ii);
end
scatter(1:numPoints,randSamples)
dg = [0 0.5 0];
plot(t,c_t,'r');
hold on;
plot(t,c_t_noise,'Color',dg);
hold off;
axis([0 50 0 1900]);
xlabel('Time[mins]');
ylabel('concentration [Mbq]');
title('My signal');
%plot(t,c_tnp);
end
The output characteristics from the above function are as follows,Here i could not visualize any noise
The only remotely close thing to what you want to be done can be done as follows, but will involve looping because you can not request 500 data points from only 45 different means and variances, without the assumption that multiple sets can be revisited.
This is my interpretation of what you want, though I am still not entirely sure.
Random Gaussian Function Selection
meanAndVar = rand(45,2);
numPoints = 500;
randSamples = zeros(1,numPoints);
for ii = 1:numPoints
randMeanVarIdx = randi([1,size(meanAndVar,1)]);
randSamples(ii) = normrnd(meanAndVar(randMeanVarIdx,1),meanAndVar(randMeanVarIdx,2));
end
scatter(1:numPoints,randSamples)
The above code generates a random 2-D matrix of mean and variance (1st col = mean, 2nd col = variance). We then preallocate some space.
Inside the loop we chose a random set of mean and variance to use (uniformly) and then take that mean and variance, plug it into a random gaussian value function, and store it.
the matrix randSamples will contain a list of random values generated by a random set of gaussian functions chosen in a randomly uniform manner.
Sequential Function Selection
If you do not want to randomly select which function to use, and just want to go sequentially you loop using modulus to get the index of which set of values to use.
meanAndVar = (rand(45,2)-0.5)*2; % zero shift and make bounds [-1,1]
numPoints = 500;
randSamples = zeros(1,numPoints);
for ii = 1:numPoints
idx = mod(ii,size(meanAndVar,1))+1;
randSamples(ii) = normrnd(meanAndVar(idx,1),meanAndVar(idx,2));
end
scatter(1:numPoints,randSamples)
The problem with this statement
Gaussian = normrnd([1 45],[1 45],[1 500],length(c_t));
is that you supply two mu values and two sigma values, and ask for a matrix of size [1 500] x length(c_t). You need to pass the size in a uniform way, so either
Gaussian = normrnd(mu, sigma,[500 length(c_t)]);
or
Gaussian = normrnd(mu, sigma, 500, length(c_t));
Then you should make sure that the size of the mu/sigma vectors match the size of the matrix you ask for. So if you want a 500 x length(c_t) matrix as output you need to pass 500 x length(c_t) (mu,sigma) pairs. If you only want to vary one of mu or sigma you can pass a single value for the other parameter
To get N values from a normal distribution with fixed mean and steadily increasing sigma you can do
noise = #(mu, s0, s1, n) normrnd(mu, s0:(s1-s0)/(n-1):s1, 1,n)
where s0 is the lowest sigma value and s1 is the largest sigma value. To get 10 values drawn from distributions with mu=0 and sigma increasing from 1 to 5 you can do
noise(0,1,5,10)
If you want to introduce some randomness in the increase of sigma you can do
noise_rand = #(mu, s0, s1, n) normrnd(mu, (s0:(s1-s0)/(n-1):s1) .* rand(1,n), 1,n)
I'm trying to generate a cloud of 2D points (uniformly) distributed within a triangle. So far, I've achieved the following:
The code I've used is this:
N = 1000;
X = -10:0.1:10;
for i=1:N
j = ceil(rand() * length(X));
x_i = X(j);
y_i = (10 - abs(x_i)) * rand;
E(:, i) = [x_i y_i];
end
However, the points are not uniformly distributed, as clearly seen in the left and right corners. How can I improve that result? I've been trying to search for the different shapes too, with no luck.
You should first ask yourself what would make the points within a triangle distributed uniformly.
To make a long story short, given all three vertices of the triangle, you need to transform two uniformly distributed random values like so:
N = 1000; % # Number of points
V = [-10, 0; 0, 10; 10, 0]; % # Triangle vertices, pairs of (x, y)
t = sqrt(rand(N, 1));
s = rand(N, 1);
P = (1 - t) * V(1, :) + bsxfun(#times, ((1 - s) * V(2, :) + s * V(3, :)), t);
This will produce a set of points which are uniformly distributed inside the specified triangle:
scatter(P(:, 1), P(:, 2), '.')
Note that this solution does not involve repeated conditional manipulation of random numbers, so it cannot potentially fall into an endless loop.
For further reading, have a look at this article.
That concentration of points would be expected from the way you are building the points. Your points are equally distributed along the X axis. At the extremes of the triangle there is approximately the same amount of points present at the center of the triangle, but they are distributed along a much smaller region.
The first and best approach I can think of: brute force. Distribute the points equally around a bigger region, and then delete the ones that are outside the region you are interested in.
N = 1000;
points = zeros(N,2);
n = 0;
while (n < N)
n = n + 1;
x_i = 20*rand-10; % generate a number between -10 and 10
y_i = 10*rand; % generate a number between 0 and 10
if (y_i > 10 - abs(x_i)) % if the points are outside the triangle
n = n - 1; % decrease the counter to try to generate one more point
else % if the point is inside the triangle
points(n,:) = [x_i y_i]; % add it to a list of points
end
end
% plot the points generated
plot(points(:,1), points(:,2), '.');
title ('1000 points randomly distributed inside a triangle');
The result of the code I've posted:
one important disclaimer: Randomly distributed does not mean "uniformly" distributed! If you generate data randomly from an Uniform Distribution, that does not mean that it will be "evenly distributed" along the triangle. You will see, in fact, some clusters of points.
You can imagine that the triangle is split vertically into two halves, and move one half so that together with the other it makes a rectangle. Now you sample uniformly in the rectangle, which is easy, and then move the half triangle back.
Also, it's easier to work with unit lengths (the rectangle becomes a square) and then stretch the triangle to the desired dimensions.
x = [-10 10]; % //triangle base
y = [0 10]; % //triangle height
N = 1000; %// number of points
points = rand(N,2); %// sample uniformly in unit square
ind = points(:,2)>points(:,1); %// points to be unfolded
points(ind,:) = [2-points(ind,2) points(ind,1)]; %// unfold them
points(:,1) = x(1) + (x(2)-x(1))/2*points(:,1); %// stretch x as needed
points(:,2) = y(1) + (y(2)-y(1))*points(:,2); %// stretch y as needed
plot(points(:,1),points(:,2),'.')
We can generalize this case. If you want to sample points from some (n - 1)-dimensional simplex in Euclidean space UNIFORMLY (not necessarily a triangle - it can be any convex polytope), just sample a vector from a symmetric n-dimensional Dirichlet distribution with parameter 1 - these are the convex (or barycentric) coordinates relative to the vertices of the polytope.
I am working on thumb recognition system. I need to implement KNN algorithm to classify my images. according to this, it has only 2 measurements, through which it is calculating the distance to find the nearest neighbour but in my case I have 400 images of 25 X 42, in which 200 are for training and 200 for testing. I am searching for few hours but I am not finding the way to find the distance between the points.
EDIT:
I have reshaped 1st 200 images in to 1 X 1050 and stored them in a matrix trainingData of 200 X 1050. similarly I made testingData.
Here is an illustration code for k-nearest neighbor classification (some functions used require the Statistics toolbox):
%# image size
sz = [25,42];
%# training images
numTrain = 200;
trainData = zeros(numTrain,prod(sz));
for i=1:numTrain
img = imread( sprintf('train/image_%03d.jpg',i) );
trainData(i,:) = img(:);
end
%# testing images
numTest = 200;
testData = zeros(numTest,prod(sz));
for i=1:numTest
img = imread( sprintf('test/image_%03d.jpg',i) );
testData(i,:) = img(:);
end
%# target class (I'm just using random values. Load your actual values instead)
trainClass = randi([1 5], [numTrain 1]);
testClass = randi([1 5], [numTest 1]);
%# compute pairwise distances between each test instance vs. all training data
D = pdist2(testData, trainData, 'euclidean');
[D,idx] = sort(D, 2, 'ascend');
%# K nearest neighbors
K = 5;
D = D(:,1:K);
idx = idx(:,1:K);
%# majority vote
prediction = mode(trainClass(idx),2);
%# performance (confusion matrix and classification error)
C = confusionmat(testClass, prediction);
err = sum(C(:)) - sum(diag(C))
If you want to compute the Euclidean distance between vectors a and b, just use Pythagoras. In Matlab:
dist = sqrt(sum((a-b).^2));
However, you might want to use pdist to compute it for all combinations of vectors in your matrix at once.
dist = squareform(pdist(myVectors, 'euclidean'));
I'm interpreting columns as instances to classify and rows as potential neighbors. This is arbitrary though and you could switch them around.
If have a separate test set, you can compute the distance to the instances in the training set with pdist2:
dist = pdist2(trainingSet, testSet, 'euclidean')
You can use this distance matrix to knn-classify your vectors as follows. I'll generate some random data to serve as example, which will result in low (around chance level) accuracy. But of course you should plug in your actual data and results will probably be better.
m = rand(nrOfVectors,nrOfFeatures); % random example data
classes = randi(nrOfClasses, 1, nrOfVectors); % random true classes
k = 3; % number of neighbors to consider, 3 is a common value
d = squareform(pdist(m, 'euclidean')); % distance matrix
[neighborvals, neighborindex] = sort(d,1); % get sorted distances
Take a look at the neighborvals and neighborindex matrices and see if they make sense to you. The first is a sorted version of the earlier d matrix, and the latter gives the corresponding instance numbers. Note that the self-distances (on the diagonal in d) have floated to the top. We're not interested in this (always zero), so we'll skip the top row in the next step.
assignedClasses = mode(neighborclasses(2:1+k,:),1);
So we assign the most common class among the k nearest neighbors!
You can compare the assigned classes with the actual classes to get an accuracy score:
accuracy = 100 * sum(classes == assignedClasses)/length(classes);
fprintf('KNN Classifier Accuracy: %.2f%%\n', 100*accuracy)
Or make a confusion matrix to see the distribution of classifications:
confusionmat(classes, assignedClasses)
yes, there is a function for knn : knnclassify
Play around with the number of neighbors you want to keep in order to get the best result (use a confusion matrix). This function takes care of the distance, of course.