Random sample of points from a rectangular box with a spherical obstacle

Random sample of points from a rectangular box with a spherical obstacle - matlab

The workspace is given as:
limits=[-1 4; -1 4; -1 4];
And in this workspace, there is a spherical obstacle which is defined as:
obstacle.origin_x=1.6;
obstacle.origin_y=0.8;
obstacle.origin_z=0.2;
obstacle.radius_obs=0.2;
save('obstacle.mat', 'obstacle');
I would like to create random point in the area of lim. I created random points using the code below:
function a=rndmpnt(lim, numofpoints)
x=lim(1,1)+(lim(1,2)-lim(1,1))*rand(1,numofpoint);
y=lim(2,1)+(lim(2,2)-lim(2,1))*rand(1,numofpoint);
z=lim(3,1)+(lim(3,2)-lim(3,1))*rand(1,numofpoint);
a=[x y z];
Now I would like to eliminate the points in the area of limits-obstacle. how can I do that?

You want to reject the points within the obstacle. Naturally, after rejection you will probably end up with fewer points than numofpoint. So the process will need to be repeated until enough points are generated. A while loop is appropriate here.
Rejection is done by finding ix (indices of acceptable points) and appending only those points to matrix a. The loop repeats until there are enough of those, and returns exactly the number requested.
function a = rndmpnt(lim, numofpoints)
a = zeros(3,0); % begin with empty matrix
while size(a,2) < numofpoint % not enough points yet
x=lim(1,1)+(lim(1,2)-lim(1,1))*rand(1,numofpoint);
y=lim(2,1)+(lim(2,2)-lim(2,1))*rand(1,numofpoint);
z=lim(3,1)+(lim(3,2)-lim(3,1))*rand(1,numofpoint);
ix = (x - obstacle.origin_x).^2 + (y - obstacle.origin_y).^2 + (z - obstacle.origin_z).^2 > obstacle.radius_obs^2;
a = [a, [x(ix); y(ix); z(ix)]];
end
a = a(:, 1:numofpoint);
end
You may want to add a safeguard against infinite loop (some limit on the number of cycles) in case the user passes in the values such that there are no acceptable points.

Related

Which Bins are occupied in a 3D histogram in MatLab

I got 3D data, from which I need to calculate properties.
To reduce computung I wanted to discretize the space and calculate the properties from the Bin instead of the individual data points and then reasign the propertie caclulated from the bin back to the datapoint.
I further only want to calculate the Bins which have points within them.
Since there is no 3D-binning function in MatLab, what i do is using histcounts over each dimension and then searching for the unique Bins that have been asigned to the data points.
a5pre=compositions(:,1);
a7pre=compositions(:,2);
a8pre=compositions(:,3);
%% BINNING
a5pre_edges=[0,linspace(0.005,0.995,19),1];
a5pre_val=(a5pre_edges(1:end-1) + a5pre_edges(2:end))/2;
a5pre_val(1)=0;
a5pre_val(end)=1;
a7pre_edges=[0,linspace(0.005,0.995,49),1];
a7pre_val=(a7pre_edges(1:end-1) + a7pre_edges(2:end))/2;
a7pre_val(1)=0;
a7pre_val(end)=1;
a8pre_edges=a7pre_edges;
a8pre_val=a7pre_val;
[~,~,bin1]=histcounts(a5pre,a5pre_edges);
[~,~,bin2]=histcounts(a7pre,a7pre_edges);
[~,~,bin3]=histcounts(a8pre,a8pre_edges);
bins=[bin1,bin2,bin3];
[A,~,C]=unique(bins,'rows','stable');
a5pre=a5pre_val(A(:,1));
a7pre=a7pre_val(A(:,2));
a8pre=a8pre_val(A(:,3));
It seems like that the unique function is pretty time consuming, so I was wondering if there is a faster way to do it, knowing that the line only can contain integer or so... or a totaly different.
Best regards

function [comps,C]=compo_binner(x,y,z,e1,e2,e3,v1,v2,v3)
C=NaN(length(x),1);
comps=NaN(length(x),3);
id=1;
for i=1:numel(x)
B_temp(1,1)=v1(sum(x(i)>e1));
B_temp(1,2)=v2(sum(y(i)>e2));
B_temp(1,3)=v3(sum(z(i)>e3));
C_id=sum(ismember(comps,B_temp),2)==3;
if sum(C_id)>0
C(i)=find(C_id);
else
comps(id,:)=B_temp;
id=id+1;
C_id=sum(ismember(comps,B_temp),2)==3;
C(i)=find(C_id>0);
end
end
comps(any(isnan(comps), 2), :) = [];
end
But its way slower than the histcount, unique version. Cant avoid find-function, and thats a function you sure want to avoid in a loop when its about speed...

If I understand correctly you want to compute a 3D histogram. If there's no built-in tool to compute one, it is simple to write one:
function [H, lindices] = histogram3d(data, n)
% histogram3d 3D histogram
% H = histogram3d(data, n) computes a 3D histogram from (x,y,z) values
% in the Nx3 array `data`. `n` is the number of bins between 0 and 1.
% It is assumed all values in `data` are between 0 and 1.
assert(size(data,2) == 3, 'data must be Nx3');
H = zeros(n, n, n);
indices = floor(data * n) + 1;
indices(indices > n) = n;
lindices = sub2ind(size(H), indices(:,1), indices(:,2), indices(:,3));
for ii = 1:size(data,1)
H(lindices(ii)) = H(lindices(ii)) + 1;
end
end
Now, given your compositions array, and binning each dimension into 20 bins, we get:
[H, indices] = histogram3d(compositions, 20);
idx = find(H);
[x,y,z] = ind2sub(size(H), idx);
reduced_compositions = ([x,y,z] - 0.5) / 20;
The bin centers for H are at ((1:20)-0.5)/20.
On my machine this runs in a fraction of a second for 5 million inputs points.
Now, for each composition(ii,:), you have a number indices(ii), which matches with another number idx[jj], corresponding to reduced_compositions(jj,:). One easy way to make the assignment of results is as follows:
H(H > 0) = 1:numel(idx);
indices = H(indices);
Now for each composition(ii,:), your closest match in the reduced set is reduced_compositions(indices(ii),:).

Remove long row of data points in Matlab

I have a list of coordinates, coord, which looks like this when plotted:
I want to remove the long string of points that goes completely from 0 to 1 from the data set, shown on this plot starting at (0, 11) and ending at (1, 11) and the other one that begins at (0, 24) and ends at (1, 28).
So far, I have tried using kmeans to group the data by height using this code:
jet = colormap('jet');
amount = 20;
step = floor(numel(jet(:,1))/amount);
idxOIarr = cell(numel(terp));
scale = 100;
for ii = 1:numel(terp)
figure;
hold on;
expandDat = [stretched{ii}(:,1), scale.*log(terp{ii}(:,2))];
[idx, cent] = kmeans(expandDat(:,1:2), amount, 'Distance', 'cityblock');
idxOIarr{ii} = idx;
for jj = 1:amount
scatter(stretched{ii}(idx == jj,1), FREQ(terp{ii}(idx == jj,2)), 10, jet(step*jj,:), 'filled');
end
end
resulting in this image: Although it does separate the higher rows quite well, it breaks the line in the middle in two and groups the line that begins at (0,20) with some data points below it.
Is there any other way to group and remove these points?

The most efficient way to solve this involves building a graph where each point is a vertex. You join points that you consider "connected" or "closed" with an edge. Thus, the graph will connected components. Now you need to look for the connected components that span the whole range from 0 to 1.
Build the graph. Finding neighbors is most efficient using an R-tree. Here are some suggestions. You can also use a k-d tree, for example. However, this is not strictly necessary, it just can get really slow without a proper spatial indexing structure, because you'll have to compare distances between each pair of points.
Given a Nx2 matrix coord, you can find the square distances between each pair:
D = sum((reshape(coord,[],1,2) - reshape(coord,1,[],2)).^2,3);
(note again that this is expensive if N is large, and in that case using an R-tree will speed things up significantly). D(i,j) is the distance between points with indices i and j (i.e. coord(i,:) and coord(j,:).
Next, build the graph, G, nodes i and j are connected if G(i,j)==1. G is a symmetric matrix:
G = D <= max_distance;
Find connected components. A connected component is just a set of nodes that you can reach from each other by following edges. You don't really need to find all connected components, you just need to find the set of points that have x=0, and starting from each, recursively visit all elements in its connected component to see if you can reach a point that has x=1.
This next code is not tested, but helpfully it gives a starting point:
start_indices = find(coord(:,1)==0); % Is exact equality appropriate here?
end_indices = find(coord(:,1)==1);
to_remove = [];
visited = false(size(coord,1), 1);
for ii=start_indices.'
% For each point with x=0, see if we can reach any of the points at x=1
[res, visited] = can_reach(ii, end_indices, G, visited);
if res
% For this point we can, remove it!
to_remove(end+1) = ii;
end
end
% Iterative function to visit all nodes in a connected component
function [res, visited] = can_reach(start, end_indices, G, visited)
visited(start) = true;
if any(start==end_indices)
% We've reach an end point, stop iterating and return true.
res = true;
return;
end
next = find(G(start,:)); % find neighbors
next(visited(next)) = []; % remove visited neighbors
for ii=next
[res, visited] = can_reach(ii, end_indices, G, visited);
if res
% Yes, we can visit an end point, stop iterating now.
return
end
end
end

Plot distances between points matlab

I've made a plot of 10 points
10 10
248,628959661970 66,9462583977501
451,638770451973 939,398361884535
227,712826026548 18,1775336366957
804,449583613070 683,838613746355
986,104241895970 783,736480083219
29,9919502693899 534,137567882728
535,664190667238 885,359450931142
87,0772199008924 899,004898906140
990 990
With the first column as x-coordinates and the other column as y-coordinates
Leading to the following Plot:
Using the following code: scatter(Problem.Points(:,1),Problem.Points(:,2),'.b')
I then also calculated the euclidean distances using Problem.DistanceMatrix = pdist(Problem.Points);
Problem.DistanceMatrix = squareform(Problem.DistanceMatrix);
I replaced the distances by 1*10^6 when they are larger than a certain value.
This lead to the following table:
Then, I would like to plot the lines between the corresponding points, preferably with their distances, but only in case the distance < 1*10^6.
Specifically i want to plot the line [1,2] [1,4] [1,7] [2,4] etc.
My question is, can this be done and how?

Assuming one set of your data is in something called xdata and the other in ydata and then the distances in distances, the following code should accomplish what you want.
hold on
for k = 1:length(xdata)
for j = 1:length(ydata)
if(distances(k,j) < 1e6)
plot([xdata(k) xdata(j)], [ydata(k) ydata(j)]);
end
end
end
You just need to iterate through your matrix and then if the value is less than 1e6, then plot the line between the kth and jth index points. This will however double plot lines, so it will plot from k to j, and also from j to k, but it is quick to code and easy to understand. I got the following plot with this.

This should do the trick:
P = [
10.0000000000000 10.0000000000000;
248.6289596619700 66.9462583977501;
451.6387704519730 939.3983618845350;
227.7128260265480 18.1775336366957;
804.4495836130700 683.8386137463550;
986.1042418959700 783.7364800832190;
29.9919502693899 534.1375678827280;
535.6641906672380 885.3594509311420;
87.0772199008924 899.0048989061400;
990.0000000000000 990.0000000000000
];
P_len = size(P,1);
D = squareform(pdist(P));
D(D > 600) = 1e6;
scatter(P(:,1),P(:,2),'*b');
hold on;
for i = 1:P_len
pi = P(i,:);
for j = 1:P_len
pj = P(j,:);
d = D(i,j);
if ((d > 0) && (d < 1e6))
plot([pi(1) pj(1)],[pi(2) pj(2)],'-r');
end
end
end
hold off;
Final output:
On a side note, the part in which you replaces the distance values trespassing a certain treshold (it looks like it's 600 by looking at your distances matrix) with 1e6 can be avoided by just inserting that threshold into the loop for plotting the lines. I mean... it's not wrong, but I just think it's an unnecessary step.
D = squareform(pdist(P));
% ...
if ((d > 0) && (d < 600))
plot([pi(1) pj(1)],[pi(2) pj(2)],'-r');
end

A friend of mine suggested using gplot
gplot(Problem.AdjM, Problem.Points(:,:), '-o')
With problem.points as the coordinates and Problem.AdjM as the adjacency matrix. The Adjacency matrix was generated like this:
Problem.AdjM=Problem.DistanceMatrix;
Problem.AdjM(Problem.AdjM==1000000)=0;
Problem.AdjM(Problem.AdjM>0)=1;
Since the distances of 1*10^6 was the replacement of a distance that is too large, I put the adjacency there to 0 and all the other to 1.
This lead to the following plot, which was more or less what I wanted:
Since you people have been helping me in such a wonderful way, I just wanted to add this:
I added J. Mel's solution to my code, leading to two exactly the same figures:
Since the figures get the same outcome, both methods should be all right. Furthermore, since Tommasso's and J Mel's outcomes were equal earlier, Tommasso's code must also be correct.
Many thanks to both of you and all other people contributing!

Matlab Code to distribute points on plot

I have edited a code that i found online that helps me draw points somehow distributed on a graph based on the minimum distance between them
This is the code that i have so far
x(1)=rand(1)*1000; %Random coordinates of the first point
y(1)=rand(1)*1000;
minAllowableDistance = 30; %IF THIS IS TOO BIG, THE LOOP DOES NOT END
numberOfPoints = 300; % Number of points equivalent to the number of sites
keeperX = x(1); % Initialize first point
keeperY = y(1);
counter = 2;
for k = 2 : numberOfPoints %Dropping another point, and checking if it can be positioned
done=0;
trial_counter=1;
while (done~=1)
x(k)=rand(1)*1000;
y(k)=rand(1)*1000;
thisX = x(k); % Get a trial point.
thisY = y(k);
% See how far is is away from existing keeper points.
distances = sqrt((thisX-keeperX).^2 + (thisY - keeperY).^2);
minDistance = min(distances);
if minDistance >= minAllowableDistance
keeperX(k) = thisX;
keeperY(k) = thisY;
done=1;
trial_counter=trial_counter+1;
counter = counter + 1;
end
if (trial_counter>2)
done=1;
end
end
end
end
So this code is working fine, but sometimes matlab is freezing if the points are above 600. The problem is full , and no more points are added so matlab is doing the work over and over. So i need to find a way when the trial_counter is larger than 2, for the point to find a space that is empty and settle there.
The trial_counter is used to drop a point if it doesn't fit on the third time.
Thank you

Since trial_counter=trial_counter+1; is only called inside if minDistance >= minAllowableDistance, you will easily enter an infinite loop if minDistance < minAllowableDistance (e.g. if your existing points are quite closely packed).
How you do this depends on what your limitations are, but if you're looking at integer points in a set range, one possibility is to keep the points as a binary image, and use bwdist to work out the distance transform, then pick an acceptable point. So each iteration would be (where BW is your stored "image"/2D binary matrix where 1 is the selected points):
D = bwdist(BW);
maybe_points = find(D>minAllowableDistance); % list of possible locations
n = randi(length(maybe_points)); % pick one location
BW(maybe_points(n))=1; % add it to your matrix
(then add some checking such that if you can't find any allowable points the loop quits)

Matlab -- random walk with boundaries, vectorized

Suppose I have a vector J of jump sizes and an initial starting point X_0. Also I have boundaries 0, B (assume 0 < X_0 < B). I want to do a random walk where X_i = [min(X_{i-1} + J_i,B)]^+. (positive part). Basically if it goes over a boundary, it is made equal to the boundary. Anyone know a vectorized way to do this? The current way I am doing it consists of doing cumsums and then finding places where it violates a condition, and then starting from there and repeating the cumsum calculation, etc until I find that I stop violating the boundaries. It works when the boundaries are rarely hit, but if they are hit all the time, it basically becomes a for loop.
In the code below, I am doing this across many samples. To 'fix' the ones that go out of the boundary, I have to loop through the samples to check...(don't think there is a vectorized 'find')
% X_init is a row vector describing initial resource values to use for
% each sample
% J is matrix where each col is a sequence of Jumps (columns = sample #)
% In this code the jumps are subtracted, but same thing
X_intvl = repmat(X_init,NumJumps,1) - cumsum(J);
X = [X_init; X_intvl];
for sample = 1:NumSamples
k = find(or(X_intvl(:,sample) > B, X_intvl(:,sample) < 0),1);
while(~isempty(k))
change = X_intvl(k-1,sample) - X_intvl(k,sample);
X_intvl(k:end,sample) = X_intvl(k:end,sample)+change;
k = find(or(X_intvl(:,sample) > B, X_intvl(:,sample) < 0),1);
end
end

Interesting question (+1).
I faced a similar problem a while back, although slightly more complex as my lower and upper bound depended on t. I never did work out a fully-vectorized solution. In the end, the fastest solution I found was a single loop which incorporates the constraints at each step. Adapting the code to your situation yields the following:
%# Set the parameters
LB = 0; %# Lower bound
UB = 5; %# Upper bound
T = 100; %# Number of observations
N = 3; %# Number of samples
X0 = (1/2) * (LB + UB); %# Arbitrary start point halfway between LB and UB
%# Generate the jumps
Jump = randn(N, T-1);
%# Build the constrained random walk
X = X0 * ones(N, T);
for t = 2:T
X(:, t) = max(min(X(:, t-1) + Jump(:, t-1), UB), 0);
end
X = X';
I would be interested in hearing if this method proves faster than what you are currently doing. I suspect it will be for cases where the constraint is binding in more than one or two places. I can't test it myself as the code you provided is not a "working" example, ie I can't just copy and paste it into Matlab and run it, as it depends on several variables for which example (or simulated) values are not provided. I tried adapting it myself, but couldn't get it to work properly?
UPDATE: I just switched the code around so that observations are indexed on columns and samples are indexed on rows, and then I transpose X in the last step. This will make the routine more efficient, since Matlab allocates memory for numeric arrays column-wise - hence it is faster when performing operations down the columns of an array (as opposed to across the rows). Note, you will only notice the speed-up for large N.
FINAL THOUGHT: These days, the JIT accelerator is very good at making single loops in Matlab efficient (double loops are still pretty slow). Therefore personally I'm of the opinion that every time you try and obtain a fully-vectorized solution in Matlab, ie no loops, you should weigh up whether the effort involved in finding a clever solution is worth the slight gains in efficiency to be made over an easier-to-obtain method that utilizes a single loop. And it is important to remember that fully-vectorized solutions are sometimes slower than solutions involving single loops when T and N are small!

I'd like to propose another vectorized solution.
So, first we should set the parameters and generate random Jumpls. I used the same set of parameters as Colin T Bowers:
% Set the parameters
LB = 0; % Lower bound
UB = 20; % Upper bound
T = 1000; % Number of observations
N = 3; % Number of samples
X0 = (1/2) * (UB + LB); % Arbitrary start point halfway between LB and UB
% Generate the jumps
Jump = randn(N, T-1);
But I changed generation code:
% Generate initial data without bounds
X = cumsum(Jump, 2);
% Apply bounds
Amplitude = UB - LB;
nsteps = ceil( max(abs(X(:))) / Amplitude - 0.5 );
for ii = 1:nsteps
ind = abs(X) > (1/2) * Amplitude;
X(ind) = Amplitude * sign(X(ind)) - X(ind);
end
% Shifting X
X = X0 + X;
So, instead of for loop I'm using cumsum function with smart post-processing.
N.B. This solution works significantly slower than Colin T Bowers's one for tight bounds (Amplitude < 5), but for loose bounds (Amplitude > 20) it works much faster.