I am implementing Equation 8 from Kraskov et al (2004)'s Estimating Mutual Information paper, and have the following problem:
Given vectors X = [X_1,...,X_N] and r = [r_1,...,r_N], I need to compute A= [A_1,...A_N] where A_i is the number of points in X within a r_i radius of X_i.
If r were a fixed number, that is, if the radius around each point was fixed for all points, I could easily use rangesearch. But because it is a vector (different radius for each point), I am not sure how to do this fast. Exhaustive search (or making any N^2 sized distance arrays) is not good, because N is on the order of a million.
Related
I am trying to randomly generate uniformly distributed vectors, which are of Euclidian length of 1. By uniformly distributed I mean that each entry (coordinate) of the vectors is uniformly distributed.
More specifically, I would like to create a set of, say, 1000 vectors (lets call them V_i, with i=1,…,1000), where each of these random vectors has unit Euclidian length and the same dimension V_i=(v_1i,…,v_ni)' (let’s say n = 5, but the algorithm should work with any dimension). If we then look on the distribution of e.g. v_1i, the first element of each V_i, then I would like that this is uniformly distributed.
In the attached MATLAB example you see that you cannot simply draw random vectors from a uniform distribution and then normalize the vectors to Euclidian length of 1, as the distribution of the elements across the vectors is then no longer uniform.
Is there a way to generate this set of vectors such, that the distribution of the single elements across the vector-set is uniform?
Thank you for any ideas.
PS: MATLAB is our Language of choice, but solutions in any languages are, of course, welcome.
clear all
rng('default')
nvar=5;
sample = 1000;
x = zeros(nvar,sample);
for ii = 1:sample
y=rand(nvar,1);
x(:,ii) = y./norm(y);
end
hist(x(1,:))
figure
hist(x(2,:))
figure
hist(x(3,:))
figure
hist(x(4,:))
figure
hist(x(5,:))
What you want cannot be accomplished.
Vectors with a length of 1 sit on a circle (or sphere or hypersphere depending on the number of dimensions). Let's focus on the 2D case, if it cannot be done there, it will be clear that it cannot be done with more dimensions either.
Because the points are on a circle, their x and y coordinates are dependent, the one can be computed based on the other. Thus, the distributions of x and y coordinates cannot be defined independently. We can define the distribution of the one, generate random values for it, but the other coordinate must be computed from the first.
Let's make points on a half circle with a uniform x coordinate (can be extended to a full circle by adding a random sign to the y coordinate):
N = 1000;
x = 2 * rand(N,1) - 1;
y = sqrt(1 - x.^2);
plot(x,y,'.')
axis equal
histogram(y)
The plot generates shows a clearly non-uniform distribution, with many more samples generated near y=1 than near y=0. If we add a random sign to the y-coordinate we'd have more samples near y=1 and y=-1 than near y=0.
I have an image of a cytoskeleton. There are a lot of small objects inside and I want to calculate the length between all of them in every axis and to get a matrix with all this data. I am trying to do this in matlab.
My final aim is to figure out if there is any axis with a constant distance between the object.
I've tried bwdist and to use connected components without any luck.
Do you have any other ideas?
So, the end goal is that you want to globally stretch this image in a certain direction (linearly) so that the distances between nearest pairs end up the closest together, hopefully the same? Or may you do more complex stretching ? (note that with arbitrarily complex one you can always make it work :) )
If linear global one, distance in x' and y' is going to be a simple multiplication of the old distance in x and y, applied to every pair of points. So, the final euclidean distance will end up being sqrt((SX*x)^2 + (SY*y)^2), with SX being stretch in x and SY stretch in y; X and Y are distances in X and Y between pairs of points.
If you are interested in just "the same" part, solution is not so difficult:
Find all objects of interest and put their X and Y coordinates in a N*2 matrix.
Calculate distances between all pairs of objects in X and Y. You will end up with 2 matrices sized N*N (with 0 on the diagonal, symmetric and real, not sure what is the name for that type of matrix).
Find minimum distance (say this is between A an B).
You probably already have this. Now:
Take C. Make N-1 transformations, which all end up in C->nearestToC = A->B. It is a simple system of equations, you have X1^2*SX^2+Y1^2*SY^2 = X2^2*SX^2+Y2*SY^2.
So, first say A->B = C->A, then A->B = C->B, then A->B = C->D etc etc. Make sure transformation is normalized => SX^2 + SY^2 = 1. If it cannot be found, the only valid transformation is SX = SY = 0 which means you don't have solution here. Obviously, SX and SY need to be real.
Note that this solution is unique except in case where X1 = X2 and Y1 = Y2. In this case, grab some other point than C to find this transformation.
For each transformation check the remaining points and find all nearest neighbours of them. If distance is always the same as these 2 (to a given tolerance), great, you found your transformation. If not, this transformation does not work and you should continue with the next one.
If you want a transformation that minimizes variations between distances (but doesn't require them to be nearly equal), I would do some optimization method and search for a minimum - I don't know how to find an exact solution otherwise. I would pick this also in case you don't have linear or global stretch.
If i understand your question correctly, the first step is to obtain all of the objects center of mass points in the image as (x,y) coordinates. Then, you can easily compute all of the distances between all points. I suggest taking a look on a histogram of those distances which may provide some information as to the nature of distance distribution (for example if it is uniformly random, or are there any patterns that appear).
Obtaining the center of mass points is not an easy task, consider transforming the image into a binary one, or some sort of background subtraction with blob detection or/and edge detector.
For building a histogram you can use histogram.
I want to determine a point in space by geometry and I have math computations that gives me several theta values. After evaluating the theta values, I could get N 1 x 3 dimension matrix where N is the number of theta evaluated. Since I have my targeted point, I only need to decide which of the matrices is closest to the target with adequate focus on the three coordinates (x,y,z).
Take a view of the analysis in the figure below:
Fig 1: Determining Closest Point with all points having minimal error
It can easily be seen that the third matrix is closest using sum(abs(Matrix[x,y,z])). However, if the method is applied on another figure given below, obviously, the result is wrong.
Fig 2: One Point has closest values with 2-axes of the reference point
Looking at point B, it is closer to the reference point on y-,z- axes but just that it strayed greatly on x-axis.
So how can I evaluate the matrices and select the closest one to point of reference and adequate emphasis will be on error differences in all coordinates (x,y,z)?
If your results is in terms of (x,y,z), why don't evaluate the euclidean distance of each matrix you have obtained from the reference point?
Sort of matlab code:
Ref_point = [48.98, 20.56, -1.44];
Curr_point = [x,y,z];
Xd = (x-Ref_point(1))^2 ;
Yd = (y-Ref_point(2))^2 ;
Zd = (z-Ref_point(3))^2 ;
distance = sqrt(Xd + Yd + Zd);
%find the minimum distance
I need to optimize the location of 10 Transmitters and 10 Receivers (modeled as points on an aperture plane) so as to minimize a certain objective scalar using Genetic Algorithm toolbox in MATLAB. My question is: I have (10+10)*2 = 40 variables (optimizing x and y positions of each point). How do I model the constraints in the form Ax <= b, such that each point is separated by a minimum distance in both x and y directions from all other points?
I'd model the objective function as the euclidian distance between the points, which you are trying to minimize. Also the transmitters and receivers must have a minimum distance between them. So this distance should be minimum, but greater than the minimum distance of the equipment. I'd look into the dimensions of the plane to identify all the constraints.
I have a data-set that has four columns [X Y Z C]. I would like to find all the C values that are in a given sphere centered at [X, Y, Z] with a radius r. What is the best approach to address this problem? Should I use the clusterdata command?
Here is one solution that uses naively euclidean distance:
say V = [X Y Z C] is your dataset, Center = [x,y,z] is the center of the sphere, then
dist = bsxfun(#minus,V(:,1:3),Center); % // finds the distance vectors
% // between the points and the center
dist = sum(dist.^2,2); % // evaluate the squares of the euclidean distances (scalars)
idx = (dist < r^2); % // Find the indexes of the matching points
The good C values are
good = V(idx,4); % // here I kept just the C column
This is not "cluster analysis": You do not attempt to discover structure in your data.
Instead, what you are doing, is commonly called a "range query" or "radius query". In classic database terms, a SELECT, with a distance selector.
You probably want to define your sphere using euclidean distance. For computational purposes, it actually is beneficial to instead of squared Euclidean, by simply taking the square of your radius.
I don't use matlab, but there must be tons of examples of how to compute the distance of each instance in a data set from a query point, and then selecting those objects where the distance is small enough.
I don't know if there is any good index structures package for Matlab. But in general, at 3D, this can be well accelerated with index structures. Computing all distances is O(n), but with an index structure only O(log n).