Related
I have a matrix consisting of three rows: gene 1, gene 2, Distance.
I want to create a network where each gene is a node and the connecting line is scaled by the distance between the two genes.
How can I do this without using the bioinformatics or neural network toolboxes?
Thanks!
Some Information
It is almost impossible to draw a graph with edge length proportional to edge weight, atleast its very unlikely that the weights allow a graph to be drawn this way, most will be impossible...
See:
P. Eades and N. C. Wormald. Fixed edge-length graph drawing is
NP-hard. Discrete Applied Mathematics, 28(2):111–134, 1990]
or to quote:
"Drawing planar graphs with edge weights as standard node-link
diagram, where edge lengths are proportional to the edge weights, is
an NP-hard problem"
M. Nollenburg, R. Prutkin and I. Rutter, Edge-weighted contact
representations of planar
graphs.
Journal of Graph Algorithms and Applications, 17(4):441–473, 2013
Consider the simple example of the 3 vertices joined as such, in your data format:
[1,2,1;
1,3,1;
2,3,10;]
it's should be immediately obvious that such a graph is impossible to draw with edge length proportional to weight (with straight lines). As such alternatives in MATLAB include using color or line width to represent Weight.
Sorry for the length of this answer but to implement this is not trivial, the process used below for drawing the graph can also be found here (in debth), here (most simple) and in a similar problem here. However these do not address weighted graphs...
So on with implementing line width and color proportional to weight:
Code
Due to the length the code is available without description here
Firstly some test data, consisting of 30 edges randomly assigned between 20 vertices with random weights between 0 and 10.
clear
%% generate testing data
[X,Y] = ndgrid(1:20); testdata = [X(:) Y(:)]; %// all possible edges
data(data(:,1)==data(:,2),:)=[]; %// delete self loops
data=data(randperm(size(data,1),20),:); %// take random sample of edges
data(:,3)=rand(size(data,1),1)*10; %// assign random weights in range 0-10
First some processing of the data to get it into required format;
edges=data(:,1:2);
[Verticies,~,indEdges]=unique(edges); %// get labels & locations of vertices
indEdges=reshape(indEdges,[],2);
weights=data(:,3);
normalisedWeights=weights/max(weights); %// normalise weights (range 0-1)
numeEdge=numel(weights);
numVertex=numel(Verticies);
Now x and y coordinates for each vertex are created on a unit circle:
theta=linspace(0,2*pi,numVertex+1);
theta=theta(1:end-1);
[x,y]=pol2cart(theta,1); % create x,y coordinates for each vertex
As lines in MATLAB plots inherit their color from the axis color order we create a RGB array which corresponds to a colormap, with a entry for each line giving the RGB values for the color assigned to that weight.
The autumn colormap is simple to manually implement as R=1,B=0 for all values and G ranges from 0-1 linearly, so we can make the Cmap variable which is used as the axis color order as follows:
clear Cmap %// to avoid errors due to the way it is created
Cmap(:,2)=normalisedWeights;
Cmap(:,1)=1;
Cmap(:,3)=0;
Now we create a figure, set the colormap to autumn (for the color bar), put hold on so the plot command doesn't reset the color order, and apply the color order
figure
colormap('autumn')
hold on
set(gca,'colororder',Cmap) %// set axis colororder to Cmap
how we plot the edges using the edge indexes generated earlier at locations given by x & y. The handles of the lines (Hline) is stored for later use.
Hline=plot(x(indEdges).',y(indEdges).'); %// plot edges
Now we set the axis to square so the circle of points is displayed properly and turn the axis off to hide them (as they bear no relevance to the plotted graph). The axis color limits (Clim) is then set to match the range of weights and add a colorbar is added.
axis square off
set(gca,'Clim',[0 max(weights)])
colorbar
The final step for plotting the edges, setting the line width to be proportional to weight, The normalised weights are scaled to be in the range 0-5. The linewidths are then set to scalefactor*normalisedWeights..
scalefactor=5; %// scale factor (width of highest weight line)
set(hline, {'LineWidth'}, num2cell(normalisedWeights*scalefactor));
The vertices are now plotted at the x and y coordinates (as black squares here).
The axis limits are increased to allow vertex labels to fit. Finally the vertices are labelled with the values from the original matrix, the labels are placed on a slightly larger circle than the vertices
plot(x,y,'ks')
xlim([-1.1,1.1]) % expand axis to fix labels
ylim([-1.1,1.1])
text(x(:)*1.1, y(:)*1.1, num2str(Verticies),...
'FontSize',8,'HorizontalAlignment','center'); %// add Vertex labels
Results
G'day
I'm trying to program a smart way to find the closest grid points to the points along a contour.
The grid is a 2-dimensional grid, stored in x and y (which contain the x and y kilometre positions of the grid cells).
The contour is a line, made up of x and y locations, not necessarily regularly spaced.
This is shown below - the red dots are the grid, and the blue dots are the points on the contour. How do I find the indices of the red dot closest to each blue dot?
Edit - I should mention that the grid is a latitude/longitude grid, of an area fairly close to the south pole. So, the points (the red dots) are the position in metres from the south pole (using a polar stereographic representation). Since the grid is a geographic grid there is unequal grid spacing - with slightly different shaped cells (where the red dots define the vertices of the cells) due to the distortion at high latitudes.
The result is that I can't just find which row/column of the x and y matrix corresponds closest to the input point coordinates - unlike a regular grid from meshgrid, the values in the rows and columns vary...
Cheers
Dave
The usual method is to go:
for every blue point {
for every red point {
is this the closest so far
}
}
But a better way is to put the red data into a kd tree. This is a tree that splits the data along its mean, then splits the two data sets along their means etc until you have them separated into a tree structure.
This will change your searching effeciancy from O(n*m) to O(log(n)*m)
Here is a library:
http://www.mathworks.com.au/matlabcentral/fileexchange/4586-k-d-tree
This library will provide you the means to easily make a kd tree out of the data and to search for the closest point in it.
Alternatively you can use a quadtree, not as simple but the same idea. (you may have to write your own library for that)
Make sure the largest data set (in this case your red points) go into the tree as this will provide the greatest time reduction.
I think I've found a way to do it using the nearest flag of griddata.
I make a matrix that is the same size as the grid x and y matrices, but is filled with the linear indices of the corresponding matrix element. This is formed by reshaping a vector (which is 1:size(x,1)*size(x,2)) to the same dimensions as x.
I then use griddata and the nearest flag to find the linear index of the point closest to each point on my contour (blue dots). Then, simply converting back to subscript notation with ind2sub leaves me with a 2 row vectors describing the matrix subscripts for the points closest to each point on the blue-dotted contour.
This plot below shows the contour (blue dots), the grid (red dots) and the closest grid points (green dots).
This is the code snippet I used:
index_matrix1 = 1:size(x,1)*size(x,2);
index_matrix1 = reshape(index_matrix1,size(x));
lin_ind = griddata(x,y,index_matrix1,CX,CY,'nearest'); % where CX and CY are the coords of the contour
[sub_ind(1,:),sub_ind(2,:)] = ind2sub(size(x),lin_ind);
I suppose that in the stereographic representation, your points form a neat grid in r-theta coordinates. (I'm not too familiar with this, so correct me if I'm wrong. My suggestion may still apply).
For plotting you convert from the stereographic to latitude-longitude, which distorts the grid. However, for finding the nearest point, consider converting the latitude-longitude of the blue contour points into stereographic coordinates, where it is easy to determine the cell for each point using its r and theta values.
If you can index the cell in the stereographic representation, the index will be the same when you transform to another representation.
The main requirement is that under some transformation, the grid points are defined by two vectors, X and Y, so that for any x in X and y in Y, (x, y) is a grid point. Next transform both the grid and the contour points by that transformation. Then given an arbitrary point (x1, y1), we can find the appropriate grid cell by finding the closest x to x1 and the closest y to y1. Transform back to get the points in the desired coordinate system.
dsearchn: N-D nearest point search.
[k, d] = dsearchn(A,B) : returns the distances, d, to the closest points. d is a column vector of length p.
http://au.mathworks.com/help/matlab/ref/dsearchn.html?s_tid=gn_loc_drop
I have a 2d image, I have locations where local minimas occurs.
I want to measure the width of the valleys "leading" to those minimas.
I need either the radii of the circles or ellipses fitted to these valley.
An example attached here, dark red lines on the peaks contours is what I wish to find.
Thanks.
I am partially extending the answer of #Lucas.
Given a threshold t I would consider the points P_m that are below t and closer to a certain point m of minimum of your f (given a characteristic scale length r).
(You said your data are noisy; to distinguish minima and talk about wells, you need to estimate such r. In your example it can be for instance r=4, i.e. half the distance between the minima).
Then you have to consider a metric for each well region P_m, say for example
metric(P_m) = .5 * mean{ maximum vertical diameter of P_m ,
maximum horizontal diameter of P_m}.
In your picture metric(P_m) = 2 for both wells.
On the whole, in terms of pseudo-code you may consider
M := set of local minima of f
for_each(minimum m in M){
P_m += {p : d(p,m) < r and f(r)<t} % say that += is the push operation in a Stack
}
radius_of_region_around(m) = metric(P_m); %
I would suggest making a list of points that describe the values at the edge of your ellipse, perhaps by finding all the points where it crosses a threshold.
above = data > threshold
apply a simple edge detector
edges = EdgeDetector(above)
find coordinates of edges
[row,col] = find(edges)
Then apply this ellipse fitter http://www.mathworks.com/matlabcentral/fileexchange/3215-fitellipse
I'm assuming here you have access to the x, y and z data and are not processing a given JPG (or so) image. Then, you can use the function contourc to your advantage:
% plot some example function
figure(1), clf, hold on
[x,y,z] = peaks;
surf(x,y,z+10,'edgecolor', 'none')
grid on, view(44,24)
% generate contour matrix. The last entry is a 2-element vector, the last
% element of which is to ensure the right algorithm gets called (so leave
% it untouched), and the first element is your threshold.
C = contourc(x(1,:), y(:,1), z, [-4 max(z(:))+1]);
% plot the selected points
plot(C(1,2:end), C(2,2:end), 'r.')
Then use this superfast ellipse fitting tool to fit an ellipse through those points and find all the parameters of the ellipse you desire.
I suggest you read help contourc and doc contourc to find out why the above works, and what else you can use it for.
I have randomly generated 3 set of points through scatter algorithm and the algorithm is
M = randi(1000,200);
AP = randi(1000,12);
BS = randi(1000,7);
scatter(M(:,1),M(:,20),21,'b.'); hold on
scatter(AP(:,1),AP(:,9),80,'k*');hold on
scatter(BS(:,1),BS(:,4),'r');hold off
Now, I have to set the coverage area for AP and to analyze distance between AP and M, which is within the coverage area. Can anyone help me?
First extract coverage area of AP:
ExtractedM=[M(:,1) M(:,20)];
ExtractedAP=[AP(:,1) AP(:,9)];%Just creating two different matrices for plotted points
%calculating coverage area of AP i.e. bounding box for coordinates of AP
yMax=max(ExtractedAP(:,2));
yMin=min(ExtractedAP(:,2));
xMax=max(ExtractedAP(:,1));
xMin=min(ExtractedAP(:,1));
%Finding the points from M which lie in the coverage area of AP
[positionBoundedPointsMx1,dummy]=find(ExtractedM(:,1)>=xMin & ExtractedM(:,1)<=xMax);
[positionBoundedPointsMx2,dummy]=find(ExtractedM(:,2)>=yMin & ExtractedM(:,2)<=yMax);
positionBoundedPointsMx=intersect(positionBoundedPointsMx1,positionBoundedPointsMx2,'rows');
boundedPointsM=ExtractedM(positionBoundedPointsMx,:);
%Making sure that correct sets of points in M have been extracted
scatter(M(:,1),M(:,20),21,'b.'); hold on
scatter(AP(:,1),AP(:,9),80,'k*');hold on
scatter(BS(:,1),BS(:,4),'r');hold on;
scatter(boundedPointsM(:,1),boundedPointsM(:,2),'ko');
Extracted points in M are represented by blue dot inside a black circle. It looks correct as all points in M lie inside the coverage area of AP.
Now, I am not sure what do you want to calculate. Assuming you just want to calculate the Euclidean distance between two points, therefore, I converted ExtractedAP and boundedPointsM into a vector and used pdist2
ExtractedAP_Vec=ExtractedAP(:); %24x1 vector
boundedPointsM_Vec=boundedPointsM(:); %224x1 vector
%calculate distance between each point of 'ExtractedAP_Vec' and every point in 'boundedPointsM_Vec'.
dist_AP_M=pdist2(ExtractedAP_Vec,boundedPointsM_Vec);
P.S. Plot the scatter plots and verify by yourself. I cannot post images here, sorry.
I would like to draw a circular graph of nodes where certain nodes have a link between them. Here are a few examples from social network graphs:
(source: wrightresult.com)
(source: twit88.com)
How can this be done with MATLAB? Is it possible without installing a separate package?
Here is one way you can do what you want. First, generate points on the circle that you are interested in
clear;
theta=linspace(0,2*pi,31);theta=theta(1:end-1);
[x,y]=pol2cart(theta,1);
Next, if you know the pairs of nodes that are connected, you can skip this step. But in many cases, you get a connectivity matrix from other computations, and you find the indices of the connected nodes from that. Here, I've created a Boolean matrix of connections. So, if there are N nodes, the connectivity matrix is an NxN symmetric matrix, where if the i,jth element is 1, it means you have a connection from node i to node j and 0 otherwise. You can then extract the subscripts of the non-zero pairs to get node connections (only the upper triangle is needed).
links=triu(round(rand(length(theta))));%# this is a random list of connections
[ind1,ind2]=ind2sub(size(links),find(links(:)));
This is the connectivity matrix I generated with the code above.
Now we just need to plot the connections, one at a time
h=figure(1);clf(h);
plot(x,y,'.k','markersize',20);hold on
arrayfun(#(p,q)line([x(p),x(q)],[y(p),y(q)]),ind1,ind2);
axis equal off
which will give you a figure similar to your examples
Inspired by the latest blog post by Cleve Moler, you could also use the gplot function to draw a graph given an adjacency matrix and node coordinates.
Here is an example using bucky; a demo function part of MATLAB that generates the graph of a truncated icosahedron (looks like a soccer ball). We will only use its adjacency matrix for this example since we are laying out the vertices in a circular shape:
%# 60-by-60 sparse adjacency matrix
A = bucky();
N = length(A);
%# x/y coordinates of nodes in a circular layout
r = 1;
theta = linspace(0,2*pi,N+1)'; theta(end) = [];
xy = r .* [cos(theta) sin(theta)];
%# labels of nodes
txt = cellstr(num2str((1:N)','%02d'));
%# show nodes and edges
line(xy(:,1), xy(:,2), 'LineStyle','none', ...
'Marker','.', 'MarkerSize',15, 'Color','g')
hold on
gplot(A, xy, 'b-')
axis([-1 1 -1 1]); axis equal off
hold off
%# show node labels
h = text(xy(:,1).*1.05, xy(:,2).*1.05, txt, 'FontSize',8);
set(h, {'Rotation'},num2cell(theta*180/pi))
We can take this a step further and try to minimize edge crossings. That is we want to rearrange the nodes so that the edges are as close as possible to the circumference of the circle.
This can be done by finding a symmetric permutation of the matrix that minimizes its bandwidth (non-zeros are closer to the diagonal)
p = symrcm(A);
A = A(p,p);
txt = txt(p);
The result in this case:
Other improvements include replacing straight lines with curved splines to draw the edges, (that way you get a nicer graph similar to the second one you've shown), or using different colors to show clusters of vertices and their edges (obviously you'll need to do graph clustering). I will leave those steps to you :)