How do I implement dynamically sized data structures in MATLAB? - matlab

I am trying to implement a dynamically sized data structure in MATLAB.
I have a 2D plane with nodes. For each node I need to save the coordinates and the coordinates of the nodes around it, within a distance of e.g. 100.
Imagine a circle with a radius of 100 around each node. I want to store all nodes within this circle for every node.
For example:
-----------------------------------------------
| |
| x |
| x |
| |
| x |
| x |
| x |
| |
| x |
-----------------------------------------------
I tried to implement this as shown below. I create a NodeList which contains a NodeStruct for every node. Every NodeStruct contains the coordinates of its corresponding node, as well as an array of the nodes around it. The problem with the implementation which I had in mind is, that the variable NodeStruct.NextNode changes its size for every Node.
I have an idea on how to find all the nodes, my problem is the datastructure to store all the necessary information.
NodeList = [];
NodeStruct.Coords = [];
NodeStruct.NextNode = [];

You can create a struct array that you index as follows:
NodeStruct(3).Coords = [x,y];
NodeStruct(3).NextNode = [1,2,6,10];
However, it is likely that this is better solved with an adjacency matrix. That is an NxN matrix, with N the number of nodes, and where adj(i,j) is true if nodes i and j are within the given distance of each other. In this case, the adjacency matrix is symmetric, but it doesn't need to be if you list, for example, the 10 nearest nodes for each node. That case can also be handled with the adjacency matrix.
Given an Nx2 matrix with coordinates coord, where each row is the coordinates for one node, you can write:
dist = sqrt(sum((reshape(coord,[],1,2) - reshape(coord,1,[],2)).^2, 3));
adj = dist < 100; % or whatever your threshold is

Related

Finding unique closest point corresponding to latitude and longitude vectors for a given point with shortest distance

I have two large vectors for the pair of latitudes and longitudes (Lat, Lon). I want to find a unique single pair of latitude and longitude corresponding to the point ([lat_deg, lon_deg]) which has shortest distance.
I am using this:
P = ([lat_deg, lon_deg]);
PQ = [Lat, Lon];
[k,dist] = dsearchn(P,PQ);
But at the end of this I get the distances of all the points and vector k contains all ones. Please guide if this is the right function if yes how can I correct it? if not what is the right function.
Sample vectors are:
Lat Lon
39.2591200000000 -85.9394200000000
39.2591300000000 -85.9392000000000
39.2590800000000 -85.9406300000000
39.2593500000000 -85.9406200000000
39.1949800000000 -85.9633400000000
39.1954200000000 -85.9633500000000
39.1954200000000 -85.9633500000000
39.1963300000000 -85.9633600000000
39.1957400000000 -85.9678800000000
39.1959300000000 -85.9682400000000
P=39.2005981000000 -85.9045842000000
You just need to reverse the parameter order. So using your original variable names (which is where the problem was first introduced):
[k,dist] = dsearchn(PQ,P)
k =
8
dist =
0.0589
k = dsearchn(P,PQ) returns the indices of the closest points in P to the query points in PQ measured in Euclidean distance.
The point query is the point PQ (which in your case is a single point but can be a point list) (and which you defined as P but should have been PQ) and the list of points to compute against is P (which you defined as PQ).
For completeness this is what the original code produces (which also conveniently gives the distance from your query point to the list of points for verification) again using the original variable names:
>> [k,dist] = dsearchn(P,PQ)
k =
1
1
1
1
1
1
1
1
1
1
dist =
0.0681
0.0680
0.0687
0.0689
0.0590
0.0590
0.0590
0.0589
0.0635
0.0638
To further explain how you got what you did - the query point was defined as a list of points and the points to compute against was a list of size 1. So the list of 1 point was the shortest distance to every query point and so the index for every query point is 1.
I'll assume you know how to interpret the dist values.

How can I choose the cluster with the highest WCSS value via sumd and idx?

I am applying the bisecting k-means algorithm to cluster users for each antenna beam.
The problem arises after splitting the cluster containing all users in two. In fact at this point I have to go to select the cluster with the highest wcss but I don't know how to do it.
I had thought about taking advantage of the sumd and idx values.
function [Clustering, SYSTEM] = CLUST_Bkmeans(kk, SYSTEM, USERS, ChannelMatrix)
Clustering = cell(SYSTEM.Nbeams,1);
UserPool = (1:SYSTEM.Nusers)';
Channel_real = real(ChannelMatrix);
Channel_imag = imag(ChannelMatrix);
avg_clusterSize = kk;
for ii=1:SYSTEM.Nbeams
Users = UserPool(USERS.BeamIndex==ii);
%Matrix of channel coefficient built as [real part | imaginary part]
Users_real = Channel_real(Users,:);
Users_imag = Channel_imag(Users,:);
X = [Users_real Users_imag];
SYSTEM.Nclusters(ii) = ceil(size(Users,1)/avg_clusterSize);
Clustering{ii} = cell(SYSTEM.Nclusters(ii),1);
%Bisecting k-means clustering of X
[idx,C,sumd] = kmeans(X,2); %first division in two cluster
for pp = 3:SYSTEM.Ncluster(ii)
%kmeans applied to cluster with higher WCSS
end
% silhouette(X,idx)
% xlabel('Silhouette Value')
% ylabel('Cluster')
for jj = 1:SYSTEM.Nclusters(ii)
Clustering{ii}{jj,1} = Users(idx==jj)';
end
end
end
As you've noted, kmeans returns the cluster indices and the sum of squared distances within each cluster (along with the centroid of each cluster, but we don't need that in this instance).
Finding the cluster with the highest WCSS is easy. sumd is a k x 1 vector where k is the number of clusters. With just two clusters, you can easily select which one is larger, but if you have more clusters, you can use the I (index) return value from max:
[~, max_wcss_cluster] = max(sumd); % index is the second return value
At some point, you're probably going to need to know which observations in X are in a particular cluster. To list those rows of X, you would use the idx vector returned by kmeans and logical indexing:
cluster_number = 2; % find all observations in cluster 2
my_cluster = X(idx==cluster_number, :);

How to define neighborhoods in hexagonal Self-Organizing Maps (SOM)

I'm trying to implement a bi-dimensional SOM with shrinking neighborhoods, but to avoid computing the neighborhood function to each neuron for every input, I want to define the neighbors for each neuron since the construction of the lattice. I mean, when creating the SOM, I would add each neighbor to a list within the neurons, so when a neuron is selected as BMU, I only have to apply the neighborhood function to the neurons in that BMU's list.
The problem is to define the topology of an hexagonal lattice within a bi-dimensional array, which is the structure that I'm using for the SOM, cause to achieve the hexagonal distribution I would have to do something like this:
n1 | null | n2 | null | n3
null | n4 | null | n5 | null
n6 | null | n7 | null | n8
Is it correct to create the array like that or there is a way to create a normal array and adjust de indexes?

Storing a dynamic array of structures in Matlab

I'm new to Matlab, and I want to do the following.
I have 2500 data points that can be clustered into 10 groups. My aim is to find the top 5 data points of each cluster that is closest to the centroid. To do that, I did the following.
1) Find the distance between each point to each centroid, and allocate the closest cluster to each data point.
2) Store the data point's index (1,...,2500) and the corresponding distance in a cluster{index} array (not sure what data type this should be), where index = 1,2,...,10.
3) Go through each cluster to find the 5 closest data points.
My problem is I don't know how many data points will be stored in each cluster, so I don't know which data type I should use for my clusters and how to add to them in Step 2. I think a cell array may be what I need, but then I'll need one for the data point index and one for the distance. Or can I create a cell array of structure (each structure consisting of 2 members - index and distance). Again, how could I dynamically add to each cluster then?
I would suggest you keep the data in an normal array, this usually works the quickest in Matlab.
You could do as follows: (assuming p is an n=2500 by dim matrix of data points, and c is an m=10 by dim matrix of centroids):
dists = zeros(n,m);
for i = 1:m
dists(:,i) = sqrt(sum(bsxfun(#minus,p,c(i,:)).^2,2));
end
[mindists,groups] = min(dists,[],2);
orderOfClosenessInGroup = zeros(size(groups));
for i = 1:m
[~,permutation] = sort(mindists(groups==i));
[~,orderOfClosenessInGroup(groups==i)] = sort(permutation);
end
Then groups will be an n by 1 matrix of values 1 to m telling you which centroid the corresponding data point is closest to, and orderOfClosenessInGroup is an n by 1 matrix telling you the order of closeness inside each group (orderOfClosenessInGroup <= 5 will give you a logical vector of which data points are among the 5 closest to their centroid in their group). To illustrate it, try the following example:
n = 2500;
m = 10;
dim = 2;
c = rand(m,dim);
p = rand(n,dim);
Then run the above code, and finally plot the data as follows:
scatter(p(:,1),p(:,2),100./orderOfClosenessInGroup,[0,0,1],'x');hold on;scatter(c(:,1),c(:,2),50,[1,0,0],'o');
figure;scatter(p(orderOfClosenessInGroup<=5,1),p(orderOfClosenessInGroup<=5,2),50,[0,0,1],'x');hold on;scatter(c(:,1),c(:,2),50,[1,0,0],'o');
This will give you a result looking something like this:
and this:

Matlab - submatrix for stiffness method

In order to use the stiffness method for trusses, I need to extract certain elements from a large global stiffness matrix.
Say I have a 9 x 9 matrix K representing a three-member truss. This means that the first 3 rows and columns correspond to the first node, the second set of three rows and columns with the second node, and the third with the third node. In the code is a vector zDisp that corresponds to each node that has zero displacement. On paper, a zero displacement of a node means you would cross out the rows and columns corresponding to that displacement, leaving you with a smaller and easier to work with K matrix. So if the first and third nodes have zero displacement, you would be left with a 3 x 3 matrix corresponding to the intersection of the middle three rows and the middle three columns.
I thought I could accomplish this one node at a time with a function like so:
function [ B ] = deleteNode( B, node )
%deleteNode removes the corresponding rows and vectors to a node that has
% zero deflection from the global stiffness matrix
% --- Problem line - this gets the first location in the matrix corresponding to the node
start = 3*node- 2;
for i = 0 : 2
B(start+i,:) = [];
B(:,start+i) = [];
end
end
So my main project would go something like
% Zero displacement nodes
zDisp = [1;
3;
];
% --- Create 9 x 9 global matrix Kg ---
% Make a copy of the global matrix
S = Kg;
for(i = 1 : length(zDisp))
S = deleteNode(S, zDisp(i));
end
This does not work because once the loop executes for node 1 and removes the first 3 rows and columns, the problem line in the function no longer works to find the correct location in the smaller matrix to find the node.
So I think this step needs to be executed all at once. I am thinking I may need to instead input which nodes are NOT zero displacement, and create a submatrix based off of that. Any tips on this? Been thinking on it awhile. Thanks all.
In your example, you want to remove rows/columns 1, 2, 3, 7, 8, and 9, so if zDisp=[1;3],
remCols=bsxfun(#plus,1:3,3*(zDisp-1))
If I understand correctly, you should just be able to first remove the columns given by zDisp:
S(remCols(:),:)=[]
then remove the rows:
S(:,remCols(:))=[]