Visualizing N-dimensions on a graph - cluster-analysis

I have been conducting my thesis lately, and in the first stage I do some form of clustering in order to generate K user-specified clusters.
The size of the input matrix that I cluster is N*M. Where N is the population and M are the samples.
After the clustering is done, I would have loved to see how those clusters look on the field. But I can't picture anything more than 3 dimensions.
Is there a way to visualize M dimensions? Using R? Java? Python?

Yes, plenty. Do some research for your thesis.
PCA
MDS
Parallel coordinates
tSNE
ISOMAP
locally linear embeddings
Fisher LDA
LMNN

Related

Visualizing clusters using TSNE

I have a dataset which I need to cluster and display in a way wherein elements in the same cluster should appear closer together. The dataset is based out of a research study, and has around 16 rows(entries) and about 50 features. I do agree that its not an ideal dataset to begin with, but unfortunately thats is the situation on hand.
Following is the approach I took:
I first applied KMeans on the dataset after normalizing it.
In parallel I also tried to use TSNE to map the data into 2 dimensions and plotted them on a scatterplot. From my understanding of TSNE, that technique should already be placing items in same clusters closer to each other. When I look at the scatterplot, however, the clusters are really all over the place.
The result of the scatterplot can be found here: https://imgur.com/ZPhPjHB
Is this because TSNE and KMeans intrinsically work differently? Should I just do TSNE and try to label the clusters (and if so, how?) or should I be using TSNE output to feed into KMeans somehow?
I am really new in this space and advice would be greatly appreciated!
Thanks in advance once again
Edit: The same overlap happens if I first use TSNE to reduce dimensions to 2 and then use those reduced dimensions to cluster using KMeans
There is a difference between TSNE and KMeans. TSNE is used for visualization mostly and it tries to project points on the 2D/3D space (from bigger spaces) in order to keep distances (if in the bigger space 2 points were far away TSNE will try to show it).
So TSNE is not a real clustering. And that's why results you got that strange scatter plot.
For TSNE sometimes you need to apply PCA before but that is needed if your number of features is big. Just to speed-up calculations.
As already advised, try to use hierarchical clustering or simply generate more rows.
Apply tSNE and fit k-means is one of the basic things you can start from.
I would say consider using different f-divergence.
Stochastic Neighbor Embedding under f-divergences https://arxiv.org/pdf/1811.01247.pdf
This paper tries five different f- divergence functions : KL, RKL, JS, CH (Chi-Square), HL (Hellinger).
The paper goes over which divergence emphasize what in terms of precision and recall.

Hybrid SOM (with MLP)

Could someone please provide some information on how to properly combine a self organizing map with a multilayer perceptron?
I recently read some articles about this technique in comparison to regular MLPs and it performed way better in prediction tasks. So, I want to use the SOM as front-end for dimension reduction by clustering the input data and pass the results to an MLP back-end.
My current idea of implementing it is it to train the SOM with a couple of training sets and to determine the clusters. Afterwards, I initialize the MLP with as many input units as SOM clusters. Next step would be to train the MLP using the SOM's output (which value?...weights of BMU?) as in input for the network (SOM's Output for the Cluster matching Input Unit and zeros for any other Input Units?).
There is no single way of doing that. Let me list some possibilities:
The one you describe. But then, your MLP will need to have K*D inputs, where K is the number of clusters and D is the input dimension. There is no dimensionality reduction.
Similar to your idea, but instead of using the weights, just send 1 for the BMU and 0 for the remaining clusters. Then your MLP will need K inputs.
Same as above, but instead of 1 or 0, send the distance from the input vector to each cluster.
Same as above, but instead of the distance, compute a Gaussian activation for each cluster.
Since the SOM preserves topology, send only the 2D coordinates of the BMU (possibly normalized between 0 and 1). Then your MLP will need only 2 inputs and you achieve real extreme dimensionality reduction.
You can read about those ideas and some more here: Principal temporal extensions of SOM: Overview. It is not about feeding the output of a SOM to a MLP, but a SOM to itself. But you'll be able to understand the various possibilities when trying to produce some output from a SOM.

Square error clustering for Texture Segmentation

I am doing Texture Segmentation in Images Using Gabor Filters. I have generated Feature images and now i want to cluster them using square clustering or K-means.
Since i am working in MATLAB, I am using the matlab defined functions.
The official documentation for K-means clustering states
idx = kmeans(X,k)
where,X= n by p data matrix and k is the number of clusters.
I want to know how to use this function for my dataset since i have n feature images which means i have n features for every pixel of the original image. So my question is how do I use my dataset in the above function.
Also, if no function exists please explain how can I implement it on my own.
I am referring this research paper( page 6 )
Gabor filters for texture Segmentation

Principal Component Analysis in practice

I understand the concept of PCA, and what it's doing, but trying to apply the concept to my application is proving difficult.
I have a 1 by X matrix of a physiological signal (it's not EMG, but very similar, so think of it as EMG if it helps) which contains various noise and artefacts. What I've noticed of the noise is that some of it is very large and I would assume after PCA this would be the largest principal component, thus my idea of using PCA for some dimensional reduction.
My problem is that with a 1 by X matrix there is no covariance matrix, only the variance, and thus eigenvectors and all of PCA falls through.
I know I need to rearrange my data into a matrix more than 1D, but this is where I need some suggestions. Do I split my data into windows of equal length to create a large dimensional matrix which I can apply PCA to? Do I perform several trials of the same action so I have lots of data sets (this would be impractical for my application)?
Any suggestions or examples would be helpful. I'm using MATLAB to perform this task.

MATLAB: Self-Organizing Map (SOM) clustering

I'm trying to cluster some images depending on the angles between body parts.
The features extracted from each image are:
angle1 : torso - torso
angle2 : torso - upper left arm
..
angle10: torso - lower right foot
Therefore the input data is a matrix of size 1057x10, where 1057 stands for the number of images, and 10 stands for angles of body parts with torso.
Similarly a testSet is 821x10 matrix.
I want all the rows in input data to be clustered with 88 clusters.
Then I will use these clusters to find which clusters does TestData fall into?
In a previous work, I used K-Means clustering which is very straightforward. We just ask K-Means to cluster the data into 88 clusters. And implement another method that calculates the distance between each row in test data and the centers of each cluster, then pick the smallest values. This is the cluster of the corresponding input data row.
I have two questions:
Is it possible to do this using SOM in MATLAB?
AFAIK SOM's are for visual clustering. But I need to know the actual class of each cluster so that I can later label my test data by calculating which cluster it belongs to.
Do you have a better solution?
Self-Organizing Map (SOM) is a clustering method considered as an unsupervised variation of the Artificial Neural Network (ANN). It uses competitive learning techniques to train the network (nodes compete among themselves to display the strongest activation to a given data)
You can think of SOM as if it consists of a grid of interconnected nodes (square shape, hexagonal, ..), where each node is an N-dim vector of weights (same dimension size as the data points we want to cluster).
The idea is simple; given a vector as input to SOM, we find the node closet to it, then update its weights and the weights of the neighboring nodes so that they approach that of the input vector (hence the name self-organizing). This process is repeated for all input data.
The clusters formed are implicitly defined by how the nodes organize themselves and form a group of nodes with similar weights. They can be easily seen visually.
SOM are in a way similar to the K-Means algorithm but different in that we don't impose a fixed number of clusters, instead we specify the number and shape of nodes in the grid that we want it to adapt to our data.
Basically when you have a trained SOM, and you want to classify a new test input vector, you simply assign it to the nearest (distance as a similarity measure) node on the grid (Best Matching Unit BMU), and give as prediction the [majority] class of the vectors belonging to that BMU node.
For MATLAB, you can find a number of toolboxes that implement SOM:
The Neural Network Toolbox from MathWorks can be used for clustering using SOM (see the nctool clustering tool).
Also worth checking out is the SOM Toolbox