Clustering coefficient, EEG brain data, Graph theory analysis - cluster-analysis

I am final year master student of biomedical engineering field, my research interest area is the Brain research, using EEG modality data. Currently i am struggling to understand statistical analysis using the graph theory analysis technique, For the 10–20 system EEG electrode locations related brain node, clusters. if you are an expert or having knowledge of the local /global clustering coefficient please share your precious knowledge.
Case: If we stimulate the subject with Theta, alpha, beta and gamma frequencies through acoustic binaural beat modulation then following two queries:
Q1. when the clustering coefficient is increased, what will happen to the cortical neuronal activities in the brain areas/ regions ? will be well organized or hyper active or control/uncontrol brain functional network?
Q2. when the clustering coefficient is decreased then what will happen as result? is it good for the cortical activities or bad either?
please share your recommendations resources as related articles, websites, ebook link.

Related

How to do Hierarchical Heteroskedastic Sparse GPs in GPflow?

Is is possible to model a general trend from a population using GPflow and also have individual predictions, as in Hensman et al?
Specifically, I am trying to fit spatial data from a bunch of individuals from a clinical assessment. For each individual, am I dealing with approx 20000 datapoints (different number of recordings for each individual), which definitely restricts myself to a sparse implementation. In addition to this, there also seemes that I need an input dependent noise model, hence the heteroskedasticity.
I have fitted a hetero-sparse model as in this notebook example, but I am not sure how to scale it to perform the hierarchical learning. Any ideas would be welcome :)
https://github.com/mattramos/SparseHGP may be helpful. This repo is gives GPFlow2 code for modelling a sparse hierarchical model. Note, there are still some rough edges in the implementation that require an expensive for loop to be constructed.

Fully connected neural network for panel data?

I am trying to build neural network (NN) to forecast the probability of an event (e.g. thunderstorm will occur). As a base, I have a panel with weather data per state over 10 years. I saw some posts with similar questions (e.g. Keras Recurrent Neural Networks For Multivariate Time Series) and they all seem to use a RNN for this problem.
I would like to understand why a RNN seems the go-to solution and not e.g. a simple fully connected NN. Conventionally, I would use a logit model with fixed-effects for this problem.
Maybe someone can point me towards a paper or two which discusses this?

Spatial features and pattern analysis of a plan?

I am working on instances from the TSPLIB, which are simply coordinates of nodes in a plan. I'm looking to analyze spatial characteristics and features of a set of instances (e.g. clustered, not clustered, dispersed, etc) and I would like to implement some code in Matlab to analyze and compute specific features.
For example, so far, I have used Nearest Neighbor analysis to identify clusters, as well as quadrant analysis. Can anyone suggest any other spatial features and patterns that could be computed with some relatively simple code? Anybody maybe expert in the Traveling Salesman Problem. Thank you so much!
K-means is a very useful clustering tool that you can use.
https://www.mathworks.com/help/stats/kmeans.html
Nearest Neighbor is a classification methods. if you want to do classification you can use K Nearest Neighbors, SVM or Neural Networks Pattern recognition toolbox. these are all already in Matlab.
Also, check out Matlab Apps. there are some very cool clustering tools available as well with examples.

Clustering Algorithm for average energy measurements

I have a data set which consists of data points having attributes like:
average daily consumption of energy
average daily generation of energy
type of energy source
average daily energy fed in to grid
daily energy tariff
I am new to clustering techniques.
So my question is which clustering algorithm will be best for such kind of data to form clusters ?
I think hierarchical clustering is a good choice. Have a look here Clustering Algorithms
The more simple way to do clustering is by kmeans algorithm. If all of your attributes are numerical, then this is the easiest way of doing the clustering. Even if they are not, you would have to find a distance measure for caterogical or nominal attributes, but still kmeans is a good choice. Kmeans is a partitional clustering algorithm... i wouldn't use hierarchical clustering for this case. But that also depends on what you want to do. you need to evaluate if you want to find clusters within clusters or they all have to be totally apart from each other and not included on each other.
Take care.
1) First, try with k-means. If that fulfills your demand that's it. Play with different number of clusters (controlled by parameter k). There are a number of implementations of k-means and you can implement your own version if you have good programming skills.
K-means generally works well if data looks like a circular/spherical shape. This means that there is some Gaussianity in the data (data comes from a Gaussian distribution).
2) if k-means doesn't fulfill your expectations, it is time to read and think more. Then I suggest reading a good survey paper. the most common techniques are implemented in several programming languages and data mining frameworks, many of them are free to download and use.
3) if applying state-of-the-art clustering techniques is not enough, it is time to design a new technique. Then you can think by yourself or associate with a machine learning expert.
Since most of your data is continuous, and it reasonable to assume that energy consumption and generation are normally distributed, I would use statistical methods for clustering.
Such as:
Gaussian Mixture Models
Bayesian Hierarchical Clustering
The advantage of these methods over metric-based clustering algorithms (e.g. k-means) is that we can take advantage of the fact that we are dealing with averages, and we can make assumptions on the distributions from which those average were calculated.

CurveRep in Hmisc for clustering longitudinal curves based on 3 time points

I am working on the following project and am exploring the CurveRep() clustering approach provided by Hmisc. (CurveRep clusters individual subjects' longitudinal growth curves according to similar patterns based on the CLARA clustering algorithm). As I haven't found any publication using CurveRep() and generally very little discussion about it on the internet, I would be grateful if you could let me know your experience with it or what you think about it!
- My project: I have about 200 metabolites measured in n=500 subjects at three time points (0,30,120min). Individual time courses vary quite a bit, but in Spaghetti plots, there appear to be groups (e.g. straight & flat curves, peak-shaped curves, valley-curves). I would like to cluster these curves into two or three representative time courses and would then fit a curve-specific regression model for each cluster. CurveRep() seems exactly what I am looking for and it produces acceptable cluster solutions (although solutions are more based on different y-axis intersections rather than different growth patterns).
Is it any good? Are there alternative clustering algorithms that group according to similar longitudinal change (e.g., cluster 1 = "linear rising", cluster 2 = "valley-shaped")?
Thanks a lot!
Chris
Three time points is too little for all the time-series methods to wpork for you. Look at DTW - it is designed for much higher resolution.
Clustering algorithms such as k-means, PAM and CLARA could work for you. Look at the cluster centers.
It may be necessary to preprocess your data more carefully.
If you are interested in change instead of absolute values, encode your data accordingly. For example,
x1, x2, x3 -> x2-x1, x3-x2
or
x1,x2,x3 -> x1-mu,x2-mu,x3-mu with mu=(x1+x2+x3)/3
this will make the clustering results more likely to match your motivation.