how can I reorder my data after AgglomerativeClustering? - cluster-analysis

I want to do a heatmap after clustering my data rows and data columns to 2 clusters.
I used AgglomerativeClustering(), I did that for my data and my transpose data in order to cluster my columns too.
clustering1 = AgglomerativeClustering()
clustering1.fit(dataset)
label1 = clustering1.labels_
dataset_t = dataset.T
clustering2 = AgglomerativeClustering()
clustering2.fit(dataset_t)
label2 = clustering2.labels_
I got 2 vectors of 0,1 (2 clusters). And now how I can reorder my data after the clustering and plot a new heatmap?
Like I want a heatmap that we got from using sns.clustermap(data) but without using it.

Related

Changing Matlab Nested Structures to Matrices Iteratively for histogram plotting

Pretty new to Matlab, so please forgive the poor coding. I have some data for different categories (9 categories) with a different number of data points in each category. I created a structure that holds the data points for the different categories. I believe the categories are themselves structures within the larger structure.
I want to plot a histogram for each category. The first thing I tried was just creating a for-loop and plotting a histogram for each category in the structure, but this failed because the histogram doesn't take in structures. The next thing I tried to do was create another for loop which would change the structure holding each category into a cell array, but this also failed with the error:
if isnumeric(c{1}) || ischar(c{1}) || islogical(c{1}) || isstruct(c{1})
I am able to individually change each category to a cell array and then to a matrix, which allowed me to create one histogram. Is there a way to do this using a loop? My code is below. Thanks.
data = readtable('');
data = table2array(data);
Trial = data(:,1);
dist = data(:,2);
Time = data(:,3);
intstim = data(:,4);
color = data(:,5);
UniqueDist = unique(dist);
for ii = 1.0:length(UniqueDist)
idx = find(dist == UniqueDist(ii));
distTime(ii).data = Time(idx);
distTime(ii).data = distTime(ii).data(distTime(ii).data ~= 0);
end
for jj =1.0:length(distTime)
distTime(jj).data = struct2cell(distTime(jj));
distTime(jj).data = cell2mat(distTime(jj));
end

How can I color-label the cluster data after GMM is fitted?

I am trying to do some labelling on cluster data following GMMs but haven't found a way to do it.
Let me explain:
I have some x,y data pairs into a X=30000x2 array. In reality the array contains the data from different sources (known) and each source has the same number of data (So source 1 has 500 (x,y), source 2 500 (x,y) and so on and all of them are appended into the X array above).
I have fitted a GMM on X. Cluster results are fine and as expected but now that the data are clustered I want to be able to color code them based on their initial origin.
So let's say I want to shown in black the data points of source 1 that are in cluster 2.
Is that possible?
Example:
In the original array we have three sources for the data. Source 1 is data from 1-10000, source 2 10001-20000 and source 3 20001-30000.
After GMM fitting and clustering I have clustered my data as per figure 1 and I got two clusters. The red colour in all of them is irrelevant.
I want to modify the color of the data points in cluster 2 based on their index and the original array X.
E.g., if a data point belongs to cluster 2 (clusteridx=2), then I want to check to which source it belongs and then color it and label it accordingly. So that you can tell from which source are the data points in cluster 2 as shown in the second figure.
Original clusters
Desired labelling
You could add a "source_id" column and then plot through a loop on that. For example:
% setup fake data
source1 = rand(10,2);
source2 = rand(15,2);
source3 = rand(8,2);
% end setup
% append column with source_id (you could do this in a loop if you have many sources)
source1 = [source1, repmat(1, length(source1), 1)];
source2 = [source2, repmat(2, length(source2), 1)];
source3 = [source3, repmat(3, length(source3), 1)];
mytable = array2table([source1; source2; source3]);
mytable.Properties.VariableNames = {'X' 'Y' 'source_id'};
figure
hold on;
for ii = 1:max(mytable.source_id)
rows = mytable.source_id==ii;
x = mytable.X(rows);
y = mytable.Y(rows);
label = char(strcat('Source ID =', {' '}, num2str(ii)));
mycolor = rand(1,3);
scatter(x,y, 'MarkerEdgeColor', mycolor, 'MarkerFaceColor', mycolor, 'DisplayName', label);
end
set(legend, 'Location', 'best')

Insert a plot image inside table in MATLAB

I'm very new in MATLAB and I'm trying to do the following: I've 5 images and I want to create a table with names and some statistics as columns. In addition, I would like to insert one column in which every cell is the histogram of the images.
My code is this:
Type = {'volto';'volto_jpg'; 'volto_jpg_100';'volto_tiff';'volto_tiff_nocompression'}
Sum_of_diff = [NaN;2531547;280391;0;0]
Medie = cellfun(#mean2,y)
Dev_st = cellfun(#std2,y)
x = table(Type; Sum_of_diff; Medie; Dev_st)
I would like to add a final column where i can view the histogram made with imhist() function.
It could be possible? Of course only for exporting and visualising purposes.

Matlab: Plotting bar groups

I want to plot bar plot in which bars are grouped like this:
I have tried this code but I am not getting this type of plot. Please guide me how can I generate plot like above:
load Newy.txt;
load good.txt;
one = Newy(:,1);
orig = good(:,1);
hold on
bar(one,'m');
bar(orig,'g');
hold off
set(gca,'XTickLabel',{'0-19','20-39','40-79','80-159','160-319','320-639','640-1279','1280-1500'})
In each text file there is a list of numbers. The list comprises of 8 values.
You can use histc to count the values within certain edges.
To group bars you can collect them in a single matrix (with the values in each column).
edges = [0 20 40 80 160 320 640 1280 1501];
edLeg = cell(numel(edges)-1,1);
for i=1:length(edLeg)
edLeg{i} = sprintf('%d-%d',edges(i),edges(i+1)-1);
end
n = histc([one,orig],edges);
bar(n(1:end-1,:));
set(gca,'XTickLabels',edLeg)
legend('One','Orig')
I used these as test data
one = ceil(1500.*rand(200,1));
orig = ceil(1500.*rand(200,1));
I got the way to achieve group bars:
I had to plot the data such that there are 8 groups of bars where each group consists of 3 bars.
For that I wrote the data in my each file like this:
Y = [30.9858 1.36816 38.6943
0.655176 6.44236 13.1563
1.42942 3.0947 0.621403
22.6364 2.80378 17.1299
0.621871 5.37145 1.87824
0.876739 5.97647 3.80334
40.6585 68.6757 23.0408
2.13606 6.26739 1.67559
];
bar(Y)

Find rows within multiple arrays that have the same header and remove all other rows, using Matlab

In Matlab, I have several txt files that I have loaded and converted to matrices. The matrices represent temperature data at different cities around the world. The first column in each matrix is a year. Each file spans a different range of years but they all overlap for a few of those years. I would like to find where the overlap, and either extract out (or delete non-overlapping years) so that when I plot the data, each data set is using the same span of years. The code should be able to ingest an unknown number of these txt files. I have tried to use the "intersect" function but that will work on an element-by-element basis. I want all data for overlapping years, so the elements (except for the header) will be different.
An example of current code:
clear all
files = dir('.txt');
num_files = length(files);
mintersect(files);
for i=1:num_files
eval(['load ' files(i).name ' -ascii']);
vals{i} = load(files(i).name);
matrix = vals{i};
station = (files(i).name(1:end-4));
matrix(matrix == 999.9) = NaN;
matrix(matrix == -99.0) = NaN;
years = matrix(:,1);
months = matrix(:,2:13)';
figure, hold on
plot(years, months,'');
ylabel('Temp.');
xlabel('Years');
grid on;
title(sprintf('Mean Monthly Temperature for %s Station',station));
end