MATLAB Heatmap dendrogram not showing column names when there are many names - matlab

I have a list of a proteins and values for each protein based on three different experimental conditions (alpha, beta and gamma). The array containing the values is called 'heatmap_data'. The name of the proteins is in the array called: 'text'
I generated a heatmap:
rows = ['ALPHA' ;'BETA '; 'GAMMA']
rowscell = cellstr(rows)
dm=DataMatrix(heatmap_data,rowscell,text);
cg = clustergram(dm,'Standardize','none');
cgAxes =plot(cg);
set(cgAxes, 'Clim', [-1,1])
When the list of proteins is short, I got the expected heatmap, showing labels for x axis
However, when the list extens to few hundreds, the names disappear.
I can understand that the labels might not fit in the short space, but if they were written I could reduce font size, or expand the dendrogram, etc
My question: is there a way to force MATLAB to show the column names even if they overlap, or a function I can save the names in the same order the dendrogram ordered so I can identify which proteins are in each cluster?
Thanks

Ok, I found this:
https://www.mathworks.com/help/bioinfo/ref/clustergram.html
RowLabelsValue Vector of numbers or cell array of character vectors
to label the rows in the dendrogram and heat map. Default is a vector
of values 1 through M, where M is the number of rows in Data. Note:
If the number of row labels is 200 or more, the labels do not appear
in the clustergram plot unless you zoom in on the plot.
Now, If I zoom I can see the names.

Related

How to calculate Trailing Moving Sums going up vertically in a table?

I have the following table: Table
Columns 'L' and 'U' if the table consist of cells that contain object names that correspond to the headers in columns 4-281. Example
Goal: For every date verify what objects are in 'L' (respectively 'U') and sum the aggregate of those objects' 4-point trailing moving sum and its standard deviation (going up in the table!) and store it in a new variable, e.g. LSum and LStd for 'L' as well as USum and UStd for 'U'. For dates with insufficient values, e.g. 15-Jul-2016 with only 3 instead of 4 time steps ahead, return NaN's.
How I would start:
for row=1:size(ABC,1)
row_values = ABC{row,:};
row_values = row_values(4:end);
% How to make the loop for columns L and U where there are multiple objects in one cell?
% How can I use 'movsum' and 'movstd' here to calculate values vertically going up?
end;
Thanks a lot for your help!
Maybe you could use the functions cell2mat and cellfun to achieve your goal.
With these functions you can:
Convert your cell matrix to a double matrix in order to perform (cell2mat)
Perform a certain operation on all cell elements (cellfun)

Looping 2 columns from an excel file into matlab

For my experiment, I am presenting participants with different images (which are numbered from 1 to 324) in a scrambled order. My goal is to overlay their gaze pattern with a saliency map.
So I have two variables in MATLAB that I want to FOR loop:
"z" is the scrambled presentation order, ex: [95,147,1...] (324 numbers in total)
"i" is the order of gaze patterns recorded (it goes in order from 1 to 324)
I have tried use 2 for loops,
for z=[95,147,1....]
for i=1:324
%open and create saliency maps for "z"
%open gaze pathways for "i"
%combine both
%save
What I was hoping was that z=95 would be paired with i=1, z=147 would be paired with i=2 and so on, however what happens is the for loop goes through i=1:324 for all of z=95 and then continues to z=147 and goes through i=1:324 again...
I have thought of putting z and i values into a table such that
ImageOrder ScatterOrder
95 1
147 2
1 3
However, I have been having difficulty for the specific steps.
I actually figured out a method,
I set
s1='A2' %first cell of the image order column
s2='B2' %first cell of the scatter order column
Then I create one for loop for
i=2:___(The row number I want to end on)
s1=['A', num2str(i)]; %Loops through rows of A
s2=['B', num2stri(i)]; %Loops through rows of B
x1=xlsread('filename.xls',[s1,':',s2]);
a=x1(:,1) %Taking image order of every row
b=x1(:,2); %Taking scatter order of every row
With the 'a' and 'b' then I can access two different variables in the same rows while it loops.

setting YTickLabel matlab

what is wrong that i can not figure out in my YTickLabel:
h2=bar(myData);
ylabels=['1';'1.5';'2';'2.5';'3'];
set(gca,'XTickLabel',applicationNames),'XTick',applicationNames),'YTickLabel',ylabel));
p.s: I have tried this as well with no success:
ax=gca
ax.YTickLabel=['1';'1.5';'2';'2.5';'3'];
I am getting this error:
Error using vertcat
Dimensions of matrices being concatenated are not consistent.
You are trying to create a character array. In this case, you are trying to create a 2D matrix where the number of columns should have the same number of characters and the number of rows denotes how many labels you have. For your strings, the maximum number of characters per column is three (number / dot / number). Because you have characters that are only of length 1 (i.e. just a number), you are getting an inconsistent concatenation error because it's expecting all characters to be of length 3 in the array.
What you actually need to use is a cell array to accommodate for the inconstant size of each y tick label. Therefore:
ax.YTickLabel={'1';'1.5';'2';'2.5';'3'};
Alternatively, because your labels are numbers, you can simply use a numeric array instead:
ax.YTickLabel = [1;1.5;2;2.5;3];
A cell array of characters is used if you want to label the x and/or y axis to be something other than just numbers. It's possible to label the y axis using text, such as:
ax.YTickLabel = {'John'; 'Paul'; 'George'; 'Ringo'; 'The Beatles'};

How to implement data I have to svmtrain() function in MATLAB?

I have to write a script using MATLAB which will classify my data.
My data consists of 1051 web pages (rows) and 11000+ words (columns). I am holding the word occurences in the matrix for each page. The first 230 rows are about computer science course (to be labeled with +1) and remaining 821 are not (to be labeled with -1). I am going to label few part of these rows (say 30 rows) by myself. Then SVM will label the remaining unlabeled rows.
I have found that I could solve my problem using MATLAB's svmtrain() and svmclassify() methods. First I need to create SVMStruct.
SVMStruct = svmtrain(Training,Group)
Then I need to use
Group = svmclassify(SVMStruct,Sample)
But the point that I do not know what Training and Group are. For Group Mathworks says:
Grouping variable, which can be a categorical, numeric, or logical
vector, a cell vector of strings, or a character matrix with each row
representing a class label. Each element of Group specifies the group
of the corresponding row of Training. Group should divide Training
into two groups. Group has the same number of elements as there are
rows in Training. svmtrain treats each NaN, empty string, or
'undefined' in Group as a missing value, and ignores the corresponding
row of Training.
And for Training it is said that:
Matrix of training data, where each row corresponds to an observation
or replicate, and each column corresponds to a feature or variable.
svmtrain treats NaNs or empty strings in Training as missing values
and ignores the corresponding rows of Group.
I want to know how I can adopt my data to Training and Group? I need (at least) a little code sample.
EDIT
What I did not understand is that in order to have SVMStruct I have to run
SVMStruct = svmtrain(Training, Group);
and in order to have Group I have to run
Group = svmclassify(SVMStruct,Sample);
Also I still did not get what Sample should be like?
I am confused.
Training would be a matrix with 1051 rows (the webpages/training instances) and 11000 columns (the features/words). I'm assuming you want to test for the existence of each word on a webpage? In this case you could make the entry of the matrix a 1 if the word exists for a given webpage and a 0 if not.
You could initialize the matrix with Training = zeros(1051,11000); but filling the entries would be up to you, presumably done with some other code you've written.
Group is a 1-D column vector with one entry for every training instance (webpage) than tells you which of two classes the webpage belongs to. In your case you would make the first 230 entries a "+1" for computer science and the remaining 821 entries a "-1" for not.
Group = zeros(1051,1); % gives you a matrix of zeros with 1051 rows and 1 column
Group(1:230) = 1; % set first 230 entries to +1
Group(231:end) = -1; % set the rest to -1

Preserving matrix columns using Matlab brush/select data tool

I'm working with matrices in Matlab which have five columns and several million rows. I'm interested in picking particular groups of this data. Currently I'm doing this using plot3() and the brush/select data tool.
I plot the first three columns of the matrix as X,Y, Z and highlight the matrix region I'm interested in. I then use the brush/select tool's "Create variable" tool to export that region as a new matrix.
The problem is that when I do that, the remaining two columns of the original, bigger matrix are dropped. I understand why- they weren't plotted and hence the figure tool doesn't know about them. I need all five columns of that subregion though in order to continue the processing pipeline.
I'm adding the appropriate 4th and 5th column values to the exported matrix using a horrible nested if loop approach- if columns 1, 2 and 3 match in both the original and exported matrix, attach columns 4/5 of the original matrix to the exported one. It's bad design and agonizingly slow. I know there has to be a Matlab function/trick for this- can anyone help?
Thanks!
This might help:
1. I start with matrix 1 with columns X,Y,Z,A,B
2. Using the brush/select tool, I create a new (subregion) matrix 2 with columns X,Y,Z
3. I then loop through all members of matrix 2 against all members of matrix 1. If X,Y,Z match for a pair of rows, I append A and B
from that row in matrix 1 to the appropriate row in matrix 2.
4. I become very sad as this takes forever and shows my ignorance of Matlab.
If I understand your situation correctly here is a simple way to do it:
Assuming you have a matrix like so: M = [A B C D E] where each letter is a Nx1 vector.
You select a range, this part is not really clear to me, but suppose you can create the following:
idxA,idxB and idxC, that are 1 if they are in the region and 0 otherwise.
Then you can simply use:
M(idxA&idxB&idxC,:)
and you will get the additional two columns as well.