MATLAB Box Plot for large groups of Measurements - matlab

I've got basically three large groups of measurements and im trying to generate a BoxPlot with 4 Boxes. One box for each group and the last one for all groups joined.
I've tried with this code
A = rand(1417725,1)
B = rand(2236508,1)
C = rand(3100641,1)
D = [A;B;C]
X= [A;B;C;D]
group = [repmat({'a'},1417725,1); repmat({'b'},2236508,1); repmat({'c'},3100641,1); repmat({'d'},6754874,1)];
boxplot(X,group)
but at the end i get " Out of memory" and i can't get the plot.
do you have any idea to solve this problem??
Thank you!

Instead of creating a huge (in terms of memory) cell array of strings, create a much smaller array of for instance int8 integers:
group = [ones(size(A),'int8');2*ones(size(B),'int8');3*ones(size(C),'int8');4*ones(size(D),'int8')];
Then, after plotting, change the labels in the plot to the desired names:
set(gca, 'XTick', 1:4, 'XTickLabel', {'a','b','c','d'});
Maybe you have enough memory to do it this way..

Related

How to assign symbols to the data in Matlab

I am trying to plot some data with categories. However, I couldn't manage to plot my data according to the categories and show a legend for them. Therefore, I am asking a little bit help to this issue.
Example data;
Name
Brand
Mt
Enes
Renault
2.6
Avni
Tofaş
2.38
Asaf
Tofaş
3.06
My experience, I have managed to plot these data with two plot command overlaying each other grouping them by Brand. However, this time, one group of data has 2 line (as Tofaş has two data) and the other has only one line data (as in Renault). Thus, x-axis is confusing and not giving a healty graph.
The other issue with this, I can't label the x-axis according to Name when I plot two graph overlaying.
figure
plot(table.Mt, 'o');
xtickangle(90)
sizes = size(table.Name);
a1 = gca;
a1.XTick = [1:sizes];
a1.XTickLabel = table.Name;
the output of the code above
[Ism, Ind] =ismember(table.Brand, 'Tofaş');
plot(Ism, 'o')
the output of second code block above
As you can see, when I select only spesific Brand. The rest of arrray filling with zero (0) which I don't want to.
What I want is that plotting all data with spesific symbols for each Brand together.
Thank you
Enes

MATLAB loading data from multiple .mat files

My data is x,y co-ordinates in multiple files
a=dir('*.mat')
b={a(:).name}
to load the filenames in a cell array
How do I use a loop to sequentially load one column of data from each file into consecutive rows of a new/separate array......?
I've been doing it individually using e.g.
Load(example1.mat)
A(:,1)=AB(:,1)
Load(example2.mat)
A(:,2)=AB(:,1)
Load(example3.mat)
A(:,3)=AB(:,1)
Obviously very primitive and time consuming!!
My Matlab skills are weak so any advice gratefully received
Cheers
Many thanks again, I'm still figuring out how to read the code but I used it like this;
a=dir('*.mat');
b={a(:).name};
test1=zeros(numel(b),1765);
for k=1:numel(b) S=load(b{k});
I then used the following code to create a PCA cluster plot
test1(k,:)=S.AB(:,2); end [wcoeff,score,latent,tsquared,explained] = pca(test1,... 'VariableWeights','variance');
c3 = wcoeff(:,1:3) coefforth = inv(diag(std(test1)))*wcoeff; I = c3'*c3 cscores = zscore(test1)*coefforth;
figure() plot(score(:,1),score(:,2),'+') xlabel('1st Principal Component') ylabel('2nd Principal Component') –
I was using 'gname' to label the points on the cluster plot but found that the point were simply labelled from 1 to the number of rows in the array.....I was going to ask you about this but I found out simply through trial and error if I used 'gname(b)' this labels the points with the .names listed in b.....
However the clusterplot starts to look very busy/messy once I have labelled quite a few points so now I am wondering is is possible to extract the filenames into a list by dragging round or selecting a few points, I think it is possible as I have read a few related topics.....but any tips/advice around gname or labelled/extracting labels from clusterplots would be greatly appreciated. Apologies again for my formatting I'm still getting used to this website!!!
Here is a way to do it. Hopefully I got what you wanted correctly :)
The code is commented but please ask any questions if something is unclear.
a=dir('*.mat');
b={a(:).name};
%// Initialize the output array. Here SomeNumber depends on the size of your data in AB.
A = zeros(numel(b),SomeNumber);
%// Loop through each 'example.mat' file
for k = 1:numel(b)
%// ===========
%// Here you could do either of the following:
1)
%// Create a name to load with sprintf. It does not require a or b.
NameToLoad = sprintf('example%i.mat',k);
%// Load the data
S = load(NameToLoad);
2)
%// Load directly from b:
S = load(b{k});
%// ===========
%// Now S is a structure containing every variable from the exampleX.mat file.
%// You can access the data using dot notation.
%// Store the data into rows of A
A(k,:) = S.AB(:,1);
end
Hope that is what you meant!

Issue with reading data in text files and plotting a graph (Matlab)

I use the following code to collect data from two text files join them together and then plot them. For some reason, I appear to get two plots rather than 1, I am not sure why this is happening.
load MODES1.dat; % read data into the MODES1 matrix
x1 = MODES1(:,1); % copy first column of MODES1 into x1
y1 = MODES1(:,2); % and second column of MODES1 into y1
load MODES.dat; % read data into the MODES matrix
x = MODES(:,1); % copy first column of MODES into x
y = MODES(:,2); % and second column of MODES into y
% Joining the two sets of data
endx = [x1;x];
endy = [y1;y];
figure(1)
plot(endx,endy)
xlabel('Unique Threshold Strains','FontSize',12);
ylabel('Probabilities of occurrence','FontSize',12);
title('\it{Unique Values versus frequencies of occurrence}','FontSize',16);
Thanks
Your problem is quite a simple one. Matlab's plot command creates a point for each data point defined by the parameters and connects those points in the order they appeared in the first parameter. To get an idea of this behavior, try
x = [0;1;-1;2;-2;3;-3;4;-4;5];
plot(x,x.^2);
You won't get the quadratic function graph you might expect.
To fix this, you must sort you input arrays identically. Sorting one array is simple (sort(endx)), but you want to sort both in the same way. Matlab actually gives you a function to do this, but it only works on matrices, so you need to do some concatenating/seperating:
input = sortrows( [endx endy] );
endx = input(:,1);
endy = input(:,2);
This will sort the rows of the matrix built by putting endy right of endx with respect to the first column (endx). Now your inputs are correctly sorted and the resulting plot should only show one line. (More accurately, one line which does not at some point go back where it came from.)
Another way to achieve this, depending on you actual use case and data origin, would be to build the mean value of both parts of x, so instead of endx = [x1;x];, you'd build endx = mean([x1 x],2);.
Yet another way is to drop the line altogether and go with
plot(endx,endy,'.');
or
plot(endx,endy,'LineStyle','none');
But this is only useful if your data points are very close to each other.

Matlab boxplot for multiple fields

I have this matlab file which has a field called "data". In "data" I have lots of fields for different bonds (x5Q12... etc).
I am trying to produce ONE box plot that contains ONE column from each of the fields (i.e. a box diagram with 36 boxes in it). I tried this code (e.g. to plot a box for column 2 in all of the bonds) but it does't work for me:
boxplot(gilts_withoutdates.data.:(:,2));figure(gcf);
I know my understanding of calling different levels in a structure is a problem here. Any suggestions, please? Many thanks.
You can use STRUCTFUN to extract the data from a particular column of all fields of a structure.
col2plot = 2; %# this is the column you want to plot
%# return, for each field in the structure, the specified
%# column in a cell array
data2plot = structfun(#(x){x(:,col2plot)},gilts_withoutdates.data);
%# convert the cell array into a vector plus group indices
groupIdx = arrayfun(#(x)x*ones(size(data2plot{x})),1:length(data2plot),'uni',0);
groupIdx = cat(1,groupIdx{:});
data2plot = cat(1,data2plot{:});
%# create a compact boxplot
boxplot(data2plot,groupIdx,'plotStyle','compact','labels',labels)
If you're interested in the distribution of the data, I can recommend my function distributionPlot.
B=gilts_withoutdates.data;
b=fieldnames(B);
for a=1:numel(b)
boxplot(B.(b{a})); fig;
end
To plot a boxplot for each of the 5 columns of data for each field you could do this:
pos=1;
for i = 1:numel(b)
for ii=1:5
subplot(numel(b),5,pos);boxplot(B.(b{i})(:,ii));pos=pos+1;
end
end

MATLAB query about for loop, reading in data and plotting

I am a complete novice at using matlab and am trying to work out if there is a way of optimising my code. Essentially I have data from model outputs and I need to plot them using matlab. In addition I have reference data (with 95% confidence intervals) which I plot on the same graph to get a visual idea on how close the model outputs and reference data is.
In terms of the model outputs I have several thousand files (number sequentially) which I open in a loop and plot. The problem/question I have is whether I can preprocess the data and then plot later - to save time. The issue I seem to be having when I try this is that I have a legend which either does not appear or is inaccurate.
My code (apolgies if it not elegant):
fn= xlsread(['tbobserved' '.xls']);
time= fn(:,1);
totalreference=fn(:,4);
totalreferencelowerci=fn(:,6);
totalreferenceupperci=fn(:,7);
figure
plot(time,totalrefrence,'-', time, totalreferencelowerci,'--', time, totalreferenceupperci,'--');
xlabel('Year');
ylabel('Reference incidence per 100,000 population');
title ('Total');
clickableLegend('Observed reference data', 'Totalreferencelowerci', 'Totalreferenceupperci','Location','BestOutside');
xlim([1910 1970]);
hold on
start_sim=10000;
end_sim=10005;
h = zeros (1,1000);
for i=start_sim:end_sim %is there any way of doing this earlier to save time?
a=int2str(i);
incidenceFile =strcat('result_', 'Sim', '_', a, 'I_byCal_total.xls');
est_tot=importdata(incidenceFile, '\t', 1);
cal_tot=est_tot.data;
magnitude=1;
t1=cal_tot(:,1)+1750;
totalmodel=cal_tot(:,3)+cal_tot(:,5);
h(a)=plot(t1,totalmodel);
xlim([1910 1970]);
ylim([0 500]);
hold all
clickableLegend(h(a),a,'Location','BestOutside')
end
Essentially I was hoping to have a way of reading in the data and then plot later - ie. optimise the code.
I hope you might be able to help.
Thanks.
mp
Regarding your issue concerning
I have a legend which either does not
appear or is inaccurate.
have a look at the following extracts from your code.
...
h = zeros (1,1000);
...
a=int2str(i);
...
h(a)=plot(t1,totalmodel);
...
You are using a character array as index. Instead of h(a) you should use h(i). MATLAB seems to cast the character array a to double as shown in the following example with a = 10;.
>> double(int2str(10))
ans = 49 48
Instead of h(10) the plot handle will be assigned to h([49 48]) which is not your intention.