From an N x 3 matrix with values in third column only having the 4 values 1, 2, 3 & 4, I have to create a bar plot as well as a line plot showing the growth rate for each row (second column values). First row values are called 'Temperature', second 'Growth rate' and third 'Bacteria type'
As of right now my line plot does not work when removing rows with one of the four values in the third column. The matrix could look something like this
39.1220 0.8102 1.0000
13.5340 0.5742 1.0000
56.1370 0.2052 1.0000
50.0190 0.4754 1.0000
24.2970 0.8615 1.0000
37.1830 0.8513 1.0000
59.2390 0.0584 1.0000
45.7840 0.6254 1.0000
51.9480 0.3932 1.0000
42.3400 0.7371 1.0000
25.3870 0.8774 1.0000
57.1870 0.3880 2.0000
37.4580 0.7095 2.0000
46.4190 0.6431 2.0000
38.8380 0.7034 2.0000
11.2930 0.1214 2.0000
32.3270 0.6708 2.0000
42.3150 0.6908 2.0000
36.0600 0.7049 2.0000
28.6160 0.6248 2.0000
56.8570 0.3940 2.0000
51.4770 0.5410 2.0000
52.4540 0.5127 2.0000
28.6270 0.6248 2.0000
39.6590 0.7021 2.0000
53.6280 0.4829 2.0000
56.6750 0.4029 2.0000
43.4230 0.6805 2.0000
20.3390 0.4276 2.0000
42.6930 0.6826 2.0000
13.6030 0.2060 2.0000
30.3360 0.6497 2.0000
43.3470 0.6749 2.0000
56.6860 0.3977 2.0000
50.5480 0.5591 2.0000
34.2270 0.6929 2.0000
47.8370 0.6136 2.0000
30.8520 0.6593 2.0000
51.3290 0.5050 3.0000
29.5010 0.7789 3.0000
34.8950 0.8050 3.0000
44.7400 0.6884 3.0000
51.7180 0.4927 3.0000
40.4810 0.7621 3.0000
38.7370 0.7834 3.0000
26.3020 0.7379 3.0000
32.8210 0.8072 3.0000
45.6900 0.6684 3.0000
54.2200 0.4058 3.0000
46.0430 0.6611 3.0000
10.9310 0.2747 3.0000
43.7390 0.7043 3.0000
31.9250 0.7948 3.0000
31.8910 0.7954 3.0000
15.8520 0.4592 3.0000
50.7340 0.5237 3.0000
26.2430 0.7305 3.0000
22.3110 0.6536 3.0000
14.7690 0.1796 4.0000
17.3260 0.2304 4.0000
41.5570 0.3898 4.0000
52.9660 0.2604 4.0000
58.7110 0.1558 4.0000
And my code is as follows, with data being the matrix (double)
function dataPlot(data)
%1xN matrix with bacteria
A=data(:,3);
%Number of different bacterias is counted, and gathered in a vector
barData = [sum(A(:) == 1), sum(A(:) == 2), sum(A(:) == 3), sum(A(:) == 4)];
figure
bar(barData);
label = {'Salmonella enterica'; 'Bacillus cereus'; 'Listeria'; 'Brochothrix thermosphacta'};
set(gca,'xtick',[1:4],'xticklabel',label)
set(gca,'XTickLabelRotation',45)
ylabel('Observations')
title('Destribution of bacteria')
%The data is divided into four matrices based on the four different bacterias
%Salmonella matrix
S=data;
deleterow = false(size(S, 1), 1);
for n = 1:size(S, 1)
%For column condition
if S(n, 3)~= 1
%Mark line for deletion afterwards
deleterow(n) = true;
end
end
S(deleterow,:) = [];
S=S(:,1:2);
S=sortrows(S,1);
%Bacillus cereus
Ba=data;
deleterow = false(size(Ba, 1), 1);
for p = 1:size(Ba, 1)
%For column condition
if Ba(p, 3)~= 2
%Mark line for deletion afterwards
deleterow(p) = true;
end
end
Ba(deleterow,:) = [];
Ba=Ba(:,1:2);
Ba=sortrows(Ba,1);
%Listeria
L=data;
deleterow = false(size(L, 1), 1);
for v = 1:size(L, 1)
%For column condition
if L(v, 3)~= 3
%Mark line for deletion afterwards
deleterow(v) = true;
end
end
L(deleterow,:) = [];
L=L(:,1:2);
L=sortrows(L,1);
%Brochothrix thermosphacta
Br=data;
deleterow = false(size(Br, 1), 1);
for q = 1:size(Br, 1)
%For column condition
if Br(q, 3)~= 3
%Mark line for deletion afterwards
deleterow(q) = true;
end
end
Br(deleterow,:) = [];
Br=Br(:,1:2);
Br=sortrows(Br,1);
%The data is plotted (growth rate against temperature)
figure
plot(S(:,1), S(:, 2), Ba(:,1), Ba(:, 2), L(:,1), L(:, 2), Br(:,1), Br(:, 2))
xlim([10 60])
ylim([0; Inf])
xlabel('Temperature')
ylabel('Growth rate')
title('Growth rate as a function of temperature')
legend('Salmonella enterica','Bacillus cereus','Listeria','Brochothrix thermosphacta')
Can anyone help me fix it so when I have a matrix without eg 2 in the third column, it will still plot correctly?
I do know how to filter it correctly, and apply that filtering to 'data', so the only problem is the error codes occurring when plotting.
The errors are;
Warning: Ignoring extra legend entries.
> In legend>set_children_and_strings (line 643)
In legend>make_legend (line 328)
In legend (line 254)
In dataPlot (line 82)
In Hovedscript (line 153)
When running this function from a main script with the matrix sorted by growth rate (second column) and filtering for only third row values 1, 3 and 4 to be analyzed. The filtering is done through another function, done in advance, and the new 'data' looks like this.
59.2390 0.0584 1.0000
58.7110 0.1558 4.0000
14.7690 0.1796 4.0000
56.1370 0.2052 1.0000
17.3260 0.2304 4.0000
52.9660 0.2604 4.0000
10.9310 0.2747 3.0000
41.5570 0.3898 4.0000
51.9480 0.3932 1.0000
54.2200 0.4058 3.0000
15.8520 0.4592 3.0000
50.0190 0.4754 1.0000
51.7180 0.4927 3.0000
51.3290 0.5050 3.0000
50.7340 0.5237 3.0000
13.5340 0.5742 1.0000
45.7840 0.6254 1.0000
22.3110 0.6536 3.0000
46.0430 0.6611 3.0000
45.6900 0.6684 3.0000
44.7400 0.6884 3.0000
43.7390 0.7043 3.0000
26.2430 0.7305 3.0000
42.3400 0.7371 1.0000
26.3020 0.7379 3.0000
40.4810 0.7621 3.0000
29.5010 0.7789 3.0000
38.7370 0.7834 3.0000
31.9250 0.7948 3.0000
31.8910 0.7954 3.0000
34.8950 0.8050 3.0000
32.8210 0.8072 3.0000
39.1220 0.8102 1.0000
37.1830 0.8513 1.0000
24.2970 0.8615 1.0000
25.3870 0.8774 1.0000
Again, the bar plot works just fine, but does show all 4 bacteria even when only 3 of them are used, and the problem is in the line plot, with one line not showing in the plot.
Thank you for your time
A solution is to replace the following lines
plot(S(:,1), S(:, 2), Ba(:,1), Ba(:, 2), L(:,1), L(:, 2), Br(:,1), Br(:, 2))
legend('Salmonella enterica','Bacillus cereus','Listeria','Brochothrix thermosphacta')
by
hold on
plot(S(:,1), S(:, 2), 'DisplayName', 'Salmonella enterica');
plot(Ba(:,1), Ba(:, 2), 'DisplayName', 'Bacillus cereus');
plot(L(:,1), L(:, 2), 'DisplayName', 'Listeria');
plot(Br(:,1), Br(:, 2), 'DisplayName', 'Brochothrix thermosphacta');
legend SHOW;
In this way, the legend entries are explitely assigned to a specific plot, which works even if some plots are empty.
Copy and Paste mistake
L == Br in the provided code due to a copy and paste mistake. You should change if Br(q, 3)~= 3 into if Br(q, 3)~= 4.
Result
If I use your second input data (without 2 in the third column), I get the following (without any error message):
Hi I have data in MATLAB like this:
F =
1.0000 1.0000
2.0000 1.0000
3.0000 1.0000
3.1416 9.0000
4.0000 1.0000
5.0000 1.0000
6.0000 1.0000
6.2832 9.0000
7.0000 1.0000
8.0000 1.0000
9.0000 1.0000
9.4248 9.0000
10.0000 1.0000
I am looking for a way to sum the data in specific intervals. Example if I want my sampling interval to be 1, then the end result should be:
F =
1.0000 1.0000
2.0000 1.0000
3.0000 10.0000
4.0000 1.0000
5.0000 1.0000
6.0000 10.0000
7.0000 1.0000
8.0000 1.0000
9.0000 10.0000
10.0000 1.0000
i.e data is accumulated in the second column based on sampling the first row. Is there a function in MATLAB to do this?
Yes by combining histc() and accumarray():
F =[1.0000 1.0000;...
2.0000 1.0000;...
3.0000 1.0000;...
3.1416 9.0000;...
4.0000 1.0000;...
5.0000 1.0000;...
6.0000 1.0000;...
6.2832 9.0000;...
7.0000 1.0000;...
8.0000 1.0000;...
9.0000 1.0000;...
9.4248 9.0000;...
10.0000 1.0000];
range=1:0.5:10;
[~,bin]=histc(F(:,1),range);
result= [range.' accumarray(bin,F(:,2),[])]
If you run the code keep in mind that I changed the sampling interval (range) to 0.5.
This code works for all sampling intervals just define your wanted interval as range.
Yes and that's a job for accumarray:
Use the values in column 1 of F to sum (default behavior of accumarray) the elements in the 2nd column.
For a given interval of size s (Thanks to Luis Mendo for that):
S = accumarray(round(F(:,1)/s),F(:,2),[]); %// or you can use "floor" instead of "round".
S =
1
1
10
1
1
10
1
1
10
1
So constructing the output by concatenation:
NewF = [unique(round(F(:,1)/s)) S]
NewF =
1 1
2 1
3 10
4 1
5 1
6 10
7 1
8 1
9 10
10 1
Yay!!
I have 3 vectors: npdf, tn(:,1) and tn(:,2) and am finding the values of npdf in tn(:,2) line by line:
[npdf(1:20,1), tn(1:20,:)]
ans =
8.0000 3.0000 1.0000
11.0000 2.9167 1.0000
1.0000 3.3000 1.0000
11.0000 1.2167 1.0000
5.0000 2.8167 1.0000
1.0000 2.4000 1.0000
2.0000 2.4500 1.0000
4.0000 0.2500 1.0000
15.0000 3.7500 1.0000
15.0000 4.9167 1.0000
1.0000 2.8167 2.0000
17.0000 0.2500 2.0000
15.0000 1.0000 3.0000
4.0000 3.0000 3.0000
8.0000 0.5833 3.0000
1.0000 0.5833 3.0000
3.0000 5.0000 5.0000
11.0000 3.7500 6.0000
8.0000 3.0000 7.0000
15.0000 2.8000 7.0000
for i=1:length(npdf)
[LOCA,~]=ismember(tn(:,2),npdf(i,1,1));
dummy=find(LOCA~=0);
tpdf(i,1)=tn(randi(length(dummy),1,1),1);
end
each time it finds the value of npdf in tn(:,2) it chooses a value from tn(:,1).
Here's the problem: if it can't locate the value from npdf in tn(:,2) then I need to choose the nearest value (in magnitude) in tn(:,2) and proceed. Either that or some sort of interpolation between nearest values.. How would you do this most efficiently?
At your discretion to change the code, it doesn't look very efficient to me.
It can be done easily by using knnsearch as follows:
[idx,D]=knnsearch(tn(:,2),npdf,'K',size(tn,1));
for i=1:size(D,1)
tpdf(i,1)=tn(randi(sum(D(i,:)==min(D(i,:))),1,1),1);
end
It finds distance of each value in npdf to all the values in tn. Then it considers only the nearest value. Then it selects a random indices from tn(:,1) as per your code.
I have following data matrix, I want to iterate over this matrix and look at a value in the last column based on a given row and add that row - last element of that row to a new matrix.
5.1000 3.3000 1.7000 0.5000 1.0000
6.8000 3.2000 5.9000 2.3000 3.0000
5.0000 2.3000 3.3000 1.0000 2.0000
7.4000 2.8000 6.1000 1.9000 3.0000
6.5000 3.2000 5.1000 2.0000 3.0000
4.8000 3.4000 1.9000 0.2000 1.0000
4.9000 3.0000 1.4000 0.2000 1.0000
5.1000 3.8000 1.5000 0.3000 1.0000
5.1000 3.4000 1.5000 0.2000 1.0000
5.5000 2.6000 4.4000 1.2000 2.0000
This is the code that I have
M1 = [];
M2 = [];
M3 = [];
for i=1:length(currentCell)
if currentCell(1,5) == 1.00
m3Data = currentCell(1:1,1:4);
%how can I add m3Data to M1
end
end
Let your original matrix be M, then this
M1 = M(find(M(:,5)==1),1:4)
puts all the rows ending with a 1 into M1, excluding the final column. Is that what you want ?
You could do it with a for loop if you want, but I don't see any need.
I have a matrix (table actually) which I imported from a file:
1.0000 1.9736
4.0000 0.2016
9.0000 0.0584
10.0000 0.0495
5.0000 0.1845
2.0000 0.6873
1.0000 1.4177
2.0000 0.4699
5.0000 0.1555
10.0000 0.0435
13.0000 0.0326
8.0000 0.0860
5.0000 0.1685
4.0000 0.1956
5.0000 0.1433
8.0000 0.0675
13.0000 0.0335
13.0000 0.0327
10.0000 0.0431
9.0000 0.0582
10.0000 0.0551
13.0000 0.0308
I want to get the average of each of the occurance on left column. That is:
avg = [
1.0000 1.69565
2.0000 0.5786
4.0000 0.1978]
and so on. I could do this with a wile or for group but this is not the matlab way. So how can I do this?
a=[randi(5,10,1) rand(10,1)];
a =
4.0000 0.4387
1.0000 0.3816
2.0000 0.7655
1.0000 0.7952
1.0000 0.1869
5.0000 0.4898
4.0000 0.4456
2.0000 0.6463
5.0000 0.7094
1.0000 0.7547
[uniqueID,~,uniqueInd]=unique(a(:,1));
[uniqueID accumarray(uniqueInd,a(:,2))./accumarray(uniqueInd,1)]
ans =
1.0000 0.5296
2.0000 0.7059
4.0000 0.4422
5.0000 0.5996
If your matrix is called a, try
>> accumarray(grp2idx(a(:,1)),a(:,2),[],#mean)
ans =
1.6957
0.5786
0.1986
0.16295
0.07675
0.0583
0.0478
0.0324
Note that grp2idx is part of Statistics Toolbox. If you don't have that, you can use the unique command to get the same results.