Matlab - Plot from double data - matlab

From an N x 3 matrix with values in third column only having the 4 values 1, 2, 3 & 4, I have to create a bar plot as well as a line plot showing the growth rate for each row (second column values). First row values are called 'Temperature', second 'Growth rate' and third 'Bacteria type'
As of right now my line plot does not work when removing rows with one of the four values in the third column. The matrix could look something like this
39.1220 0.8102 1.0000
13.5340 0.5742 1.0000
56.1370 0.2052 1.0000
50.0190 0.4754 1.0000
24.2970 0.8615 1.0000
37.1830 0.8513 1.0000
59.2390 0.0584 1.0000
45.7840 0.6254 1.0000
51.9480 0.3932 1.0000
42.3400 0.7371 1.0000
25.3870 0.8774 1.0000
57.1870 0.3880 2.0000
37.4580 0.7095 2.0000
46.4190 0.6431 2.0000
38.8380 0.7034 2.0000
11.2930 0.1214 2.0000
32.3270 0.6708 2.0000
42.3150 0.6908 2.0000
36.0600 0.7049 2.0000
28.6160 0.6248 2.0000
56.8570 0.3940 2.0000
51.4770 0.5410 2.0000
52.4540 0.5127 2.0000
28.6270 0.6248 2.0000
39.6590 0.7021 2.0000
53.6280 0.4829 2.0000
56.6750 0.4029 2.0000
43.4230 0.6805 2.0000
20.3390 0.4276 2.0000
42.6930 0.6826 2.0000
13.6030 0.2060 2.0000
30.3360 0.6497 2.0000
43.3470 0.6749 2.0000
56.6860 0.3977 2.0000
50.5480 0.5591 2.0000
34.2270 0.6929 2.0000
47.8370 0.6136 2.0000
30.8520 0.6593 2.0000
51.3290 0.5050 3.0000
29.5010 0.7789 3.0000
34.8950 0.8050 3.0000
44.7400 0.6884 3.0000
51.7180 0.4927 3.0000
40.4810 0.7621 3.0000
38.7370 0.7834 3.0000
26.3020 0.7379 3.0000
32.8210 0.8072 3.0000
45.6900 0.6684 3.0000
54.2200 0.4058 3.0000
46.0430 0.6611 3.0000
10.9310 0.2747 3.0000
43.7390 0.7043 3.0000
31.9250 0.7948 3.0000
31.8910 0.7954 3.0000
15.8520 0.4592 3.0000
50.7340 0.5237 3.0000
26.2430 0.7305 3.0000
22.3110 0.6536 3.0000
14.7690 0.1796 4.0000
17.3260 0.2304 4.0000
41.5570 0.3898 4.0000
52.9660 0.2604 4.0000
58.7110 0.1558 4.0000
And my code is as follows, with data being the matrix (double)
function dataPlot(data)
%1xN matrix with bacteria
A=data(:,3);
%Number of different bacterias is counted, and gathered in a vector
barData = [sum(A(:) == 1), sum(A(:) == 2), sum(A(:) == 3), sum(A(:) == 4)];
figure
bar(barData);
label = {'Salmonella enterica'; 'Bacillus cereus'; 'Listeria'; 'Brochothrix thermosphacta'};
set(gca,'xtick',[1:4],'xticklabel',label)
set(gca,'XTickLabelRotation',45)
ylabel('Observations')
title('Destribution of bacteria')
%The data is divided into four matrices based on the four different bacterias
%Salmonella matrix
S=data;
deleterow = false(size(S, 1), 1);
for n = 1:size(S, 1)
%For column condition
if S(n, 3)~= 1
%Mark line for deletion afterwards
deleterow(n) = true;
end
end
S(deleterow,:) = [];
S=S(:,1:2);
S=sortrows(S,1);
%Bacillus cereus
Ba=data;
deleterow = false(size(Ba, 1), 1);
for p = 1:size(Ba, 1)
%For column condition
if Ba(p, 3)~= 2
%Mark line for deletion afterwards
deleterow(p) = true;
end
end
Ba(deleterow,:) = [];
Ba=Ba(:,1:2);
Ba=sortrows(Ba,1);
%Listeria
L=data;
deleterow = false(size(L, 1), 1);
for v = 1:size(L, 1)
%For column condition
if L(v, 3)~= 3
%Mark line for deletion afterwards
deleterow(v) = true;
end
end
L(deleterow,:) = [];
L=L(:,1:2);
L=sortrows(L,1);
%Brochothrix thermosphacta
Br=data;
deleterow = false(size(Br, 1), 1);
for q = 1:size(Br, 1)
%For column condition
if Br(q, 3)~= 3
%Mark line for deletion afterwards
deleterow(q) = true;
end
end
Br(deleterow,:) = [];
Br=Br(:,1:2);
Br=sortrows(Br,1);
%The data is plotted (growth rate against temperature)
figure
plot(S(:,1), S(:, 2), Ba(:,1), Ba(:, 2), L(:,1), L(:, 2), Br(:,1), Br(:, 2))
xlim([10 60])
ylim([0; Inf])
xlabel('Temperature')
ylabel('Growth rate')
title('Growth rate as a function of temperature')
legend('Salmonella enterica','Bacillus cereus','Listeria','Brochothrix thermosphacta')
Can anyone help me fix it so when I have a matrix without eg 2 in the third column, it will still plot correctly?
I do know how to filter it correctly, and apply that filtering to 'data', so the only problem is the error codes occurring when plotting.
The errors are;
Warning: Ignoring extra legend entries.
> In legend>set_children_and_strings (line 643)
In legend>make_legend (line 328)
In legend (line 254)
In dataPlot (line 82)
In Hovedscript (line 153)
When running this function from a main script with the matrix sorted by growth rate (second column) and filtering for only third row values 1, 3 and 4 to be analyzed. The filtering is done through another function, done in advance, and the new 'data' looks like this.
59.2390 0.0584 1.0000
58.7110 0.1558 4.0000
14.7690 0.1796 4.0000
56.1370 0.2052 1.0000
17.3260 0.2304 4.0000
52.9660 0.2604 4.0000
10.9310 0.2747 3.0000
41.5570 0.3898 4.0000
51.9480 0.3932 1.0000
54.2200 0.4058 3.0000
15.8520 0.4592 3.0000
50.0190 0.4754 1.0000
51.7180 0.4927 3.0000
51.3290 0.5050 3.0000
50.7340 0.5237 3.0000
13.5340 0.5742 1.0000
45.7840 0.6254 1.0000
22.3110 0.6536 3.0000
46.0430 0.6611 3.0000
45.6900 0.6684 3.0000
44.7400 0.6884 3.0000
43.7390 0.7043 3.0000
26.2430 0.7305 3.0000
42.3400 0.7371 1.0000
26.3020 0.7379 3.0000
40.4810 0.7621 3.0000
29.5010 0.7789 3.0000
38.7370 0.7834 3.0000
31.9250 0.7948 3.0000
31.8910 0.7954 3.0000
34.8950 0.8050 3.0000
32.8210 0.8072 3.0000
39.1220 0.8102 1.0000
37.1830 0.8513 1.0000
24.2970 0.8615 1.0000
25.3870 0.8774 1.0000
Again, the bar plot works just fine, but does show all 4 bacteria even when only 3 of them are used, and the problem is in the line plot, with one line not showing in the plot.
Thank you for your time

A solution is to replace the following lines
plot(S(:,1), S(:, 2), Ba(:,1), Ba(:, 2), L(:,1), L(:, 2), Br(:,1), Br(:, 2))
legend('Salmonella enterica','Bacillus cereus','Listeria','Brochothrix thermosphacta')
by
hold on
plot(S(:,1), S(:, 2), 'DisplayName', 'Salmonella enterica');
plot(Ba(:,1), Ba(:, 2), 'DisplayName', 'Bacillus cereus');
plot(L(:,1), L(:, 2), 'DisplayName', 'Listeria');
plot(Br(:,1), Br(:, 2), 'DisplayName', 'Brochothrix thermosphacta');
legend SHOW;
In this way, the legend entries are explitely assigned to a specific plot, which works even if some plots are empty.
Copy and Paste mistake
L == Br in the provided code due to a copy and paste mistake. You should change if Br(q, 3)~= 3 into if Br(q, 3)~= 4.
Result
If I use your second input data (without 2 in the third column), I get the following (without any error message):

Related

Isolines/contour in matlab

I want to make a contour plot of three variables.
x coordinate , y coordinate and speed
Then I wish to depict the velocity directions on same plot with quiver.
Code:
k=49;
data_k=data(:,1)==k&sp>sp_ths;%filter data
x=xcor(data_k);y=ycor(data_k);sp_k=sp(data_k);vx_k=vx(data_k);vy_k=vy(data_k);
if contour_plot
[Xq, Yq] = meshgrid(x,y);
Zq =griddata(x,y,sp_k,Xq,Yq);
contour(Xq,Yq,Zq,5)%,'ShowText','on');
end
hold on
quiver(x, y, vx_k*5, vy_k*5, 0, 'k');
Output:
contour seems incorrect, I cant understand why though.
Data:
>> [x,y,vx_k,vy_k,sp_k]
ans =
57.3030 61.6410 0.8965 0.4430 2.0000
84.9540 -0.0559 0.9534 0.3017 2.0000
80.3200 7.7009 0.9009 0.4339 2.0000
76.6780 -35.6720 0.9391 -0.3437 2.0000
61.4120 54.7280 0.3449 0.9386 2.0000
70.9940 32.3250 0.7934 0.6088 2.0000
-77.8030 4.8428 -0.9998 -0.0178 2.0000
-39.4330 66.0040 -0.6452 0.7640 2.0000
-41.1680 -70.1010 -0.9055 -0.4244 2.0000
57.7840 -58.3810 0.9264 -0.3765 2.0000
-70.2350 -8.8322 -0.9975 0.0712 2.0000
-77.6940 -26.3810 -0.9676 -0.2525 2.0000
-49.7200 -48.1560 -0.5801 -0.8145 2.0000
-34.4620 -76.7420 -0.6990 -0.7151 2.0000
68.3490 21.6690 0.9678 -0.2516 2.0000
71.7360 -16.5990 0.9287 -0.3709 2.0000
17.9180 -66.0220 0.5107 -0.8598 2.0000
-57.2370 -55.8160 -0.9522 -0.3055 2.0000
86.0120 5.7037 0.9336 0.3583 2.0000
75.7290 16.6260 0.9946 0.1035 1.9114
-78.2140 4.6192 -0.9969 0.0783 2.0000
42.9320 -63.1170 0.5138 -0.8579 2.0000
-56.5820 39.2650 -0.2098 0.9777 2.0000
-18.2490 75.0340 -0.0854 0.9963 2.0000
75.4960 -28.2940 0.8437 -0.5367 2.0000
-17.6210 74.9380 -0.0340 0.9994 2.0000
-10.9350 -79.1950 -0.3356 -0.9420 2.0000
-16.2720 69.7160 0.2938 0.9559 2.0000
-70.9780 -37.1290 -0.9887 0.1496 2.0000
71.9370 -38.4470 0.8501 -0.5266 2.0000
73.3310 -7.0563 0.9994 0.0341 2.0000
83.7780 19.1370 0.8500 0.5268 2.0000
-8.1897 79.2620 0.0479 0.9989 2.0000
56.7250 62.4670 0.9049 0.4256 2.0000
56.6710 62.1070 0.8763 0.4818 2.0000
77.0110 9.7810 0.9787 -0.2053 2.0000
56.3630 62.7070 0.9476 0.3195 2.0000
84.0260 0.2988 0.9618 0.2737 2.0000
-68.5600 -42.1320 -0.9822 -0.1880 2.0000
55.5620 63.5370 0.6724 0.7402 2.0000
19.3120 -67.2460 0.1840 -0.9829 2.0000
-71.6530 28.4280 -0.9346 0.3558 2.0000
-35.6610 -75.9520 -0.1767 -0.9843 2.0000
33.1410 -75.3810 -0.1116 -0.9938 2.0000
55.1580 56.8510 0.5865 0.8099 2.0000
34.6410 -75.9710 0.3960 -0.9183 2.0000
57.9810 -58.1830 0.8255 -0.5645 2.0000
62.0610 -56.7770 0.8238 -0.5669 2.0000
46.5930 -68.1200 0.9947 -0.1029 2.0000
38.4250 -74.4980 0.3652 -0.9309 2.0000
-46.0560 -67.3300 -0.3544 -0.9351 2.0000
75.8290 18.4470 0.9997 0.0244 1.9114
72.4200 31.8080 0.9841 0.1777 2.0000
61.8330 -53.1870 0.9163 -0.4005 2.0000
-62.1240 -25.1080 -0.6117 -0.7911 2.0000
57.0410 62.0730 0.8391 0.5440 2.0000
73.0400 -2.7887 0.9313 0.3643 2.0000
39.0000 -73.7550 0.3970 -0.9178 2.0000
81.8430 -20.8660 0.9697 -0.2443 2.0000
-77.8410 4.7747 -0.9584 0.2853 2.0000
I hope I have the answer for you here: Drawing 3D contour from 3D data
For start, contour and contour3 plots are to represent scalar fields, not vector fields. For vector field You can use quiver and quiver3 plots.
Note that for Contours You need two M-vectors of x and y coordinates and one M-matrix of z data, where z(ii,jj) corresponds to x(ii) and y(jj), but for quiver you need M-vectors of coordinates and M-vectors of vector directions, where x(ii), y(ii), z(ii), u(ii), v(ii) and w(ii) represent one "arrow" in the plot.
It would me helpful to provide image how it should look like.

Adding data in intervals in Matlab

Hi I have data in MATLAB like this:
F =
1.0000 1.0000
2.0000 1.0000
3.0000 1.0000
3.1416 9.0000
4.0000 1.0000
5.0000 1.0000
6.0000 1.0000
6.2832 9.0000
7.0000 1.0000
8.0000 1.0000
9.0000 1.0000
9.4248 9.0000
10.0000 1.0000
I am looking for a way to sum the data in specific intervals. Example if I want my sampling interval to be 1, then the end result should be:
F =
1.0000 1.0000
2.0000 1.0000
3.0000 10.0000
4.0000 1.0000
5.0000 1.0000
6.0000 10.0000
7.0000 1.0000
8.0000 1.0000
9.0000 10.0000
10.0000 1.0000
i.e data is accumulated in the second column based on sampling the first row. Is there a function in MATLAB to do this?
Yes by combining histc() and accumarray():
F =[1.0000 1.0000;...
2.0000 1.0000;...
3.0000 1.0000;...
3.1416 9.0000;...
4.0000 1.0000;...
5.0000 1.0000;...
6.0000 1.0000;...
6.2832 9.0000;...
7.0000 1.0000;...
8.0000 1.0000;...
9.0000 1.0000;...
9.4248 9.0000;...
10.0000 1.0000];
range=1:0.5:10;
[~,bin]=histc(F(:,1),range);
result= [range.' accumarray(bin,F(:,2),[])]
If you run the code keep in mind that I changed the sampling interval (range) to 0.5.
This code works for all sampling intervals just define your wanted interval as range.
Yes and that's a job for accumarray:
Use the values in column 1 of F to sum (default behavior of accumarray) the elements in the 2nd column.
For a given interval of size s (Thanks to Luis Mendo for that):
S = accumarray(round(F(:,1)/s),F(:,2),[]); %// or you can use "floor" instead of "round".
S =
1
1
10
1
1
10
1
1
10
1
So constructing the output by concatenation:
NewF = [unique(round(F(:,1)/s)) S]
NewF =
1 1
2 1
3 10
4 1
5 1
6 10
7 1
8 1
9 10
10 1
Yay!!

implementing "not equal to " loop in matlab

i have a simple problem i am quite new to matlab so i am having problem in implementing it i have two 64x2 matrices u and h.i have to check if a single row in u is not equal to all of the rows in h.then the row which is not equal should be saved in a separate matrix meanwhile i have written this code but what it does is that r(i,:) get all the values of u(i,:) when this code runs, what i want is that only those values of u(i,:) should be stored in r which are not similar to any row in h matrix.
h=[];
for j=1:8
for i=1:8
h=[h; i j];
end
end
u=[5.3,1.4;6,8;2,3;3,5.5;2.6,8;3.7,2;4,2;5,3;1.9,8;5.4,4;3.2,3;2,2;2,4;2,3;8,2.2;8,4;7.3,1.5;6.2,5.1;2.4,1.5;3,5;2,7.1;1.8,2.7;3,4;6,5;6,1;5,4;4,6;3.5,2;5,7;7.2,8;7,7;5,5;6,3;6,6;1,2;5,8;3,5;1,5;2,2;2,1;6,3;4,7;6,8;3,6;1,6;5,2;3,5;8,7;8,4;4,8;1,1;6,3;7,5;8,1;1,6;4,5;5,5;6,7;6,7;6,7;6,3;3,4;5,7;1,1]
for i=1
for j=1:64
if u(i,:)==h(j,:)
c=1
else
c=0
if c==0
r(i,:)=u(i,:)
end
end
end
end
can anyone help me please
You can do it in one line with ismember:
r = u(~ismember(u,h,'rows'),:);
With your example data, the result is
>> r
r =
5.3000 1.4000
3.0000 5.5000
2.6000 8.0000
3.7000 2.0000
1.9000 8.0000
5.4000 4.0000
3.2000 3.0000
8.0000 2.2000
7.3000 1.5000
6.2000 5.1000
2.4000 1.5000
2.0000 7.1000
1.8000 2.7000
3.5000 2.0000
7.2000 8.0000
use setdiff with 'rows' option to compute r. Please avoid unnecessary loops. pre-allocate when possible.
% construct h without loop
[h{1} h{2}]=ndgrid(1:8,1:8);
h=[h{1}(:) h{2}(:)];
% get r using setdiff
r = setdiff( u, h, 'rows')
Results with
r =
1.8000 2.7000
1.9000 8.0000
2.0000 7.1000
2.4000 1.5000
2.6000 8.0000
3.0000 5.5000
3.2000 3.0000
3.5000 2.0000
3.7000 2.0000
5.3000 1.4000
5.4000 4.0000
6.2000 5.1000
7.2000 8.0000
7.3000 1.5000
8.0000 2.2000
Solution of you question in NlogN complexity (N=64):
N=size(h,1);
[husorted,origin_husorted,destination_hu]=unique([h;u],'rows','first');
iduplicates=destination_hu(N+1:end)<=destination_hu(N),:);
r=u;
r(iduplicates,:)=0;
destination_uh is the only output of unique that is useful; It verifies [h;u]=husorted(destination_uh,:)]. 'first' ensures that if line i of u is equal line j of h, then destination_uh(i+N) is equal to destination_uh(j).
Solution for your particular h, with complexity N:
r=u;
r(all(u==round(u)&u>=1&u<=8,2),:)=0;

Efficiently finding a value from one vector in another matlab: error checking if empty

I have 3 vectors: npdf, tn(:,1) and tn(:,2) and am finding the values of npdf in tn(:,2) line by line:
[npdf(1:20,1), tn(1:20,:)]
ans =
8.0000 3.0000 1.0000
11.0000 2.9167 1.0000
1.0000 3.3000 1.0000
11.0000 1.2167 1.0000
5.0000 2.8167 1.0000
1.0000 2.4000 1.0000
2.0000 2.4500 1.0000
4.0000 0.2500 1.0000
15.0000 3.7500 1.0000
15.0000 4.9167 1.0000
1.0000 2.8167 2.0000
17.0000 0.2500 2.0000
15.0000 1.0000 3.0000
4.0000 3.0000 3.0000
8.0000 0.5833 3.0000
1.0000 0.5833 3.0000
3.0000 5.0000 5.0000
11.0000 3.7500 6.0000
8.0000 3.0000 7.0000
15.0000 2.8000 7.0000
for i=1:length(npdf)
[LOCA,~]=ismember(tn(:,2),npdf(i,1,1));
dummy=find(LOCA~=0);
tpdf(i,1)=tn(randi(length(dummy),1,1),1);
end
each time it finds the value of npdf in tn(:,2) it chooses a value from tn(:,1).
Here's the problem: if it can't locate the value from npdf in tn(:,2) then I need to choose the nearest value (in magnitude) in tn(:,2) and proceed. Either that or some sort of interpolation between nearest values.. How would you do this most efficiently?
At your discretion to change the code, it doesn't look very efficient to me.
It can be done easily by using knnsearch as follows:
[idx,D]=knnsearch(tn(:,2),npdf,'K',size(tn,1));
for i=1:size(D,1)
tpdf(i,1)=tn(randi(sum(D(i,:)==min(D(i,:))),1,1),1);
end
It finds distance of each value in npdf to all the values in tn. Then it considers only the nearest value. Then it selects a random indices from tn(:,1) as per your code.

Indexing during assignment

Say I have this sample data
A =
1.0000 6.0000 180.0000 12.0000
1.0000 5.9200 190.0000 11.0000
1.0000 5.5800 170.0000 12.0000
1.0000 5.9200 165.0000 10.0000
2.0000 5.0000 100.0000 6.0000
2.0000 5.5000 150.0000 8.0000
2.0000 5.4200 130.0000 7.0000
2.0000 5.7500 150.0000 9.0000
I wish to calculate the variance of each column, grouped by class (the first column).
I have this working with the following code, but it uses hard coded indices, requiring knowledge of the number of samples per class and they must be in specific order.
Is there a better way to do this?
variances = zeros(2,4);
variances = [1.0 var(A(1:4,2)), var(A(1:4,3)), var(A(1:4,4));
2.0 var(A(5:8,2)), var(A(5:8,3)), var(A(5:8,4))];
disp(variances);
1.0 3.5033e-02 1.2292e+02 9.1667e-01
2.0 9.7225e-02 5.5833e+02 1.6667e+00
Separate the class labels and the data into different variables.
cls = A(:, 1);
data = A(:, 2:end);
Get the list of class labels
labels = unique(cls);
Compute the variances
variances = zeros(length(labels), 3);
for i = 1:length(labels)
variances(i, :) = var(data(cls == labels(i), :)); % note the use of logical indexing
end
I've done a fair bit of this type of stuff over the years, but to be able to judge, better vs. best, it would help to know what you expect to change in the data set or structure.
Otherwise, if no change is anticipated and the hard code works, stick with it.
Easy, peasy. Use consolidator. It is on the file exchange.
A = [1.0000 6.0000 180.0000 12.0000
1.0000 5.9200 190.0000 11.0000
1.0000 5.5800 170.0000 12.0000
1.0000 5.9200 165.0000 10.0000
2.0000 5.0000 100.0000 6.0000
2.0000 5.5000 150.0000 8.0000
2.0000 5.4200 130.0000 7.0000
2.0000 5.7500 150.0000 9.0000];
[C1,var234] = consolidator(A(:,1),A(:,2:4),#var)
C1 =
1
2
var234 =
0.035033 122.92 0.91667
0.097225 558.33 1.6667
We can test the variances produced, since we know the grouping.
var(A(1:4,2:4))
ans =
0.035033 122.92 0.91667
var(A(5:8,2:4))
ans =
0.097225 558.33 1.6667
It is efficient too.