MATLAB code for Hamacher sum - matlab

The hamacher sum is :
I wrote the following MATLAB code for the above function
function f=hamachersum(x,y)
f = zeros(numel(x),1);
for j=1:numel(x)
if x(j)==1 && y(j)==1
f(j,1)=1;
else
f(j,1)=(x(j)+y(j)-2*(x(j)*y(j)))/(1-(x(j)*y(j)));
end
end
end
Then I want to test t3=hamachersum(t1,t2)
My input values t1, t2 are
t1
t1 =
1.0000
0
1.0000
1.0000
1.0000
1.0000
NaN
0.8167
1.0000
1.0000
1.0000
0.4667
NaN
1.0000
1.0000
1.0000
NaN
NaN
1.0000
1.0000
1.0000
NaN
0.0250
1.0000
t2
t2 =
1.0000
0.5524
1.0000
1.0000
1.0000
1.0000
NaN
0
1.0000
1.0000
1.0000
1.0000
NaN
1.0000
1.0000
1.0000
NaN
NaN
0.6032
1.0000
1.0000
NaN
0.9973
0.7260
The result is
t3 =
1.2000
0.5524
1.0000
1.0000
1.0000
1.0000
NaN
0.8167
2.0000
2.0000
1.0000
1.0000
NaN
0.6667
1.0000
1.0769
NaN
NaN
1.0000
1.0000
1.0000
NaN
0.9973
1.0000
Why do I get values above 1. As this is a fuzzy operator it can't have values above 1.
Is there something wrong in my code?

I am probably answering this against my better judgement.
Caveat: I am not familiar with the Hamacher sum so my approach to answering this is strictly based on the equation in your question.
Is there something wrong in my code?
Your code produces 1.0 when I run it on MATLAB R2014a for inputs of 1.0 and 1.0, I'm presuming that is correct as you have an explicit condition for it. I cant produce the results you are are seeing in your question.
However, I felt compelled to provide a more efficient implementation of the equation
function h = hamachersum(mu_a, mu_b)
h = (mu_a + mu_b - (2 .* mu_a .* mu_b)) ./ (1 - mu_a .* mu_b);
% h(isnan(h)) = 1.0; % Included this line to show you how to remove NaN
end
Note: I've included % h(isnan(h)) = 1.0; to show you how to handle cases when mu_a and mu_b are both 1.0 as you have explicitly handled this in your question (rather poorly might I add).
Comparing floating point numbers is not reliable even in MATLAB and could be part of the reason why you are receiving the results you are. A better way to check what the value of a floating point number is would be to use
if (x - 1.0 > 1e-15)
fprintf(1, "x == 1.0");
else
fprintf(1, "x ~= 1.0");
end
If x is equal to 1.0 down to machine precision this expression will be true otherwise it will be false.

Related

How to test a correlation?

How to test wheater there is a correlation between two parameters? I have n-dimensional parameters a and b in matlab? I would like to find the correlation parametr is there a simply function for it? Thank you
I tried:
C = [a b];
cov(C)
I got
0.0001 0.0029
0.0029 1.0668
And
corrcoef(C)
with the result
1.0000 0.3682
0.3682 1.0000
I sould look at diagonnal members - they are 1, so it is correlation?
Is that right?
1
1.0000 -0.3143
-0.3143 1.0000
2
1.0000 0.4248
0.4248 1.0000
3
1.0000 -0.3290
-0.3290 1.0000
4
1.0000 -0.1397
-0.1397 1.0000
5
1.0000 -0.1506
-0.1506 1.0000
6
1.0000 0.3682
0.3682 1.0000

Matlab - Plot from double data

From an N x 3 matrix with values in third column only having the 4 values 1, 2, 3 & 4, I have to create a bar plot as well as a line plot showing the growth rate for each row (second column values). First row values are called 'Temperature', second 'Growth rate' and third 'Bacteria type'
As of right now my line plot does not work when removing rows with one of the four values in the third column. The matrix could look something like this
39.1220 0.8102 1.0000
13.5340 0.5742 1.0000
56.1370 0.2052 1.0000
50.0190 0.4754 1.0000
24.2970 0.8615 1.0000
37.1830 0.8513 1.0000
59.2390 0.0584 1.0000
45.7840 0.6254 1.0000
51.9480 0.3932 1.0000
42.3400 0.7371 1.0000
25.3870 0.8774 1.0000
57.1870 0.3880 2.0000
37.4580 0.7095 2.0000
46.4190 0.6431 2.0000
38.8380 0.7034 2.0000
11.2930 0.1214 2.0000
32.3270 0.6708 2.0000
42.3150 0.6908 2.0000
36.0600 0.7049 2.0000
28.6160 0.6248 2.0000
56.8570 0.3940 2.0000
51.4770 0.5410 2.0000
52.4540 0.5127 2.0000
28.6270 0.6248 2.0000
39.6590 0.7021 2.0000
53.6280 0.4829 2.0000
56.6750 0.4029 2.0000
43.4230 0.6805 2.0000
20.3390 0.4276 2.0000
42.6930 0.6826 2.0000
13.6030 0.2060 2.0000
30.3360 0.6497 2.0000
43.3470 0.6749 2.0000
56.6860 0.3977 2.0000
50.5480 0.5591 2.0000
34.2270 0.6929 2.0000
47.8370 0.6136 2.0000
30.8520 0.6593 2.0000
51.3290 0.5050 3.0000
29.5010 0.7789 3.0000
34.8950 0.8050 3.0000
44.7400 0.6884 3.0000
51.7180 0.4927 3.0000
40.4810 0.7621 3.0000
38.7370 0.7834 3.0000
26.3020 0.7379 3.0000
32.8210 0.8072 3.0000
45.6900 0.6684 3.0000
54.2200 0.4058 3.0000
46.0430 0.6611 3.0000
10.9310 0.2747 3.0000
43.7390 0.7043 3.0000
31.9250 0.7948 3.0000
31.8910 0.7954 3.0000
15.8520 0.4592 3.0000
50.7340 0.5237 3.0000
26.2430 0.7305 3.0000
22.3110 0.6536 3.0000
14.7690 0.1796 4.0000
17.3260 0.2304 4.0000
41.5570 0.3898 4.0000
52.9660 0.2604 4.0000
58.7110 0.1558 4.0000
And my code is as follows, with data being the matrix (double)
function dataPlot(data)
%1xN matrix with bacteria
A=data(:,3);
%Number of different bacterias is counted, and gathered in a vector
barData = [sum(A(:) == 1), sum(A(:) == 2), sum(A(:) == 3), sum(A(:) == 4)];
figure
bar(barData);
label = {'Salmonella enterica'; 'Bacillus cereus'; 'Listeria'; 'Brochothrix thermosphacta'};
set(gca,'xtick',[1:4],'xticklabel',label)
set(gca,'XTickLabelRotation',45)
ylabel('Observations')
title('Destribution of bacteria')
%The data is divided into four matrices based on the four different bacterias
%Salmonella matrix
S=data;
deleterow = false(size(S, 1), 1);
for n = 1:size(S, 1)
%For column condition
if S(n, 3)~= 1
%Mark line for deletion afterwards
deleterow(n) = true;
end
end
S(deleterow,:) = [];
S=S(:,1:2);
S=sortrows(S,1);
%Bacillus cereus
Ba=data;
deleterow = false(size(Ba, 1), 1);
for p = 1:size(Ba, 1)
%For column condition
if Ba(p, 3)~= 2
%Mark line for deletion afterwards
deleterow(p) = true;
end
end
Ba(deleterow,:) = [];
Ba=Ba(:,1:2);
Ba=sortrows(Ba,1);
%Listeria
L=data;
deleterow = false(size(L, 1), 1);
for v = 1:size(L, 1)
%For column condition
if L(v, 3)~= 3
%Mark line for deletion afterwards
deleterow(v) = true;
end
end
L(deleterow,:) = [];
L=L(:,1:2);
L=sortrows(L,1);
%Brochothrix thermosphacta
Br=data;
deleterow = false(size(Br, 1), 1);
for q = 1:size(Br, 1)
%For column condition
if Br(q, 3)~= 3
%Mark line for deletion afterwards
deleterow(q) = true;
end
end
Br(deleterow,:) = [];
Br=Br(:,1:2);
Br=sortrows(Br,1);
%The data is plotted (growth rate against temperature)
figure
plot(S(:,1), S(:, 2), Ba(:,1), Ba(:, 2), L(:,1), L(:, 2), Br(:,1), Br(:, 2))
xlim([10 60])
ylim([0; Inf])
xlabel('Temperature')
ylabel('Growth rate')
title('Growth rate as a function of temperature')
legend('Salmonella enterica','Bacillus cereus','Listeria','Brochothrix thermosphacta')
Can anyone help me fix it so when I have a matrix without eg 2 in the third column, it will still plot correctly?
I do know how to filter it correctly, and apply that filtering to 'data', so the only problem is the error codes occurring when plotting.
The errors are;
Warning: Ignoring extra legend entries.
> In legend>set_children_and_strings (line 643)
In legend>make_legend (line 328)
In legend (line 254)
In dataPlot (line 82)
In Hovedscript (line 153)
When running this function from a main script with the matrix sorted by growth rate (second column) and filtering for only third row values 1, 3 and 4 to be analyzed. The filtering is done through another function, done in advance, and the new 'data' looks like this.
59.2390 0.0584 1.0000
58.7110 0.1558 4.0000
14.7690 0.1796 4.0000
56.1370 0.2052 1.0000
17.3260 0.2304 4.0000
52.9660 0.2604 4.0000
10.9310 0.2747 3.0000
41.5570 0.3898 4.0000
51.9480 0.3932 1.0000
54.2200 0.4058 3.0000
15.8520 0.4592 3.0000
50.0190 0.4754 1.0000
51.7180 0.4927 3.0000
51.3290 0.5050 3.0000
50.7340 0.5237 3.0000
13.5340 0.5742 1.0000
45.7840 0.6254 1.0000
22.3110 0.6536 3.0000
46.0430 0.6611 3.0000
45.6900 0.6684 3.0000
44.7400 0.6884 3.0000
43.7390 0.7043 3.0000
26.2430 0.7305 3.0000
42.3400 0.7371 1.0000
26.3020 0.7379 3.0000
40.4810 0.7621 3.0000
29.5010 0.7789 3.0000
38.7370 0.7834 3.0000
31.9250 0.7948 3.0000
31.8910 0.7954 3.0000
34.8950 0.8050 3.0000
32.8210 0.8072 3.0000
39.1220 0.8102 1.0000
37.1830 0.8513 1.0000
24.2970 0.8615 1.0000
25.3870 0.8774 1.0000
Again, the bar plot works just fine, but does show all 4 bacteria even when only 3 of them are used, and the problem is in the line plot, with one line not showing in the plot.
Thank you for your time
A solution is to replace the following lines
plot(S(:,1), S(:, 2), Ba(:,1), Ba(:, 2), L(:,1), L(:, 2), Br(:,1), Br(:, 2))
legend('Salmonella enterica','Bacillus cereus','Listeria','Brochothrix thermosphacta')
by
hold on
plot(S(:,1), S(:, 2), 'DisplayName', 'Salmonella enterica');
plot(Ba(:,1), Ba(:, 2), 'DisplayName', 'Bacillus cereus');
plot(L(:,1), L(:, 2), 'DisplayName', 'Listeria');
plot(Br(:,1), Br(:, 2), 'DisplayName', 'Brochothrix thermosphacta');
legend SHOW;
In this way, the legend entries are explitely assigned to a specific plot, which works even if some plots are empty.
Copy and Paste mistake
L == Br in the provided code due to a copy and paste mistake. You should change if Br(q, 3)~= 3 into if Br(q, 3)~= 4.
Result
If I use your second input data (without 2 in the third column), I get the following (without any error message):

Adding data in intervals in Matlab

Hi I have data in MATLAB like this:
F =
1.0000 1.0000
2.0000 1.0000
3.0000 1.0000
3.1416 9.0000
4.0000 1.0000
5.0000 1.0000
6.0000 1.0000
6.2832 9.0000
7.0000 1.0000
8.0000 1.0000
9.0000 1.0000
9.4248 9.0000
10.0000 1.0000
I am looking for a way to sum the data in specific intervals. Example if I want my sampling interval to be 1, then the end result should be:
F =
1.0000 1.0000
2.0000 1.0000
3.0000 10.0000
4.0000 1.0000
5.0000 1.0000
6.0000 10.0000
7.0000 1.0000
8.0000 1.0000
9.0000 10.0000
10.0000 1.0000
i.e data is accumulated in the second column based on sampling the first row. Is there a function in MATLAB to do this?
Yes by combining histc() and accumarray():
F =[1.0000 1.0000;...
2.0000 1.0000;...
3.0000 1.0000;...
3.1416 9.0000;...
4.0000 1.0000;...
5.0000 1.0000;...
6.0000 1.0000;...
6.2832 9.0000;...
7.0000 1.0000;...
8.0000 1.0000;...
9.0000 1.0000;...
9.4248 9.0000;...
10.0000 1.0000];
range=1:0.5:10;
[~,bin]=histc(F(:,1),range);
result= [range.' accumarray(bin,F(:,2),[])]
If you run the code keep in mind that I changed the sampling interval (range) to 0.5.
This code works for all sampling intervals just define your wanted interval as range.
Yes and that's a job for accumarray:
Use the values in column 1 of F to sum (default behavior of accumarray) the elements in the 2nd column.
For a given interval of size s (Thanks to Luis Mendo for that):
S = accumarray(round(F(:,1)/s),F(:,2),[]); %// or you can use "floor" instead of "round".
S =
1
1
10
1
1
10
1
1
10
1
So constructing the output by concatenation:
NewF = [unique(round(F(:,1)/s)) S]
NewF =
1 1
2 1
3 10
4 1
5 1
6 10
7 1
8 1
9 10
10 1
Yay!!

matlab: apply an operand on an array by a condition

I have an array like this:
>> a = [2,34,5,6,7,0,1,10]
now I want to reverse each element of this array.
By using 1 ./ a the result is:
ans =
0.5000 0.0294 0.2000 0.1667 0.1429 Inf 1.0000 0.1000
The Inf is not good for me, the answer should be
ans =
0.5000 0.0294 0.2000 0.1667 0.1429 0 1.0000 0.1000
I want to apply this on elements that are not zero!
How can I do that?
You could also reset the Inf value to zero afterwards:
>> b=1./a
b =
0.5000 0.0294 0.2000 0.1667 0.1429 Inf 1.0000 0.1000
>> b(isinf(b)) = 0
b =
0.5000 0.0294 0.2000 0.1667 0.1429 0 1.0000 0.1000
You can do it conditionally:
nz = a ~= 0; %// select using logical indexing
a(nz) = 1./a(nz);
A slightly more general approach than m.s.'s is to check for finite elements in the output using isfinite:
b = 1./a;
b( ~isfinite(b) ) = 0;
isfinite covers both inf values as well as NaN values, so if the element-wise function you are applying might generate both types of non-numeric values, isfinite handles them simultaneously for you.

linear regression with multiple variables in matlab, formula and code do not match

I have the following datasets:
X
X =
1.0000 0.1300 -0.2237
1.0000 -0.5042 -0.2237
1.0000 0.5025 -0.2237
1.0000 -0.7357 -1.5378
1.0000 1.2575 1.0904
1.0000 -0.0197 1.0904
1.0000 -0.5872 -0.2237
1.0000 -0.7219 -0.2237
1.0000 -0.7810 -0.2237
1.0000 -0.6376 -0.2237
1.0000 -0.0764 1.0904
1.0000 -0.0009 -0.2237
1.0000 -0.1393 -0.2237
1.0000 3.1173 2.4045
1.0000 -0.9220 -0.2237
1.0000 0.3766 1.0904
1.0000 -0.8565 -1.5378
1.0000 -0.9622 -0.2237
1.0000 0.7655 1.0904
1.0000 1.2965 1.0904
1.0000 -0.2940 -0.2237
1.0000 -0.1418 -1.5378
1.0000 -0.4992 -0.2237
1.0000 -0.0487 1.0904
1.0000 2.3774 -0.2237
1.0000 -1.1334 -0.2237
1.0000 -0.6829 -0.2237
1.0000 0.6610 -0.2237
1.0000 0.2508 -0.2237
1.0000 0.8007 -0.2237
1.0000 -0.2034 -1.5378
1.0000 -1.2592 -2.8519
1.0000 0.0495 1.0904
1.0000 1.4299 -0.2237
1.0000 -0.2387 1.0904
1.0000 -0.7093 -0.2237
1.0000 -0.9584 -0.2237
1.0000 0.1652 1.0904
1.0000 2.7864 1.0904
1.0000 0.2030 1.0904
1.0000 -0.4237 -1.5378
1.0000 0.2986 -0.2237
1.0000 0.7126 1.0904
1.0000 -1.0075 -0.2237
1.0000 -1.4454 -1.5378
1.0000 -0.1871 1.0904
1.0000 -1.0037 -0.2237
theta
0
0
0
y
y =
399900
329900
369000
232000
539900
299900
314900
198999
212000
242500
239999
347000
329999
699900
259900
449900
299900
199900
499998
599000
252900
255000
242900
259900
573900
249900
464500
469000
475000
299900
349900
169900
314900
579900
285900
249900
229900
345000
549000
287000
368500
329900
314000
299000
179900
299900
239500
The X set represents values for multiple variable regression, the first colum stands for X0, second X1; and so on.
The implementation formula is something like:
I have implemented a matlab code which is:
for i=1:size(theta,1)
h=X*theta;
sumE=sum((h-y).*X(:,i));
theta(i)=theta(i)-alpha*(1/m)*sumE;
end
which is inside a for loop going from 1 to a n number of iterations (the value of m is not relevant, it can be set up to 40 for example).
The problem is that even though the code works and the result is the one expected, when I submit it to a online checking program it appears that my results are wrong. The reason is that I should update theta simultaneously.
I have gotten the following Matlab code from Internet:
h = X*theta;
theta = theta - alpha / m * (X'*(h - y));
when I run the Internet solution it gives me almost the same answer as mine, with only a subtle difference in the 6th decimal position. When I submit that answer to the online program it is fully accepted, but I was wondering where the summation goes? in the formula explicitly indicates a summation which is no longer in the Internet solution. Maybe both codes are fine, but I do not know if the Internet author has made some linear algebra trick. Any help?
Thanks
I am not sure if I understood your question, but the formula you copied from internet is X'(h-y). Note that there is a tranposition signal after X! So, this is a matrices product. Your sum (your loop) is replaced by this matrices product.
Their code simultaneously updates theta. Your code iterates through the rows of theta using newer values of theta to regenerate the h which is used in updating the later rows of theta. I'd bet heavily that this is the difference.
For clarity, let's keep track of each iteration of theta in a matrix.
Their code for iteration j is:
h = X*theta(:,j);
theta(:,j+1) = theta(:,j) - alpha / m * (X'*(h - y));
On the other hand, your code would be:
for i=1:size(theta,1)
my_mismatched_theta = [theta(1:i-1, j+1); theta(i:end, j)];
h=X * my_mismatched_theta;
sumE=sum((h-y).*X(:,i));
theta(i,j)=theta(i,j)-alpha*(1/m)*sumE;
end
It doesn't simultaneously update theta in one step. You're using newer versions of theta (i.e. theta(:,j+1) ) to generate h when updating the later rows of theta.
something you should try
Change your code to what I have below and see if you then get the same answer:
h=X*theta; %placed outside of loop so it doesn't get updated by new theta values
for i=1:size(theta,1)
sumE=sum((h-y).*X(:,i));
theta(i)=theta(i)-alpha*(1/m)*sumE;
end
Your algorithm may converge to the same point as theirs in this case, but there's a chance that the sort of casecade updating you're doing creates weirdness in other cases. Who knows.