matlab : vectorization and fitlm - matlab

I have a vectorization problem with nlinfit.
Let A = (n,p) the matrix of observations and t(1,p) the explanatory variable.
For ex
t=[0 1 2 3 4 5 6 7]
and
A=[3.12E-04 7.73E-04 3.58E-04 5.05E-04 4.02E-04 5.20E-04 1.84E-04 3.70E-04
3.38E-04 3.34E-04 3.28E-04 4.98E-04 5.19E-04 5.05E-04 1.97E-04 2.88E-04
1.09E-04 3.64E-04 1.82E-04 2.91E-04 1.82E-04 3.62E-04 4.65E-04 3.89E-04
2.70E-04 3.37E-04 2.03E-04 1.70E-04 1.37E-04 2.08E-04 1.05E-04 2.45E-04
3.70E-04 3.34E-04 2.63E-04 3.21E-04 2.52E-04 2.81E-04 6.25E+09 2.51E-04
3.11E-04 3.68E-04 3.65E-04 2.71E-04 2.69E-04 1.49E-04 2.97E-04 4.70E-04
5.48E-04 4.12E-04 5.55E-04 5.94E-04 6.10E-04 5.44E-04 5.67E-04 4.53E-04
....
]
I want to estimate a linear model for each row of A without looping and avoid the loop
for i=1:7
ml[i]=fitlm(A(i,:),t);
end
Thanks for your help !
Luc

I believe that your probem is about undertanding how fitlm works, for matrix:
Let's work with the hald example for matlab:
>> load hald
>> Description
Description =
== Portland Cement Data ==
Multiple regression data
ingredients (%):
column1: 3CaO.Al2O3 (tricalcium aluminate)
column2: 3CaO.SiO2 (tricalcium silicate)
column3: 4CaO.Al2O3.Fe2O3 (tetracalcium aluminoferrite)
column4: 2CaO.SiO2 (beta-dicalcium silicate)
heat (cal/gm):
heat of hardening after 180 days
Source:
Woods,H., H. Steinour, H. Starke,
"Effect of Composition of Portland Cement on Heat Evolved
during Hardening," Industrial and Engineering Chemistry,
v.24 no.11 (1932), pp.1207-1214.
Reference:
Hald,A., Statistical Theory with Engineering Applications,
Wiley, 1960.
>> ingredients
ingredients =
7 26 6 60
1 29 15 52
11 56 8 20
11 31 8 47
7 52 6 33
11 55 9 22
3 71 17 6
1 31 22 44
2 54 18 22
21 47 4 26
1 40 23 34
11 66 9 12
10 68 8 12
>> heat
heat =
78.5000
74.3000
104.3000
87.6000
95.9000
109.2000
102.7000
72.5000
93.1000
115.9000
83.8000
113.3000
109.4000
This means that you have a matrix ingredients column % of ingredients in a component
>> sum(ingredients(1,:))
ans =
99 % so it is near 100%
and the rows are the 13 measures of the prodcut and the heat vector, the heat at the observation was taken.
>> mdl = fitlm(ingredients,heat)
mdl =
Linear regression model:
y ~ 1 + x1 + x2 + x3 + x4
Estimated Coefficients:
Estimate SE tStat pValue
________ _______ ________ ________
(Intercept) 62.405 70.071 0.8906 0.39913
x1 1.5511 0.74477 2.0827 0.070822
x2 0.51017 0.72379 0.70486 0.5009
x3 0.10191 0.75471 0.13503 0.89592
x4 -0.14406 0.70905 -0.20317 0.84407
Number of observations: 13, Error degrees of freedom: 8
Root Mean Squared Error: 2.45
R-squared: 0.982, Adjusted R-Squared 0.974
F-statistic vs. constant model: 111, p-value = 4.76e-07
So in your case, it not have sense to measure for each observation separately. is simply with t the same number of elements than observations.
take a look here
mdl = fitllm(A,t)

Problem solved using sapply and findgroups !

Related

Correct formulation for Logistic regression using glmfit in Matlab

When using glmfit in matlab, there are different problem setups that can be used:
x = [2100 2300 2500 2700 2900 3100 ...
3300 3500 3700 3900 4100 4300]';
n = [48 42 31 34 31 21 23 23 21 16 17 21]';
y = [1 2 0 3 8 8 14 17 19 15 17 21]';
[b dev] = glmfit(x,[y n],'binomial','link','probit');
Here they fit numerical data where n is the number of items tested, and y is the number of successes.
X = meas(51:end,:);
y = strcmp('versicolor',species(51:end));
b = glmfit(X,y,'binomial','link','logit')
In this case the y variable is binary and no n value is required (is that correct?)
In my case I have data on greyhoud races.
For each race I have a dummy variable (y) that takes value one when the dog wins and zero otherwise.
Q1.) For this setup I should use this formulation correct (with no n value supplied)?
[b dev] = glmfit(X,y,'binomial','link','logit')
Q2.) What is the precise definition of dev? It says in the support that it is a generalization of the residual sum of squares squares, but does not define it precisely.
Thanks

How to find out the coefficient in matching pursuit algorithm [duplicate]

This question already has answers here:
Implementing matching pursuit algorithm
(3 answers)
Closed 6 years ago.
I'm trying to implement Matching Pursuit algorithm in Matlab.I have found out the maximum inner product value ,i m stuck with how to find out the coeffients.
help me out.
Here is the algorithm
D=[1 6 11 16 21 26 31 36 41 46
2 7 12 17 22 27 32 37 42 47
3 8 13 18 23 28 33 38 43 48
4 9 14 19 24 29 34 39 44 49
5 10 15 20 25 30 35 40 45 50];
b=[16;17;18;19;20];
n=size(D);
A1=zeros(n);
R=b;
x=[];
H=10;
if(H <= 0)
error('The number of iterations needs to be greater then 0')
end;
[c,d] = max(abs(D'*R));
Here i have used a prefined dictionary.
Thanks in advance
You can use this function based on "S. Mallat, Z. Zhang, 1993. Matching pursuit in a time frequency dictionary. IEEE Transactions Signal Processing, Vol. 41, No. 12, pp. 3397-3415."
x = MP(b,D,10);
function S = MP(y,Dictionary,iteration)
n = size(Dictionary,2);
S = zeros(n,1);
% Normalize the dictionary atoms (coloumns) to have unit norm
% It's better to implement this part out of function,
% to normalize the dictionary just one time!
%**************************************
for j = 1:n;
Dictionary(:,j) = Dictionary(:,j)/norm(Dictionary(:,j));
end
% *************************************
for i = 1:iteration
gn = Dictionary' * y / norm(y);
[MAX,index] = max(abs(gn));
y = y - MAX * Dictionary(:,index);
S(index) = MAX + S(index);
end

matlab: interpolation vector with a lot of duplicate values

Maybe my question might seem a little strange.
I have 3 vectors x1, y1, and x2
x1 and x2 have a lot of duplicate values, I would like to get y2 by interpolation with the same lenght of y1
y1 = [350 770 800 920 970 990 1020 1054 1080 1100];
x1=[10 10 11 14 13 12 10 10 10 7];
x2 = [10 10 13 13 15 13 13 10 10 10];
(actually have greater length, but always the same for all)
Is it impossible or a non-sense question? (in any case, my problem would remain unsolved)

Find closest matching distances for a set of points in a distance matrix in Matlab

I have a matrix of measured angles between M planes
0 52 77 79
52 0 10 14
77 10 0 3
79 14 3 0
I have a list of known angles between planes, which is an N-by-N matrix which I name rho. Here's is a subset of it (it's too large to display):
0 51 68 75 78 81 82
51 0 17 24 28 30 32
68 17 0 7 11 13 15
75 24 7 0 4 6 8
78 28 11 4 0 2 4
81 30 13 6 2 0 2
82 32 15 8 4 2 0
My mission is to find the set of M planes whose angles in rho are nearest to the measured angles.
For example, the measured angles for the planes shown above are relatively close to the known angles between planes 1, 2, 4 and 6.
Put differently, I need to find a set of points in a distance matrix (which uses cosine-related distances) which matches a set of distances I measured. This can also be thought of as matching a pattern to a mold.
In my problem, I have M=5 and N=415.
I really tried to get my head around it but have run out of time. So currently I'm using the simplest method: iterating over every possible combination of 3 planes but this is slow and currently written only for M=3. I then return a list of matching planes sorted by a matching score:
function [scores] = which_zones(rho, angles)
N = size(rho,1);
scores = zeros(N^3, 4);
index = 1;
for i=1:N-2
for j=(i+1):N-1
for k=(j+1):N
found_angles = [rho(i,j) rho(i,k) rho(j,k)];
score = sqrt(sum((found_angles-angles).^2));
scores(index,:)=[score i j k];
index = index + 1;
end
end;
end
scores=scores(1:(index-1),:); % was too lazy to pre-calculate #
scores=sortrows(scores, 1);
end
I have a feeling pdist2 might help but not sure how. I would appreciate any help in figuring this out.
There is http://www.mathworks.nl/help/matlab/ref/dsearchn.html for closest point search, but that requires same dimensionality. I think you have to bruteforce find it anyway because it's just a special problem.
Here's a way to bruteforce iterate over all unique combinations of the second matrix and calculate the score, after that you can find the one with the minimum score.
A=[ 0 52 77 79;
52 0 10 14;
77 10 0 3;
79 14 3 0];
B=[ 0 51 68 75 78 81 82;
51 0 17 24 28 30 32;
68 17 0 7 11 13 15;
75 24 7 0 4 6 8;
78 28 11 4 0 2 4;
81 30 13 6 2 0 2;
82 32 15 8 4 2 0];
M = size(A,1);
N = size(B,1);
% find all unique permutations of `1:M`
idx = nchoosek(1:N,M);
K = size(idx,1); % number of combinations = valid candidates for matching A
score = NaN(K,1);
idx_triu = triu(true(M,M),1);
Atriu = A(idx_triu);
for ii=1:K
partB = B(idx(ii,:),idx(ii,:));
partB_triu = partB(idx_triu);
score = norm(Atriu-partB_triu,2);
end
[~, best_match_idx] = min(score);
best_match = idx(best_match_idx,:);
The solution of your example actually is [1 2 3 4], so the upperleft part of B and not [1 2 4 6].
This would theoretically solve your problem, and I don't know how to make this algorithm any faster. But it will still be slow for large numbers. For example for your case of M=5 and N=415, there are 100 128 170 583 combinations of B which are a possible solution; just generating the selector indices is impossible in 32-bit because you can't address them all.
I think the real optimization here lies in cutting away some of the planes in the NxN matrix in a preceding filtering part.

Error with ^2, says matrix should be square

I wanted to plot the following
y=linspace(0,D,100)
temp=y^2;
plot(y,temp);
i am getting an error with y^2, it says matrix should be square.
is there another way to plot.
You are not getting that error because of plot. You are getting it because of
temp=y^2
Instead, you should be using
temp=y.^2
^ means matrix power. .^ is elementwise power. You can find more about MATLAB operators here.
Let's say you have a 3x3 matrix, magic(3).
A=magic(3)
A =
8 1 6
3 5 7
4 9 2
Here is square of matrix A (which is A*A, as Dan suggested):
A^2
ans =
91 67 67
67 91 67
67 67 91
Here is the matrix which contains squares of A's elements:
A.^2
ans =
64 1 36
9 25 49
16 81 4
Just as an alternative to the above answer, you can consider the following case:
A = magic(3);
temp = bsxfun(#times,A,A);
which retrieves the same results as
temp = A.^2;
the . operator will apply your square element-wise. bsxfun makes exactly the same.
I hope this helps.