why the regression doesnt work? (MATLAB) - matlab

I have a accelerometer signal as below:
and I have a walking signal as below:
I tried to regress these two signal using regression. Here is the code:
regMat=zeros(size(walkVec,1),size(walkVec,2));
for i=1:size(data,1)
b=regress(walkAcc', walkVec(i,:)');
regMat(i,:)=walkVec(i,:)-repmat(b,1,length(walkAcc));
regData(i,:)=[data(i,1:(ipt(1)-1)),regMat(i,:), data(i,
(ipt(2)+1):size(data,2) ) ];
end
figure(1),hold on, plot(data(1,:),'r'), plot(regData(1,:)), title
('regressed (b) and raw(r) ')
legend('raw','regressed')
xlabel('samples')
ylabel('Intensity')
and here is the reslts:
as you can see the regression doesnt really work. Do you have any idea why and how I can fix it?
Thanks a lot and kind Regards

Check if regData has any data in the first column. This can also happen if one data is several magnitudes higher or lower. It becomes negligible in the same chart. I tried your code with dummy data and it works.

If what you are looking for is just a visual representation of regressions, and you don't need to store/process data, I suggest you to use the function plotregression (https://it.mathworks.com/help/nnet/ref/plotregression.html).

Related

Avoiding for loop with cells and matrixes involved

I am trying to avoid the for loops and I have been reading through all the old posts there are about it but I am not able to solve my problem. I am new in MATLAB, so apologies for my ignorance.
The thing is that I have a 300x2 cell and in each one I have a 128x128x256 matrix. Each one is an image with 128x128 pixels and 256 channels per pixel. In the first column of the 300x2 cell I have my parallel intensity values and in the second one my perpendicular intensity values.
What I want to do is to take every pixel of every image (for each component) and sum the intensity values channel by channel.
The code I have is the following:
Image_par_channels=zeros(128,128,256);
Image_per_channels=zeros(128,128,256);
Image_tot_channels=zeros(128,128,256);
for a=1:128
for b=1:128
for j=1:256
for i=1:numfiles
Image_par_channels(a,b,j)=Image_par_channels(a,b,j)+Image_cell_par_per{i,1}(a,b,j);
Image_per_channels(a,b,j)=Image_per_channels(a,b,j)+Image_cell_par_per{i,2}(a,b,j);
end
Image_tot_channels(a,b,j)=Image_par_channels(a,b,j)+2*G*Image_per_channels(a,b,j);
end
end
end
I think I could speed it up introducing (:,:,j) instead of specifying a and b. But still a for loop. I am trying to use cellfun without any success due to my lack of expertise. Could you please give me a hand?
I would really appreciate it.
Many thanks and have a nice day!
Y
I believe you could do something like
Image_par_channels=zeros(128,128,256);
Image_per_channels=zeros(128,128,256);
Image_tot_channels=zeros(128,128,256);
for i=1:numfiles
Image_par_channels = Image_par_channels + Image_cell_par_per{i,1};
Image_per_channels = Image_per_channels + Image_cell_par_per{i,2};
end
Image_tot_channels = Image_par_channels + 2*G*Image_per_channels;
I haven't work with matlab in a long time, but I seem to recall you can do something like this. g is a constant.
EDIT:
Removed the +=. Incremental assignment is not an operator available in matlab. You should also note that Image_tot_channels can be build directly in the loop, if you don't need the other two variables later.

Resample factors are too large

I have a large vector of recorded data which I need to resample. The problem I encounter is that when using resample, I get the following error:
??? Error using ==> upfirdn at 82 The product of the downsample factor
Q and the upsample factor P must be less than 2^31.
Now, I understand why this is happening - my two sampling rates are very close together, so the integer factors need to be quite large (something like 73999/74000). Unfortunately this means that the appropriate filter can't be created by MATLAB. I also tried resampling just up, with the intention of then resampling down, but there is not enough memory to do this to even 1 million samples of data (mine is 93M).
What other methods could I use to properly resample this data?
An interpolated polyphase FIR filter can be used to interpolate just the new set of sample points without using an upsampling+downsampling process.
But if performance is completely unimportant, here's a Quick and Dirty windowed-Sinc interpolator in Basic.
here's my code, I hope it helps :
function resig = resamplee(sig,upsample,downsample)
if upsample*downsample<2^31
resig = resample(sig,upsample,downsample);
else
sig1half=sig(1:floor(length(sig)/2));
sig2half=sig(floor(length(sig)/2):end);
resig1half=resamplee(sig1half,floor(upsample/2),length(sig1half));
resig2half=resamplee(sig2half,upsample-floor(upsample/2),length(sig2half));
resig=[resig1half;resig2half];
end

function parameters in matlab wander off after curve fitting

first a little background. I'm a psychology student so my background in coding isn't on par with you guys :-)
My problem is as follow and the most important observation is that curve fitting with 2 different programs gives completly different results for my parameters, altough my graphs stay the same. The main program we have used to fit my longitudinal data is kaleidagraph and this should be seen as kinda the 'golden standard', the program I'm trying to modify is matlab.
I was trying to be smart and wrote some code (a lot at least for me) and the goal of that code was the following:
1. Taking an individual longitudinal datafile
2. curve fitting this data on a non-parametric model using lsqcurvefit
3. obtaining figures and the points where f' and f'' are zero
This all worked well (woohoo :-)) but when I started comparing the function parameters both programs generate there is a huge difference. The kaleidagraph program stays close to it's original starting values. Matlab wanders off and sometimes gets larger by a factor 1000. The graphs stay however more or less the same in both situations and both fit the data well. However it would be lovely if I would know how to make the matlab curve fitting more 'conservative' and more located near it's original starting values.
validFitPersons = true(nbValidPersons,1);
for i=1:nbValidPersons
personalData = data{validPersons(i),3};
personalData = personalData(personalData(:,1)>=minAge,:);
% Fit a specific model for all valid persons
try
opts = optimoptions(#lsqcurvefit, 'Algorithm', 'levenberg-marquardt');
[personalParams,personalRes,personalResidual] = lsqcurvefit(heightModel,initialValues,personalData(:,1),personalData(:,2),[],[],opts);
catch
x=1;
end
Above is a the part of the code i've written to fit the datafiles into a specific model.
Below is an example of a non-parametric model i use with its function parameters.
elseif strcmpi(model,'jpa2')
% y = a.*(1-1/(1+(b_1(t+e))^c_1+(b_2(t+e))^c_2+(b_3(t+e))^c_3))
heightModel = #(params,ages) abs(params(1).*(1-1./(1+(params(2).* (ages+params(8) )).^params(5) +(params(3).* (ages+params(8) )).^params(6) +(params(4) .*(ages+params(8) )).^params(7) )));
modelStrings = {'a','b1','b2','b3','c1','c2','c3','e'};
% Define initial values
if strcmpi('male',gender)
initialValues = [176.76 0.339 0.1199 0.0764 0.42287 2.818 18.52 0.4363];
else
initialValues = [161.92 0.4173 0.1354 0.090 0.540 2.87 14.281 0.3701];
end
I've tried to mimick the curve fitting process in kaleidagraph as good as possible. There I've found they use the levenberg-marquardt algorithm which I've selected. However results still vary and I don't have any more clues about how I can change this.
Some extra adjustments:
The idea for this code was the following:
I'm trying to compare different fitting models (they are designed for this purpose). So what I do is I have 5 models with different parameters and different starting values ( the second part of my code) and next I have the general curve fitting file. Since there are different models it would be interesting if I could put restrictions into how far my starting values could wander off.
Anyone any idea how this could be done?
Anybody willing to help a psychology student?
Cheers
This is a common issue when dealing with non-linear models.
If I were, you, I would try to check if you can remove some parameters from the model in order to simplify it.
If you really want to keep your solution not too far from the initial point, you can use upper bounds and lower bounds for each variable:
x = lsqcurvefit(fun,x0,xdata,ydata,lb,ub)
defines a set of lower and upper bounds on the design variables in x so that the solution is always in the range lb ≤ x ≤ ub.
Cheers
You state:
I'm trying to compare different fitting models (they are designed for
this purpose). So what I do is I have 5 models with different
parameters and different starting values ( the second part of my code)
and next I have the general curve fitting file.
You will presumably compare the statistics from fits with different models, to see whether reductions in the fitting error are unlikely to be due to chance. You may want to rely on that comparison to pick the model that not only fits your data suitably but is also simplest (which is often referred to as the principle of parsimony).
The problem is really with the model you have shown resulting in correlated parameters and therefore overfitting, as mentioned by #David. Again, this should be resolved when you compare different models and find that some do just as well (statistically speaking) even though they involve fewer parameters.
edit
To drive the point home regarding the problem with the choice of model, here are (1) results of a trial fit using simulated data (2) the correlation matrix of the parameters in graphical form:
Note that absolute values of the correlation close to 1 indicate strongly correlated parameters, which is highly undesirable. Note also that the trend in the data is practically linear over a long portion of the dataset, which implies that 2 parameters might suffice over that stretch, so using 8 parameters to describe it seems like overkill.

Tanimoto Coefficient using Matlab

I need to calculate Tanimoto Coefficient. I don't know what's wrong in my code. I have 2 nearly similar images. But the value obtained using my code indicates that the two images are highly dissimilar. Kindly help me with my code.
%Tanimoto coeff
I=imread('sliver3.jpg');
J=imread('ref5.jpg');
figure,imshow(I),title('Original');
figure,imshow(J),title('Reference');
inter=intersect(I,J,'rows');
uni=union(I,J,'rows');
si=size(inter);
su=size(uni);
tc=si/su
I attach three images here. First one is the segmented output. Second is a reference image.Third is also a reference but which is highly dissimilar. So, the output must be that, first and second must be nearly similar and first and third must be highly dissimilar. But I'm getting the opposite.
For first two images, tc =0.4895
For first and third images, tc=0.5692
Kindly help me out.
I think you should use the sum() function on union and intersect instead of size() since the Tanimoto Coefficient is the "summation of the intersect"/"summation of the union"

Samples by samples cross-correlation(Xcorr) matlab

I am using the xcorr function for identifying the similarity of the signals. the following is the code,
r1 = max(abs(xcorr(S1, shat1,'coeff')));
r2 = max(abs(xcorr(S1,shat2,'coeff')));
if r1>r2
dn=shat2;
else
dn=shat1;
end
It worked perfectly. But the problem is the signals are having 40,000 samples each. Practically I do get a lot of delay. I have to send bunch of samples (like 250samples)into the xcorr for getting rid of the delay. But how do I do that? I know that I have to use a for loop, but found difficult in doing that. Can some one suggest me how do I do that.I tried something like this
for i=1:250:40000
r1 = max(abs(xcorr(S1(:,i), shat1(:,i),'coeff')));
but totally lost. Someone suggest something please....
If I understand you correctly, you want to cross-correlate block of 250 samples, one after the other. Adapting from your attempt, try
for i=1:250:40000
r1 = max(abs(xcorr(S1(i:i+249), shat1(i:i+249),'coeff')));
end
As a side note, do you know anything about the maximum lag between your signals? If you can safely assume that the temporal shift between your signals is below 250 (which the idea of splitting it into intervals suggests), you could save calculation time by modifying your original code with using maxlags, a parameter for xcorr:
maxlags=250; %# or some other reasonable value, maybe even 100? 50?
r1 = max(abs(xcorr(S1, shat1,maxlags, 'coeff')));
r2 = max(abs(xcorr(S1, shat2,maxlags, 'coeff')));
...
I haven't tested how fast that would be, but my guess is you might be able to avoid your loop altogether with this...