Related
I have a couple of big 3 dimensional matrices (e.g. dimension:16330,1300,16). For each cell I need to develop a simple linear regression model and extract some information such as slope and intercept of the fitted model.I created a loop and run the processing pixel by pixel but it will take for ever. Is there any suggestion that I can improve the following code?
% read the multiband image (16330,1300,16)
[A,R] = geotiffread('16Bands_image.tif');
% this is a vector (1*16) that i fit it against the third dimension of each
%pixel throughout the image
Load external.m
intercept = zeros(size(A,1),size(A,2));
slope = zeros(size(A,1),size(A,2));
for i=1:size(A,1)
for j=1:size(A,2)
REF=squeeze(A(i,j,:));
p=fitlm(REF,external);
intercept(i,j)=p.Coefficients.Estimate(1);
slope(i,j) = p.Coefficients.Estimate(2);
end
end
Thanks
If p = fitlm(external, REF) is what you need, there is a fast solution: reshape the image into 16 by (16330*1300), and apply the model without loop.
A = reshape(A, [], 16)'; % reshape and transpose to 16 by N
X = external(:);
X = X - mean(X);
b = [ones(16,1) X] \ A; % solve all once
Rows 1 and 2 of b are intercept and slope respectively.
I don't know your data, but this supposes A is the measured data.
If indeed you want the other way, you may still need loop over pixels:
external = external(:); % make sure it is column
b = zeros(2, size(A,2)); % A in 16 by N
for i = 1:size(A,2)
X = A(:,i);
X = X - mean(X);
b(:,i) = [ones(16,1) X] \ external;
end
But this is still slow, although it is faster than fitlm.
I have a synthetic image. I want to do eigenvalue decomposition of local structure tensor (LST) of it for some edge detection purposes. I used the eigenvaluesl1 , l2 and eigenvectors e1 ,e2 of LST to generate an adaptive ellipse for each pixel of image. Unfortunately I get unequal eigenvalues l1 , l2 and so unequal semi-axes length of ellipse for homogeneous regions of my figure:
However I get good response for a simple test image:
I don't know what is wrong in my code:
function [H,e1,e2,l1,l2] = LST_eig(I,sigma1,rw)
% LST_eig - compute the structure tensor and its eigen
% value decomposition
%
% H = LST_eig(I,sigma1,rw);
%
% sigma1 is pre smoothing width (in pixels).
% rw is filter bandwidth radius for tensor smoothing (in pixels).
%
n = size(I,1);
m = size(I,2);
if nargin<2
sigma1 = 0.5;
end
if nargin<3
rw = 0.001;
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% pre smoothing
J = imgaussfilt(I,sigma1);
% compute gradient using Sobel operator
Sch = [-3 0 3;-10 0 10;-3 0 3];
%h = fspecial('sobel');
gx = imfilter(J,Sch,'replicate');
gy = imfilter(J,Sch','replicate');
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% compute tensors
gx2 = gx.^2;
gy2 = gy.^2;
gxy = gx.*gy;
% smooth
gx2_sm = imgaussfilt(gx2,rw); %rw/sqrt(2*log(2))
gy2_sm = imgaussfilt(gy2,rw);
gxy_sm = imgaussfilt(gxy,rw);
H = zeros(n,m,2,2);
H(:,:,1,1) = gx2_sm;
H(:,:,2,2) = gy2_sm;
H(:,:,1,2) = gxy_sm;
H(:,:,2,1) = gxy_sm;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% eigen decomposition
l1 = zeros(n,m);
l2 = zeros(n,m);
e1 = zeros(n,m,2);
e2 = zeros(n,m,2);
for i = 1:n
for j = 1:m
Hmat = zeros(2);
Hmat(:,:) = H(i,j,:,:);
[V,D] = eigs(Hmat);
D = abs(D);
l1(i,j) = D(1,1); % eigen values
l2(i,j) = D(2,2);
e1(i,j,:) = V(:,1); % eigen vectors
e2(i,j,:) = V(:,2);
end
end
Any help is appreciated.
This is my ellipse drawing code:
% determining ellipse parameteres from eigen value decomposition of LST
M = input('Enter the maximum allowed semi-major axes length: ');
I = input('Enter the input data: ');
row = size(I,1);
col = size(I,2);
a = zeros(row,col);
b = zeros(row,col);
cos_phi = zeros(row,col);
sin_phi = zeros(row,col);
for m = 1:row
for n = 1:col
a(m,n) = (l2(m,n)+eps)/(l1(m,n)+l2(m,n)+2*eps)*M;
b(m,n) = (l1(m,n)+eps)/(l1(m,n)+l2(m,n)+2*eps)*M;
cos_phi1 = e1(m,n,1);
sin_phi1 = e1(m,n,2);
len = hypot(cos_phi1,sin_phi1);
cos_phi(m,n) = cos_phi1/len;
sin_phi(m,n) = sin_phi1/len;
end
end
%% plot elliptic structuring elements using parametric equation and superimpose on the image
figure; imagesc(I); colorbar; hold on
t = linspace(0,2*pi,50);
for i = 10:10:row-10
for j = 10:10:col-10
x0 = j;
y0 = i;
x = a(i,j)/2*cos(t)*cos_phi(i,j)-b(i,j)/2*sin(t)*sin_phi(i,j)+x0;
y = a(i,j)/2*cos(t)*sin_phi(i,j)+b(i,j)/2*sin(t)*cos_phi(i,j)+y0;
plot(x,y,'r','linewidth',1);
hold on
end
end
This my new result with the Gaussian derivative kernel:
This is the new plot with axis equal:
I created a test image similar to yours (probably less complicated) as follows:
pos = yy([400,500]) + 100 * sin(xx(400)/400*2*pi);
img = gaussianlineclip(pos+50,7) + gaussianlineclip(pos-50,7);
I = double(stretch(img));
(This requires DIPimage to run)
Then ran your LST_eig on it (sigma1=1 and rw=3) and your code to draw ellipses (no change to either, except adding axis equal), and got this result:
I suspect some non-uniformity in some of the blue areas of your image, which cause very small gradients to appear. The problem with the definition of the ellipses as you use them is that, for sufficiently oriented patterns, you'll get a line even if that pattern is imperceptible. You can get around this by defining your ellipse axes lengths as follows:
a = repmat(M,size(l2)); % longest axis is always the same
b = M ./ (l2+1); % shortest axis is shorter the more important the largest eigenvalue is
The smallest eigenvalue l1 is high in regions with strong gradients but no clear direction. The above does not take this into account. One option could be to make a depend on both energy and anisotropy measures, and b depend only on energy:
T = 1000; % some threshold
r = M ./ max(l1+l2-T,1); % circle radius, smaller for higher energy
d = (l2-l1) ./ (l1+l2+eps); % anisotropy measure in range [0,1]
a = M*d + r.*(1-d); % use `M` length for high anisotropy, use `r` length for high isotropy (circle)
b = r; % use `r` width always
This way, the whole ellipse shrinks if there are strong gradients but no clear direction, whereas it stays large and circular when there are only weak or no gradients. The threshold T depends on image intensities, adjust as needed.
You should probably also consider taking the square root of the eigenvalues, as they correspond to the variance.
Some suggestions:
You can write
a = (l2+eps)./(l1+l2+2*eps) * M;
b = (l1+eps)./(l1+l2+2*eps) * M;
cos_phi = e1(:,:,1);
sin_phi = e1(:,:,2);
without a loop. Note that e1 is normalized by definition, there is no need to normalize it again.
Use Gaussian gradients instead of Gaussian smoothing followed by Sobel or Schaar filters. See here for some MATLAB implementation details.
Use eig, not eigs, when you need all eigenvalues. Especially for such a small matrix, there is no advantage to using eigs. eig seems to produce more consistent results. There is no need to take the absolute value of the eigenvalues (D = abs(D)), as they are non-negative by definition.
Your default value of rw = 0.001 is way too small, a sigma of that size has no effect on the image. The goal of this smoothing is to average gradients in a local neighborhood. I used rw=3 with good results.
Use DIPimage. There is a structuretensor function, Gaussian gradients, and a lot more useful stuff. The 3.0 version (still in development) is a major rewrite that improves significantly on dealing with vector- and matrix-valued images. I can write all of your LST_eig as follows:
I = dip_image(I);
g = gradient(I, sigma1);
H = gaussf(g*g.', rw);
[e,l] = eig(H);
% Equivalences with your outputs:
l1 = l{2};
l2 = l{1};
e1 = e{2,:};
e2 = e{1,:};
I'm a beginner in image processing and I'm using MATLAB to extract HOG features from the images to train SVM classifier. The size of the training images is 480*640 pixels and I'm getting 167796 features with the default settings for the built-in MATLAB extractHOGFeatures function. However, when I test the model it gives me less features (216 features only!) knowing that the testing images have the same size of the training images. I get this error in MATLAB "The number of columns in TEST and training data must be equal".
Do you have any clue how to solve this problem and get feature vector with the same size for the training and testing sets?
Here is the code,
[fpos,fneg] = featuress(pathPos, pathNeg);
%train SVM
HOG_featV = loadingV(fpos,fneg); % loading and labeling each training example
%% Detection
tSize = [24 32];
testImPath = '.\face_detection\dataset\bikes_and_persons2\';
imlist = dir([testImPath '*.bmp']);
for j = 1:length(imlist)
disp ('inside for loop');
img = imread([testImPath imlist(j).name]);
axis equal; axis tight; axis off;
imshow(img); hold on;
detect(img,model,tSize);
%% training
function [fpos, fneg] = featuress(pathPos,pathNeg)
% extract features for positive examples
imlist = dir([pathPos '*.bmp']);
for i = 1:length(imlist)
im = imread([pathPos imlist(i).name]);
fpos{i} = extractHOGFeatures(double(im));
end
% extract features for negative examples
imlist = dir([pathNeg '*.bmp']);
for i = 1:length(imlist)
im = imread([pathNeg imlist(i).name]);
fneg{i} = extractHOGFeatures(double(im));
end
end
%% testing function
function detect(im,model,wSize)
topLeftRow = 1;
topLeftCol = 1;
[bottomRightCol bottomRightRow d] = size(im);
fcount = 1;
for y = topLeftCol:bottomRightCol-wSize(2)
for x = topLeftRow:bottomRightRow-wSize(1)
p1 = [x,y];
p2 = [x+(wSize(1)-1), y+(wSize(2)-1)];
po = [p1; p2];
img = imcut(po,im);
featureVector{fcount} = extractHOGFeatures(double(img));
boxPoint{fcount} = [x,y];
fcount = fcount+1;
x = x+1;
end
end
lebel = ones(length(featureVector),1);
P = cell2mat(featureVector');
% each row of P' correspond to a window
[ predictions] = svmclassify(model, P); % classifying each window
[a, indx]= max(predictions);
bBox = cell2mat(boxPoint(indx));
rectangle('Position',[bBox(1),bBox(2),24,32],'LineWidth',1, 'EdgeColor','r');
end
Thanks in advance.
What's the size of P? Is it 167796 x 216? If so then, you should not transpose featureVector when you call cell2mat. Or you should transpose P before you use it. You can also make featureVector a matrix rather than a cell array. Since you know that the length of the HOG vector is 167796 and you know how many images you have, you can pre-allocate it up front, and fill in the rows.
I actually want to use a linear model to fit a set of 'sin' data, but it turns out the loss function goes larger during each iteration. Is there any problem with my code below ? (gradient descent method)
Here is my code in Matlab
m=20;
rate = 0.1;
x = linspace(0,2*pi,20);
x = [ones(1,length(x));x]
y = sin(x);
w = rand(1,2);
for i=1:500
h = w*x;
loss = sum((h-y).^2)/m/2
total_loss = [total_loss loss];
**gradient = (h-y)*x'./m ;**
w = w - rate.*gradient;
end
Here is the data I want to fit
There isn't a problem with your code. With your current framework, if you can define data in the form of y = m*x + b, then this code is more than adequate. I actually ran it through a few tests where I define an equation of the line and add some Gaussian random noise to it (amplitude = 0.1, mean = 0, std. dev = 1).
However, one problem I will mention to you is that if you take a look at your sinusoidal data, you define a domain between [0,2*pi]. As you can see, you have multiple x values that get mapped to the same y value but of different magnitude. For example, at x = pi/2 we get 1 but at x = -3*pi/2 we get -1. This high variability will not bode well with linear regression, and so one suggestion I have is to restrict your domain... so something like [0, pi]. Another reason why it probably doesn't converge is the learning rate you chose is too high. I'd set it to something low like 0.01. As you mentioned in your comments, you already figured that out!
However, if you want to fit non-linear data using linear regression, you're going to have to include higher order terms to account for the variability. As such, try including second order and/or third order terms. This can simply be done by modifying your x matrix like so:
x = [ones(1,length(x)); x; x.^2; x.^3];
If you recall, the hypothesis function can be represented as a summation of linear terms:
h(x) = theta0 + theta1*x1 + theta2*x2 + ... + thetan*xn
In our case, each theta term would build a higher order term of our polynomial. x2 would be x^2 and x3 would be x^3. Therefore, we can still use the definition of gradient descent for linear regression here.
I'm also going to control the random generation seed (via rng) so that you can produce the same results I have gotten:
clear all;
close all;
rng(123123);
total_loss = [];
m = 20;
x = linspace(0,pi,m); %// Change
y = sin(x);
w = rand(1,4); %// Change
rate = 0.01; %// Change
x = [ones(1,length(x)); x; x.^2; x.^3]; %// Change - Second and third order terms
for i=1:500
h = w*x;
loss = sum((h-y).^2)/m/2;
total_loss = [total_loss loss];
% gradient is now in a different expression
gradient = (h-y)*x'./m ; % sum all in each iteration, it's a batch gradient
w = w - rate.*gradient;
end
If we try this, we get for w (your parameters):
>> format long g;
>> w
w =
Columns 1 through 3
0.128369521905694 0.819533906064327 -0.0944622478526915
Column 4
-0.0596638117151464
My final loss after this point is:
loss =
0.00154350916582836
This means that our equation of the line is:
y = 0.12 + 0.819x - 0.094x^2 - 0.059x^3
If we plot this equation of the line with your sinusoidal data, this is what we get:
xval = x(2,:);
plot(xval, y, xval, polyval(fliplr(w), xval))
legend('Original', 'Fitted');
Please find the data in the link below, or if you can send me your private email, I can send you the data
https://dl.dropboxusercontent.com/u/5353938/test_matlab_lefou.xlsx
In the excel sheet, the first column is y, the second is x and the third is t, I hope this will make things much more clear, and many thanks for the help.
I need to use the following model because it is the one that fits best my data, but what I don't know is how to find the best values of a and b, that will allow me to get the best fit, (I can attach a file if you need the values), I already have the values of y, x and t:
y= a*sqrt(x).exp(b.t)
Thanks
Without the dependency on the curve fitting toolbox, this problem can also be solved by using fminsearch. I first generate some data, which you already have but didn't share with us. An initial guess on the parameters a and b must be made (p0). Then I do the optimiziation by minizmizing the squared errors between data and fit resulting in the vector p_fit, which contains the optimized parameters for a and b. In the end, the result is visualized.
% ----- Generating some data for x, y and t (which you already got)
N = 10; % num of data points
x = linspace(0,5,N);
t = linspace(0,10,N);
% random parameters
a = rand()*5; % a between 0 and 5
b = (rand()-1); % b between -1 and 0
y = a*sqrt(x).*exp(b*t) + rand(size(x))*0.1; % noisy data
% ----- YOU START HERE WITH YOUR PROBLEM -----
% put x and t into a 2 row matrix for simplicity
D(1,:) = x;
D(2,:) = t;
% create model function with parameters p(1) = a and p(2) = b
model = #(p, D) p(1)*sqrt(D(1,:)).*exp(p(2)*D(2,:));
e = #(p) sum((y - model(p,D)).^2); % minimize squared errors
p0 = [1,-1]; % an initial guess (positive a and probably negative b for a decay)
[p_fit, r1] = fminsearch(e, p0); % Optimize
% ----- VISUALIZATION ----
figure
plot(x,y,'ko')
hold on
X = linspace(min(x), max(x), 100);
T = linspace(min(t), max(t), 100);
plot(X, model(p_fit, [X; T]), 'r--')
legend('data', sprintf('fit: y(t,x) = %.2f*sqrt(x)*exp(%.2f*t)', p_fit))
The result can look like
UPDATE AFTER MANY MANY COMMENTS
Your data are column vectors, my solution used row vectors. The error occured when the errorfunction tryed to compute the difference of a column vector (y) and a row-vector (result of the model-function). Easy hack: make them all to row vectors and use my approach. The result is: a = 0.5296 and b = 0.0013.
However, the Optimization depends on the initial guess p0, you might want to play around with it a little bit.
clear variables
load matlab.mat
% put x and t into a 2 row matrix for simplicity
D(1,:) = x;
D(2,:) = t;
y = reshape(y, 1, length(y)); % <-- also y is a row vector, now
% create model function with parameters p(1) = a and p(2) = b
model = #(p, D) p(1)*sqrt(D(1,:)).*exp(p(2)*D(2,:));
e = #(p) sum((y - model(p,D)).^2); % minimize squared errors
p0 = [1,0]; % an initial guess (positive a and probably negative b for a decay)
[p_fit, r1] = fminsearch(e, p0); % Optimize
% p_fit = nlinfit(D, y, model, p0) % as a working alternative with dependency on the statistics toolbox
% ----- VISUALIZATION ----
figure
plot(x,y,'ko', 'markerfacecolor', 'black', 'markersize',5)
hold on
X = linspace(min(x), max(x), 100);
T = linspace(min(t), max(t), 100);
plot(X, model(p_fit, [X; T]), 'r-', 'linewidth', 2)
legend('data', sprintf('fit: y(t,x) = %.2f*sqrt(x)*exp(%.2f*t)', p_fit))
The result doesn't look too satisfying though. But that mainly is because of your data. Have a look here:
With the cftool-command (curve fitting toolbox) you can fit to your own functions, returning the variables that you need (a,b). Make sure your x-data and y-data are in separate variables. you can also specify weights for your measurements.