I have an image similar to this one:
and want to remove its underlying baseline so that it looks like:
The image is always different, usually has some peaks and has a total absolute offset and a base surface that is tilted/curved/nonlinear.
I was thinking of a using the 1D baseline fitting and subtraction technique for common signal spectra and create a 2D baseline image and then numerically subtract each from another. But can't quite get my head around it in 2D.
This is an improved question I asked before but this one should be more clear.
It seems to me that we can apply some kind of high pass filter to sovle your problem. One way to do so is using a blurring filter (some kind of low pass filter), and subtract the blurred part from the original (known as "unsharp masking"). So for lowpass filtering you could use a convolutionw with a gaussian or just a plain box filter. Alternatively you could also use a median filter which is what I did here:
%% setup
v = 0:0.01:1;
[x,y] = meshgrid(v, v);
z0 = cos(pi*x).*cos(pi*y);z = z0; % "baseline" surface
pks = [1,1; 3,3; 7,5; 2,8; 7, 3]/10;% add 5 peaks
for p=pks';
z = z + exp(-((x-p(1)).^2 + (y-p(2)).^2)/0.02.^2);
end
subplot(221);mesh(x,y,z);title('data');
%% recover "baseline"
z0_ = medfilt2(z, [1,1]*20, 'symmetric'); % low pass filter of your choice
subplot(222);mesh(x,y,z0_);title('recovered baseline');
subplot(223);mesh(x,y,z0_-z0);title('error');
%% subtract recovered baseline
subplot(224);mesh(x,y,z-z0_);title('recovered baseline removed');
Previous answers have suggested interesting mathematical methods for removing the baseline. But I guess this question is a continuation of your previous questions, and by "image" you mean that your data is really an image. If so, you can use image processing techniques to find the peaks and flatten the areas around them.
1. Preprocessing
Before applying different filters, it is better to map the pixel values to a certain range. this way we can have better control over the values of the required parameters of the filters.
First we convert the image data type to double, for cases when the pixel values are integers.
I = double(I);
Then, by applying the average filter, we reduce the noise in the image.
SI = imfilter(I,fspecial('disk',40),'replicate');
Finally, we map the values of all the pixels to the range of zero to one.
NI = SI-min(SI(:));
NI = NI/max(NI(:));
2. Segmentation
After preparing the image, we can identify the parts where each of the peaks is located. To do this, we first calculate the image gradient.
G = imgradient(NI,'sobel');
Then we identify the parts of the image that have a higher slope. Since "high slope" may have different meanings in different images, we use the graythresh function to divide the image into two parts, low slope and high slope.
SA = im2bw(G, graythresh(G));
The segmented areas in the previous step can have several problems:
Small continuous components, which are categorized as part of high slope area, may be caused only by noise. Therefore, components with an area less than a threshold value should be removed.
Due to the fact that the slope reaches zero at the top of the peaks, there will likely be holes in the components found in the previous step.
The slope of the peak is not necessarily the same along its boundaries, and the found areas can have irregular shapes. One solution could be to expand them by replacing them with their Convex Halls.
[L, nPeaks] = bwlabel(SA);
minArea = 0.03*numel(I);
P = false(size(I));
for i=1:nPeaks
P_i = bwconvhull(L==i);
area = sum(P_i(:));
if (area>minArea)
P = P|P_i;
end
end
3. Removing Baseline
The P matrix, calculated in the previous step, contains the value of one at the peaks and zero at the other areas. So far, we can delete the base line by multiplying this matrix in the main image. But it is better to first soften the edges of the found areas so that the edges of the peaks do not suddenly fall to zero.
FC = imfilter(double(P),fspecial('disk',50),'replicate');
F = I.*FC;
You can also shift peaks with the least amount of image at their edges.
E = bwmorph(P, 'remove');
o = min(I(E));
T = max(0, F-o);
All the above steps in one function
function [hlink, T] = removeBaseline(I, demoSteps)
% converting image to double
I = double(I);
% smoothing image to reduce noise
SI = imfilter(I,fspecial('disk',40),'replicate');
% normalizing image in [0..1] range
NI = SI-min(SI(:));
NI = NI/max(NI(:));
% computng image gradient
G = imgradient(NI,'sobel');
% finding steep areas of the image
SA = im2bw(G, graythresh(G));
% segmenting image to find regions covered by each peak
[L, nPeaks] = bwlabel(SA);
% defining a threshold for minimum area covered by each peak
minArea = 0.03*numel(I);
% filling each of the regions, and eliminating small ones
P = false(size(I));
for i=1:nPeaks
% finding convex hull of the region
P_i = bwconvhull(L==i);
% computing area of the filled region
area = sum(P_i(:));
if (area>minArea)
% adding the region to peaks mask
P = P|P_i;
end
end
% applying the average filter on peaks mask to compute coefficients
FC = imfilter(double(P),fspecial('disk',50),'replicate');
% Removing baseline by multiplying the coefficients
F = I.*FC;
% finding edge of peaks
E = bwmorph(P, 'remove');
% finding minimum value of edges in the image
o = min(I(E));
% shifting the flattened image
T = max(0, F-o);
if demoSteps
figure
subplot 231, imshow(I, []); title('Original Image');
subplot 232, imshow(SI, []); title('Smoothed Image');
subplot 233, imshow(NI); title('Normalized in [0..1]');
subplot 234, imshow(G, []); title('Gradient of Image');
subplot 235, imshow(SA); title('Steep Areas');
subplot 236, imshow(P); title('Peaks');
figure;
subplot 221, imshow(FC); title('Flattening Coefficients');
subplot 222, imshow(F, []); title('Base Line Removed');
subplot 223, imshow(E); title('Peak Edge');
subplot 224, imshow(T, []); title('Final Result');
figure
h1 = subplot(1, 3, 1);
surf(I, 'edgecolor', 'none'); hold on;
contour3(I, 'k', 'levellist', o, 'linewidth', 2)
h2 = subplot(1, 3, 2);
surf(F, 'edgecolor', 'none'); hold on;
contour3(F, 'k', 'levellist', o, 'linewidth', 2)
h3 = subplot(1, 3, 3);
surf(T, 'edgecolor', 'none');
hlink = linkprop([h1 h2 h3],{'CameraPosition','CameraUpVector', 'xlim', 'ylim', 'zlim', 'clim'});
set(h1, 'zlim', [0 max(I(:))])
set(h1, 'ylim', [0 size(I, 1)])
set(h1, 'xlim', [0 size(I, 2)])
set(h1, 'clim', [0 max(I(:))])
end
end
To execute the function with an image containing several peaks with noise:
close all; clc; clear variables;
I = abs(peaks(1200));
J1 = imnoise(ones(size(I))*0.5,'salt & pepper', 0.05);
J1 = imfilter(double(J1),fspecial('disk',20),'replicate');
[X, Y] = meshgrid(linspace(0, 1, size(I, 2)), linspace(0, 1, size(I, 1)));
J2 = X.^3+Y.^2;
I = max(I, 2*J2) + 5*J1;
lp3 = removeBaseline(I, true);
To call the function for an image read from file:
I = rgb2gray(imread('imagefile.jpg'));
[~, I2] = removeBaseline(I, true);
Results for images provided in previous questions:
I have a solution in Python, but guess it would not be to complicated to transfer this to MATLAB.
It works with a bunch of peaks. I made a few assumptions, though, like that there are several peaks. It works with one, but is better if there are a few peaks. Peaks may overlap. The main assumption is of course the shape of the background, but this can be modified if other models exist.
The main idea is to subtract a function but forbidding negative values. This is done via an extra cost function, which I keep differentiable for the sake of minimization. As a consequence, there might be issues for values near zero. Such cases can be handled by iterating on how sharp the extra cost comes in. One would start with a moderate slope of about one and re-do the fit with a steeper slope and starting values from the previous fit. I've done that on similar problems and it works ok. Technically, it is not excluded that there are small negative values after subtracting the fit-solution, so for image data an extra step would be necessary to take care of that.
Here is the code
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
from scipy.optimize import least_squares
def peak( x,y, a, x0, y0, s):
"""
Just a symmetric peak for testing
"""
return a * np.exp( -( (x - x0 )**2 + ( y - y0 )**2 ) / 2 / s**2 )
def second_order( xx, yy, aa, bb, cc, dd, ee, ff ):
"""
Assuming that the base can be approximated by a second order equation
generalization to higher orders should be straight forward
"""
out = aa * xx**2 + 2 * bb * xx * yy + cc * yy**2 + dd * xx + ee * yy + ff
return out
def residual_function( params, xa, ya, za, extracost, slope ):
"""
cost function. Calculates difference from zero-plane
with ultra high cost for negative values.
previous solutions to similar problems have shown that sometimes
the optimization process has to be iterated with increasing
parameter slope ( and maybe extracost )
"""
aa, bb, cc, dd, ee, ff = params
###subtract such that values become as small as possible
###
diffarray = za - second_order( xa, ya, aa, bb, cc, dd, ee, ff )
diffarray = diffarray.flatten( )
### BUT high costs for negative values
cost = np.fromiter( ( -extracost * ( np.tanh( slope * x ) - 1 ) / 2.0 for x in diffarray ), np.float )
return np.abs( cost ) + np.abs( diffarray )
### some test data
xl = np.linspace( -3, 5, 50 )
yl = np.linspace( -1, 7, 60 )
XX, YY = np.meshgrid( xl, yl )
VV = second_order( XX, YY, 0.1, 0.2, 0.08, 0.28, 1.9, 1.3 )
VV = VV + peak( XX, YY, 65, 1, 2, 0.3 )
# ~VV = VV + peak( XX, YY, 55, 3, 4, 0.5 )
# ~VV = VV + peak( XX, YY, 55, -1, 0, 0.4 )
# ~VV = VV + peak( XX, YY, 55, -3, 6, 0.7 )
### the optimization
result = least_squares(residual_function, x0=( 0.0, 0.0, 0.00, 0.0, 0.0, 0 ), args=( XX, YY, VV, 1e4, 50 ) )
print result
print result.x
subtractme = second_order( XX, YY, *(result.x) )
nobase = VV - subtractme
### plotting
fig = plt.figure()
ax = fig.add_subplot( 1, 2, 1, projection='3d' )
ax.plot_surface( XX, YY, VV)
bx = fig.add_subplot( 1, 2, 2, projection='3d' )
bx.plot_surface( XX, YY, nobase)
plt.show()
provides
<< [0.092135 0.18974991 0.06144773 0.37054049 2.05096262 0.88314363]
and
Related
I am trying to model some measures (lux, ohm) that behave as a logarithmic function.
In order to do it, I've tried to model it with MATLAB by projecting the real values using natural logarithms, then use polyfit to get a linear expression. Then, I want to isolate the variable lux.
What I have so far is:
% Original values from spreadsheet
lux = [1, 5, 10, 50, 100]';
ohm = [100, 35, 25, 8, 6]';
% Logaritmic fitting
x_ = log(lux);
y_ = log(ohm);
p = polyfit(x_,y_,1)
x1 = linspace(x_(1), x_(end), 1000);
y1 = polyval(p,x1);
plot(x_,y_,'ob');
hold on;
plot(x1,y1, 'r');
hold on;
% Get expression
% y = -0.6212x + 4.5944
% log(ohm) = -0.6212 * log(lux) + 4.5944
lfun = #(ohm, a, b) ((ohm/exp(b)).^(1/a));
a = p(1);
b = p(2);
t = linspace(1,100,100);
plot(log(t), log(lfun(t, a, b)), '--g');
hold off;
legend({'points','poly','eq'});
Since I got p = -0.6212 4.5944, I assume that the equation is log(ohm) = -0.6212 * log(lux) + 4.5944. Isolating the variable lux results in:
lux = (ohm / exp(4.5944) ).^(-1/0.6212)
However, as can be seen in the green line, it is not working!
What am I doing wrong?
You're doing everything right except for the plotting, in the first plot you defined the x-axis to be log(lux) and the y-axis to be log(ohm), but to adhere to that in the second case you need to flip the arguments:
plot(log(lfun(t, a, b)), log(t), '--g')
t refers to 'ohm' and must therefore be displayed on the y-axis to coincide with the first plot.
My goal is to fit a sinusoid to data goming from a datalogger using Octave.
The datalogger logs force which is produced using an excenter, so it theoretically should be a sine wave.
I could not find any hint on how to do this elsewhere.
Currently I'm using the function "splinefit" followd by "ppval" to fit my data but I don't realy get the results I hoped from it...
Has anybody an idea how I could fit a sinusoid to my data?
Here's my current code I use to fit the data and a scrennshot of the result:
## splinefit force left
spfFL = splinefit(XAxis,forceL,50);
fitForceL=ppval(spfFL,XAxis);
##middle force left
meanForceL=mean(fitForceL);
middleedForceL=fitForceL-meanForceL;
result spline fit
on the X-Axis I have the 30'000 measurepoints or logs
on the Y-Axis I have the actual measured force values
the data comes from the datalogger in a .csv-file like this
You can do a simple regression using the sine and cosine of your (time) input as your regression features.
Here's an example
% Let's generate a dataset from a known sinusoid as an example
N = 1000;
Range = 100;
w = 0.25; % known frequency (e.g. from specs or from fourier analysis)
Inputs = randi(Range, [N, 1]);
Targets = 0.5 * sin( w * Inputs + pi/3 ) + 0.05 * randn( size( Inputs ) );
% Y = A + B sin(wx) + C cos(wx); <-- this is your model
Features = [ ones(N, 1), sin(w * Inputs), cos(w * Inputs) ];
Coefs = pinv(Features) * Targets;
A = Coefs(1); % your solutions
B = Coefs(2);
C = Coefs(3);
% print your nice solution against the input dataset
figure('position', [0, 0, 800, 400])
ax1 = axes()
plot(Inputs, Targets, 'o', 'markersize', 10, ...
'markeredgecolor', [0, 0.25, 0.5], ...
'markerfacecolor', [0, 0.5, 1], ...
'linewidth', 1.5)
set(ax1, 'color', [0.9, 0.9, 0.9])
ax2 = axes()
X = 1:0.1:Range;
plot( X, A + B*sin(w * X) + C*cos(w * X), 'k-', 'linewidth', 5 ); hold on
plot( X, A + B*sin(w * X) + C*cos(w * X), 'g-', 'linewidth', 2 ); hold off
set(ax2, 'xlim', get(ax1, 'xlim'), 'ylim', get(ax1, 'ylim'), 'color', 'none')
You could do a least squares optimization, using fminsearch
% sine to fit (in your case your data)
x = 0:0.01:50;
y = 2.6*sin(1.2*x+3.1) + 7.3 + 0.2*rand(size(x)); % create some noisy sine with known parameters
% function with parameters
fun = #(x,p) p(1)*sin(p(2)*x+p(3)) + p(4); % sine wave with 4 parameters to estimate
fcn = #(p) sum((fun(x,p)-y).^2); % cost function to minimize the sum of the squares
% initial guess for parameters
p0 = [0 0 0 0];
% parameter optimization
par = fminsearch(fcn, p0);
% see if estimated parameters match measured data
yest = fun(x, par)
plot(x,y,x,yest)
Replace x and y with your data. The par variable contains the parameters of the sine, as defined in fun.
I have a set of points in which i want to fit a line through. In most cases i end up getting Inf or -Inf especially when the lines are either vertical or horizontal. I have seen Matlab's description of centering and scaling, but i do not seem to understand how apply this to my data. Below is an example code, but please note however that it isn't exactly the one with the issue. I have used this because the main code will just be too long to follow.
x = [0, 1.81, 3.64, 5.45, 7.27];
y = [1, -0.82, -2.64, -4.45, -6.27];
fitline = polyfit([y(1), y(2), y(3), y(4)], [x(1), x(2), x(3), x(4)], 1);
%plot the data
k = linspace(0, 10, 5);
fk = (fitline(1)*k) + fitline(2);
figure, plot(k, fk, 'Color', 'r', 'linewidth', 1);
Looking forward to any help/suggestions/advice. Thanks!
MATLAB's function set polyfit and polyval will handle the centering (calculation of the mean) and scaling (calculation of the standard deviation) for you. Use the third output of polyfit to get the parameters:
x = [0, 1.81, 3.64, 5.45, 7.27];
y = [1, -0.82, -2.64, -4.45, -6.27];
[fitline,~,mu] = polyfit(y(1:4),x(1:4), 1);
And pass them to polyval:
k = linspace(0, 10, 5);
fk = y = polyval(fitline,k,[],mu);
I have this Binary image bw:
edges of an object to carry out some measurements. But firstly I have to do curve fitting for both edges. The result should be two smooth curves representing edges.
I have the indices for each edge but I can not use them in making x-y data to be input data to a fitting function. I mean they are not x and f(x), actually, they all have the same value (1) with different positions. It is not right to say [x y]=find(BW) ; y here is not the value at x, but for sure there should be a way to use them to some kind of scale the binary image. I seem confused and I'm stuck here.
Any recommendations?
Why not use polyfit?
[y x] = find( bw ); %// get x y coordinates of all curve points
There are two small "tricks" you need to use here:
You have two curves, thus you need to split your data points to the left and right curves
right = x<300;
xr = x(right);
yr = y(right);
xl = x(~right);
yl = y(~right);
Since your curves are close to vertical, it would be better to fit x=f(y) rather than the "classic" y=f(x):
pr = polyfit( yr, xr, 3 ); %// fit 3rd deg poly
pl = polyfit( yl, xl, 3 );
Now you can plot them
yy = linspace( 1, size(bw,1), 50 );
figure; imshow(bw, 'border', 'tight' );
hold all
plot( polyval( pr, yy ), yy, '.-', 'LineWidth', 1 );
plot( polyval( pl, yy ), yy, '.-', 'LineWidth', 1 );
And you get:
If you want to create a new refined mask from the estimated curves, you can do the following:
yy = 1:size(bw,1); %// single value per row
xxr=polyval(pr,yy); %// corresponding column values
xxl=polyval(pl,yy);
Set a new mask of the same size
nbw = false(size(bw));
nbw( sub2ind(size(bw),yy,round(xxr)) )=true;
nbw( sub2ind(size(bw), yy, round(xxl) )) = true;
And the result
figure;imshow(nbw,'border','tight');
I have a set of 3d data (300 points) that create a surface which looks like two cones or ellipsoids connected to each other. I want a way to find the equation of a best fit ellipsoid or cone to this dataset. The regression method is not important, the easier it is the better. I basically need a way, a code or a matlab function to calculate the constants of the elliptic equation for these data.
You can also try with fminsearch, but to avoid falling on local minima you will need a good starting point given the amount of coefficients (try to eliminate some of them).
Here is an example with a 2D ellipse:
% implicit equation
fxyc = #(x, y, c_) c_(1)*x.^2 + c_(2).*y.^2 + c_(3)*x.*y + c_(4)*x + c_(5).*y - 1; % free term locked to -1
% solution (ellipse)
c_ = [1, 2, 1, 0, 0]; % x^2, y^2, x*y, x, y (free term is locked to -1)
[X,Y] = meshgrid(-2:0.01:2);
figure(1);
fxy = #(x, y) fxyc(x, y, c_);
c = contour(X, Y, fxy(X, Y), [0, 0], 'b');
axis equal;
grid on;
xlabel('x');
ylabel('y');
title('solution');
% we sample the solution to have some data to fit
N = 100; % samples
sample = unique(2 + floor((length(c) - 2)*rand(1, N)));
x = c(1, sample).';
y = c(2, sample).';
x = x + 5e-2*rand(size(x)); % add some noise
y = y + 5e-2*rand(size(y));
fc = #(c_) fxyc(x, y, c_); % function in terms of the coefficients
e = #(c) fc(c).' * fc(c); % squared error function
% we start with a circle
c0 = [1, 1, 0, 0, 0];
copt = fminsearch(e, c0)
figure(2);
plot(x, y, 'rx');
hold on
fxy = #(x, y) fxyc(x, y, copt);
contour(X, Y, fxy(X, Y), [0, 0], 'b');
hold off;
axis equal;
grid on;
legend('data', 'fit');
xlabel('x'); %# Add an x label
ylabel('y');
title('fitted solution');
The matlab function fit can take arbitrary fit expressions. It takes a bit of figuring out the parameters but it can be done.
You would first create a fittype object that has a string representing your expected form. You'll need to work out the expression yourself that best fits what you're expecting, I'm going to take a cone expression from the Mathworld site for an example and rearrange it for z
ft = fittype('sqrt((x^2 + y^2)/c^2) + z_0', ...
'independent', {'x', 'y'}, 'coeff', {'c', 'z_0'});
If it's a simple form matlab can work out which are the variables and which the coefficients but with something more complex like this you'd want to give it a hand.
The 'fitoptions' object holds the configuration for the methods: depending on your dataset you might have to spend some time specifying upper and lower bounds, starting values etc.
fo = fitoptions('Upper', [one, for, each, of, your, coeffs, in, the, order, they, appear, in, the, string], ...
'Lower', [...], `StartPoint', [...]);
then get the output
[fitted, gof] = fit([xvals, yvals], zvals, ft, fo);
Caveat: I've done this plenty with 2D datasets and the docs state it works for three but I haven't done that myself so the above code might not work, check the docs to make sure you've got your syntax right.
It might be worth starting with a simple fit expression, something linear, so that you can get your code working. Then swap the expression out for the cone and play around until you get something that looks like what you're expecting.
After you've got your fit a good trick is that you can use the eval function on the string expression you used in your fit to evaluate the contents of the string as if it was a matlab expression. This means you need to have workspace variables with the same names as the variables and coefficients in your string expression.