Adaptive Thresholding - Implementation of the Minimum Error Thresholding Method - matlab

I'm trying to implement the following Minimum Error Thresholding (By J. Kittler and J. Illingworth) method in MATLAB.
You may have a look at the PDF:
Scribd - Minimum Error Thresholding.
DocDroid - Minimum Error Thresholding.
My code is:
function [ Level ] = MET( IMG )
%Maximum Error Thresholding By Kittler
% Finding the Min of a cost function J in any possible thresholding. The
% function output is the Optimal Thresholding.
for t = 0:255 % Assuming 8 bit image
I1 = IMG;
I1 = I1(I1 <= t);
q1 = sum(hist(I1, 256));
I2 = IMG;
I2 = I2(I2 > t);
q2 = sum(hist(I2, 256));
% J is proportional to the Overlapping Area of the 2 assumed Gaussians
J(t + 1) = 1 + 2 * (q1 * log(std(I1, 1)) + q2 * log(std(I2, 1)))...
-2 * (q1 * log(q1) + q2 * log(q2));
end
[~, Level] = min(J);
%Level = (IMG <= Level);
end
I've tried it on the following image:
Original size image.
The target is to extract a binary image of the letters (Hebrew Letters).
I applied the code on sub blocks of the image (40 x 40).
Yet I got results which are inferior to K-Means Clustering method.
Did I miss something?
Anyone has a better idea?
Thanks.
P.S.
Would anyone add "Adaptive-Thresholding" to the subject tags (I can't as I'm new).

Thresholding is a rather tricky business. In the many years I've been thresholding images I have not found one single technique that always performs well, and I have come to distrust the claims of universally excellent performance in CS journals.
The maximum error thresholding method only works on nicely bimodal histogram (but it works well on those). Your image looks like signal and background may not be clearly separated enough for this thresholding method to work.
If you want to make sure that the code works fine, you could create a test program like this and check both whether you get good initial segmentation, as well as at what level of 'bimodality' the code breaks down.

I think your code is not fully correct. You use the absolute histogram of the image instead of the relative histogram which is used in the paper. In addition, your code is rather inefficient as it computes two histograms per possible threshold. I implemented the algorithm myself. Maybe, someone can make use of it:
function [ optimalThreshold, J ] = kittlerMinimimErrorThresholding( img )
%KITTLERMINIMIMERRORTHRESHOLDING Compute an optimal image threshold.
% Computes the Minimum Error Threshold as described in
%
% 'J. Kittler and J. Illingworth, "Minimum Error Thresholding," Pattern
% Recognition 19, 41-47 (1986)'.
%
% The image 'img' is expected to have integer values from 0 to 255.
% 'optimalThreshold' holds the found threshold. 'J' holds the values of
% the criterion function.
%Initialize the criterion function
J = Inf * ones(255, 1);
%Compute the relative histogram
histogram = double(histc(img(:), 0:255)) / size(img(:), 1);
%Walk through every possible threshold. However, T is interpreted
%differently than in the paper. It is interpreted as the lower boundary of
%the second class of pixels rather than the upper boundary of the first
%class. That is, an intensity of value T is treated as being in the same
%class as higher intensities rather than lower intensities.
for T = 1:255
%Split the hostogram at the threshold T.
histogram1 = histogram(1:T);
histogram2 = histogram((T+1):end);
%Compute the number of pixels in the two classes.
P1 = sum(histogram1);
P2 = sum(histogram2);
%Only continue if both classes contain at least one pixel.
if (P1 > 0) && (P2 > 0)
%Compute the standard deviations of the classes.
mean1 = sum(histogram1 .* (1:T)') / P1;
mean2 = sum(histogram2 .* (1:(256-T))') / P2;
sigma1 = sqrt(sum(histogram1 .* (((1:T)' - mean1) .^2) ) / P1);
sigma2 = sqrt(sum(histogram2 .* (((1:(256-T))' - mean2) .^2) ) / P2);
%Only compute the criterion function if both classes contain at
%least two intensity values.
if (sigma1 > 0) && (sigma2 > 0)
%Compute the criterion function.
J(T) = 1 + 2 * (P1 * log(sigma1) + P2 * log(sigma2)) ...
- 2 * (P1 * log(P1) + P2 * log(P2));
end
end
end
%Find the minimum of J.
[~, optimalThreshold] = min(J);
optimalThreshold = optimalThreshold - 0.5;

Related

Matlab: Using imhist() with equal-sized bins

I use both Matlab and OpenCV to produce Grayscale histogram, divided into 10 bins.
In OpenCV, each bin has equal range (i.e. [0,25], [26,51], [52,77], ...).
However, in Matlab, the bin sizes are not equal (I guess it's related to some theory about different sensitivity to intensity changes between lower and higher values).
These different results make big trouble for me.
Is there an option to use calcHist with equal bin sizes? (Of course except for the option of implementing it myself...)
Answering my own question with a self-implemented function:
function h = fixedSizeBinnedHist(grayImg, numBins)
binSize = 256 / numBins;
binnedImg = floor(double(grayImg) / binSize);
maxVal = max(binnedImg(:));
numLeadingZeros = min(binnedImg(:));
numTrailingZeros = numBins - maxVal - 1;
% First, computing histogram for the existing range
h = hist(double(binnedImg(:)), maxVal - numLeadingZeros + 1);
leading = zeros(1, numLeadingZeros);
trailing = zeros(1, numTrailingZeros);
% Finally attaching needed zeros in both sides, so the histogram is in the requested size
h = [leading h trailing];
end

Measure similar information using Kullback-Leibler (KL) distance matlab code

I am trying implement the distance measurement between two distributions. The detail is described in here and input image . Let short summarize the idea of the paper:
The input image is divided into inside region and outside region by using Heaviside function H
Calculate the distribution inside region and outside region, in which phi is boundary of inside and outside.
Calculate the Kullback-Leibler distance
I implement that scheme, but I have three problems:
Log function in paper is log or log2 in matlab?
Log(0) is infinite, But we know that distribution result will return many 0 values. How to ignore it? In my case, I plus with eps value, some people add h1(h1==0)=1, which is correct?
Could you see my code? Is it correct? I am not sure about my implementation.
This is my code to implement that scheme:
function main()
Img=imread('1.bmp');%please download at above link
Img=double(Img(:,:,1));
%% Initial boundary
c0=2; %const value
phi = ones(size(Img(:,:,1))).*c0;
phi(26:32,28:34) = -c0;
%% Heaviside function
epsilon=1
Hu=0.5*(1+(2/pi)*atan(phi./epsilon));
%% Inside and outside image
inImg=Img.*(1-Hu);
outImg=Img.*(Hu);
%% Let caclulate KL distance
h1 = histogram(inImg, 256, 0, 255); %Histogram of inside
h2 = histogram(outImg, 256, 0, 255);%Histogram of outside
lamda1=KLdist(h1,h2) % distance from h1 to h2
lamda2=KLdist(h2,h1) % distance from h2 to h1
end
%%%%%%%%%% function for KL distance%%%%%%%%%%%%%%%
function [d1,d2]=KLdist(h1,h2)
d1=sum(h1.*log2(h1+eps)-h1.*log2(h2+eps))
d2=sum(h2.*log2(h2+eps)-h2.*log2(h1+eps))
end
%%%%%%%%%%function for histogram calculation%%%%%%
function [h,bins] = histogram(I, n, min, max)
I = I(:);
range = max - min;
drdb = range / double(n); % dr/db - change in range per bin
h = zeros(n,1);
bins = zeros(n,1);
for i=1:n
% note: while the instructions say "within integer round off" I'm leaving
% this as float bin edges, to handle the potential float input
% ie - say the input was a probability image.
low = min + (i-1)*drdb;
high = min + i*drdb;
h(i) = sum( (I>=low) .* (I<high) );
bins(i) = low;
end
h(n) = h(n) + sum( (I>=(n*drdb)) .* (I<=max) ); % include anything we may have missed in the last bin.
h = h ./ sum(h); % "relative frequency"
end
Let answer one by one
Log function in paper is log or log2 in matlab?
Ans: This is natural log. In matlab, you just call log()
Log(0) is infinite, But we know that distribution result will return many 0 values. How to ignore it?
Ans: To ignore the log of zero. You need add some small value as log(x+eps) or log(x+(x==0)*eps), where x is your values
Could you see my code? Is it correct? I am not sure about my implementation.
Ans: Your code looks fine. You can base on my suggestion to improve your code. Good luck

Gaussian random function

By using normrnd, I would like to create a normal distribution function with mean and sigma values expressed as vectors of size 1x45 varying from 1:45 and plot this simulated PDF with ideal values.
Whenever I create a normrnd like the one expressed below,
Gaussian = normrnd([1 45],[1 45],[1 500],length(c_t));
I am obtaining the following error,
Size information is inconsistent.
The reason for creating this PDF is to compute Chemical kinetics of a tracer with variable gaussian noise model. Basically i have an Ideal characteristics of a Tracer now i would like to add gaussian noise and understand how the chemical kinetics of a tracer vary with changing noise.
Basically there are different computational models for understanding chemical kinetics of tracer, one of which is Three compartmental model ,others are viz shape analysis,constrained shape analysis model.
I currently have ideal curve for all respective models, now i would like to add noise to these models and understand how each particular model behaves with varying noise
This is why i would like to create a variable noise model with normrnd add this model to ideal characteristics and compute Noise(Sigma) Vs Error -This analysis will give me an approximate estimation how different models behave with varying noise and which model is suitable for estimating chemical kinetics of tracer.
function [c_t,c_t_noise] =Noise_ConstrainedK2(t,a1,a2,a3,b1,b2,b3,td,tmax,k1,k2,k3)
K_1 = (k1*k2)/(k2+k3);
K_2 = (k1*k3)/(k2+k3);
%DV_free= k1/(k2+k3);
c_t = zeros(size(t));
ind = (t > td) & (t < tmax);
c_t(ind)= conv(((t(ind) - td) ./ (tmax - td) * (a1 + a2 + a3)),(K_1*exp(-(k2+k3)*t(ind)+K_2)),'same');
ind = (t >= tmax);
c_t(ind)=conv((a1 * exp(-b1 * (t(ind) - tmax))+ a2 * exp(-b2 * (t(ind) - tmax))) + a3 * exp(-b3 * (t(ind) - tmax)),(K_1*exp(-(k2+k3)*t(ind)+K_2)),'same');
meanAndVar = (rand(45,2)-0.5)*2;
numPoints = 500;
randSamples = zeros(1,numPoints);
for ii = 1:numPoints
idx = mod(ii,size(meanAndVar,1))+1;
randSamples(ii) = normrnd(meanAndVar(idx,1),meanAndVar(idx,2));
c_t_noise = c_t + randSamples(ii);
end
scatter(1:numPoints,randSamples)
dg = [0 0.5 0];
plot(t,c_t,'r');
hold on;
plot(t,c_t_noise,'Color',dg);
hold off;
axis([0 50 0 1900]);
xlabel('Time[mins]');
ylabel('concentration [Mbq]');
title('My signal');
%plot(t,c_tnp);
end
The output characteristics from the above function are as follows,Here i could not visualize any noise
The only remotely close thing to what you want to be done can be done as follows, but will involve looping because you can not request 500 data points from only 45 different means and variances, without the assumption that multiple sets can be revisited.
This is my interpretation of what you want, though I am still not entirely sure.
Random Gaussian Function Selection
meanAndVar = rand(45,2);
numPoints = 500;
randSamples = zeros(1,numPoints);
for ii = 1:numPoints
randMeanVarIdx = randi([1,size(meanAndVar,1)]);
randSamples(ii) = normrnd(meanAndVar(randMeanVarIdx,1),meanAndVar(randMeanVarIdx,2));
end
scatter(1:numPoints,randSamples)
The above code generates a random 2-D matrix of mean and variance (1st col = mean, 2nd col = variance). We then preallocate some space.
Inside the loop we chose a random set of mean and variance to use (uniformly) and then take that mean and variance, plug it into a random gaussian value function, and store it.
the matrix randSamples will contain a list of random values generated by a random set of gaussian functions chosen in a randomly uniform manner.
Sequential Function Selection
If you do not want to randomly select which function to use, and just want to go sequentially you loop using modulus to get the index of which set of values to use.
meanAndVar = (rand(45,2)-0.5)*2; % zero shift and make bounds [-1,1]
numPoints = 500;
randSamples = zeros(1,numPoints);
for ii = 1:numPoints
idx = mod(ii,size(meanAndVar,1))+1;
randSamples(ii) = normrnd(meanAndVar(idx,1),meanAndVar(idx,2));
end
scatter(1:numPoints,randSamples)
The problem with this statement
Gaussian = normrnd([1 45],[1 45],[1 500],length(c_t));
is that you supply two mu values and two sigma values, and ask for a matrix of size [1 500] x length(c_t). You need to pass the size in a uniform way, so either
Gaussian = normrnd(mu, sigma,[500 length(c_t)]);
or
Gaussian = normrnd(mu, sigma, 500, length(c_t));
Then you should make sure that the size of the mu/sigma vectors match the size of the matrix you ask for. So if you want a 500 x length(c_t) matrix as output you need to pass 500 x length(c_t) (mu,sigma) pairs. If you only want to vary one of mu or sigma you can pass a single value for the other parameter
To get N values from a normal distribution with fixed mean and steadily increasing sigma you can do
noise = #(mu, s0, s1, n) normrnd(mu, s0:(s1-s0)/(n-1):s1, 1,n)
where s0 is the lowest sigma value and s1 is the largest sigma value. To get 10 values drawn from distributions with mu=0 and sigma increasing from 1 to 5 you can do
noise(0,1,5,10)
If you want to introduce some randomness in the increase of sigma you can do
noise_rand = #(mu, s0, s1, n) normrnd(mu, (s0:(s1-s0)/(n-1):s1) .* rand(1,n), 1,n)

Custom Algorithm for Exp. maximization in Matlab

I try to write an algorithm which determine $\mu$, $\sigma$,$\pi$ for each class from a mixture multivariate normal distribution.
I finish with the algorithm partially, it works when I set the random guess values($\mu$, $\sigma$,$\pi$) near from the real value. But when I set the values far from the real one, the algorithm does not converge. The sigma goes to 0 $(2.30760684053766e-24 2.30760684053766e-24)$.
I think the problem is my covarience calculation, I am not sure that this is the right way. I found this on wikipedia.
I would be grateful if you could check my algorithm. Especially the covariance part.
Have a nice day,
Thanks,
2 mixture gauss
size x = [400, 2] (400 point 2 dimension gauss)
mu = 2 , 2 (1 row = first gauss mu, 2 row = second gauss mu)
for i = 1 : k
gaussEvaluation(i,:) = pInit(i) * mvnpdf(x,muInit(i,:), sigmaInit(i, :) * eye(d));
gaussEvaluationSum = sum(gaussEvaluation(i, :));
%mu calculation
for j = 1 : d
mu(i, j) = sum(gaussEvaluation(i, :) * x(:, j)) / gaussEvaluationSum;
end
%sigma calculation methode 1
%for j = 1 : n
% v = (x(j, :) - muNew(i, :));
% sigmaNew(i) = sigmaNew(i) + gaussEvaluation(i,j) * (v * v');
%end
%sigmaNew(i) = sigmaNew(i) / gaussEvaluationSum;
%sigma calculation methode 2
sub = bsxfun(#minus, x, mu(i,:));
sigma(i,:) = sum(gaussEvaluation(i,:) * (sub .* sub)) / gaussEvaluationSum;
%p calculation
p(i) = gaussEvaluationSum / n;
Two points: you can observe this even when you implement gaussian mixture EM correctly, but in your case, the code does seem to be incorrect.
First, this is just a problem that you have to deal with when fitting mixtures of gaussians. Sometimes one component of the mixture can collapse on to a single point, resulting in the mean of the component becoming that point and the variance becoming 0; this is known as a 'singularity'. Hence, the likelihood also goes to infinity.
Check out slide 42 of this deck: http://www.cs.ubbcluj.ro/~csatol/gep_tan/Bishop-CUED-2006.pdf
The likelihood function that you are evaluating is not log-concave, so the EM algorithm will not converge to the same parameters with different initial values. The link I gave above also gives some solutions to avoid this over-fitting problem, such as putting a prior or regularization term on your parameters. You can also consider running multiple times with different starting parameters and discarding any results with variance 0 components as having over-fitted, or just reduce the number of components you are using.
In your case, your equation is right; the covariance update calculation on Wikipedia is the same as the one on slide 45 of the above link. However, if you are in a 2d space, for each component the mean should be a length 2 vector and the covariance should be a 2x2 matrix. Hence your code (for two components) is wrong because you have a 2x2 matrix to store the means and a 2x2 matrix to store the covariances; it should be a 2x2x2 matrix.

find out the orientation, length and radius of capped rectangular object

I have a image as shown as fig.1. I am trying to fit this binary image with a capped rectangular (fig.2) to figure out:
the orientation (the angle between the long axis and the horizontal axis)
the length (l) and radius (R) of the object. What is the best way to do it?
Thanks for the help.
My very naive idea is using least square fit to find out these information however I found out there is no equation for capped rectangle. In matlab there is a function called rectangle can create the capped rectangle perfectly however it seems just for the plot purpose.
I solved this 2 different ways and have notes on each approach below. Each method varies in complexity so you will need to decide the best trade for your application.
First Approach: Least-Squares-Optimization:
Here I used unconstrained optimization through Matlab's fminunc() function. Take a look at Matlab's help to see the options you can set prior to optimization. I made some fairly simple choices just to get this approach working for you.
In summary, I setup a model of your capped rectangle as a function of the parameters, L, W, and theta. You can include R if you wish but personally I don't think you need that; by examining continuity with the half-semi-circles at each edge, I think it may be sufficient to let R = W, by inspection of your model geometry. This also reduces the number of optimization parameters by one.
I made a model of your capped rectangle using boolean layers, see the cappedRectangle() function below. As a result, I needed a function to calculate finite difference gradients of the model with respect to L, W, and theta. If you don't provide these gradients to fminunc(), it will attempt to estimate these but I found that Matlab's estimates didn't work well for this application, so I provided my own as part of the error function that gets called by fminunc() (see below).
I didn't initially have your data so I simply right-clicked on your image above and downloaded: 'aRIhm.png'
To read your data I did this (creates the variable cdata):
image = importdata('aRIhm.png');
vars = fieldnames(image);
for i = 1:length(vars)
assignin('base', vars{i}, image.(vars{i}));
end
Then I converted to double type and "cleaned-up" the data by normalizing. Note: this pre-processing was important to get the optimization to work properly, and may have been needed since I didn't have your raw data (as mentioned I downloaded your image from the webpage for this question):
data = im2double(cdata);
data = data / max(data(:));
figure(1); imshow(data); % looks the same as your image above
Now get the image sizes:
nY = size(data,1);
nX = size(data,2);
Note #1: you might consider adding the center of the capped rectangle, (xc,yc), as optimization parameters. These extra degrees of freedom will make a difference in the overall fitting results (see comment on final error function values below). I didn't set that up here but you can follow the approach I used for L, W, and theta, to add that functionality with the finite difference gradients. You will also need to setup the capped rectangle model as a function of (xc,yc).
EDIT: Out of curiosity I added the optimization over the capped rectangle center, see the results at the bottom.
Note #2: for "continuity" at the ends of the capped rectangle, let R = W. If you like, you can later include R as an explicit optimization
parameter following the examples for L, W, theta. You might even want to have say R1 and R2 at each endpoint as variables?
Below are arbitrary starting values that I used to simply illustrate an example optimization. I don't know how much information you have in your application but in general, you should try to provide the best initial estimates that you can.
L = 25;
W = L;
theta = 90;
params0 = [L W theta];
Note that you will get different results based on your initial estimates.
Next display the starting estimate (the cappedRectangle() function is defined later):
capRect0 = reshape(cappedRectangle(params0,nX,nY),nX,nY);
figure(2); imshow(capRect0);
Define an anonymous function for the error metric (errorFunc() is listed below):
f = #(x)errorFunc(x,data);
% Define several optimization parameters for fminunc():
options = optimoptions(#fminunc,'GradObj','on','TolX',1e-3, 'Display','iter');
% Call the optimizer:
tic
[x,fval,exitflag,output] = fminunc(f,params0,options);
time = toc;
disp(['convergence time (sec) = ',num2str(time)]);
% Results:
disp(['L0 = ',num2str(L),'; ', 'L estimate = ', num2str(x(1))]);
disp(['W0 = ',num2str(W),'; ', 'W estimate = ', num2str(x(2))]);
disp(['theta0 = ',num2str(theta),'; ', 'theta estimate = ', num2str(x(3))]);
capRectEstimate = reshape(cappedRectangle(x,nX,nY),nX,nY);
figure(3); imshow(capRectEstimate);
Below is the output from fminunc (for more details on each column see Matlab's help):
Iteration f(x) step optimality CG-iterations
0 0.911579 0.00465
1 0.860624 10 0.00457 1
2 0.767783 20 0.00408 1
3 0.614608 40 0.00185 1
.... and so on ...
15 0.532118 0.00488281 0.000962 0
16 0.532118 0.0012207 0.000962 0
17 0.532118 0.000305176 0.000962 0
You can see that the final error metric values have not decreased that much relative to the starting value, this indicates to me that the model function probably doesn't have enough degrees of freedom to really "fit" the data that well, so consider adding extra optimization parameters, e.g., image center, as discussed earlier.
EDIT: Added optimization over the capped rectangle center, see results at the bottom.
Now print the results (using a 2011 Macbook Pro):
Convergence time (sec) = 16.1053
L0 = 25; L estimate = 58.5773
W0 = 25; W estimate = 104.0663
theta0 = 90; theta estimate = 36.9024
And display the results:
EDIT: The exaggerated "thickness" of the fitting results above are because the model is trying to fit the data while keeping its center fixed, resulting in larger values for W. See updated results at bottom.
You can see by comparing the data to the final estimate that even a relatively simple model starts to resemble the data fairly well.
You can go further and calculate error bars for the estimates by setting up your own Monte-Carlo simulations to check accuracy as a function of noise and other degrading factors (with known inputs that you can generate to produce simulated data).
Below is the model function I used for the capped rectangle (note: the way I did image rotation is kind of sketchy numerically and not very robust for finite-differences but its quick and dirty and gets you going):
function result = cappedRectangle(params, nX, nY)
[x,y] = meshgrid(-(nX-1)/2:(nX-1)/2,-(nY-1)/2:(nY-1)/2);
L = params(1);
W = params(2);
theta = params(3); % units are degrees
R = W;
% Define r1 and r2 for the displaced rounded edges:
x1 = x - L;
x2 = x + L;
r1 = sqrt(x1.^2+y.^2);
r2 = sqrt(x2.^2+y.^2);
% Capped Rectangle prior to rotation (theta = 0):
temp = double( (abs(x) <= L) & (abs(y) <= W) | (r1 <= R) | (r2 <= R) );
cappedRectangleRotated = im2double(imrotate(mat2gray(temp), theta, 'bilinear', 'crop'));
result = cappedRectangleRotated(:);
return
And then you will also need the error function called by fminunc:
function [error, df_dx] = errorFunc(params,data)
nY = size(data,1);
nX = size(data,2);
% Anonymous function for the model:
model = #(params)cappedRectangle(params,nX,nY);
% Least-squares error (analogous to chi^2 in the literature):
f = #(x)sum( (data(:) - model(x) ).^2 ) / sum(data(:).^2);
% Scalar error:
error = f(params);
[df_dx] = finiteDiffGrad(f,params);
return
As well as the function to calculate the finite difference gradients:
function [df_dx] = finiteDiffGrad(fun,x)
N = length(x);
x = reshape(x,N,1);
% Pick a small delta, dx should be experimented with:
dx = norm(x(:))/10;
% define an array of dx values;
h_array = dx*eye(N);
df_dx = zeros(size(x));
f = #(x) feval(fun,x);
% Finite Diff Approx (use "centered difference" error is O(h^2);)
for j = 1:N
hj = h_array(j,:)';
df_dx(j) = ( f(x+hj) - f(x-hj) )/(2*dx);
end
return
Second Approach: use regionprops()
As others have pointed out you can also use Matlab's regionprops(). Overall I think this could work the best with some tuning and checking to insure that its doing what you expect. So the approach would be to call it like this (it certainly is a lot simpler than the first approach!):
data = im2double(cdata);
data = round(data / max(data(:)));
s = regionprops(data, 'Orientation', 'MajorAxisLength', ...
'MinorAxisLength', 'Eccentricity', 'Centroid');
And then the struct result s:
>> s
s =
Centroid: [345.5309 389.6189]
MajorAxisLength: 365.1276
MinorAxisLength: 174.0136
Eccentricity: 0.8791
Orientation: 30.9354
This gives enough information to feed into a model of a capped rectangle. At first glance this seems like the way to go, but it seems like you have your mind set on another approach (maybe the first approach above).
Anyway, below is an image of the results (in red) overlaid on top of your data which you can see looks quite good:
EDIT: I couldn't help myself, I suspected that by including the image center as an optimization parameter, much better results could be obtained, so I went ahead and did it just to check. Sure enough, with the same starting estimates used earlier in the Least-Squares Estimation, here are the results:
Iteration f(x) step optimality CG-iterations
0 0.911579 0.00465
1 0.859323 10 0.00471 2
2 0.742788 20 0.00502 2
3 0.530433 40 0.00541 2
... and so on ...
28 0.0858947 0.0195312 0.000279 0
29 0.0858947 0.0390625 0.000279 1
30 0.0858947 0.00976562 0.000279 0
31 0.0858947 0.00244141 0.000279 0
32 0.0858947 0.000610352 0.000279 0
By comparison with the earlier values we can see that the new least-square error values are quite a bit smaller when including the image center, confirming what we suspected earlier (so no big surprise).
The updated estimates for the capped rectangle parameters are thus:
Convergence time (sec) = 96.0418
L0 = 25; L estimate = 89.0784
W0 = 25; W estimate = 80.4379
theta0 = 90; theta estimate = 31.614
And relative to the image array center we get:
xc = -22.9107
yc = 35.9257
The optimization takes longer but the results are improved as seen by visual inspection:
If performance is an issue you may want to consider writing your own optimizer or first try tuning Matlab's optimization parameters, perhaps using different algorithm options as well; see the optimization options above.
Here is the code for the updated model:
function result = cappedRectangle(params, nX, nY)
[X,Y] = meshgrid(-(nX-1)/2:(nX-1)/2,-(nY-1)/2:(nY-1)/2);
% Extract params to make code more readable:
L = params(1);
W = params(2);
theta = params(3); % units are degrees
xc = params(4); % new param: image center in x
yc = params(5); % new param: image center in y
% Shift coordinates to the image center:
x = X-xc;
y = Y-yc;
% Define R = W as a constraint:
R = W;
% Define r1 and r2 for the rounded edges:
x1 = x - L;
x2 = x + L;
r1 = sqrt(x1.^2+y.^2);
r2 = sqrt(x2.^2+y.^2);
temp = double( (abs(x) <= L) & (abs(y) <= W) | (r1 <= R) | (r2 <= R) );
cappedRectangleRotated = im2double(imrotate(mat2gray(temp), theta, 'bilinear', 'crop'));
result = cappedRectangleRotated(:);
and then prior to calling fminunc() I adjusted the parameter list:
L = 25;
W = L;
theta = 90;
% set image center to zero as intial guess:
xc = 0;
yc = 0;
params0 = [L W theta xc yc];
Enjoy.
First I have to say that I do not have the answer to all of your questions but I can help you with the orientation.
I suggest using principal component analysis on the binary image. A good tutorial on PCA is given by Jon Shlens. In Figure 2 of his tutorial there is an example what it can be used for. In Section 5 of his paper you can see some sort of instruction how to compute the principal components. With singular value decomposition it is much easier as shown in Section 6.1.
To use PCA you have to get measurements for which you want to compute the principal components. In your case each white pixel is a measurement which is represented by the pixel location (x, y)'. You will have N two-dimensional vectors that give your measurements. Thus, your measurement 2xN matrix X is represented by the concatenated vectors.
When you have built this matrix proceed as given in Section 6.1. The singular values are representing the "strength" of the different components. Thus, the largest singular value represents the long axis of your ellipse. The second largest singular value (and it should only be two at all) is represents the other (perpendicular) axis.
Remember, if the ellipse is a circle your singular values should be equal but with your discrete image representation you will not get a perfect circle.