Linear geometric transformations with application to IFS fractals - fractals

I am one of the developers of Perceptron http://perceptron.sourceforge.net - unique generator of video-feedback fractals written in Java. I would like to draw your attention to this open-source project, so that you may also participate in the forum at SourceForge web site.
In particular, I am interested to improve the current linear geometric transformations.
What Perceptron generates is always an IFS fractal made of fragments of a Julia fractal. Such combination is created in a two-step, cyclic (recursive, endless) process of image transformation:
morphing according to z_new = f(z_old) + constant_c
and a linear mapping.
In file DoubleBuffer.java, we read the pixel color from the "screen" at the coordinates given by
z_new = (x,y).
Naturally, the complex number z_new can be anywhere in the complex plane, but the "screen" has strict physical dimensions. The desired coordinates HAVE been scaled to the screen appropriately - that is not the issue.
However, we apply seemingly needless rules such as, "take absolute value of z_new", or, "if z_new is large, wrap it!". We apply these rules to prevent reading non-existent off-screen pixels. Instead, we re-read some of the pixels. This leads to the amazing IFS fractals.
I want to know where can I learn more similar "linear geometric transformations" that wrap arrays (matrices) in interesting ways, create tiles, rotations, shape the data in matrices by giving them edges of various shapes, transformations that simulate mirrors and such.
To illustrate, see this code.
public int interface_getColor(int x, int y) {
/**
* Only positive x and y at the screen can be read to obtain
* the color. */
x ^= x >> 31; // absolute values only; no choice but to disregard negative z_new
y ^= y >> 31;
x >>= 8; //divide by 256
y >>= 8;
/**
* The reflection transformation to put the off-screen z_new
* points back within the screen. Although x = x % W and y =
* y % H would suffice, it is more interesting like this... */
x = (x / W & 1) == 0 ? x % W : W_ONE - x % W; // if x/W is even then x =... else x=...
y = (y / H & 1) == 0 ? y % H : H_ONE - y % H;
/**
* Since the screen is a one-dimensional array of length
* W*H, the index of any element is i(x,y) = x + W * y. */
return buffer.getElem(x + W * y);
}
As you can see, bitwise operators are required for speed and classical array wrapping gives more wonder than anyone had ever hoped to see from an IFS fractal. It would be good to replace this hard-coded block with equations from a definition file, and a template suggestion is required.

Related

How to Deal with Edge Cases: For Loops and Modulo

I'm trying to apply bare-bones image processing to images like this: My for-loop does exactly what I want it to: it allows me to find the pixels of highest intensity, and also remember the coordinates of that pixel. However, the code breaks whenever it encounters a multiple of rows – which in this case is equal to 18.
For example, the length of this image (rows * columns of image) is 414. So there are 414/18 = 23 cases where the program fails (i.e., the number of columns).
Perhaps there is a better way to accomplish my goal, but this is the only way I could think of sorting an image by pixel intensity while also knowing the coordinates of each pixel. Happy to take suggestions of alternative code, but it'd be great if someone had an idea of how to handle the cases where mod(x,18) = 0 (i.e., when the index of the vector is divisible by the total # of rows).
image = imread('test.tif'); % feed program an image
image_vector = image(:); % vectorize image
[sortMax,sortIndex] = sort(image_vector, 'descend'); % sort vector so
%that highest intensity pixels are at top
max_sort = [];
[rows,cols] = size(image);
for i=1:length(image_vector)
x = mod(sortIndex(i,1),rows); % retrieve original coordinates
% of pixels from matrix "image"
y = floor(sortIndex(i,1)/rows) +1;
if image(x,y) > 0.5 * max % filter out background noise
max_sort(i,:) = [x,y];
else
continue
end
end
You know that MATLAB indexing starts at 1, because you do +1 when you compute y. But you forgot to subtract 1 from the index first. Here is the correct computation:
index = sortIndex(i,1) - 1;
x = mod(index,rows) + 1;
y = floor(index/rows) + 1;
This computation is performed by the function ind2sub, which I recommend you use.
Edit: Actually, ind2sub does the equivalent of:
x = rem(sortIndex(i,1) - 1, rows) + 1;
y = (sortIndex(i,1) - x) / rows + 1;
(you can see this by typing edit ind2sub. rem and mod are the same for positive inputs, so x is computed identically. But for computing y they avoid the floor, I guess it is slightly more efficient.
Note also that
image(x,y)
is the same as
image(sortIndex(i,1))
That is, you can use the linear index directly to index into the two-dimensional array.

Not getting what 'spatial weights' for HOG are

I am using HOG for sunflower detection. I understand most of what HOG is doing now, but have some things that I do not understand in the final stages. (I am going through the MATLAB code from Mathworks).
Let us assume we are using the Dalal-Triggs implementation. (That is, 8x8 pixels make 1 cell, 2x2 cells make 1 block, blocks are taken at 50% overlap in both directions, and lastly, that we have quantized the histograms into 9 bins, unsigned. (meaning, from 0 to 180 degrees)). Finally, our image here is 64x128 pixels.
Let us say that we are on the first block. This block has 4 cells. I understand that we are going to weight the orientations of each of the orientations by their magnitude. I also understand that we are going to weight them further, by a gaussian centered on the block.
So far so good.
However in the MATLAB implementation, they have an additional step, whereby they create a 'spatial' weight:
If we dive into this function, it looks like this:
Finally, the function 'computeLowerHistBin' looks like this:
function [x1, b1] = computeLowerHistBin(x, binWidth)
% Bin index
width = single(binWidth);
invWidth = 1./width;
bin = floor(x.*invWidth - 0.5);
% Bin center x1
x1 = width * (bin + 0.5);
% add 2 to get to 1-based indexing
b1 = int32(bin + 2);
end
Now, I believe that those 'spatial' weights are being used during the tri-linear interpolation part later on... but what I do not get is just how exactly they are being computed, or the logic behind that code. I am completely lost on this issue.
Note: I understand the need for the tri-linear interpolation, and (I think) how it works. What I do not understand is why we need those 'spatial weights', and what the logic behind their computation here is.
Thanks.
The idea here is that each pixel contributes not only to its own histogram cell, but also to the neighboring cell to some degree. These contributions are weighed differently, depending on how close the pixel is to the edge of the cell. The closer you are to an edge of your cell, the more you contribute to the corresponding neighboring cell, and the less you contribute to your own cell.
This code is pre-computing the spatial weights for the trilinear interpolation. Take a look at the equation here for trilinear interpolation:
HOG Trilinear Interpolation of Histogram Bins
There you see things like (x-x1)/bx, (y-y1)/by, (1 - (x-x1)/bx), etc. In the code, wx1 and wy1 correspond to:
wx1 = (1 - (x-x1)/bx)
wy1 = (1 - (y-y1)/by)
Here, x1 and y1 are centers of the histogram bins for the X and Y directions. It's easier to describe these things in 1D. So in 1D, a value x will fall between 2 bin centers x1 <= x < x2. It doesn't matter exactly bin (1 or 2) it belongs. The important thing is to figure out the fraction of x that belongs to x1, the rest belongs to x2. Using the distance from x to x1 and dividing by the width of the bin gives a percentage distance. 1 minus that is the fraction that belongs to bin 1. So if x == x1, wx1 is 1. And if x == x2, wx1 is zero because x2 - x1 == bx (the width of a bin).
Going back to the code that creates the 4 matrices is just pre-computing all the multiplications of the weights needed for the interpolation of all the pixels in a HOG block. That is why it is a matrix of weights: each element in the matrix if for one of the pixels in the HOG block.
For example, you look at the equation for the wieghts for h(x1, y2, ~) you'll see these 2 weights for x and y (ignoring the z component).
(1 - (x-x1)/bx) * ((y-y1)/by)
Going back to the code, this multiplication is pre-computed for every pixel in the block using:
weights.x1y2 = (1-wy1)' * wx1;
where
(1-wy1) == (y - y1)/by
The same logic applies to the other weight matrices.
As for the code in "computeLowerHistBin", it's just finding the x1 in the trilinear interpolation equation, where x1 <= x < x2 (same for y1). There are probably a bunch of ways to solve this problem given a pixel location x and the width of a bin bx as long as you satisfy x1 <= x < x2.
For example, "|" indicate bin edges. "o" are the bin centers.
-20 0 20 40
|------o-------|-------o-------|-------o-------|
-10 10 30
if x = [2 9 11], the lower bin center x1 is [-10 -10 10].

Creating Filter's Laplacian Matrix and Solving the Linear Equation for Image Filtering

I have an optimization problem to solve in order to filter an image.
I created a Linear Equation of the problem which deals with Sparse Matrices.
At first I will show the problem.
First, the Laplacian (Adjacency) matrix of the problem:
The matrix Dx / Dy is the forward difference operator -> Hence its transpose is the backward difference operator.
The matrix Ax / Ay is diagonal matrix with weights which are function of the gradient of the image (Point wise, namely the value depends only on the gradient on that pixel by itself).
The weights are:
Where Ix(i) is the horizontal gradient of the input image at the i-th pixel (When you vectorize the input image).
Assuming input Image G -> g = vec(G) = G(:).
I want to find and image U -> u = vec(U) = U(:) s.t.:
My questions are:
How can I build the matrices Dx / Dy / Ax / Ay effectively (They are all sparse)?
By setting M = (I + \lambda * {L}_{g}), Is there an optimized way to create M directly?
What would be the best way to solve this linear problem in MATLAB? Is there a way to by pass memory limitations (Namely, dealing with large images and still be able to solve it)?
Is there an Open Source library to solve it under limited memory resources? Any library with MATLAB API?
Thank You.
Given your comments, let's answer each question in a synopsis and go from there:
I will answer that question below using sparse and other related functions
Using (1), we can definitely build M in an optimized way.
Simply put, the \ operator is the best thing to use when solving an inverse. MathWorks have spent so much time trying to optimize it, and it pretty much uses LAPACK and BLAS under the hood, that you would be insane not to use it. The only time you wouldn't be able to use it is answered in (4).
There are some MATLAB scripts that can handle solving the matrix iteratively, like the Successive Overrelaxation technique, but you should only use those if your run out of memory (i.e. if \ doesn't give you an answer). With the sparse representation of the matrices, this shouldn't (hopefully) happen, so let's avoid using those functions for now.
Going back to your question, we can produce a sparse representation of L_g very nicely. Given the definition of Dx and Dy, we can use the sparse version of the eye command called speye. Therefore, Dx and Dy can be calculated by Dx = diff(speye(size(inputImage))); As an example, this is what would be produced if you tried doing this on a 7 x 5 image.
>> diff(speye(7,5))
ans =
(1,1) -1
(1,2) 1
(2,2) -1
(2,3) 1
(3,3) -1
(3,4) 1
(4,4) -1
(4,5) 1
(5,5) -1
As you can see, we are referencing only non-zero entries. Row 1, column 1 has a coefficient of -1, row 1, column 2 has a coefficient of 1 and so on. As for your Ax and Ay, that's also very easy to do. We have a diagonal matrix and we can set each of the entries manually. All we would do is specify a set of row indices, column indices, and what the values are at each point. Therefore, we can do that by:
inputImage = im2double(inputImage); %//Important
rows = 1 : numel(inputImage); %// Assuming a 2D matrix
cols = rows; % // Row and column indices are the same
valuesDx = exp(-(gradX(rows).^2 / 2*sigma*sigma ));
valuesDy = exp(-(gradY(rows).^2 / 2*sigma*sigma ));
The reason for the first call is because we want to make sure that the pixels are in double precision, as finding the inverse in MATLAB requires that you do this. It also ensures we don't overflow the type as we are normalizing the intensities between 0 and 1. You may have to adjust your standard deviation to reflect this. Now we just need to construct our Ax and Ay matrices, and let's put it together with Dx and Dy:
numberElements = numel(inputImage);
Ax = sparse(rows, cols, valuesDx, numberElements, numberElements);
Ay = sparse(rows, cols, valuesDy, numberElements, numberElements);
identity = speye(numberElements, numberElements);
Dx = diff(identity);
Dy = Dx.'; %// Transpose
The reason why I'm transposing Dx to get Dy is because the difference operator in the vertical direction should simply be the transpose (makes sense to me). These should all be sparse representations of each of the matrices you want. Matrix operations can also be performed on sparse matrices, including multiplication and the inverse. As such:
Lg = Dx.' * Ax * Dx + Dy.' * Ay * Dy;
You can now solve for u via:
u = (identity + lambda*Lg) \ g;
This assumes that g is structured with your pixels in your image in column-major format. The way I sampled the pixels to build Ax and Ay naturally follows this. As such, do g = inputImage(:);, assuming that we have converted to double and normalized between 0 and 1.
When you finally solve for u, you can reshape it back to an image by doing:
u = reshape(u, size(inputImage, 1), size(inputImage, 2));
u may also be sparse, so if you want the original image back, cast it using full():
u = full(u);
Hope this helps!

What algorithm can I use to recognize the line in this scatterplot?

I'm creating a program to compare audio files which uses a similar algorithm to the one described here http://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf. I am plotting the times of matches between two songs being compared and finding the line of least squares for the plot. href=http://imgur.com/fGu7jhX&yOeMSK0 is an example plot of matching files. The plot is too messy and the least squares regression line does not produce a high correlation coefficient even though there is an obvious line in the graph. What other algorithm can I use to recognize this line?
This is an interesting question, but it's been pretty quiet. Maybe this answer
will trigger some more activity.
For identifying lines with arbitrary slopes and intercepts within a collection
of points, the Hough transform would be a good place to start. For your audio
application, however, it looks like the slope should always be 1, so you don't
need the full generality of the Hough transform.
Instead, you can think of the problem as one of clustering the differences x - y, where x and y are the vectors holding the x and y coordinates of the points.
One approach would be to compute a histogram of x - y. Points that are close to lying in the same line with slope 1 will have differences in the same bin in the histogram. The bin with the largest count corresponds to the largest collection of points that are approximately aligned. An issue to deal with in this approach is choosing the boundaries of the histogram bins. A bad choice could result in points that should be grouped together being split into neighboring bins.
A simple brute-force approach is to imagine a diagonal window with a given width, sliding left to right across the (x,y) plane. The best candidate for a line corresponds to the position of the window that contains the most points. This is similar to a histogram of x - y, but instead of having a collection of disjoint bins, there are overlapping bins, one for each point. All the bins have the same width, and each point determines the left edge of a bin.
The function count_diag_groups in the code below does that computation. For each point, it counts how many points are in the diagonal window when the left edge of the window is on that point. The best candidate for a line is the window with the most points. Here's the plot generated by the script. The top is the scatter plot of the data. The bottow is the same scatter plot, with the best candidate points highlighted.
A nice feature of this method is that there is only one parameter, the window width. A not-so-nice feature is that it has time complexity O(n**2), where n is the number of points. There are surely algorithms with better time complexity that could do something similar; the article that you link to discusses this. To judge the quality of an alternative, however, will require more concrete specifications of how "good" or robust the line identification must be.
import numpy as np
import matplotlib.pyplot as plt
def count_diag_groups(x, y, width):
"""
Returns a list of arrays. The length of the list is the same
as the length of x. The k-th array holds the indices into x
(and y) of a set of points that are in a "diagonal" window with
the given width whose left edge includes the point (x[k], y[k]).
"""
d = x - y
result = []
for i in range(d.size):
delta = d - d[i]
neighbors = np.where((delta >= 0) & (delta <= width))[0]
result.append(neighbors)
return result
def generate_demo_data():
# Generate some data.
np.random.seed(123)
xmin = 0
xmax = 100
ymin = 0
ymax = 25
nrnd = 175
xrnd = xmin + (xmax - xmin)*np.random.rand(nrnd)
yrnd = ymin + (ymax - ymin)*np.random.rand(nrnd)
n = 25
xx = xmin + 0.1*(xmax - xmin) + ymax*np.random.rand(n)
yy = (xx - xx.min()) + 0.2*np.random.randn(n)
x = np.concatenate((xrnd, xx))
y = np.concatenate((yrnd, yy))
return x, y
def plot_result(x, y, width, selection):
xmin = x.min()
xmax = x.max()
ymin = y.min()
ymax = y.max()
xsel = x[selection]
ysel = y[selection]
# Plot...
plt.figure(1)
plt.clf()
ax = plt.subplot(2,1,1)
plt.plot(x, y, 'o', mfc='b', mec='b', alpha=0.5)
plt.xlim(xmin - 1, xmax + 1)
plt.ylim(ymin - 1, ymax + 1)
plt.subplot(2,1,2, sharex=ax, sharey=ax)
plt.plot(x, y, 'o', mfc='b', mec='b', alpha=0.5)
plt.plot(xsel, ysel, 'o', mfc='w', mec='w')
plt.plot(xsel, ysel, 'o', mfc='r', mec='r', alpha=0.65)
xi = np.array([xmin, xmax])
d = x - y
yi1 = xi - d[imax]
yi2 = yi1 - width
plt.plot(xi, yi1, 'r-', alpha=0.25)
plt.plot(xi, yi2, 'r-', alpha=0.25)
plt.xlim(xmin - 1, xmax + 1)
plt.ylim(ymin - 1, ymax + 1)
plt.show()
if __name__ == "__main__":
x, y = generate_demo_data()
# Find a selection of points that are close to being aligned
# with a slope of 1.
width = 0.75
r = count_diag_groups(x, y, width)
# Find the largest group.
sz = np.array(list(len(f) for f in r))
imax = sz.argmax()
# k holds the indices of the selected points.
selection = r[imax]
plot_result(x, y, width, selection)
This looks like an excellent example of a task for Random Sampling Consensus (RANSAC).
The Wikipedia article even uses your problem as an example!
The rough outline is something like this.
Select 2 random points in your data, fit a line to them
For each other point, find the distance to that line. If the distance is below a threshold, it is part of the inlier set.
If the final inlier set for this particular line is larger than the previously best line, then keep the new line as the best candidate.
If the decided number of iterations is reached, return the best line found, else go back to 1 and choose new random points.
Check the Wikipedia article for more information.

find out the orientation, length and radius of capped rectangular object

I have a image as shown as fig.1. I am trying to fit this binary image with a capped rectangular (fig.2) to figure out:
the orientation (the angle between the long axis and the horizontal axis)
the length (l) and radius (R) of the object. What is the best way to do it?
Thanks for the help.
My very naive idea is using least square fit to find out these information however I found out there is no equation for capped rectangle. In matlab there is a function called rectangle can create the capped rectangle perfectly however it seems just for the plot purpose.
I solved this 2 different ways and have notes on each approach below. Each method varies in complexity so you will need to decide the best trade for your application.
First Approach: Least-Squares-Optimization:
Here I used unconstrained optimization through Matlab's fminunc() function. Take a look at Matlab's help to see the options you can set prior to optimization. I made some fairly simple choices just to get this approach working for you.
In summary, I setup a model of your capped rectangle as a function of the parameters, L, W, and theta. You can include R if you wish but personally I don't think you need that; by examining continuity with the half-semi-circles at each edge, I think it may be sufficient to let R = W, by inspection of your model geometry. This also reduces the number of optimization parameters by one.
I made a model of your capped rectangle using boolean layers, see the cappedRectangle() function below. As a result, I needed a function to calculate finite difference gradients of the model with respect to L, W, and theta. If you don't provide these gradients to fminunc(), it will attempt to estimate these but I found that Matlab's estimates didn't work well for this application, so I provided my own as part of the error function that gets called by fminunc() (see below).
I didn't initially have your data so I simply right-clicked on your image above and downloaded: 'aRIhm.png'
To read your data I did this (creates the variable cdata):
image = importdata('aRIhm.png');
vars = fieldnames(image);
for i = 1:length(vars)
assignin('base', vars{i}, image.(vars{i}));
end
Then I converted to double type and "cleaned-up" the data by normalizing. Note: this pre-processing was important to get the optimization to work properly, and may have been needed since I didn't have your raw data (as mentioned I downloaded your image from the webpage for this question):
data = im2double(cdata);
data = data / max(data(:));
figure(1); imshow(data); % looks the same as your image above
Now get the image sizes:
nY = size(data,1);
nX = size(data,2);
Note #1: you might consider adding the center of the capped rectangle, (xc,yc), as optimization parameters. These extra degrees of freedom will make a difference in the overall fitting results (see comment on final error function values below). I didn't set that up here but you can follow the approach I used for L, W, and theta, to add that functionality with the finite difference gradients. You will also need to setup the capped rectangle model as a function of (xc,yc).
EDIT: Out of curiosity I added the optimization over the capped rectangle center, see the results at the bottom.
Note #2: for "continuity" at the ends of the capped rectangle, let R = W. If you like, you can later include R as an explicit optimization
parameter following the examples for L, W, theta. You might even want to have say R1 and R2 at each endpoint as variables?
Below are arbitrary starting values that I used to simply illustrate an example optimization. I don't know how much information you have in your application but in general, you should try to provide the best initial estimates that you can.
L = 25;
W = L;
theta = 90;
params0 = [L W theta];
Note that you will get different results based on your initial estimates.
Next display the starting estimate (the cappedRectangle() function is defined later):
capRect0 = reshape(cappedRectangle(params0,nX,nY),nX,nY);
figure(2); imshow(capRect0);
Define an anonymous function for the error metric (errorFunc() is listed below):
f = #(x)errorFunc(x,data);
% Define several optimization parameters for fminunc():
options = optimoptions(#fminunc,'GradObj','on','TolX',1e-3, 'Display','iter');
% Call the optimizer:
tic
[x,fval,exitflag,output] = fminunc(f,params0,options);
time = toc;
disp(['convergence time (sec) = ',num2str(time)]);
% Results:
disp(['L0 = ',num2str(L),'; ', 'L estimate = ', num2str(x(1))]);
disp(['W0 = ',num2str(W),'; ', 'W estimate = ', num2str(x(2))]);
disp(['theta0 = ',num2str(theta),'; ', 'theta estimate = ', num2str(x(3))]);
capRectEstimate = reshape(cappedRectangle(x,nX,nY),nX,nY);
figure(3); imshow(capRectEstimate);
Below is the output from fminunc (for more details on each column see Matlab's help):
Iteration f(x) step optimality CG-iterations
0 0.911579 0.00465
1 0.860624 10 0.00457 1
2 0.767783 20 0.00408 1
3 0.614608 40 0.00185 1
.... and so on ...
15 0.532118 0.00488281 0.000962 0
16 0.532118 0.0012207 0.000962 0
17 0.532118 0.000305176 0.000962 0
You can see that the final error metric values have not decreased that much relative to the starting value, this indicates to me that the model function probably doesn't have enough degrees of freedom to really "fit" the data that well, so consider adding extra optimization parameters, e.g., image center, as discussed earlier.
EDIT: Added optimization over the capped rectangle center, see results at the bottom.
Now print the results (using a 2011 Macbook Pro):
Convergence time (sec) = 16.1053
L0 = 25; L estimate = 58.5773
W0 = 25; W estimate = 104.0663
theta0 = 90; theta estimate = 36.9024
And display the results:
EDIT: The exaggerated "thickness" of the fitting results above are because the model is trying to fit the data while keeping its center fixed, resulting in larger values for W. See updated results at bottom.
You can see by comparing the data to the final estimate that even a relatively simple model starts to resemble the data fairly well.
You can go further and calculate error bars for the estimates by setting up your own Monte-Carlo simulations to check accuracy as a function of noise and other degrading factors (with known inputs that you can generate to produce simulated data).
Below is the model function I used for the capped rectangle (note: the way I did image rotation is kind of sketchy numerically and not very robust for finite-differences but its quick and dirty and gets you going):
function result = cappedRectangle(params, nX, nY)
[x,y] = meshgrid(-(nX-1)/2:(nX-1)/2,-(nY-1)/2:(nY-1)/2);
L = params(1);
W = params(2);
theta = params(3); % units are degrees
R = W;
% Define r1 and r2 for the displaced rounded edges:
x1 = x - L;
x2 = x + L;
r1 = sqrt(x1.^2+y.^2);
r2 = sqrt(x2.^2+y.^2);
% Capped Rectangle prior to rotation (theta = 0):
temp = double( (abs(x) <= L) & (abs(y) <= W) | (r1 <= R) | (r2 <= R) );
cappedRectangleRotated = im2double(imrotate(mat2gray(temp), theta, 'bilinear', 'crop'));
result = cappedRectangleRotated(:);
return
And then you will also need the error function called by fminunc:
function [error, df_dx] = errorFunc(params,data)
nY = size(data,1);
nX = size(data,2);
% Anonymous function for the model:
model = #(params)cappedRectangle(params,nX,nY);
% Least-squares error (analogous to chi^2 in the literature):
f = #(x)sum( (data(:) - model(x) ).^2 ) / sum(data(:).^2);
% Scalar error:
error = f(params);
[df_dx] = finiteDiffGrad(f,params);
return
As well as the function to calculate the finite difference gradients:
function [df_dx] = finiteDiffGrad(fun,x)
N = length(x);
x = reshape(x,N,1);
% Pick a small delta, dx should be experimented with:
dx = norm(x(:))/10;
% define an array of dx values;
h_array = dx*eye(N);
df_dx = zeros(size(x));
f = #(x) feval(fun,x);
% Finite Diff Approx (use "centered difference" error is O(h^2);)
for j = 1:N
hj = h_array(j,:)';
df_dx(j) = ( f(x+hj) - f(x-hj) )/(2*dx);
end
return
Second Approach: use regionprops()
As others have pointed out you can also use Matlab's regionprops(). Overall I think this could work the best with some tuning and checking to insure that its doing what you expect. So the approach would be to call it like this (it certainly is a lot simpler than the first approach!):
data = im2double(cdata);
data = round(data / max(data(:)));
s = regionprops(data, 'Orientation', 'MajorAxisLength', ...
'MinorAxisLength', 'Eccentricity', 'Centroid');
And then the struct result s:
>> s
s =
Centroid: [345.5309 389.6189]
MajorAxisLength: 365.1276
MinorAxisLength: 174.0136
Eccentricity: 0.8791
Orientation: 30.9354
This gives enough information to feed into a model of a capped rectangle. At first glance this seems like the way to go, but it seems like you have your mind set on another approach (maybe the first approach above).
Anyway, below is an image of the results (in red) overlaid on top of your data which you can see looks quite good:
EDIT: I couldn't help myself, I suspected that by including the image center as an optimization parameter, much better results could be obtained, so I went ahead and did it just to check. Sure enough, with the same starting estimates used earlier in the Least-Squares Estimation, here are the results:
Iteration f(x) step optimality CG-iterations
0 0.911579 0.00465
1 0.859323 10 0.00471 2
2 0.742788 20 0.00502 2
3 0.530433 40 0.00541 2
... and so on ...
28 0.0858947 0.0195312 0.000279 0
29 0.0858947 0.0390625 0.000279 1
30 0.0858947 0.00976562 0.000279 0
31 0.0858947 0.00244141 0.000279 0
32 0.0858947 0.000610352 0.000279 0
By comparison with the earlier values we can see that the new least-square error values are quite a bit smaller when including the image center, confirming what we suspected earlier (so no big surprise).
The updated estimates for the capped rectangle parameters are thus:
Convergence time (sec) = 96.0418
L0 = 25; L estimate = 89.0784
W0 = 25; W estimate = 80.4379
theta0 = 90; theta estimate = 31.614
And relative to the image array center we get:
xc = -22.9107
yc = 35.9257
The optimization takes longer but the results are improved as seen by visual inspection:
If performance is an issue you may want to consider writing your own optimizer or first try tuning Matlab's optimization parameters, perhaps using different algorithm options as well; see the optimization options above.
Here is the code for the updated model:
function result = cappedRectangle(params, nX, nY)
[X,Y] = meshgrid(-(nX-1)/2:(nX-1)/2,-(nY-1)/2:(nY-1)/2);
% Extract params to make code more readable:
L = params(1);
W = params(2);
theta = params(3); % units are degrees
xc = params(4); % new param: image center in x
yc = params(5); % new param: image center in y
% Shift coordinates to the image center:
x = X-xc;
y = Y-yc;
% Define R = W as a constraint:
R = W;
% Define r1 and r2 for the rounded edges:
x1 = x - L;
x2 = x + L;
r1 = sqrt(x1.^2+y.^2);
r2 = sqrt(x2.^2+y.^2);
temp = double( (abs(x) <= L) & (abs(y) <= W) | (r1 <= R) | (r2 <= R) );
cappedRectangleRotated = im2double(imrotate(mat2gray(temp), theta, 'bilinear', 'crop'));
result = cappedRectangleRotated(:);
and then prior to calling fminunc() I adjusted the parameter list:
L = 25;
W = L;
theta = 90;
% set image center to zero as intial guess:
xc = 0;
yc = 0;
params0 = [L W theta xc yc];
Enjoy.
First I have to say that I do not have the answer to all of your questions but I can help you with the orientation.
I suggest using principal component analysis on the binary image. A good tutorial on PCA is given by Jon Shlens. In Figure 2 of his tutorial there is an example what it can be used for. In Section 5 of his paper you can see some sort of instruction how to compute the principal components. With singular value decomposition it is much easier as shown in Section 6.1.
To use PCA you have to get measurements for which you want to compute the principal components. In your case each white pixel is a measurement which is represented by the pixel location (x, y)'. You will have N two-dimensional vectors that give your measurements. Thus, your measurement 2xN matrix X is represented by the concatenated vectors.
When you have built this matrix proceed as given in Section 6.1. The singular values are representing the "strength" of the different components. Thus, the largest singular value represents the long axis of your ellipse. The second largest singular value (and it should only be two at all) is represents the other (perpendicular) axis.
Remember, if the ellipse is a circle your singular values should be equal but with your discrete image representation you will not get a perfect circle.