Simplex noise displaying incorrectly with BufferedImage

Simplex noise displaying incorrectly with BufferedImage - scala

I finally got a working tile-able version of Simplex noise working after much work, but I can't seem to get it to record and display correctly when using a BufferedImage. Whenever I try to create an image, it ends up with bands or rings of black and white, instead of a smooth change of shades, which is what I'm expecting. I'm guessing there's something simple I'm not doing, but for the life of me, I can't find it.
This is my code (quite a bit of which is from Stefan Gustavson's Simplex noise implementation):
import java.awt.image.BufferedImage
import javax.imageio.ImageIO
import java.io.File
import scala.util.Random
object ImageTest {
def main(args: Array[String]): Unit = {
val image = generate(1024, 1024, 1)
ImageIO.write(image, "png", new File("heightmap.png"))
}
def generate(width: Int, height: Int, octaves: Int) = {
val map = new BufferedImage(width, height, BufferedImage.TYPE_USHORT_GRAY)
val pi2 = Math.PI * 2
for ( x <- 0 until width;
y <- 0 until height) {
var total = 0.0
for (oct <- 1 to octaves) {
val scale = (1 - 1/Math.pow(2, oct))
val s = x / width.toDouble
val t = y / height.toDouble
val dx = 1-scale
val dy = 1-scale
val nx = scale + Math.cos(s*pi2) * dx
val ny = scale + Math.cos(t*pi2) * dy
val nz = scale + Math.sin(s*pi2) * dx
val nw = scale + Math.sin(t*pi2) * dy
total += (((noise(nx,ny,nz,nw)+1)/2)) * Math.pow(0.5, oct)
}
map.setRGB(x,y, (total * 0xffffff).toInt)
}
map
}
// Simplex 4D noise generator
// returns -1.0 <-> 1.0
def noise(x: Double, y: Double, z: Double, w: Double) = {
// Skew the (x,y,z,w) space to determine which cell of 24 simplices we're in
val s = (x + y + z + w) * F4; // Factor for 4D skewing
val i = Math.floor(x+s).toInt
val j = Math.floor(y+s).toInt
val k = Math.floor(z+s).toInt
val l = Math.floor(w+s).toInt
val t = (i+j+k+l) * G4 // Factor for 4D unskewing
val xBase = x - (i-t) // Unskew the cell space and set the x, y, z, w
val yBase = y - (j-t) //distances from the cell origin
val zBase = z - (k-t)
val wBase = w - (l-t)
// For the 4D case, the simplex is a 4D shape I won't even try to describe.
// To find out which of the 24 possible simplices we're in, we need to
// determine the magnitude ordering of x0, y0, z0 and w0.
// Six pair-wise comparisons are performed between each possible pair
// of the four coordinates, and the results are used to rank the numbers.
var rankx = 0
var ranky = 0
var rankz = 0
var rankw = 0
if(xBase > yBase) rankx+=1 else ranky+=1
if(xBase > zBase) rankx+=1 else rankz+=1
if(xBase > wBase) rankx+=1 else rankw+=1
if(yBase > zBase) ranky+=1 else rankz+=1
if(yBase > wBase) ranky+=1 else rankw+=1
if(zBase > wBase) rankz+=1 else rankw+=1
// simplex[c] is a 4-vector with the numbers 0, 1, 2 and 3 in some order.
// Many values of c will never occur, since e.g. x>y>z>w makes x<z, y<w and x<w
// impossible. Only the 24 indices which have non-zero entries make any sense.
// We use a thresholding to set the coordinates in turn from the largest magnitude.
// Rank 3 denotes the largest coordinate.
val i1 = if (rankx >= 3) 1 else 0
val j1 = if (ranky >= 3) 1 else 0
val k1 = if (rankz >= 3) 1 else 0
val l1 = if (rankw >= 3) 1 else 0
// Rank 2 denotes the second largest coordinate.
val i2 = if (rankx >= 2) 1 else 0
val j2 = if (ranky >= 2) 1 else 0
val k2 = if (rankz >= 2) 1 else 0
val l2 = if (rankw >= 2) 1 else 0
// Rank 1 denotes the second smallest coordinate.
val i3 = if (rankx >= 1) 1 else 0
val j3 = if (ranky >= 1) 1 else 0
val k3 = if (rankz >= 1) 1 else 0
val l3 = if (rankw >= 1) 1 else 0
// The fifth corner has all coordinate offsets = 1, so no need to compute that.
val xList = Array(xBase, xBase-i1+G4, xBase-i2+2*G4, xBase-i3+3*G4, xBase-1+4*G4)
val yList = Array(yBase, yBase-j1+G4, yBase-j2+2*G4, yBase-j3+3*G4, yBase-1+4*G4)
val zList = Array(zBase, zBase-k1+G4, zBase-k2+2*G4, zBase-k3+3*G4, zBase-1+4*G4)
val wList = Array(wBase, wBase-l1+G4, wBase-l2+2*G4, wBase-l3+3*G4, wBase-1+4*G4)
// Work out the hashed gradient indices of the five simplex corners
val ii = if (i < 0) 256 + (i % 255) else i % 255
val jj = if (j < 0) 256 + (j % 255) else j % 255
val kk = if (k < 0) 256 + (k % 255) else k % 255
val ll = if (l < 0) 256 + (l % 255) else l % 255
val gradIndices = Array(
perm(ii+perm(jj+perm(kk+perm(ll)))) % 32,
perm(ii+i1+perm(jj+j1+perm(kk+k1+perm(ll+l1)))) % 32,
perm(ii+i2+perm(jj+j2+perm(kk+k2+perm(ll+l2)))) % 32,
perm(ii+i3+perm(jj+j3+perm(kk+k3+perm(ll+l3)))) % 32,
perm(ii+1+perm(jj+1+perm(kk+1+perm(ll+1)))) % 32)
// Calculate the contribution from the five corners
var total = 0.0
for (dim <- 0 until 5) {
val (x,y,z,w) = (xList(dim), yList(dim), zList(dim), wList(dim))
var t = 0.5 - x*x - y*y - z*z - w*w
total += {
if (t < 0) 0.0
else {
t *= t
val g = grad4(gradIndices(dim))
t * t * ((g.x*x)+(g.y*y)+(g.z*z)+(g.w*w))
}
}
}
// Sum up and scale the result to cover the range [-1,1]
27.0 * total
}
case class Grad(x: Double, y: Double, z: Double, w: Double = 0.0)
private lazy val grad4 = Array(
Grad(0,1,1,1), Grad(0,1,1,-1), Grad(0,1,-1,1), Grad(0,1,-1,-1),
Grad(0,-1,1,1),Grad(0,-1,1,-1),Grad(0,-1,-1,1),Grad(0,-1,-1,-1),
Grad(1,0,1,1), Grad(1,0,1,-1), Grad(1,0,-1,1), Grad(1,0,-1,-1),
Grad(-1,0,1,1),Grad(-1,0,1,-1),Grad(-1,0,-1,1),Grad(-1,0,-1,-1),
Grad(1,1,0,1), Grad(1,1,0,-1), Grad(1,-1,0,1), Grad(1,-1,0,-1),
Grad(-1,1,0,1),Grad(-1,1,0,-1),Grad(-1,-1,0,1),Grad(-1,-1,0,-1),
Grad(1,1,1,0), Grad(1,1,-1,0), Grad(1,-1,1,0), Grad(1,-1,-1,0),
Grad(-1,1,1,0),Grad(-1,1,-1,0),Grad(-1,-1,1,0),Grad(-1,-1,-1,0))
private lazy val perm = new Array[Short](512)
for(i <- 0 until perm.length)
perm(i) = Random.nextInt(256).toShort
private lazy val F4 = (Math.sqrt(5.0) - 1.0) / 4.0
private lazy val G4 = (5.0 - Math.sqrt(5.0)) / 20.0
}
I've checked the output values of the noise function I'm using, which as of yet hasn't returned anything outside of (-1, 1) exclusive. And for a single octave, the value I'm supplying to the image (total) has not gone outside of (0,1) exclusive, either.
The only thing I can see that would be a problem is either the BufferedImage type is set incorrectly, or I'm multiplying total by the wrong hex value when setting the values in the image.
I've looked through the Javadocs on BufferedImage for information about the types and the values they accept, though nothing I've found seems to be out of place in my code (though, I am fairly new to using BufferedImage in general). And I've tried changing the hex value, but neither seems to change anything. The only thing I've found that has any affect is if I divide the (total * 0xffffff).toInt value by 256, which seems to darken the bands a bit and a slight gradient appears over the areas it should, but if I increase the division too much, the image just becomes black. So as of right now I'm stuck on what could be the issue.

(total * 0xffffff).toInt doesn't seem to make sense. You are creating an ARGB value from a grayscale float with a single multiplication?
I think you want something like this:
val i = (total * 0xFF).toInt
val rgb = 0xFF000000 | (i << 16) | (i << 8) | i
That gives me a smooth random texture, although with very low contrast—with 1 octave, your total seems to vary approx from 0.2 to 0.3, so you may need to adjust the scale a bit.
I'm not sure though how you can get 16-bit grayscale resolution. Perhaps you need to set the raster data directly instead of using setRGB (which forces you down to 8 bits). The comments below this question suggest that you use the raster directly.

Related

Fast CVX solvers in Matlab

I am wondering what is the fastest convex optimizer in Matlab or is there any way to speed up current solvers? I'm using CVX, but it's taking forever to solve the optimization problem I have.
The optimization I have is to solve
minimize norm(Ax-b, 2)
subject to
x >= 0
and x d <= delta
where the size of A and b are very large.
Is there any way that I can solve this by a least square solver and then transfer it to the constraint version to make it faster?

I'm not sure what x.d <= delta means, but I'll just assume it's supposed to be x <= delta.
You can solve this problem using the projected gradient method or an accelerated projected gradient method (which is just a slight modification of the projected gradient method, which "magically" converges much faster). Here is some python code that shows how to minimize .5|| Ax - b ||^2 subject to the constraint that 0 <= x <= delta using FISTA, which is an accelerated projected gradient method. More details about the projected gradient method and FISTA can be found for example in Boyd's manuscript on proximal algorithms.
import numpy as np
import matplotlib.pyplot as plt
def fista(gradf,proxg,evalf,evalg,x0,params):
# This code does FISTA with line search
maxIter = params['maxIter']
t = params['stepSize'] # Initial step size
showTrigger = params['showTrigger']
increaseFactor = 1.25
decreaseFactor = .5
costs = np.zeros((maxIter,1))
xkm1 = np.copy(x0)
vkm1 = np.copy(x0)
for k in np.arange(1,maxIter+1,dtype = np.double):
costs[k-1] = evalf(xkm1) + evalg(xkm1)
if k % showTrigger == 0:
print "Iteration: " + str(k) + " cost: " + str(costs[k-1])
t = increaseFactor*t
acceptFlag = False
while acceptFlag == False:
if k == 1:
theta = 1
else:
a = tkm1
b = t*(thetakm1**2)
c = -t*(thetakm1**2)
theta = (-b + np.sqrt(b**2 - 4*a*c))/(2*a)
y = (1 - theta)*xkm1 + theta*vkm1
(gradf_y,fy) = gradf(y)
x = proxg(y - t*gradf_y,t)
fx = evalf(x)
if fx <= fy + np.vdot(gradf_y,x - y) + (.5/t)*np.sum((x - y)**2):
acceptFlag = True
else:
t = decreaseFactor*t
tkm1 = t
thetakm1 = theta
vkm1 = xkm1 + (1/theta)*(x - xkm1)
xkm1 = x
return (xkm1,costs)
if __name__ == '__main__':
delta = 5.0
numRows = 300
numCols = 50
A = np.random.randn(numRows,numCols)
ATrans = np.transpose(A)
xTrue = delta*np.random.rand(numCols,1)
b = np.dot(A,xTrue)
noise = .1*np.random.randn(numRows,1)
b = b + noise
def evalf(x):
AxMinusb = np.dot(A, x) - b
val = .5 * np.sum(AxMinusb ** 2)
return val
def gradf(x):
AxMinusb = np.dot(A, x) - b
grad = np.dot(ATrans, AxMinusb)
val = .5 * np.sum(AxMinusb ** 2)
return (grad, val)
def evalg(x):
return 0.0
def proxg(x,t):
return np.maximum(np.minimum(x,delta),0.0)
x0 = np.zeros((numCols,1))
params = {'maxIter': 500, 'stepSize': 1.0, 'showTrigger': 5}
(x,costs) = fista(gradf,proxg,evalf,evalg,x0,params)
plt.figure()
plt.plot(x)
plt.plot(xTrue)
plt.figure()
plt.semilogy(costs)

Generating an image of the Mandelbrot set

I've been trying to generate an image of the Mandelbrot set but something is obviously very wrong. Here's the image this code produces: http://puu.sh/csUDd/bdfa6c1d98.png
I'm using Scala and processing. The coloring technique is very simple but i don't think it's the main problem by looking at the image's shape. Thank you.
for (i <- 0 to width){ // 0 to 700
for (j <- 0 to height){ // 0 to 400
val x = i/200.toDouble - 2.5 // scaled to -2.5, 1
val y = j/200.toDouble - 1 // scaled to -1, 1
val c = new Complex(x, y)
val iterMax = 1000
var z = c
var iterations = 0
var inSet = false
while (z.abs < 2 && iterations < iterMax) {
z = z.squared.plus(c)
iterations += 1
if (iterations == iterMax) {
inSet = true
}
}
// stroke() defines the current rgb color.
// If the point is in the set, it is coloured black.
// If not, the point is coloured as such: (iterations^5 mod 255, iterations^7 mod 255, iterations^11 mod 255)
// I use 5, 7 and 11 for no specific reason. Using iterations alone results in a black picture.
if (inSet) stroke(0, 0, 0)
else stroke(pow(iterations, 5).toInt % 255, pow(iterations, 7).toInt % 255, pow(iterations, 11).toInt % 255)
// Finally, draw the point.
point(i, j)
}
}
Here's the class for complex numbers
class Complex(val real: Double, val imag: Double) {
def squared = new Complex(real*real - imag*imag, 2*real*imag)
def abs = sqrt(real*real + imag*imag)
def plus(another: Complex) = new Complex(real + real, imag + imag)
}

Your plus method doesn't add with another, I think it should be
def plus(another: Complex) = new Complex(real + another.real, imag + another.imag)

Ok, i figured it out. There was a mistake in the complex number class.
def plus(another: Complex) = new Complex(real + real, imag + imag)
--->
def plus(another: Complex) = new Complex(this.real + another.real, this.imag + another.imag)
Here's the result for anyone interested
http://puu.sh/csYbb/b2a0d882e1.png

Matlab FFT and home brewed FFT

I'm trying to verify an FFT algorithm I should use for a project VS the same thing on Matlab.
The point is that with my own C FFT function I always get the right (the second one) part of the double sided FFT spectrum evaluated in Matlab and not the first one as "expected".
For instance if my third bin is in the form a+i*b the third bin of Matlab's FFT is a-i*b. A and b values are the same but i always get the complex conjugate of Matlab's.
I know that in terms of amplitudes and power there's no trouble (cause abs value) but I wonder if in terms of phases I'm going to read always wrong angles.
Im not so skilled in Matlab to know (and I have not found useful infos on the web) if Matlab FFT maybe returns the FFT spectre with negative frequencies first and then positive... or if I have to fix my FFT algorithm... or if it is all ok because phases are the unchanged regardless wich part of FFT we choose as single side spectrum (but i doubt about this last option).
Example:
If S is the sample array with N=512 samples, Y = fft(S) in Matlab return the FFT as (the sign of the imaginary part in the first half of the array are random, just to show the complex conjugate difference for the second part):
1 A1 + i*B1 (DC, B1 is always zero)
2 A2 + i*B2
3 A3 - i*B3
4 A4 + i*B4
5 A5 + i*B5
...
253 A253 - i*B253
254 A254 + i*B254
255 A255 + i*B255
256 A256 + i*B256
257 A257 - i*B257 (Nyquyst, B257 is always zero)
258 A256 - i*B256
259 A255 - i*B255
260 A254 - i*B254
261 A253 + i*B253
...
509 A5 - i*B5
510 A4 - i*B4
511 A3 + i*B3
512 A2 - i*B2
My FFT implementation returns only 256 values (and that's ok) in the the Y array as:
1 1 A1 + i*B1 (A1 is the DC, B1 is Nyquist, both are pure Real numbers)
2 512 A2 - i*B2
3 511 A3 + i*B3
4 510 A4 - i*B4
5 509 A5 + i*B5
...
253 261 A253 + i*B253
254 260 A254 - i*B254
255 259 A255 - i*B255
256 258 A256 - i*B256
Where the first column is the proper index of my Y array and the second is just the reference of the relative row in the Matlab FFT implementation.
As you can see my FFT implementation (DC apart) returns the FFT like the second half of the Matlab's FFT (in reverse order).
To summarize: even if I use fftshift as suggested, it seems that my implementation always return what in the Matlab FFT should be considered the negative part of the spectrum.
Where is the error???
This is the code I use:
Note 1: the FFT array is not declared here and it is changed inside the function. Initially it holds the N samples (real values) and at the end it contains the N/2 +1 bins of the single sided FFT spectrum.
Note 2: the N/2+1 bins are stored in N/2 elements only because the DC component is always real (and it is stored in FFT[0]) and also the Nyquyst (and it is stored in FFT[1]), this exception apart all the other even elements K holds a real number and the oven elements K+1 holds the imaginary part.
void Fft::FastFourierTransform( bool inverseFft ) {
double twr, twi, twpr, twpi, twtemp, ttheta;
int i, i1, i2, i3, i4, c1, c2;
double h1r, h1i, h2r, h2i, wrs, wis;
int nn, ii, jj, n, mmax, m, j, istep, isign;
double wtemp, wr, wpr, wpi, wi;
double theta, tempr, tempi;
// NS is the number of samples and it must be a power of two
if( NS == 1 )
return;
if( !inverseFft ) {
ttheta = 2.0 * PI / NS;
c1 = 0.5;
c2 = -0.5;
}
else {
ttheta = 2.0 * PI / NS;
c1 = 0.5;
c2 = 0.5;
ttheta = -ttheta;
twpr = -2.0 * Pow( Sin( 0.5 * ttheta ), 2 );
twpi = Sin(ttheta);
twr = 1.0+twpr;
twi = twpi;
for( i = 2; i <= NS/4+1; i++ ) {
i1 = i+i-2;
i2 = i1+1;
i3 = NS+1-i2;
i4 = i3+1;
wrs = twr;
wis = twi;
h1r = c1*(FFT[i1]+FFT[i3]);
h1i = c1*(FFT[i2]-FFT[i4]);
h2r = -c2*(FFT[i2]+FFT[i4]);
h2i = c2*(FFT[i1]-FFT[i3]);
FFT[i1] = h1r+wrs*h2r-wis*h2i;
FFT[i2] = h1i+wrs*h2i+wis*h2r;
FFT[i3] = h1r-wrs*h2r+wis*h2i;
FFT[i4] = -h1i+wrs*h2i+wis*h2r;
twtemp = twr;
twr = twr*twpr-twi*twpi+twr;
twi = twi*twpr+twtemp*twpi+twi;
}
h1r = FFT[0];
FFT[0] = c1*(h1r+FFT[1]);
FFT[1] = c1*(h1r-FFT[1]);
}
if( inverseFft )
isign = -1;
else
isign = 1;
n = NS;
nn = NS/2;
j = 1;
for(ii = 1; ii <= nn; ii++) {
i = 2*ii-1;
if( j>i ) {
tempr = FFT[j-1];
tempi = FFT[j];
FFT[j-1] = FFT[i-1];
FFT[j] = FFT[i];
FFT[i-1] = tempr;
FFT[i] = tempi;
}
m = n/2;
while( m>=2 && j>m ) {
j = j-m;
m = m/2;
}
j = j+m;
}
mmax = 2;
while(n>mmax) {
istep = 2*mmax;
theta = 2.0 * PI /(isign*mmax);
wpr = -2.0 * Pow( Sin( 0.5 * theta ), 2 );
wpi = Sin(theta);
wr = 1.0;
wi = 0.0;
for(ii = 1; ii <= mmax/2; ii++) {
m = 2*ii-1;
for(jj = 0; jj <= (n-m)/istep; jj++) {
i = m+jj*istep;
j = i+mmax;
tempr = wr*FFT[j-1]-wi*FFT[j];
tempi = wr*FFT[j]+wi*FFT[j-1];
FFT[j-1] = FFT[i-1]-tempr;
FFT[j] = FFT[i]-tempi;
FFT[i-1] = FFT[i-1]+tempr;
FFT[i] = FFT[i]+tempi;
}
wtemp = wr;
wr = wr*wpr-wi*wpi+wr;
wi = wi*wpr+wtemp*wpi+wi;
}
mmax = istep;
}
if( inverseFft )
for(i = 1; i <= 2*nn; i++)
FFT[i-1] = FFT[i-1]/nn;
if( !inverseFft ) {
twpr = -2.0 * Pow( Sin( 0.5 * ttheta ), 2 );
twpi = Sin(ttheta);
twr = 1.0+twpr;
twi = twpi;
for(i = 2; i <= NS/4+1; i++) {
i1 = i+i-2;
i2 = i1+1;
i3 = NS+1-i2;
i4 = i3+1;
wrs = twr;
wis = twi;
h1r = c1*(FFT[i1]+FFT[i3]);
h1i = c1*(FFT[i2]-FFT[i4]);
h2r = -c2*(FFT[i2]+FFT[i4]);
h2i = c2*(FFT[i1]-FFT[i3]);
FFT[i1] = h1r+wrs*h2r-wis*h2i;
FFT[i2] = h1i+wrs*h2i+wis*h2r;
FFT[i3] = h1r-wrs*h2r+wis*h2i;
FFT[i4] = -h1i+wrs*h2i+wis*h2r;
twtemp = twr;
twr = twr*twpr-twi*twpi+twr;
twi = twi*twpr+twtemp*twpi+twi;
}
h1r = FFT[0];
FFT[0] = h1r+FFT[1]; // DC
FFT[1] = h1r-FFT[1]; // FS/2 (NYQUIST)
}
return;
}

In matlab try using fftshift(fft(...)). Matlab doesn't automatically shift the spectrum after the FFT is called which is why they implemented the fftshift() function.

It is simply a matlab formatting thing. Basically, matlab arrange Fourier transform in following order
DC, (DC-1), .... (Nyquist-1), -Nyquist, -Nyquist+1, ..., DC-1
Let's say you have a 8 point sequence: [1 2 3 1 4 5 1 3]
In your signal processing class, your professor probably draws the Fourier spectrum based on a Cartesian system ( negative -> positive for x axis); So your DC should be located at 0 (the 4th position in your fft sequence, assuming position index here is 0-based) on your x axis.
In matlab, the DC is the very first element in the fft sequence, so you need to to fftshit() to swap the first half and second half of the fft sequence such that DC will be located at 4th position (position is 0-based indexed)
I am attaching a graph here so you may have a visual:
where a is the original 8-point sequence; FT(a) is the Fourier transform of a.
The matlab code is here:
a = [1 2 3 1 4 5 1 3];
A = fft(a);
N = length(a);
x = -N/2:N/2-1;
figure
subplot(3,1,1), stem(x, a,'o'); title('a'); xlabel('time')
subplot(3,1,2), stem(x, fftshift(abs(A),2),'o'); title('FT(a) in signal processing'); xlabel('frequency')
subplot(3,1,3), stem(x, abs(A),'o'); title('FT(a) in matlab'); xlabel('frequency')

How can I find a matching between two independent sets of features extracted from two images of the same scene captured by two different cameras?

I need to find matching between two independent sets of features extracted from two images of the same scene captured by two different cameras.
I'm using the HumanEvaI data set, so I have the calibration matrices of the cameras (in this particular case BW1 and BW2).
I can not use method like simple correlation, SIFT or SURF to solve the problem because the cameras are quite far away from each other and also rotated. So the differences between the images are big and there is occlusion as well.
I have been focused in finding an Homography between the captured images based on ground truth points matching that I have been able to build due to the calibration information I already have.
Once I have this homography I will use a perfect matching (Hungarian algorithm) to find the best correspondence. The importance of the homography here is that is the way I have to calculate the distance between the points.
So far everything seems fine, my problem is that I haven't been able to find a good homography . I have tried RANSAC method, gold standard method with sampson distance (this is a nonlinear optimization approach) and mainly everything from a book called 'Multiple View Geometry in Computer Vision' Second Edition by Richard Hartley.
I have implemented everything in matlab so far.
Is there another way to do this? I'm I in the right path? If so what could I have been doing wrong?

You can try these two methods:
A new point matching algorithm for non-rigid registration (uses Thin-plate Spline) - relatively slower.
Point Set Registration: Coherent Point Drift (faster and more accurate I feel).
As far as 2nd method is concerned, I feel that it gives very good registration result in presence of outliers, it is fast and is able to recover complex transformations. But the 1st method is also a well-known method in registration field and you may try that as well.
Try understanding the core of the algorithm and then move on to the codes available online.
Thin plate spline here - Download the TPS-RPM demo.
Point Set Registration: Coherent Point Drift here

You might find my solution interesting. It is a pure numpy implementation of the Coherent Point Drift algorithm.
Here is an example:
from functools import partial
from scipy.io import loadmat
import matplotlib.pyplot as plt
import numpy as np
import time
class RigidRegistration(object):
def __init__(self, X, Y, R=None, t=None, s=None, sigma2=None, maxIterations=100, tolerance=0.001, w=0):
if X.shape[1] != Y.shape[1]:
raise 'Both point clouds must have the same number of dimensions!'
self.X = X
self.Y = Y
(self.N, self.D) = self.X.shape
(self.M, _) = self.Y.shape
self.R = np.eye(self.D) if R is None else R
self.t = np.atleast_2d(np.zeros((1, self.D))) if t is None else t
self.s = 1 if s is None else s
self.sigma2 = sigma2
self.iteration = 0
self.maxIterations = maxIterations
self.tolerance = tolerance
self.w = w
self.q = 0
self.err = 0
def register(self, callback):
self.initialize()
while self.iteration < self.maxIterations and self.err > self.tolerance:
self.iterate()
callback(X=self.X, Y=self.Y)
return self.Y, self.s, self.R, self.t
def iterate(self):
self.EStep()
self.MStep()
self.iteration = self.iteration + 1
def MStep(self):
self.updateTransform()
self.transformPointCloud()
self.updateVariance()
def updateTransform(self):
muX = np.divide(np.sum(np.dot(self.P, self.X), axis=0), self.Np)
muY = np.divide(np.sum(np.dot(np.transpose(self.P), self.Y), axis=0), self.Np)
self.XX = self.X - np.tile(muX, (self.N, 1))
YY = self.Y - np.tile(muY, (self.M, 1))
self.A = np.dot(np.transpose(self.XX), np.transpose(self.P))
self.A = np.dot(self.A, YY)
U, _, V = np.linalg.svd(self.A, full_matrices=True)
C = np.ones((self.D, ))
C[self.D-1] = np.linalg.det(np.dot(U, V))
self.R = np.dot(np.dot(U, np.diag(C)), V)
self.YPY = np.dot(np.transpose(self.P1), np.sum(np.multiply(YY, YY), axis=1))
self.s = np.trace(np.dot(np.transpose(self.A), self.R)) / self.YPY
self.t = np.transpose(muX) - self.s * np.dot(self.R, np.transpose(muY))
def transformPointCloud(self, Y=None):
if not Y:
self.Y = self.s * np.dot(self.Y, np.transpose(self.R)) + np.tile(np.transpose(self.t), (self.M, 1))
return
else:
return self.s * np.dot(Y, np.transpose(self.R)) + np.tile(np.transpose(self.t), (self.M, 1))
def updateVariance(self):
qprev = self.q
trAR = np.trace(np.dot(self.A, np.transpose(self.R)))
xPx = np.dot(np.transpose(self.Pt1), np.sum(np.multiply(self.XX, self.XX), axis =1))
self.q = (xPx - 2 * self.s * trAR + self.s * self.s * self.YPY) / (2 * self.sigma2) + self.D * self.Np/2 * np.log(self.sigma2)
self.err = np.abs(self.q - qprev)
self.sigma2 = (xPx - self.s * trAR) / (self.Np * self.D)
if self.sigma2 <= 0:
self.sigma2 = self.tolerance / 10
def initialize(self):
self.Y = self.s * np.dot(self.Y, np.transpose(self.R)) + np.repeat(self.t, self.M, axis=0)
if not self.sigma2:
XX = np.reshape(self.X, (1, self.N, self.D))
YY = np.reshape(self.Y, (self.M, 1, self.D))
XX = np.tile(XX, (self.M, 1, 1))
YY = np.tile(YY, (1, self.N, 1))
diff = XX - YY
err = np.multiply(diff, diff)
self.sigma2 = np.sum(err) / (self.D * self.M * self.N)
self.err = self.tolerance + 1
self.q = -self.err - self.N * self.D/2 * np.log(self.sigma2)
def EStep(self):
P = np.zeros((self.M, self.N))
for i in range(0, self.M):
diff = self.X - np.tile(self.Y[i, :], (self.N, 1))
diff = np.multiply(diff, diff)
P[i, :] = P[i, :] + np.sum(diff, axis=1)
c = (2 * np.pi * self.sigma2) ** (self.D / 2)
c = c * self.w / (1 - self.w)
c = c * self.M / self.N
P = np.exp(-P / (2 * self.sigma2))
den = np.sum(P, axis=0)
den = np.tile(den, (self.M, 1))
den[den==0] = np.finfo(float).eps
self.P = np.divide(P, den)
self.Pt1 = np.sum(self.P, axis=0)
self.P1 = np.sum(self.P, axis=1)
self.Np = np.sum(self.P1)
def visualize(X, Y, ax):
plt.cla()
ax.scatter(X[:,0] , X[:,1], color='red')
ax.scatter(Y[:,0] , Y[:,1], color='blue')
plt.draw()
plt.pause(0.001)
def main():
fish = loadmat('./data/fish.mat')
X = fish['X'] # number-of-points x number-of-dimensions array of fixed points
Y = fish['Y'] # number-of-points x number-of-dimensions array of moving points
fig = plt.figure()
fig.add_axes([0, 0, 1, 1])
callback = partial(visualize, ax=fig.axes[0])
reg = RigidRegistration(X, Y)
reg.register(callback)
plt.show()
if __name__ == '__main__':
main()

Another method I think you might find useful is described here.
This approach tries to fit local models to group of points. Its global optimization method allows it to outperform RANSAC when several distinct local models exists.
I also believe they have code available.

How to solve a linear system of matrices in scala breeze?

How to solve a linear system of matrices in scala breeze? ie, I have Ax = b, where A is a matrix (usually positive definite), and x and b are vectors.
I can see that there is a cholesky decomposition available, but I couldn't seem to find a solver? (if it was matlab I could do x = b \ A. If it was scipy I could do x = A.solve(b) )

Apparently, it is quite simple in fact, and built into scala-breeze as an operator:
x = A \ b
It doesnt use Cholesky, it uses LU decomposition, which is I think about half as fast, but they are both O(n^3), so comparable.

Well, I wrote my own solver in the end. I'm not sure if this is the optimal way to do it, but it doesn't seem unreasonable? :
// Copyright Hugh Perkins 2012
// You can use this under the terms of the Apache Public License 2.0
// http://www.apache.org/licenses/LICENSE-2.0
package root
import breeze.linalg._
object Solver {
// solve Ax = b, for x, where A = choleskyMatrix * choleskyMatrix.t
// choleskyMatrix should be lower triangular
def solve( choleskyMatrix: DenseMatrix[Double], b: DenseVector[Double] ) : DenseVector[Double] = {
val C = choleskyMatrix
val size = C.rows
if( C.rows != C.cols ) {
// throw exception or something
}
if( b.length != size ) {
// throw exception or something
}
// first we solve C * y = b
// (then we will solve C.t * x = y)
val y = DenseVector.zeros[Double](size)
// now we just work our way down from the top of the lower triangular matrix
for( i <- 0 until size ) {
var sum = 0.
for( j <- 0 until i ) {
sum += C(i,j) * y(j)
}
y(i) = ( b(i) - sum ) / C(i,i)
}
// now calculate x
val x = DenseVector.zeros[Double](size)
val Ct = C.t
// work up from bottom this time
for( i <- size -1 to 0 by -1 ) {
var sum = 0.
for( j <- i + 1 until size ) {
sum += Ct(i,j) * x(j)
}
x(i) = ( y(i) - sum ) / Ct(i,i)
}
x
}
}