Teaching a Neural Net: Bipolar XOR - neural-network

I'm trying to to teach a neural net of 2 inputs, 4 hidden nodes (all in same layer) and 1 output node. The binary representation works fine, but I have problems with the Bipolar. I can't figure out why, but the total error will sometimes converge to the same number around 2.xx. My sigmoid is 2/(1+ exp(-x)) - 1. Perhaps I'm sigmoiding in the wrong place. For example to calculate the output error should I be comparing the sigmoided output with the expected value or with the sigmoided expected value?
I was following this website here: http://galaxy.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html , but they use different functions then I was instructed to use. Even when I did try to implement their functions I still ran into the same problem. Either way I get stuck about half the time at the same number (a different number for different implementations). Please tell me if I have made a mistake in my code somewhere or if this is normal (I don't see how it could be). Momentum is set to 0. Is this a common 0 momentum problem? The error functions we are supposed to be using are:
if ui is an output unit
Error(i) = (Ci - ui ) * f'(Si )
if ui is a hidden unit
Error(i) = Error(Output) * weight(i to output) * f'(Si)
public double sigmoid( double x ) {
double fBipolar, fBinary, temp;
temp = (1 + Math.exp(-x));
fBipolar = (2 / temp) - 1;
fBinary = 1 / temp;
if(bipolar){
return fBipolar;
}else{
return fBinary;
}
}
// Initialize the weights to random values.
private void initializeWeights(double neg, double pos) {
for(int i = 0; i < numInputs + 1; i++){
for(int j = 0; j < numHiddenNeurons; j++){
inputWeights[i][j] = Math.random() - pos;
if(inputWeights[i][j] < neg || inputWeights[i][j] > pos){
print("ERROR ");
print(inputWeights[i][j]);
}
}
}
for(int i = 0; i < numHiddenNeurons + 1; i++){
hiddenWeights[i] = Math.random() - pos;
if(hiddenWeights[i] < neg || hiddenWeights[i] > pos){
print("ERROR ");
print(hiddenWeights[i]);
}
}
}
// Computes output of the NN without training. I.e. a forward pass
public double outputFor ( double[] argInputVector ) {
for(int i = 0; i < numInputs; i++){
inputs[i] = argInputVector[i];
}
double weightedSum = 0;
for(int i = 0; i < numHiddenNeurons; i++){
weightedSum = 0;
for(int j = 0; j < numInputs + 1; j++){
weightedSum += inputWeights[j][i] * inputs[j];
}
hiddenActivation[i] = sigmoid(weightedSum);
}
weightedSum = 0;
for(int j = 0; j < numHiddenNeurons + 1; j++){
weightedSum += (hiddenActivation[j] * hiddenWeights[j]);
}
return sigmoid(weightedSum);
}
//Computes the derivative of f
public static double fPrime(double u){
double fBipolar, fBinary;
fBipolar = 0.5 * (1 - Math.pow(u,2));
fBinary = u * (1 - u);
if(bipolar){
return fBipolar;
}else{
return fBinary;
}
}
// This method is used to update the weights of the neural net.
public double train ( double [] argInputVector, double argTargetOutput ){
double output = outputFor(argInputVector);
double lastDelta;
double outputError = (argTargetOutput - output) * fPrime(output);
if(outputError != 0){
for(int i = 0; i < numHiddenNeurons + 1; i++){
hiddenError[i] = hiddenWeights[i] * outputError * fPrime(hiddenActivation[i]);
deltaHiddenWeights[i] = learningRate * outputError * hiddenActivation[i] + (momentum * lastDelta);
hiddenWeights[i] += deltaHiddenWeights[i];
}
for(int in = 0; in < numInputs + 1; in++){
for(int hid = 0; hid < numHiddenNeurons; hid++){
lastDelta = deltaInputWeights[in][hid];
deltaInputWeights[in][hid] = learningRate * hiddenError[hid] * inputs[in] + (momentum * lastDelta);
inputWeights[in][hid] += deltaInputWeights[in][hid];
}
}
}
return 0.5 * (argTargetOutput - output) * (argTargetOutput - output);
}

General coding comments:
initializeWeights(-1.0, 1.0);
may not actually get the initial values you were expecting.
initializeWeights should probably have:
inputWeights[i][j] = Math.random() * (pos - neg) + neg;
// ...
hiddenWeights[i] = (Math.random() * (pos - neg)) + neg;
instead of:
Math.random() - pos;
so that this works:
initializeWeights(0.0, 1.0);
and gives you initial values between 0.0 and 1.0 rather than between -1.0 and 0.0.
lastDelta is used before it is declared:
deltaHiddenWeights[i] = learningRate * outputError * hiddenActivation[i] + (momentum * lastDelta);
I'm not sure if the + 1 on numInputs + 1 and numHiddenNeurons + 1 are necessary.
Remember to watch out for rounding of ints: 5/2 = 2, not 2.5!
Use 5.0/2.0 instead. In general, add the .0 in your code when the output should be a double.
Most importantly, have you trained the NeuralNet long enough?
Try running it with numInputs = 2, numHiddenNeurons = 4, learningRate = 0.9, and train for 1,000 or 10,000 times.
Using numHiddenNeurons = 2 it sometimes get "stuck" when trying to solve the XOR problem.
See also XOR problem - simulation

Related

Boids in compute shader: How can I avoid a stable state when processing every boid in parallel?

I want to implement the simple, age-old boid algorithm in 2D but as compute shader in HLSL (host programm ist VVVV).
Now my problem: In all 2D sample-applications I`ve found, boids are updated one at a time. This results in some jittering that causes the typical "random" results. Direction changes, etc.
I am not sure if my implementation is correct, but if I test each rule individually, my results look like other references. If I apply the rules together however (in pretty much any combination), very soon my boids enter a stable state where they form a fixed formation and just fly in some particular direction. Changing the view-radius influences the size and number of formations but doesn`t introduce anything "chaotic" or flock like, just static bunches after a couple of seconds.
Is there an implementation problem or is this just a property of parallel compute you have to compensate (somehow?
Relevant code:
void CS(uint3 tid : SV_DispatchThreadID){
if (tid.x >= elementCount)
return;
if(reset){
OutputBuffer[tid.x].x = random(float2(rand.x + 12,tid.x-4)); // PosXY
OutputBuffer[tid.x].y = random(float2(tid.x + 16,rand.y*6));
OutputBuffer[tid.x].z = random(float2(tid.x,tid.x * 2)) / 100; //VelXY
OutputBuffer[tid.x].w = random(float2(tid.x * 16, tid.x / 4))/ 100;
}else{
float maxSpeed = 0.01;
float2 myPos = OutputBuffer[tid.x].xy;
float2 myVel = OutputBuffer[tid.x].zw;
float2 myAcc = 0;
float2 steerAlign = 0;
float2 steerCohesion = 0;
float2 steerAvoid = 0;
int alignCount = 0;
int cohesionCount = 0;
int avoidCount = 0;
for(uint i = 0; i < elementCount; i++){
if(i != tid.x){
float2 iPos = OutputBuffer[i].xy;
float2 iVel = OutputBuffer[i].wz;
float dist = distance(iPos,myPos);
if(dist < range / 2){
steerAlign += iVel;
alignCount++;
}
if(dist < range * 3){
steerCohesion += iPos - myPos;
cohesionCount++;
}
if(dist < range){
float2 diff = myPos - iPos;
diff /= dist * dist;
steerAvoid += diff;
avoidCount++;
}
}
}
if(alignCount > 0 && steerAlign.x != 0){
steerAlign /= alignCount;
steerAlign = normalize(steerAlign) * maxForce;
}
if(cohesionCount > 0){
steerCohesion /= cohesionCount;
steerCohesion = normalize(steerCohesion) * maxForce;
}
if(avoidCount > 0){
steerAvoid /= avoidCount;
steerAvoid = normalize(steerAvoid) * maxForce;
}
if(myPos.x < -1){
myPos.x = 1;
}
if(myPos.x > 1){
myPos.x = -1;
}
if(myPos.y < -1){
myPos.y = 1;
}
if(myPos.y > 1){
myPos.y = -1;
}
myAcc = (steerAlign * alignFx) + (steerCohesion * cohesionFx) + (steerAvoid * seperationFx);
myAcc = clamp(myAcc, -maxForce, maxForce);
myVel += myAcc;
myVel = clamp(myVel,-maxSpeed,maxSpeed);
myPos += myVel;
OutputBuffer[tid.x].xy = myPos;
OutputBuffer[tid.x].zw = myVel;
}
}

Same C code different results TIv5.2.5 and gcc 5.4.1 c99 compiler

I am using MSP432P401R to do FFT of SAR ADC samples, did FFT in MATLAB and got results same as C compiler online but Code Composer Studio IDE is giving different output than MATLAB results, I thought that can be a compiler issue so tried reading same did some changes and tried but not getting results Like MATLAB.
Online C compiler was gcc 5.4.1 c99.
and in CCS TI v5.2.5 compiler is used.
float m;
float ur, ui, sr, si,tr, ti;
long double Temp_A[256],ArrayA[256]={2676,2840,2838,2832,2826,2818,2814,2808,
2804,2798,2790,2784,2778,2770,2764,2758,2752,2746,2740,2734,
2726,2720,2714,2706,2700,2692,2686,2680,2674,2668,2660,2654,
2646,2642,2634,2624,2618,2612,2604,2598,2590,2584,2576,2570,
2562,2556,2550,2542,2536,2530,2522,2512,2508,2498,2490,2484,
2478,2470,2462,2454,2448,2442,2432,2426,2420,2414,2404,2398,
2390,2382,2374,2368,2360,2352,2346,2338,2330,2322,2314,2306,
2300,2294,2286,2278,2272,2262,2258,2250,2238,2234,2228,2220,
2208,2202,2192,2186,2178,2170,2164,2156,2150,2142,2134,2126,
2116,2110,2104,2096,2088,2078,2070,2062,2054,2046,2040,2034,
2026,2018,2010,2002,1994,1986,1978,1970,1962,1954,1946,1936,
1930,1922,1914,1908,1902,1894,1886,1876,1868,1860,1852,1846,
1838,1830,1822,1814,1804,1796,1790,1784,1776,1768,1760,1754,
1746,1738,1728,1720,1714,1708,1698,1692,1684,1674,1668,1656,
1656,1644,1640,1628,1624,1612,1610,1598,1596,1584,1580,1570,
1564,1554,1546,1540,1532,1526,1520,1512,1504,1496,1490,1482,
1474,1468,1462,1454,1446,1438,1432,1424,1420,1410,1404,1398,
1392,1384,1376,1370,1364,1356,1348,1342,1336,1328,1322,1316,
1308,1300,1294,1286,1280,1276,1270,1262,1254,1248,1242,1236,
1230,1222,1216,1210,1206,1198,1192,1188,1178,1172,1168,1162,
1154,1148,1144,1138,1132,1126,1120,1114,1108,1102,1096,1090,
1084,1080,1074,1068,1062,1058,1052,1048},ArrayA_IMX[256]={0};
unsigned int jm1,i;
unsigned int ip,l;
void main(void)
{
WDT_A->CTL = WDT_A_CTL_PW |WDT_A_CTL_HOLD;
VCORE();
CLK();
P1DIR |= BIT5; //CLK--AD7352 OUTPUT DIRECTION
P1DIR |= BIT7; //CHIP SELECT--AD7352 OUTPUT DIRECTION
P5DIR &= ~BIT0; //SDATAA--AD7352 INPUT DIRECTION P5.0
P5DIR &= ~BIT2; //SDATAB--AD7352 INPUT DIRECTION P5.2
while(1)
{
bit_reversal(ArrayA);
fft(ArrayA,ArrayA_IMX);
}
}
void bit_reversal(long double REX[])
{
int i,i2,n,m;
int tx,k,j;
n = 1;
m=8;
for (i=0;i<m;i++)
{
n *= 2;
}
i2 = n >> 1;
j = 0;
for (i=0;i<n-1;i++)
{
if (i < j)
{
tx = REX[i];
//ty = IMX[i];
REX[i] = REX[j];
//IMX[i] = IMX[j];
REX[j] = tx;
//IMX[j] = ty;
}
k = i2;
while (k <= j)
{
j -= k;
k >>= 1;
}
j += k;
}
}
void fft(long double REX[],long double IMX[])
{
N = 256;
nm1 = N - 1;
nd2 = N / 2;
m = log10l(N) / log10l(2);
j = nd2;
for (l = 1; l <= m; l++)
{
le = powl(2, l);
le2 = le / 2;
ur = 1;
ui = 0;
// Calculate sine and cosine values
sr = cosl(M_PI/le2);
si = -sinl(M_PI/le2);
// Loop for each sub DFT
for (j = 1; j <= le2; j++)
{
jm1 = j - 1;
// Loop for each butterfly
for (i = jm1; i <= nm1; i += le)
{
ip = i + le2;
tr = REX[ip]*ur - IMX[ip]*ui;
ti = REX[ip]*ui + IMX[ip]*ur;
REX[ip] = REX[i] - tr;
IMX[ip] = IMX[i] - ti;
REX[i] = REX[i] + tr;
IMX[i] = IMX[i] + ti;
}
tr = ur;
ur = tr*sr - ui*si;
ui = tr*si + ui*sr;
}
}
}

Building a non-layered LSTM network from scratch, how to do forward pass and backward pass?

I'm building a LSTM network from scratch, from my own understanding of how LSTM cells work.
There are no layers, so I'm trying to implement non-vectorized forms of the equations I see in the tutorials. I'm also using peepholes from the cell state.
So far, I understand that it looks like this: LSTM network
With that I've made these equations for each of the gates for forward pass:
i_t = sigmoid( i_w * (x_t + c_t) + i_b )
f_t = sigmoid( f_w * (x_t + c_t) + f_b )
cell_gate = tanh( c_w * x_t + c_b )
c_t = (f_t * c_t) + (i_t * cell_gate)
o_t = sigmoid( o_w * (x_t + c_t) + o_b )
h_t = o_t * tanh(c_t)
Where _w's mean weights for that respective gate and _b for biases. Also, I've named that first sigmoid on the far left the "cell_gate".
Back pass is where things get fuzzy for me, I'm not sure how to derive these equations correctly.
I know generally to calculate error, the equation is: error = f'(x_t) * (received_error). Where f'(x_t) is the first derivative of the activation function and received_error could be either (target - output) for output neurons or ∑(o_e * w_io) for hidden neurons.
Where o_e is the error of one of the cells the current cell outputs to and w_io is the weight connecting them.
I not sure if the LSTM cell as a whole is considered a neuron, so I treated each of the gates as neurons and tried to calculate error signals for each. Then used the error signal from the cell gate alone to pass back up the network...:
o_e = sigmoid'(o_w * (x_t + c_t) + o_b) * (received_error)
o_w += o_l * x_t * o_e
o_b += o_l * sigmoid(o_b) * o_e
...The rest of the gates follow the same format...
Then the error for the entire LSTM cell is equal to o_e.
Then for a LSTM cell above the current cell, the error it receives is equal to to:
tanh'(x_t) * ∑(o_e * w_io)
Is this all correct? Am I doing anything completely wrong?
I to am taking on this task, I believe your approach is correct:
https://github.com/evolvingstuff/LongShortTermMemory/blob/master/src/com/evolvingstuff/LSTM.java
Some nice work from: Thomas Lahore
//////////////////////////////////////////////////////////////
//////////////////////////////////////////////////////////////
//BACKPROP
//////////////////////////////////////////////////////////////
//////////////////////////////////////////////////////////////
//scale partials
for (int c = 0; c < cell_blocks; c++) {
for (int i = 0; i < full_input_dimension; i++) {
this.dSdwWeightsInputGate[c][i] *= ForgetGateAct[c];
this.dSdwWeightsForgetGate[c][i] *= ForgetGateAct[c];
this.dSdwWeightsNetInput[c][i] *= ForgetGateAct[c];
dSdwWeightsInputGate[c][i] += full_input[i] * neuronInputGate.Derivative(InputGateSum[c]) * NetInputAct[c];
dSdwWeightsForgetGate[c][i] += full_input[i] * neuronForgetGate.Derivative(ForgetGateSum[c]) * CEC1[c];
dSdwWeightsNetInput[c][i] += full_input[i] * neuronNetInput.Derivative(NetInputSum[c]) * InputGateAct[c];
}
}
if (target_output != null) {
double[] deltaGlobalOutputPre = new double[output_dimension];
for (int k = 0; k < output_dimension; k++) {
deltaGlobalOutputPre[k] = target_output[k] - output[k];
}
//output to hidden
double[] deltaNetOutput = new double[cell_blocks];
for (int k = 0; k < output_dimension; k++) {
//links
for (int c = 0; c < cell_blocks; c++) {
deltaNetOutput[c] += deltaGlobalOutputPre[k] * weightsGlobalOutput[k][c];
weightsGlobalOutput[k][c] += deltaGlobalOutputPre[k] * NetOutputAct[c] * learningRate;
}
//bias
weightsGlobalOutput[k][cell_blocks] += deltaGlobalOutputPre[k] * 1.0 * learningRate;
}
for (int c = 0; c < cell_blocks; c++) {
//update output gates
double deltaOutputGatePost = deltaNetOutput[c] * CECSquashAct[c];
double deltaOutputGatePre = neuronOutputGate.Derivative(OutputGateSum[c]) * deltaOutputGatePost;
for (int i = 0; i < full_input_dimension; i++) {
weightsOutputGate[c][i] += full_input[i] * deltaOutputGatePre * learningRate;
}
peepOutputGate[c] += CEC3[c] * deltaOutputGatePre * learningRate;
//before outgate
double deltaCEC3 = deltaNetOutput[c] * OutputGateAct[c] * neuronCECSquash.Derivative(CEC3[c]);
//update input gates
double deltaInputGatePost = deltaCEC3 * NetInputAct[c];
double deltaInputGatePre = neuronInputGate.Derivative(InputGateSum[c]) * deltaInputGatePost;
for (int i = 0; i < full_input_dimension; i++) {
weightsInputGate[c][i] += dSdwWeightsInputGate[c][i] * deltaCEC3 * learningRate;
}
peepInputGate[c] += CEC2[c] * deltaInputGatePre * learningRate;
//before ingate
double deltaCEC2 = deltaCEC3;
//update forget gates
double deltaForgetGatePost = deltaCEC2 * CEC1[c];
double deltaForgetGatePre = neuronForgetGate.Derivative(ForgetGateSum[c]) * deltaForgetGatePost;
for (int i = 0; i < full_input_dimension; i++) {
weightsForgetGate[c][i] += dSdwWeightsForgetGate[c][i] * deltaCEC2 * learningRate;
}
peepForgetGate[c] += CEC1[c] * deltaForgetGatePre * learningRate;
//update cell inputs
for (int i = 0; i < full_input_dimension; i++) {
weightsNetInput[c][i] += dSdwWeightsNetInput[c][i] * deltaCEC3 * learningRate;
}
//no peeps for cell inputs
}
}
//////////////////////////////////////////////////////////////
//roll-over context to next time step
for (int j = 0; j < cell_blocks; j++) {
context[j] = NetOutputAct[j];
CEC[j] = CEC3[j];
}
Also, and perhaps even more interesting is the Lecture and Lecture Notes from Andrej Karpathy:
https://youtu.be/cO0a0QYmFm8?t=45m36s
http://cs231n.stanford.edu/slides/2016/winter1516_lecture10.pdf

iRPROP+ Multilayer Perceptron

Hello everyone This is the code of iRPROP+ algo for my MLP. When I try to train my network, standart deviation decreases for 1500 epoches (so slow: from ~0.5 to 0.4732) but suddenly it starts to increase.
Can someone say what did I do wrong?
public void RPROP()
{
double a = 1.2, b = 0.5, nMax = 50, nMin = 0.000001;
for (int l = Network.Length - 1; l > 0; l--)
{
for (int i = 0; i < Network[l].getSize(); i++)
{
Neuron n = Network[l].Neurons[i];
double sum = 0;
if (l == Network.Length - 1) n.Delta = (n.Output - DesiredOutput[i]) * ActFunc.calcDeprivateFunction(n.Output);
else
{
for (int k = 0; k < Network[l + 1].getSize(); k++)
{
sum += Network[l + 1].Neurons[k].getWeight(i) * Network[l + 1].Neurons[k].Delta;
}
n.Delta = sum * ActFunc.calcDeprivateFunction(n.Output);
}
}
}
for (int l = 1; l < Network.Length; l++)
{
for (int i = 0; i < Network[l].getSize(); i++)
{
Neuron n = Network[l].Neurons[i];
if ((n.PrevDelta * n.Delta) > 0)
{
n.N = Math.Min(a * n.PrevN, nMax);
n.Bias -= n.N * Math.Sign(n.Delta);
for (int j = 0; j < Network[l - 1].getSize(); j++)
{
n.setWeight(j, n.getWeight(j) - n.N * Math.Sign(n.Delta));
}
n.PrevDelta = n.Delta;
}
else if ((n.PrevDelta * n.Delta) < 0)
{
n.N = Math.Max(b * n.PrevN, nMin);
if (this.CurrentError > this.LastError)
{
n.Bias += n.PrevN * Math.Sign(n.PrevDelta);
for (int j = 0; j < Network[l - 1].getSize(); j++)
{
n.setWeight(j, n.getWeight(j) + n.PrevN * Math.Sign(n.PrevDelta));
}
}
n.Delta = 0;
}
else if ((n.PrevDelta * n.Delta) == 0)
{
n.Bias -= n.N * Math.Sign(n.Delta);
for (int j = 0; j < Network[l - 1].getSize(); j++)
{
n.setWeight(j, n.getWeight(j) - n.N * Math.Sign(n.Delta));
}
n.PrevDelta = n.Delta;
}
n.PrevN = n.N;
}
}
}
For the first view, you calculate one train element error and you instantly teach it to the network. try to run over the full train set, without change the weights, and just summarize the Delta. After that, update the weights once, set the prev delta and start over.
Also, there is no update for neuron threshold.

imregionalmax matlab function's equivalent in opencv

I have an image of connected components(circles filled).If i want to segment them i can use watershed algorithm.I prefer writing my own function for watershed instead of using the inbuilt function in OPENCV.I have successfu How do i find the regionalmax of objects using opencv?
I wrote a function myself. My results were quite similar to MATLAB, although not exact. This function is implemented for CV_32F but it can easily be modified for other types.
I mark all the points that are not part of a minimum region by checking all the neighbors. The remaining regions are either minima, maxima or areas of inflection.
I use connected components to label each region.
I check each region for any point belonging to a maxima, if yes then I push that label into a vector.
Finally I sort the bad labels, erase all duplicates and then mark all the points in the output as not minima.
All that remains are the regions of minima.
Here is the code:
// output is a binary image
// 1: not a min region
// 0: part of a min region
// 2: not sure if min or not
// 3: uninitialized
void imregionalmin(cv::Mat& img, cv::Mat& out_img)
{
// pad the border of img with 1 and copy to img_pad
cv::Mat img_pad;
cv::copyMakeBorder(img, img_pad, 1, 1, 1, 1, IPL_BORDER_CONSTANT, 1);
// initialize binary output to 2, unknown if min
out_img = cv::Mat::ones(img.rows, img.cols, CV_8U)+2;
// initialize pointers to matrices
float* in = (float *)(img_pad.data);
uchar* out = (uchar *)(out_img.data);
// size of matrix
int in_size = img_pad.cols*img_pad.rows;
int out_size = img.cols*img.rows;
int x, y;
for (int i = 0; i < out_size; i++) {
// find x, y indexes
y = i % img.cols;
x = i / img.cols;
neighborCheck(in, out, i, x, y, img_pad.cols); // all regions are either min or max
}
cv::Mat label;
cv::connectedComponents(out_img, label);
int* lab = (int *)(label.data);
in = (float *)(img.data);
in_size = img.cols*img.rows;
std::vector<int> bad_labels;
for (int i = 0; i < out_size; i++) {
// find x, y indexes
y = i % img.cols;
x = i / img.cols;
if (lab[i] != 0) {
if (neighborCleanup(in, out, i, x, y, img.rows, img.cols) == 1) {
bad_labels.push_back(lab[i]);
}
}
}
std::sort(bad_labels.begin(), bad_labels.end());
bad_labels.erase(std::unique(bad_labels.begin(), bad_labels.end()), bad_labels.end());
for (int i = 0; i < out_size; ++i) {
if (lab[i] != 0) {
if (std::find(bad_labels.begin(), bad_labels.end(), lab[i]) != bad_labels.end()) {
out[i] = 0;
}
}
}
}
int inline neighborCleanup(float* in, uchar* out, int i, int x, int y, int x_lim, int y_lim)
{
int index;
for (int xx = x - 1; xx < x + 2; ++xx) {
for (int yy = y - 1; yy < y + 2; ++yy) {
if (((xx == x) && (yy==y)) || xx < 0 || yy < 0 || xx >= x_lim || yy >= y_lim)
continue;
index = xx*y_lim + yy;
if ((in[i] == in[index]) && (out[index] == 0))
return 1;
}
}
return 0;
}
void inline neighborCheck(float* in, uchar* out, int i, int x, int y, int x_lim)
{
int indexes[8], cur_index;
indexes[0] = x*x_lim + y;
indexes[1] = x*x_lim + y+1;
indexes[2] = x*x_lim + y+2;
indexes[3] = (x+1)*x_lim + y+2;
indexes[4] = (x + 2)*x_lim + y+2;
indexes[5] = (x + 2)*x_lim + y + 1;
indexes[6] = (x + 2)*x_lim + y;
indexes[7] = (x + 1)*x_lim + y;
cur_index = (x + 1)*x_lim + y+1;
for (int t = 0; t < 8; t++) {
if (in[indexes[t]] < in[cur_index]) {
out[i] = 0;
break;
}
}
if (out[i] == 3)
out[i] = 1;
}
The following listing is a function similar to Matlab's "imregionalmax". It looks for at most nLocMax local maxima above threshold, where the found local maxima are at least minDistBtwLocMax pixels apart. It returns the actual number of local maxima found. Notice that it uses OpenCV's minMaxLoc to find global maxima. It is "opencv-self-contained" except for the (easy to implement) function vdist, which computes the (euclidian) distance between points (r,c) and (row,col).
input is one-channel CV_32F matrix, and locations is nLocMax (rows) by 2 (columns) CV_32S matrix.
int imregionalmax(Mat input, int nLocMax, float threshold, float minDistBtwLocMax, Mat locations)
{
Mat scratch = input.clone();
int nFoundLocMax = 0;
for (int i = 0; i < nLocMax; i++) {
Point location;
double maxVal;
minMaxLoc(scratch, NULL, &maxVal, NULL, &location);
if (maxVal > threshold) {
nFoundLocMax += 1;
int row = location.y;
int col = location.x;
locations.at<int>(i,0) = row;
locations.at<int>(i,1) = col;
int r0 = (row-minDistBtwLocMax > -1 ? row-minDistBtwLocMax : 0);
int r1 = (row+minDistBtwLocMax < scratch.rows ? row+minDistBtwLocMax : scratch.rows-1);
int c0 = (col-minDistBtwLocMax > -1 ? col-minDistBtwLocMax : 0);
int c1 = (col+minDistBtwLocMax < scratch.cols ? col+minDistBtwLocMax : scratch.cols-1);
for (int r = r0; r <= r1; r++) {
for (int c = c0; c <= c1; c++) {
if (vdist(Point2DMake(r, c),Point2DMake(row, col)) <= minDistBtwLocMax) {
scratch.at<float>(r,c) = 0.0;
}
}
}
} else {
break;
}
}
return nFoundLocMax;
}
I do not know if it is what you want, but in my answer to this post, I gave some code to find local maxima (peaks) in a grayscale image (resulting from distance transform).
The approach relies on subtracting the original image from the dilated image and finding the zero pixels).
I hope it helps,
Good luck
I had the same problem some time ago, and the solution was to reimplement the imregionalmax algorithm in OpenCV/Cpp. It is not that complicated, because you can find the C++ source code of the function in the Matlab distribution. (somewhere in toolbox). All you have to do is to read carefully and understand the algorithm described there. Then rewrite it or remove the matlab-specific checks and you'll have it.