Best Way to Add 3 Numbers (or 4, or N) in Java - Kahan Sums? - double

I found a completely different answer to this question, the whole original question makes no sense anymore. However, the answer way be useful, so I modify it a bit...
I want to sum up three double numbers, say a, b, and c, in the most numerically stable way possible.
I think using a Kahan Sum would be the way to go.
However, a strange thought occured to me: Would it make sense to:
First sum up a, b, and c and remember the (absolute value of the) compensation.
Then sum up a, c, b
If the (absolute value of the) compensation of the second sum is smaller, use this sum instead.
Proceed similar with b, a, c and other permutations of the numbers.
Return the sum with the smallest associated absolute compensation.
Would I get a more "stable" Addition of three numbers this way? Or does the order of numbers in the sum have no (use-able) impact on the compensation left at the end of the Summation? With (use-able) I mean to ask whether the compensation value itself is stable enough to contain Information that I can use?
(I am using the Java programming language, although I think this does not matter here.)
Many thanks,
Thomas.

I think I found a much more reliable way to solve the "Add 3" (or "Add 4" or "Add N" numbers problem.
First of all, I implemented my idea from the original post. It resulted into quite some big code which seemed, initially, to work. However, it failed in the following case: add Double.MAX_VALUE, 1, and -Double.MAX_VALUE. The result was 0.
#njuffa's comments inspired me dig somewhat deeper and at http://code.activestate.com/recipes/393090-binary-floating-point-summation-accurate-to-full-p/, I found that in Python, this problem has been solved quite nicely. To see the full code, I downloaded the Python source (Python 3.5.1rc1 - 2015-11-23) from https://www.python.org/getit/source/, where we can find the following method (under PYTHON SOFTWARE FOUNDATION LICENSE VERSION 2):
static PyObject*
math_fsum(PyObject *self, PyObject *seq)
{
PyObject *item, *iter, *sum = NULL;
Py_ssize_t i, j, n = 0, m = NUM_PARTIALS;
double x, y, t, ps[NUM_PARTIALS], *p = ps;
double xsave, special_sum = 0.0, inf_sum = 0.0;
volatile double hi, yr, lo;
iter = PyObject_GetIter(seq);
if (iter == NULL)
return NULL;
PyFPE_START_PROTECT("fsum", Py_DECREF(iter); return NULL)
for(;;) { /* for x in iterable */
assert(0 <= n && n <= m);
assert((m == NUM_PARTIALS && p == ps) ||
(m > NUM_PARTIALS && p != NULL));
item = PyIter_Next(iter);
if (item == NULL) {
if (PyErr_Occurred())
goto _fsum_error;
break;
}
x = PyFloat_AsDouble(item);
Py_DECREF(item);
if (PyErr_Occurred())
goto _fsum_error;
xsave = x;
for (i = j = 0; j < n; j++) { /* for y in partials */
y = p[j];
if (fabs(x) < fabs(y)) {
t = x; x = y; y = t;
}
hi = x + y;
yr = hi - x;
lo = y - yr;
if (lo != 0.0)
p[i++] = lo;
x = hi;
}
n = i; /* ps[i:] = [x] */
if (x != 0.0) {
if (! Py_IS_FINITE(x)) {
/* a nonfinite x could arise either as
a result of intermediate overflow, or
as a result of a nan or inf in the
summands */
if (Py_IS_FINITE(xsave)) {
PyErr_SetString(PyExc_OverflowError,
"intermediate overflow in fsum");
goto _fsum_error;
}
if (Py_IS_INFINITY(xsave))
inf_sum += xsave;
special_sum += xsave;
/* reset partials */
n = 0;
}
else if (n >= m && _fsum_realloc(&p, n, ps, &m))
goto _fsum_error;
else
p[n++] = x;
}
}
if (special_sum != 0.0) {
if (Py_IS_NAN(inf_sum))
PyErr_SetString(PyExc_ValueError,
"-inf + inf in fsum");
else
sum = PyFloat_FromDouble(special_sum);
goto _fsum_error;
}
hi = 0.0;
if (n > 0) {
hi = p[--n];
/* sum_exact(ps, hi) from the top, stop when the sum becomes
inexact. */
while (n > 0) {
x = hi;
y = p[--n];
assert(fabs(y) < fabs(x));
hi = x + y;
yr = hi - x;
lo = y - yr;
if (lo != 0.0)
break;
}
/* Make half-even rounding work across multiple partials.
Needed so that sum([1e-16, 1, 1e16]) will round-up the last
digit to two instead of down to zero (the 1e-16 makes the 1
slightly closer to two). With a potential 1 ULP rounding
error fixed-up, math.fsum() can guarantee commutativity. */
if (n > 0 && ((lo < 0.0 && p[n-1] < 0.0) ||
(lo > 0.0 && p[n-1] > 0.0))) {
y = lo * 2.0;
x = hi + y;
yr = x - hi;
if (y == yr)
hi = x;
}
}
sum = PyFloat_FromDouble(hi);
_fsum_error:
PyFPE_END_PROTECT(hi)
Py_DECREF(iter);
if (p != ps)
PyMem_Free(p);
return sum;
}
This summation method is different from Kahan's method, it uses a variable number of compensation variables. When adding the ith number, at most i additional compensation variables (stored in the array p) get used. This means if I want to add 3 numbers, I may need 3 additional variables. For 4 numbers, I may need 4 additional variables. Since the number of used variables may increase from n to n+1 only after the nth summand is loaded, I can translate the above code to Java as follows:
/**
* Compute the exact sum of the values in the given array
* {#code summands} while destroying the contents of said array.
*
* #param summands
* the summand array – will be summed up and destroyed
* #return the accurate sum of the elements of {#code summands}
*/
private static final double __destructiveSum(final double[] summands) {
int i, j, n;
double x, y, t, xsave, hi, yr, lo;
boolean ninf, pinf;
n = 0;
lo = 0d;
ninf = pinf = false;
for (double summand : summands) {
xsave = summand;
for (i = j = 0; j < n; j++) {
y = summands[j];
if (Math.abs(summand) < Math.abs(y)) {
t = summand;
summand = y;
y = t;
}
hi = summand + y;
yr = hi - summand;
lo = y - yr;
if (lo != 0.0) {
summands[i++] = lo;
}
summand = hi;
}
n = i; /* ps[i:] = [summand] */
if (summand != 0d) {
if ((summand > Double.NEGATIVE_INFINITY)
&& (summand < Double.POSITIVE_INFINITY)) {
summands[n++] = summand;// all finite, good, continue
} else {
if (xsave <= Double.NEGATIVE_INFINITY) {
if (pinf) {
return Double.NaN;
}
ninf = true;
} else {
if (xsave >= Double.POSITIVE_INFINITY) {
if (ninf) {
return Double.NaN;
}
pinf = true;
} else {
return Double.NaN;
}
}
n = 0;
}
}
}
if (pinf) {
return Double.POSITIVE_INFINITY;
}
if (ninf) {
return Double.NEGATIVE_INFINITY;
}
hi = 0d;
if (n > 0) {
hi = summands[--n];
/*
* sum_exact(ps, hi) from the top, stop when the sum becomes inexact.
*/
while (n > 0) {
x = hi;
y = summands[--n];
hi = x + y;
yr = hi - x;
lo = y - yr;
if (lo != 0d) {
break;
}
}
/*
* Make half-even rounding work across multiple partials. Needed so
* that sum([1e-16, 1, 1e16]) will round-up the last digit to two
* instead of down to zero (the 1e-16 makes the 1 slightly closer to
* two). With a potential 1 ULP rounding error fixed-up, math.fsum()
* can guarantee commutativity.
*/
if ((n > 0) && (((lo < 0d) && (summands[n - 1] < 0d)) || //
((lo > 0d) && (summands[n - 1] > 0d)))) {
y = lo * 2d;
x = hi + y;
yr = x - hi;
if (y == yr) {
hi = x;
}
}
}
return hi;
}
This function will take the array summands and add up the elements while simultaneously using it to store the compensation variables. Since we load the summand at index i before the array element at said index may become used for compensation, this will work.
Since the array will be small if the number of variables to add is small and won't escape the scope of our method, I think there is a decent chance that it will be allocated directly on the stack by the JIT, which may make the code quite fast.
I admit that I did not fully understand why the authors of the original code handled infinities, overflows, and NaNs the way they did. Here my code deviates from the original. (I hope I did not mess it up.)
Either way, I can now sum up 3, 4, or n double numbers by doing:
public static final double add3(final double x0, final double x1,
final double x2) {
return __destructiveSum(new double[] { x0, x1, x2 });
}
public static final double add4(final double x0, final double x1,
final double x2, final double x3) {
return __destructiveSum(new double[] { x0, x1, x2, x3 });
}
If I want to sum up 3 or 4 long numbers and obtain the precise result as double, I will have to deal with the fact that doubles can only represent longs in -9007199254740992..9007199254740992L. But this can easily be done by splitting each long into two parts:
public static final long add3(final long x0, final long x1,
final long x2) {
double lx;
return __destructiveSum(new long[] {new double[] { //
lx = x0, //
(x0 - ((long) lx)), //
lx = x1, //
(x1 - ((long) lx)), //
lx = x2, //
(x2 - ((long) lx)), //
});
}
public static final long add4(final long x0, final long x1,
final long x2, final long x3) {
double lx;
return __destructiveSum(new long[] {new double[] { //
lx = x0, //
(x0 - ((long) lx)), //
lx = x1, //
(x1 - ((long) lx)), //
lx = x2, //
(x2 - ((long) lx)), //
lx = x3, //
(x3 - ((long) lx)), //
});
}
I think this should be about right. At least I can now add Double.MAX_VALUE, 1, and -Double.MAX_VALUE and get 1 as result.

Related

Integer division in Scala [duplicate]

(note: not the same as this other question since the OP never explicitly specified rounding towards 0 or -Infinity)
JLS 15.17.2 says that integer division rounds towards zero. If I want floor()-like behavior for positive divisors (I don't care about the behavior for negative divisors), what's the simplest way to achieve this that is numerically correct for all inputs?
int ifloor(int n, int d)
{
/* returns q such that n = d*q + r where 0 <= r < d
* for all integer n, d where d > 0
*
* d = 0 should have the same behavior as `n/d`
*
* nice-to-have behaviors for d < 0:
* option (a). same as above:
* returns q such that n = d*q + r where 0 <= r < -d
* option (b). rounds towards +infinity:
* returns q such that n = d*q + r where d < r <= 0
*/
}
long lfloor(long n, long d)
{
/* same behavior as ifloor, except for long integers */
}
(update: I want to have a solution both for int and long arithmetic.)
If you can use third-party libraries, Guava has this: IntMath.divide(int, int, RoundingMode.FLOOR) and LongMath.divide(int, int, RoundingMode.FLOOR). (Disclosure: I contribute to Guava.)
If you don't want to use a third-party library for this, you can still look at the implementation.
(I'm doing everything for longs since the answer for ints is the same, just substitute int for every long and Integer for every Long.)
You could just Math.floor a double division result, otherwise...
Original answer:
return n/d - ( ( n % d != 0 ) && ( (n<0) ^ (d<0) ) ? 1 : 0 );
Optimized answer:
public static long lfloordiv( long n, long d ) {
long q = n/d;
if( q*d == n ) return q;
return q - ((n^d) >>> (Long.SIZE-1));
}
(For completeness, using a BigDecimal with a ROUND_FLOOR rounding mode is also an option.)
New edit: Now I'm just trying to see how far it can be optimized for fun. Using Mark's answer the best I have so far is:
public static long lfloordiv2( long n, long d ){
if( d >= 0 ){
n = -n;
d = -d;
}
long tweak = (n >>> (Long.SIZE-1) ) - 1;
return (n + tweak) / d + tweak;
}
(Uses cheaper operations than the above, but slightly longer bytecode (29 vs. 26)).
There's a rather neat formula for this that works when n < 0 and d > 0: take the bitwise complement of n, do the division, and then take the bitwise complement of the result.
int ifloordiv(int n, int d)
{
if (n >= 0)
return n / d;
else
return ~(~n / d);
}
For the remainder, a similar construction works (compatible with ifloordiv in the sense that the usual invariant ifloordiv(n, d) * d + ifloormod(n, d) == n is satisfied) giving a result that's always in the range [0, d).
int ifloormod(int n, int d)
{
if (n >= 0)
return n % d;
else
return d + ~(~n % d);
}
For negative divisors, the formulas aren't quite so neat. Here are expanded versions of ifloordiv and ifloormod that follow your 'nice-to-have' behavior option (b) for negative divisors.
int ifloordiv(int n, int d)
{
if (d >= 0)
return n >= 0 ? n / d : ~(~n / d);
else
return n <= 0 ? n / d : (n - 1) / d - 1;
}
int ifloormod(int n, int d)
{
if (d >= 0)
return n >= 0 ? n % d : d + ~(~n % d);
else
return n <= 0 ? n % d : d + 1 + (n - 1) % d;
}
For d < 0, there's an unavoidable problem case when d == -1 and n is Integer.MIN_VALUE, since then the mathematical result overflows the type. In that case, the formula above returns the wrapped result, just as the usual Java division does. As far as I'm aware, this is the only corner case where we silently get 'wrong' results.
return BigDecimal.valueOf(n).divide(BigDecimal.valueOf(d), RoundingMode.FLOOR).longValue();

Octave - how to operate with big numbers

I work on RSA algorithm in octave, but it isn't working in proper way. Problem appears while i try to use "^" function. Check my example below:
>> mod((80^65), 133)
terminal gives me:
ans = 0
I cannot fix this stuff, it's funny becouse even my system calculator return correct number (54)
to calculate this in correct way you can use fast power-modulo algorithm.
In c++, check function below where ->
a^b mod m:
int power_modulo_fast(int a, int b, int m)
{
int i;
int result = 1;
int x = a % m;
for (i=1; i<=b; i<<=1)
{
x %= m;
if ((b&i) != 0)
{
result *= x;
result %= m;
}
x *= x;
}
return result;
}

SPOJ - #26 BSHEEP - Build the Fence

#include <bits/stdc++.h>
using namespace std;
#define EPS 1e-9
typedef double coord_t;
typedef double coord2_t;
struct point {
double x, y;
point(double _x, double _y)
{
x = _x, y = _y;
}
bool operator < (point p) const{
if(fabs(x - p.x) > EPS)
return x < p.x;
return y < p.x;
}
bool operator == (point p) const{
return fabs(x - p.x) < EPS && fabs (x - p.y) < EPS;
}
};
coord2_t cross(const point &O, const point &A, const point &B)
{
return (long)(A.x - O.x) * (B.y - O.y) - (long)(A.y - O.y) * (B.x - O.x);
}
bool cmp(point a, point b)
{
if(fabs(a.y - b.y) > EPS)
return a.y < b.y;
return a.x < b.x;
}
vector<point> convex_hull(vector<point> P)
{
int n = P.size();
vector<point> H;
sort(P.begin(), P.end(), cmp);
for (int i = 0; i < n; ++i)
{
while(H.size() >= 2 && cross(H[H.size() - 2], H[H.size() - 1], P[i]) <= 0)
H.pop_back();
H.push_back(P[i]);
}
int l = H.size() + 1;
for (int i = n - 1; i >= 0; i--)
{
while(H.size() >= l && cross(H[H.size() - 2], H[H.size() - 1], P[i]) <= 0)
H.pop_back();
H.push_back(P[i]);
}
return H;
}
int main()
{
int tc, n, x, y;
double length;
vector<point> P;
scanf("%d", &tc);
while(tc--)
{
length = 0;
P.clear();
scanf("%d", &n);
for(int i = 0; i < n; i++)
{
scanf("%d %d", &x, &y);
P.push_back(point(x, y));
}
P = convex_hull(P);
for (int i = 0; i < (int) P.size() - 1; i++) {
length += sqrt(pow((P[i].x - P[i+1].x),2) + pow((P[i].y - P[i+1].y),2));
}
printf("%.2lf\n", length);
for (int i = 1; i < (int) P.size(); i++) {
printf("%lf ", P[i]); // Problem in this line , can't print the required output
}
printf("\n");
}
return 0;
}
It's a convex hull problem and I think I have done everything alright, but can't output p1 p2 .... pk of the problem. The problem is here:
At the beginning of spring all the sheep move to the higher pastures in the mountains. If there are thousands of them, it is well worthwhile gathering them together in one place. But sheep don't like to leave their grass-lands. Help the shepherd and build him a fence which would surround all the sheep. The fence should have the smallest possible length! Assume that sheep are negligibly small and that they are not moving. Sometimes a few sheep are standing in the same place. If there is only one sheep, it is probably dying, so no fence is needed at all...
Input
t [the number of tests <= 100]
[empty line]
n [the number of sheep <= 100000]
x1 y1 [coordinates of the first sheep]
...
xn yn
[integer coordinates from -10000 to 10000]
[empty line]
[other lists of sheep]
Text grouped in [ ] does not appear in the input file. Assume that sheep are numbered in the input order.
Output
o [length of circumference, 2 digits precision]
p1 p2 ... pk
[the sheep that are standing in the corners of the fence; the first one should be positioned bottommost and as far to the left as possible, the others ought to be written in anticlockwise order; ignore all sheep standing in the same place but the first to appear in the input file; the number of sheep should be the smallest possible]
[empty line]
[next solutions]
In your struct, add one more variable while will hold the position of the point.
struct point {
double x, y, c;
point(double _x, double _y, double _c)
{
x = _x, y = _y,c = _c;
}
bool operator < (point p) const{
if(fabs(x - p.x) > EPS)
return x < p.x;
return y < p.x;
}
bool operator == (point p) const{
return fabs(x - p.x) < EPS && fabs (x - p.y) < EPS;
}
};
When you take input, pushback all three of them:
for(int i = 0; i < n; i++)
{
scanf("%d %d", &x, &y);
P.push_back(point(x, y,i+1));
}

Porting signal windowing code from Matlab to Java

This is part of a code from spectral subtraction algorithm,i'm trying to optimize it for android.please help me.
this is the matlab code:
function Seg=segment(signal,W,SP,Window)
% SEGMENT chops a signal to overlapping windowed segments
% A= SEGMENT(X,W,SP,WIN) returns a matrix which its columns are segmented
% and windowed frames of the input one dimentional signal, X. W is the
% number of samples per window, default value W=256. SP is the shift
% percentage, default value SP=0.4. WIN is the window that is multiplied by
% each segment and its length should be W. the default window is hamming
% window.
% 06-Sep-04
% Esfandiar Zavarehei
if nargin<3
SP=.4;
end
if nargin<2
W=256;
end
if nargin<4
Window=hamming(W);
end
Window=Window(:); %make it a column vector
L=length(signal);
SP=fix(W.*SP);
N=fix((L-W)/SP +1); %number of segments
Index=(repmat(1:W,N,1)+repmat((0:(N-1))'*SP,1,W))';
hw=repmat(Window,1,N);
Seg=signal(Index).*hw;
and this is our java code for this function:
public class MatrixAndSegments
{
public int numberOfSegments;
public double[][] res;
public MatrixAndSegments(int numberOfSegments,double[][] res)
{
this.numberOfSegments = numberOfSegments;
this.res = res;
}
}
public MatrixAndSegments segment (double[] signal_in,int samplesPerWindow, double shiftPercentage, double[] window)
{
//default shiftPercentage = 0.4
//default samplesPerWindow = 256 //W
//default window = hanning
int L = signal_in.length;
shiftPercentage = fix(samplesPerWindow * shiftPercentage); //SP
int numberOfSegments = fix ( (L - samplesPerWindow)/ shiftPercentage + 1); //N
double[][] reprowMatrix = reprowtrans(samplesPerWindow,numberOfSegments);
double[][] repcolMatrix = repcoltrans(numberOfSegments, shiftPercentage,samplesPerWindow );
//Index=(repmat(1:W,N,1)+repmat((0:(N-1))'*SP,1,W))';
double[][] index = new double[samplesPerWindow+1][numberOfSegments+1];
for (int x = 1; x < samplesPerWindow+1; x++ )
{
for (int y = 1 ; y < numberOfSegments + 1; y++) //numberOfSegments was 3
{
index[x][y] = reprowMatrix[x][y] + repcolMatrix[x][y];
}
}
//hamming window
double[] hammingWindow = this.HammingWindow(samplesPerWindow);
double[][] HW = repvector(hammingWindow, numberOfSegments);
double[][] seg = new double[samplesPerWindow][numberOfSegments];
for (int y = 1 ; y < numberOfSegments + 1; y++)
{
for (int x = 1; x < samplesPerWindow+1; x++)
{
seg[x-1][y-1] = signal_in[ (int)index[x][y]-1 ] * HW[x-1][y-1];
}
}
MatrixAndSegments Matrixseg = new MatrixAndSegments(numberOfSegments,seg);
return Matrixseg;
}
public int fix(double val) {
if (val < 0) {
return (int) Math.ceil(val);
}
return (int) Math.floor(val);
}
public double[][] repvector(double[] vec, int replications)
{
double[][] result = new double[vec.length][replications];
for (int x = 0; x < vec.length; x++) {
for (int y = 0; y < replications; y++) {
result[x][y] = vec[x];
}
}
return result;
}
public double[][] reprowtrans(int end, int replications)
{
double[][] result = new double[end +1][replications+1];
for (int x = 1; x <= end; x++) {
for (int y = 1; y <= replications; y++) {
result[x][y] = x ;
}
}
return result;
}
public double[][] repcoltrans(int end, double multiplier, int replications)
{
double[][] result = new double[replications+1][end+1];
for (int x = 1; x <= replications; x++) {
for (int y = 1; y <= end ; y++) {
result[x][y] = (y-1)*multiplier;
}
}
return result;
}
public double[] HammingWindow(int size)
{
double[] window = new double[size];
for (int i = 0; i < size; i++)
{
window[i] = 0.54-0.46 * (Math.cos(2.0 * Math.PI * i / (size-1)));
}
return window;
}
"Porting" Matlab code statement by statement to Java is a bad approach.
Data is rarely manipulated in Matlab using loops and addressing individual elements (because the Matlab interpreter/VM is rather slow), but rather through calls to block processing functions (which have been carefully written and optimized). This leads to a very idiosyncratic programming style in which repmat, reshape, find, fancy indexing et al. are used to do operations which would be much more naturally expressed through Java loops.
For example, to multiply each column of a matrix A by a vector v, you will write in matlab:
A = diag(v) * A
or
A = repmat(v', 1, size(A, 2)) .* A
This solution:
for i = 1:size(A, 2),
A(:, i) = A(:, i) .* v';
end;
is inefficient.
But it would be terribly foolish to try to do the same thing in Java and invoke a matrix product or to build a matrix with repeated copies of v. Instead, just do:
for (int i = 0; i < rows; i++) {
for (int j = 0; j < columns; j++) {
a[i][j] *= v[i]
}
}
I suggest you to try to understand what this matlab function is actually doing, instead of focusing on how it is doing it, and reimplement it from scratch in Java, forgetting all the matlab implementation except the specifications given in the comments. Half of the code you have written is useless, indeed. Actually, it seems to me that this function wouldn't be needed at all, and what it does could be efficiently integrated in the caller's code.

imregionalmax matlab function's equivalent in opencv

I have an image of connected components(circles filled).If i want to segment them i can use watershed algorithm.I prefer writing my own function for watershed instead of using the inbuilt function in OPENCV.I have successfu How do i find the regionalmax of objects using opencv?
I wrote a function myself. My results were quite similar to MATLAB, although not exact. This function is implemented for CV_32F but it can easily be modified for other types.
I mark all the points that are not part of a minimum region by checking all the neighbors. The remaining regions are either minima, maxima or areas of inflection.
I use connected components to label each region.
I check each region for any point belonging to a maxima, if yes then I push that label into a vector.
Finally I sort the bad labels, erase all duplicates and then mark all the points in the output as not minima.
All that remains are the regions of minima.
Here is the code:
// output is a binary image
// 1: not a min region
// 0: part of a min region
// 2: not sure if min or not
// 3: uninitialized
void imregionalmin(cv::Mat& img, cv::Mat& out_img)
{
// pad the border of img with 1 and copy to img_pad
cv::Mat img_pad;
cv::copyMakeBorder(img, img_pad, 1, 1, 1, 1, IPL_BORDER_CONSTANT, 1);
// initialize binary output to 2, unknown if min
out_img = cv::Mat::ones(img.rows, img.cols, CV_8U)+2;
// initialize pointers to matrices
float* in = (float *)(img_pad.data);
uchar* out = (uchar *)(out_img.data);
// size of matrix
int in_size = img_pad.cols*img_pad.rows;
int out_size = img.cols*img.rows;
int x, y;
for (int i = 0; i < out_size; i++) {
// find x, y indexes
y = i % img.cols;
x = i / img.cols;
neighborCheck(in, out, i, x, y, img_pad.cols); // all regions are either min or max
}
cv::Mat label;
cv::connectedComponents(out_img, label);
int* lab = (int *)(label.data);
in = (float *)(img.data);
in_size = img.cols*img.rows;
std::vector<int> bad_labels;
for (int i = 0; i < out_size; i++) {
// find x, y indexes
y = i % img.cols;
x = i / img.cols;
if (lab[i] != 0) {
if (neighborCleanup(in, out, i, x, y, img.rows, img.cols) == 1) {
bad_labels.push_back(lab[i]);
}
}
}
std::sort(bad_labels.begin(), bad_labels.end());
bad_labels.erase(std::unique(bad_labels.begin(), bad_labels.end()), bad_labels.end());
for (int i = 0; i < out_size; ++i) {
if (lab[i] != 0) {
if (std::find(bad_labels.begin(), bad_labels.end(), lab[i]) != bad_labels.end()) {
out[i] = 0;
}
}
}
}
int inline neighborCleanup(float* in, uchar* out, int i, int x, int y, int x_lim, int y_lim)
{
int index;
for (int xx = x - 1; xx < x + 2; ++xx) {
for (int yy = y - 1; yy < y + 2; ++yy) {
if (((xx == x) && (yy==y)) || xx < 0 || yy < 0 || xx >= x_lim || yy >= y_lim)
continue;
index = xx*y_lim + yy;
if ((in[i] == in[index]) && (out[index] == 0))
return 1;
}
}
return 0;
}
void inline neighborCheck(float* in, uchar* out, int i, int x, int y, int x_lim)
{
int indexes[8], cur_index;
indexes[0] = x*x_lim + y;
indexes[1] = x*x_lim + y+1;
indexes[2] = x*x_lim + y+2;
indexes[3] = (x+1)*x_lim + y+2;
indexes[4] = (x + 2)*x_lim + y+2;
indexes[5] = (x + 2)*x_lim + y + 1;
indexes[6] = (x + 2)*x_lim + y;
indexes[7] = (x + 1)*x_lim + y;
cur_index = (x + 1)*x_lim + y+1;
for (int t = 0; t < 8; t++) {
if (in[indexes[t]] < in[cur_index]) {
out[i] = 0;
break;
}
}
if (out[i] == 3)
out[i] = 1;
}
The following listing is a function similar to Matlab's "imregionalmax". It looks for at most nLocMax local maxima above threshold, where the found local maxima are at least minDistBtwLocMax pixels apart. It returns the actual number of local maxima found. Notice that it uses OpenCV's minMaxLoc to find global maxima. It is "opencv-self-contained" except for the (easy to implement) function vdist, which computes the (euclidian) distance between points (r,c) and (row,col).
input is one-channel CV_32F matrix, and locations is nLocMax (rows) by 2 (columns) CV_32S matrix.
int imregionalmax(Mat input, int nLocMax, float threshold, float minDistBtwLocMax, Mat locations)
{
Mat scratch = input.clone();
int nFoundLocMax = 0;
for (int i = 0; i < nLocMax; i++) {
Point location;
double maxVal;
minMaxLoc(scratch, NULL, &maxVal, NULL, &location);
if (maxVal > threshold) {
nFoundLocMax += 1;
int row = location.y;
int col = location.x;
locations.at<int>(i,0) = row;
locations.at<int>(i,1) = col;
int r0 = (row-minDistBtwLocMax > -1 ? row-minDistBtwLocMax : 0);
int r1 = (row+minDistBtwLocMax < scratch.rows ? row+minDistBtwLocMax : scratch.rows-1);
int c0 = (col-minDistBtwLocMax > -1 ? col-minDistBtwLocMax : 0);
int c1 = (col+minDistBtwLocMax < scratch.cols ? col+minDistBtwLocMax : scratch.cols-1);
for (int r = r0; r <= r1; r++) {
for (int c = c0; c <= c1; c++) {
if (vdist(Point2DMake(r, c),Point2DMake(row, col)) <= minDistBtwLocMax) {
scratch.at<float>(r,c) = 0.0;
}
}
}
} else {
break;
}
}
return nFoundLocMax;
}
I do not know if it is what you want, but in my answer to this post, I gave some code to find local maxima (peaks) in a grayscale image (resulting from distance transform).
The approach relies on subtracting the original image from the dilated image and finding the zero pixels).
I hope it helps,
Good luck
I had the same problem some time ago, and the solution was to reimplement the imregionalmax algorithm in OpenCV/Cpp. It is not that complicated, because you can find the C++ source code of the function in the Matlab distribution. (somewhere in toolbox). All you have to do is to read carefully and understand the algorithm described there. Then rewrite it or remove the matlab-specific checks and you'll have it.