What is wrong with my 3-way partitioning quicksort algorithm? - quicksort

I don't understand where it is overflowing. The output for data of 5 elements [2 3 9 2 2] is [7208668 3 2 2 9] instead of [2 2 2 3 9].
This partition is called from a another function which chooses a random pivot and calls this partition function.
It takes output of this partition function to recursively call it's own function to perform a quicksort.
EDIT: Here swap() is a built-in function from library of C++.
int partition(int arr[], int l, int h){
int i = l;
int j = h;
int p= l;
int q = h;
int pivot = arr[0];
while(i<j){
while(arr[i]<pivot){
i++;
}
while(arr[j]>pivot){
j--;
}
swap(arr[i],arr[j]);
if(arr[i]==pivot){
p++;
swap(arr[i], arr[p]);
}
if (arr[j]==pivot){
q--;
swap(arr[j],arr[q]);
}
}
swap(arr[j], arr[l]);
int t = i-1;
for(int s = l; s<p; s++,t--){
swap(arr[s],arr[t]);
}
i++;
for(int s = h-1; s>q; s--, i++){
swap(arr[s],arr[q]);
}
return j;
}

Related

Expected bounds on universal hashing

I am sure this is a simple problem to work out, but I don't see an obvious solution... If I have a hash table with m bins and hash into this n < m keys, what is the probability that no bin receives more than k hash-keys. I'm trying to figure out how many rehash operations I should expect if I fill up a table to load n / m and then rehash until I see no more than k collisions in any bin (obviously with k > n / m).
With uniform distribution, this is the same as throwing balls into bins, which has been studied in "Balls into Bins - A Simple and Tight Analysis" from M. Raab and A. Steger.
This is a bit related to cuckoo hashing, but here you just use one hash function.
As this is stackoverflow.com, I give you a simulation program that can be used to verify your formula. According to this, it also depends on the number of balls / buckets, and not just on the the average number of balls per bucket.
public static void main(String... args) throws InterruptedException {
for (int k = 1; k < 4; k++) {
test(10, 30, k);
test(100, 300, k);
}
}
public static void test(int ballCount, int binCount, int k) {
int rehashCount = 0;
Random r = new Random(1);
int testCount = 100000000 / ballCount;
for(int test = 0; test < testCount; test++) {
long[] balls = new long[ballCount];
int[] bins = new int[binCount];
for (int i = 0; i < ballCount; i++) {
balls[i] = r.nextLong();
}
// it's very unlikely to get duplicates, but test
Arrays.sort(balls);
for (int i = 1; i < ballCount; i++) {
if (balls[i - 1] == balls[i]) {
throw new AssertionError();
}
}
int universalHashId = 0;
boolean rehashNeeded = false;
for (int i = 0; i < ballCount; i++) {
long x = balls[i];
// might as well do y = x
long y = supplementalHashWeyl(x, universalHashId);
int binId = reduce((int) y, binCount);
if (++bins[binId] > k) {
rehashNeeded = true;
break;
}
}
if (rehashNeeded) {
rehashCount++;
}
}
System.out.println("balls: " + ballCount + " bins: " + binCount +
" k: " + k + " rehash probability: " + (double) rehashCount / testCount);
}
public static int reduce(int hash, int n) {
// http://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/
return (int) (((hash & 0xffffffffL) * n) >>> 32);
}
public static int supplementalHashWeyl(long hash, long index) {
long x = hash + (index * 0xbf58476d1ce4e5b9L);
x = (x ^ (x >>> 32)) * 0xbf58476d1ce4e5b9L;
x = ((x >>> 32) ^ x);
return (int) x;
}
Outputs:
balls: 10 bins: 30 k: 1 rehash probability: 0.8153816
balls: 100 bins: 300 k: 1 rehash probability: 1.0
balls: 10 bins: 30 k: 2 rehash probability: 0.1098305
balls: 100 bins: 300 k: 2 rehash probability: 0.777381
balls: 10 bins: 30 k: 3 rehash probability: 0.0066018
balls: 100 bins: 300 k: 3 rehash probability: 0.107309

Determine if matrix A is subset of matrix B

For a matrix such as
A = [...
12 34 67;
90 78 15;
10 71 24];
how could we determine efficiently if it is subset of a larger matrix?
B = [...
12 34 67; % found
89 67 45;
90 78 15; % found
10 71 24; % found, so A is subset of B.
54 34 11];
Here are conditions:
all numbers are integers
matrices are so large, i.e., row# > 100000, column# may vary from 1 to 10 (same for A and B).
Edit:
It seems that ismember for the case of this question, when called only few times works just fine. My initial impression was due to previous experiences where ismember was being invoked many times inside a nested loop resulting in the worst performance.
clear all; clc
n = 200000;
k = 10;
B = randi(n,n,k);
f = randperm(n);
A = B(f(1:1000),:);
tic
assert(sum(ismember(A,B,'rows')) == size(A,1));
toc
tic
assert(all(any(all(bsxfun(#eq,B,permute(A,[3,2,1])),2),1))); %user2999345
toc
which results in:
Elapsed time is 1.088552 seconds.
Elapsed time is 12.154969 seconds.
Here are more benchmarks:
clear all; clc
n = 20000;
f = randperm(n);
k = 10;
t1 = 0;
t2 = 0;
t3 = 0;
for i=1:7
B = randi(n,n,k);
A = B(f(1:n/10),:);
%A(100,2) = 0; % to make A not submat of B
tic
b = sum(ismember(A,B,'rows')) == size(A,1);
t1 = t1+toc;
assert(b);
tic
b = ismember_mex(A,sortrows(B));
t2 = t2+toc;
assert(b);
tic
b = issubmat(A,B);
t3 = t3+toc;
assert(b);
end
George's skm's
ismember | ismember_mex | issubmat
n=20000,k=10 0.6326 0.1064 11.6899
n=1000,k=100 0.2652 0.0155 0.0577
n=1000,k=1000 1.1705 0.1582 0.2202
n=1000,k=10000 13.2470 2.0033 2.6367
*issubmat eats RAM when n or k is over 10000!
*issubmat(A,B), A is being checked as submat of B.
It seems that ismember is hard to beat, at least using MATLAB code. I created a C implementation which can be used using the MEX compiler.
#include "mex.h"
#if MX_API_VER < 0x07030000
typedef int mwIndex;
typedef int mwSize;
#endif /* MX_API_VER */
#include <math.h>
#include <stdlib.h>
#include <string.h>
int ismember(const double *y, const double *x, int yrow, int xrow, int ncol);
void mexFunction(int nlhs, mxArray *plhs[],
int nrhs, const mxArray *prhs[])
{
mwSize xcol, ycol, xrow, yrow;
/* output data */
int* result;
/* arguments */
const mxArray* y;
const mxArray* x;
if (nrhs != 2)
{
mexErrMsgTxt("2 input required.");
}
y = prhs[0];
x = prhs[1];
ycol = mxGetN(y);
yrow = mxGetM(y);
xcol = mxGetN(x);
xrow = mxGetM(x);
/* The first input must be a sparse matrix. */
if (!mxIsDouble(y) || !mxIsDouble(x))
{
mexErrMsgTxt("Input must be of type 'double'.");
}
if (xcol != ycol)
{
mexErrMsgTxt("Inputs must have the same number of columns");
}
plhs[0] = mxCreateLogicalMatrix(1, 1);
result = mxGetPr(plhs[0]);
*result = ismember(mxGetPr(y), mxGetPr(x), yrow, xrow, ycol);
}
int ismemberinner(const double *y, int idx, const double *x, int yrow, int xrow, int ncol) {
int from, to, i;
from = 0;
to = xrow-1;
for(i = 0; i < ncol; ++i) {
// Perform binary search
double yi = *(y + i * yrow + idx);
double *curx = x + i * xrow;
int l = from;
int u = to;
while(l <= u) {
int mididx = l + (u-l)/2;
if(yi < curx[mididx]) {
u = mididx-1;
}
else if(yi > curx[mididx]) {
l = mididx+1;
}
else {
// This can be further optimized by performing additional binary searches
for(from = mididx; from > l && curx[from-1] == yi; --from);
for(to = mididx; to < u && curx[to+1] == yi; ++to);
break;
}
}
if(l > u) {
return 0;
}
}
return 1;
}
int ismember(const double *y, const double *x, int yrow, int xrow, int ncol) {
int i;
for(i = 0; i < yrow; ++i) {
if(!ismemberinner(y, i, x, yrow, xrow, ncol)) {
return 0;
}
}
return 1;
}
Compile it using:
mex -O ismember_mex.c
It can be called as follows:
ismember_mex(x, sortrows(x))
First of all, it assumes that the columns of the matrices have the same size. It works by first sorting the rows of the larger matrix (x in this case, the second argument to the function). Then, a type of binary search is employed to identify whether the rows of the smaller matrix (y hereafter) are contained in x. This is done for each row of y separately (see ismember C function).
For a given row of y, it starts from the first entry and finds the range of indices (using the from and to variables) that match with the first column of x using binary search. This is repeated for the remaining entries, unless some value is not found, in which case it terminates and returns 0.
I tried implementing it this idea in MATLAB, but it didn't work that well. Regarding performance, I found that: (a) in case there are mismatches, it is usually much faster than ismember (b) in case the range of values in x and y is large, it is again faster than ismember, and (c) in case everything matches and the number of possible values in x and y is small (e.g. less than 1000), then ismember may be faster in some situations.
Finally, I want to point out that some parts of the C implementation may be further optimized.
EDIT 1
I fixed the warnings and further improved the function.
#include "mex.h"
#include <math.h>
#include <stdlib.h>
#include <string.h>
int ismember(const double *y, const double *x, unsigned int nrowy, unsigned int nrowx, unsigned int ncol);
void mexFunction(int nlhs, mxArray *plhs[],
int nrhs, const mxArray *prhs[])
{
unsigned int xcol, ycol, nrowx, nrowy;
/* arguments */
const mxArray* y;
const mxArray* x;
if (nrhs != 2)
{
mexErrMsgTxt("2 inputs required.");
}
y = prhs[0];
x = prhs[1];
ycol = (unsigned int) mxGetN(y);
nrowy = (unsigned int) mxGetM(y);
xcol = (unsigned int) mxGetN(x);
nrowx = (unsigned int) mxGetM(x);
/* The first input must be a sparse matrix. */
if (!mxIsDouble(y) || !mxIsDouble(x))
{
mexErrMsgTxt("Input must be of type 'double'.");
}
if (xcol != ycol)
{
mexErrMsgTxt("Inputs must have the same number of columns");
}
plhs[0] = mxCreateLogicalScalar(ismember(mxGetPr(y), mxGetPr(x), nrowy, nrowx, ycol));
}
int ismemberinner(const double *y, const double *x, unsigned int nrowy, unsigned int nrowx, unsigned int ncol) {
unsigned int from = 0, to = nrowx-1, i;
for(i = 0; i < ncol; ++i) {
// Perform binary search
const double yi = *(y + i * nrowy);
const double *curx = x + i * nrowx;
unsigned int l = from;
unsigned int u = to;
while(l <= u) {
const unsigned int mididx = l + (u-l)/2;
const double midx = curx[mididx];
if(yi < midx) {
u = mididx-1;
}
else if(yi > midx) {
l = mididx+1;
}
else {
{
// Binary search to identify smallest index of x that equals yi
// Equivalent to for(from = mididx; from > l && curx[from-1] == yi; --from)
unsigned int limit = mididx;
while(curx[from] != yi) {
const unsigned int mididx = from + (limit-from)/2;
if(curx[mididx] < yi) {
from = mididx+1;
}
else {
limit = mididx-1;
}
}
}
{
// Binary search to identify largest index of x that equals yi
// Equivalent to for(to = mididx; to < u && curx[to+1] == yi; ++to);
unsigned int limit = mididx;
while(curx[to] != yi) {
const unsigned int mididx = limit + (to-limit)/2;
if(curx[mididx] > yi) {
to = mididx-1;
}
else {
limit = mididx+1;
}
}
}
break;
}
}
if(l > u) {
return 0;
}
}
return 1;
}
int ismember(const double *y, const double *x, unsigned int nrowy, unsigned int nrowx, unsigned int ncol) {
unsigned int i;
for(i = 0; i < nrowy; ++i) {
if(!ismemberinner(y + i, x, nrowy, nrowx, ncol)) {
return 0;
}
}
return 1;
}
Using this version I wasn't able to identify any case where ismember is faster. Also, I noticed that one reason ismember is hard to beat is that it uses all cores of the machine! Of course, the function I provided can be optimized to do this too, but this requires much more effort.
Finally, before using my implementation I would advise you to do extensive testing. I did some testing and it seems to work, but I suggest you also do some additional testing.
For small matrices ismember should be enough, probably.
Usage: ismember(B,A,'rows')
ans =
1
0
1
1
0
I put this answer here, emphasizing on a need to solutions with higher performance. I will accept this answer only if there was no better solution.
Using ismember, if a row of A appears twice in B while another one is missing, might wrongly indicate that A is a member of B. The following solution is suitable if the rows of A and B doesn't need to be in the same order. However, I haven't tested its performance for large matrices.
A = [...
34 12 67;
90 78 15;
10 71 24];
B = [...
34 12 67; % found
89 67 45;
90 78 15; % found
10 71 24; % found, so A is subset of B.
54 34 11];
A = permute(A,[3 2 1]);
rowIdx = all(bsxfun(#eq,B,A),2);
colIdx = any(rowIdx,1);
isAMemberB = all(colIdx);
You have said number of columns <= 10. In addition, if the matrix elements are all integers representable as bytes, you could code each row into a two 64 bit integers. That would reduce the number of comparisons by a factor of 64.
For the general case, the following may not be all that much better for thin matrices, but scales very well as the matrices get fat due to the level 3 multiplication:
function yes = is_submat(A,B)
ma = size(A, 1);
mb = size(B, 1);
n = size(B, 2);
yes = false;
if ma >= mb
a = A(:,1);
b = B(:,1);
D = (0 == bsxfun(#minus, a, b'));
q = any(D, 2);
yes = all(any(D,1));
if yes && (n > 1)
A = A(q, :);
C = B*A';
za = sum(A.*A, 2);
zb = sum(B.*B, 2);
Z = sqrt(zb)*sqrt(za');
[~, ix] = max(C./Z, [], 2);
A = A(ix,:);
yes = all(A(:) == B(:));
end
end
end
In the above, I use the fact that the dot product is maximized when two unit vectors are equal.
For fat matrices (say 5000+ columns) with large numbers of unique elements the performance beats ismember quite handily, but otherwise, it is slower than ismember. For thin matrices ismember is faster by an order of magnitude.
Best case test for this function:
A = randi(50000, [10000, 10000]);
B = A(2:3:end, :);
B = B(randperm(size(B,1)),:);
fprintf('%s: %u\n', 'Number of columns', size(A,2));
fprintf('%s: %u\n', 'Element spread', 50000);
tic; is_submat(A,B); toc;
tic; all(ismember(B,A,'rows')); toc;
fprintf('________\n\n');
is_submat_test;
Number of columns: 10000
Element spread: 50000
Elapsed time is 10.713310 seconds (is_submat).
Elapsed time is 17.446682 seconds (ismember).
So I have to admit, all round ismember seems to be much better.
Edits: Edited to correct bug when there is only one column - fixing this also results in more efficient code. Also previous version did not distinguish between positive and negative numbers. Added timing tests.

Arrays in Merge Sort Algorithm

im trying to understand Merge Sort Algorithm working, im having issues that i don't understand where all the data members were stored when two created arrays sorting operations are performed?
i have understood now Merge sort doesn't create new arrays, it just create logical arrays and performs all the operations on the original array, this article will clear your concepts about merge sort c++
Please see following link for your solution.
http://www.algolist.net/Algorithms/Merge/Sorted_arrays
// size of C array must be equal or greater than
// sum of A and B arrays' sizes
public void merge(int[] A, int[] B, int[] C) {
int i, j, k, m, n;
i = 0;
j = 0;
k = 0;
m = A.length;
n = B.length;
while (i < m && j < n) {
if (A[i] <= B[j]) {
C[k] = A[i];
i++;
} else {
C[k] = B[j];
j++;
}
k++;
}
if (i < m) {
for (int p = i; p < m; p++) {
C[k] = A[p];
k++;
}
} else {
for (int p = j; p < n; p++) {
C[k] = B[p];
k++;
}
}
}

Porting signal windowing code from Matlab to Java

This is part of a code from spectral subtraction algorithm,i'm trying to optimize it for android.please help me.
this is the matlab code:
function Seg=segment(signal,W,SP,Window)
% SEGMENT chops a signal to overlapping windowed segments
% A= SEGMENT(X,W,SP,WIN) returns a matrix which its columns are segmented
% and windowed frames of the input one dimentional signal, X. W is the
% number of samples per window, default value W=256. SP is the shift
% percentage, default value SP=0.4. WIN is the window that is multiplied by
% each segment and its length should be W. the default window is hamming
% window.
% 06-Sep-04
% Esfandiar Zavarehei
if nargin<3
SP=.4;
end
if nargin<2
W=256;
end
if nargin<4
Window=hamming(W);
end
Window=Window(:); %make it a column vector
L=length(signal);
SP=fix(W.*SP);
N=fix((L-W)/SP +1); %number of segments
Index=(repmat(1:W,N,1)+repmat((0:(N-1))'*SP,1,W))';
hw=repmat(Window,1,N);
Seg=signal(Index).*hw;
and this is our java code for this function:
public class MatrixAndSegments
{
public int numberOfSegments;
public double[][] res;
public MatrixAndSegments(int numberOfSegments,double[][] res)
{
this.numberOfSegments = numberOfSegments;
this.res = res;
}
}
public MatrixAndSegments segment (double[] signal_in,int samplesPerWindow, double shiftPercentage, double[] window)
{
//default shiftPercentage = 0.4
//default samplesPerWindow = 256 //W
//default window = hanning
int L = signal_in.length;
shiftPercentage = fix(samplesPerWindow * shiftPercentage); //SP
int numberOfSegments = fix ( (L - samplesPerWindow)/ shiftPercentage + 1); //N
double[][] reprowMatrix = reprowtrans(samplesPerWindow,numberOfSegments);
double[][] repcolMatrix = repcoltrans(numberOfSegments, shiftPercentage,samplesPerWindow );
//Index=(repmat(1:W,N,1)+repmat((0:(N-1))'*SP,1,W))';
double[][] index = new double[samplesPerWindow+1][numberOfSegments+1];
for (int x = 1; x < samplesPerWindow+1; x++ )
{
for (int y = 1 ; y < numberOfSegments + 1; y++) //numberOfSegments was 3
{
index[x][y] = reprowMatrix[x][y] + repcolMatrix[x][y];
}
}
//hamming window
double[] hammingWindow = this.HammingWindow(samplesPerWindow);
double[][] HW = repvector(hammingWindow, numberOfSegments);
double[][] seg = new double[samplesPerWindow][numberOfSegments];
for (int y = 1 ; y < numberOfSegments + 1; y++)
{
for (int x = 1; x < samplesPerWindow+1; x++)
{
seg[x-1][y-1] = signal_in[ (int)index[x][y]-1 ] * HW[x-1][y-1];
}
}
MatrixAndSegments Matrixseg = new MatrixAndSegments(numberOfSegments,seg);
return Matrixseg;
}
public int fix(double val) {
if (val < 0) {
return (int) Math.ceil(val);
}
return (int) Math.floor(val);
}
public double[][] repvector(double[] vec, int replications)
{
double[][] result = new double[vec.length][replications];
for (int x = 0; x < vec.length; x++) {
for (int y = 0; y < replications; y++) {
result[x][y] = vec[x];
}
}
return result;
}
public double[][] reprowtrans(int end, int replications)
{
double[][] result = new double[end +1][replications+1];
for (int x = 1; x <= end; x++) {
for (int y = 1; y <= replications; y++) {
result[x][y] = x ;
}
}
return result;
}
public double[][] repcoltrans(int end, double multiplier, int replications)
{
double[][] result = new double[replications+1][end+1];
for (int x = 1; x <= replications; x++) {
for (int y = 1; y <= end ; y++) {
result[x][y] = (y-1)*multiplier;
}
}
return result;
}
public double[] HammingWindow(int size)
{
double[] window = new double[size];
for (int i = 0; i < size; i++)
{
window[i] = 0.54-0.46 * (Math.cos(2.0 * Math.PI * i / (size-1)));
}
return window;
}
"Porting" Matlab code statement by statement to Java is a bad approach.
Data is rarely manipulated in Matlab using loops and addressing individual elements (because the Matlab interpreter/VM is rather slow), but rather through calls to block processing functions (which have been carefully written and optimized). This leads to a very idiosyncratic programming style in which repmat, reshape, find, fancy indexing et al. are used to do operations which would be much more naturally expressed through Java loops.
For example, to multiply each column of a matrix A by a vector v, you will write in matlab:
A = diag(v) * A
or
A = repmat(v', 1, size(A, 2)) .* A
This solution:
for i = 1:size(A, 2),
A(:, i) = A(:, i) .* v';
end;
is inefficient.
But it would be terribly foolish to try to do the same thing in Java and invoke a matrix product or to build a matrix with repeated copies of v. Instead, just do:
for (int i = 0; i < rows; i++) {
for (int j = 0; j < columns; j++) {
a[i][j] *= v[i]
}
}
I suggest you to try to understand what this matlab function is actually doing, instead of focusing on how it is doing it, and reimplement it from scratch in Java, forgetting all the matlab implementation except the specifications given in the comments. Half of the code you have written is useless, indeed. Actually, it seems to me that this function wouldn't be needed at all, and what it does could be efficiently integrated in the caller's code.

What is the returned value?

In a language that passes parameters by reference, given the following function:
int function g(x, y) {
x = x + 1;
y = y + 2;
return x + y;
}
If i = 3, and g(i,i) is called, what is value returned? I thought it is 9, is this correct?
If it's pass-by-reference (your original question was C but C doesn't have pass-by-reference and the question has changed since then anyway, so I'll answer generically), it's probably the case that x and y will simply modify the variables that are passed in for them. That's what a reference is, after all.
In this case, they're both a reference to the same variable i, so your sequence is likely to be:
i = i + 1; // i becomes 4.
i = i + 2; // i becomes 6.
return i + i; // return i + i, or 12.
You can see this in operation with the following C (using pointers to emulate pass-by-reference):
pax$ cat qq.c
#include <stdio.h>
int g(int *x, int *y) {
*x = *x + 1;
*y = *y + 2;
return *x + *y;
}
int main (void) {
int i = 3;
int rv = g (&i, &i);
printf ("Returned: %d\n", rv);
return 0;
}
pax$ gcc -o qq qq.c ; ./qq
Returned: 12
Your result of 9 seems to be assuming that the references are distinct from one another, such as in the following code:
#include <stdio.h>
int g(int *x, int *y) {
*x = *x + 1;
*y = *y + 2;
return *x + *y;
}
int main (void) {
int i1 = 3, i2 = 3;
int rv = g (&i1, &i2);
printf ("Returned: %d\n", rv);
return 0;
}
(this does output 9) but that's not usually the case with reference types.