realloc():invalid next size - realloc

so I have some code that works fine with small text files but crashes with larger ones. The point of the code is to take a file and a parameter n, parse through the code and save everything in a 2d array in chucks of size n. So buffer[0][0]through [0][n-1] should hold n characters, and buffer[1][0]through [1][n-1] should hold the next n chunk, and so on. My code works when the file only has a few words, but with a larger file I get an error saying realloc():invalid next size. Any ideas why? Here is my code.
void bsort(int n)
{
int numwords= 0;
int numlets=0;
char ** buffer=(char**)malloc(numwords*n);
while (!feof(stdin))
{
char l= getchar();
if (l!= EOF)
{
if (numlets%n==0)
{
numwords=numwords+1;
buffer=(char**)realloc(buffer,numwords*n);
if(!buffer)
{
printf("Allocation error!");
}
buffer[numwords-1]= (char*) malloc (n);
buffer[numwords-1][numlets%n]=l;
// printf("%c", buffer[numwords-1][numlets%n]);
numlets=numlets+1;
}
}
int i,j;
for (i=0; i < numwords; i++)
{
for(j=0; j< n; j++)
{
printf("%c",buffer[i][j]);
}
}

It looks as if each time you get a character, you are reallocating your buffer. That seems a little off to me. Have you thought of allocating some space, doing a memset to \0, and just managing the current size and buffer size separately?
It may be that realloc is having issues with a pointer to nothing at first. If it fails after the first character input, you might be having issues with your first malloc(). Pre-allocating some space would solve that.

AFAIK, malloc(0) is not guaranteed to return a useful pointer you can realloc().
The documentation only guarantees that malloc(0) returns either null or a pointer that can safely be used to call free().

Related

mex function memory leak

I am new to writing MEX-functions and I have a memory problem. The MEXf getaway routine is as follows:
void mexFunction (int nlhs, mxArray *plhs[], int nrhs,const mxArray *prhs[]){
double *ecg; /*Pointer to double for input data*/
double *outArray; /*Pointer to double for output data*/
void *dyn; /*Pointer to void for the dynamic allocation of memory
int N=0;
int i=1;
int j=0;
int k=0;
/*CHECK FOR PROPER NUMBER OF ARGUMENTS*/
if (nrhs != 1 ) mexErrMsgIdAndTxt("EplimitedQRSDetector:NoInput", "This function takes one input argument: ECG.");
else if(nlhs!=1) mexErrMsgIdAndTxt("EplimitedQRSDetector:NoOutput", "This function requires one output argument.");
/*LOAD INPUT DATA AND ALLOCATE OUTPUT MEMORY*/
ecg=mxGetPr(prhs[0]); /*Input data loading*/
N=(int) mxGetM(prhs[0]);
plhs[0]=mxCreateDoubleMatrix(0,0,mxREAL);
dyn = mxCalloc(N,sizeof(double)); /*Dynamic memory allocation*/
outArray=(double*) dyn;
/*CALL THE SUBROUTINE*/
for (j=0;j<N;j++){
outArray[k]=QRSDet(ecg[j], i );
if (outArray[k]!=0){
outArray[k]=j-outArray[k];
k++;
}
i=0;
}
/*FILL THE OUTPUT ARRAY*/
mxSetData(plhs[0], outArray);
mxSetM(plhs[0], k-1);
mxSetN(plhs[0], 1);
mxFree(dyn);
mxFree(outArray);
return;
When I call the Mex-function from the matlab command window, i get the error message "maximum variable size allowed by the function is exceeded". Since the function worked well the first few times i used it, I think the problem is that I don't free memory in the right way in my code. Any suggestions would be greatly appreciated :) Thanks!
N
The code is now running thanks to the modifications suggested by Navan. In addition to the improper use of mxFree, these 3 lines were causing a segmentation violation:
mxSetData(plhs[0], outArray);
mxSetM(plhs[0], k-1);
mxSetN(plhs[0], 1);
outArray is pointing to a Nx1 array allocated using mxCalloc, so setting the first dimension of plhs[0] to (k-1)!=N causes the segmentation violation. Once substituted that line with
mxSetM(plhs[0], N)
the algorithm started to work properly. Thank you for your help.
In your code you should not call mxFree on the memory you allocated. This needs to go back to MATLAB since that is the output. You are also calling it twice on the same pointer. mxSetData does not copy your data. It sets the pointer.
I think in call to mxSetM you need to pass k instead of k-1 unless you are intentionally ignoring last value.

kdb c++ interface: create byte list from std::string

The following is very slow for long strings:
std::string s = "long string";
K klist = DBVec::CreateList(KG , s.length());
for (int i=0; i<s.length(); i++)
{
kG(klist)[i]=s.c_str()[i];
}
It works acceptably fast (<100ms) for strings up to 100k, but slows to a crawl (tens of minutes, possibly hours) for strings of a few million characters. I don't see anything other than kG that can create nonlinearity. I don't see any reason for accessor function kG to be non-constant time, but there is just nothing else in this loop. Unfortunately I don't know how kG works due to lack of documentation.
Question: given a blob of binary data as std::string, what's the efficient way to construct a byte list?
kG is a macro defined in k.h which expands to ((x)->G0), i.e. follow the G0 pointer of the K object
http://kx.com/q/d/a/c.htm#Strings documents kp, which creates a K string object directly from a string, so presumably you could do K klist = kp(s.c_str()), which is probably faster
This works:
memcpy(kG(klist), s.c_str(), s.length());
Still wonder why that loop is not O(N).

How to reverse a string without allocating memory

I was asked this question on how to reverse a string without allocating memory. Any takers?
You cannot reverse an NSString, with or without allocating memory, because an NSString is immutable.
You cannot reverse an NSMutableString in place without allocating memory, because the only methods that NSMutableString provides to replace its contents require the new characters to be specified in an NSString, which you would have to allocate.
CFMutableString has the same “problem”.
void reverseStringBetter(char* str)
{
int i, j;
i=j=0;
j=strlen(str)1;
for (i=0; i<j; i++, j-)
{
str[i] ^= str[j] ;
str[j] ^= str[i] ;
str[i] ^= str[j] ;
}
}
It is not possible with NSString since they are immutable and the only way is to create a new string.
Though this might not be what you are looking for, you can convert the NSString to a normal c-string, and edit that in-place. You are still allocating memory, but you'll at least get half of what you want by being able to modify the string in place.
I'm not sure what your use case is for not wanting to allocate memory, or if this is simply a hypothetical.

Unable to understand the block's lexical scope

To understand the lexical scope of block, I have write the following code
typedef int (^ MyBlock)(void);
MyBlock b[3];
for (int i=0; i<3; i++) {
b[i]=^{return i;};
}
for (int i=0; i<3; i++) {
NSLog(#"%d",b[i]());
}
NSLog(#"----------------------------");
int j=0;
b[0]=^{return j;};
j++;
b[1]=^{return j;};
j++;
b[2]=^{return j;};
for (int i=0; i<3; i++) {
NSLog(#"%d",b[i]());
}
first time o/p is 2,2,2
second time o/p is 0,1,2
I am expecting 2,2,2 for both of block execution.
Can anybody please explain me why is it so?
I assume you’ve been reading bbum’s post on blocks and know that your code isn’t correct since you aren’t copying the blocks from the stack to the heap.
That said:
for (int i=0; i<3; i++) {
b[i]=^{return i;};
}
does the following in each iteration:
Allocates space in the stack for a block variable. Let's say its memory address is A;
Creates the block in the stack and assign its address (A) to b[i];
At the end of the iteration, since the compound statement/scope ({}) has ended, pops whatever was in the stack and resets the stack pointer.
The stack grows at the beginning of each iteration, and shrinks at the end of each iteration. This means that all blocks are being created in the same memory address, namely A. This also means that all elements in the b array end up pointing to the same block, namely the last block that was created. You can test this by running the following code:
for (int i = 0; i < 3; i++) {
printf("%p", (void *)b[i]);
}
which should output something like:
0x7fff5fbff9e8
0x7fff5fbff9e8
0x7fff5fbff9e8
All elements point to the same block, the last one created in the memory address A = 0x7fff5fbff9e8.
On the other hand, when you do the following:
b[0]=^{return j;};
j++;
b[1]=^{return j;};
j++;
b[2]=^{return j;};
there’s no compound statement that defines the same scope for all blocks. This means that each time you create a block its address is further down the stack, effectively assigning a different address to each block. Since all blocks are different, they correctly capture the current runtime value of j.
If you print the address of those blocks as described earlier, you should get an output similar to:
0x7fff5fbff9b8
0x7fff5fbff990
0x7fff5fbff968
showing that each block is at a different memory address.
The i you use to iterate the array of blocks with b[i] is not the i that is used inside every block. The blocks you define reference the i that is used during definition. And that i is changed to 2. Then you iterate through those blocks with another i, but the block still refers to the original i (which by now holds the value 2) that you used during definition of the block although that i is already "dead" for the rest of the program.
In the 2nd case your blocks use also a common variable but you change it before you use it every time.
The key is: The block is always associated to the variable that was used during definition. The i in the 2nd for-loop is not the i the blocks refer to.
Blocks can be invoke when the defining code is already "dead". For that the reference variables have an "extended" live. They might also be moved to the heap by the runtime.
Take a look at the WWDC 2010 video "Session 206 - Introducing Blocks and Grand Central Dispatch on iPhone".

Using memcpy/memset

When using memset or memcpy within an Obj-C program, will the compiler optimise the setting (memset) or copying (memcpy) of data into 32-bit writes or will it do it byte by byte?
You can see the libc implementations of these methods in the Darwin source. In 10.6.3, memset works at the word level. I didn't check memcpy, but probably it's the same.
You are correct that it's possible for the compiler to do the work inline instead of calling these functions. I suppose I'll let someone who knows better answer what it will do, though I would not expect a problem.
Memset will come as part of your standard C library so it depends on the implementation you are using. I would guess most implementations will copy in blocks of the native CPU size (32/64 bits) and then the remainder byte-by-byte.
Here is glibc's version of memcpy for an example implementation:
void *
memcpy (dstpp, srcpp, len)
void *dstpp;
const void *srcpp;
size_t len;
{
unsigned long int dstp = (long int) dstpp;
unsigned long int srcp = (long int) srcpp;
/* Copy from the beginning to the end. */
/* If there not too few bytes to copy, use word copy. */
if (len >= OP_T_THRES)
{
/* Copy just a few bytes to make DSTP aligned. */
len -= (-dstp) % OPSIZ;
BYTE_COPY_FWD (dstp, srcp, (-dstp) % OPSIZ);
/* Copy whole pages from SRCP to DSTP by virtual address manipulation,
as much as possible. */
PAGE_COPY_FWD_MAYBE (dstp, srcp, len, len);
/* Copy from SRCP to DSTP taking advantage of the known alignment of
DSTP. Number of bytes remaining is put in the third argument,
i.e. in LEN. This number may vary from machine to machine. */
WORD_COPY_FWD (dstp, srcp, len, len);
/* Fall out and copy the tail. */
}
/* There are just a few bytes to copy. Use byte memory operations. */
BYTE_COPY_FWD (dstp, srcp, len);
return dstpp;
}
So you can see it copies a few bytes first to get aligned, then copies in words, then finally in bytes again. It does some optimized page copying using some kernel operations.