Finding the row indices of a logical vector without using find() - matlab

My program handles huge amount of data and the function find is the one to blame for taking so much time to execute. At some point I get a logical vector and I want to extract row indices of the 1 elements in the vector. How can I do that without using the find function?
Here's a demo:
temp = rand(10000000, 1);
temp1 = temp > 0.5;
temp2 = find(temp1);
But it is too slow in case of having much more data. Any suggestion?
Thank you

Find seems to be a very optimized function. What I did was to create a mex version very restricted to this particular problem. Running time was cut by half. :)
Here is the code:
#include <math.h>
#include <matrix.h>
#include <mex.h>
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
mxLogical *in;
double *out;
int i, nInput, nTrues;
// Get the number of elements of the input.
nInput = mxGetNumberOfElements(prhs[0]);
// Get a pointer to the logical input array.
in = mxGetLogicals(prhs[0]);
// Allocate memory for the output. As we don't know the number of
// matches, we allocate an array the same size of the input. We will
// probably reallocate it later.
out = mxMalloc(sizeof(double) * nInput);
// Count the number of 'trues' and store its positions.
for (nTrues = 0, i = 0; i < nInput; )
if (in[i++])
out[nTrues++] = i;
// Reallocate the array, if necessary.
if (nTrues < nInput)
out = mxRealloc(out, sizeof(double) * nTrues);
// Assign the indexes to the output array.
plhs[0] = mxCreateDoubleMatrix(0, 0, mxREAL);
mxSetPr(plhs[0], out);
mxSetM(plhs[0], nTrues);
mxSetN(plhs[0], 1);
}
Just save it to a file called, for example, find2.c and compile with mex find2.c.
Assuming:
temp = rand(10000000, 1);
temp1 = temp > 0.5;
Running times:
tic
temp2 = find(temp1);
toc
Elapsed time is 0.082875 seconds.
tic
temp2 = find2(temp1);
toc
Elapsed time is 0.044330 seconds.
IMPORTANT NOTE: this function has no error handling. It's assumed the input is always a logical array and the output is a double array. Caution is required.

You could try to split your calculations in small pieces. This will not reduce the amount of calculations you have to do, but it might still be faster since the data fits into fast cache memory, instead of in the slow main memory (or in the worst case you might even be swapping to disk). Something like this:
temp = rand(10000000, 1);
n = 100000; % chunk size
for i = 1:floor(length(temp) / n)
chunk = temp(((i-1) * n + 1):(i*n))
temp1 = chunk > 0.5;
temp2 = find(temp1);
do_stuff(temp2)
end

You can create an array of regular index and then apply logical indexing. I didn't check if it was faster than find tough.
Example:
Index=1:size(temp);
Found = Index(temp1);

Related

Matlab: fastest method of reading parts/sequences of a large binary file

I want to read parts from a large (ca. 11 GB) binary file. The currently working solution is to load the entire file ( raw_data ) with fread(), then crop out pieces of interest ( data ).
Question: Is there a faster method of reading small (1-2% of total file, partially sequential reads) parts of a file, given something like a binary mask (i.e. a logical index of specific bytes of interst) in Matlab? Specifics below.
Notes for my specific case:
data of interest (26+e6 bytes, or ca. 24 MB) is roughly 2% of raw_data (1.2e+10 bytes or ca. 11 GB)
each 600.000 bytes contain ca 6.500 byte reads, which can be broken down to roughly 1.200 read-skip cycles (such as 'read 10 bytes, skip 5000 bytes').
the read instructions of the total file can be broken down in ca 20.000 similar but (not exactly identical) read-skip cycles (i.e. ca. 20.000x1.200 read-skip cycles)
The file is read from a GPFS (parallel file system)
Excessive RAM, newest Matlab ver and all toolboxes are available for the task
My initial idea of fread-fseek cycle proved to be extrodinarily much slower (see psuedocode below) than reading the whole file. Profiling revealed fread() is slowest (being called over a million times probably obvious to the experts here).
Alternatives I considered: memmapfile() [ ref ] has no feasible read multiple small parts as far as I could find. The MappedTensor library might be the next thing I'd look into. Related but didn't help, just to link to article: 1, 2.
%open file
fi=fopen('data.bin');
%example read-skip data
f_reads = [20 10 6 20 40]; %read this number of bytes
f_skips = [900 6000 40 300 600]; %skip these bytes after each read instruction
data = []; %save the result here
fseek(fi,90000,'bof'); %skip initial bytes until first read
%read the file
for ind=1:nbr_read_skip_cylces-1
tmp_data = fread(fi,f_reads(ind));
data = [data; tmp_data]; %add newly read bytes to data variable
fseek(fi,f_skips(ind),'cof'); %skip to next read position
end
FYI: To get an overview and for transparency, I've compiled some plots (below) of the first ca 6.500 read locations (of my actual data) that, after collapsing into fread-fseek pairs can, can be summarized in 1.200 fread-fseek pairs.
I would do two things to speed up your code:
preallocate the data array.
write a C MEX-file to call fread and fseek.
This is a quick test I did to compare using fread and fseek from MATLAB or C:
%% Create large binary file
data = 1:10000000; % 80 MB
fi = fopen('data.bin', 'wb');
fwrite(fi, data, 'double');
fclose(fi);
n_read = 1;
n_skip = 99;
%% Read using MATLAB
tic
fi = fopen('data.bin', 'rb');
fseek(fi, 0, 'eof');
sz = ftell(fi);
sz = floor(sz / (n_read + n_skip));
data = zeros(1, sz);
fseek(fi, 0, 'bof');
for ind = 1:sz
data(ind) = fread(fi, n_read, 'int8');
fseek(fi, n_skip, 'cof');
end
toc
%% Read using C MEX-file
mex fread_test_mex.c
tic
data = fread_test_mex('data.bin', n_read, n_skip);
toc
And this is fread_test_mex.c:
#include <stdio.h>
#include <mex.h>
void mexFunction(int nlhs, mxArray *plhs[],
int nrhs, const mxArray *prhs[])
{
// No testing of inputs...
// inputs = 'data.bin', 1, 99
char* fname = mxArrayToString(prhs[0]);
int n_read = mxGetScalar(prhs[1]);
int n_skip = mxGetScalar(prhs[2]);
FILE* fi = fopen(fname, "rb");
fseek(fi, 0L, SEEK_END);
int sz = ftell(fi);
sz /= n_read + n_skip;
plhs[0] = mxCreateNumericMatrix(1, sz, mxDOUBLE_CLASS, mxREAL);
double* data = mxGetPr(plhs[0]);
fseek(fi, 0L, SEEK_SET);
char buffer[1];
for(int ind = 1; ind < sz; ++ind) {
fread(buffer, 1, n_read, fi);
data[ind] = buffer[0];
fseek(fi, n_skip, SEEK_CUR);
}
fclose(fi);
}
I see this:
Elapsed time is 6.785304 seconds.
Building with 'Xcode with Clang'.
MEX completed successfully.
Elapsed time is 1.376540 seconds.
That is, reading the data is 5x as fast with a C MEX-file. And that time includes loading the MEX-file into memory. A second run is a bit faster (1.14 s) because the MEX-file is already loaded.
In the MATLAB code, if I initialize data = []; and then extend the matrix every time I read like OP does:
tmp = fread(fi, n_read, 'int8');
data = [data, tmp];
then the execution time for that loop was 159 s, with 92.0% of the time spent in the data = [data, tmp] line. Preallocating really is important!

Efficient import of semi structured text

I have multiple text files saved from a Tekscan pressure mapping system. I'm am trying to find the most efficient method for importing the multiple comma delimited matrices into one 3-d matrix of type uint8. I have developed a solution, which makes repeated calls to the MATLAB function dlmread. Unfortunately, it takes roughly 1.5 min to import the data. I have included the code below.
This code makes calls to two other functions I wrote, metaextract and framecount which I have not included as they aren't truly relevant to answering the question at hand.
Here are two links to samples of the files I am using.
The first is a shorter file with 90 samples
The second is a longer file with 3458 samples
Any help would be appreciated
function pressureData = tekscanimport
% Import TekScan data from .asf file to 3d matrix of type double.
[id,path] = uigetfile('*.asf'); %User input for .asf file
if path == 0 %uigetfile returns zero on cancel
error('You must select a file to continue')
end
path = strcat(path,id); %Concatenate path and id to full path
% function calls
pressureData.metaData = metaextract(path);
nLines = linecount(path); %Find number of lines in file
nFrames = framecount(path,nLines);%Find number of frames
rowStart = 25; %Default starting row to read from tekscan .asf file
rowEnd = rowStart + 41; %Frames are 42 rows long
colStart = 0;%Default starting col to read from tekscan .asf file
colEnd = 47;%Frames are 48 rows long
pressureData.frames = zeros([42,48,nFrames],'uint8');%Preallocate for speed
f = waitbar(0,'1','Name','loading Data...',...
'CreateCancelBtn','setappdata(gcbf,''canceling'',1)');
setappdata(f,'canceling',0);
for i = 1:nFrames %Loop through file skipping frame metadata
if getappdata(f,'canceling')
break
end
waitbar(i/nFrames,f,sprintf('Loaded %.2f%%', i/nFrames*100));
%Make repeated calls to dlmread
pressureData.frames(:,:,i) = dlmread(path,',',[rowStart,colStart,rowEnd,colEnd]);
rowStart = rowStart + 44;
rowEnd = rowStart + 41;
end
delete(f)
end
I gave it a try. This code opens your big file in 3.6 seconds on my PC. The trick is to use sscanf instead of the str2double and str2number functions.
clear all;tic
fid = fopen('tekscanlarge.txt','rt');
%read the header, stop at frame
header='';
l = fgetl(fid);
while length(l)>5&&~strcmp(l(1:5),'Frame')
header=[header,l,sprintf('\n')];
l = fgetl(fid);
if length(l)<5,l(end+1:5)=' ';end
end
%all data at once
dat = fread(fid,inf,'*char');
fclose(fid);
%allocate space
res = zeros([48,42,3458],'uint8');
%get all line endings
LE = [0,regexp(dat','\n')];
i=1;
for ct = 2:length(LE)-1 %go line by line
L = dat(LE(ct-1)+1:LE(ct)-1);
if isempty(L),continue;end
if all(L(1:5)==['Frame']')
fr = sscanf(L(7:end),'%u');
i=1;
continue;
end
% sscan can only handle row-char with space seperation.
res(:,i,fr) = uint8(sscanf(strrep(L',',',' '),'%u'));
i=i+1;
end
toc
Does anyone knows of a faster way to convert than sscanf? Because it spends the majority of time on this function (2.17 seconds). For a dataset of 13.1MB I find it very slow compared to the speed of the memory.
Found a way to do it in 0.2 seconds that might be usefull for others as well.
This mex-file scans through a list of char values for numbers and reports them back. Save it as mexscan.c and run mex mexscan.c.
#include "mex.h"
/* The computational routine */
void calc(unsigned char *in, unsigned char *out, long Sout, long Sin)
{
long ct = 0;
int newnumber=0;
for (int i=0;i<Sin;i+=2){
if (in[i]>=48 && in[i]<=57) { //it is a number
out[ct]=out[ct]*10+in[i]-48;
newnumber=1;
} else { //it is not a number
if (newnumber==1){
ct++;
if (ct>Sout){return;}
}
newnumber=0;
}
}
}
/* The gateway function */
void mexFunction( int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
unsigned char *in; /* input vector */
long Sout; /* input size of output vector */
long Sin; /* size of input vector */
unsigned char *out; /* output vector*/
/* check for proper number of arguments */
if(nrhs!=2) {
mexErrMsgIdAndTxt("MyToolbox:arrayProduct:nrhs","two input required.");
}
if(nlhs!=1) {
mexErrMsgIdAndTxt("MyToolbox:arrayProduct:nlhs","One output required.");
}
/* make sure the first input argument is type char */
if(!mxIsClass(prhs[0], "char")) {
mexErrMsgIdAndTxt("MyToolbox:arrayProduct:notDouble","Input matrix must be type char.");
}
/* make sure the second input argument is type uint32 */
if(!mxIsClass(prhs[0], "char")) {
mexErrMsgIdAndTxt("MyToolbox:arrayProduct:notDouble","Input matrix must be type char.");
}
/* get dimensions of the input matrix */
Sin = mxGetM(prhs[0])*2;
/* create a pointer to the real data in the input matrix */
in = (unsigned char *) mxGetPr(prhs[0]);
Sout = mxGetScalar(prhs[1]);
/* create the output matrix */
plhs[0] = mxCreateNumericMatrix(1,Sout,mxUINT8_CLASS,0);
/* get a pointer to the real data in the output matrix */
out = (unsigned char *) mxGetPr(plhs[0]);
/* call the computational routine */
calc(in,out,Sout,Sin);
}
Now this script runs in 0.2 seconds and returns the same result as the previous script.
clear all;tic
fid = fopen('tekscanlarge.txt','rt');
%read the header, stop at frame
header='';
l = fgetl(fid);
while length(l)>5&&~strcmp(l(1:5),'Frame')
header=[header,l,sprintf('\n')];
l = fgetl(fid);
if length(l)<5,l(end+1:5)=' ';end
end
%all data at once
dat = fread(fid,inf,'*char');
fclose(fid);
S=[48,42,3458];
d = mexscan(dat,uint32(prod(S)+3458));
d(1:prod(S(1:2))+1:end)=[];%remove frame numbers
d = reshape(d,S);
toc

Counting frequency in a cell of char in Matlab: fast code?

I have a 1x2 cell A in Matlab. A{i} is a cell of dimension 30494866x1 for i=1,2. A{i}(j) is a 1x21 char for i=1,2 and j=1,...,30494866.
For example I report here A{2}(1:3)
'116374117927631468606'
'112188647432305746617'
'116374117927631468606'
I want to count how many times each 1x21 char in A{2} is repeated. For example, just considering A{2}(1:3), I want to get
'116374117927631468606' 2
'112188647432305746617' 1
What I am doing at the moment is
a=unique(A{2},'stable');
b=cellfun(#(x) sum(ismember(A{2},x)),a);
However this is incredibly slow (running since yesterday). Do you have any suggestion on how I can speed up the code?
Since you want to know how many times each 21-char string used:
1) sort the cell
2) count how many times each string is used in a for loop.
Your code is O(n^2) so it's very slow. This should take less than a minute.
Based on your code
B=sort(A{2});
U=sort(unique(B));
C=zeros(numel(U),1);
cnt = 1;
for j=1:numel(B)
if strcmp(U(cnt),B(j))==1
C(cnt)=C(cnt)+1;
else
cnt = cnt +1;
if cnt <= numel(U)
C(cnt) = C(cnt)+1;
end
end
end
You can do this with the standard unique- accumarray couple:
data = {'116374117927631468606'
'112188647432305746617'
'116374117927631468606'};
[uu, ~, ww] = unique(data, 'stable');
count = accumarray(ww, 1);
result = [uu, num2cell(count)];
Or, a little more memory-efficient:
data = {'116374117927631468606'
'112188647432305746617'
'116374117927631468606'};
[~, vv, ww] = unique(data, 'stable');
count = accumarray(ww, 1);
result = [data(vv) num2cell(count)];

Mex files and memory management

I have a mex code where the output variable has the same name as the input variable, but it changes size as a result of the operations of the mex code. For instance, I have something like:
A=Function(A) where A in the input is a 100 X 1 vector (much much larger in my simulation) and the output A is a 50 X 1 vector. I want to understand how memory is managed in this situation. After the operation is finished, does A now occupy 50 X 1 worth of space and the rest is free to allocate to other variables?
Thanks!
Siddharth
That is correct, the data buffer for the original A is destroyed by MATLAB and a new buffer is created (the same mxArray structure address is reused presumably by copying the new one onto the original after deallocating the original array's data buffer). This is assuming you are not writing to prhs[i] in you MEX file!
You can see this with format debug. You will observed that the output mxArray has the same address, but it's data buffer has a different address, so it has clearly reallocated the output array. This suggests that the original buffer is deallocated or queued to be deallocated.
Starting with the output for a change, of the file testMEX.mexw64 that takes the first half of the input array's first row and copies it into a new array:
>> format debug
>> A = rand(1,8)
A =
Structure address = efbb890
m = 1
n = 8
pr = 77bb6c40
pi = 0
0.2581 0.4087 0.5949 0.2622 0.6028 0.7112 0.2217 0.1174
>> A = testMEX(A)
A =
Structure address = efbb890
m = 1
n = 4
pr = 77c80380
pi = 0
0.2581 0.4087 0.5949 0.2622
Note that pr is different, meaning that MATLAB has created a new data buffer. However, the mxArray "Structure address" is the same. So, at the minimum, the old data buffer will be deallocated. Whether or not the original mxArray structure is simply mutated or a new mxArray is created is another question (see below).
Edit: The following is some evidence to suggest that an entirely new mxArray is created and it is copied onto the old mxArray
Add the following two lines to the MEX function:
mexPrintf("prhs[0] = %X, mxGetPr = %X, value = %lf\n",
prhs[0], mxGetPr(prhs[0]), *mxGetPr(prhs[0]));
mexPrintf("plhs[0] = %X, mxGetPr = %X, value = %lf\n",
plhs[0], mxGetPr(plhs[0]), *mxGetPr(plhs[0]));
The result is:
prhs[0] = EFBB890, mxGetPr = 6546D840, value = 0.258065
plhs[0] = EFA2DA0, mxGetPr = 77B65660, value = 0.258065
Clearly there is a temporary mxArray at EFA2DA0 containing the output (plhs[0]), and this mxArray header/structure is entirely copied onto the old mxArray structure (the one as A in the base MATLAB workspace). Before this copy happens, MATLAB surely deallocates the data buffer at 6546D840.
testMEX.cpp
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
mxAssert(nrhs == 1 && mxGetM(prhs[0]) == 1, "Input must be a row vector.");
double *A = mxGetPr(prhs[0]);
size_t cols = mxGetN(prhs[0]);
size_t newCols = cols / 2;
plhs[0] = mxCreateDoubleMatrix(1, newCols, mxREAL);
for (int i = 0; i < newCols; ++i)
mxGetPr(plhs[0])[i] = A[i];
mexPrintf("prhs[0] = %X, mxGetPr = %X, value = %lf\n",
prhs[0], mxGetPr(prhs[0]), *mxGetPr(prhs[0]));
mexPrintf("plhs[0] = %X, mxGetPr = %X, value = %lf\n",
plhs[0], mxGetPr(plhs[0]), *mxGetPr(plhs[0]));
}

Array of structures importing - memory prealocation problemMatLab

I have a few .mat files, each of them has an array of structures (of unknown length) called DATA. I want to import all these structures in a single array, but I don't want to use this code:
FileNames = strcat('file',num2str((1:N)''),'.mat');
DATATemp = [];
for int = 1:length(FileNames)
load(FileNames(int,:));
DATATemp=[DATATemp DATA];
end
DATA = DATATemp;
because it does not prealocate the memory for the array.
Are there any clever ways of doing that?
If the length is short enough that you can over-allocate memory, you can do something like this: Pick an array size that is way larger than what you will ever see, and then trim it back down after you are done.
FileNames = strcat('file',num2str((1:N)''),'.mat');
DATATemp = zeros(1e6,1);
idx = 1;
for int = 1:length(FileNames)
load(FileNames(int,:));
idx_end = idx + length(DATA) - 1;
DATATemp(idx:idx_end) = DATA;
idx = idx_end + 1;
end
DATA = DATATemp(1:idx_end);
However, if you are talking about a LOT of data, or just want to cover all your bases, a more rigorous solution is to allocate in chunks
FileNames = strcat('file',num2str((1:N)''),'.mat');
CHUNK_SIZE = 1e6;
DATATemp = zeros(INIT_SIZE,1);
idx = 1;
for int = 1:length(FileNames)
load(FileNames(int,:));
idx_end = idx + length(DATA) - 1;
if idx_end > length(DATATemp)
DATATemp = [DATATemp zeros(INIT_SIZE ,1);
DATATemp(idx:idx_end) = DATA;
idx = idx_end + 1;
end
DATA = DATATemp(1:idx_end);
Just make sure that your CHUNK_SIZE is significantly larger than the size of a typical individual file. I picked 1e6 here. That's what I would pick if I was loading ~20 files with an average size of 1e5 each. That way, although I'm still concatenating more space, it's much less often. This may not be very clever, but I hope it helps.
Also note that if you are loading files from a network drive that will slow things down immensely.