I have some pretty massive data files (256 channels, on the order of 75-100 million samples = ~40-50 GB or so per file) in int16 format. It is written in flat binary format, so the structure is something like: CH1S1,CH2S1,CH3S1 ... CH256S1,CH1S2,CH2S2,...
I need to read in each channel separately, filter and offset correct it, then save. My current bottleneck is loading each channel, which takes about 7-8 minutes... scale that up 256 times, and I'm looking at nearly 30 hours just to load the data! I am trying to intelligently use fread, to skip bytes as I read each channel; I have the following code in a loop over all 256 channels to do this:
offset = i - 1;
dat = fread(fid,[1,nSampsTotal],'*int16',(nChan-1)*2);
Reading around, this is typically the fastest way to load parts of a large binary file, but is the file simply too large to do this any faster?
I'm not loading that much data... the test file I'm working with is 37GB, for one of the 256 channels, I'm only loading 149MB for the entire trace... maybe the 'skip' functionality of fread is suboptimal?
System details: MATLAB 2017a, Windows 7, 64bit, 32GB RAM

#CrisLuengo's idea was much faster: essentially, chunking the data, loading each chunk and then splitting that out to separate channel files to save RAM.
Here is some code for just the loading part which is fast, less than 1 minute:
% fake raw data
disp('building... ');
nChan = 256;
nSampsTotal = 10e6;
tic; DATA = rand(nChan,nSampsTotal); toc;
fid = fopen('rawData.dat','w');
disp('writing flat binary file... ');
tic; fwrite(fid,DATA(:),'int16'); toc;
% compute the number of samples and chunks
chunkSize = 1e6;
nChunksTotal = ceil(nSampsTotal/chunkSize);
%% load by chunks
t1 = tic;
fid = fopen('rawData.dat','r');
dat = zeros(nChan,chunkSize,'int16');
chunkCnt = 1;
while 1
if chunkCnt <= nChunksTotal
% load the data
fprintf('Chunk %02d/%02d: loading... ',chunkCnt,nChunksTotal);
dat = fread(fid,[nChan,chunkSize],'*int16');
chunkCnt = chunkCnt + 1;
t = toc(t1); fprintf('Total time: %4.2f secs.\n\n\n',t);
% Total time: 55.07 secs.
On the other hand, loading by channel by skipping through the file takes about 20x longer, a little over 20 minutes:
%% load by channels (slow)
t1 = tic;
fid = fopen('rawData.dat','r');
dat = zeros(1,nSampsTotal);
for i = 1:nChan
fprintf('Channel %03d/%03d: loading... ');
offset = i-1;
dat = fread(fid,[1,nSampsTotal],'*int16',(nChan-1)*2);
t = toc(t1); fprintf('Total time: %4.2f secs.\n\n\n',t);
% Total time: 1133.48 secs.
I'd also like to thank OCDER on the Matlab forums for their help:


Matlab: fastest method of reading parts/sequences of a large binary file

I want to read parts from a large (ca. 11 GB) binary file. The currently working solution is to load the entire file ( raw_data ) with fread(), then crop out pieces of interest ( data ).
Question: Is there a faster method of reading small (1-2% of total file, partially sequential reads) parts of a file, given something like a binary mask (i.e. a logical index of specific bytes of interst) in Matlab? Specifics below.
Notes for my specific case:
data of interest (26+e6 bytes, or ca. 24 MB) is roughly 2% of raw_data (1.2e+10 bytes or ca. 11 GB)
each 600.000 bytes contain ca 6.500 byte reads, which can be broken down to roughly 1.200 read-skip cycles (such as 'read 10 bytes, skip 5000 bytes').
the read instructions of the total file can be broken down in ca 20.000 similar but (not exactly identical) read-skip cycles (i.e. ca. 20.000x1.200 read-skip cycles)
The file is read from a GPFS (parallel file system)
Excessive RAM, newest Matlab ver and all toolboxes are available for the task
My initial idea of fread-fseek cycle proved to be extrodinarily much slower (see psuedocode below) than reading the whole file. Profiling revealed fread() is slowest (being called over a million times probably obvious to the experts here).
Alternatives I considered: memmapfile() [ ref ] has no feasible read multiple small parts as far as I could find. The MappedTensor library might be the next thing I'd look into. Related but didn't help, just to link to article: 1, 2.
%open file
%example read-skip data
f_reads = [20 10 6 20 40]; %read this number of bytes
f_skips = [900 6000 40 300 600]; %skip these bytes after each read instruction
data = []; %save the result here
fseek(fi,90000,'bof'); %skip initial bytes until first read
%read the file
for ind=1:nbr_read_skip_cylces-1
tmp_data = fread(fi,f_reads(ind));
data = [data; tmp_data]; %add newly read bytes to data variable
fseek(fi,f_skips(ind),'cof'); %skip to next read position
FYI: To get an overview and for transparency, I've compiled some plots (below) of the first ca 6.500 read locations (of my actual data) that, after collapsing into fread-fseek pairs can, can be summarized in 1.200 fread-fseek pairs.
I would do two things to speed up your code:
preallocate the data array.
write a C MEX-file to call fread and fseek.
This is a quick test I did to compare using fread and fseek from MATLAB or C:
%% Create large binary file
data = 1:10000000; % 80 MB
fi = fopen('data.bin', 'wb');
fwrite(fi, data, 'double');
n_read = 1;
n_skip = 99;
%% Read using MATLAB
fi = fopen('data.bin', 'rb');
fseek(fi, 0, 'eof');
sz = ftell(fi);
sz = floor(sz / (n_read + n_skip));
data = zeros(1, sz);
fseek(fi, 0, 'bof');
for ind = 1:sz
data(ind) = fread(fi, n_read, 'int8');
fseek(fi, n_skip, 'cof');
%% Read using C MEX-file
mex fread_test_mex.c
data = fread_test_mex('data.bin', n_read, n_skip);
And this is fread_test_mex.c:
#include <stdio.h>
#include <mex.h>
void mexFunction(int nlhs, mxArray *plhs[],
int nrhs, const mxArray *prhs[])
// No testing of inputs...
// inputs = 'data.bin', 1, 99
char* fname = mxArrayToString(prhs[0]);
int n_read = mxGetScalar(prhs[1]);
int n_skip = mxGetScalar(prhs[2]);
FILE* fi = fopen(fname, "rb");
fseek(fi, 0L, SEEK_END);
int sz = ftell(fi);
sz /= n_read + n_skip;
plhs[0] = mxCreateNumericMatrix(1, sz, mxDOUBLE_CLASS, mxREAL);
double* data = mxGetPr(plhs[0]);
fseek(fi, 0L, SEEK_SET);
char buffer[1];
for(int ind = 1; ind < sz; ++ind) {
fread(buffer, 1, n_read, fi);
data[ind] = buffer[0];
fseek(fi, n_skip, SEEK_CUR);
I see this:
Elapsed time is 6.785304 seconds.
Building with 'Xcode with Clang'.
MEX completed successfully.
Elapsed time is 1.376540 seconds.
That is, reading the data is 5x as fast with a C MEX-file. And that time includes loading the MEX-file into memory. A second run is a bit faster (1.14 s) because the MEX-file is already loaded.
In the MATLAB code, if I initialize data = []; and then extend the matrix every time I read like OP does:
tmp = fread(fi, n_read, 'int8');
data = [data, tmp];
then the execution time for that loop was 159 s, with 92.0% of the time spent in the data = [data, tmp] line. Preallocating really is important!

Processing a large dataset

My MATLAB program generates N=100 trajectories with T=10^8 time steps in each, i.e.
x = randn(10^8,100);
Ultimately, I want to process this data set and obtain an average autocorrelation of all trajectories:
y = mean(fft(x),2); % output size (10^8, 1)
Now since x is too big to store, my only viable option is to save it on the hard drive in small chunks of 10^6
x1 = randn(10^6, 100);
x2 = randn(10^6, 100);
and then obtain y by processing each trajectory n=1:100 individually and accumulating the result:
for n=1:100
y = y + fft([x1(:,n); x2(:,n); ...; x100(:,n)]);
Is there a more elegant way of doing this? I have 100GB of RAM and a pool of 12 workers.
An easier way would be to generate your data once and then chop it into small pieces you save on disk, or, if possible, create the data on the workers themselves.
x = randn(10^8,100);
for ii=1:100
if ii ~=100
tmp = x(ii:ii+1e6)
tmp = x(ii:end); %ii+1e6 would result in end+1
filename = sprintf('Dataset%i',ii); %create filename
save(filename,tmp,'-v7.3'); %save file to disk in -v7.3 format
y = cell(100,1) %initialise output
parfor ii = 1:100
filename = sprintf('Dataset%i',ii); %get the filenames back
load(filename); %load the file
y{ii} = mean(fft(tmp),2); % I called it tmp when saving, so it's called tmp here
Now you can accumulate the results from the cell y in the desired manner. You can of course play around with the amount of files you create, as less files will be processed faster due to the overhead of parfor.

Reading labview binary files in Matlab?

I have large .bin files (10GB-60GB) created by Labview software, the .bin files represent the output of two sensors used from experiments that I have done.
The problem I have is importing the data into Matlab, the only way I have achieved this so far is by converting the .bin files to .txt files in Labview software then Importing the data into MATLAB using the following code:
Nlines = 1e6; % set number of lines to sample per cycle
sample_rate = (1); %sample rate
DECE= 1000;% decimation factor
TIME = (0:sample_rate:sample_rate*((Nlines)-1));%first inctance of time vector
format = '%f\t%f';
fid = fopen('H:\PhD backup\Data/ONK_PP260_G_text.txt');
C = textscan(fid, format, Nlines, 'CollectOutput', true);
d = C{1}; % immediately clear C at this point you need the memory!
clearvars C ;
TIME = ((TIME(end)+sample_rate):sample_rate:(sample_rate*(size(d,1)))+(TIME(end)));%shift Time along
plot((TIME(1:DECE:end)),(d(1:DECE:end,:)))%plot and decimate
hold on;
clearvars d;
The basic idea behind my code is to conserve RAM by reading Nlines of data from .txt on disk to Matlab variable C in RAM, plotting C then clearing C. This process occurs in loop so the data is plotted in chunks until the end of the .txt file is reached.
I want to read the .bin file directly into MATLAB rather than converting it to .txt first because it takes hours for the conversion to complete and I have a lot of data. Here are some examples of my data but in manageable sizes:
Here is a description of the binary data:
Someone has all ready written a Matlab script to import Labveiw .bin files but their script will only work with very small files:
% LABVIEWLOAD Load Labview binary data into Matlab
% DATA = LABVIEWLOAD(FNAME,DIM); % Loads the Labview data in binary
% format from the file specified by FNAME, given the NUMBER of
% dimensions (not the actual dimensions of the data in the binary
% file) of dimensions specified by DIM.
% LABVIEWLOAD will repeatedly grab data until the end of the file.
% Labview arrays of the same number of dimensions can be repeatedly
% appended to the same binary file. Labview arrays of any dimensions
% can be read.
% DATA = LABVIEWLOAD(FNAME,DIM,PREC); % Loads the data with the specified
% precision, PREC.
% Note: This script assumes the default parameters were used in the
% Labview VI "Write to Binary File". Labview uses the Big Endian
% binary number format.
% Examples:
% D = labviewload('Data.bin',2); % Loads in Data.bin assuming it
% contains double precision data and two dimensions.
% D = labviewload('OthereData.bin',3,'int8'); % Loads in
% OtherData.bin assuming it contains 8 bit integer values or
% boolean values.
% Jeremiah Smith
% 4/8/10
% Last Edit: 5/6/10
function data = labviewload(fname,dim,varargin)
siz = [2^32 2^16 2^8 1]'; % Array dimension conversion table
% Precision Input
if nargin == 2
prec = 'double';
elseif nargin == 3
prec = varargin{1};
error('Too many inputs.')
%% Initialize Values
fid = fopen(fname,'r','ieee-be'); % Open for reading and set to big-endian binary format
fsize = dir(fname); % File information
fsize = fsize.bytes; % Files size in bytes
%% Preallocation
rows = [];
columns = [];
I = 0;
while fsize ~= ftell(fid)
dims = [];
for i=1:1:dim
temp = fread(fid,4);
temp = sum(siz.*temp);
dims = [dims,temp];
I = I + 1;
% fseek(fid,prod(dims)*8,'cof'); % Skip the actual data
temp = fread(fid,prod(dims),prec,0,'ieee-be'); % Skip the actual data (much faster for some reason)
fseek(fid,0,'bof'); % Reset the cursor
data = repmat({NaN*ones(dims)},I,1); % Preallocate space, assumes each section is the same
%% Load and parse data
for j=1:1:I
dims = []; % Actual array dimensions
for i=1:1:dim
temp = fread(fid,4);
temp = sum(siz.*temp);
dims = [dims,temp];
clear temp i
temp = fread(fid,prod(dims),prec,0,'ieee-be'); % 11 is the values per row,
% double is the data type, 0 is the bytes to skip, and
% ieee-be specified big endian binary data
%% Reshape the data into the correct array configuration
if dim == 1
temp = reshape(temp,1,dims);
evalfunc = 'temp = reshape(temp';
for i=1:1:dim
evalfunc = [evalfunc ',' num2str(dims(dim-i+1))];
if dim ~= 2
eval([evalfunc ');'])
eval([evalfunc ')'';'])
data{j} = temp; % Save the data
fclose(fid); % Close the file
The code has the following error message, when you try to process even relatively small .bin files:
Error using ones
Maximum variable size allowed by the program is exceeded.
Error in labviewload (line 65)
data = repmat({NaN*ones(dims)},I,1); % Preallocate space, assumes each section is the same
Can you help me modify the code so that I can open large .bin files? Any help will be much appreciated.

Reading images in a loop gets slower

I have to process a batch of images (around 30000) in a for loop. For this I read one image with every execution of the loop.
When reaching a certain index (the exact value varies but usually it is around 2000) the progress slows dramatically (factor 2 or 3). How can this be?
Here is a minimal code example that shows this behavior:
imgFolder = [uigetdir, '/'];
files = dir(fullfile(imgFolder, '*.tiff'));
filenames = sort_nat({files.name});
imshow(imread([imgFolder, '/', filenames{1}]))
roi = roipoly;
meansRGB = zeros(3,length(files));
for i = 1:size(files,1)
img = imread([imgFolder, '/', filenames{i}]);
% for j=1:3
% a = regionprops(roi,img(:,:,j),'PixelValues');
% meansRGB(j,i) = mean(a.PixelValues);
% end
% show remaining time
dt = 500; % display remaining time every dt'th step
if(mod(i,dt)) == 0
elapsed = toc;
remainingSeconds = elapsed*(length(files)-i)/dt;
[hours, minutes, seconds] = sec2hms(remainingSeconds);
str = sprintf('remaining time: %d:%d:%d', hours, minutes, round(seconds));
Note that all the important parts that I thought would consume the most time are already commented out and are not the reason for the decelerating loop. Also there is plenty of RAM and processor cycles left, this shouldn't be the problem.
Could it be that the first few hundred images fill up your RAM?
Any other accesses to memory would trigger swapping and writing to the hard drive, which is easily orders of magnitude slower...
Is there any chance that imread() is leaving file handles open? (Though I'd be rather surprised if that were the case?)
You could maybe try sticking an fclose('all') in the loop to check?

Fastest Matlab file reading?

My MATLAB program is reading a file about 7m lines long and wasting far too much time on I/O. I know that each line is formatted as two integers, but I don't know exactly how many characters they take up. str2num is deathly slow, what matlab function should I be using instead?
Catch: I have to operate on each line one at a time without storing the whole file memory, so none of the commands that read entire matrices are on the table.
fid = fopen('file.txt');
tline = fgetl(fid);
while ischar(tline)
nums = str2num(tline);
%do stuff with nums
tline = fgetl(fid);
Problem statement
This is a common struggle, and there is nothing like a test to answer. Here are my assumptions:
A well formatted ASCII file, containing two columns of numbers. No headers, no inconsistent lines etc.
The method must scale to reading files that are too large to be contained in memory, (although my patience is limited, so my test file is only 500,000 lines).
The actual operation (what the OP calls "do stuff with nums") must be performed one row at a time, cannot be vectorized.
With that in mind, the answers and comments seem to be encouraging efficiency in three areas:
reading the file in larger batches
performing the string to number conversion more efficiently (either via batching, or using better functions)
making the actual processing more efficient (which I have ruled out via rule 3, above).
I put together a quick script to test out the ingestion speed (and consistency of result) of 6 variations on these themes. The results are:
Initial code. 68.23 sec. 582582 check
Using sscanf, once per line. 27.20 sec. 582582 check
Using fscanf in large batches. 8.93 sec. 582582 check
Using textscan in large batches. 8.79 sec. 582582 check
Reading large batches into memory, then sscanf. 8.15 sec. 582582 check
Using java single line file reader and sscanf on single lines. 63.56 sec. 582582 check
Using java single item token scanner. 81.19 sec. 582582 check
Fully batched operations (non-compliant). 1.02 sec. 508680 check (violates rule 3)
More than half of the original time (68 -> 27 sec) was consumed with inefficiencies in the str2num call, which can be removed by switching the sscanf.
About another 2/3 of the remaining time (27 -> 8 sec) can be reduced by using larger batches for both file reading and string to number conversions.
If we are willing to violate rule number three in the original post, another 7/8 of the time can be reduced by switching to a fully numeric processing. However, some algorithms do not lend themselves to this, so we leave it alone. (Not the "check" value does not match for the last entry.)
Finally, in direct contradiction a previous edit of mine within this response, no savings are available by switching the the available cached Java, single line readers. In fact that solution is 2 -- 3 times slower than the comparable single line result using native readers. (63 vs. 27 seconds).
Sample code for all of the solutions described above are included below.
Sample code
%% Create a test file
fName = 'demo_file.txt';
fid = fopen(fName,'w');
for ixLoop = 1:5
d = randi(1e6, 1e5,2);
fprintf(fid, '%d, %d \n',d);
%% Initial code
CHECK = 0;
fid = fopen('demo_file.txt');
tline = fgetl(fid);
while ischar(tline)
nums = str2num(tline);
CHECK = round((CHECK + mean(nums) ) /2);
tline = fgetl(fid);
t = toc;
fprintf(1,'Initial code. %3.2f sec. %d check \n', t, CHECK);
%% Using sscanf, once per line
CHECK = 0;
fid = fopen('demo_file.txt');
tline = fgetl(fid);
while ischar(tline)
nums = sscanf(tline,'%d, %d');
CHECK = round((CHECK + mean(nums) ) /2);
tline = fgetl(fid);
t = toc;
fprintf(1,'Using sscanf, once per line. %3.2f sec. %d check \n', t, CHECK);
%% Using fscanf in large batches
CHECK = 0;
bufferSize = 1e4;
fid = fopen('demo_file.txt');
scannedData = reshape(fscanf(fid, '%d, %d', bufferSize),2,[])' ;
while ~isempty(scannedData)
for ix = 1:size(scannedData,1)
nums = scannedData(ix,:);
CHECK = round((CHECK + mean(nums) ) /2);
scannedData = reshape(fscanf(fid, '%d, %d', bufferSize),2,[])' ;
t = toc;
fprintf(1,'Using fscanf in large batches. %3.2f sec. %d check \n', t, CHECK);
%% Using textscan in large batches
CHECK = 0;
bufferSize = 1e4;
fid = fopen('demo_file.txt');
scannedData = textscan(fid, '%d, %d \n', bufferSize) ;
while ~isempty(scannedData{1})
for ix = 1:size(scannedData{1},1)
nums = [scannedData{1}(ix) scannedData{2}(ix)];
CHECK = round((CHECK + mean(nums) ) /2);
scannedData = textscan(fid, '%d, %d \n', bufferSize) ;
t = toc;
fprintf(1,'Using textscan in large batches. %3.2f sec. %d check \n', t, CHECK);
%% Reading in large batches into memory, incrementing to end-of-line, sscanf
CHECK = 0;
fid = fopen('demo_file.txt');
bufferSize = 1e4;
eol = sprintf('\n');
dataBatch = fread(fid,bufferSize,'uint8=>char')';
dataIncrement = fread(fid,1,'uint8=>char');
while ~isempty(dataIncrement) && (dataIncrement(end) ~= eol) && ~feof(fid)
dataIncrement(end+1) = fread(fid,1,'uint8=>char'); %This can be slightly optimized
data = [dataBatch dataIncrement];
while ~isempty(data)
scannedData = reshape(sscanf(data,'%d, %d'),2,[])';
for ix = 1:size(scannedData,1)
nums = scannedData(ix,:);
CHECK = round((CHECK + mean(nums) ) /2);
dataBatch = fread(fid,bufferSize,'uint8=>char')';
dataIncrement = fread(fid,1,'uint8=>char');
while ~isempty(dataIncrement) && (dataIncrement(end) ~= eol) && ~feof(fid)
dataIncrement(end+1) = fread(fid,1,'uint8=>char');%This can be slightly optimized
data = [dataBatch dataIncrement];
t = toc;
fprintf(1,'Reading large batches into memory, then sscanf. %3.2f sec. %d check \n', t, CHECK);
%% Using Java single line readers + sscanf
CHECK = 0;
bufferSize = 1e4;
reader = java.io.LineNumberReader(java.io.FileReader('demo_file.txt'),bufferSize );
tline = char(reader.readLine());
while ~isempty(tline)
nums = sscanf(tline,'%d, %d');
CHECK = round((CHECK + mean(nums) ) /2);
tline = char(reader.readLine());
t = toc;
fprintf(1,'Using java single line file reader and sscanf on single lines. %3.2f sec. %d check \n', t, CHECK);
%% Using Java scanner for file reading and string conversion
CHECK = 0;
jFile = java.io.File('demo_file.txt');
scanner = java.util.Scanner(jFile);
while scanner.hasNextInt()
nums = [scanner.nextInt() scanner.nextInt()];
CHECK = round((CHECK + mean(nums) ) /2);
t = toc;
fprintf(1,'Using java single item token scanner. %3.2f sec. %d check \n', t, CHECK);
%% Reading in large batches into memory, vectorized operations (non-compliant solution)
CHECK = 0;
fid = fopen('demo_file.txt');
bufferSize = 1e4;
eol = sprintf('\n');
dataBatch = fread(fid,bufferSize,'uint8=>char')';
dataIncrement = fread(fid,1,'uint8=>char');
while ~isempty(dataIncrement) && (dataIncrement(end) ~= eol) && ~feof(fid)
dataIncrement(end+1) = fread(fid,1,'uint8=>char'); %This can be slightly optimized
data = [dataBatch dataIncrement];
while ~isempty(data)
scannedData = reshape(sscanf(data,'%d, %d'),2,[])';
CHECK = round((CHECK + mean(scannedData(:)) ) /2);
dataBatch = fread(fid,bufferSize,'uint8=>char')';
dataIncrement = fread(fid,1,'uint8=>char');
while ~isempty(dataIncrement) && (dataIncrement(end) ~= eol) && ~feof(fid)
dataIncrement(end+1) = fread(fid,1,'uint8=>char');%This can be slightly optimized
data = [dataBatch dataIncrement];
t = toc;
fprintf(1,'Fully batched operations. %3.2f sec. %d check \n', t, CHECK);
(original answer)
To expand on the point made by Ben ... your bottleneck will always be file I/O if you are reading these files line by line.
I understand that sometimes you cannot fit a whole file into memory. I typically read in a large batch of characters (1e5, 1e6 or thereabouts, depending on the memory of your system). Then I either read additional single characters (or back off single characters) to get a round number of lines, and then run your string parsing (e.g. sscanf).
Then if you want you can process the resulting large matrix one row at a time, before repeating the process until you read the end of the file.
It's a little bit tedious, but not that hard. I typically see 90% plus improvement in speed over single line readers.
(terrible idea using Java batched line readers removed in shame)
I have had good results (speedwise) using memmapfile(). This minimises the amount of memory data copying, and makes use of the kernel's IO buffering. You need enough free address space (though not actual free memory) to map the entire file, and enough free memory to hold the output variable (obviously!)
The example code below reads a text file into a two-column matrix data of int32 type.
fname = 'file.txt';
fstats = dir(fname);
% Map the file as one long character string
m = memmapfile(fname, 'Format', {'uint8' [ 1 fstats.bytes] 'asUint8'});
textdata = char(m.Data(1).asUint8);
% Use textscan() to parse the string and convert to an int32 matrix
data = textscan(textdata, '%d %d', 'CollectOutput', 1);
data = data{:};
% Tidy up!
You may need to fiddle with the parameters to textscan() to get exactly what you want - see the online docs.
Even if you can't fit the whole file in memory, you should read a large batch using the matrix read functions.
Maybe you can even use vector operations for some of the data processing, which would speed things along further.
I have found that MATLAB reads csv files significantly faster than text files, so if it's possible to convert your text file to csv using some other software, it may significantly speed up Matlab's operations.