There are some good posts on here (such as this one) on how to make a circular buffer in MATLAB. However from looking at them, I do not believe they fit my application, because what I am seeking, is a circular buffer solution in MATLAB, that does NOT involve any copying of old data.
To use a simple example, let us say that I am processing 50 samples at a time, and I read in 10 samples each iteration. I would first run through 5 iterations, fill up my buffer, and in the end, process my 50 samples. So my buffer will be
[B1 B2 B3 B4 B5]
, where each 'B' is a block of 10 samples.
Now, I read in the next 10 samples, call them B6. I want my buffer to now look like:
[B2 B3 B4 B5 B6]
The catch is this - I do NOT want to copy the old data, B2, B3, B4, B5 everytime, because it becomes expensive in time. (I have very large data sets).
I am wondering if there is a way to do this without any copying of 'old' data. Thank you.
One way to quickly implement a circular buffer is to use modulus to circle back around to the front. This will slightly modify the order of the data from what you specified but may be faster and equivalent if you simply replace the oldest data with the newest so instead of
[B2 B3 B4 B5 B6]
You get
[B6 B2 B3 B4 B5]
By using code like this:
bufferSize = 5;
data = nan(bufferSize,1)';
for ind = 1:bufferSize+2
data(mod(ind-1, bufferSize)+1) = ind
end
and this works for arbitrary sized data.
If you're not familiar with modulo, the mod function returns effectively the remainder of a division operation. So mod(3,5) returns3, mod(6,5) returns 1, mod(7,5) returns 2 and so on until you reach mod(10,5) which equals 0 again. This allows us to 'wrap around' the vector by moving back to the start every time we reach the end. The +1 and -1 in the code is because MATLAB starts its vector indices at 1 rather than 0 so to get the math to work out right you have to remove 1 before doing the mod then add it back in to get the right index. The result is that when you try and write the 6th element to your vector is goes and writes it to the 1st position in the vector.
My idea would be to use a cell-array with 5 entries and use a variable to index the sub-array which should be overwritten in the next step. E.g. something like
a = {ones(10),2*ones(10),3*ones(10),4*ones(10),5*ones(10)};
index = 1;
in the next step you could then write into the sub-array:
a{index} = 6*ones(10);
and increase the index like
index = index+1
Obviously, some sort of limitation:
if(index > 5) % FIXED TYPO!!
index = 1;
end
Would that be for you?
EDIT: One more thing to look at would be the sortation of the entries which would therefore always be shifted by some entries but depending on how you keep on using the data, you could e.g. shift the usage of the data depending on the variable index.
EDIT2: I've got another idea: What about using classes in MATLAB. You could use a handle-class to hold your data thus using the buffer to only reference to the data. This might make it a bit faster, depending on what data (how large the datasets are etc) you hold and how many shifts you'll have in your code. See e.g. here: Matlab -- handle objects
You could use a simple handle class:
classdef Foo < handle
properties (SetAccess = public, GetAccess = public)
x
end
methods
function obj = foo(x)
% constructor
obj.x = x;
end
end
end
Store the data in it:
data = [1 2 3 4];
foo = Foo(data); % handle object
and then only store the object-reference in the circular buffer. In the posted link the answer shows that the assignment bar = foo doesn't copy the object but really only helds the reference:
foo.x = [3 4]
disp(bar.x) % would be [3 4]
but as stated, I don't know if that will be faster because of the OOP-overhead. It might be depending on your data... And here some more information about it: http://www.matlabtips.com/how-to-point-at-in-matlab/
I just uploaded my solution for a fast circular buffer to which does not copy old data
http://www.mathworks.com/matlabcentral/fileexchange/47025-circvbuf-m
The main idea of this circular buffer is constant and fast performance
and avoiding copy operations when using the buffer in a program:
% create a circular vector buffer
bufferSz = 1000;
vectorLen= 7;
cvbuf = circVBuf(int64(bufferSz),int64(vectorLen));
% fill buffer with 99 vectors
vecs = zeros(99,vectorLen,'double');
cvbuf.append(vecs);
% loop over lastly appended vectors of the circVBuf:
new = cvbuf.new;
lst = cvbuf.lst;
for ix=new:lst
vec(:) = cvbuf.raw(:,ix);
end
% or direct array operation on lastly appended vectors in the buffer (no copy => fast)
new = cvbuf.new;
lst = cvbuf.lst;
mean = mean(cvbuf.raw(3:7,new:lst));
Check the screenshot to see, that this circular buffer has advantages if the buffer is large, but the size of data to append each time is small as the performance of circVBuf does NOT depend on the buffer size, compared to a simple copy buffer.
The double buffering garanties a predictive time for an append depending on the data to append in any situation. In future this class shall give you a choice for double buffering yes or no - things will speedup, if you do not need the garantied time.
Related
I have 200 time points. For each time point, there is an image, the size of which is 40*40 double, corresponds to this time point. For example, image 1 corresponds to time point 1; image k corresponds to time point k (k = 1,2,...,200).
The time points are T = 1:200, with the images named as Image_T, thus Image_1, Image_2 etc.
I want to put all these 200 images together. The final size is 40*40*200 double. The final image looks like fMRI image (fmri_szX = 40, fmri_szY = 40 and fmri_szT = 200). How to achieve that?
Thanks!
Dynamic variables
Note that whilst this is possible, it's considered to be bad programming (see for instance here, or this blog by Loren and even the Mathworks in their documentation tell you not to do this). It would be much better to load your images directly into either a 3D array or a cell structure, avoiding dynamic variable names. I just posted this for completeness; if you ever happen to have to use this solution, you should change to a (cell-) array immediately.
The gist of the linked articles as to why eval is such a bad idea, is that MATLAB can no longer predict what the outcome of the operation will be. For instance A=3*(2:4) is recognised by MATLAB to output a double-array. If you eval stuff, MATLAB can no longer do this. MATLAB is an interpreted language, i.e. each line of code is read then ran, without compiling the entire code beforehand. This means that each time MATLAB encounters eval, it has to stop, evaluate the expression, then check the output, store that, and continue. Most of the speed-engines employed by MATLAB (JIT/MAGMA etc) can't work without predicting the outcome of statements, and will therefore shut down during the eval evaluation, rendering your code very slow.
Also there's a security aspect to the usage of eval. Consider the following:
var1 = 1;
var2 = 2;
var3 = 3;
varnames = {'var1', 'var2; disp(''GOTCHA''); %', 'var3'};
accumvar = [];
for k = 1:numel(varnames)
vname = varnames{k};
disp(['Reading from variable named ' vname]); eval(['accumvar(end+1) = ' vname ';']);
end
Now accumvar will contain the desired variable names. But if you don't set accumvar as output, you might as well not use a disp, but e.g. eval('rm -rf ~/*') which would format your entire disk without even telling you it's doing so.
Loop approach
for ii = 200:-1:1
str = sprintf('Image_%d',ii);
A(:, :, :, ii) = eval(str);
end
This creates your matrix. Note that I let the for loop run backwards, so as to initialise A in its largest size.
Semi-vectorised approach
str = strsplit(sprintf('image_%d ',1:200),' '); % Create all your names
str(end) = []; % Delete the last entry (empty)
%Problem: eval cannot handle cells, loop anyway:
for ii = 200:-1:1
A(:, :, :, ii) = eval(str{ii});
end
eval does not support arrays, so you cannot directly plug the cellarray strin.
Dynamic file names
Despite having a similar title as above, this implies having your file names structured, so in the file browser, and not MATLAB. I'm assuming .jpg files here, but you can add every supported image extension. Also, be sure to have all images in a single folder and no additional images with that extension, or you have to modify the dir() call to include only the desired images.
filenames = dir('*.jpg');
for ii = length(filenames):-1:1
A(:,:,:,ii) = imread(filenames{ii});
end
Images are usually read as m*n*3 files, where m*n is your image size in pixels and the 3 stems from the fact that they're read as RGB by imread. Therefore A is now a 4D matrix, structured as m*n*3*T, where the last index corresponds to the time of the image, and the first three are your image in RGB format.
Since you do not specify how you obtain your 40*40 double, I have left the 4D matrix. Could be you read them and then switch to using a uint16 interpretation of RGB, which is a single number, which would result in a m*n*1*T variable, which you can reduce to a 3D variable by calling A = squeeze(A(:,:,1,:));
More of a blue skies question here - if I have some code that is like
A = [1,2,3,4,5,6]; %input data
B = sort(A); %step one
C = B(1,1) + 10; %step two
Is there a line of code I can use to remove "B" to save memory before doing something else with C?
clear B
This will remove the variable B from memory.
See the documentation here for more info.
There is no need to assign each result to a new variable. For example, you could write:
A = [1,2,3,4,5,6]; %input data
A = sort(A); %step one
A = A(1,1) + 10; %step two
Especially if A is large, it is much more efficient to write A = sort(A) than B = sort(A), because then sort can work in-place, avoiding the need to create a secondary array. The same is true for many other functions. Working in-place means that the cache can be used more effectively, speeding up operations. The reduced memory usage is also a plus for very large arrays, and in-place operations tend to avoid memory fragmentation.
In contrast, things like clear B tend to slow down the interpreter, as they make things more complicated for the JIT. Furthermore, as can be seen in the documentation,
On UNIX® systems, clear does not affect the amount of memory allocated to the MATLAB process.
That is, the variable is cleared from memory, but the memory itself is not returned to the system.
As an aside, as #obchardon said in a comment, your code can be further simplified by realizing that min does the same thing as keeping only the first value of the result of sort (but much more efficiently).
As an example, I've put three operations in a row that can work in-place, and used timeit to time the execution time of these two options: using a different variable every time and clearing them when no longer needed, or assigning into the same variable.
N = 1000;
A = rand(1,N);
disp(timeit(#()method1(A)))
disp(timeit(#()method2(A)))
function D = method1(A)
B = sort(A);
clear A
C = cumsum(B);
clear B
D = cumprod(C);
end
function A = method2(A)
A = sort(A);
A = cumsum(A);
A = cumprod(A);
end
Using MATLAB Online I see these values:
different variables + clear: 5.8806e-05 s
re-using same variable: 4.4185e-05 s
MATLAB Online is not the best way for timing tests, as so many other things happen on the server at the same time, but it gives a good indication. I've ran the test multiple times and seen similar values most of those times.
I have large binary files (2+ GB) that are arranged with a sync pattern (0xDEADBEEF) followed by a data block of a fixed size.
Example:
0xDE AD BE EF ... 96 bytes of data
0xDE AD BE EF ... 96 bytes of data
... repeat ...
I need to locate the offsets to the start of each packet. Ideally this would just be [1:packetSize:fileSize] However, there is other data that can be interspersed, headers etc. so I need to search the file for the sync pattern.
I am using the following code which is based on Loren from Mathworks findPattern2 but modified a little to use a memory map.
function pattLoc = findPattern(fileName, bytePattern)
%Mem Map file
m = memmapfile(fileName);
% Find candidate locations match the first element in the pattern.
pattLoc = find(m.data==bytePattern(1));
%Remove start values that are too close to the end to possibly match
len = numel(bytePattern);
endVals = pattLoc+len-1;
pattLoc(endVals>length(m.data)) = [];
% loop over elements of Sync Pattern to check possible location validity.
for pattval = 2:len
% check viable locations in array
locs = bytePattern(pattval) == m.data(pattLoc+pattval-1);
pattLoc(~locs) = []; % delete false ones from indices
end
This works pretty well. However, I think there might be room for improvement. First I know my patterns can't be closer than packetSize (100 in this example) but may be farther apart. Seems like I should be able to use this information somehow to speed up the search. Second the initial search on line 5 uses a find for numerical indexing instead of logical. This line takes almost twice as long as leaving it as a logical. However, I tried to rework this function using logical indexing only and failed miserably. The problem arises inside the loop and keeping track of the nested indexing with logicals without using more finds ... or checking more data than needs to be.
So any help speeding this up would be appreciated. Below is some code which will create a simple sample binary file to work with if necessary.
function genSampleFile(numPackets)
pattern = hex2dec({'DE','AD','BE','EF'});
fileName = 'testFile.bin';
fid = fopen(fileName,'w+');
for f = 1:numPackets
fwrite(fid,[pattern; uint8(rand(96,1)*255)],'uint8');
end
fclose(fid);
Searching a file with 10000000 packets took the following:
>> genSampleFile(10000000); %Warning Makes 950+MB file
>> tic;pattLoc = findPattern(fileName, pattern);toc
Elapsed time is 4.608321 seconds.
You can get an immediate boost by using findstr, or better yet strfind instead of find:
pattLoc = strfind(m.data, bytePattern)
This removes the need for any further looping. You just need to clean up a couple of things in the returned array of indices.
You want to remove things that are closer than 100 bytes to the end, not closer than 4 bytes to the end, so set len = 100, instead of len = length(bytePattern).
To filter out elements that are closer than 100 bytes from each other, use diff on the list of indices:
pattLoc[diff(pattLoc) < 100] = []
This should speed up your code by relying more on builtins, which are generally much more efficient than loops.
If you must incrementally append data to arrays, it seems that using individual vectors of basic data types is orders of magnitude faster than an array of structs (with one vector element per record). Even trying to collect the individual vectors into a struct seems to double the time. The tests are:
N=5e4;
fprintf('\nstruct array (array of structs):\n')
clear x y;
y=struct( 'a',[], 'b',[], 'c',[], 'd',[] );
tic
for iIns = 1 : N
x.a=rand; x.b=rand; x.c=rand; x.d=rand;
y(end+1)=x;
end % for iIns
toc
fprintf('\nSeparate arrays of scalars:\n')
clear a b c d;
a=[]; b=[]; c=[]; d=[];
tic
for iIns = 1 : N
a(end+1) = rand;
b(end+1) = rand;
c(end+1) = rand;
d(end+1) = rand;
end % for iIns
toc
fprintf('\nA struct with arrays of scalars for fields:\n')
clear a b c d x y
x.a=[]; x.b=[]; x.c=[]; x.d=[];
tic
for iIns = 1:N
x.a(end+1)=rand;
x.b(end+1)=rand;
x.c(end+1)=rand;
x.d(end+1)=rand;
end % for iIns
toc
The results:
struct array (array of structs):
Elapsed time is 24.127274 seconds.
Separate arrays of scalars:
Elapsed time is 0.048190 seconds.
A struct with arrays of scalars for fields:
Elapsed time is 0.084624 seconds.
Even though collecting individual vectors of basic data types into a struct (3rd scenario above) imposes such a penalty, it may be preferrable to simply using individual vectors (second scenario above) because the variables are more organized. Your variable name space isn't filled up with so many variables which are in fact conceptually grouped.
That's quite a significant penalty, however, to pay for such organization. I don't suppose there is way to avoid this?
There are two ways to avoid this performance penalty: (1) pre-allocate, and (2) rethink your stance on "organizing" variables. I suggest both. Oh, and if you can, don't use arrays of structs where each field only uses scalars - if your application suddenly has to handle a couple of orders of magnitude more data, the memory overhead will force you to rewrite everything.
Pre-allocation
You often know how many elements your array will end up having. Thus, initialize your arrays as s = struct('a',NaN(1:N),'b',NaN(1:N)); If you don't know ahead of time how many entries there will be, but you can estimate an upper limit, initialize with the upper limit, and either remove the elements, or use functions (e.g. nanmean) that do not care if the array has a few extra NaNs in the end. If you truly know nothing about the final size (except that N will be large enough to matter), pre-allocate with a nice number (e.g. N=1337), and extend the array in chunks. MathWorks have sped up dynamic growing of numeric arrays in a recent release, but as you demonstrate in your answer, the optimization has not been applied to structs yet. Don't count MathWorks' optimization team to fix your code.
Nice variables
Why worry about your variable space? As long as you use explicitVariableNames, your code remains readable and you will have an easy time picking out the right variable. But ok, let's say you want to clean up: The first way to keeping the number of active variables low is to use clear or keep at strategic points in your code to make sure you only keep around what's needed. The second (assuming you want to optimize for performance), is to put contextually linked vectors into the same array: objectDimensions = [lengthOfObject, widthOfObject, heightOfObject]. This keeps everything as numeric arrays (which are fastest), and allows easy vectorization such as objectVolume = prod(objectDimensions,2);.
/aside: I should disclose that I used to use structures frequently for assembling results (so that I could return a lot of information a single variable and have the field names be part of the documentation). I have since switched to use object-oriented-programming (usually handle-objects), which no only collect related variables, but also the associated functionality, and which facilitate code re-use. I do take a performance hit, but the time it saves me coding makes more than up for it. Note that I do pre-allocate if at all possible (and if it's not just growing an array three times).
Example
Assume you have a function getDimensions that reads dimensions (length, height, width) of objects. However, sometimes, the object is 2D, sometimes it is 3D. Thus, you want to fill the following variables: twoD.length, twoD.width, threeD.length, threeD.width, threeD.height, ideally as arrays of structs, so that each element of a struct corresponds to an object. You do not know ahead of time how many objects there are, all you can do is poll the function thereAreMoreObjects, which returns true or false, until there are no more objects.
Here's how you can do this with reasonable efficiency and growing arrays by chunks:
%// preassign the temporary variable, and some others
chunkSize = 1000;
numObjects = 0;
idAndDimensions = zeros(chunkSize,4);
while thereAreMoreObjects()
objectId = getCurrentObjectId();
%// hi==-1 if it's flat
[len,wid,hi] = getObjectDimensions(objectId);
%// allocate more, if needed
numObjects = numObjects + 1;
if numObjects > size(idAndDimensions,1)
%// grow array
idAndDimensions(end+chunkSize,1) = 0;
end
idAndDimensions(numObjects,:) = [objectId, len, wid, hi];
end
%// throw away excess
idAndDimensions = idAndDimensions(1:numObjects,:);
%// split into 2D and 3D objects
isTwoD = numObjects(:,end) == -1;
%// assign twoD struct
twoD = struct('id',num2cell(idAndDimensions(isTwoD,1),...
'length',num2cell(idAndDimensions(isTwoD,2),...
'width',num2cell(idAndDimensions(isTwoD,3));
%// assign threeD struct
%// clean up - we need only the two structs
%// I use keep from the File Exchange instead of clearvars
clearvars -except twoD threeD
Is there a way to rewrite my code to make it faster?
for i = 2:length(ECG)
u(i) = max([a*abs(ECG(i)) b*u(i-1)]);
end;
My problem is the length of ECG.
You should pre-allocate u like this
>> u = zeros(size(ECG));
or possibly like this
>> u = NaN(size(ECG));
or maybe even like this
>> u = -Inf(size(ECG));
depending on what behaviour you want.
When you pre-allocate a vector, MATLAB knows how big the vector is going to be and reserves an appropriately sized block of memory.
If you don't pre-allocate, then MATLAB has no way of knowing how large the final vector is going to be. Initially it will allocate a short block of memory. If you run out of space in that block, then it has to find a bigger block of memory somewhere, and copy all the old values into the new memory block. This happens every time you run out of space in the allocated block (which may not be every time you grow the array, because the MATLAB runtime is probably smart enough to ask for a bit more memory than it needs, but it is still more than necessary). All this unnecessary reallocating and copying is what takes a long time.
There are several several ways to optimize this for loop, but, surprisingly memory pre-allocation is not the part that saves the most time. By far. You're using max to find the largest element of a 1-by-2 vector. On each iteration you build this vector. However, all you're doing is comparing two scalars. Using the two argument form of max and passing it two scalar is MUCH faster: 75+ times faster on my machine for large ECG vectors!
% Set the parameters and create a vector with million elements
a = 2;
b = 3;
n = 1e6;
ECG = randn(1,n);
ECG2 = a*abs(ECG); % This can be done outside the loop if you have the memory
u(1,n) = 0; % Fast zero allocation
for i = 2:length(ECG)
u(i) = max(ECG2(i),b*u(i-1)); % Compare two scalars
end
For the single input form of max (not including creation of random ECG data):
Elapsed time is 1.314308 seconds.
For my code above:
Elapsed time is 0.017174 seconds.
FYI, the code above assumes u(1) = 0. If that's not true, then u(1) should be set to it's value after preallocation.