I'm writing a simulation in Matlab.
I will eventually run this simulation hundreds of times.
In each simulation run, there are millions of simulation cycles.
In each of these cycles, I calculate a very complex function, which takes ~0.5 sec to finish.
The function input is a long bit array (>1000 bits) - which is an array of 0 and 1.
I hold the bit arrays in a matrix of 0 and 1, and for each one of them I only run the function once - as I save the result in a different array (res) and check if the bit array is in the matrix before running the functions:
for i=1:1000000000
%pick a bit array somehow
[~,indx] = ismember(bit_array,bit_matrix,'rows');
if indx == 0
indx = length(results) + 1;
bit_matrix(indx,:) = bit_array;
res(indx) = complex_function(bit_array);
result = res(indx)
%do something with result
I have two quesitons, really:
Is there a more efficient way to find the index of a row in a matrix then 'ismember'?
Since I run the simulation many times, and there is a big overlap in the bit-arrays I'm getting, I want to cache the matrix between runs so that I don't recalculate the function over the same bit-arrays over and over again. How do I do that?

The answer to both questions is to use a map. There are a few steps to do this.
First you will need a function to turn your bit_array into either a number or a string. For example, turn [0 1 1 0 1 0] into '011010'. (Matlab only supports scalar or string keys, which is why this step is required.)
Defined a map object
cachedRunMap = containers.Map; %See edit below for more on this
To check if a particular case has been run, use iskey.
To add the results of a run use the appending syntax
cachedRunMap('011010') = [0 1 1 0 1]; %Or whatever your result is.
To retrieve cached results, use the getting syntax
tmpResult = cachedRunMap.values({'011010'});
This should efficiently store and retrieve values until you run out of system memory.
Putting this together, now your code would look like this:
%Hacky magic function to convert an array into a string of '0' and '1'
strFromBits = #(x) char((x(:)'~=0)+48); %'
%Initialize the map
cachedRunMap = containers.Map;
%Loop, computing and storing results as needed
for i=1:1000000000
%pick a bit array somehow
strKey = strFromBits(bit_array);
if cachedRunMap.isKey(strKey)
result = cachedRunMap(strKey);
result = complex_function(bit_array);
cachedRunMap(strKey) = reult;
%do something with result
If you want a key which is not a string, that needs to be declared at step 2. Some examples are:
cachedRunMap = containers.Map('KeyType', 'char', 'ValueType', 'any');
cachedRunMap = containers.Map('KeyType', 'double', 'ValueType', 'any');
cachedRunMap = containers.Map('KeyType', 'uint64', 'ValueType', 'any');
cachedRunMap = containers.Map('KeyType', 'uint64', 'ValueType', 'double');
Setting a KeyType of 'char' sets the map to use strings as keys. All other types must be scalars.
Regarding issues as you scale this up (per your recent comments)
Saving data between sessions: There should be no issues saving this map to a *.mat file, up to the limits of your systems memory
Purging old data: I am not aware of a straightforward way to add LRU features to this map. If you can find a Java implementation you can use it within Matlab pretty easily. Otherwise it would take some thought to determine the most efficient method of keeping track of the last time a key was used.
Sharing data between concurrent sessions: As you indicated, this probably requires a database to perform efficiently. The DB table would be two columns (3 if you want to implement LRU features), the key, value, (and last used time if desired). If your "result" is not a type which easily fits into SQL (e.g. a non-uniform size array, or complex structure) then you will need to put additional thought into how to store it. You will also need a method to access the database (e.g. the database toolbox, or various tools on the Mathworks file exchange). Finally you will need to actually setup a database on a server (e.g. MySql if you are cheap, like me, or whatever you have the most experience with, or can find the most help with.) This is not actually that hard, but it takes a bit of time and effort the first time through.
Another approach to consider (much less efficient, but not requiring a database) would be to break up the data store into a large (e.g. 1000's or millions) number of maps. Save each into a separate *.mat file, with a filename based on the keys contained in that map (e.g. the first N characters of your string key), and then load/save these files between sessions as needed. This will be pretty slow ... depending on your usage it may be faster to recalculate from the source function each time ... but it's the best way I can think of without setting up the DB (clearly a better answer).

For a large list, a hand-coded binary search can beat ismember, if maintaining it in sorted order isn't too expensive. If that's really your bottleneck. Use the profiler to see how much the ismember is really costing you. If there aren't too many distinct values, you could also store them in a containers.Map by packing the bit_matrix in to a char array and using it as the key.
If it's small enough to fit in memory, you could store it in a MAT file using save and load. They can store any basic Matlab datatype. Have the simulation save the accumulated res and bit_matrix at the end of its run, and re-load them the next time it's called.

I think that you should use containers.Map() for the purpose of speedup.
The general idea is to hold a map that contains all hash values. If your bit arrays have uniform distribution under the hash function, most of the time you won't need the call to ismember.
Since the key type cannot be an array in Matlab, you can calculate some hash function on your array of bits.
For example:
function s = GetHash(bitArray)
s = mod( sum(bitArray), intmax('uint32'));
This is a lousy hash function, but enough to understand the principle.
Then the code would look like:
map = containers.Map('KeyType','uint32','ValueType','any');
for i=1:1000000000
%pick a bit array somehow
s = GetHash(bit_array);
if isKey %Do the slow check.
[~,indx] = ismember(bit_array,bit_matrix,'rows');
map(s) = 1;
if indx == 0
indx = length(results) + 1;
bit_matrix(indx,:) = bit_array;
res(indx) = complex_function(bit_array);
result = res(indx)
%do something with result


Encoding a binary vector in a suitable way in Matlab

The context and the problem below are only examples that can help to visualize the question.
Context: Let's say that I'm continously generating random binary vectors G with length 1x64 (whose values are either 0 or 1).
Problem: I don't want to check vectors that I've already checked, so I want to create a kind of table that can identify what vectors are already generated before.
So, how can I identify each vector in an optimized way?
My first idea was to convert the binary vectors into decimal numbers. Due to the maximum length of the vectors, I would need 2^64 = 1.8447e+19 numbers to encode them. That's huge, so I need an alternative.
I thought about using hexadecimal coding. In that case, if I'm not wrong, I would need nchoosek(16+16-1,16) = 300540195 elements, which is also huge.
So, there are better alternatives? For example, a kind of hash function that can identify that vectors without repeating values?
So you have 64 bit values (or vectors) and you need a data structure in order to efficiently check if a new value is already existing?
Hash sets or binary trees come to mind, depending on if ordering is important or not.
Matlab has a hash table in containers.Map.
Here is a example:
n = 1e5; % number of random elements
keys = uint64(rand(n, 1) * 2^64); % random uint64
% check and add key if not already existing (using a containers.Map)
map = containers.Map('KeyType', 'uint64', 'ValueType', 'logical');
for i = 1 : n
key = keys(i);
if ~isKey(map, key)
map(key) = true;
However, depending on why you really need that and when you really need to check, the Matlab function unique might also be something for you.
Just throwing out duplicates once at the end like:
unique_keys = unique(keys);
is in this example 300 times faster than checking every time.

Matlab: Randomly select from "slowly varying" index set

I would like to find or implement a Matlab data structure that allows me to efficiently do the following three things:
Retrieve an element uniformly at random.
Add a new element.
Delete an element. (If it helps, this element was just "retrieved" out of the structure, so I can use both its location and its value to delete it).
Since I don't need duplicates, this structure is mathematically equivalent to a set. Also, my elements are always integers in the range 1 to 2500; it is not unusual for the set to be this entire range.
What is such a data structure? I've thought of using something like containers.Map or java.util.HashSet, but I don't know how to satisfy the first requirement in this case, because I don't know how to efficiently retrieve the nth key of such a structure. An ordinary array can achieve the first requirement of course, but it is a bad choice for the second and third requirements because of inefficient resizing.
For some context for why I'm looking to do this, in some current code I spent about 1/4 of the runtime doing:
and then randomly retrieving an element from this vector. Yet this vector changes very little, and in a very predictable manner, in each iteration of my program. So I would prefer to carry around a data structure and update it as I go rather than recomputing it every time.
If you're familiar with Haskell, one way to implement the operations I'm looking to support would be
randomSelect set = fmap (\n -> elemAt n set) $ randomRIO (0,size set-1)
along with insert and delete, from Data.Set. But I have other reasons not to use Haskell in this project, and I don't know how to implement the backend of Data.Set myself.
Frequently, the best way to decrease time complexity is to increase space complexity. Given that your sets are going to be rather small, we can probably afford to use a little extra space.
To contain the set itself, you can use a preallocated array:
maxSize = 2500;
theSet = zeros(1, maxSize); % set elements
setCount = 0; % number of set elements
You can then have an auxiliary array to check for set membership:
isMember = zeros(1, maxSize);
To insert a new element newval into the set, add it to the end of theSet and increment the count (assuming there's room):
if ~isMember(newval)
assert(setCount < maxSize, 'Too many elements in set.');
theSet(++setCount) = newval;
isMember(newval) = 1;
% tried to add duplicate element... do something here
To delete an element by index delidx, swap the element to be deleted and the last element and decrement the count:
assert(delidx <= setCount, 'Tried to remove element beyond end of set.');
isMember(theSet(delidx)) = 0;
theSet(delidx) = theSet(setCount--);
Getting a random element of the set is then simple, just:
randidx = randi(setCount);
randelem = theSet(randidx);
All operations are O(1) and the only real disadvantage is that we have to carry along two arrays of size maxCount. Because of that you probably don't want to put these operations in functions as you'd end up creating new arrays on every function call. You'd be better off putting them inline or, better yet, wrapping them in a nice class.

Efficient byte pattern search in matlab memory map

I have large binary files (2+ GB) that are arranged with a sync pattern (0xDEADBEEF) followed by a data block of a fixed size.
0xDE AD BE EF ... 96 bytes of data
0xDE AD BE EF ... 96 bytes of data
... repeat ...
I need to locate the offsets to the start of each packet. Ideally this would just be [1:packetSize:fileSize] However, there is other data that can be interspersed, headers etc. so I need to search the file for the sync pattern.
I am using the following code which is based on Loren from Mathworks findPattern2 but modified a little to use a memory map.
function pattLoc = findPattern(fileName, bytePattern)
%Mem Map file
m = memmapfile(fileName);
% Find candidate locations match the first element in the pattern.
pattLoc = find(;
%Remove start values that are too close to the end to possibly match
len = numel(bytePattern);
endVals = pattLoc+len-1;
pattLoc(endVals>length( = [];
% loop over elements of Sync Pattern to check possible location validity.
for pattval = 2:len
% check viable locations in array
locs = bytePattern(pattval) ==;
pattLoc(~locs) = []; % delete false ones from indices
This works pretty well. However, I think there might be room for improvement. First I know my patterns can't be closer than packetSize (100 in this example) but may be farther apart. Seems like I should be able to use this information somehow to speed up the search. Second the initial search on line 5 uses a find for numerical indexing instead of logical. This line takes almost twice as long as leaving it as a logical. However, I tried to rework this function using logical indexing only and failed miserably. The problem arises inside the loop and keeping track of the nested indexing with logicals without using more finds ... or checking more data than needs to be.
So any help speeding this up would be appreciated. Below is some code which will create a simple sample binary file to work with if necessary.
function genSampleFile(numPackets)
pattern = hex2dec({'DE','AD','BE','EF'});
fileName = 'testFile.bin';
fid = fopen(fileName,'w+');
for f = 1:numPackets
fwrite(fid,[pattern; uint8(rand(96,1)*255)],'uint8');
Searching a file with 10000000 packets took the following:
>> genSampleFile(10000000); %Warning Makes 950+MB file
>> tic;pattLoc = findPattern(fileName, pattern);toc
Elapsed time is 4.608321 seconds.
You can get an immediate boost by using findstr, or better yet strfind instead of find:
pattLoc = strfind(, bytePattern)
This removes the need for any further looping. You just need to clean up a couple of things in the returned array of indices.
You want to remove things that are closer than 100 bytes to the end, not closer than 4 bytes to the end, so set len = 100, instead of len = length(bytePattern).
To filter out elements that are closer than 100 bytes from each other, use diff on the list of indices:
pattLoc[diff(pattLoc) < 100] = []
This should speed up your code by relying more on builtins, which are generally much more efficient than loops.

Incremental appending: How to avoid performance penalty of struct arrays

If you must incrementally append data to arrays, it seems that using individual vectors of basic data types is orders of magnitude faster than an array of structs (with one vector element per record). Even trying to collect the individual vectors into a struct seems to double the time. The tests are:
fprintf('\nstruct array (array of structs):\n')
clear x y;
y=struct( 'a',[], 'b',[], 'c',[], 'd',[] );
for iIns = 1 : N
x.a=rand; x.b=rand; x.c=rand; x.d=rand;
end % for iIns
fprintf('\nSeparate arrays of scalars:\n')
clear a b c d;
a=[]; b=[]; c=[]; d=[];
for iIns = 1 : N
a(end+1) = rand;
b(end+1) = rand;
c(end+1) = rand;
d(end+1) = rand;
end % for iIns
fprintf('\nA struct with arrays of scalars for fields:\n')
clear a b c d x y
x.a=[]; x.b=[]; x.c=[]; x.d=[];
for iIns = 1:N
end % for iIns
The results:
struct array (array of structs):
Elapsed time is 24.127274 seconds.
Separate arrays of scalars:
Elapsed time is 0.048190 seconds.
A struct with arrays of scalars for fields:
Elapsed time is 0.084624 seconds.
Even though collecting individual vectors of basic data types into a struct (3rd scenario above) imposes such a penalty, it may be preferrable to simply using individual vectors (second scenario above) because the variables are more organized. Your variable name space isn't filled up with so many variables which are in fact conceptually grouped.
That's quite a significant penalty, however, to pay for such organization. I don't suppose there is way to avoid this?
There are two ways to avoid this performance penalty: (1) pre-allocate, and (2) rethink your stance on "organizing" variables. I suggest both. Oh, and if you can, don't use arrays of structs where each field only uses scalars - if your application suddenly has to handle a couple of orders of magnitude more data, the memory overhead will force you to rewrite everything.
You often know how many elements your array will end up having. Thus, initialize your arrays as s = struct('a',NaN(1:N),'b',NaN(1:N)); If you don't know ahead of time how many entries there will be, but you can estimate an upper limit, initialize with the upper limit, and either remove the elements, or use functions (e.g. nanmean) that do not care if the array has a few extra NaNs in the end. If you truly know nothing about the final size (except that N will be large enough to matter), pre-allocate with a nice number (e.g. N=1337), and extend the array in chunks. MathWorks have sped up dynamic growing of numeric arrays in a recent release, but as you demonstrate in your answer, the optimization has not been applied to structs yet. Don't count MathWorks' optimization team to fix your code.
Nice variables
Why worry about your variable space? As long as you use explicitVariableNames, your code remains readable and you will have an easy time picking out the right variable. But ok, let's say you want to clean up: The first way to keeping the number of active variables low is to use clear or keep at strategic points in your code to make sure you only keep around what's needed. The second (assuming you want to optimize for performance), is to put contextually linked vectors into the same array: objectDimensions = [lengthOfObject, widthOfObject, heightOfObject]. This keeps everything as numeric arrays (which are fastest), and allows easy vectorization such as objectVolume = prod(objectDimensions,2);.
/aside: I should disclose that I used to use structures frequently for assembling results (so that I could return a lot of information a single variable and have the field names be part of the documentation). I have since switched to use object-oriented-programming (usually handle-objects), which no only collect related variables, but also the associated functionality, and which facilitate code re-use. I do take a performance hit, but the time it saves me coding makes more than up for it. Note that I do pre-allocate if at all possible (and if it's not just growing an array three times).
Assume you have a function getDimensions that reads dimensions (length, height, width) of objects. However, sometimes, the object is 2D, sometimes it is 3D. Thus, you want to fill the following variables: twoD.length, twoD.width, threeD.length, threeD.width, threeD.height, ideally as arrays of structs, so that each element of a struct corresponds to an object. You do not know ahead of time how many objects there are, all you can do is poll the function thereAreMoreObjects, which returns true or false, until there are no more objects.
Here's how you can do this with reasonable efficiency and growing arrays by chunks:
%// preassign the temporary variable, and some others
chunkSize = 1000;
numObjects = 0;
idAndDimensions = zeros(chunkSize,4);
while thereAreMoreObjects()
objectId = getCurrentObjectId();
%// hi==-1 if it's flat
[len,wid,hi] = getObjectDimensions(objectId);
%// allocate more, if needed
numObjects = numObjects + 1;
if numObjects > size(idAndDimensions,1)
%// grow array
idAndDimensions(end+chunkSize,1) = 0;
idAndDimensions(numObjects,:) = [objectId, len, wid, hi];
%// throw away excess
idAndDimensions = idAndDimensions(1:numObjects,:);
%// split into 2D and 3D objects
isTwoD = numObjects(:,end) == -1;
%// assign twoD struct
twoD = struct('id',num2cell(idAndDimensions(isTwoD,1),...
%// assign threeD struct
%// clean up - we need only the two structs
%// I use keep from the File Exchange instead of clearvars
clearvars -except twoD threeD

Vectorize matlab code to map nearest values in two arrays

I have two lists of timestamps and I'm trying to create a map between them that uses the imu_ts as the true time and tries to find the nearest vicon_ts value to it. The output is a 3xd matrix where the first row is the imu_ts index, the third row is the unix time at that index, and the second row is the index of the closest vicon_ts value above the timestamp in the same column.
Here's my code so far and it works, but it's really slow. I'm not sure how to vectorize it.
function tmap = sync_times(imu_ts, vicon_ts)
tstart = max(vicon_ts(1), imu_ts(1));
tstop = min(vicon_ts(end), imu_ts(end));
%trim imu data to
tmap(1,:) = find(imu_ts >= tstart & imu_ts <= tstop);
tmap(3,:) = imu_ts(tmap(1,:));%Use imu_ts as ground truth
%Find nearest indecies in vicon data and map
vic_t = 1;
for i = 1:size(tmap,2)
while(vicon_ts(vic_t) < tmap(3,i))
vic_t = vic_t + 1;
tmap(2,i) = vic_t;
The timestamps are already sorted in ascending order, so this is essentially an O(n) operation but because it's looped it runs slowly. Any vectorized ways to do the same thing?
It appears to be running faster than I expected or first measured, so this is no longer a critical issue. But I would be interested to see if there are any good solutions to this problem.
Have a look at knnsearch in MATLAB. Use cityblock distance and also put an additional constraint that the data point in vicon_ts should be less than its neighbour in imu_ts. If it is not then take the next index. This is required because cityblock takes absolute distance. Another option (and preferred) is to write your custom distance function.
I believe that your current method is sound, and I would not try and vectorize any further. Vectorization can actually be harmful when you are trying to optimize some inner loops, especially when you know more about the context of your data (e.g. it is sorted) than the Mathworks engineers can know.
Things that I typically look for when I need to optimize some piece of code liek this are:
All arrays are pre-allocated (this is the biggest driver of performance)
Fast inner loops use simple code (Matlab does pretty effective JIT on basic commands, but must interpret others.)
Take advantage of any special data features that you have, e.g. use sort appropriate algorithms and early exit conditions from some loops.
You're already doing all this. I recommend no change.
A good start might be to get rid of the while, try something like:
for i = 1:size(tmap,2)
C = max(0,tmap(3,:)-vicon_ts(i));
tmap(2,i) = find(C==min(C));