Vectorizing code - How to reduce MATLAB computational time - matlab

I have this piece of code
N=10^4;
for i = 1:N
[E,X,T] = fffun(); % Stochastic simulation. Returns every time three different vectors (whose length is 10^3).
X_(i,:)=X;
T_(i,:)=T;
GRID=[GRID T];
end
GRID=unique(GRID);
% Second part
for i=1:N
for j=1:(kmax)
f=find(GRID==T_(i,j) | GRID==T_(i,j+1));
s=f(1);
e=f(2)-1;
counter(X_(i,j), s:e)=counter(X_(i,j), s:e)+1;
end
end
The code performs N different simulations of a stochastic process (which consists of 10^3 events, occurring at discrete moments (T vector) that depends on the specific simulation.
Now (second part) I want to know, as a function of time istant, how many simulations are in a particular state (X assumes value between 1 and 10). The idea I had: create a grid vector with all the moments at which something happens in any simulation. Then, looping over the simulations, loop over the timesteps in which something happens and incrementing all the counter indeces that corresponds to this particular slice of time.
However this second part is very heavy (I mean days of processing on a standard quad-core CPU). And it shouldn't.
Are there any ideas (maybe about comparing vectors in a more efficient way) to cut the CPU time?
This is a standalone 'second_part'
N=5000;
counter=zeros(11,length(GRID));
for i=1:N
disp(['Counting sim #' num2str(i)]);
for j=1:(kmax)
f=find(GRID==T_(i,j) | GRID==T_(i,j+1),2);
s=f(1);
e=f(2)-1;
counter(X_(i,j), s:e)=counter(X_(i,j), s:e)+1;
end
end
counter=counter/N;
stop=find(GRID==Tmin);
stop=stop-1;
plot(counter(:,(stop-500):stop)')
with associated dummy data ( filedropper.com/data_38 ). In the real context the matrix has 2x rows and 10x columns.

Here is what I understand:
T_ is a matrix of time steps from N simulations.
X_ is a matrix of simulation state at T_ in those simulations.
so if you do:
[ut,~,ic]= unique(T_(:));
you get ic which is a vector of indices for all unique elements in T_. Then you can write:
counter = accumarray([ic X_(:)],1);
and get counter with no. of rows as your unique timesteps, and no. of columns as the unique states in X_ (which are all, and must be, integers). Now you can say that for each timestep ut(k) the number of time that the simulation was in state m is counter(k,m).
In your data, the only combination of m and k that has a value greater than 1 is (1,1).
Edit:
From the comments below, I understand that you record all state changes, and the time steps when they occur. Then every time a simulation change a state you want to collect all the states from all simulations and count how many states are from each type.
The main problem here is that your time is continuous, so basically each element in T_ is unique, and you have over a million time steps to loop over. Fully vectorizing such a process will need about 80GB of memory which will probably stuck your computer.
So I looked for a combination of vectorizing and looping through the time steps. We start by finding all unique intervals, and preallocating counter:
ut = unique(T_(:));
stt = 11; % no. of states
counter = zeros(stt,numel(ut));r = 1:size(T_,1);
r = 1:size(T_,1); % we will need that also later
Then we loop over all element in ut, and each time look for the relevant timestep in T_ in all simulations in a vectorized way. And finally we use histcounts to count all the states:
for k = 1:numel(ut)
temp = T_<=ut(k); % mark all time steps before ut(k)
s = cumsum(temp,2); % count the columns
col_ind = s(:,end); % fins the column index for each simulation
% convert the coulmns to linear indices:
linind = sub2ind(size(T_),r,col_ind.');
% count the states:
counter(:,k) = histcounts(X_(linind),1:stt+1);
end
This takes about 4 seconds at my computer for 1000 simulations, so it adds to a little more than one hour for the whole process. Not very quick...
You can try also one or two of the tweaks below to squeeze run time a little bit more:
As you can read here, accumarray seems to work faster in small arrays then histcouns. So may want to switch to it.
Also, computing linear indices directly is a quicker method than sub2ind, so you may want to try that.
implementing these suggestions in the loop above, we get:
R = size(T_,1);
r = (1:R).';
for k = 1:K
temp = T_<=ut(k); % mark all time steps before ut(k)
s = cumsum(temp,2); % count the columns
col_ind = s(:,end); % fins the column index for each simulation
% convert the coulmns to linear indices:
linind = R*(col_ind-1)+r;
% count the states:
counter(:,k) = accumarray(X_(linind),1,[stt 1]);
end
In my computer switching to accumarray and or removing sub2ind gain a slight improvement but it was not consistent (using timeit for testing on 100 or 1K elements in ut), so you better test it yourself. However, this still remains very long.
One thing that you may want to consider is trying to discretize your timesteps, so you will have much less unique elements to loop over. In your data about 8% of the time intervals a smaller than 1. If you can assume that this is short enough to be treated as one time step, then you could round your T_ and get only ~12.5K unique elements, which take about a minute to loop over. You can do the same for 0.1 intervals (which are less than 1% of the time intervals), and get 122K elements to loop over, what will take about 8 hours...
Of course, all the timing above are rough estimates using the same algorithm. If you do choose to round the times there may be even better ways to solve this.

Related

Revolving Doors Riddle - Matlab Time-Efficient Sparse Matrix Use

I'm running a code with many iterations using large sparse matrices. There are three lines in my code that take about 75% of the running time and I think I can use the special structure of my sparse matrix to reduce that time, but so far I haven't managed to do it. I would love your help!!
Ok, here's the gist of my code:
I = 70;
J = 1000;
A = rand(I);
A = A./repmat(sum(A, 2), 1, I);
S = kron(A, speye(J));
indj = randi(J,I,1);
tic
for i = 1:I
S(:, (i-1)*J+indj(i)) = sum(S(:, (i-1)*J + (1:indj(i))), 2);
end
toc
You can skip the following 2 paragraphs
Here's a story to make the example a bit more lively. An old man is visiting sick people at different hospitals. There are 1000 (J) hospitals, and each hospital has 70 (I) rooms in it. The matrix A is the transition matrix that specifies the probability of the old man moving from one room at the hospital to another room within the same hospital. A(i1,i2) is the probability the old man moves from room i1 to room i2 (so columns sum to 1). The big S matrix is the transition probability matrix, where moving from room i1 at hospital j1 to room i2 at hospital j2 is given by the (J*(i1-1)+j1, J*(i2-1)+j2) element. There is no way the old man moves from one hospital to another, so the matrix is sparse.
Something magical happens and now all the doors to room number i in the first indj(i) hospitals all lead to the same hospital, hospital indj(i). So the old man can now magically move between hospitals. We need to change the S matrix accordingly. This amounts to two things, increasing the probability of moving to room i at hospital indj(i), for all i, and setting to zero the probability of getting into all rooms lower than indj(i) at hospital i, for all i. The latter I can do very efficiently, but the first part is taking me too long.
Why I think there's a chance to reduce running time
Loop. The part between the tic and toc can be written without a loop. I have done it, but it made it run much slower perhaps because the length of the sub2ind is very large.
Matrix structure. Notice that we don’t need the entire sum, only one element needs to be added. These loops achieve the same outcome (but here, obviously, much slower):
for i = 1:I
for ii = 1:I
for j = 1:indj(i)-1
S((ii-1)*J+j, (i-1)*J+indj(i)) = S((ii-1)*J+j, (i-1)*J+indj(i)) + S((ii-1)*J+j, (i-1)*J+j);
end
end
end
This makes me somewhat hopeful that there is a way to make the calculation faster…
Your help is HIGHLY appreciated!

Efficient way of computing second min value

Given a matrix, it's easy to compute the value and index of the min value:
A = rand(10);
[value, index] = min(A(:));
However I would also like to recover the second min value (idem for max).
I can of course take any of this two approaches:
Converting A to a vector and sorting it.
PROS: I can then recover the second, third... n minimum value
CONS: If A is large, sorting is expensive
Once the min location of A is located, I can replace this value by a large one (eg: Inf) and then run min again.
PROS: Cheaper than sort
CONS: I must modify my matrix (and save the modified value in an aux variable). Also re-running min is costly on a large matrix.
I'm wondering if there is a better solution:
When computing min the algorithm has to keep track of the min value found so far, until a new value has a lower value (then we update the value).
If instead we keep track of the last n min values found so far will allow to recover the minimum n values.
I can implement this, but I'm wondering if it's the best approach or if it's already implemented.
I don't know in which case it would be less expensive than sorting, but an easy, but not so fast way would be to use the following code. I may be wrong, but I don't think you can get faster with build-in functions if you just want the first and the second min.
A = rand(10);
[firstMin, firstMinIndex] = min(A(:));
secondMin = min(A(A~=firstMin));
secondMinIndex = find(A==secondMin); % slow, but use only if you need the index
Here, you go through the matrix two times more, one for the boolean operation, and one for the second min.
After some testing on 2000x2000 and 4000x4000 random matrix, it seems that this code snipset is around 3.5 time faster than the sort function applied on the same matrix.
If you really need more efficiency, you'd have to write your own mex routine, with which you can theoretically get the two values in n+log n-2 comparison, as explained in the link provided by #luismendotomas.
Hope this help !
In a single pass:
a = [53 53 49 49 97 75 4 22 4 37];
first = Inf;
second = Inf;
for i = 1:1:numel(a)
if (a(i) < first)
second = first;
first = a(i);
elseif (a(i) < second && a(i) ~= first)
second = a(i);
end
end
fprintf('First smallest %d\n', first);
fprintf('Second smallest %d\n', second);
You can remove the a(i) ~= first condition if you rather have 4, 4 as output instead of 4, 23
Also, see this SO question
As already mentioned I suppose the best (read: "most efficient") method is to implement the methods from #luismendotomas link.
However, if you want to avoid doing too much programming yourself, then you could apply some k-nearest neighbours algorithm, given you have a lower bound on your data, e.g. if all your data points are positive, you can find the 2 nearest neighbours to 0. Though I am not sure whether this is faster than your initial suggestions or not.
For one k-nearest neighbour algorithm see e.g. this
beesleep has already pointed out that method 2 (by computing the minimum twice) is more efficient that method 1 (by sorting). However the implementation provided in the answer to compute the index of the second minimum via find is, as mentioned, very inefficient.
In fact, to get the index of the second minimum, it is ca. 10x faster to set the first minimum value to inf (as suggested in the question) and then get the index of the second minimum from the min function (as opposed to using find)
[firstMin, firstMinIndex] = min(A(:));
A(firstMinIndex) = inf;
[secondMin, secondMinIndex] = min(A(:));
Here is the code which I used to compare this implementation to the one suggested by beesleep:
for i = 1:10
A = rand(10000);
tic
[firstMin, firstMinIndex] = min(A(:));
secondMin = min(A(A~=firstMin));
secondMinIndex = find(A==secondMin); % slow, but use only if you need the index
t1(i) = toc;
tic
[firstMin, firstMinIndex] = min(A(:));
A(firstMinIndex) = inf;
[secondMin, secondMinIndex] = min(A(:));
t2(i) = toc;
end
disp(mean(t1) / mean(t2))

How to identify an optimal subsample from a data set with missing values in MATLAB

I would like to identify the largest possible contiguous subsample of a large data set. My data set consists of roughly 15,000 financial time series of up to 360 periods in length. I have imported the data into MATLAB as a 360 by 15,000 numerical matrix.
This matrix contains a lot of NaNs due to some of the financial data not being available for the entire period. In the illustration, NaN entries are shown in dark blue, and non-NaN entries appear in light blue. It is these light blue non-NaN entries which I would like to ideally combine into an optimal subsample.
I would like to find the largest possible contiguous block of data that is contained in my matrix, while ensuring that my matrix contains a sufficient number of periods.
In a first step I would like to sort my matrix from left to right in descending order by the number of non-NaN entries in each column, that is, I would like to sort by the vector obtained by entering sum(~isnan(data),1).
In a second step I would like to find the sub-array of my data matrix that is at least 72 entries along the first dimension and is otherwise as large as possible, measured by the total number of entries.
What is the best way to implement this?
A big warning (may or may not apply depending on context)
As Oleg mentioned, when an observation is missing from a financial time series, it's often missing for reason: eg. the entity went bankrupt, the entity was delisted, or the instrument did not trade (i.e. illiquid). Constructing a sample without NaNs is likely equivalent to constructing a sample where none of these events occur!
For example, if this were hedge fund return data, selecting a sample without NaNs would exclude funds that blew up and ceased trading. Excluding imploded funds would bias estimates of expected returns upwards and estimates of variance or covariance downwards.
Picking a sample period with the fewest time series with NaNs would also exclude periods like the 2008 financial crisis, which may or may not make sense. Excluding 2008 could lead to an underestimate of how haywire things could get (though including it could lead to overestimate the probability of certain rare events).
Some things to do:
Pick a sample period as long as possible but be aware of the limitations.
Do your best to handle survivorship bias: eg. if NaNs represent delisting events, try to get some kind of delisting return.
You almost certainly will have an unbalanced panel with missing observations, and your algorithm will have to be deal with that.
Another general finance / panel data point, selecting a sample at some time point t and then following it into the future is perfectly ok. But selecting a sample based upon what happens during or after the sample period can be incredibly misleading.
Code that does what you asked:
This should do what you asked and be quite fast. Be aware of the problems though if whether an observation is missing is not random and orthogonal to what you care about.
Inputs are a T by n sized matrix X:
T = 360; % number of time periods (i.e. rows) in X
n = 15000; % number of time series (i.e. columns) in X
T_subsample = 72; % desired length of sample (i.e. rows of newX)
% number of possible starting points for series of length T_subsample
nancount_periods = T - T_subsample + 1;
nancount = zeros(n, nancount_periods, 'int32'); % will hold a count of NaNs
X_isnan = int32(isnan(X));
nancount(:,1) = sum(X_isnan(1:T_subsample, :))'; % 'initialize
% We need to obtain a count of nans in T_subsample sized window for each
% possible time period
j = 1;
for i=T_subsample + 1:T
% One pass: add new period in the window and subtract period no longer in the window
nancount(:,j+1) = nancount(:,j) + X_isnan(i,:)' - X_isnan(j,:)';
j = j + 1;
end
indicator = nancount==0; % indicator of whether starting_period, series
% has no NaNs
% number of nonan series of length T_subsample by starting period
max_subsample_size_by_starting_period = sum(indicator);
max_subsample_size = max(max_subsample_size_by_starting_period);
% find the best starting period
starting_period = find(max_subsample_size_by_starting_period==max_subsample_size, 1);
ending_period = starting_period + T_subsample - 1;
columns_mask = indicator(:,starting_period);
columns = find(columns_mask); %holds the column ids we are using
newX = X(starting_period:ending_period, columns_mask);
Here's an idea,
Assuming you can rearrange the series, calculate the distance (you decide the metric, but if looking at is nan vs not is nan, Hamming is ok).
Now hierarchically cluster the series and rearrange them using either a dendrogram
or http://www.mathworks.com/help/bioinfo/examples/working-with-the-clustergram-function.html
You should probably prune any series that doesn't have a minimum number of non nan values before you start.
First I have only little insight in financial mathematics. I understood it that you want to find the longest continuous chain of non-NaN values for each time series. The time series should be sorted depending on the length of this chain and each time series, not containing a chain above a threshold, discarded. This can be done using
data = rand(360,15e3);
data(abs(data) <= 0.02) = NaN;
%% sort and chop data based on amount of consecutive non-NaN values
binary_data = ~isnan(data);
% find edges, denote their type and calculate the biggest chunk in each
% column
edges = [2*binary_data(1,:)-1; diff(binary_data, 1)];
chunk_size = diff(find(edges));
chunk_size(end+1) = numel(edges)-sum(chunk_size);
[row, ~, id] = find(edges);
num_row_elements = diff(find(row == 1));
num_row_elements(end+1) = numel(chunk_size) - sum(num_row_elements);
%a chunk of NaN has a -1 in id, a chunk of non-NaN a 1
chunks_per_row = mat2cell(chunk_size .* id,num_row_elements,1);
% sort by largest consecutive block of non-NaNs
max_size = cellfun(#max, chunks_per_row);
[max_size_sorted, idx] = sort(max_size, 'descend');
data_sorted = data(:,idx);
% remove all elements that only have block sizes smaller then some number
some_number = 20;
data_sort_chop = data_sorted(:,max_size_sorted >= some_number);
Note that this can be done a lot simpler, if the order of periods within a time series doesn't matter, aka data([1 2 3],id) and data([3 1 2], id) are identical.
What I do not know is, if you want to discard all periods within a time series that don't correspond to the biggest value, get all those chains as individual time series, ...
Feel free to drop a comment if it has to be more specific.

Random numbers that add to 1 with a minimum increment: Matlab

Having read carefully the previous question
Random numbers that add to 100: Matlab
I am struggling to solve a similar but slightly more complex problem.
I would like to create an array of n elements that sums to 1, however I want an added constraint that the minimum increment (or if you like number of significant figures) for each element is fixed.
For example if I want 10 numbers that sum to 1 without any constraint the following works perfectly:
num_stocks=10;
num_simulations=100000;
temp = [zeros(num_simulations,1),sort(rand(num_simulations,num_stocks-1),2),ones(num_simulations,1)];
weights = diff(temp,[],2);
I foolishly thought that by scaling this I could add the constraint as follows
num_stocks=10;
min_increment=0.001;
num_simulations=100000;
scaling=1/min_increment;
temp2 = [zeros(num_simulations,1),sort(round(rand(num_simulations,num_stocks-1)*scaling)/scaling,2),ones(num_simulations,1)];
weights2 = diff(temp2,[],2);
However though this works for small values of n & small values of increment, if for example n=1,000 & the increment is 0.1% then over a large number of trials the first and last numbers have a mean which is consistently below 0.1%.
I am sure there is a logical explanation/solution to this but I have been tearing my hair out to try & find it & wondered anybody would be so kind as to point me in the right direction. To put the problem into context create random stock portfolios (hence the sum to 1).
Thanks in advance
Thank you for the responses so far, just to clarify (as I think my initial question was perhaps badly phrased), it is the weights that have a fixed increment of 0.1% so 0%, 0.1%, 0.2% etc.
I did try using integers initially
num_stocks=1000;
min_increment=0.001;
num_simulations=100000;
scaling=1/min_increment;
temp = [zeros(num_simulations,1),sort(randi([0 scaling],num_simulations,num_stocks-1),2),ones(num_simulations,1)*scaling];
weights = (diff(temp,[],2)/scaling);
test=mean(weights);
but this was worse, the mean for the 1st & last weights is well below 0.1%.....
Edit to reflect excellent answer by Floris & clarify
The original code I was using to solve this problem (before finding this forum) was
function x = monkey_weights_original(simulations,stocks)
stockmatrix=1:stocks;
base_weight=1/stocks;
r=randi(stocks,stocks,simulations);
x=histc(r,stockmatrix)*base_weight;
end
This runs very fast, which was important considering I want to run a total of 10,000,000 simulations, 10,000 simulations on 1,000 stocks takes just over 2 seconds with a single core & I am running the whole code on an 8 core machine using the parallel toolbox.
It also gives exactly the distribution I was looking for in terms of means, and I think that it is just as likely to get a portfolio that is 100% in 1 stock as it is to geta portfolio that is 0.1% in every stock (though I'm happy to be corrected).
My issue issue is that although it works for 1,000 stocks & an increment of 0.1% and I guess it works for 100 stocks & an increment of 1%, as the number of stocks decreases then each pick becomes a very large percentage (in the extreme with 2 stocks you will always get a 50/50 portfolio).
In effect I think this solution is like the binomial solution Floris suggests (but more limited)
However my question has arrisen because I would like to make my approach more flexible & have the possibility of say 3 stocks & an increment of 1% which my current code will not handle correctly, hence how I stumbled accross the original question on stackoverflow
Floris's recursive approach will get to the right answer, but the speed will be a major issue considering the scale of the problem.
An example of the original research is here
http://www.huffingtonpost.com/2013/04/05/monkeys-stocks-study_n_3021285.html
I am currently working on extending it with more flexibility on portfolio weights & numbers of stock in the index, but it appears my programming & probability theory ability are a limiting factor.......
One problem I can see is that your formula allows for numbers to be zero - when the rounding operation results in two consecutive numbers to be the same after sorting. Not sure if you consider that a problem - but I suggest you think about it (it would mean your model portfolio has fewer than N stocks in it since the contribution of one of the stocks would be zero).
The other thing to note is that the probability of getting the extreme values in your distribution is half of what you want them to be: If you have uniformly distributed numbers from 0 to 1000, and you round them, the numbers that round to 0 were in the interval [0 0.5>; the ones that round to 1 came from [0.5 1.5> - twice as big. The last number (rounding to 1000) is again from a smaller interval: [999.5 1000]. Thus you will not get the first and last number as often as you think. If instead of round you use floor I think you will get the answer you expect.
EDIT
I thought about this some more, and came up with a slow but (I think) accurate method for doing this. The basic idea is this:
Think in terms of integers; rather than dividing the interval 0 - 1 in steps of 0.001, divide the interval 0 - 1000 in integer steps
If we try to divide N into m intervals, the mean size of a step should be N / m; but being integer, we would expect the intervals to be binomially distributed
This suggests an algorithm in which we choose the first interval as a binomially distributed variate with mean (N/m) - call the first value v1; then divide the remaining interval N - v1 into m-1 steps; we can do so recursively.
The following code implements this:
% random integers adding up to a definite sum
function r = randomInt(n, limit)
% returns an array of n random integers
% whose sum is limit
% calls itself recursively; slow but accurate
if n>1
v = binomialRandom(limit, 1 / n);
r = [v randomInt(n-1, limit - v)];
else
r = limit;
end
function b = binomialRandom(N, p)
b = sum(rand(1,N)<p); % slow but direct
To get 10000 instances, you run this as follows:
tic
portfolio = zeros(10000, 10);
for ii = 1:10000
portfolio(ii,:) = randomInt(10, 1000);
end
toc
This ran in 3.8 seconds on a modest machine (single thread) - of course the method for obtaining a binomially distributed random variate is the thing slowing it down; there are statistical toolboxes with more efficient functions but I don't have one. If you increase the granularity (for example, by setting limit=10000) it will slow down more since you increase the number of random number samples that are generated; with limit = 10000 the above loop took 13.3 seconds to complete.
As a test, I found mean(portfolio)' and std(portfolio)' as follows (with limit=1000):
100.20 9.446
99.90 9.547
100.09 9.456
100.00 9.548
100.01 9.356
100.00 9.484
99.69 9.639
100.06 9.493
99.94 9.599
100.11 9.453
This looks like a pretty convincing "flat" distribution to me. We would expect the numbers to be binomially distributed with a mean of 100, and standard deviation of sqrt(p*(1-p)*n). In this case, p=0.1 so we expect s = 9.4868. The values I actually got were again quite close.
I realize that this is inefficient for large values of limit, and I made no attempt at efficiency. I find that clarity trumps speed when you develop something new. But for instance you could pre-compute the cumulative binomial distributions for p=1./(1:10), then do a random lookup; but if you are just going to do this once, for 100,000 instances, it will run in under a minute; unless you intend to do it many times, I wouldn't bother. But if anyone wants to improve this code I'd be happy to hear from them.
Eventually I have solved this problem!
I found a paper by 2 academics at John Hopkins University "Sampling Uniformly From The Unit Simplex"
http://www.cs.cmu.edu/~nasmith/papers/smith+tromble.tr04.pdf
In the paper they outline how naive algorthms don't work, in a way very similar to woodchips answer to the Random numbers that add to 100 question. They then go on to show that the method suggested by David Schwartz can also be slightly biased and propose a modified algorithm which appear to work.
If you want x numbers that sum to y
Sample uniformly x-1 random numbers from the range 1 to x+y-1 without replacement
Sort them
Add a zero at the beginning & x+y at the end
difference them & subtract 1 from each value
If you want to scale them as I do, then divide by y
It took me a while to realise why this works when the original approach didn't and it come down to the probability of getting a zero weight (as highlighted by Floris in his answer). To get a zero weight in the original version for all but the 1st or last weights your random numbers had to have 2 values the same but for the 1st & last ones then a random number of zero or the maximum number would result in a zero weight which is more likely.
In the revised algorithm, zero & the maximum number are not in the set of random choices & a zero weight occurs only if you select two consecutive numbers which is equally likely for every position.
I coded it up in Matlab as follows
function weights = unbiased_monkey_weights(num_simulations,num_stocks,min_increment)
scaling=1/min_increment;
sample=NaN(num_simulations,num_stocks-1);
for i=1:num_simulations
allcomb=randperm(scaling+num_stocks-1);
sample(i,:)=allcomb(1:num_stocks-1);
end
temp = [zeros(num_simulations,1),sort(sample,2),ones(num_simulations,1)*(scaling+num_stocks)];
weights = (diff(temp,[],2)-1)/scaling;
end
Obviously the loop is a bit clunky and as I'm using the 2009 version the randperm function only allows you to generate permutations of the whole set, however despite this I can run 10,000 simulations for 1,000 numbers in 5 seconds on my clunky laptop which is fast enough.
The mean weights are now correct & as a quick test I replicated woodchips generating 3 numbers that sum to 1 with the minimum increment being 0.01% & it also look right
Thank you all for your help and I hope this solution is useful to somebody else in the future
The simple answer is to use the schemes that work well with NO minimum increment, then transform the problem. As always, be careful. Some methods do NOT yield uniform sets of numbers.
Thus, suppose I want 11 numbers that sum to 100, with a constraint of a minimum increment of 5. I would first find 11 numbers that sum to 45, with no lower bound on the samples (other than zero.) I could use a tool from the file exchange for this. Simplest is to simply sample 10 numbers in the interval [0,45]. Sort them, then find the differences.
X = diff([0,sort(rand(1,10)),1]*45);
The vector X is a sample of numbers that sums to 45. But the vector Y sums to 100, with a minimum value of 5.
Y = X + 5;
Of course, this is trivially vectorized if you wish to find multiple sets of numbers with the given constraint.

Matlab fast neighborhood operation

I have a Problem. I have a Matrix A with integer values between 0 and 5.
for example like:
x=randi(5,10,10)
Now I want to call a filter, size 3x3, which gives me the the most common value
I have tried 2 solutions:
fun = #(z) mode(z(:));
y1 = nlfilter(x,[3 3],fun);
which takes very long...
and
y2 = colfilt(x,[3 3],'sliding',#mode);
which also takes long.
I have some really big matrices and both solutions take a long time.
Is there any faster way?
+1 to #Floris for the excellent suggestion to use hist. It's very fast. You can do a bit better though. hist is based on histc, which can be used instead. histc is a compiled function, i.e., not written in Matlab, which is why the solution is much faster.
Here's a small function that attempts to generalize what #Floris did (also that solution returns a vector rather than the desired matrix) and achieve what you're doing with nlfilter and colfilt. It doesn't require that the input have particular dimensions and uses im2col to efficiently rearrange the data. In fact, the the first three lines and the call to im2col are virtually identical to what colfit does in your case.
function a=intmodefilt(a,nhood)
[ma,na] = size(a);
aa(ma+nhood(1)-1,na+nhood(2)-1) = 0;
aa(floor((nhood(1)-1)/2)+(1:ma),floor((nhood(2)-1)/2)+(1:na)) = a;
[~,a(:)] = max(histc(im2col(aa,nhood,'sliding'),min(a(:))-1:max(a(:))));
a = a-1;
Usage:
x = randi(5,10,10);
y3 = intmodefilt(x,[3 3]);
For large arrays, this is over 75 times faster than colfilt on my machine. Replacing hist with histc is responsible for a factor of two speedup. There is of course no input checking so the function assumes that a is all integers, etc.
Lastly, note that randi(IMAX,N,N) returns values in the range 1:IMAX, not 0:IMAX as you seem to state.
One suggestion would be to reshape your array so each 3x3 block becomes a column vector. If your initial array dimensions are divisible by 3, this is simple. If they don't, you need to work a little bit harder. And you need to repeat this nine times, starting at different offsets into the matrix - I will leave that as an exercise.
Here is some code that shows the basic idea (using only functions available in FreeMat - I don't have Matlab on my machine at home...):
N = 100;
A = randi(0,5*ones(3*N,3*N));
B = reshape(permute(reshape(A,[3 N 3 N]),[1 3 2 4]), [ 9 N*N]);
hh = hist(B, 0:5); % histogram of each 3x3 block: bin with largest value is the mode
[mm mi] = max(hh); % mi will contain bin with largest value
figure; hist(B(:),0:5); title 'histogram of B'; % flat, as expected
figure; hist(mi-1, 0:5); title 'histogram of mi' % not flat?...
Here are the plots:
The strange thing, when you run this code, is that the distribution of mi is not flat, but skewed towards smaller values. When you inspect the histograms, you will see that is because you will frequently have more than one bin with the "max" value in it. In that case, you get the first bin with the max number. This is obviously going to skew your results badly; something to think about. A much better filter might be a median filter - the one that has equal numbers of neighboring pixels above and below. That has a unique solution (while mode can have up to four values, for nine pixels - namely, four bins with two values each).
Something to think about.
Can't show you a mex example today (wrong computer); but there are ample good examples on the Mathworks website (and all over the web) that are quite easy to follow. See for example http://www.shawnlankton.com/2008/03/getting-started-with-mex-a-short-tutorial/