I have a population p of indices and corresponding weights in vector w. I want to get k samples from this population without replacement where the selection is done proportional to the weights in random.
I know that randsample can be used for selection with replacement by saying
J = randsample(p,k,true,w)
but when I call it with parameter false instead of true, I get
??? Error using ==> randsample at 184
Weighted sampling without replacement is not supported.
I wrote my own function as discussed in here:
p = 1:n;
J = zeros(1,k);
for i = 1:k
J(i) = randsample(p,1,true,w);
w(p == J(i)) = 0;
end
But since it has k iterations in the loop, I seek for a shorter/faster way to do this. Do you have any suggestions?
EDIT: I want to randomly select k unique columns of a matrix proportional to some weighting criteria. That is why I use sampling without replacement.
I don't think it is possible to avoid some sort of loop, since sampling without replacement means that the samples are no longer independent. Besides, what does the weighting actually mean when sampling without replacement?
In any case, for relatively small sample sizes I don't think you will notice any problem with performance. All the solutions I can think of basically do what you have done, but possibly expand out what is going on in randsample.
I think you should keep using the for, but I suggest to reduce the corresponding weight by one.
w(p == J(i)) = w(p == J(i)) -1;
This still shows up in search results, so I wanted to add the datasample function as an option. The following code will provide a weighted sample of 5 units from fromVector according the corresponding vector myWeights.
mySample = datasample(fromVector, 5, 'Replace', false, 'Weights', myWeights)
An alternative to the for loop approach of petrichor that performs well if the number of samples is much smaller than the number of elements is to compute a weighted random sample with replacement and then remove duplicates. Of course, this is a very bad idea if the number of samples k is near the number of elements n, as this will require many iterations, but by avoiding for loops, the wall clock performance is often better. Your mileage may vary.
function I=randsample_noreplace(n,k,w)
I = sort(randsample(n, k, true, w));
while 1
Idup = find( I(2:end)-I(1:end-1) ==0);
if length(Idup) == 0
break
else
I(Idup)=randsample(n, length(Idup), true, w);
I = sort(I);
end
end
If you want to select a large fraction of the columns (i.e., k is not very much smaller than n), or if the weights are very skewed, you can use this refinement of Jeff's solution, which ensures that each call to randsample produces samples distinct from the previous ones.
Moreover, it returns the samples in the order in which true sampling without replacement would return them, rather than sorted.
function I=randsample_noreplace(n,k,w)
I = randsample(n, k, true, w);
while 1
[II, idx] = sort(I);
Idup = [false, diff(II)==0];
if ~any(Idup)
break
else
w(I) = 0; %% Don't replace samples
Idup (idx) = Idup; %% find duplicates in original list
I = [I(~Idup), (randsample(n, sum(Idup), true, w))];
end
end
When selecting 29 out of 30 values with uniform weights (the case that gives least benefit), it takes 3 or 4 iterations, compared with 26 without the additional line. If the weights are chosen uniformly, it still takes 3 to 5 iterations compared with around 80 without the additional line.
Also, the number of iterations is bounded by k, however skewed the distribution is.
Related
I am doing a Monte-Carlo simulation, where each repetition requires the sum or product of a random number of random variables. My problem is how to do this efficiently as the entire simulation should be as vectorized as possible.
For example, say we want to take the sum of 5, 10 and 3 random numbers, represented by the vector len = [5;10;3]. Then what I am currently doing is drawing a full matrix of random numbers:
A = randn(length(len),max(len));
Creating a mask of the non-needed numbers:
lenlen = repmat(len,1,max(len));
idx = repmat(1:max(len),length(len),1);
mask = idx>lenlen;
and then I can "pad", the matrix as I am interested in the sum the padding have to be zero (for the case with the product the padding had to be 1)
A(mask)=0;
To obtain:
A =
1.7708 -1.4609 -1.5637 -0.0340 0.9796 0 0 0 0 0
1.8034 -1.5467 0.3938 0.8777 0.6813 1.0594 -0.3469 1.7472 -0.4697 -0.3635
1.5937 -0.1170 1.5629 0 0 0 0 0 0 0
Whereafter I can sum them together
B = sum(A,2);
However, I find it rather superfluous that I have to draw too many random numbers and then throw them away. In the real case, I need in the range of hundred thousands of repetitions and the vector len might vary a lot, i.e. it can easily be that I have to draw twice or three times the number of random numbers than of what is needed.
You can generate the exact amount of random numbers required, create a grouping variable with repelem, and compute the sum of each group using accumarray:
len = [5; 10; 3];
B = accumarray(repelem(1:numel(len), len).', randn(sum(len),1));
You could just use arrayfun or a loop. You say "efficient" and "vectorized" in the same breath, but they are not necessarily the same thing - since the new(ish) JIT compiler, loops are pretty fast in MATLAB. arrayfun is basically a loop in disguise, but means you could create B like so:
len = [5;10;3];
B = arrayfun( #(x) sum( randn(x,1) ), len );
For each element in len, this creates a vector of length len(i) and takes the sum. The output is an array with one value for each value in len.
This will certainly be a lot more memory friendly for large values and largely different values within len. It may therefore be quicker, your mileage may vary but it cuts out a lot of the operations you're doing.
You mention wanting to take the product sometimes, in which case use prod in place of sum.
Edit: rough and ready benchmark to compare arrayfun and a loop...
len = randi([1e3, 1e7], 100, 1);
tic;
B = arrayfun( #(x) sum( randn(x,1) ), len );
toc % ~8.77 seconds
tic;
out=zeros(size(len));
for ii = 1:numel(len)
out(ii) = sum(randn(len(ii),1));
end
toc % ~8.80 seconds
The "advantage" of the loop over arrayfun is you can pre-generate all of the random numbers in one go, then index. This isn't necesarryily quicker because you're addressing much bigger chunks of memory, and the call to randn is the main bottleneck anyway!
tic;
out = zeros(size(len));
rnd = randn(sum(len),1);
idx = [0; cumsum(len)]; % note: cumsum is very quick (~0.001sec here) so negligible
for ii = 1:numel(len)
out(ii) = sum(rnd(idx(ii)+1:idx(ii+1)),1);
end
toc % ~10.2 sec! Slower because of massive call to randn and the indexing into large array.
As stated at the top, arrayfun and looping are basically the same under the hood, so no reason to expect a big time difference.
The sum of multiple random numbers drawn from a specific distribution is also a random number with a (different) specific distribution. Therefore you can just cut the middleman and draw directly from the latter distribution.
In your case you are summing 3, 10 and 5 numbers drawn from a N(0,1) distribution. As explained here, the resulting distributions therefore are N(0,3), N(0,10) and N(0,5). This page explains how you can draw from non-standard normal distributions in Matlab. As such, we can in this case generate those numbers with randn(3,1).*sqrt([5; 10; 3]).
In case you would want 1000 triples, you could then use
randn(3,1000).*sqrt([5; 10; 3])
or pre Matlab2016b
bsxfun(#times, randn(3,1000), sqrt([5; 10; 3]))
which is of course very fast.
Different distributions have different summation rules, but as long as you are not summing up numbers drawn from different distributions the rules are usually quite simple and found quickly with google.
You can do this using a combination of cumsum and diff. The plan is:
Create all the random numbers in a single call to randn up front
Then, use cumsum to produce a vector of cumulative summations
Use cumsum on the list of number-of-samples-per-result to work out where to read out the results
We also need diff to correct for the prior summations.
Note that this method might lose accuracy if you weren't using randn for the random samples, as cumsum would then build up arithmetic rounding errors.
% We want 100 sums of random numbers
numSamples = 100;
% Here's where we define how many random samples contribute to each sum
numRandsPerSample = randi(5, 1, numSamples);
% Let's make all the random numbers in one call
allRands = randn(1, sum(numRandsPerSample));
% Use CUMSUM to build up a cumulative sum of the whole of allRands. We also
% need a leading 0 for the first sum.
allRandsCS = [0, cumsum(allRands)];
% Use CUMSUM again to pick out the places we need to pick from
% allRandsCS
endIdxs = 1 + [0, cumsum(numRandsPerSample)];
% Use DIFF to subtract the prior sums from the result.
result = diff(allRandsCS(endIdxs))
I have 2 nested loops which do the following:
Get two rows of a matrix
Check if indices meet a condition or not
If they do: calculate xcorr between the two rows and put it into new vector
Find the index of the maximum value of sub vector and replace element of LAG matrix with this value
I dont know how I can speed this code up by vectorizing or otherwise.
b=size(data,1);
F=size(data,2);
LAG= zeros(b,b);
for i=1:b
for j=1:b
if j>i
x=data(i,:);
y=data(j,:);
d=xcorr(x,y);
d=d(:,F:(2*F)-1);
[M,I] = max(d);
LAG(i,j)=I-1;
d=xcorr(y,x);
d=d(:,F:(2*F)-1);
[M,I] = max(d);
LAG(j,i)=I-1;
end
end
end
First, a note on floating point precision...
You mention in a comment that your data contains the integers 0, 1, and 2. You would therefore expect a cross-correlation to give integer results. However, since the calculation is being done in double-precision, there appears to be some floating-point error introduced. This error can cause the results to be ever so slightly larger or smaller than integer values.
Since your calculations involve looking for the location of the maxima, then you could get slightly different results if there are repeated maximal integer values with added precision errors. For example, let's say you expect the value 10 to be the maximum and appear in indices 2 and 4 of a vector d. You might calculate d one way and get d(2) = 10 and d(4) = 10.00000000000001, with some added precision error. The maximum would therefore be located in index 4. If you use a different method to calculate d, you might get d(2) = 10 and d(4) = 9.99999999999999, with the error going in the opposite direction, causing the maximum to be located in index 2.
The solution? Round your cross-correlation data first:
d = round(xcorr(x, y));
This will eliminate the floating-point errors and give you the integer results you expect.
Now, on to the actual solutions...
Solution 1: Non-loop option
You can pass a matrix to xcorr and it will perform the cross-correlation for every pairwise combination of columns. Using this, you can forego your loops altogether like so:
d = round(xcorr(data.'));
[~, I] = max(d(F:(2*F)-1,:), [], 1);
LAG = reshape(I-1, b, b).';
Solution 2: Improved loop option
There are limits to how large data can be for the above solution, since it will produce large intermediate and output variables that can exceed the maximum array size available. In such a case for loops may be unavoidable, but you can improve upon the for-loop solution above. Specifically, you can compute the cross-correlation once for a pair (x, y), then just flip the result for the pair (y, x):
% Loop over rows:
for row = 1:b
% Loop over upper matrix triangle:
for col = (row+1):b
% Cross-correlation for upper triangle:
d = round(xcorr(data(row, :), data(col, :)));
[~, I] = max(d(:, F:(2*F)-1));
LAG(row, col) = I-1;
% Cross-correlation for lower triangle:
d = fliplr(d);
[~, I] = max(d(:, F:(2*F)-1));
LAG(col, row) = I-1;
end
end
I have a 151-by-151 matrix A. It's a correlation matrix, so there are 1s on the main diagonal and repeated values above and below the main diagonal. Each row/column represents a person.
For a given integer n I will seek to reduce the size of the matrix by kicking people out, such that I am left with a n-by-n correlation matrix that minimises the total sum of the elements. In addition to obtaining the abbreviated matrix, I also need to know the row number of the people who should be booted out of the original matrix (or their column number - they'll be the same number).
As a starting point I take A = tril(A), which will remove redundant off-diagonal elements from the correlation matrix.
So, if n = 4 and we have the hypothetical 5-by-5 matrix above, it's very clear that person 5 should be kicked out of the matrix, since that person is contributing a lot of very high correlations.
It's also clear that person 1 should not be kicked out, since that person contributes a lot of negative correlations, and thus brings down the sum of the matrix elements.
I understand that sum(A(:)) will sum everything in the matrix. However, I'm very unclear about how to search for the minimum possible answer.
I noticed a similar question Finding sub-matrix with minimum elementwise sum, which has a brute force solution as the accepted answer. While that answer works fine there it's impractical for a 151-by-151 matrix.
EDIT: I had thought of iterating, but I don't think that truly minimizes the sum of elements in the reduced matrix. Below I have a 4-by-4 correlation matrix in bold, with sums of rows and columns on the edges. It's apparent that with n = 2 the optimal matrix is the 2-by-2 identity matrix involving Persons 1 and 4, but according to the iterative scheme I would have kicked out Person 1 in the first phase of iteration, and so the algorithm makes a solution that is not optimal. I wrote a program that always generated optimal solutions, and it works well when n or k are small, but when trying to make an optimal 75-by-75 matrix from a 151-by-151 matrix I realised my program would take billions of years to terminate.
I vaguely recalled that sometimes these n choose k problems can be resolved with dynamic programming approaches that avoid recomputing things, but I can't work out how to solve this, and nor did googling enlighten me.
I'm willing to sacrifice precision for speed if there's no other option, or the best program will take more than a week to generate a precise solution. However, I'm happy to let a program run for up to a week if it will generate a precise solution.
If it's not possible for a program to optimise the matrix within an reasonable timeframe, then I would accept an answer that explains why n choose k tasks of this particular sort can't be resolved within reasonable timeframes.
This is an approximate solution using a genetic algorithm.
I started with your test case:
data_points = 10; % How many data points will be generated for each person, in order to create the correlation matrix.
num_people = 25; % Number of people initially.
to_keep = 13; % Number of people to be kept in the correlation matrix.
to_drop = num_people - to_keep; % Number of people to drop from the correlation matrix.
num_comparisons = 100; % Number of times to compare the iterative and optimization techniques.
for j = 1:data_points
rand_dat(j,:) = 1 + 2.*randn(num_people,1); % Generate random data.
end
A = corr(rand_dat);
then I defined the functions you need to evolve the genetic algorithm:
function individuals = user1205901individuals(nvars, FitnessFcn, gaoptions, num_people)
individuals = zeros(num_people,gaoptions.PopulationSize);
for cnt=1:gaoptions.PopulationSize
individuals(:,cnt)=randperm(num_people);
end
individuals = individuals(1:nvars,:)';
is the individual generation function.
function fitness = user1205901fitness(ind, A)
fitness = sum(sum(A(ind,ind)));
is the fitness evaluation function
function offspring = user1205901mutations(parents, options, nvars, FitnessFcn, state, thisScore, thisPopulation, num_people)
offspring=zeros(length(parents),nvars);
for cnt=1:length(parents)
original = thisPopulation(parents(cnt),:);
extraneus = setdiff(1:num_people, original);
original(fix(rand()*nvars)+1) = extraneus(fix(rand()*(num_people-nvars))+1);
offspring(cnt,:)=original;
end
is the function to mutate an individual
function children = user1205901crossover(parents, options, nvars, FitnessFcn, unused, thisPopulation)
children=zeros(length(parents)/2,nvars);
cnt = 1;
for cnt1=1:2:length(parents)
cnt2=cnt1+1;
male = thisPopulation(parents(cnt1),:);
female = thisPopulation(parents(cnt2),:);
child = union(male, female);
child = child(randperm(length(child)));
child = child(1:nvars);
children(cnt,:)=child;
cnt = cnt + 1;
end
is the function to generate a new individual coupling two parents.
At this point you can define your problem:
gaproblem2.fitnessfcn=#(idx)user1205901fitness(idx,A)
gaproblem2.nvars = to_keep
gaproblem2.options = gaoptions()
gaproblem2.options.PopulationSize=40
gaproblem2.options.EliteCount=10
gaproblem2.options.CrossoverFraction=0.1
gaproblem2.options.StallGenLimit=inf
gaproblem2.options.CreationFcn= #(nvars,FitnessFcn,gaoptions)user1205901individuals(nvars,FitnessFcn,gaoptions,num_people)
gaproblem2.options.CrossoverFcn= #(parents,options,nvars,FitnessFcn,unused,thisPopulation)user1205901crossover(parents,options,nvars,FitnessFcn,unused,thisPopulation)
gaproblem2.options.MutationFcn=#(parents, options, nvars, FitnessFcn, state, thisScore, thisPopulation) user1205901mutations(parents, options, nvars, FitnessFcn, state, thisScore, thisPopulation, num_people)
gaproblem2.options.Vectorized='off'
open the genetic algorithm tool
gatool
from the File menu select Import Problem... and choose gaproblem2 in the window that opens.
Now, run the tool and wait for the iterations to stop.
The gatool enables you to change hundreds of parameters, so you can trade speed for precision in the selected output.
The resulting vector is the list of indices that you have to keep in the original matrix so A(garesults.x,garesults.x) is the matrix with only the desired persons.
If I have understood you problem statement, you have a N x N matrix M (which happens to be a correlation matrix), and you wish to find for integer n where 2 <= n < N, a n x n matrix m which minimises the sum over all elements of m which I denote f(m)?
In Matlab it is fairly easy and fast to obtain a sub-matrix of a matrix (see for example Removing rows and columns from matrix in Matlab), and the function f is relatively inexpensive to evaluate for n = 151. So why can't you implement an algorithm that solves this backwards dynamically in a program as below where I have sketched out the pseudocode:
function reduceM(M, n){
m = M
for (ii = N to n+1) {
for (jj = 1 to ii) {
val(jj) = f(m) where mhas column and row jj removed, f(X) being summation over all elements of X
}
JJ(ii) = jj s.t. val(jj) is smallest
m = m updated by removing column and row JJ(ii)
}
}
In the end you end up with an m of dimension n which is the solution to your problem and a vector JJ which contains the indices removed at each iteration (you should easily be able to convert these back to indices applicable to the full matrix M)
There are several approaches to finding an approximate solution (eg. quadratic programming on relaxed problem or greedy search), but finding the exact solution is an NP-hard problem.
Disclaimer: I'm not an expert on binary quadratic programming, and you may want to consult the academic literature for more sophisticated algorithms.
Mathematically equivalent formulation:
Your problem is equivalent to:
For some symmetric, positive semi-definite matrix S
minimize (over vector x) x'*S*x
subject to 0 <= x(i) <= 1 for all i
sum(x)==n
x(i) is either 1 or 0 for all i
This is a quadratic programming problem where the vector x is restricted to taking only binary values. Quadratic programming where the domain is restricted to a set of discrete values is called mixed integer quadratic programming (MIQP). The binary version is sometimes called Binary Quadratic Programming (BQP). The last restriction, that x is binary, makes the problem substantially more difficult; it destroys the problem's convexity!
Quick and dirty approach to finding an approximate answer:
If you don't need a precise solution, something to play around with might be a relaxed version of the problem: drop the binary constraint. If you drop the constraint that x(i) is either 1 or 0 for all i, then the problem becomes a trivial convex optimization problem and can be solved nearly instantaneously (eg. by Matlab's quadprog). You could try removing entries that, on the relaxed problem, quadprog assigns the lowest values in the x vector, but this does not truly solve the original problem!
Note also that the relaxed problem gives you a lower bound on the optimal value of the original problem. If your discretized version of the solution to the relaxed problem leads to a value for the objective function close to the lower bound, there may be a sense in which this ad-hoc solution can't be that far off from the true solution.
To solve the relaxed problem, you might try something like:
% k is number of observations to drop
n = size(S, 1);
Aeq = ones(1,n)
beq = n-k;
[x_relax, f_relax] = quadprog(S, zeros(n, 1), [], [], Aeq, beq, zeros(n, 1), ones(n, 1));
f_relax = f_relax * 2; % Quadprog solves .5 * x' * S * x... so mult by 2
temp = sort(x_relax);
cutoff = temp(k);
x_approx = ones(n, 1);
x_approx(x_relax <= cutoff) = 0;
f_approx = x_approx' * S * x_approx;
I'm curious how good x_approx is? This doesn't solve your problem, but it might not be horrible! Note that f_relax is a lower bound on the solution to the original problem.
Software to solve your exact problem
You should check out this link and go down to the section on Mixed Integer Quadratic Programming (MIQP). It looks to me that Gurobi can solve problems of your type. Another list of solvers is here.
Working on a suggestion from Matthew Gunn and also some advice at the Gurobi forums, I came up with the following function. It seems to work pretty well.
I will award it the answer, but if someone can come up with code that works better I'll remove the tick from this answer and place it on their answer instead.
function [ values ] = the_optimal_method( CM , num_to_keep)
%the_iterative_method Takes correlation matrix CM and number to keep, returns list of people who should be kicked out
N = size(CM,1);
clear model;
names = strseq('x',[1:N]);
model.varnames = names;
model.Q = sparse(CM); % Gurobi needs a sparse matrix as input
model.A = sparse(ones(1,N));
model.obj = zeros(1,N);
model.rhs = num_to_keep;
model.sense = '=';
model.vtype = 'B';
gurobi_write(model, 'qp.mps');
results = gurobi(model);
values = results.x;
end
Let us say I have the following:
M = randn(10,20);
T = randn(1,20);
I would like to threshold each column of M, by each entry of T. For example, find all indicies of all elements of M(:,1) that are greater than T(1). Find all indicies of all elements in M(:,2) that are greater than T(2), etc etc.
Of course, I would like to do this without a for-loop. Is this possible?
You can use bsxfun like this:
I = bsxfun(#gt, M, T);
Then I will be a logcial matrix of size(M) with ones where M(:,i) > T(i).
You can use bsxfun to do things like this, but it may not be faster than a for loop (more below on this).
result = bsxfun(#gt,M,T)
This will do an element wise comparison and return you a logical matrix indicating the relationship governed by the first argument. I have posted code below to show the direct comparison, indicating that it does return what you are looking for.
%var declaration
M = randn(10,20);
T = randn(1,20);
% quick method
fastres = bsxfun(#gt,M,T);
% looping method
res = false(size(M));
for i = 1:length(T)
res(:,i) = M(:,i) > T(i);
end
% check to see if the two matrices are identical
isMatch = all(all(fastres == res))
This function is very powerful and can be used to help speed up processes, but keep in mind that it will only speed things up if there is a lot of data. There is a bit of background work that bsxfun must do, which can actually cause it to be slower.
I would only recommend using it if you have several thousand data points. Otherwise, the traditional for-loop will actually be faster. Try it out for yourself by changing the size of the M and T variables.
You can replicate the threshold vector and use matrix comparison:
s=size(M);
T2=repmat(T, s(1), 1);
M(M<T2)=0;
Indexes=find(M);
Context: I'm working on Project Euler Problem 23 using Matlab in order to practice my barely existing programming skills.
My Problem:
Now I have a vector with roughly 6500 numbers (ranging from 12 to 28122) as elements and want to calculate all the two element sums. That is I only need one instance of every sum, so having calculated a1 + an it's not necessary to calculate an + a1.
Edit for clarification: This includes the sums a1+a1, a2+a2,..., an+an.
The problem is that this is much too slow.
Problem specific constraints:
It's a given that sums 28123 or over aren't necessary to calculate, since those can't be used to solve the problem further.
My approach:
AbundentNumberSumsRaw=[];
for i=1:3490
AbundentNumberSumsRaw=[AbundentNumberSumRaw AbundentNumbers(i)+AbundentNumbers(i:end);
end
This works terribly :p
My Comments:
I'm pretty sure that incrementally increasing the vector AbundentNumbersRaw is bad coding, since that means memory usage will spike unnecessarily. I haven't done so, since a) I don't know what size vector to pre-allocate and b) I couldn't come up with a way to inject the sums into AbundentNumbersRaw in a orderly manner without using some ugly looking nested loops.
"for i=1:3490" is lower than the numbers of elements simply because I checked and saw that all the resulting sums for numbers whose index are above 3490 would be too large for me to use anyway.
I'm pretty sure my main issue is that the program need to do a lot of incremental increases of the vector AbundentNumbersRaw.
Any and all help and suggestions would be much appreciated :)
Cheers
Rasmus
Suppose
a = 28110*rand(6500,1)+12;
then
sums = [
a(1) + a(1:end)
a(2) + a(2:end)
...
];
is the calculation you're after.
You also state that sums whose value goes over 28123 should be discarded.
This can be generalized like so:
% Compute all 2-element sums without repetitions
C = arrayfun(#(x) a(x)+a(x:end), 1:numel(a), 'uniformoutput', false);
C = cat(1, C{:});
% discard sums exceeding threshold
C(C>28123) = [];
or using a loop
% Compute all 2-element sums without repetitions
E = cell(numel(a),1);
for ii = 1:numel(a)
E{ii} = a(ii)+a(ii:end); end
E = cat(1, E{:});
% discard sums exceeding threshold
E(E>28123) = [];
Simple testing shows that arrayfun is somewhat faster than the loop, so I'd go for the arrayfun option.
As your primary problem is to find out, which integers in a given set can be written as the sum of two integers of a different set, I'd choose a different approach:
AbundantNumbers = 1:6500; % replace with the list you generated somewhere else
maxInteger = 28122;
AbundantNumberSum(1:maxInteger) = true; % logical array
for i = 1:length(AbundantNumbers)
sumIndices = AbundantNumbers(i) + AbundantNumbers;
AbundantNumberSum(sumIndices(sumIndices <= maxInteger)) = false;
end
Unfortunantely, this is not an answer to your question but to your problem ;-) For the MatLab way to solve your original question, see the elegant answer of Rody Oldenhuis.
My approach would be the following:
v = 1:3490; % your vector here
s = length(v);
result = zeros(s); % preallocate memory
for m = 1:s
result(m,m:end) = v(m)+v(m:end);
end
You will get a matrix of 3490 x 3490 elements and more than half of them 0.