Can I do the following fast in Matlab? - matlab

I have three matrices in Matlab, A which is n x m, B which is p x m and C which is r x n.
Say we initialize some matrices using:
A = rand(3,4);
B = rand(2,3);
C = rand(5,4);
The following two are equivalent:
>> s=0;
>> for i=1:3
for j=1:4
s = s + A(i,j)*B(:,i)*C(:,j)';
end;
end
>> s
s =
2.6823 2.2440 3.5056 2.0856 2.1551
2.0656 1.7310 2.6550 1.5767 1.6457
>> B*A*C'
ans =
2.6823 2.2440 3.5056 2.0856 2.1551
2.0656 1.7310 2.6550 1.5767 1.6457
The latter being much more efficient.
I can't find any efficient version for the following variant of the loop:
s=0;
for i=1:3
for j=1:4
x = A(i,j)*B(:,i)*C(:,j)';
s = s + x/sum(sum(x));
end;
end
Here, the matrices being added are normalized by the sum of their values after each step.
Any ideas how to make this efficient like the matrix multiplication above? I thought maybe accumarray could help, but not sure how.

You can do it efficiently with bsxfun:
aux1 = bsxfun(#times, permute(B,[1 3 2]), permute(C,[3 1 4 2]));
aux2 = sum(sum(aux1,1),2);
s = sum(sum(bsxfun(#rdivide, aux1, aux2),3),4);
Note that, because of the normalization, the result is independent of A, assuming it doesn't contain any zero entries (if it does the result is undefined).

Related

3D matrix multiplication by 2D matrix

I have a NxNx4 matrix(A) and a 4x4 matrix (B). I need to multiply the vector a composed by the four elements of the first matrix A, let's say
a = A(1,1,1)
A(1,1,2)
A(1,1,3)
A(1,1,4)
by the 4x4 matrix B but I'm not sure if there is a faster and clever solution than using a for loop to build the vector a. Does exist a way to do this computation with few lines of code?
I built A like
A(:,:,1) = rand(20);
A(:,:,2) = rand(20);
A(:,:,3) = rand(20);
A(:,:,4) = rand(20);
and the matrix B
B = rand(4);
now I want to multiply B with
B*[A(1,1,1);A(1,1,2);A(1,1,3);A(1,1,4)]
This, for each element of A
B*[A(1,2,1);A(1,2,2);A(1,2,3);A(1,2,4)]
B*[A(1,3,1);A(1,3,2);A(1,3,3);A(1,3,4)]
...
You can do this with a simple loop, note loops aren't necessarily slow in newer MATLAB versions. Mileage may vary.
Loops have the advantage of improving code readability, it's extremely clear what's happening here:
% For matrix A of size N*N*4
C = zeros( size( A ) );
for ii = 1:N
for jj = 1:N
C( ii, jj, : ) = B * reshape( A( ii, jj, : ), [], 1 );
end
end
A loop solution that has good performance specially when N is large:
s = size(A, 3);
C = A(:,:,1) .* reshape(B(:,1),1,1,[]);
for k = 2:s
C = C + A(:,:,k) .* reshape(B(:,k),1,1,[]);
end
I think this does what you want:
C = permute(sum(bsxfun(#times, permute(B, [3 4 2 1]), A), 3), [1 2 4 3]);
Check:
>> C(1,2,:)
ans(:,:,1) =
1.501739582138850
ans(:,:,2) =
1.399465238902816
ans(:,:,3) =
0.715531734553844
ans(:,:,4) =
1.617019921519029
>> B*[A(1,2,1);A(1,2,2);A(1,2,3);A(1,2,4)]
ans =
1.501739582138850
1.399465238902816
0.715531734553844
1.617019921519029

How to write this matrix in matlab,

I want to write this matrix in matlab,
s=[0 ..... 0
B 0 .... 0
AB B .... 0
. . .
. . .
. . . 0 ....
A^(n-1)*B ... AB B ]
I have tried this below code but giving error,
N = 50;
A=[2 3;4 1];
B=[3 ;2];
[nx,ny] = size(A);
s(nx,ny,N) = 0;
for n=1:1:N
s(:,:,n)=A.^n;
end
s_x=cat(3, eye(size(A)) ,s);
for ii=1:1:N-1
su(:,:,ii)=(A.^ii).*B ;
end
z= zeros(1,60,1);
su1 = [z;su] ;
s_u=repmat(su1,N);
seems like the concatenation of matrix is not being done.
I am a beginner so having serious troubles,please help.
Use cell arrays and the answer to your previous question
A = [2 3; 4 1];
B = [3 ;2 ];
N = 60;
[cs{1:(N+1),1:N}] = deal( zeros(size(B)) ); %// allocate space, setting top triangle to zero
%// work on diagonals
x = B;
for ii=2:(N+1)
[cs{ii:(N+2):((N+1)*(N+2-ii))}] = deal(x); %//deal to diagonal
x = A*x;
end
s = cell2mat(cs); %// convert cells to a single matrix
For more information you can read about deal and cell2mat.
Important note about the difference between matrix operations and element-wise operations
In your question (and in your previous one) you confuse between matrix power: A^2 and element-wise operation A.^2:
matrix power A^2 = [16 9;12 13] is the matrix product of A*A
element-wise power A.^2 takes each element separately and computes its square: A.^2 = [4 9; 16 1]
In yor question you ask about matrix product A*b, but the code you write is A.*b which is an element-by-element product. This gives you an error since the size of A and the size of b are not the same.
I often find that Matlab gives itself to a coding approach of "write what it says in the equation". That also leads to code that is easy to read...
A = [2 3; 4 1];
B = [3; 2];
Q = 4;
%// don't need to...
s = [];
%// ...but better to pre-allocate s for performance
s = zeros((Q+1)*2, Q);
X = B;
for m = 2:Q+1
for n = m:Q+1
s(n*2+(-1:0), n-m+1) = X;
end
X = A * X;
end

Matlab calculate the product of an expression

I'm basicaly trying to find the product of an expression that goes like this:
(x-(N-1)/2).....(x+(N-1)/2) for even value of N
x is a value that I will set at the beginning that changes too but that is a different problem...
let's say for the sake of argument that for now x is a constant (ex x=1)
example for N=6
(x-5/2)(x-3/2)(x-1/2)(x+1/2)(x+3/2)*(x+5/2)
the idea was to create a row vector every element of which is each individual term (P(1)=x-5/2) (P(2)=x-3/2)...etc and then calculate its product
N=6;
x=1;
P=ones(1,N);
for k=(-N-1)/2:(N-1)/2
for n=1:N
P(n)=(x-k);
end
end
y=prod(P);
instead this creates a vector that takes only the first value of the epxression and then
repeats the same value at each cell.
there is obviously a fundamental problem with my loop but I just can't see it.
So if anyone can help with that OR suggest a better way to calculate the product I would be grateful.
Use vectorized commands
Why use a loop when you can use vectorized commands like prod?
y = prod(2 * x + [-N + 1 : 2 : N - 1]) / 2;
For convenience, you may want to define an anonymous function for it:
f = #(N,x) reshape(prod(bsxfun(#plus, 2 * x(:), -N + 1 : 2 : N - 1) / 2, 2), size(x));
Note that the function is compatible with a (row or column) vector input x.
Tests in MATLAB's Command Window
>> f(6, [2,2]')
ans =
-14.7656
4.9219
-3.5156
4.9219
-14.7656
>> f(6, [2,2])
ans =
-14.7656 4.9219 -3.5156 4.9219 -14.7656
Benchmark
Here is a comparison of rayreng's approach versus mine. The former emerges as the clear winner... :'( ...at least as N increases.
Varying N, fixed x
Fixed N (= 10), vector x of varying length
Fixed N (= 100), vector x of varying length
Benchmark code
function benchmark
% varying N, fixed x
clear all
n = logspace(2,4,20)';
x = rand(1000,1);
tr = zeros(size(n));
tj = tr;
for k = 1 : numel(n)
% rayreng's approach (poly/polyval)
fr = #() rayreng(n(k), x);
tr(k) = timeit(fr);
% Jubobs's approach (prod/reshape/bsxfun)
fj = #() jubobs(n(k), x);
tj(k) = timeit(fj);
end
figure
hold on
plot(n, tr, 'bo')
plot(n, tj, 'ro')
hold off
xlabel('N')
ylabel('time (s)')
legend('rayreng', 'jubobs')
end
function y = jubobs(N,x)
y = reshape(prod(bsxfun(#plus,...
2 * x(:),...
-N + 1 : 2 : N - 1) / 2,...
2),...
size(x));
end
function y = rayreng(N, x)
p = poly(linspace(-(N-1)/2, (N-1)/2, N));
y = polyval(p, x);
end
function benchmark2
% fixed N, varying x
clear all
n = 100;
nx = round(logspace(2,4,20));
tr = zeros(size(n));
tj = tr;
for k = 1 : numel(nx)
disp(k)
x = rand(nx(k), 1);
% rayreng's approach (poly/polyval)
fr = #() rayreng(n, x);
tr(k) = timeit(fr);
% Jubobs's approach (prod/reshape/bsxfun)
fj = #() jubobs(n, x);
tj(k) = timeit(fj);
end
figure
hold on
plot(nx, tr, 'bo')
plot(nx, tj, 'ro')
hold off
xlabel('number of elements in vector x')
ylabel('time (s)')
legend('rayreng', 'jubobs')
title(['n = ' num2str(n)])
end
function y = jubobs(N,x)
y = reshape(prod(bsxfun(#plus,...
2 * x(:),...
-N + 1 : 2 : N - 1) / 2,...
2),...
size(x));
end
function y = rayreng(N, x)
p = poly(linspace(-(N-1)/2, (N-1)/2, N));
y = polyval(p, x);
end
An alternative
Alternatively, because the terms in your product form an arithmetic progression (each term is greater than the previous one by 1/2), you can use the formula for the product of an arithmetic progression.
I agree with #Jubobs in that you should avoid using for loops for this kind of computation. There are cases where for loops perform fast, but for something as simple as this, avoid using loops if possible.
An alternative approach to what Jubobs has suggested is that you can consider that polynomial equation to be in factored form where each factor denotes a root located at that particular location. You can use poly to convert these factors into a polynomial equation, then use polyval to evaluate the expression at the point you want. First, generate your roots by linspace where the points vary from -(N-1)/2 to (N-1)/2 and there are N of them, then plug this into poly. Finally, for any values of x, put this into polyval with the output of poly. The advantage of this approach is that you can evaluate multiple points of x in a single sweep.
Going with what you have, you would simply do this:
p = poly(linspace(-(N-1)/2, (N-1)/2, N));
out = polyval(p, x);
With your example, supposing that N = 6, this would be the output of the first line:
p =
1.0000 0 -8.7500 0 16.1875 0 -3.5156
As such, this is saying that when we expand out (x-5/2)(x-3/2)(x-1/2)(x+1/2)(x+3/2)(x+5/2), we get:
x^6 - 8.75x^4 + 16.1875x^2 - 3.5156
If we take a look at the roots of this equation, this is what we get:
r = roots(p)
r =
-2.5000
2.5000
-1.5000
1.5000
-0.5000
0.5000
As you can see, each term corresponds to one factor in your polynomial equation, so we do have the right mindset here. Now, all you have to do is use p with your values of x into polyval to obtain your results. For example, if I wanted to evaluate that polynomial from -2 <= x <= 2 where x is an integer, this is the result I get:
polyval(p, -2:2)
ans =
-14.7656 4.9219 -3.5156 4.9219 -14.7656
Therefore, when x = -2, the result is -14.7656 and so on.
Though I would recommend the solution by #Jubobs, it is also good to check what the issue is with your loop.
The first indication that something is wrong, is that you have a nested loop over 2 variables, and only index with one of them to store the result. Probably you just need a single loop.
Here is a loop that you may be interested in that should do roughly what you need:
N=6;
x=1;
k=(-N-1)/2:(N-1)/2
P = ones(size(k));
for n=1:numel(k)
P(n)=(x-k(n));
end
y=prod(P);
I tried to keep the code close to the original, so hopefully it is easy to understand.

matlab/octave - Generalized matrix multiplication

I would like to do a function to generalize matrix multiplication. Basically, it should be able to do the standard matrix multiplication, but it should allow to change the two binary operators product/sum by any other function.
The goal is to be as efficient as possible, both in terms of CPU and memory. Of course, it will always be less efficient than A*B, but the operators flexibility is the point here.
Here are a few commands I could come up after reading various interesting threads:
A = randi(10, 2, 3);
B = randi(10, 3, 4);
% 1st method
C = sum(bsxfun(#mtimes, permute(A,[1 3 2]),permute(B,[3 2 1])), 3)
% Alternative: C = bsxfun(#(a,b) mtimes(a',b), A', permute(B, [1 3 2]))
% 2nd method
C = sum(bsxfun(#(a,b) a*b, permute(A,[1 3 2]),permute(B,[3 2 1])), 3)
% 3rd method (Octave-only)
C = sum(permute(A, [1 3 2]) .* permute(B, [3 2 1]), 3)
% 4th method (Octave-only): multiply nxm A with nx1xd B to create a nxmxd array
C = bsxfun(#(a, b) sum(times(a,b)), A', permute(B, [1 3 2]));
C = C2 = squeeze(C(1,:,:)); % sum and turn into mxd
The problem with methods 1-3 are that they will generate n matrices before collapsing them using sum(). 4 is better because it does the sum() inside the bsxfun, but bsxfun still generates n matrices (except that they are mostly empty, containing only a vector of non-zeros values being the sums, the rest is filled with 0 to match the dimensions requirement).
What I would like is something like the 4th method but without the useless 0 to spare memory.
Any idea?
Here is a slightly more polished version of the solution you posted, with some small improvements.
We check if we have more rows than columns or the other way around, and then do the multiplication accordingly by choosing either to multiply rows with matrices or matrices with columns (thus doing the least amount of loop iterations).
Note: This may not always be the best strategy (going by rows instead of by columns) even if there are less rows than columns; the fact that MATLAB arrays are stored in a column-major order in memory makes it more efficient to slice by columns, as the elements are stored consecutively. Whereas accessing rows involves traversing elements by strides (which is not cache-friendly -- think spatial locality).
Other than that, the code should handle double/single, real/complex, full/sparse (and errors where it is not a possible combination). It also respects empty matrices and zero-dimensions.
function C = my_mtimes(A, B, outFcn, inFcn)
% default arguments
if nargin < 4, inFcn = #times; end
if nargin < 3, outFcn = #sum; end
% check valid input
assert(ismatrix(A) && ismatrix(B), 'Inputs must be 2D matrices.');
assert(isequal(size(A,2),size(B,1)),'Inner matrix dimensions must agree.');
assert(isa(inFcn,'function_handle') && isa(outFcn,'function_handle'), ...
'Expecting function handles.')
% preallocate output matrix
M = size(A,1);
N = size(B,2);
if issparse(A)
args = {'like',A};
elseif issparse(B)
args = {'like',B};
else
args = {superiorfloat(A,B)};
end
C = zeros(M,N, args{:});
% compute matrix multiplication
% http://en.wikipedia.org/wiki/Matrix_multiplication#Inner_product
if M < N
% concatenation of products of row vectors with matrices
% A*B = [a_1*B ; a_2*B ; ... ; a_m*B]
for m=1:M
%C(m,:) = A(m,:) * B;
%C(m,:) = sum(bsxfun(#times, A(m,:)', B), 1);
C(m,:) = outFcn(bsxfun(inFcn, A(m,:)', B), 1);
end
else
% concatenation of products of matrices with column vectors
% A*B = [A*b_1 , A*b_2 , ... , A*b_n]
for n=1:N
%C(:,n) = A * B(:,n);
%C(:,n) = sum(bsxfun(#times, A, B(:,n)'), 2);
C(:,n) = outFcn(bsxfun(inFcn, A, B(:,n)'), 2);
end
end
end
Comparison
The function is no doubt slower throughout, but for larger sizes it is orders of magnitude worse than the built-in matrix-multiplication:
(tic/toc times in seconds)
(tested in R2014a on Windows 8)
size mtimes my_mtimes
____ __________ _________
400 0.0026398 0.20282
600 0.012039 0.68471
800 0.014571 1.6922
1000 0.026645 3.5107
2000 0.20204 28.76
4000 1.5578 221.51
Here is the test code:
sz = [10:10:100 200:200:1000 2000 4000];
t = zeros(numel(sz),2);
for i=1:numel(sz)
n = sz(i); disp(n)
A = rand(n,n);
B = rand(n,n);
tic
C = A*B;
t(i,1) = toc;
tic
D = my_mtimes(A,B);
t(i,2) = toc;
assert(norm(C-D) < 1e-6)
clear A B C D
end
semilogy(sz, t*1000, '.-')
legend({'mtimes','my_mtimes'}, 'Interpreter','none', 'Location','NorthWest')
xlabel('Size N'), ylabel('Time [msec]'), title('Matrix Multiplication')
axis tight
Extra
For completeness, below are two more naive ways to implement the generalized matrix multiplication (if you want to compare the performance, replace the last part of the my_mtimes function with either of these). I'm not even gonna bother posting their elapsed times :)
C = zeros(M,N, args{:});
for m=1:M
for n=1:N
%C(m,n) = A(m,:) * B(:,n);
%C(m,n) = sum(bsxfun(#times, A(m,:)', B(:,n)));
C(m,n) = outFcn(bsxfun(inFcn, A(m,:)', B(:,n)));
end
end
And another way (with a triple-loop):
C = zeros(M,N, args{:});
P = size(A,2); % = size(B,1);
for m=1:M
for n=1:N
for p=1:P
%C(m,n) = C(m,n) + A(m,p)*B(p,n);
%C(m,n) = plus(C(m,n), times(A(m,p),B(p,n)));
C(m,n) = outFcn([C(m,n) inFcn(A(m,p),B(p,n))]);
end
end
end
What to try next?
If you want to squeeze out more performance, you're gonna have to move to a C/C++ MEX-file to cut down on the overhead of interpreted MATLAB code. You can still take advantage of optimized BLAS/LAPACK routines by calling them from MEX-files (see the second part of this post for an example). MATLAB ships with Intel MKL library which frankly you cannot beat when it comes to linear algebra computations on Intel processors.
Others have already mentioned a couple of submissions on the File Exchange that implement general-purpose matrix routines as MEX-files (see #natan's answer). Those are especially effective if you link them against an optimized BLAS library.
Why not just exploit bsxfun's ability to accept an arbitrary function?
C = shiftdim(feval(f, (bsxfun(g, A.', permute(B,[1 3 2])))), 1);
Here
f is the outer function (corrresponding to sum in the matrix-multiplication case). It should accept a 3D array of arbitrary size mxnxp and operate along its columns to return a 1xmxp array.
g is the inner function (corresponding to product in the matrix-multiplication case). As per bsxfun, it should accept as input either two column vectors of the same size, or one column vector and one scalar, and return as output a column vector of the same size as the input(s).
This works in Matlab. I haven't tested in Octave.
Example 1: Matrix-multiplication:
>> f = #sum; %// outer function: sum
>> g = #times; %// inner function: product
>> A = [1 2 3; 4 5 6];
>> B = [10 11; -12 -13; 14 15];
>> C = shiftdim(feval(f, (bsxfun(g, A.', permute(B,[1 3 2])))), 1)
C =
28 30
64 69
Check:
>> A*B
ans =
28 30
64 69
Example 2: Consider the above two matrices with
>> f = #(x,y) sum(abs(x)); %// outer function: sum of absolute values
>> g = #(x,y) max(x./y, y./x); %// inner function: "symmetric" ratio
>> C = shiftdim(feval(f, (bsxfun(g, A.', permute(B,[1 3 2])))), 1)
C =
14.8333 16.1538
5.2500 5.6346
Check: manually compute C(1,2):
>> sum(abs( max( (A(1,:))./(B(:,2)).', (B(:,2)).'./(A(1,:)) ) ))
ans =
16.1538
Without diving into the details, there are tools such as mtimesx and MMX that are fast general purpose matrix and scalar operations routines. You can look into their code and adapt them to your needs.
It would most likely be faster than matlab's bsxfun.
After examination of several processing functions like bsxfun, it seems it won't be possible to do a direct matrix multiplication using these (what I mean by direct is that the temporary products are not stored in memory but summed ASAP and then other sum-products are processed), because they have a fixed size output (either the same as input, either with bsxfun singleton expansion the cartesian product of dimensions of the two inputs). It's however possible to trick Octave a bit (which does not work with MatLab who checks the output dimensions):
C = bsxfun(#(a,b) sum(bsxfun(#times, a, B))', A', sparse(1, size(A,1)))
C = bsxfun(#(a,b) sum(bsxfun(#times, a, B))', A', zeros(1, size(A,1), 2))(:,:,2)
However do not use them because the outputted values are not reliable (Octave can mangle or even delete them and return 0!).
So for now on I am just implementing a semi-vectorized version, here's my function:
function C = genmtimes(A, B, outop, inop)
% C = genmtimes(A, B, inop, outop)
% Generalized matrix multiplication between A and B. By default, standard sum-of-products matrix multiplication is operated, but you can change the two operators (inop being the element-wise product and outop the sum).
% Speed note: about 100-200x slower than A*A' and about 3x slower when A is sparse, so use this function only if you want to use a different set of inop/outop than the standard matrix multiplication.
if ~exist('inop', 'var')
inop = #times;
end
if ~exist('outop', 'var')
outop = #sum;
end
[n, m] = size(A);
[m2, o] = size(B);
if m2 ~= m
error('nonconformant arguments (op1 is %ix%i, op2 is %ix%i)\n', n, m, m2, o);
end
C = [];
if issparse(A) || issparse(B)
C = sparse(o,n);
else
C = zeros(o,n);
end
A = A';
for i=1:n
C(:,i) = outop(bsxfun(inop, A(:,i), B))';
end
C = C';
end
Tested with both sparse and normal matrices: the performance gap is a lot less with sparse matrices (3x slower) than with normal matrices (~100x slower).
I think this is slower than bsxfun implementations, but at least it doesn't overflow memory:
A = randi(10, 1000);
C = genmtimes(A, A');
If anyone has any better to offer, I'm still looking for a better alternative!

special add in matlab [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to Add a row vector to a column vector like matrix multiplication
I have a nx1 vector and a 1xn vector. I want to add them in a special manner like matrix multiplication in an efficient manner (vectorized):
Example:
A=[1 2 3]'
B=[4 5 6]
A \odd_add B =
[1+4 1+5 1+6
2+4 2+5 2+6
3+4 3+5 3+6
]
I have used bsxfun in MATLAB, but I think it is slow. Please help me...
As mentioned by #b3. this would be an appropriate place to use repmat. However in general, and especially if you are dealing with very large matrices, bsxfun normally makes a better substitute. In this case:
>> bsxfun(#plus, [1,2,3]', [4,5,6])
returns the same result, using about a third the memory in the large-matrix limit.
bsxfun basically applies the function in the first argument to every combination of items in the second and third arguments, placing the results in a matrix according to the shape of the input vectors.
I present a comparison of the different methods mentioned here. I am using the TIMEIT function to get robust estimates (takes care of warming up the code, average timing on multiple runs, ..):
function testBSXFUN(N)
%# data
if nargin < 1
N = 500; %# N = 10, 100, 1000, 10000
end
A = (1:N)';
B = (1:N);
%# functions
f1 = #() funcRepmat(A,B);
f2 = #() funcTonyTrick(A,B);
f3 = #() funcBsxfun(A,B);
%# timeit
t(1) = timeit( f1 );
t(2) = timeit( f2 );
t(3) = timeit( f3 );
%# time results
fprintf('N = %d\n', N);
fprintf('REPMAT: %f, TONY_TRICK: %f, BSXFUN: %f\n', t);
%# validation
v{1} = f1();
v{2} = f2();
v{3} = f3();
assert( isequal(v{:}) )
end
where
function C = funcRepmat(A,B)
N = numel(A);
C = repmat(A,1,N) + repmat(B,N,1);
end
function C = funcTonyTrick(A,B)
N = numel(A);
C = A(:,ones(N,1)) + B(ones(N,1),:);
end
function C = funcBsxfun(A,B)
C = bsxfun(#plus, A, B);
end
The timings:
>> for N=[10 100 1000 5000], testBSXFUN(N); end
N = 10
REPMAT: 0.000065, TONY_TRICK: 0.000013, BSXFUN: 0.000031
N = 100
REPMAT: 0.000120, TONY_TRICK: 0.000065, BSXFUN: 0.000085
N = 1000
REPMAT: 0.032988, TONY_TRICK: 0.032947, BSXFUN: 0.010185
N = 5000
REPMAT: 0.810218, TONY_TRICK: 0.824297, BSXFUN: 0.258774
BSXFUN is a clear winner.
In matlab vectorization, there is no substitute for Tony's Trick in terms of speed in comparison to repmat or any other built in Matlab function for that matter. I am sure that the following code must be fastest for your purpose.
>> A = [1 2 3]';
>> B = [4 5 6];
>> AB_sum = A(:,ones(3,1)) + B(ones(3,1),:);
The speed differential will be much more apparent (at LEAST an order of magnitude) for larger size of A and B. See this test I conducted some time ago to ascertain the superiority of Tony's Trick over repmatin terms of time consumption.
REPMAT is your friend:
>> A = [1 2 3]';
>> B = [4 5 6];
>> AplusB = repmat(A, 1, 3) + repmat(B, 3, 1)
AplusB =
5 6 7
6 7 8
7 8 9