Cartesian product in MATLAB - matlab

Here is the simplified version of the problem I have. Suppose I have a vector
p=[1 5 10]
and another one
q=[.75 .85 .95]
And I want to come up with the following matrix:
res=[1, .75;
1, .85;
1, .95;
5, .75;
5, .85;
5, .95;
10, .75;
10, .85;
10, .95]
This is also known as the Cartesian Product.
How can I do that?

Here's one way:
[X,Y] = meshgrid(p,q);
result = [X(:) Y(:)];
The output is:
result =
1.0000 0.7500
1.0000 0.8500
1.0000 0.9500
5.0000 0.7500
5.0000 0.8500
5.0000 0.9500
10.0000 0.7500
10.0000 0.8500
10.0000 0.9500

A similar approach as the one described by #nibot can be found in matlab central file-exchange.
It generalizes the solution to any number of input sets. This would be a simplified version of the code:
function C = cartesian(varargin)
args = varargin;
n = nargin;
[F{1:n}] = ndgrid(args{:});
for i=n:-1:1
G(:,i) = F{i}(:);
end
C = unique(G , 'rows');
end
For instance:
cartesian(['c','d','e'],[1,2],[50,70])
ans =
99 1 50
99 1 70
99 2 50
99 2 70
100 1 50
100 1 70
100 2 50
100 2 70
101 1 50
101 1 70
101 2 50
101 2 70

Here's a function, cartesian_product, that can handle any type of input, including string arrays, and returns a table with column names that match the names of the input variables. Inputs that are not variables are given names like var1, var2, etc.
function tbl = cartesian_product(varargin)
names = arrayfun(#inputname, 1:nargin, 'UniformOutput', false);
for i = 1:nargin
if isempty(names{i})
names{i} = ['var' num2str(i)];
end
end
rev_args = flip(varargin);
[A{1:nargin}] = ndgrid(rev_args{:});
B = cellfun(#(x) x(:), A, 'UniformOutput', false);
C = flip(B);
tbl = table(C{:}, 'VariableNames', names);
end
>> x = ["a" "b"];
>> y = 1:3;
>> z = 4:5;
>> cartesian_product(x, y, z)
ans =
12×3 table
x y z
___ _ _
"a" 1 4
"a" 1 5
"a" 2 4
"a" 2 5
"a" 3 4
"a" 3 5
"b" 1 4
"b" 1 5
"b" 2 4
"b" 2 5
"b" 3 4
"b" 3 5
>> cartesian_product(1:2, 3:4)
ans =
4×2 table
var1 var2
____ ____
1 3
1 4
2 3
2 4

Related

Matlab: Odd linear indexing into array with singleton dimensions

I'm having trouble understanding the circumstances under which linear indexing applies.
For a 2D array, it seems intuitive:
>> clear x
>> x=[1 2;3 4]
x =
1 2
3 4
>> x([1 2])
ans =
1 3
>> x([1;2])
ans =
1
3
For a 3D matrix, it seems intuitive:
>> clear y
>> y(1:2,1:2,1)=x
y =
1 2
3 4
>> y(1:2,1:2,2)=x+10
y(:,:,1) =
1 2
3 4
y(:,:,2) =
11 12
13 14
>> y([1 2])
ans =
1 3
>> y([1;2])
ans =
1
3
For a 3D matrix in which the 1st two dimensions are singletons, it's not what I would expect:
>> clear y
>> y(1,1,1)=1
y =
1
>> y(1,1,2)=2
y(:,:,1) =
1
y(:,:,2) =
2
>> y([1 2])
ans(:,:,1) =
1
ans(:,:,2) =
2
>> y([1;2])
ans(:,:,1) =
1
ans(:,:,2) =
2
I would have expected exactly the same as the 3D matrix without singleton dimensions.
Is there a rule that can be relied on to predict the behaviour of linear indexing?
The rule for linear indexing of an array x with an array ind is as follows (taken from this great post by Loren Shure):
If at least one of x and ind is not a vector (that is, if x or ind have more than one non-singleton dimension) the output has the same shape (size) as ind:
>> x = rand(3,4);
>> ind = [9 3];
>> x(ind)
ans =
0.276922984960890 0.743132468124916
>> x(ind.')
ans =
0.276922984960890
0.743132468124916
>> x = rand(2,3,4);
>> ind = [1 2; 5 6];
>> x(ind)
ans =
0.814723686393179 0.905791937075619
0.632359246225410 0.097540404999410
If both x and ind are vectors, the output is a vector that has the same orientation as x (i.e. the output dimension which is non-singleton is that in x), and the same number of elements as ind:
>> x = 10:10:70;
>> ind = [1 3 5];
>> x(ind)
ans =
10 30 50
>> x(ind.')
ans =
10 30 50
>> x = reshape(10:10:70,1,1,[]); % 1×1×7
>> ind = reshape([2 3 4],1,1,1,1,[]); % 1×1×1×1×3
>> x(ind)
ans(:,:,1) =
20
ans(:,:,2) =
30
ans(:,:,3) =
40
For completeness, if two or more indexing arrays are applied (so this is not linear indexing anymore), the shape of the output is determined by which dimensions of the original array are being indexed and by the number of elements of each indexing array. It is independent of the shape of the indexing arrays, which are simply read in column-major order as usual:
>> x = [10 20 30 40; 50 60 70 80];
>> x(2, [1 2; 3 4])
ans =
50 70 60 80
>> x(2, [1 2 3 4])
ans =
50 60 70 80
>> x(2, reshape([1 2 3 4],1,1,1,[]))
ans =
50 60 70 80
If there are fewer indices than the number of dimensions of x, the trailing dimensions of x are implicitly collapsed into one before the indexing is applied:
>> x = [10 20 30 40; 50 60 70 80];
>> x(:,:,2) = x+100;
>> x([1 2], [1; 5; 7; 8])
ans =
10 110 130 140
50 150 170 180

What's wrong with my implementation of filter function on matlab

I am implementing a function named filter_fir on matlab, referencing the built-in function filter in matlab.
function y = filter_fir(b, a, x)
y = conv(b, x);
y = y(1:length(x));
as = [0 a(2:end)];
a1 = a(1);
if a1 == 0
error('a(1) cannot be zero');
end
ya = y
for n = 1:(length(x))
ya = conv(ya, as);
ya = ya(1:length(x));
y = y - ya;
end
y = y ./ a1;
end
Here are the results:
Yt = filter([1 1 1], 1, [1 2 3 4 5])
Ys = filter_fir([1 1 1], 1, [1 2 3 4 5])
Yt =
1 3 6 9 12
Ys =
1 3 6 9 12
Yt = filter([1 1 1], [2 2], [1 2 3 4 5])
Ys = filter_fir([1 1 1], [2 2], [1 2 3 4 5])
Yt =
0.5000 1.0000 2.0000 2.5000 3.5000
Ys =
0.5000 0.5000 -2.0000 -11.5000 -35.0000
When a=1,two results from the built-in filter function and filter_fir are the same.
But when a = [2 2], they are not the same.
Could anybody tell me what the problem is ?

How to separate combinatorics in a loop?

Lets say I have 4 locations, x = rand(4,1). For each location I would like to calculate the distance to each of the other 3 locations, d = pdist(x, 'euclidean'). This gives me 6 unique distances, eg. 12, 13, 14, 23, 24, 34.
How do I separate these combinations, such that I get all distances from respective location 1, 2, 3 to the others. So the results should look like:
[1 2 3]
[4 5]
[6]
Maybe squareform is what you're after. It unpacks all distances into a square, symmetric matrix:
>> x = rand(4,1)
x =
0.5290
0.5673
0.4487
0.9872
>> d = pdist(x, 'euclidean')
d =
0.0383 0.0802 0.4582 0.1186 0.4199 0.5384
>> D = squareform(d)
D =
0 0.0383 0.0802 0.4582
0.0383 0 0.1186 0.4199
0.0802 0.1186 0 0.5384
0.4582 0.4199 0.5384 0
so for example D(2,3) (or D(3,2)) is the distance from point 2 to point 3.
Create an index with all the 6 combinations, idx = [1 2 3 4 5 6]. Remove respectively the first 3, 2, 1 elements in this index for each loop (idx(1:Num-n) = []).
Num = 4;
idx = 1:nchoosek(Num,2); % index of all combinations
for n = 1:Num-1 % loop through all except last
idx(1:Num-n) % print
idx(1:Num-n) = []; % remove the first elements in index
end
This will give the following result
ans = 1 2 3
ans = 4 5
ans = 6

Matlab Conditional probability from dataset

I have a Matrix M of 500x5 and I need to calculate conditional probability. I have discretised my data and then I have this code that currently only works with 3 variables rather than 5 but that's fine for now.
The code below already works out the number of times I get A=1, B=1 and C=1, the number of times we get A=2, B=1, C=1 etc.
data = M;
npatients=size(data,1)
asum=zeros(4,2,2)
prob=zeros(4,2,2)
for patient=1:npatients,
h=data(patient,1)
i=data(patient,2)
j=data(patient,3)
asum(h,i,j)=asum(h,i,j)+1
end
for h=1:4,
for i=1:2,
for j=1:2,
prob(h,i,j)=asum(h,i,j)/npatients
end
end
end
So I need code to sum over to get the number of time we get A=1 and B=1 (adding over all C) to find:
Prob(C=1 given A=1 and B=1) = P(A=1,B=1, C=1)/P( A=1, B=1).
This is the rule strength of the first rule. I need to find out how to loop over A, B and C to get the rest and how to actually get this to work in Matlab. I don't know if its of any use but I have code to put each column into its own thing.:
dest = M(:,1); gen = M(:,2); age = M(:,3); year = M(:,4); dur = M(:,5);
So say dest is the consequent and gen and age are the antecedents how would I do this.
Below is the data of the first 10 patients as an example:
destination gender age
2 2 2
2 2 2
2 2 2
2 2 2
2 2 2
2 1 1
3 2 2
2 2 2
3 2 1
3 2 1
Any help is appreciated and badly needed.
Sine your code didn't work by copy & paste, I changed it a little bit,
It's better if you define a function that calculates the probability for given data,
function p = prob(data)
n = size(data,1);
uniquedata = unique(data);
p = zeros(length(uniquedata),2);
p(:,2) = uniquedata;
for i = 1 : size(uniquedata,1)
p(i,1) = sum(data == uniquedata(i)) / n;
end
end
Now in another script,
data =[3 2 91;
3 2 86;
3 2 90;
3 2 85;
3 2 86;
3 1 77;
4 2 88;
3 2 90;
4 2 79;
4 2 77;
4 1 65;
3 1 60];
pdest = prob(data(:,1));
pgend = prob(data(:,2));
page = prob(data(:,3));
This will give,
page =
0.0833 60.0000
0.0833 65.0000
0.1667 77.0000
0.0833 79.0000
0.0833 85.0000
0.1667 86.0000
0.0833 88.0000
0.1667 90.0000
0.0833 91.0000
pgend =
0.2500 1.0000
0.7500 2.0000
pdest =
0.6667 3.0000
0.3333 4.0000
That will give the probabilities you've already calculated,
Note that the second column of prob is the valuse and the first column the probability.
When you want to calculate probabilities for des = 3 & gend = 2 you should create a new data set and call prob, for new data set use,
mapd2g3 = data(:,1) == 3 & data(:,2) == 2;
datad2g3 = data(mapd2g3,:)
3 2 91
3 2 86
3 2 90
3 2 85
3 2 86
3 2 90
paged2g3 = prob(datad2g3(:,3))
0.1667 85.0000
0.3333 86.0000
0.3333 90.0000
0.1667 91.0000
This is the prob(age|dest = 3 & gend = 2) .
You could even write a function to create the data sets.

Combine matrices using loop and condition in matlab

I have the following two matrices
c=[1 0 1.05
1 3 2.05
1 6 2.52
1 9 0.88
2 0 2.58
2 3 0.53
2 6 3.69
2 9 0.18
3 0 3.22
3 3 1.88
3 6 3.98]
f=[1 6 3.9
1 9 9.1
1 12 9
2 0 0.3
2 3 0.9
2 6 1.2
2 9 2.5
3 0 2.7]
And the final matrix should be
n=[1 6 2.52 3.9
1 9 0.88 9.1
2 0 2.58 0.3
2 3 0.53 0.9
2 6 3.69 1.2
2 9 0.18 2.5
3 0 3.22 2.7]
The code I used gives as a result only the last row of the previous matrix [n].
for j=1
for i=1:rs1
for k=1
for l=1:rs2
if f(i,j)==c(l,k) && f(i,j+1)==c(l,k+1)
n=[f(i,j),f(i,j+1),f(i,j+2), c(l,k+2)];
end
end
end
end
end
Can anyone help me on this?
Is there something more simple?
Thanks in advance
You should learn to use set operations and avoid loops wherever possible. Here intersect could be extremely useful:
[u, idx_c, idx_f] = intersect(c(:, 1:2) , f(:, 1:2), 'rows');
n = [c(idx_c, :), f(idx_f, end)];
Explanation: by specifying the 'rows' flag, intersect finds the common rows in c and f, and their indices are given in idx_c and idx_f respectively. Use vector subscripting to extract matrix n.
Example
Let's use the example from your question:
c = [1 0 1.05;
1 3 2.05
1 6 2.52
1 9 0.88
2 0 2.58
2 3 0.53
2 6 3.69
2 9 0.18
3 0 3.22
3 3 1.88
3 6 3.98];
f = [1 6 3.9
1 9 9.1
1 12 9
2 0 0.3
2 3 0.9
2 6 1.2
2 9 2.5
3 0 2.7];
[u, idx_c, idx_f] = intersect(c(:, 1:2) , f(:, 1:2), 'rows');
n = [c(idx_c, :), f(idx_f, end)];
This should yield the desired result:
n =
1.0000 6.0000 2.5200 3.9000
1.0000 9.0000 0.8800 9.1000
2.0000 0 2.5800 0.3000
2.0000 3.0000 0.5300 0.9000
2.0000 6.0000 3.6900 1.2000
2.0000 9.0000 0.1800 2.5000
3.0000 0 3.2200 2.7000
According to this answer on Mathworks support you can use join from the statistics toolbox, specifically in your case, an inner join.
Unfortunately I don't have access to my computer with matlab on it, but give it a try and let us know how/if it works.
You can reduce the number of loops by comparing both the first and second columns of at once, then using the "all" function to only collapse the values if they both match. The following snippet replicates the "n" array you had provided.
n = [];
for r1 = 1:size(c, 1)
for r2 = 1:size(f,1)
if all(c(r1, [1 2]) == f(r2, [1 2]))
n(end+1, 1:4) = [c(r1,:) f(r2,3)];
end
end
end
If you insist on doing this in a loop you need to give n the proper dimension according
to the loop counter you are using, or concatenate it to itself of each iteration (this can be very slow for big matrices). For example, writing:
for j=1
for i=1:rs1
for k=1
for l=1:rs2
m=m+1;
if f(i,j)==c(l,k) && f(i,j+1)==c(l,k+1)
n(m,:)=[f(i,j),f(i,j+1),f(i,j+2), c(l,k+2)];
end
end
end
end
end
will save into the m-th row the for numbers when the loop reaches a counter value of m.
However, just be aware that this can be done also without a nested loop and an if condition, in a vectorized way. For example, instead of the condition if f(i,j)==c(l,k)... you can use ismember etc...
How about without any for loops at all (besides in native code)
mf = size(f,1);
mc = size(c,1);
a = repmat(c(:,1:2),1,mf);
b = repmat(reshape((f(:,1:2))',1,[]),mc,1);
match = a == b;
match = match(:, 1 : 2 : 2*mf) & match(:, 2 : 2 : 2*mf);
crows = nonzeros(diag(1:mc) * match);
frows = nonzeros(match * diag(1:mf));
n = [c(crows,:),f(frows,3)]