Combine matrices using loop and condition in matlab - matlab

I have the following two matrices
c=[1 0 1.05
1 3 2.05
1 6 2.52
1 9 0.88
2 0 2.58
2 3 0.53
2 6 3.69
2 9 0.18
3 0 3.22
3 3 1.88
3 6 3.98]
f=[1 6 3.9
1 9 9.1
1 12 9
2 0 0.3
2 3 0.9
2 6 1.2
2 9 2.5
3 0 2.7]
And the final matrix should be
n=[1 6 2.52 3.9
1 9 0.88 9.1
2 0 2.58 0.3
2 3 0.53 0.9
2 6 3.69 1.2
2 9 0.18 2.5
3 0 3.22 2.7]
The code I used gives as a result only the last row of the previous matrix [n].
for j=1
for i=1:rs1
for k=1
for l=1:rs2
if f(i,j)==c(l,k) && f(i,j+1)==c(l,k+1)
n=[f(i,j),f(i,j+1),f(i,j+2), c(l,k+2)];
end
end
end
end
end
Can anyone help me on this?
Is there something more simple?
Thanks in advance

You should learn to use set operations and avoid loops wherever possible. Here intersect could be extremely useful:
[u, idx_c, idx_f] = intersect(c(:, 1:2) , f(:, 1:2), 'rows');
n = [c(idx_c, :), f(idx_f, end)];
Explanation: by specifying the 'rows' flag, intersect finds the common rows in c and f, and their indices are given in idx_c and idx_f respectively. Use vector subscripting to extract matrix n.
Example
Let's use the example from your question:
c = [1 0 1.05;
1 3 2.05
1 6 2.52
1 9 0.88
2 0 2.58
2 3 0.53
2 6 3.69
2 9 0.18
3 0 3.22
3 3 1.88
3 6 3.98];
f = [1 6 3.9
1 9 9.1
1 12 9
2 0 0.3
2 3 0.9
2 6 1.2
2 9 2.5
3 0 2.7];
[u, idx_c, idx_f] = intersect(c(:, 1:2) , f(:, 1:2), 'rows');
n = [c(idx_c, :), f(idx_f, end)];
This should yield the desired result:
n =
1.0000 6.0000 2.5200 3.9000
1.0000 9.0000 0.8800 9.1000
2.0000 0 2.5800 0.3000
2.0000 3.0000 0.5300 0.9000
2.0000 6.0000 3.6900 1.2000
2.0000 9.0000 0.1800 2.5000
3.0000 0 3.2200 2.7000

According to this answer on Mathworks support you can use join from the statistics toolbox, specifically in your case, an inner join.
Unfortunately I don't have access to my computer with matlab on it, but give it a try and let us know how/if it works.

You can reduce the number of loops by comparing both the first and second columns of at once, then using the "all" function to only collapse the values if they both match. The following snippet replicates the "n" array you had provided.
n = [];
for r1 = 1:size(c, 1)
for r2 = 1:size(f,1)
if all(c(r1, [1 2]) == f(r2, [1 2]))
n(end+1, 1:4) = [c(r1,:) f(r2,3)];
end
end
end

If you insist on doing this in a loop you need to give n the proper dimension according
to the loop counter you are using, or concatenate it to itself of each iteration (this can be very slow for big matrices). For example, writing:
for j=1
for i=1:rs1
for k=1
for l=1:rs2
m=m+1;
if f(i,j)==c(l,k) && f(i,j+1)==c(l,k+1)
n(m,:)=[f(i,j),f(i,j+1),f(i,j+2), c(l,k+2)];
end
end
end
end
end
will save into the m-th row the for numbers when the loop reaches a counter value of m.
However, just be aware that this can be done also without a nested loop and an if condition, in a vectorized way. For example, instead of the condition if f(i,j)==c(l,k)... you can use ismember etc...

How about without any for loops at all (besides in native code)
mf = size(f,1);
mc = size(c,1);
a = repmat(c(:,1:2),1,mf);
b = repmat(reshape((f(:,1:2))',1,[]),mc,1);
match = a == b;
match = match(:, 1 : 2 : 2*mf) & match(:, 2 : 2 : 2*mf);
crows = nonzeros(diag(1:mc) * match);
frows = nonzeros(match * diag(1:mf));
n = [c(crows,:),f(frows,3)]

Related

Normalize each slice of a 3D matrix

How do I normalize each slice of a 3D matrix? I tried like this:
a=rand(1,100,3481);
a= (a - min(a)) ./ (max(a)-min(a)); %
By right each slice of matrix should ranges from 0 to 1. But that is not the case, I don't find 1 in some of the slices. As I inspected, min(a) and max(a) returned the respective value in 3D. Thus it should be of no issue using the code above. Is there something I missed for 3D matrix? Thanks in advance!
We need to find the minimum and maximum values for each of those 2D slices and then we can use bsxfun to do those operations in a vectorized manner with help from permute to let the singleton dims align properly to let bsxfun do its broadcasting job (or use reshape there).
Hence, the implementation would be -
mins = min(reshape(a,[],size(a,3)));
maxs = max(reshape(a,[],size(a,3)));
a_offsetted = bsxfun(#minus, a, permute(mins,[1,3,2]));
a_normalized = bsxfun(#rdivide, a_offsetted, permute(maxs-mins,[1,3,2]))
Sample input, output -
>> a
a(:,:,1) =
2 8 2 2
8 3 8 2
a(:,:,2) =
8 1 1 5
4 9 8 6
a(:,:,3) =
7 9 3 5
6 2 6 5
a(:,:,4) =
9 3 4 9
7 1 9 9
>> a_normalized
a_normalized(:,:,1) =
0 1.0000 0 0
1.0000 0.1667 1.0000 0
a_normalized(:,:,2) =
0.8750 0 0 0.5000
0.3750 1.0000 0.8750 0.6250
a_normalized(:,:,3) =
0.7143 1.0000 0.1429 0.4286
0.5714 0 0.5714 0.4286
a_normalized(:,:,4) =
1.0000 0.2500 0.3750 1.0000
0.7500 0 1.0000 1.0000
My option would be without reshaping as it is sometimes bit difficult to understand. I use min max with the dimension you want want to use for normalization with repmat to clone...:
a=rand(1,100,3481);
a_min2 = min(a,[],2);
a_max2 = max(a,[],2);
a_norm2 = (a - repmat(a_min2,[1 size(a,2) 1]) ) ./ repmat( (a_max2-a_min2),[1 size(a,2) 1]);
or if normalization on 3rd dim...
a_min3 = min(a,[],3);
a_max3 = max(a,[],3);
a_norm3 = (a - repmat(a_min3,[1 1 size(a,3)]) ) ./ repmat( (a_max3-a_min3),[1 1 size(a,3)]);

How to accumulate (average) data based on multiple criteria

I have a set of data where I have recorded values in sets of 3 readings (so as to be able to obtain a general idea of the SEM). I have them recorded in a list that looks as follows, which I am trying to collapse into averages of each set of 3 points:
I want to collapse essentially each 3 rows into one row where the average data value is given for that set. In essence, it would look as follows:
This is something I know how to do basically in Excel (i.e. using a Pivot table) but I am not sure how to do the same in MATLAB. I have tried using accumarray but struggle with knowing how to incorporate multiple conditions essentially. I would need to create a subs array where its number corresponds to each unique set of 3 data points. By brute force, I could create an array such as:
subs = [1 1 1; 2 2 2; 3 3 3; 4 4 4; ...]'
using some looping and have that as my subs array, but since it isn't tied to the data itself, and there may be strange hiccups throughout (i.e. more than 3 data points per set, or missing data, etc.). I know there must be some way to have this sort of Pivot-table-esque grouping for something like this, but need some help to get it off the ground. Thanks.
Here is the input data in text form:
Subject Flow On/Off Values
1 10 1 2.20
1 10 1 2.50
1 10 1 2.60
1 20 1 5.50
1 20 1 6.10
1 20 1 5.90
1 30 1 10.10
1 30 1 10.50
1 30 1 10.50
1 10 0 1.90
1 10 0 2.20
1 10 0 2.30
1 20 0 5.20
1 20 0 5.80
1 20 0 5.60
1 30 0 9.80
1 30 0 10.20
1 30 0 10.20
2 10 1 5.70
2 10 1 6.00
2 10 1 6.10
2 20 1 9.00
2 20 1 9.60
2 20 1 9.40
2 30 1 13.60
2 30 1 14.00
2 30 1 14.00
2 10 0 5.40
2 10 0 5.70
2 10 0 5.80
2 20 0 8.70
2 20 0 9.30
2 20 0 9.10
2 30 0 13.30
2 30 0 13.70
2 30 0 13.70
You can use unique and accumarray like so to maintain the order of your rows of data:
[newData, ~, subs] = unique(data(:, 1:3), 'rows', 'stable');
newData(:, 4) = accumarray(subs, data(:, 4), [], #mean);
newData =
1.0000 10.0000 1.0000 2.4333
1.0000 20.0000 1.0000 5.8333
1.0000 30.0000 1.0000 10.3667
1.0000 10.0000 0 2.1333
1.0000 20.0000 0 5.5333
1.0000 30.0000 0 10.0667
2.0000 10.0000 1.0000 5.9333
2.0000 20.0000 1.0000 9.3333
2.0000 30.0000 1.0000 13.8667
2.0000 10.0000 0 5.6333
2.0000 20.0000 0 9.0333
2.0000 30.0000 0 13.5667
I assume that
You want to average based on unique values of the first three columns (not on groups of three rows, although the two criteria coincide in your example);
Order is determined by column 1, then 3, then 2.
Then, denoting your data as x,
[~, ~, subs] = unique(x(:, [1 3 2]), 'rows', 'sorted');
result = accumarray(subs, x(:,end), [], #mean);
gives
result =
2.1333
5.5333
10.0667
2.4333
5.8333
10.3667
5.6333
9.0333
13.5667
5.9333
9.3333
13.8667
As you see, I am using the third output of unique with the 'rows' and 'sorted' options. This creates the subs grouping vector based on first three columns of your data in the desired order. Then, passing that to accumarray computes the means.
accumarray is indeed the way to go. First, you'll need assign an index to each set of values with unique :
[unique_subjects, ~, ind_subjects] = unique(vect_subjects);
[unique_flows, ~, ind_flows] = unique(vect_flows);
[unique_on_off, ~, ind_on_off] = unique(vect_on_off);
So basically, you now got ind_subjects, ind_flows and ind_on_off that are values in [1..2], [1..3] and [1..2].
Now, you can compute the mean values in a [3x2x2] array (in you example) :
mean_values = accumarray([ind_flows, ind_on_off, ind_subjects], vect_values, [], #mean);
mean_values = mean_values(:);
Nota : order is set accordingly to your example.
Then you can construct the summary :
[ind1, ind2, ind3] = ndgrid(1:numel(unique_flows), 1:numel(unique_on_off), 1:numel(unique_subjects));
flows_summary = unique_flows(ind1(:));
on_off_summary = unique_on_off(ind2(:));
subjects_summary = unique_subjects(ind3(:));
Nota : Also works with non numeric values.
You should also try checking out the findgroups and splitapply reference pages. The easiest way to use them here is probably to place your data in a table:
>> T = array2table(data, 'VariableNames', { 'Subject', 'Flow', 'On_Off', 'Values'});
>> [gid,Tgrp] = findgroups(T(:,1:3));
>> Tgrp.MeanValue = splitapply(#mean, T(:,4), gid)
Tgrp =
12×4 table
Subject Flow On_Off MeanValue
_______ ____ ______ _________
1 10 0 2.1333
1 10 1 2.4333
1 20 0 5.5333
1 20 1 5.8333
1 30 0 10.067
1 30 1 10.367
2 10 0 5.6333
2 10 1 5.9333
2 20 0 9.0333
2 20 1 9.3333
2 30 0 13.567
2 30 1 13.867

Piecewise average over n elements in Matlab

I want to piecewise-average a vector in Matlab. Vector x looks like this:
x = 1:15;
Respectively:
x = [1 2 3 4 5 6 7 8 9 10 11 12 13 14 15]
I want to find the mean value over n = 5 elements; therefore, the result-vector y should look like:
y = [1 1.5 2.5 3 4 5 6 7 8 9 10 11 12 13]
The code for generating the vector y should somehow work like this:
y = [
mean ([1])
mean ([1,2])
mean ([1,2,3])
mean ([1,2,3,4])
mean ([1,2,3,4,5])
mean ([2,3,4,5,6])
mean ([3,4,5,6,7])
mean ([4,5,6,7,8])
mean ([5,6,7,8,9])
mean ([6,7,8,9,10])
mean ([7,8,9,10,11])
mean ([8,9,10,11,12])
mean ([9,10,11,12,13])
mean ([10,11,12,13,14])
mean ([11,12,13,14,15])
]
For n < 5 elements, the program should average over n elements. For example, if there are only 3 elements available, the code should average the first 3 elements. For n > 5 elements, the program should average over the last 5 elements.
Any help is appreciated!
For such sliding summing or averaging operations, a very efficient vectorized approach would be with 1D convolution conv, like so -
n = 5
sums = conv(x,ones(1,n))
out = sums(1:numel(x))./[1:n n*ones(1,numel(x)-n)]
Try this:
x = 1:15;
for n = 1:length(x)
if n <= 5
y(n) = mean(x(1:n))
else
y(n) = mean(x(n-4:n))
end
end
Below is a kind of a brute force method.
for j=1:length(x)
A=j-4;
if A<1
A=1;
end;
y(j)=mean(x(A:j))
end;
or in a more compact form:
for j=1:length(x)
y(j)=mean(x(max(j-4,1):j));
end;
Here is an alternative method
make a matrix of all the numbers which you have to take average of in each row, here in bsxfun() function a vector four new rows are creating for each element of 1:15 row by row of previous 4 digits of the current number and then all the non zero elements are ommitted and amde 0
n =
5
A = bsxfun(#plus ,[1:15].',-(n -1):0)
A(A<0) = 0
A =
0 0 0 0 1
0 0 0 1 2
0 0 1 2 3
0 1 2 3 4
1 2 3 4 5
2 3 4 5 6
3 4 5 6 7
4 5 6 7 8
5 6 7 8 9
6 7 8 9 10
7 8 9 10 11
8 9 10 11 12
9 10 11 12 13
10 11 12 13 14
11 12 13 14 15
and then divide the sum of each row by number of non-zero elements in each row
>> sum(A,2)./sum(~ismember(A,0),2)
ans =
1.0000
1.5000
2.0000
2.5000
3.0000
4.0000
5.0000
6.0000
7.0000
8.0000
9.0000
10.0000
11.0000
12.0000
13.0000

Matlab Conditional probability from dataset

I have a Matrix M of 500x5 and I need to calculate conditional probability. I have discretised my data and then I have this code that currently only works with 3 variables rather than 5 but that's fine for now.
The code below already works out the number of times I get A=1, B=1 and C=1, the number of times we get A=2, B=1, C=1 etc.
data = M;
npatients=size(data,1)
asum=zeros(4,2,2)
prob=zeros(4,2,2)
for patient=1:npatients,
h=data(patient,1)
i=data(patient,2)
j=data(patient,3)
asum(h,i,j)=asum(h,i,j)+1
end
for h=1:4,
for i=1:2,
for j=1:2,
prob(h,i,j)=asum(h,i,j)/npatients
end
end
end
So I need code to sum over to get the number of time we get A=1 and B=1 (adding over all C) to find:
Prob(C=1 given A=1 and B=1) = P(A=1,B=1, C=1)/P( A=1, B=1).
This is the rule strength of the first rule. I need to find out how to loop over A, B and C to get the rest and how to actually get this to work in Matlab. I don't know if its of any use but I have code to put each column into its own thing.:
dest = M(:,1); gen = M(:,2); age = M(:,3); year = M(:,4); dur = M(:,5);
So say dest is the consequent and gen and age are the antecedents how would I do this.
Below is the data of the first 10 patients as an example:
destination gender age
2 2 2
2 2 2
2 2 2
2 2 2
2 2 2
2 1 1
3 2 2
2 2 2
3 2 1
3 2 1
Any help is appreciated and badly needed.
Sine your code didn't work by copy & paste, I changed it a little bit,
It's better if you define a function that calculates the probability for given data,
function p = prob(data)
n = size(data,1);
uniquedata = unique(data);
p = zeros(length(uniquedata),2);
p(:,2) = uniquedata;
for i = 1 : size(uniquedata,1)
p(i,1) = sum(data == uniquedata(i)) / n;
end
end
Now in another script,
data =[3 2 91;
3 2 86;
3 2 90;
3 2 85;
3 2 86;
3 1 77;
4 2 88;
3 2 90;
4 2 79;
4 2 77;
4 1 65;
3 1 60];
pdest = prob(data(:,1));
pgend = prob(data(:,2));
page = prob(data(:,3));
This will give,
page =
0.0833 60.0000
0.0833 65.0000
0.1667 77.0000
0.0833 79.0000
0.0833 85.0000
0.1667 86.0000
0.0833 88.0000
0.1667 90.0000
0.0833 91.0000
pgend =
0.2500 1.0000
0.7500 2.0000
pdest =
0.6667 3.0000
0.3333 4.0000
That will give the probabilities you've already calculated,
Note that the second column of prob is the valuse and the first column the probability.
When you want to calculate probabilities for des = 3 & gend = 2 you should create a new data set and call prob, for new data set use,
mapd2g3 = data(:,1) == 3 & data(:,2) == 2;
datad2g3 = data(mapd2g3,:)
3 2 91
3 2 86
3 2 90
3 2 85
3 2 86
3 2 90
paged2g3 = prob(datad2g3(:,3))
0.1667 85.0000
0.3333 86.0000
0.3333 90.0000
0.1667 91.0000
This is the prob(age|dest = 3 & gend = 2) .
You could even write a function to create the data sets.

write result as matrix form

let us consider following matrix
a=[1 2 3;2 3 4;3 4 5;4 5 7]
a =
1 2 3
2 3 4
3 4 5
4 5 7
let us consider it's svd
[U E V]=svd(a)
U =
-0.2738 -0.8708 -0.0062 -0.4082
-0.3984 -0.2552 -0.3309 0.8165
-0.5230 0.3605 -0.6557 -0.4082
-0.7020 0.2159 0.6786 0.0000
E =
13.5093 0 0
0 0.6482 0
0 0 0.2797
0 0 0
V =
-0.4032 0.8699 0.2841
-0.5437 0.0220 -0.8390
-0.7361 -0.4928 0.4641
if consider kronecker product of columns of U and V matrix
kron(U(:,1),V(:,1))
ans =
0.1104
0.1489
0.2015
0.1606
0.2166
0.2932
0.2109
0.2843
0.3849
0.2831
0.3817
0.5167
but it is returning as vector form,but i need matrix form,so how can i transform it to matrix insider kron product?maybe i should you reshape command,but could you help me to do it?thanks in advance
i have consider following change as Divakar adviced and it works fine
X=kron(U(:,1),V(:,1)')
X =
0.1104 0.1489 0.2015
0.1606 0.2166 0.2932
0.2109 0.2843 0.3849
0.2831 0.3817 0.5167
thanks my friend for your help