Split matrix based on number in first column - matlab

I have a matrix which has the following form:
M =
[1 4 56 1;
1 3 5 1;
1 3 6 4;
2 3 5 0;
2 0 0 0;
3 1 2 3;
3 3 3 3]
I want to split this matrix based on the number given in the first column. So I want to split the matrix into this:
A =
[1 4 56 1;
1 3 5 1;
1 3 6 4]
B =
[2 3 5 0;
2 0 0 0]
C =
[3 1 2 3;
3 3 3 3]
I tried this by making the following loop, but this gave me the desired matrices with rows of zeros:
for i = 1:length(M)
if (M(i,1) == 1)
A(i,:) = M(i,:);
elseif (M(i,1) == 2)
B(i,:) = M(i,:);
elseif (M(i,1) == 3)
C(i,:) = M(i,:);
end
end
The result for matrix C is then for example:
C =
[0 0 0 0;
0 0 0 0;
0 0 0 0;
2 3 5 0;
2 0 0 0]
How should I solve this issue?
Additional information:
The actual data has a date in the first column in the form yyyymmdd. The data set spans several years and I want to split this dataset in matrices for each year and after that for each month.

You can use arrayfun to solve this task:
M = [
1 4 56 1;
1 3 5 1;
1 3 6 4;
2 3 5 0;
2 0 0 0;
3 1 2 3;
3 3 3 3]
A = arrayfun(#(x) M(M(:,1) == x, :), unique(M(:,1)), 'uniformoutput', false)
The result A is a cell array and its contents can be accessed as follows:
>> a{1}
ans =
1 4 56 1
1 3 5 1
1 3 6 4
>> a{2}
ans =
2 3 5 0
2 0 0 0
>> a{3}
ans =
3 1 2 3
3 3 3 3
To split the data based on an yyyymmdd format in the first column, you can use the following:
yearly = arrayfun(#(x) M(floor(M(:,1)/10000) == x, :), unique(floor(M(:,1)/10000)), 'uniformoutput', false)
monthly = arrayfun(#(x) M(floor(M(:,1)/100) == x, :), unique(floor(M(:,1)/100)), 'uniformoutput', false)

If you don't know how many outputs you'll have, it is most convenient to put the data into a cell array rather than into separate arrays. The command to do this is MAT2CELL. Note that this assumes your data is sorted. If it isn't use sortrows before running the code.
%# count the repetitions
counts = hist(M(:,1),unique(M(:,1));
%# split the array
yearly = mat2cell(M,counts,size(M,2))
%# if you'd like to split each cell further, but still keep
%# the data also grouped by year, you can do the following
%# assuming the month information is in column 2
yearByMonth = cellfun(#(x)...
mat2cell(x,hist(x(:,2),unique(x(:,2)),size(x,2)),...
yearly,'uniformOutput',false);
You'd then access the data for year 3, month 4 as yearByMonth{3}{4}
EDIT
If the first column of your data is yyyymmdd, I suggest splitting it into three columns yyyy,mm,dd, like below, to facilitate grouping afterward:
ymd = 20120918;
yymmdd = floor(ymd./[10000 100 1])
yymmdd(2:3) = yymmdd(2:3)-100*yymmdd(1:2)

Related

How can I calculate the relative frequency of a row in a data set using Matlab?

I am new to Matlab and I have a basic question.
I have this data set:
1 2 3
4 5 7
5 2 7
1 2 3
6 5 3
I am trying to calculate the relative frequencies from the dataset above
specifically calculating the relative frequency of x=1, y=2 and z=3
my code is:
data = load('datasetReduced.txt')
X = data(:, 1)
Y = data(:, 2)
Z = data(:, 3)
f = 0;
for i=1:5
if X == 1 & Y == 2 & Z == 3
s = 1;
else
s = 0;
end
f = f + s;
end
f
r = f/5
it is giving me a 0 result.
How can the code be corrected??
thanks,
Shosho
Your issue is likely that you are comparing floating point numbers using the == operator which is likely to fail due to floating point errors.
A faster way to do this would be to use ismember with the 'rows' option which will result in a logical array that you can then sum to get the total number of rows that matched and divide by the total number of rows.
tf = ismember(data, [1 2 3], 'rows');
relFreq = sum(tf) / numel(tf);
I think you want to count frequency of each instance, So try this
data = [1 2 3
4 5 7
5 2 7
1 2 3
6 5 3];
[counts,centers] = hist(data , unique(data))
Where centers is your unique instances and counts is count of each of them. The result should be as follow:
counts =
2 0 0
0 3 0
0 0 3
1 0 0
1 2 0
1 0 0
0 0 2
centers =
1 2 3 4 5 6 7
That it means you have 7 unique instances, from 1 to 7 and there is two 1s in first column and there is not any 1s in second and third and etc.

Position of integers in vector

I think the question is pretty basic, but still keeps me busy since some time.
Lets assume we have a vector containing 4 integers randomly repetetive, like:
v = [ 1 3 3 3 4 2 1 2 3 4 3 2 1 4 3 3 4 2 2]
I am searching for the vector of all positions of each integer, e.g. for 1 it should be a vector like:
position_one = [1 7 13]
Since I want to search every row of a 100x10000 matrix I was not able to deal with linear indeces.
Thanks in advance!
Rows and columns
Since your output for every integer changes, a cell array will fit the whole task. For the whole matrix, you can do something like:
A = randi(4,10,30); % some data
Row = repmat((1:size(A,1)).',1,size(A,2)); % index of the row
Col = repmat((1:size(A,2)),size(A,1),1); % index of the column
pos = #(n) [Row(A==n) Col(A==n)]; % Anonymous function to find the indices of 'n'
than for every n you can write:
>> pos(3)
ans =
1 1
2 1
5 1
6 1
9 1
8 2
3 3
. .
. .
. .
where the first column is the row, and the second is the column for every instance of n in A.
And for all ns you can use an arrayfun:
positions = arrayfun(pos,1:max(A(:)),'UniformOutput',false) % a loop that goes over all n's
or a simple for loop (faster):
positions = cell(1,max(A(:)));
for n = 1:max(A(:))
positions(n) = {pos(n)};
end
The output in both cases would be a cell array:
positions =
[70x2 double] [78x2 double] [76x2 double] [76x2 double]
and for every n you can write positions{n}, to get for example:
>> positions{1}
ans =
10 1
2 3
5 3
3 4
5 4
1 5
4 5
. .
. .
. .
Only rows
If all you want in the column index per a given row and n, you can write this:
A = randi(4,10,30);
row_pos = #(k,n) A(k,:)==n;
positions = false(size(A,1),max(A(:)),size(A,2));
for n = 1:max(A(:))
positions(:,n,:) = row_pos(1:size(A,1),n);
end
now, positions is a logical 3-D array, that every row corresponds to a row in A, every column corresponds to a value of n, and the third dimension is the presence vector for the combination of row and n. this way, we can define R to be the column index:
R = 1:size(A,2);
and then find the relevant positions for a given row and n. For instance, the column indices of n=3 in row 9 is:
>> R(positions(9,3,:))
ans =
2 6 18 19 23 24 26 27
this would be just like calling find(A(9,:)==3), but if you need to perform this many times, the finding all indices and store them in positions (which is logical so it is not so big) would be faster.
Find linear indexes in a matrix: I = find(A == 1).
Find two dimensional indexes in matrix A: [row, col] = find(A == 1).
%Create sample matrix, with elements equal one:
A = zeros(5, 4);
A([2 10 14]) = 1
A =
0 0 0 0
1 0 0 0
0 0 0 0
0 0 1 0
0 1 0 0
Find ones as linear indexes in A:
find(A == 1)
ans =
2
10
14
%This is the same as reshaping A to a vector and find ones in the vector:
B = A(:);
find(B == 1);
B' =
0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0
Find two dimensional indexes:
[row, col] = find(A == 1)
row =
2
5
4
col =
1
2
3
You can do that with accumarray using an anonymous function as follows:
positions = accumarray(v(:), 1:numel(v), [], #(x) {sort(x.')});
For
v = [ 1 3 3 3 4 2 1 2 3 4 3 2 1 4 3 3 4 2 2];
this gives
positions{1} =
1 7 13
positions{2} =
6 8 12 18 19
positions{3} =
2 3 4 9 11 15 16
positions{4} =
5 10 14 17

Find the count of elements in one matrix equal to another

I need to compare the elements of two matrices and return a count of how many rows are exactly same. The ismember function returns one column for each column present in the matrix. But I want just one column indicating whether the row was same or not. Any ideas will be greatly appreciated.
If you want to compare corresponding rows of the two matrices, just use
result = all(A==B, 2);
Example:
>> A = [1 2; 3 4; 5 6]
A =
1 2
3 4
5 6
>> B = [1 2; 3 0; 5 6]
B =
1 2
3 0
5 6
>> result = all(A==B, 2)
result =
1
0
1
If you want to compare all pairs of rows:
result = pdist2(A,B)==0;
Example:
>> A = [1 2; 3 4; 1 2]
A =
1 2
3 4
1 2
>> B = [1 2; 3 0]
B =
1 2
3 0
>> result = pdist2(A,B)==0
result =
1 0
0 0
1 0

How to find one value from a matrix

0I have on matrix-
A=[1 2 2 3 5 5;
1 5 5 8 8 7;
2 9 9 3 3 5];
From matrix i need to count now many nonzero elements ,how any 1,how many 2 and how many 3 in each row of given matrix"A".For these i have written one code like:
[Ar Ac]=size(A);
for j=1:Ar
for k=1:Ac
count(:,j)=nnz(A(j,:));
d(:,j)=sum(A(j,:)== 1);
e(:,j)=sum(A(j,:)==2);
f(:,j)=sum(A(j,:)==3);
end
end
but i need to write these using on loop i.e. here i manually use sum(A(j,:)== 1),sum(A(j,:)== 2) and sum(A(j,:)== 3) but is there any option where i can only write sum(A(j,:)== 1:3) and store all the values in the different row i.e, the result will be like-
b=[1 2 1;
1 0 0;
0 1 2];
Matlab experts need your valuable suggestions
Sounds like you're looking for a histogram count:
U = unique(A);
counts = histc(A', U)';
b = counts(:, ismember(U, [1 2 3]));
Example
%// Input matrix and vector of values to count
A = [1 2 2 3 5 5; 1 5 5 8 8 7; 2 9 9 3 3 5];
vals = [1 2 3];
%// Count values
U = unique(A);
counts = histc(A', U)';
b = counts(:, ismember(U, vals));
The result is:
b =
1 2 1
1 0 0
0 1 2
Generalizing the sought values, as required by asker:
values = [ 1 2 3 ]; % or whichever values are sought
B = squeeze(sum(bsxfun(#(x,y) sum(x==y,2), A, shiftdim(values,-1)),2));
Here is a simple and general way. Just change n to however high you want to count. n=max(A(:)) is probably a good general value.
result = [];
n = 3;
for col= 1:n
result = [result, sum(A==col, 2)];
end
result
e.g. for n = 10
result =
1 2 1 0 2 0 0 0 0 0
1 0 0 0 2 0 1 2 0 0
0 1 2 0 1 0 0 0 2 0
Why not use this?
B=[];
for x=1:size(A,1)
B=[B;sum(A(x,:)==1),sum(A(x,:)==2),sum(A(x,:)==3)];
end
I'd do this way:
B = [arrayfun(#(i) find(A(i,:) == 1) , 1:3 , 'UniformOutput', false)',arrayfun(#(i) find(A(i,:) == 2) , 1:3 , 'UniformOutput', false)',arrayfun(#(i) find(A(i,:) == 3) , 1:3 , 'UniformOutput', false)'];
res = cellfun(#numel, B);
Here is a compact one:
sum(bsxfun(#eq, permute(A, [1 3 2]), 1:3),3)
You can replace 1:3 with any array.
you can make an anonymous function for it
rowcnt = #(M, R) sum(bsxfun(#eq, permute(M, [1 3 2]), R),3);
then running it on your data returns
>> rowcnt(A,1:3)
ans =
1 2 1
1 0 0
0 1 2
and for more generalized case
>> rowcnt(A,[1 2 5 8])
ans =
1 2 2 0
1 0 2 2
0 1 1 0

Expand matrix based on first row value (MATLAB)

My input is the following:
X = [1 1; 1 2; 1 3; 1 4; 2 5; 1 6; 2 7; 1 8];
X =
1 1
1 2
1 3
1 4
2 5
1 6
2 7
1 8
I am looking to output a new matrix based on the value of the first column. If the value is equal to 1 -- the output will remain the same, when the value is equal to 2 then I would like to output two of the values contained in the second row. Like this:
Y =
1
2
3
4
5
5
6
7
7
8
Where 5 is output two times because the value in the first column is 2 and the same for 7
Here it is (vectorized):
C = cumsum(X(:,1))
A(C) = X(:,2)
D = hankel(A)
D(D==0) = inf
Y = min(D)
Edit:
Had a small bug, now it works.
% untested code:
Y = []; % would be better to pre-allocate
for ii = 1:size(X,1)
Y = [Y; X(ii,2)*ones(X(ii,1),1)];
end