how to programme in Matlab to obtain transition probability matrix? - matlab

I have a sequence x= [12,14,6,15,15,15,15,6,8,8,18,18,14,14] so I want to make transition probability matrix. Transition probability matrix calculated by equation i.e. probability=(number of pairs x(t) followed by x(t+1))/(number of pairs x(t) followed by any state). Matrix should be like below
6 8 12 14 15 18
6 0 1/2 0 0 1/2 0
8 0 1/2 0 0 0 1/2
12 0 0 0 1 0 0
14 1/2 0 0 1/2 0 0
15 1/4 0 0 0 3/4 0
18 0 0 0 0 1/2 1/2
by following code I can do
m = max(x);
n = numel(x);
y = zeros(m,1);
p = zeros(m,m);
for k=1:n-1
y(x(k)) = y(x(k)) + 1;
p(x(k),x(k+1)) = p(x(k),x(k+1)) + 1;
end
p = bsxfun(#rdivide,p,y); p(isnan(p)) = 0;
but with this code matrix forms of order maximum state present in sequence i.e. matrix becomes of 18*18, and much more places zero occurs. I want matrix like above posted by me how to do it.

Step 1 - organize data and generate empty transition table
x= [12,14,6,15,15,15,15,6,8,8,18,18,14,14]
xind = zeros(1,length(x));
u = unique(x) % find unique elements and sort
for ii = 1:length(u)
xmask = x==u(ii); % locate all elements of a single value
xind = xind+ii*xmask; % number them in the order listed in u
end
Output is labeled Markov chain (elements are labels instead of meaningful values)
>> u
u =
6 8 12 14 15 18
>> xind
xind =
3 4 1 5 5 5 5 1 2 2 6 6 4 4
Step 2 - build "from-to" table for each hop
>> T = [xind(1:end-1);xind(2:end)]
T =
3 4 1 5 5 5 5 1 2 2 6 6 4
4 1 5 5 5 5 1 2 2 6 6 4 4
Each column is a transition. First row is "from" label, second row is "to" label.
Step 3 - count frequencies and create transition table
p = zeros(length(u));
for ii = 1:size(T,2)
px = T(1,ii); % from label
py = T(2,ii); % to label
p(px,py) = p(px,py)+1;
end
Output is aggregated frequency table. Each element is counts of a hop. Row number is "from" and column number is "to".
>> p
p =
0 1 0 0 1 0
0 1 0 0 0 1
0 0 0 2 0 0
2 0 0 1 0 0
1 0 0 0 3 0
0 0 0 1 0 1
For example the 3 means 3 transitions from 5th label to 5th label (actual value is 15 to 15)
Step 4 - normalize row vectors to get probability table
>> p./repmat(sum(p,2),1,length(u))
ans =
0 0.5000 0 0 0.5000 0
0 0.5000 0 0 0 0.5000
0 0 0 1.0000 0 0
0.5000 0 0 0.5000 0 0
0.2500 0 0 0 0.7500 0
0 0 0 0.5000 0 0.5000
alternative loop version
for ii = 1:size(p,1)
count = sum(p(ii,:));
p(ii,:) = p(ii,:)/count;
end

x=[12,14,6,15,15,15,15,6,8,8,18,18,14,14]; %discretized driving cycle
n=numel(x);%total no of data points in driving cycle
j=0;
z=unique(x); % unique data points in the driving cycle and also in arranged form
m=numel(z); % total number of unique data points
N=zeros(m); % square matrix for counting all unique data points
for k=1:1:m; % using loop cycle for unique data points all combinations; for this k is used
for l=1:1:m;
for i=1:1:n-1;
j=i+1;
if x(i)== z(k) & x(j)==z(l);
N(k,l) = N(k,l)+1;
end
i=i+1;
end
l=l+1;
end
k=k+1;
end
N
s=sum(N,2);
Tpm= N./s %transition probability matrix

%%Sample matrix
p=magic(8)
%%Fill rows and cols 3,5 with 0's
p([3 5],:)=0
p(:,[3 5])=0
%%The code
lb=[]
for k = [length(p):-1:1]
if any(p(k,:)) | any(p(:,k))
lb=[ [k],lb ]
else
p(k,:)=[]
p(:,k)=[]
end
end
lb keeps your original index

Related

Faster way to transpose each row of a matrix and multiply the resulting vector by some other matrix?

I have an input matrix X with dimensions N_rows x N_cols. I also have a sparse, tridiagonal matrix M which is square of size N_rows x N_rows. These are created as follows:
N_rows = 3;
N_cols = 6;
X = rand(N_rows,N_cols);
mm = 10*ones(N_cols,1); % Subdiagonal elements
dd = 20*ones(N_cols,1); % Main diagonal elements
pp = 30*ones(N_cols,1); % Superdiagonal elements
M = spdiags([mm dd pp],-1:1,N_cols,N_cols);
and look like the following:
>> X
X =
0.4018 0.1233 0.4173 0.9448 0.3377 0.1112
0.0760 0.1839 0.0497 0.4909 0.9001 0.7803
0.2399 0.2400 0.9027 0.4893 0.3692 0.3897
full(M)
ans =
2 3 0 0 0 0
1 2 3 0 0 0
0 1 2 3 0 0
0 0 1 2 3 0
0 0 0 1 2 3
0 0 0 0 1 2
I would like to take each row of X, and do a matrix multiplication with M, and piece the obtained rows back together to obtain an output Y. At the moment, I achieve this successfully with the following:
Y = (M*X.').';
The example above is for a 3x6 matrix for X, but in reality I need to do this for a matrix with dimensions 500 x 500, about 10000 times, and the profiler says that this operation in the bottleneck in my larger code. Is there a faster way to do this row-by-row matrix multiplication multiplication?
On my system, the following takes around 20 seconds to do this 10000 times:
N_rows = 500;
N_cols = 500;
X = rand(N_rows,N_cols);
mm = 10*ones(N_cols,1); % Subdiagonal elements
dd = 20*ones(N_cols,1); % Main diagonal elements
pp = 30*ones(N_cols,1); % Superdiagonal elements
M = spdiags([mm dd pp],-1:1,N_cols,N_cols);
tic
for k = 1:10000
Y = (M*X.').';
end
toc
Elapsed time is 18.632922 seconds.
You can use X*M.' instead of (M*X.').';. This saves around 35% of time on my computer.
This can be explained because transposing (or permuting dimensions) implies rearranging the elements in the internal (linear-order) representation of the matrix, which takes time.
Another option is using conv2:
Y = conv2(X, [30 20 10], 'same');
Explanation:
There is a tridiagonal matrix that all elements on each diagonal are identical to each other:
M =
2 3 0 0 0 0
1 2 3 0 0 0
0 1 2 3 0 0
0 0 1 2 3 0
0 0 0 1 2 3
0 0 0 0 1 2
Suppose you want to multiply the matrix by a vector:
V = [11 ;12 ;13 ;14 ;15 ;16];
R = M * V;
Each element of the vector R is computed by sum of products of each row of M by V:
R(1):
2 3 0 0 0 0
11 12 13 14 15 16
R(2):
1 2 3 0 0 0
11 12 13 14 15 16
R(3):
0 1 2 3 0 0
11 12 13 14 15 16
R(4):
0 0 1 2 3 0
11 12 13 14 15 16
R(5):
0 0 0 1 2 3
11 12 13 14 15 16
R(6):
0 0 0 0 1 2
11 12 13 14 15 16
It is the same as multiplying a sliding window of [1 2 3] by each row of M. Basically convolution applies a sliding window but first it reverses the direction of window so we need to provide the sliding window in the reversed order to get the correct result. Because of that I used Y = conv2(X, [30 20 10], 'same'); instead of Y = conv2(X, [10 20 30], 'same');.

Loopless submatrix assignment in Matlab

I have a matrix F of size D-by-N and a vector A of length N of random integers in the range [1,a]. I want to create a matrix M of size D * a such that each colum M(:,i) has the vector F(:,i) starting from the index (A(i)-1)*D+1 to (A(i)-1)*D+D.
Example:
F = [1 2 3 10
4 5 6 22]
A = [3 2 1 2]
a = 4
M = [0 0 3 0
0 0 6 0
0 2 0 10
0 5 0 22
1 0 0 0
4 0 0 0
0 0 0 0
0 0 0 0]
I can do it with a simple loop
for i = 1 : N
M((A(i)-1)*D+1:(A(i)-1)*D+D,i) = F(:,i);
end
but for large N this might take a while. I am looking for a way to do it without loop.
You can use bsxfun for a linear-indexing based approach -
[D,N] = size(F); %// Get size of F
start_idx = (A-1)*D+1 + [0:N-1]*D*a; %// column start linear indices
all_idx = bsxfun(#plus,start_idx,[0:D-1]'); %//'# all linear indices
out = zeros(D*a,N); %// Initialize output array with zeros
out(all_idx) = F; %// Insert values from F into output array
Sample run -
F =
1 2 3 10
4 5 6 22
A =
3 2 1 2
a =
4
out =
0 0 3 0
0 0 6 0
0 2 0 10
0 5 0 22
1 0 0 0
4 0 0 0
0 0 0 0
0 0 0 0

Extended block diagonal matrix in Matlab

I know that to generate a block-diagonal matrix in Matlab the command blkdiag generates such a matrix:
Now I am faced with generating the same block-diagonal matrix, but with also matrix elements B_1, B_2,..., B_{n-1} on the upper diagonal, zeros elsewhere:
I guess this can be hardcoded with loops, but I would like to find a more elegant solution. Any ideas on how to implement such a thing?
P.S. I diag command, that using diag(A,k) returns the kth diagonal. I need something for writing in the matrix, for k>0, and for block matrices, not only elements.
There is a submission on the File Exchange that can do this:
(Block) tri-diagonal matrices.
You provide the function with three 3D-arrays, each layer of the 3D array represents a block of the main, sub- or superdiagonal. (Which means that the blocks will have to be of the same size.) The result will be a sparse matrix, so it should be rather efficient in terms of memory.
An example usage would be:
As = bsxfun(#times,ones(3),permute(1:3,[3,1,2]));
Bs = bsxfun(#times,ones(3),permute(10:11,[3,1,2]));
M = blktridiag(As, zeros(size(Bs)), Bs);
where full(M) gives you:
1 1 1 10 10 10 0 0 0
1 1 1 10 10 10 0 0 0
1 1 1 10 10 10 0 0 0
0 0 0 2 2 2 11 11 11
0 0 0 2 2 2 11 11 11
0 0 0 2 2 2 11 11 11
0 0 0 0 0 0 3 3 3
0 0 0 0 0 0 3 3 3
0 0 0 0 0 0 3 3 3
This could be one approach based on kron, tril & triu -
%// Take all A1, A2, A3, etc in a cell array for easy access and same for B
A = {A1,A2,A3,A4}
B = {B1,B2,B3}
%// Setup output array with the A blocks at main diagonal
out = blkdiag(A{:})
%// logical array with 1s at places where kth diagonal elements are to be put
idx = kron(triu(true(numel(A)),k) & tril(true(numel(A)),k),ones(size(A{1})))>0
%// Put kth diagonal blocks using the logical mask
out(idx) = [B{1:numel(A)-k}]
Sample run with k = 1 for 2 x 2 sizes matrices -
>> A{:}
ans =
0.3467 0.7966
0.6228 0.7459
ans =
0.1255 0.0252
0.8224 0.4144
ans =
0.7314 0.3673
0.7814 0.7449
ans =
0.8923 0.1296
0.2426 0.2251
>> B{:}
ans =
0.3500 0.9275
0.2871 0.0513
ans =
0.5927 0.8384
0.1629 0.1676
ans =
0.5022 0.3554
0.9993 0.0471
>> out
out =
0.3467 0.7966 0.3500 0.9275 0 0 0 0
0.6228 0.7459 0.2871 0.0513 0 0 0 0
0 0 0.1255 0.0252 0.5927 0.8384 0 0
0 0 0.8224 0.4144 0.1629 0.1676 0 0
0 0 0 0 0.7314 0.3673 0.5022 0.3554
0 0 0 0 0.7814 0.7449 0.9993 0.0471
0 0 0 0 0 0 0.8923 0.1296
0 0 0 0 0 0 0.2426 0.2251

Check neighbour pixels Matlab

I have a A which is 640x1 cell. where the value of each cell A(i,1) varies from row to row, for example A(1,1) =[], while A(2,1)=[1] and A(3,1)=[1,2,3].
There is another matrix B of size 480x640, where the row_index (i) of vector A corresponds to the col_index of matrix B. While the cell value of each row in vector A corresponds to the row_index in matrix B. For example, A(2,1)=[1] this means col_2 row_1 in matrix B, while A(3,1)=[1,2,3] means col_3 rows 1,2&3 in matrix B.
What I'm trying to do is to for each non-zero value in matrix B that are referenced from vector A, I want to check whether there are at least 4 other neighbors that are also referenced from vector A. The number neighbors of each value are determined by a value N.
For example, this is a part of matrix B where all the zeros"just to clarify, as in fact they may be non-zeros" are the neighbors of pixel X when N=3:
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 X 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
As shown, because N=3, all these zeros are pixel X's neighbors. So if more than 4 neighbor pixels are found in vector A then do something e.g G=1 if not then G=0;
So if anyone could please advise. And please let me know if any more clarification is needed.
The first thing I would do is to convert your cell of indices A to a logic matrix Amat. This makes it easier to check how many neighbours are included in A.
Here is a solution that uses this conversion. I hope the comments are enough to make it understandable.
clear all
clc
nCols = 7;
nRows = 6;
N = 3; %// Number of neighbours
M = 4; %// Minimum number of wanted connections
%// Create cell of indices A
A = cell(nCols,1);
A{1} = [];
A{2} = 1;
A{3} = [1 2 3];
A{4} = [2 5];
A{5} = 3;
A{6} = [3 5];
A{7} = [1 4 6];
%// Generate radom data B
%// (There is a 50% probability for each element of B to be zero)
Bmax = 17;
B = (randi(2,nRows,nCols)-1).*(randi(Bmax,nRows,nCols));
%// Convert the cell A to a logic matrix Amat
Amat = zeros(size(B));
for ii = 1:nCols
Amat(A{ii},ii) = 1;
end
A
B
Amat
for ii = 1:nCols
for jj = A{ii}
if B(jj,ii)>0
%// Calculate neighbour indices with a lower bound of 1
%// and an upper bound of nCols or nRows
col_lim_low = max(1,ii-N);
col_lim_high = min(nCols,ii+N);
row_lim_low = max(1,jj-N);
row_lim_high = min(nRows,jj+N);
%// Get the corresponding neighbouring-matrix from Amat
A_neighbours = ...
Amat(row_lim_low:row_lim_high,col_lim_low:col_lim_high);
%// Check the number of neighbours against the wanted number M
if sum(A_neighbours(:)) > 1 + M
%# do something
fprintf('We should do something here at (%d,%d)\n',jj,ii)
end
end
end
end
The following is a printout from one run of the code.
A =
[]
[ 1]
[1x3 double]
[1x2 double]
[ 3]
[1x2 double]
[1x3 double]
B =
1 5 0 0 11 0 16
0 13 13 0 0 0 9
0 0 0 5 0 0 0
3 8 16 16 0 2 12
0 0 5 0 9 9 0
12 13 0 6 0 15 0
Amat =
0 1 1 0 0 0 1
0 0 1 1 0 0 0
0 0 1 0 1 1 0
0 0 0 0 0 0 1
0 0 0 1 0 1 0
0 0 0 0 0 0 1
We should do something here at (1,2)
We should do something here at (2,3)
We should do something here at (5,6)
We should do something here at (4,7)
Since you have a one-to-one correspondence between A and B, there is no need to work on A. B is a logical matrix (0 if not referenced in A, 1 if referenced). You can therefore apply a simple filter2 function counting the number of active neighbors within the 8 closest elements.
Here is the code
B = rand(10,10); %generate binary matrix
h = [1 1 1;1 0 1;1 1 1]; %filter to be applied
filter2(h,B,'same')>=4 & B>0 %apply filter on B, count minimum of 4 neighbors, if only B>1
EDIT
To transform a cell array B into binary presence (0=empty, 1=not empty), use of cellfunis straightforward
B = ~cellfun(#isempty,B);
And see Armo's response to your previous question for how to create B based on A.

Computing linear indices and counting non-zero entries of large sparse matrices in row-first order

Say I have a sparse non-rectangular matrix A:
>> A = round(rand(4,5))
A =
0 1 0 1 1
0 1 0 0 1
0 0 0 0 1
0 1 1 0 0
I would like to obtain the matrix B where the non-zero entries of A are replaced by their linear index in row-first order:
B =
0 2 0 4 5
0 7 0 0 10
0 0 0 0 15
0 17 18 0 0
and the matrix C that where the non-zero entries of A are replaced by the order in which they are found in a row-first search:
C =
0 1 0 2 3
0 4 0 0 5
0 0 0 0 6
0 7 8 0 0
I am looking for vectorized solutions for this problem that scale to large sparse matrices.
If I understand what you are asking, a couple of tranpositions should do the trick. The key is that find(A.') will do "row-first" indexing on A, where .' is the short hand for the transpose of a 2D matrix. So:
>> A = round(rand(4,5))
A =
0 1 0 1 1
0 1 0 0 1
0 0 0 0 1
0 1 1 0 0
then
B=A.';
B(find(B)) = find(B);
B=B.';
gives
B =
0 2 0 4 5
0 7 0 0 10
0 0 0 0 15
0 17 18 0 0
Here's a solution that doesn't require any transposing back and forth:
>> B = A; %# Initialize B
>> C = A; %# Initialize C
>> mask = logical(A); %# Create a logical mask using A
>> [r,c] = find(A); %# Find the row and column indices of non-zero values
>> index = c + (r - 1).*size(A,2); %# Compute the row-first linear index
>> [~,order] = sort(index); %# Compute the row-first order with
>> [~,order] = sort(order); %# two sorts
>> B(mask) = index %# Fill non-zero elements of B
B =
0 2 0 4 5
0 7 0 0 10
0 0 0 0 15
0 17 18 0 0
>> C(mask) = order %# Fill non-zero elements of C
C =
0 1 0 2 3
0 4 0 0 5
0 0 0 0 6
0 7 8 0 0
An outline (Matlab isn't on this machine, so verification is delayed):
You can use find() to get the coordinate list. Let T = A'; [r,c] = find(T)
From the coordinate list, you can create both B and C. Let valB = sub2ind([r,c],T) and valC = 1:length(r)
Use the sparse command to create B and C, e.g. B = sparse(r,c,valB), and then transpose, e.g. B = B' (or could do sparse(c,r,valB)).
Or, as #IanHincks suggests, let B = A'; B(find(B)) = find(B). (I'm not sure why .' is recommended, but, again, I don't have Matlab in front of to check.) For C, simply use C(find(C)) = 1:nnz(A). And transpose back, as he suggests.
Personally, I work with coordinate lists all the time, having migrated away from the sparse matrix representation, just to cut out the costs of index lookups.