I have 3 sequences in a cell-array :
S= {'ABC','ACB','AB'}
S{1}='ABC' means A<B<C and have the weights : A:3, B:2, C:1
S{2}='ACB' means A<C<B and have the weights : A:3, C:2, B:1
S{3}='AB' means A<B and have the weights : A:3, B:2
I want to convert each of the strings in the Input_cell into a matrix M[i,j] which has to satisfy those conditions :
M[i,i] =1
M[i,j] = 1/M[j,i]
If S{1}(i)<S{1}(j) then M[i,j]=weight(S{1}(i))/weight(S{1}(j))
For example:
In S{1}= 'ABC' , and weights will be A:3, B:2, C:1
If A<B then M[A,B]=3/2 and M[B,A]=2/3
If B<C then M[B,C]=2/1 and M[C,B]=1/2 ....etc
So the expected matrix for S{1} would be:
A B C
A [1 3/2 3
B 2/3 1 2
C 1/3 1/2 1]
In S{2}='ACB', the weights of ACB will be A:3, C:2, B:1. So the expected matrix of S{2} would be:
A B C
A [1 3 3/2
B 1/3 1 1/2
C 2/3 2 1]
In S{3}='AB', the weights of AB will be A:3, B:2, C:unknown. If there is an unknown value, we will put 1 in the matrix. Such as M[A,C]=M[C,A]=M[B,C]=M[C,B]=1. So the expected matrix of S{3} would be:
A B C
A [1 3/2 1
B 2/3 1 1
C 1 1 1]
How can I input the cell array of strings and convert them into matrices as above ?
This task has 2 parts (for each element of cell array):
1) Get numbers from strings. For example, ACB sequence is transformed to [3, 1, 2], because strength of the first column will be 3, the second 1, the third 2.
2) Create matrix from that array.
1) Check which letter is first and put 3 on the correct place. Then check which is second, put 2. Third, put 1. If there is no number, put nan (= initialize array by nan). If you can have arbitrarily many letters thing works the same, just first figure out how big array you need. But then your 3rd case will cease to work, as it will assume you only have 2 elements and will make 2x2 matrix.
The second part goes like this:
seq=[3,1,2];
M = seq' * (1./seq);
M(isnan(M)) = 1; % all the numbers with nan become 1.
Note that this will have doubles - M=1.5 and 0.3333, not fractions (1/3, 3/2). To represent it as fractions, you will require a bit more work to do it in general. For each element of the matrix, find gcd of the nominator and denominator and divide by it. In this simple case of 3 elements, you simply put nominator and denominator in the matrix with 1 on the diagonal - there are no common divisors.
denominator = repmat(seq, 3, 1);
Related
Given a nXm matrix A and a mX2 matrix B and a matrix C of size mX1 containing 1s and 2s C=[1 2 1 2 1...], depending on which column, I want every row of A to be multiplied with. How can this be done? Or equivalently, given D = A*B how can I access only the values dictated by C. I tried D(:,C), but the result is not the expected.
Example a =[1 2; 3 4; 5 6] . c = [1 2 1] . a(?) = [1 4 5]
Any idea?
%example data
n=10;m=20;
A=rand(n,m)
B=rand(m,2)
C=round(rand(m,1))+1;
%solution:
B2=B(:,1); %multiplication vector
B2(C==2)=B(C==2,2) %change the ones where C==2
A*B2
You can run the following command for the last example:
a(sub2ind([3,2],1:3,c))'
In general case you can do like the following:
% n is the length of the D which is nx2 matrix
D(sub2ind([n,2],1:n,C))'
Good day,
I have a question what I want to achieve without the loop if possible. As title says I need to do windowed subtraction of vectors that are not same size and then finding the mean of results.
As example, let say that we have vector a = [2 3 4 5 6] and vector b = [1 2].
Program will have to move window with smaller numbers of elements (in this example vector b) over bigger one (vector a) and make operations on that way so it starts in first two elements in vector a and make subtraction with vector b and then sum results and find mean.
In this example it will just make calculation of subtraction 2-1 = 1, 3-2 = 1, summing results 1+1=2 and divide them with 2 (because vector b is that size). Final result is 1.
Then we move window on second elements of vector a (value 3 and 4 there, or index 2 and 3) and repeat process to the last elements of vector a.
For final result we need to get vector c who consist of elements [1 2 3 4] for this example.
Is this possible to do without looping because I have data sets over 10k of size. Thanks in advance
I can solve it with only one loop, iterating through "b" (two loops in your example).
Declare vectors (as columns! This is needed for matlabs computations to work)
a = [2 3 4 5 6]';
b = [1 2]';
Declare matrix for computed results. Each column represents subtractions of elements in "a" with one of the elements in "b".
c = zeros(length(a)-length(b)+1,length(b));
for k = 1:length(b)
c(:,k) = a(k:length(a)-length(b)+k)-b(k);
end
Now just sum the elements in "c" row wise and divide by length of "b" to get the mean
result = sum(c,2)/length(b);
You can simplify this for your exact example, but this is a generic solution for any vetors "a" and "b", where "b" is the smaller vector.
I am confused by the
[m,n]=hist(y,x)
such as
M = [1, 2, 3;
4, 5, 6;
1, 2, 3];
[m,n] = hist(M,1:3)
Which results in
m = 2 0 0
0 2 0
1 1 3
Can someone please explain how m is calculated?
hist actually takes vectors as input arguments, you wrote a matrix, so it just handles your input as if it was several vector-inputs. The output are the number of elements for each container (in your case 1:3, the second argument).
[m,n] = hist([1,2,3;4,5,6;1,2,3],1:3)
treats each column as one input. You put in 3 inputs (# of columns) and you get 3 outputs.
[2 0 1]'
means, for the input [1;4;1] and the bin 1:3 two elements are in bin 1 and one element is in bin 3.
Look at the last column of m, here all three values are in the third bin, which makes sense, since the corresponding vector is [3;6;3], and out of those numbers all have to go into the bin/container 3.
I've noticed various cases in Matlab and octave where functions accept both matrices and vectors, but doesn't do the same thing with vectors as it does with matrices.
This can be frustrating because when you input a matrix with a variable number of rows/columns, it could be interpreted as a vector and do something you don't expect when the height/width is 1 making for difficult debugging and weird conditional edge cases.
I'll list a few I've found, but I'm curious what others people have run into
(Note: I'm only looking for cases where code accepts matrices as valid input. Anything that raises an exception when a non-vector matrix is given as an argument doesn't count)
1) "diag" can be used to mean diagonal of a matrix or turn a vector into a diagonal matrix
Since the former is generally only used for square matrices this isn't so egregious in matlab, but in Octave it can be particularly painful when Octave interperets a vector beginning with a nonzero element and everything else zeros as a "diagonal matrix" ie
t=eye(3);
size(diag(t(:,3))) == [3,3]
size(diag(t(:,2))) == [3,3]
size(diag(t(:,1))) == [1,1]
2) Indexing into a row-vector with logicals returns a row-vector
Indexing into anything else with logicals returns a column vector
a = 1:3;
b = true(1,3);
size(a(b)) == [1, 3]
a = [a; a];
b = [b; b];
size(a(b)) == [6, 1]
3) Indexing into a vector v with an index vector i returns a vector of the same (row/col) type as v. But if either v or i is a matrix, the return value has the same size as i.
a = 1:3;
b = a';
size(a(b)) == [1, 3]
b = [b,b];
size(a(b)) == [3, 2]
4) max, min, sum etc. operate on the columns of a matrix M individiually unless M is 1xn in which case they operate on M as a single row-vector
a = 1:3
size(max(a)) == [1, 1]
a = [a;a]
size(max(a)) == [1, 3]
max is particularly bad since it can't even take a dimension as an argument (unlike sum)
What other such cases should I watch out for when writing octave/matlab code?
Each language has its own concepts. An important point of this language is to very often think of matrices as an array of vectors, each column an entry. Things will start to make sense then. If you don't want that behavior, use matrix(:) as the argument to those functions which will pass a single vector, rather than a matrix. For example:
octave> a = magic (5);
octave> max (a)
ans =
23 24 25 21 22
octave> max (a(:))
ans = 25
1) This is not true with at least Octave 3.6.4. I'm not 100% sure but may be related related to this bug which has already been fixed.
2) If you index with boolean values, it will considered to be a mask and treated as such. If you index with non-boolean values, then it's treated as the indexes for the values. This makes perfect sense to me.
3) This is not true. The returned has always the same size of the index, independent if it's a matrix or vector. The only exception is that if the index is a vector, the output will be a single row. The idea is that indexing with a single vector/matrix returns something of the same size:
octave> a = 4:7
a =
4 5 6 7
octave> a([1 1])
ans =
4 4
octave> a([1 3])
ans =
4 6
octave> a([1 3; 3 1])
ans =
4 6
6 4
4) max does take dimension as argument at least in Octave. From the 3.6.4 help text of max:
For a vector argument, return the maximum value. For a matrix
argument, return the maximum value from each column, as a row vector,
or over the dimension DIM if defined, in which case Y should be set to
the empty matrix (it's ignored otherwise).
The rest applies like I said on the intro. If you supply a matrix, it will think of each column as a dataset.
1) As pointed out by the other user, this is not true with at Octave >= 3.6.4.
In case 2) the rule is for vectors, return always the same shape of vector, for anything else return a column vector, consider:
>> a = reshape (1:3, 1,1,3)
a(:,:,1) =
1.0000e+000
a(:,:,2) =
2.0000e+000
a(:,:,3) =
3.0000e+000
>> b = true(1,3)
b =
1×3 logical array
1 1 1
>> a(b)
ans(:,:,1) =
1.0000e+000
ans(:,:,2) =
2.0000e+000
ans(:,:,3) =
3.0000e+000
>> a = [a;a]
a(:,:,1) =
1.0000e+000
1.0000e+000
a(:,:,2) =
2.0000e+000
2.0000e+000
a(:,:,3) =
3.0000e+000
3.0000e+000
>> b = [b;b]
b =
2×3 logical array
1 1 1
1 1 1
>> a(b)
ans =
1.0000e+000
1.0000e+000
2.0000e+000
2.0000e+000
3.0000e+000
3.0000e+000
You can see that this makes sense since vectors have a clear 'direction' but other shaped matrices do not when you remove elements. EDIT: actually I just checked and Octave doesn't seem work this way exactly, but probably should.
3) This is consistent with 2). Essentially if you supply a list of indices the direction of the indexed vector is preserved. If you supply indices with a shape like a matrix, the new information is the index matrix shape is used. This is more flexible, since you can always do a(b(:)) to preserve the shape of a if you so wish. You may say it is not consistent, but remember indexing with logicals may reduce the number of elements to be returned, so they cannot be reshaped in this way.
4) As pointed out in a comment, you can specify dimension for max/min to operate on: min(rand(3),[],1) or max(rand(3),[],2), but in this case there are 'legacy' issues with these functions which data back to when they were first created and now are very difficult to change without upsetting people.
Suppose now I have two vectors of same length:
A = [1 2 2 1];
B = [2 1 2 2];
I would like to create a matrix C whose dim=m*n, m=max(A), n=max(B).
C = zeros(m,n);
for i = 1:length(A)
u = A(i);
v = B(i);
C(u,v)=C(u,v)+1;
end
and get
C =[0 2;
1 1]
More precisely, we treat the according indices in A and B as rows and columns in C, and C(u,v) is the number of elements in {k | A(i)=u and B(i)=v, i = 1,2,...,length(A)}
Is there a faster way to do that?
Yes. Use sparse. It assembles (i.e., sums up) the matrix values for repeating row-column pairs for you. You need an additional vector with the values that will be assembled into the matrix entries. If you use ones(size(A)), you will have exactly what you need - counting of repeated row-column pairs
spA=sparse(A, B, ones(size(A)));
full(spA)
ans =
0 2
1 1
The same can be obtained by simply passing scalar 1 to sparse function instead of a vector of values.
For matrices that have a large number of zero entries this is absolutely crucial that you use sparse storage. Another function you could use is accumarray. It can essentially do the same thing, but also works on dense matrix structure:
AA=accumarray([A;B]', 1);
AA =
0 2
1 1
You can pass size argument to accumarray if you want to create a matrix of specific size
AA=accumarray([A;B]', 1, [2 3]);
AA =
0 2 0
1 1 0
Note that you can actually also make it produce sparse matrices, and use a different operator in assembly (i.e., not necessarily a sum)
AA=accumarray([A;B]', 1, [2 3], #sum, 0, true)
will produce a sparse matrix (last parameter set to true) using sum for assembly and 0 as a fill value, i.e. a value which is used in cases a given row-column pair does not exist in A/B.