Matlab ./ sign; basic matlab - matlab

I am a newbie to Matlab and am trying to figure out code. One thing that keeps returning and confusing me is the ./ sign. I tried googling and stackoverflowing it, but I can not find any documentation on it. Just googling ./ in combination with Matlab does not get me there.
To my understanding the code in the bottom does: Take the first record of the contry.farm.potatoes and divide this by all country.farm.tractors values summed. The result however is always a 0.x numbers. So from 0 to 1. Does that mean that the ./ sign makes sure it is a percentage?
country.farm.potatoes(1,:)./sum(country.farm.tractors,1)

General explanation
the following code does the following:
(1) takes the first row of country.farm.potatoes
(2) generate a vector such that the i'th coordinate is the sum of the i'th column of country.farm.tractors
(3) divides each coordinate of (1) by the corresponding coordinate in (2).
(:,1) syntax
Given a matrix M of size m,n, the syntax '(:,1)' extracts the first row of the matrix.
It generates a row vector of size 1xn.
M(1,:)
sum(M,1) syntax
Given a matrix M of size m,n, the syntax sum(M,1) generates a row vecotr of size 1xn,
s.t each coordinate j is the sum of the j'th column of the matrix.
sum(M,1)
A ./ B syntax
Given two matrices or vectors A,B (of same size), the syntax 'C=A./B' yields a per coordinate division result.
This means that C(i,j)=A(i,j)/B(i,j)
C = A./B
for example:
[9,6,2] ./ [3,3,2]
ans = 3 2 1

Related

Vectorizing a parallel FOR loop across multiple dimensions MATLAB

Please correct me if there are somethings unclear in this question. I have two matrices pop, and ben of 3 dimensions. Call these dimensions as c,t,w . I want to repeat the exact same process I describe below for all of the c dimensions, without using a for loop as that is slow. For the discussion below, fix a value of the dimension c, to explain my thinking, later I will give a MWE. So when c is fixed I have a 2D matrix with dimension t,w.
Now I repeat the entire process (coming below!) for all of the w dimension.
If the value of u is zero, then I find the next non zero entry in this same t dimension. I save both this entry as well as the corresponding t index. If the value of u is non zero, I simply store this value and the corresponding t index. Call the index as i - note i would be of dimension (c,t,w). The last entry of every u(c,:,w) is guaranteed to be non zero.
Example if the u(c,:,w) vector is [ 3 0 4 2 0 1], then the corresponding i values are [1,3,3,4,6,6].
Now I take these entries and define a new 3d array of dimension (c,t,w) as follows. I take my B array and do the following what is not a correct syntax but to explain you: B(c,t,w)/u(c,i(c,t,w),w). Meaning I take the B values and divide it by the u values corresponding to the non zero indices of u from i that I computed.
For the above example, the denominator would be [3,4,4,2,1,1]. I hope that makes sense!!
QUESTION:
To do this, as this process simply repeats for all c, I can do a very fast vectorizable calculation for a single c. But for multiple c I do not know how to avoid the for loop. I don't knw how to do vectorizable calculations across dimensions.
Here is what I did, where c_size is the dimension of c.
for c=c_size:-1:1
uu=squeeze(pop(c,:,:)) ; % EXTRACT A 2D MATRIX FROM pop.
BB=squeeze(B(c,:,:)) ; % EXTRACT A 2D MATRIX FROM B
ii = nan(size(uu)); % Start with all nan values
[dum_row, ~] = find(uu); % Get row indices of non-zero values
ii(uu ~= 0) = dum_row; % Place row indices in locations of non-zero values
ii = cummin(ii, 1, 'reverse'); % Column-wise cumulative minimum, starting from bottomi
dum_i = ii+(time_size+1).*repmat(0:(scenario_size-1), time_size+1, 1); % Create linear index
ben(c,:,:) = BB(dum_i)./uu(dum_i);
i(c,:,:) = ii ;
clear dum_i dum_row uu BB ii
end
The central question is to avoid this for loop.
Related questions:
Vectorizable FIND function with if statement MATLAB
Efficiently finding non zero numbers from a large matrix
Vectorizable FIND function with if statement MATLAB

What is the colon operator doing here?

What are these lines of code doing?
x0 = rand(n,2)
x0(:,1)=W*x0(:,1)
x0(:,2)=H*x0(:,2)
x0=x0(:)
Is this just one big column vector?
I'd encourage you to take a MATLAB Tutorial as indexing arrays is a fundamental skill. Also see Basic Concepts in MATLAB. Line-by-line descriptions are below to get you started.
What are these lines of code doing?
Let's take this line by line.
1. This line uses rand() to generate an n x 2 matrix of uniform random numbers (~U(0,1)).
x0 = rand(n,2) % Generate nx2 matrix of U(0,1) random numbers
2. Multiply the first column by W
In this case, x0(:,1) means take all rows of x0 (the colon in the first argument) and the 1st column (the 1). Here, the * operator indicates W is a scalar or an appropriately sized array for feasible matrix multiplication (my guess is a scalar). The notation .* can be used for element-by-element multiplication; see here and here for more details.
x0(:,1)=W*x0(:,1) % Multiply (all rows) 1st column by W
3. Multiply the first column by H.
Using similar logic as #2.
x0(:,2)=H*x0(:,2) % Multiply (all rows) 2nd column by H
4. Force column
The x0(:) takes the array x0 and forces all elements into a single column.
From the documentation for colon:
A(:) reshapes all elements of A into a single column vector. This has
no effect if A is already a column vector.
A related operation is forcing a row vector by combining this with the transpose operator.
For example, try the following: x0(:).'
x0 = x0(:) % Force Column
x0 = x0(:).' % Force Row
Related Posts:
What is Matlab's colon operator called?
How does MATLAB's colon operator work?
Combination of colon-operations in MATLAB

Finding proportional columns in matrix

I have a big matrix (1,000 rows and 50,000 columns). I know some columns are correlated (the rank is only 100) and I suspect some columns are even proportional. How can I find such proportional columns? (one way would be looping corr(M(:,j),M(:,k))), but is there anything more efficient?
If I am understanding your problem correctly, you wish to determine those columns in your matrix that are linearly dependent, which means that one column is proportional or a scalar multiple of another. There's a very basic algorithm based on QR Decomposition. For QR decomposition, you can take any matrix and decompose it into a product of two matrices: Q and R. In other words:
A = Q*R
Q is an orthogonal matrix with each column as being a unit vector, such that multiplying Q by its transpose gives you the identity matrix (Q^{T}*Q = I). R is a right-triangular or upper-triangular matrix. One very useful theory by Golub and Van Loan in their 1996 book: Matrix Computations is that a matrix is considered full rank if all of the values of diagonal elements of R are non-zero. Because of the floating point precision on computers, we will have to threshold and check for any values in the diagonal of R that are greater than this tolerance. If it is, then this corresponding column is an independent column. We can simply find the absolute value of all of the diagonals, then check to see if they're greater than some tolerance.
We can slightly modify this so that we would search for values that are less than the tolerance which would mean that the column is not independent. The way you would call up the QR factorization is in this way:
[Q,R] = qr(A, 0);
Q and R are what I just talked about, and you specify the matrix A as input. The second parameter 0 stands for producing an economy-size version of Q and R, where if this matrix was rectangular (like in your case), this would return a square matrix where the dimensions are the largest of the two sizes. In other words, if I had a matrix like 5 x 8, producing an economy-size matrix will give you an output of 5 x 8, where as not specifying the 0 will give you an 8 x 8 matrix.
Now, what we actually need is this style of invocation:
[Q,R,E] = qr(A, 0);
In this case, E would be a permutation vector, such that:
A(:,E) = Q*R;
The reason why this is useful is because it orders the columns of Q and R in such a way that the first column of the re-ordered version is the most probable column that is independent, followed by those columns in decreasing order of "strength". As such, E would tell you how likely each column is linearly independent and those "strengths" are in decreasing order. This "strength" is exactly captured in the diagonals of R corresponding to this re-ordering. In fact, the strength is proportional to this first element. What you should do is check to see what diagonals of R in the re-arranged version are greater than this first coefficient scaled by the tolerance and you use these to determine which of the corresponding columns are linearly independent.
However, I'm going to flip this around and determine the point in the R diagonals where the last possible independent columns are located. Anything after this point would be considered linearly dependent. This is essentially the same as checking to see if any diagonals are less than the threshold, but we are using the re-ordering of the matrix to our advantage.
In any case, putting what I have mentioned in code, this is what you should do, assuming your matrix is stored in A:
%// Step #1 - Define tolerance
tol = 1e-10;
%// Step #2 - Do QR Factorization
[Q, R, E] = qr(A,0);
diag_R = abs(diag(R)); %// Extract diagonals of R
%// Step #3 -
%// Find the LAST column in the re-arranged result that
%// satisfies the linearly independent property
r = find(diag_R >= tol*diag_R(1), 1, 'last');
%// Step #4
%// Anything after r means that the columns are
%// linearly dependent, so let's output those columns to the
%// user
idx = sort(E(r+1:end));
Note that E will be a permutation vector, and I'm assuming you want these to be sorted so that's why we sort them after the point where the vectors fail to become linearly independent anymore. Let's test out this theory. Suppose I have this matrix:
A =
1 1 2 0
2 2 4 9
3 3 6 7
4 4 8 3
You can see that the first two columns are the same, and the third column is a multiple of the first or second. You would just have to multiply either one by 2 to get the result. If we run through the above code, this is what I get:
idx =
1 2
If you also take a look at E, this is what I get:
E =
4 3 2 1
This means that column 4 was the "best" linearly independent column, followed by column 3. Because we returned [1,2] as the linearly dependent columns, this means that columns 1 and 2 that both have [1,2,3,4] as their columns are a scalar multiple of some other column. In this case, this would be column 3 as columns 1 and 2 are half of column 3.
Hope this helps!
Alternative Method
If you don't want to do any QR factorization, then I can suggest reducing your matrix into its row-reduced Echelon form, and you can determine the basis vectors that make up the column space of your matrix A. Essentially, the column space gives you the minimum set of columns that can generate all possible linear combinations of output vectors if you were to apply this matrix using matrix-vector multiplication. You can determine which columns form the column space by using the rref command. You would provide a second output to rref that gives you a vector of elements that tell you which columns are linearly independent or form a basis of the column space for that matrix. As such:
[B,RB] = rref(A);
RB would give you the locations of which columns for the column space and B would be the row-reduced echelon form of the matrix A. Because you want to find those columns that are linearly dependent, you would want to return a set of elements that don't contain these locations. As such, define a linearly increasing vector from 1 to as many columns as you have, then use RB to remove these entries in this vector and the result would be those linearly dependent columns you are seeking. In other words:
[B,RB] = rref(A);
idx = 1 : size(A,2);
idx(RB) = [];
By using the above code, this is what we get:
idx =
2 3
Bear in mind that we identified columns 2 and 3 to be linearly dependent, which makes sense as both are multiples of column 1. The identification of which columns are linearly dependent are different in comparison to the QR factorization method, as QR orders the columns based on how likely that particular column is linearly independent. Because columns 1 to 3 are related to each other, it shouldn't matter which column you return. One of these forms the basis of the other two.
I haven't tested the efficiency of using rref in comparison to the QR method. I suspect that rref does Gaussian row eliminations, where the complexity is worse compared to doing QR factorization as that algorithm is highly optimized and efficient. Because your matrix is rather large, I would stick to the QR method, but use rref anyway and see what you get!
If you normalize each column by dividing by its maximum, proportionality becomes equality. This makes the problem easier.
Now, to test for equality you can use a single (outer) loop over columns; the inner loop is easily vectorized with bsxfun. For greater speed, compare each column only with the columns to its right.
Also to save some time, the result matrix is preallocated to an approximate size (you should set that). If the approximate size is wrong, the only penalty will be a little slower speed, but the code works.
As usual, tests for equality between floating-point values should include a tolerance.
The result is given as a 2-column matrix (S), where each row contains the indices of two rows that are proportional.
A = [1 5 2 6 3 1
2 5 4 7 6 1
3 5 6 8 9 1]; %// example data matrix
tol = 1e-6; %// relative tolerance
A = bsxfun(#rdivide, A, max(A,[],1)); %// normalize A
C = size(A,2);
S = NaN(round(C^1.5),2); %// preallocate result to *approximate* size
used = 0; %// number of rows of S already used
for c = 1:C
ind = c+find(all(abs(bsxfun(#rdivide, A(:,c), A(:,c+1:end))-1)<tol));
u = numel(ind); %// number of columns proportional to column c
S(used+1:used+u,1) = c; %// fill in result
S(used+1:used+u,2) = ind; %// fill in result
used = used + u; %// update number of results
end
S = S(1:used,:); %// remove unused rows of S
In this example, the result is
S =
1 3
1 5
2 6
3 5
meaning column 1 is proportional to column 3; column 1 is proportional to column 5 etc.
If the determinant of a matrix is zero, then the columns are proportional.
There are 50,000 Columns, or 2 times 25,000 Columns.
It is easiest to solve the determinant of a 2 by 2 matrix.
Hence: to Find the proportional matrix, the longest time solution is to
define the big-matrix on a spreadsheet.
Apply the determinant formula to a square beginning from 1st square on the left.
Copy it for every row & column to arrive at the answer in the next spreadsheet.
Find the Columns with Determinants Zero.
This is Quite basic,not very time consuming and should be result oriented.
Manual or Excel SpreadSheet(Efficient)

Inner matrix dimensions must agree?

this is my first matlab script, so this question may seem basic and blindingly obvious, but I am a little stuck at the moment.
I have a matlab script of two lines:
x = linspace(0,4*pi,100);
y = exp(-x) * sin(x);
I'm going off the Create 2-D Line Graph tutorial on Mathworks. I want to plot f(x) = e^(-x)sin(x) over the range 0 to 4pi, but I get an inner matrix dimensions must agree error on the second line. I'm not sure what's going on, because I don't think I'm creating any matrices at the moment. Any help would be appreciated! Is there something simple with syntax that I am missing? Thanks!
This is a very simple error to resolve, and I'll admit that it's a common error that most MATLAB programmers face when facing MATLAB for the first time. Specifically, when you do this line:
y = exp(-x) * sin(x);
This operation is assuming that you will perform a matrix multiplication. What you actually want to do is an element-by-element operation. You want the points in exp(-x) to multiply with the corresponding elements in sin(x). #ellieadam provided some nice links for you to review what these operations are, but if you want to do element-by-element operations, you need to add a dot (.) before the multiplication operator. As such, you need to do this instead:
y = exp(-x) .* sin(x); %// Note the dot!
This line should now work.
As a bonus for you, here's a simple example. Suppose I have these two matrices:
A = [1 2;
3 4];
B = [4 3;
2 1];
By doing A * B in MATLAB, you get:
>> A * B
ans =
8 5
20 13
Note that this will perform a matrix multiplication. By doing A .* B, this is what I get:
>> A .* B
ans =
4 6
6 4
What's different with this statement is that one element in A is multiplied by the corresponding element in B. The first row and first column of A gets multiplied by the first row, first column of B, and the same location in the output matrix is where this result is stored. You can follow along with the other elements in the output matrix and it'll give you the same behaviour. There are other element-by-element operations, such as division and exponentiation. Addition and subtraction are inherently element-by-element, as performing these operations on matrices is by definition in this fashion.
To add to #ellieadam's post, check this MathWorks post out that specifically shows you the various operations on matrices and vectors, including element-by-element operations:
http://www.mathworks.com/help/matlab/matlab_prog/array-vs-matrix-operations.html
Based on your code, your x variable is a vector. Therefore, when you are multiplying the term exp(-x) by sin(x) you are actually multiplying two vectors of the same size and is not mathematically correct. That is the reason you are getting the error.
In order to perform an acceptable operation (which is multiplication of values of two vectors but element by element) you need to change it to the following format:
x = linspace(0,4*pi,100);
y = exp(-x) .* sin(x);
.* does the element by element multiplication and, just for your own records, in the same fashion ./ does element by element division.
I hope it helped.

Remove duplicates in correlations in matlab

Please see the following issue:
P=rand(4,4);
for i=1:size(P,2)
for j=1:size(P,2)
[r,p]=corr(P(:,i),P(:,j))
end
end
Clearly, the loop will cause the number of correlations to be doubled (i.e., corr(P(:,1),P(:,4)) and corr(P(:,4),P(:,1)). Does anyone have a suggestion on how to avoid this? Perhaps not using a loop?
Thanks!
I have four suggestions for you, depending on what exactly you are doing to compute your matrices. I'm assuming the example you gave is a simplified version of what needs to be done.
First Method - Adjusting the inner loop index
One thing you can do is change your j loop index so that it only goes from 1 up to i. This way, you get a lower triangular matrix and just concentrate on the values within the lower triangular half of your matrix. The upper half would essentially be all set to zero. In other words:
for i = 1 : size(P,2)
for j = 1 : i
%// Your code here
end
end
Second Method - Leave it unchanged, but then use unique
You can go ahead and use the same matrix like you did before with the full two for loops, but you can then filter the duplicates by using unique. In other words, you can do this:
[Y,indices] = unique(P);
Y will give you a list of unique values within the matrix P and indices will give you the locations of where these occurred within P. Note that these are column major indices, and so if you wanted to find the row and column locations of where these locations occur, you can do:
[rows,cols] = ind2sub(size(P), indices);
Third Method - Use pdist and squareform
Since you're looking for a solution that requires no loops, take a look at the pdist function. Given a M x N matrix, pdist will find distances between each pair of rows in a matrix. squareform will then transform these distances into a matrix like what you have seen above. In other words, do this:
dists = pdist(P.', 'correlation');
distMatrix = squareform(dists);
Fourth Method - Use the corr method straight out of the box
You can just use corr in the following way:
[rho, pvals] = corr(P);
corr in this case will produce a m x m matrix that contains the correlation coefficient between each pair of columns an n x m matrix stored in P.
Hopefully one of these will work!
this works ?
for i=1:size(P,2)
for j=1:i
Since you are just correlating each column with the other, then why not just use (straight from the documentation)
[Rho,Pval] = corr(P);
I don't have the Statistics Toolbox, but according to http://www.mathworks.com/help/stats/corr.html,
corr(X) returns a p-by-p matrix containing the pairwise linear correlation coefficient between each pair of columns in the n-by-p matrix X.