Related
I want to generate 100 different 5 by 5 random matrices using matlab in [0,1] with the following property: Assume that the required matrix is A=[a_{ij}] having the condition a_{ih}+a_{hj}-a_{ij}-0.5=0 (Matrix A is the so-called Fuzzy Preference matrix (i.e.a_{ji}=1-a_{ij} for all i,j), and also consistent). But, I am stuck to write the matlab code. Could anyone help me? Thanks!
Example of such matrix, but not consistent:
A=[.5 .5 .5 .8155 .5 .3423;...
.5 .5 .6577 .8155 .5 .3423;...
.5 .3423 .5 .88662 .75 .3423;...
.1845 .8145 .1338 .5 .25 .25;...
.5 .5 .25 .75 .5 .25;...
.6577 .6577 .6577 .75 .75 .5]
Example of 3 by 3 consistent Fuzzy Preference matrix:
B=[.5 .2 .5;...
.8 .5 .8;...
.5 .2 .5]
To solve this question we need to find solutions to a system of linear equations. The unknowns are the matrix entries, so for an NxN matrix there will be N^2 unknowns. There are N^3 equations since there is an equation for each combination of i, j, and h. This is an overdetermined system, but is consistent, it's easy to see that the constant matrix with all entries 0.5 is a solution.
The system of equations can be written in matrix form, with matrix M defined as follows:
N = 5; % size of resulting matrix
M = zeros(N^3,N^2); % ar = size of matrix A
t = 0;
for i = 1:N
for j = 1:N
for h = 1:N
t = t+1;
M(t,(i-1)*N+h) = M(t,(i-1)*N+h)+1;
M(t,(h-1)*N+j) = M(t,(h-1)*N+j)+1;
M(t,(i-1)*N+j) = M(t,(i-1)*N+j)-1;
end
end
end
The right hand side is a vector of N^3 0.5's. The rank of the matrix appears to be N^2-N+1 (I think this is because the main diagonal of the result must be filled with 0.5, so there are really only N^2-N unknowns). The rank of the augmented system is the same, so we do have an infinite number of solutions, spanning an N-1 dimensional space.
So we can find the solutions as the sum of the vector of 0.5's plus any element of the null space of the matrix.
a = 0.5*ones(N^2,1) % solution to non-homogenous equation
nullspace = null(M) % columns form a basis for the null space
So we can now generate as many solutions as we like by adding multiples of the basis vectors of the null space to a,
s = a+sum(rand(1,N-1).*nullspace,2) % always a solution
The final problem is to sample uniformly from this null space while requiring the all the entries stay within [0,1].
I think it's hard even to say what uniform sampling over this space is, but at least I can generate random elements by the following:
% first check the size of the largest element
% we can't addd/subtract more than 0.5to the homogeneous solution
for p = 1:size(nullspace,2)
mn(p) = 0.5/max(abs(nullspace(:,p)));
end
c = 0;
mat = {};
while c<5 % generate 5 matrices
% a solution is the homogeneous part 0.5*ones(N) plus
% some element of the nullspace sum(rand(1,size(n,2)).*mn.* nullspace,2)
% reshaped to be a square matrix
newM = 0.5*ones(N)+reshape(sum((2*rand(1,size(nullspace,2))-1).*mn.* nullspace,2),N,N)
% make sure the entries are all in [0,1]
if all(newM(:)>=0) && all(newM(:)<=1)
c = c+1;
mat{c} = newM;
end
end
% mat is a cell array of 5 matrices satisfying the condition
I don't have much idea about the distribution of the resulting matrices.
I am following a machine learning course on Coursera and I am doing the following exercise using Octave (MatLab should be the same).
The exercise is related to the calculation of the cost function for a gradient descent algoritm.
In the course slide I have that this is the cost function that I have to implement using Octave:
This is the formula from the course slide:
So J is a function of some THETA variables represented by the THETA matrix (in the previous second equation).
This is the correct MatLab\Octave implementation for the J(THETA) computation:
function J = computeCost(X, y, theta)
%COMPUTECOST Compute cost for linear regression
% J = COMPUTECOST(X, y, theta) computes the cost of using theta as the
% parameter for linear regression to fit the data points in X and y
% Initialize some useful values
m = length(y); % number of training examples
% You need to return the following variables correctly
J = 0;
% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta
% You should set J to the cost.
J = (1/(2*m))*sum(((X*theta) - y).^2)
% =========================================================================
end
where:
X is a 2 column matrix of m rows having all the elements of the first column set to the value 1:
X =
1.0000 6.1101
1.0000 5.5277
1.0000 8.5186
...... ......
...... ......
...... ......
y is a vector of m elements (as X):
y =
17.59200
9.13020
13.66200
........
........
........
Finnally theta is a 2 columns vector having 0 asvalues like this:
theta = zeros(2, 1); % initialize fitting parameters
theta
theta =
0
0
Ok, coming back to my working solution:
J = (1/(2*m))*sum(((X*theta) - y).^2)
specifically to this matrix multiplication (multiplication between the matrix X and the vector theta): I know that it is a valid matrix multiplication because the number of column of X (2 columns) is equal to the number of rows of theta (2 rows) so it is a perfectly valid matrix multiplication.
My doubt that is driving me crazy (probably it is a trivial doubt) is related to the previous course slide context:
As you can see in the second equation used to calculated the current h_theta(x) value it is using the transposed theta vector and not the theta vector as done in the code.
Why ?!?!
I suspect that it depends only on how was created the theta vector. It was build in this way:
theta = zeros(2, 1); % initialize fitting parameters
that is generating a 2 line 1 column vector instead of a classic one line 2 column vector. So maybe I have not to transpose it. But I am absolutely not sure about this assertion.
Is my intuition correct or what am I missing?
Your intuition is correct. Effectively it does not matter whether you perform the multiplication as theta.' * X or as X.' * theta, since this either generates a horizontal vector or a vertical vector of the hypothesis representing all observations, and what you're expected to do next is subtract the y vector from the hypothesis vector at each observation, and sum the results. So as long as y has the same orientation as your hypothesis and you subtract at each equivalent point, then the scalar end-result of the summation will be the same.
Often enough, you'll see the X.' * theta version preferred over theta.' * X purely for convenience, to avoid transposing over and over again just to be consistent with the mathematical notation. But this is fine, since the underlying math doesn't really change, only the order of equivalent operations.
I agree it's confusing though, both because it makes it harder to follow the formula when the code effectively looks like it's doing something else, and also since it messes with the usual convention that a vertical vector represents 'coordinates', and a horizontal vector represents observations. In such cases, especially in languages like matlab / octave where the orientation of a vector isn't explicitly defined in the variable's type, it is doubly important to document what you expect the inputs to represent, and preferably there should have been assert statements in the code confirming the input has been passed in the correct orientation. Clearly here they felt it wasn't necessary because this code is acting under controlled conditions in a predefined exercise environment anyway, but it would have been good practice to do so from a software engineering point of view.
I make a large number of calls to the 'find' function of Matlab. For example, the following should give the essence:
x=rand(1,10^8);
indx=zeros(1,10^8);
for i=1:10^8
indx(i) = find([0.2, 0.52, 0.76,1] < x(i), 1, 'last');
end
Is there a way to vectorize this code to speed it up? Just including x as a vector creates an error. If vectorization is not possible, then any other suggestions for speed would be appreciated. The actual problem I wish to solve has a considerably longer vector in the place of [0.2, 0.52, 0.76,1], so any solution shouldn't depend on the specific vector I provided.
thanks.
For MATLAB versions R2015a and newer, the answer from crjones gives the best option using discretize:
edges = [0.2, 0.52, 0.76, 1];
indx = discretize(x, edges, 'IncludedEdge', 'right');
Any values in x outside the range of edges will have NaN for their indices.
For MATLAB versions R2014b and newer you can also use histcounts:
[~, ~, indx] = histcounts(x, edges);
The differences with discretize are that you can also get the count of values in each bin (the first output), and indices for values in x outside the range of edges will be 0.
For MATLAB versions prior to R2014b you can use histc (deprecated in newer versions):
[~, indx] = histc(x, edges);
Again, you can also get the count of values in each bin (the first output), and indices for values in x outside the range of edges will be 0.
Based on your example, you may want to consider using the discretize function for this:
x=rand(1,10^8);
edges = [0.2, 0.52, 0.76, 1];
indx = discretize(x, edges, 'IncludedEdge', 'right');
Note that cases outside of the range will result in NaN.
% small test case
% x = [0.5198, 0.0768, 0.6788, 0.9496]
% indx = discretize(x, edges, 'IncludedEdge', 'right')
% answer: 1 NaN 2 3
Of course, this will only be applicable if you're trying to find where x fits in a well-ordered set.
Compare your vector with x to get a logical matrix indicating the values lesser in vec than x. Multiply that logical matrix with column vector representing the column subscripts. Use max to find the maximum (last) index that satisfies the inequality. For the case where inequality doesn't satisfy, you will get zero.
vec = [0.2, 0.52, 0.76, 1]; %Your vector
indx = bsxfun(#lt, vec(:), x); %Making 'vec' a column matrix and comparing with 'x'
indx = max(bsxfun(#times, indx, (1:numel(vec)).')); %The required result
With R2016b and later versions, you can use implicit expansion instead of bsxfun:
indx = vec(:) < x ;
indx = max(indx .* (1:numel(vec)).');
I have this RGB matrix of a set of different pixels. (N pixels => n rows, RGB => 3 columns). I have to calculate the minimum RGB distance between any two pixels from this matrix. I tried the loop approach, but because the set is too big (let's say N=24000), it looks like it will take forever for the program to finish. Is there another approach? I read about pdist, but the RGB Euclidean distance cannot be used with it.
k=1;
for i = 1:N
for j = 1:N
if (i~=j)
dist_vect(k)=RGB_dist(U(i,1),U(j,1),U(i,2),U(j,2),U(i,3),U(j,3))
k=k+1;
end
end
end
Euclidean distance between two pixels:
So, pdist syntax would be like this: D=pdist2(U,U,#calc_distance());, where U is obtained like this:
rgbImage = imread('peppers.png');
rgb_columns = reshape(rgbImage, [], 3)
[U, m, n] = unique(rgb_columns, 'rows','stable');
But if pdist2 does the loops itself, how should I enter the parameters for my function?
function[distance]=RGB_dist(R1, R2, G1, G2, B1, B2),
where R1,G1,B1,R2,G2,B2 are the components of each pixel.
I made a new function like this:
function[distance]=RGB_dist(x,y)
distance=sqrt(sum(((x-y)*[3;4;2]).^2,2));
end
and I called it D=pdist(U,U,#RGB_dist); and I got 'Error using pdist (line 132)
The 'DISTANCE' argument must be a
string or a function.'
Testing RGB_dist new function alone, with these input set
x=[62,29,64;
63,31,62;
65,29,60;
63,29,62;
63,31,62;];
d=RGB_dist(x,x);
disp(d);
outputs only values of 0.
Contrary to what your post says, you can use the Euclidean distance as part of pdist. You have to specify it as a flag when you call pdist.
The loop you have described above can simply be computed by:
dist_vect = pdist(U, 'euclidean');
This should compute the L2 norm between each unique pair of rows. Seeing that your matrix has a RGB pixel per row, and each column represents a single channel, pdist should totally be fine for your application.
If you want to display this as a distance matrix, where row i and column j corresponds to the distance between a pixel in row i and row j of your matrix U, you can use squareform.
dist_matrix = squareform(dist_vect);
As an additional bonus, if you want to find which two pixels in your matrix share the smallest distance, you can simply do a find search on the lower triangular half of dist_matrix. The diagonals of dist_matrix are going to be all zero as any vector whose distance to itself should be 0. In addition, this matrix is symmetric and so the upper triangular half should be equal to the lower triangular half. Therefore, we can set the diagonal and the upper triangular half to Inf, then search for the minimum for those elements that are remaining. In other words:
indices_to_set = true(size(dist_matrix));
indices_to_set = triu(indices_to_set);
dist_matrix(indices_to_set) = Inf;
[v1,v2] = find(dist_matrix == min(dist_matrix(:)), 1);
v1 and v2 will thus contain the rows of U where those RGB pixels contained the smallest Euclidean distance. Note that we specify the second parameter as 1 as we want to find just one match, as what your post has stated as a requirement. If you wish to find all vectors who match the same distance, simply remove the second parameter 1.
Edit - June 25th, 2014
Seeing as how you want to weight each component of the Euclidean distance, you can define your own custom function to calculate distances between two RGB pixels. As such, instead of specifying euclidean, you can specify your own function which can calculate the distances between two vectors within your matrix by calling pdist like so:
pdist(x, #(XI,XJ) ...);
#(XI,XJ)... is an anonymous function that takes in a vector XI and a matrix XJ. For pdist you need to make sure that the custom distance function takes in XI as a 1 x N vector which is a single row of pixels. XJ is then a M x N matrix that contains multiple rows of pixels. As such, this function needs to return a M x 1 vector of distances. Therefore, we can achieve your weighted Euclidean distance as so:
weights = [3;4;2];
weuc = #(XI, XJ, W) sqrt(bsxfun(#minus, XI, XJ).^2 * W);
dist_matrix = pdist(double(U), #(XI, XJ) weuc(XI, XJ, weights));
bsxfun can handle that nicely as it will replicate XI for as many rows as we need to, and it should compute this single vector with every single element in XJ by subtracting. We thus square each of the differences, weight by weights, then take the square root and sum. Note that I didn't use sum(X,2), but I used vector algebra to compute the sum. If you recall, we are simply computing the dot product between the square distance of each component with a weight. In other words, x^{T}y where x is the square distance of each component and y are the weights for each component. You could do sum(X,2) if you like, but I find this to be more elegant and easy to read... plus it's less code!
Now that I know how you're obtaining U, the type is uint8 so you need to cast the image to double before we do anything. This should achieve your weighted Euclidean distance as we talked about.
As a check, let's put in your matrix in your example, then run it through pdist then squareform
x=[62,29,64;
63,31,62;
65,29,60;
63,29,62;
63,31,62];
weights = [3;4;2];
weuc = #(XI, XJ, W) sqrt(bsxfun(#minus,XI,XJ).^2 * W);
%// Make sure you CAST TO DOUBLE, as your image is uint8
%// We don't have to do it here as x is already a double, but
%// I would like to remind you to do so!
dist_vector = pdist(double(x), #(XI, XJ) weuc(XI, XJ, weights));
dist_matrix = squareform(dist_vector)
dist_matrix =
0 5.1962 7.6811 3.3166 5.1962
5.1962 0 6.0000 4.0000 0
7.6811 6.0000 0 4.4721 6.0000
3.3166 4.0000 4.4721 0 4.0000
5.1962 0 6.0000 4.0000 0
As you can see, the distance between pixels 1 and 2 is 5.1962. To check, sqrt(3*(63-62)^2 + 4*(31-29)^2 + 2*(64-62)^2) = sqrt(3 + 16 + 8) = sqrt(27) = 5.1962. You can do similar checks among elements within this matrix. We can tell that the distance between pixels 5 and 2 is 0 as you have made these rows of pixels the same. Also, the distance between each of themselves is also 0 (along the diagonal). Cool!
I'd like to compute a low-rank approximation to a matrix which is optimal under the Frobenius norm. The trivial way to do this is to compute the SVD decomposition of the matrix, set the smallest singular values to zero and compute the low-rank matrix by multiplying the factors. Is there a simple and more efficient way to do this in MATLAB?
If your matrix is sparse, use svds.
Assuming it is not sparse but it's large, you can use random projections for fast low-rank approximation.
From a tutorial:
An optimal low rank approximation can be easily computed using the SVD of A in O(mn^2
). Using random projections we show how to achieve an ”almost optimal” low rank pproximation in O(mn log(n)).
Matlab code from a blog:
clear
% preparing the problem
% trying to find a low approximation to A, an m x n matrix
% where m >= n
m = 1000;
n = 900;
%// first let's produce example A
A = rand(m,n);
%
% beginning of the algorithm designed to find alow rank matrix of A
% let us define that rank to be equal to k
k = 50;
% R is an m x l matrix drawn from a N(0,1)
% where l is such that l > c log(n)/ epsilon^2
%
l = 100;
% timing the random algorithm
trand =cputime;
R = randn(m,l);
B = 1/sqrt(l)* R' * A;
[a,s,b]=svd(B);
Ak = A*b(:,1:k)*b(:,1:k)';
trandend = cputime-trand;
% now timing the normal SVD algorithm
tsvd = cputime;
% doing it the normal SVD way
[U,S,V] = svd(A,0);
Aksvd= U(1:m,1:k)*S(1:k,1:k)*V(1:n,1:k)';
tsvdend = cputime -tsvd;
Also, remember the econ parameter of svd.
You can rapidly compute a low-rank approximation based on SVD, using the svds function.
[U,S,V] = svds(A,r); %# only first r singular values are computed
svds uses eigs to compute a subset of the singular values - it will be especially fast for large, sparse matrices. See the documentation; you can set tolerance and maximum number of iterations or choose to calculate small singular values instead of large.
I thought svds and eigs could be faster than svd and eig for dense matrices, but then I did some benchmarking. They are only faster for large matrices when sufficiently few values are requested:
n k svds svd eigs eig comment
10 1 4.6941e-03 8.8188e-05 2.8311e-03 7.1699e-05 random matrices
100 1 8.9591e-03 7.5931e-03 4.7711e-03 1.5964e-02 (uniform dist)
1000 1 3.6464e-01 1.8024e+00 3.9019e-02 3.4057e+00
2 1.7184e+00 1.8302e+00 2.3294e+00 3.4592e+00
3 1.4665e+00 1.8429e+00 2.3943e+00 3.5064e+00
4 1.5920e+00 1.8208e+00 1.0100e+00 3.4189e+00
4000 1 7.5255e+00 8.5846e+01 5.1709e-01 1.2287e+02
2 3.8368e+01 8.6006e+01 1.0966e+02 1.2243e+02
3 4.1639e+01 8.4399e+01 6.0963e+01 1.2297e+02
4 4.2523e+01 8.4211e+01 8.3964e+01 1.2251e+02
10 1 4.4501e-03 1.2028e-04 2.8001e-03 8.0108e-05 random pos. def.
100 1 3.0927e-02 7.1261e-03 1.7364e-02 1.2342e-02 (uniform dist)
1000 1 3.3647e+00 1.8096e+00 4.5111e-01 3.2644e+00
2 4.2939e+00 1.8379e+00 2.6098e+00 3.4405e+00
3 4.3249e+00 1.8245e+00 6.9845e-01 3.7606e+00
4 3.1962e+00 1.9782e+00 7.8082e-01 3.3626e+00
4000 1 1.4272e+02 8.5545e+01 1.1795e+01 1.4214e+02
2 1.7096e+02 8.4905e+01 1.0411e+02 1.4322e+02
3 2.7061e+02 8.5045e+01 4.6654e+01 1.4283e+02
4 1.7161e+02 8.5358e+01 3.0066e+01 1.4262e+02
With size-n square matrices, k singular/eigen values and runtimes in seconds. I used Steve Eddins' timeit file exchange function for benchmarking, which tries to account for overhead and runtime variations.
svds and eigs are faster if you want a few values from a very large matrix. It also depends on the properties of the matrix in question (edit svds should give you some idea why).