I am using Singular Value Decomposition (SVD) applied to Singular Spectrum Analysis (SSA) of a timeseries.
% original time series
x1= rand(1,10000);
N = length(x1);
% windows for trajectory matrix
L = 600;
K=N-L+1;
% trajectory matrix/matrix of lagged vectors
X = buffer(x1, L, L-1, 'nodelay');
% Covariance matrix
A = X * X' / K;
% SVD
[U, S_temp, ~] = svd(A);
% The eigenvalues of A are the squared eigenvalues of X
S = sqrt(S_temp);
d = diag(S);
% Principal components
V = X' * U;
for i = 1 : L
V(:, i) = V(:, i) / d(i);
end
I wanted to know if there is a way to have the singular components (i.e. the columns of V) always positive.
X is always > 0 in my case (and also the Covariance matrix A)
You may be looking for an algorithm such as non-negative matrix factorization.
This is available in Statistics Toolbox in the command nnmf, and there is a freely available third-party toolbox as well.
Related
I need to calculate the cumulative variance of a vector. I have tried to build and script, but this script takes too much time to calculate the cumulative variance of my vectors of size 1*100000. Do you know if there exists a faster way to find this cumulative variance?
This is the code I am using
%%Creation of the rand vectors. ans calculation of the variances
d=100000; %dimension of the vectors
nv=6 %quantity of vectors
for j=1:nv;
VItimeseries(:,j)=rand(d,1); % Final matrix with vectors
end
%% script to calculate the cumulative variance in the columns of my matrix
VectorVarianza=0;
VectoFinalVar=0;
VectorFinalTotalVAriances=zeros(d,nv);
for k=1:nv %number of columns
for j=1:numel(VItimeseries(:,k)) %size of the rows
Vector=VItimeseries(:,k);
VectorVarianza(1:j)= Vector(1:j); % Vector to calculate the variance...
...Independently
VectorFinalVar(j,k)= var(VectorVarianza);%Calculation of variances
end
VectorFinalTotalVAriances(:,k)=VectorFinalVar(:,k)% construction of the...
...Final Vector with the cumulative variances
end
Looping over the n elements of x, and within the loop computing the variance of all elements up to i using var(x(1:i)) amounts to an algorithm O(n2). This is inherently expensive.
Sample variance (what var computes) is defined as sum((x-mean(x)).^2) / (n-1), with n = length(x). This can be rewritten as (sum(x.^2) - sum(x).^2 / n) / (n-1). This formula allows us to accumulate sum(x) and sum(x.^2) within a single loop, then compute the variance later. It also allows us to compute the cumulative variance in O(n).
For a vector x, we'd have the following loop:
x = randn(100,1); % some data
v = zeros(size(x)); % cumulative variance
s = x(1); % running sum of x
s2 = x(1).^2; % running sum of square of x
for ii = 2:numel(x) % loop starts at 2, for ii=1 we cannot compute variance
s = s + x(ii);
s2 = s2 + x(ii).^2;
v(ii) = (s2 - s.^2 / ii) / (ii-1);
end
We can avoid the explicit loop by using cumsum:
s = cumsum(x);
s2 = cumsum(x.^2);
n = (1:numel(x)).';
v = (s2 - s.^2 ./ n) ./ (n-1); % v(1) will be NaN, rather than 0 as in the first version
v(1) = 0; % so we set it to 0 explicitly here
The code in the OP computes the cumulative variance for each column of a matrix. The code above can be trivially adapted to do the same:
s = cumsum(VItimeseries,1); % cumulative sum explicitly along columns
s2 = cumsum(VItimeseries.^2,1);
n = (1:size(VItimeseries,1)).'; % use number of rows, rather than `numel`.
v = (s2 - s.^2 ./ n) ./ (n-1);
v(1,:) = 0; % fill first row with zeros, not just first element
This is my code for finding the centered coefficients for lagrange polynomial interpolation:
% INPUT
% f f scalar - valued function
% interval interpolation interval [a, b]
% n interpolation order
%
% OUTPUT
% coeff centered coefficients of Lagrange interpolant
function coeff = lagrangeInterp (f, interval , n)
a = interval(1);
b = interval(2);
x = linspace(a,b,n+1);
y = f(x);
coeff(1,:) = polyfit(x,y,n);
end
Which is called in the following script
%Plot lagrangeInterp and sin(x) together
hold on
x = 0:0.1*pi:2*pi;
for n = 1:1:4
coeff = lagrangeInterp(#(x)sin(x),[0,2*pi],n);
plot(x,polyval(coeff,x,'-'));
end
y = sin(x);
plot(x,y);
legend('1st order','2nd order','3rd order','4th order','sin(x)');
To check for stability I would like to perturb the function (eg g(x) = f(x) + epsilon). How would I go about this?
Well, a little trick for you.
You know randn([m,n]) in matlab generate a m*n random matrix. The point is to generate a random vector, and interp1 to a function of x. Like this:
x = linspace(a,b,n+1); % Your range of input
g = #(ep,xx)f(xx)+interp1(x,ep*randn([length(x),1]),xx);
I want to solve the following system of equations shown in the image below,
The matrix system
where the component of the matrix A is complex numbers with the angle (theta) runs from 0 to 2*pi which has m divisions, and n = 9. The known value z = x + iy. Suppose the x and y of matrix z is
z =
0 1.0148
0.1736 0.9848
0.3420 0.9397
0.5047 0.8742
0.6748 0.8042
0.8419 0.7065
0.9919 0.5727
1.1049 0.4022
1.1757 0.2073
1.1999 0
1.1757 -0.2073
1.1049 -0.4022
0.9919 -0.5727
0.8419 -0.7065
0.6748 -0.8042
0.5047 -0.8742
0.3420 -0.9397
0.1736 -0.9848
0 -1.0148
How do you solve them iteratively? Notice that the value of the first component of the desired constants must equal 1. I am working with Matlab.
You can apply simple multilinear regression for complex valued data.
Step 1. Get the matrix ready for linear regression
Your linear system
written without matrices, becomes
that rearranged yelds
If you rewrite it with matrices you get
Step 2. Apply multiple linear regression
Let the system above be
where
Now you can apply linear regression, that returns the best fit for α when
where
is the conjugate transpose.
In MATLAB
Y = Z - A(:,1); % Calculate Y subtracting the first col of A from Z
R = A(:,:); R(:,1) = []; % Calculate R as an exact copy of A, just without first column
Rs = ctranspose(R); % Calculate R-star (conjugate transpose of R)
alpha = (Rs*R)^(-1)*Rs*Y; % Finally apply multiple linear regression
alpha = cat(1, 1, alpha); % Add alpha1 back, whose value is 1
or, if you prefer built-ins, have a look at regress function:
Y = Z - A(:,1); % Calculate Y subtracting the first col of A from Z
R = A(:,:); R(:,1) = []; % Calculate R as an exact copy of A, just without first column
alpha = regress(Y, R); % Finally apply multiple linear regression
alpha = cat(1, 1, alpha); % Add alpha1 back, whose value is 1
The original data is Y, the size of Y is L*n ( n is the number of features; L is the number of observations. B is the covariance matrix of the original data Y. Suppose A is the eigenvectors of the covariance matrix B. I represent A as A = (e1, e2,...,en), where ei is an eigenvector. Matrix Aq is the first q eigenvectors and ai be the row vectors of Aq: Aq = (e1,e2,...,eq) = (a1,a2,...,an)'. I want to apply the k-means algorithm to Aq to cluster the row vector ai to k clusters or more (note: I do not want to apply k-means algorithm to the eigenvector ei to k clusters). For each cluster, only the vector closest to the center of cluster is retained, and the feature corresponding to this vector is finally selected as the informative features.
My question is:
1) What is the difference between applying the k-means algorithm to Aq to cluster the row vector ai to k clusters and applying k-means algorithm to Aq to cluster the eigenvector ei to k clusters?
2) the closest_vectors I get is from this command: closest_vectors = Aq(min_idxs, :), the size of the closest_vectors is k*qdouble. How to get the final informative features? Since the final informative features have to be obtained from the original data Y.
Thanks!
I found two function about pca and pfa:
function [e m lambda, sqsigma] = cvPca(X, M)
[D, N] = size(X);
if ~exist('M', 'var') || isempty(M) || M == 0
M = D;
end
M = min(M,min(D,N-1));
%% mean subtraction
m = mean(X, 2); %%% calculate the mean of every row
X = X - repmat(m, 1, N);
%% singular value decomposition. X = U*S*V.' or X.' = V*S*U.'
[U S V] = svd(X,'econ');
e = U(:,1:M);
if nargout > 2
s = diag(S);
s = s(1:min(D,N-1));
lambda = s.^2 / N; % biased (1/N) estimator of variance
end
% sqsigma. Used to model distribution of errors by univariate Gaussian
if nargout > 3
d = cvPcaDist(X, e, m); % Use of validation set would be better
N = size(d,2);
sqsigma = sum(d) / N; % or (N-1) unbiased est
end
end
%/////////////////////////////////////////////////////////////////////////////
function [IDX, Me] = cvPfa(X, p, q)
[D, N] = size(X);
if ~exist('p', 'var') || isempty(p) || p == 0
p = D;
end
p = min(p, min(D, N-1));
if ~exist('q', 'var') || isempty(q)
q = p - 1;
end
%% PCA step
[U Me, Lambda] = cvPca(X, q);
%% cluter row vectors (q x D). not col
[Cl, Mu] = kmeans(U, p, 'emptyaction', 'singleton', 'distance', 'sqEuclidean');
%% find axis which are nearest to mean vector
IDX = logical(zeros(D,1));
for i = 1:p
Cli = find(Cl == i);
d = cvEucdist(Mu(i,:).', U(Cli,:).');
[mini, argmin] = min(d);
IDX(Cli(argmin)) = 1;
end
Summarizing Olologin's comments, it doesn't make sense to cluster the eigenvectors of the covariance matrix, or the columns of the U matrix of the SVD. Eigenvectors in this case are all orthogonal so if you tried to cluster them, you would only get one member per cluster and this cluster's centroid is defined by the eigenvector itself.
Now, what you're really after is selecting out the features in your data matrix that describe your data in terms of discriminatory analysis.
The functions that you have provided both compute the SVD and pluck out the k principal components of your data and also determine which features out of these k to select as the most prominent. By default, the amount of features to select out is equal to k, but you can override this if you want. Let's just stick with the default.
The cvPfa function performs this feature selection for you, but a warning to you that the data matrix in the function is organized where each row is a feature and each column is a sample. The output is a logical vector that tells you which features are the strongest to select in your data.
Simply put, you just do this:
k = 10; %// Example
IDX = cvPfa(Y.', k);
Ynew = Y(:,IDX);
This code will choose the 10 most prominent features in your data matrix and pluck out those 10 features that are the most representative of your data, or the most discriminative. You can then use the output for whatever application you're targetting.
1) I don't think that clustering eigenvectors (columns of PCA result) of covariance matrix makes any sense. All eigenvectors pairwise orthogonal and equally far one from another in sense of Euclidian distance. You can pick any eigenvectors and compute distance between them, distance will be sqrt(2) between any pair. But clustering rows of PCA result can provide something useful.
In the Matlab SVM tutorial, it says
You can set your own kernel function, for example, kernel, by setting 'KernelFunction','kernel'. kernel must have the following form:
function G = kernel(U,V)
where:
U is an m-by-p matrix.
V is an n-by-p matrix.
G is an m-by-n Gram matrix of the rows of U and V.
When I followed the custom SVM kernel example, I set a break point in mysigmoid.m function. However, I found U and V were in fact 1-by-p vectors and G was a scalar.
Why does not MATLAB process the kernel by matrices?
My custom kernel function is
function G = mysigmoid(U,V)
% Sigmoid kernel function with slope gamma and intercept c
gamma = 0.5;
c = -1;
G = tanh(gamma*U*V' + c);
end
My Matlab script is
%% Train SVM Classifiers Using a Custom Kernel
rng(1); % For reproducibility
n = 100; % Number of points per quadrant
r1 = sqrt(rand(2*n,1)); % Random radius
t1 = [pi/2*rand(n,1); (pi/2*rand(n,1)+pi)]; % Random angles for Q1 and Q3
X1 = [r1.*cos(t1), r1.*sin(t1)]; % Polar-to-Cartesian conversion
r2 = sqrt(rand(2*n,1));
t2 = [pi/2*rand(n,1)+pi/2; (pi/2*rand(n,1)-pi/2)]; % Random angles for Q2 and Q4
X2 = [r2.*cos(t2), r2.*sin(t2)];
X = [X1; X2]; % Predictors
Y = ones(4*n,1);
Y(2*n + 1:end) = -1; % Labels
% Plot the data
figure(1);
gscatter(X(:,1),X(:,2),Y);
title('Scatter Diagram of Simulated Data');
SVMModel1 = fitcsvm(X,Y,'KernelFunction','mysigmoid','Standardize',true);
% Compute the scores over a grid
d = 0.02; % Step size of the grid
[x1Grid,x2Grid] = meshgrid(min(X(:,1)):d:max(X(:,1)),...
min(X(:,2)):d:max(X(:,2)));
xGrid = [x1Grid(:),x2Grid(:)]; % The grid
[~,scores1] = predict(SVMModel1,xGrid); % The scores
figure(2);
h(1:2) = gscatter(X(:,1),X(:,2),Y);
hold on;
h(3) = plot(X(SVMModel1.IsSupportVector,1),X(SVMModel1.IsSupportVector,2),...
'ko','MarkerSize',10);
% Support vectors
contour(x1Grid,x2Grid,reshape(scores1(:,2),size(x1Grid)),[0,0],'k');
% Decision boundary
title('Scatter Diagram with the Decision Boundary');
legend({'-1','1','Support Vectors'},'Location','Best');
hold off;
CVSVMModel1 = crossval(SVMModel1);
misclass1 = kfoldLoss(CVSVMModel1);
disp(misclass1);
Kernels add dimensions to a feature. If you have, for example, one feature for sample x={a} it will expand it into something like x= {a_1... a_q}. As you are doing this for all of your data at once, you are going to have a M x P (M is the number of examples in your training set and P is the number of features). The second matrix it asks for is P x N, where N is the number of examples in the training/test set.
That said, your output should be M x N. Since it is instead 1, it means that you have U = 1XM and V=Nx1 where N=M. To have an output of M x N logic follows that you should simply transpose your inputs.