Table of correlation values - matlab

If you run the following code you will end up with a cell array composed of a correlation value in CovMatrix(:,3) and the name of the data used in calculating the correlation in CovMatrix(:,1) and CovMatrix(:,2):
clear all
FieldName = {'Name1','Name2','Name3','Name4','Name5'};
Data={rand(12,1),rand(12,1),rand(12,1),rand(12,1),rand(12,1)};
DataCell = [FieldName;Data];%place in a structure - this is the same
%structure that the data for the lakes will be placed in.
DataStructure = struct(DataCell{:});
FieldName = fieldnames(DataStructure);
Combinations = nchoosek (1:numel(FieldName),2);
d1 = cell2mat(struct2cell(DataStructure)');%this will be the surface temperatures
%use the combinations found in 'Combinations' to define which elements to
%use in calculating the coherence.
R = cell(1,size(Combinations,1));%pre-allocate the cell array
Names1 = cell(1,size(Combinations,1));
for j = 1:size(Combinations,1);
[R{j},P{j}] = corrcoef([d1(:,[Combinations(j,1)]),d1(:,[Combinations(j,2)])]);
Names1{j} = ([FieldName([Combinations(j,1)],1),FieldName([Combinations(j,2)],1)]);
end
%only obtain a single value for the correlation and p-value
for i = 1:size(Combinations,1);
R{1,i} = R{1,i}(1,2);
P{1,i} = P{1,i}(1,2);
end
R = R';P = P';
%COVARIANCE MATRIX
CovMatrix=cell(size(Combinations,1),3);%pre-allocate memory
for i=1:size(Combinations,1);
CovMatrix{i,3}=R{i,1};
CovMatrix{i,1}=Names1{1,i}{1,1};
CovMatrix{i,2}=Names1{1,i}{1,2};
end
From this I need to produce a table of the values, preferably in the form of a correlation matrix, similar to jeremytheadventurer.blogspot.com. Would this be possible in MATLAB?

You can compute the correlation matrix of your entire data set in one shot using corrcoef command:
% d1 can be simply computed as
d1_new = cell2mat(Data);
% Make sure that d1_new is the same matrix as d1
max(abs(d1(:)-d1_new(:)))
% Compute correlation matrix of columns of data in d1_new in one shot
CovMat = corrcoef(d1_new)
% Make sure that entries in CovMat are equivalent to the third column of
% CovMatrix, e.g.
CovMat(1,2)-CovMatrix{1,3}
CovMat(1,4)-CovMatrix{3,3}
CovMat(3,4)-CovMatrix{8,3}
CovMat(4,5)-CovMatrix{10,3}
Because the correlation matrix CovMat is symmetric, this contains the required result if you ignore the upper triangular part.

Related

Create a submatrix using random columns and loop

I have a 102-by-102 matrix. I want to select square sub-matrices of orders from 2 up to 8 using random column numbers. Here is what I have done so far.
matt is the the original matrix of size 102-by-102.
ittr = 30
cols = 3;
for i = 1:ittr
rr = randi([2,102], cols,1);
mattsub = matt([rr(1) rr(2) rr(3)], [rr(1) rr(2) rr(3)]);
end
I have to extract matrices of different orders from 2 to 8. Using the above code I would have to change the mattsub line every time I change cols. I believe it is possible to do with another loop inside but cannot figure out how. How can I do this?
There is no need to extract elements of a vector and concatenate them, just use the vector to index a matrix.
Instead of :
mattsub = matt([rr(1) rr(2) rr(3)], [rr(1) rr(2) rr(3)]);
Use this:
mattsub = matt(rr, rr);
Defining a set of random sizes is pretty easy using the randi function. Once this is done, they can be projected along your iterations number N using arrayfun. Within the iterations, the randperm and sort functions can be used in order to build the random indexers to the original matrix M.
Here is the full code:
% Define the starting parameters...
M = rand(102);
N = 30;
% Retrieve the matrix rows and columns...
M_rows = size(M,1);
M_cols = size(M,2);
% Create a vector of random sizes between 2 and 8...
sizes = randi(7,N,1) + 1;
% Generate the random submatrices and insert them into a vector of cells...
subs = arrayfun(#(x)M(sort(randperm(M_rows,x)),sort(randperm(M_cols,x))),sizes,'UniformOutput',false);
This can work on any type of matrix, even non-squared ones.
You don't need another loop, one is enough. If you use randi to get a random integer as size of your submatrix, and then use those to get random column and row indices you can easily get a random submatrix. Do note that the ouput is a cell, as the submatrices won't all be of the same size.
N=102; % Or substitute with some size function
matt = rand(N); % Initial matrix, use your own
itr = 30; % Number of iterations
mattsub = cell(itr,1); % Cell for non-uniform output
for ii = 1:itr
X = randi(7)+1; % Get random integer between 2 and 7
colr = randi(N-X); % Random column
rowr = randi(N-X); % random row
mattsub{ii} = matt(rowr:(rowr+X-1),colr:(colr+X-1));
end

Generate a random sparse matrix with N non-zero-elements

I've written a function that generates a sparse matrix of size nxd
and puts in each column 2 non-zero values.
function [M] = generateSparse(n,d)
M = sparse(d,n);
sz = size(M);
nnzs = 2;
val = ceil(rand(nnzs,n));
inds = zeros(nnzs,d);
for i=1:n
ind = randperm(d,nnzs);
inds(:,i) = ind;
end
points = (1:n);
nnzInds = zeros(nnzs,d);
for i=1:nnzs
nnzInd = sub2ind(sz, inds(i,:), points);
nnzInds(i,:) = nnzInd;
end
M(nnzInds) = val;
end
However, I'd like to be able to give the function another parameter num-nnz which will make it choose randomly num-nnz cells and put there 1.
I can't use sprand as it requires density and I need the number of non-zero entries to be in-dependable from the matrix size. And giving a density is basically dependable of the matrix size.
I am a bit confused on how to pick the indices and fill them... I did with a loop which is extremely costly and would appreciate help.
EDIT:
Everything has to be sparse. A big enough matrix will crash in memory if I don't do it in a sparse way.
You seem close!
You could pick num_nnz random (unique) integers between 1 and the number of elements in the matrix, then assign the value 1 to the indices in those elements.
To pick the random unique integers, use randperm. To get the number of elements in the matrix use numel.
M = sparse(d, n); % create dxn sparse matrix
num_nnz = 10; % number of non-zero elements
idx = randperm(numel(M), num_nnz); % get unique random indices
M(idx) = 1; % Assign 1 to those indices

Find the product of all entries of vector x

Here is what I am trying to do:
Let x be a vector with n entries x1,x2,...xn. Write a mat-lab program which computes the vector p with entries defined by
pk = X1*X2....Xk-1*Xk+1...Xn.
for each k =1,2,...n.
pk is the product of all the entries of x except xk. (use prod command of compute the product of all the entries, then divide by xk). Take the appropriate special action if either one of more the entries of x is zero. Using vectors throughout and no 'for' loop.
I spent too much time to figure out this problem. I still could not get it. Please help!
Brute force:
n = numel(x);
X = repmat(x(:),1,n); %// put vector in column form and repeat
X(1:n+1:end) = 1; %// make diagonal 1
result = prod(X); %// product of each column
Saving computations:
ind = find(x==0);
if numel(ind)>1 %// result is all zeros
result = zeros(size(x));
elseif numel(ind)==1 %// result is all zeros except at one entry
result = zeros(size(x));
result(ind) = prod(nonzeros(x));
else %// compute product of all elements and divide by each element
result = prod(x)./x;
end

How to see resampled data after BOOTSTRAP

I was trying to resample (with replacement) my database using 'bootstrap' in Matlab as follows:
D = load('Data.txt');
lead = D(:,1);
depth = D(:,2);
X = D(:,3);
Y = D(:,4);
%Bootstraping to resample 100 times
[resampling100,bootsam] = bootstrp(100,'corr',lead,depth);
%plottig the bootstraping result as histogram
hist(resampling100,10);
... ... ...
... ... ...
Though the script written above is correct, I wonder how I would be able to see/load the resampled 100 datasets created through bootstrap? 'bootsam(:)' display the indices of the data/values selected for the bootstrap samples, but not the new sample values!! Isn't it funny that I'm creating fake data from my original data and I can't even see what is created behind the scene?!?
My second question: is it possible to resample the whole matrix (in this case, D) altogether without using any function? However, I know how to create random values from a vector data using 'unidrnd'.
Thanks in advance for your help.
The answer to question 1 is that bootsam provides the indices of the resampled data. Specifically, the nth column of bootsam provides the indices of the nth resampled dataset. In your case, to obtain the nth resampled dataset you would use:
lead_resample_n = lead(bootsam(:, n));
depth_resample_n = depth(bootsam(:, n));
Regarding the second question, I'm guessing what you mean is, how would you just get a re-sampled dataset without worrying about applying a function to the resampled data. Personally, I would use randi, but in this situation, it is irrelevant whether you use randi or unidrnd. An example follows that assumes 4 columns of some data matrix D (as in your question):
%# Build an example dataset
T = 10;
D = randn(T, 4);
%# Obtain a set of random indices, ie indices of draws with replacement
Ind = randi(T, T, 1);
%# Obtain the resampled data
DResampled = D(Ind, :);
To create multiple re-sampled data, you can simply loop over the creation of random indices. Or you could do it in one step by creating a matrix of random indices and using that to index D. With careful use of reshape and permute you can turn this into a T*4*M array, where indexing m = 1, ..., M along the third dimension yields the mth resampled dataset. Example code follows:
%# Build an example dataset
T = 10;
M = 3;
D = randn(T, 4);
%# Obtain a set of random indices, ie indices of draws with replacement
Ind = randi(T, T, M);
%# Obtain the resampled data
DResampled = permute(reshape(D(Ind, :)', 4, T, []), [2 1 3]);

Matlab Covariance Matrix Computation for Different Classes

I've got 2 different files, one of them is an input matrix (X) which has 3823*63 elements (3823 input and 63 features), the other one is a class vector (R) which has 3823*1 elements; those elements have values from 0 to 9 (there are 10 classes).
I have to compute covariance matrices for every classes. So far, i could only compute mean vectors for every classes with so many nested loops. However, it leads me to brain dead.
Is there any other easy way?
There is the code for my purpose (thanks to Sam Roberts):
xTra = importdata('optdigits.tra');
xTra = xTra(:,2:64); % first column's inputs are all zero
rTra = importdata('optdigits.tra');
rTra = rTra(:,65); % classes of the data
c = numel(unique(rTra));
for i = 1:c
rTrai = (rTra==i-1); % Get indices of the elements from the ith class
meanvect{i} = mean(xTra(rTrai,:)); % Calculate their mean
covmat{i} = cov(xTra(rTrai,:)); % Calculate their covariance
end
Does this do what you need?
X = rand(3263,63);
R = randi(10,3263,1)-1;
numClasses = numel(unique(R));
for i = 1:numClasses
Ri = (R==i); % Get indices of the elements from the ith class
meanvect{i} = mean(X(Ri,:)); % Calculate their mean
covmat{i} = cov(X(Ri,:)); % Calculate their covariance
end
This code loops through each of the classes, selects the rows of R that correspond to observations from that class, and then gets the same rows from X and calculates their mean and covariance. It stores them in a cell array, so you can access the results like this:
% Display the mean vector of class 1
meanvect{1}
% Display the covariance matrix of class 2
covmat{2}
Hope that helps!
Don't use mean and sum as a variable names because they are names of useful Matlab built-in functions. (Type doc mean or doc sum for usage help)
Also cov will calculate the covariance matrix for you.
You can use logical indexing to pull out the examples.
covarianceMatrices = cell(m,1);
for k=0:m-1
covarianceMatrices{k} = cov(xTra(rTra==k,:));
end
One-liner
covarianceMatrices = arrayfun(#(k) cov(xTra(rTra==k,:)), 0:m-1, 'UniformOutput', false);
First construct the data matrix for each class.
Second compute the covariance for each data matrix.
The code below does this.
% assume allData contains all the data you've read in, each row is one data point
% assume classVector contains the class of each data point
numClasses = 10;
data = cell(10,1); %use cells to store each of the data matrices
covariance = cell(10,1); %store each of the class covariance matrices
[numData dummy] = size(allData);
%get the data out of allData and into each class' data matrix
%there is probably a nice matrix way to do this, but this is hopefully clearer
for i = 1:numData
currentClass = classVector(i) + 1; %Matlab indexes from 1
currentData = allData(i,:);
data{currentClass} = [data{currentClass}; currentData];
end
%calculate the covariance matrix for each class
for i = 1:numClasses
covariance{i} = cov(data{i});
end