Augmented matrix rounding issue [duplicate] - matlab

This question already has an answer here:
How to round double to something that a 'normal' human can read. (MATLAB)
(1 answer)
Closed 6 years ago.
I am trying to create an augmented matrix to solve a problem, but I can't get to not round the values. The matrix d is trying to be augmented to the matrix Diff. I want the decimal values in Diff to remain decimals and the larger values in d to remain larger values, yet whenever I try to add it, MATLAB automatically reduces all of the values. Why is it doing this and how to fix it?
d = [74000;56000;10500;25000;17500;196000;5000]
d =
74000
56000
10500
25000
17500
196000
5000
Diff = I - A
Diff =
0.8412 -0.0064 -0.0025 -0.3404 -0.0014 -0.0083 -0.1594
-0.0057 0.7355 -0.0436 -0.0099 -0.0083 -0.0201 -0.3413
-0.0264 -0.1506 0.6443 -0.0139 -0.0142 -0.0070 -0.0236
-0.3299 -0.0565 -0.0495 0.6364 -0.0204 -0.0483 -0.0649
-0.0089 -0.0081 -0.0333 -0.0295 0.6588 -0.0237 -0.0020
-0.1190 -0.0901 -0.0996 -0.1260 -0.1722 0.7632 -0.3369
-0.0063 -0.0126 -0.0196 -0.0098 -0.0064 -0.0132 0.9988
Aug = [Diff,d]
Aug =
1.0e+05 *
0.0000 -0.0000 -0.0000 -0.0000 -0.0000 -0.0000 -0.0000 0.7400
-0.0000 0.0000 -0.0000 -0.0000 -0.0000 -0.0000 -0.0000 0.5600
-0.0000 -0.0000 0.0000 -0.0000 -0.0000 -0.0000 -0.0000 0.1050
-0.0000 -0.0000 -0.0000 0.0000 -0.0000 -0.0000 -0.0000 0.2500
-0.0000 -0.0000 -0.0000 -0.0000 0.0000 -0.0000 -0.0000 0.1750
-0.0000 -0.0000 -0.0000 -0.0000 -0.0000 0.0000 -0.0000 1.9600
-0.0000 -0.0000 -0.0000 -0.0000 -0.0000 -0.0000 0.0000 0.0500

MATLAB is not rounding any values. If you look at the top left corner when you display Aug, you will see (1.0e+05) which means that all values being displayed are the actual values divided by 1e5 (fixed-decimal floating-point notation). Since you are concatenating very large values (A) with relatively small values (Diff), the significant digits of the small values don't appear because you are not displaying enough decimal points. As a result, they look like 0. This is an artifact of the way that your command window is displaying numbers.
You can change the display format to something else such as "shortg" which is typically used for large data ranges (the default is "short") and you will see that your data is not rounded.
format shortg
[Diff, d]
0.8412 -0.0064 -0.0025 -0.3404 -0.0014 -0.0083 -0.1594 74000
-0.0057 0.7355 -0.0436 -0.0099 -0.0083 -0.0201 -0.3413 56000
-0.0264 -0.1506 0.6443 -0.0139 -0.0142 -0.007 -0.0236 10500
-0.3299 -0.0565 -0.0495 0.6364 -0.0204 -0.0483 -0.0649 25000
-0.0089 -0.0081 -0.0333 -0.0295 0.6588 -0.0237 -0.002 17500
-0.119 -0.0901 -0.0996 -0.126 -0.1722 0.7632 -0.3369 1.96e+05
-0.0063 -0.0126 -0.0196 -0.0098 -0.0064 -0.0132 0.9988 5000
In general, you should rarely rely on the MATLAB command window output for much. If you think your data is being rounded, then you would actually want to test this explicitly.
data = [Diff, d];
isequal(Diff, data(:,1:end-1))
1

Related

Cluster Analysis: correcting observations with negative silhouette width

I am trying to find patterns in a dataset (~1000 series) containing time series data with yearly frequency. Some sample data:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18
1 1.0000 0.6154 0.0000 0.0769 0.0000 0.0000 0.0000 0.2308 0.6923 0.6923 0.6923 0.6923 0.6923 0.3846 0.3846 0.0769 0.0769 0.0769
2 1.0000 0.8354 0.5274 0.4451 0.4604 0.4634 0.4543 0.2195 0.0976 0.1159 0.0793 0.0000 0.0152 0.0305 0.0305 0.0335 0.0915 0.0152
3 0.9524 0.8571 0.2381 0.1429 0.6667 1.0000 1.0000 0.1905 0.4286 0.3810 0.3810 0.5714 0.0952 0.1905 0.0000 0.0000 0.0952 0.8571
4 0.9200 1.0000 0.6000 0.4000 0.0000 0.4200 0.3600 0.4400 0.4200 0.3200 0.4800 0.6400 0.5200 0.5200 0.5200 0.5400 0.4800 0.7800
5 0.8372 1.0000 0.7209 0.7907 0.6279 0.6047 0.6047 0.6279 0.5349 0.4419 0.4419 0.2791 0.4419 0.2326 0.1860 0.1860 0.1860 0.0000
6 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.6154 0.6154 0.6154 0.6154 1.0000
Note that the data is normalized, because I want to cluster the timeseries based on similar shapes. I imagined that a cluster analysis would be an appropiate analysis and I tried to cluster the time series with the following function:
a <- factoextra::eclust(Normalized_df, FUNcluster = "kmeans", nstart = 25, k.max = 5)
However, I have a couple of observations which have a negative silouhette width. Is there a way to correct for these assignments? For example, if the value sil_width is negative, then place the observation in neighbour cluster. An example can be found below.
cluster neighbor sil_width
1 1 3 -0.001258464
2 1 3 -0.004661913
3 1 4 -0.010083277
4 1 4 -0.012569472
5 1 3 -0.012793575
6 1 4 -0.013089868
7 1 5 -0.013346165
The motivation is to correct for these observations, in order to increase the average silhouette width for the clusters.
Any help would be much appreciated!
Moving points with a negative silhouette to another cluster would likely decrease the Silhouette of other points in that cluster. It's not obvious how to druther improve the results, and a) the best solution may contain negative Silhouette values, and b) it might be impossible to find a solution with only positive values. Last but not least, c) it will not be a k-means clustering result anymore - some points will no longer be assigned to the closest mean.
The core reason is that the scores within each cluster are tied. Moving one point to another cluster changes all their scores.

How to vectorize this code involving matrix pages in MATLAB?

Is it possible to vectorize, and possibly run on a GPU, the following code
x = linspace(0,100,1000);
h = zeros(size(x));
for i = 1 : length(x)
exprho = expm(-x(i)*rho);
h(i) = trace(drho*exprho*drho*exprho);
end
out = 2 * trapz(x,h);
where rho and drho are two complex Hermitian square matrices of the same size. rho is in fact a quantum density matrix and drho is its derivative with respect to a parameter.
The size can range from 10 x 10 to 300 x 300 approximately but I would also like to reach bigger sizes.
Here are two sample matrices:
rho =
0.4046 0.3849 0.2589 0.1422 0.0676 0.0288 0.0112 0.0040 0.0014 0.0004 0.0001
0.3849 0.3661 0.2462 0.1352 0.0643 0.0274 0.0106 0.0038 0.0013 0.0004 0.0001
0.2589 0.2462 0.1656 0.0910 0.0433 0.0184 0.0071 0.0026 0.0009 0.0003 0.0001
0.1422 0.1352 0.0910 0.0500 0.0238 0.0101 0.0039 0.0014 0.0005 0.0002 0.0000
0.0676 0.0643 0.0433 0.0238 0.0113 0.0048 0.0019 0.0007 0.0002 0.0001 0.0000
0.0288 0.0274 0.0184 0.0101 0.0048 0.0020 0.0008 0.0003 0.0001 0.0000 0.0000
0.0112 0.0106 0.0071 0.0039 0.0019 0.0008 0.0003 0.0001 0.0000 0.0000 0.0000
0.0040 0.0038 0.0026 0.0014 0.0007 0.0003 0.0001 0.0000 0.0000 0.0000 0.0000
0.0014 0.0013 0.0009 0.0005 0.0002 0.0001 0.0000 0.0000 0.0000 0.0000 0.0000
0.0004 0.0004 0.0003 0.0002 0.0001 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
0.0001 0.0001 0.0001 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
drho =
0.0366 0.0156 -0.0025 -0.0085 -0.0074 -0.0046 -0.0023 -0.0010 -0.0004 -0.0002 -0.0001
0.0156 -0.0035 -0.0147 -0.0148 -0.0103 -0.0057 -0.0028 -0.0012 -0.0005 -0.0002 -0.0001
-0.0025 -0.0147 -0.0181 -0.0145 -0.0091 -0.0048 -0.0022 -0.0009 -0.0004 -0.0001 -0.0000
-0.0085 -0.0148 -0.0145 -0.0105 -0.0062 -0.0031 -0.0014 -0.0006 -0.0002 -0.0001 -0.0000
-0.0074 -0.0103 -0.0091 -0.0062 -0.0035 -0.0017 -0.0008 -0.0003 -0.0001 -0.0000 -0.0000
-0.0046 -0.0057 -0.0048 -0.0031 -0.0017 -0.0008 -0.0004 -0.0001 -0.0001 -0.0000 -0.0000
-0.0023 -0.0028 -0.0022 -0.0014 -0.0008 -0.0004 -0.0002 -0.0001 -0.0000 -0.0000 -0.0000
-0.0010 -0.0012 -0.0009 -0.0006 -0.0003 -0.0001 -0.0001 -0.0000 -0.0000 -0.0000 -0.0000
-0.0004 -0.0005 -0.0004 -0.0002 -0.0001 -0.0001 -0.0000 -0.0000 -0.0000 -0.0000 -0.0000
-0.0002 -0.0002 -0.0001 -0.0001 -0.0000 -0.0000 -0.0000 -0.0000 -0.0000 -0.0000 -0.0000
-0.0001 -0.0001 -0.0000 -0.0000 -0.0000 -0.0000 -0.0000 -0.0000 -0.0000 -0.0000 -0.0000

Why am I getting a matrix of zeros and infs when dividing matrices of the same dimension by element?

I'm trying to get two matrices to divide, properly, element by element.
Essentially, firstd is a 6x499 and secd is 6x498. I first eliminate firstd's extra elements by doing firstd(:,499)=[]; making it 6x498. Now the next step is to transform firstd into the nominator, nom=((firstd.^2)+1).^1.5; My denominator is just denom=secd;
Both nom and denom have come out as 6x498 matrices with real, non-zero data for each element. However, when doing Rlayer=nom./denom, Rlayer comes out as this ludicrous 6x498 zero-ridden matrix.
I also trimmed out the elements in denom that were =0 by changing them to 0.0001.
Segment of result for Rlayer (Columns 493 through 498)
-0.0000 0.0000 -0.0000 0.0000 -0.0000 0.0000
-0.0000 0.0000 -0.0000 0.0000 0.0000 -0.0000
-0.0000 0.0000 -0.0000 0.0000 0.0000 -0.0000
-0.0000 0.0000 -0.0000 0.0000 -0.0000 0.0000
-0.0000 0.0000 -0.0000 0.0000 -0.0000 0.0000
-0.0000 0.0000 -0.0000 0.0000 0.0000 -0.0000
Below are two segments of denom (Columns 487 through 492)
0.0250 0.0281 -0.0281 0.0125 -0.0500 0.0969
-0.0125 0.0750 -0.1219 0.1094 -0.0938 0.0937
0.0344 0.0406 -0.1094 0.1187 -0.1344 0.1531
0.0001 0.0250 0.0001 -0.0437 0.0500 0.0062
0.0781 -0.0219 0.0094 -0.0125 -0.0188 0.1062
0.0250 0.0438 -0.0812 0.0937 -0.1063 0.1562
(Columns 493 through 498)
-0.1187 0.1156 -0.0844 0.0688 -0.0406 0.0125
-0.0969 0.1094 -0.0906 0.0469 0.0062 -0.0156
-0.1375 0.1719 -0.1656 0.0781 0.0187 -0.0531
-0.0562 0.1188 -0.1500 0.1438 -0.1187 0.1187
-0.1781 0.2281 -0.2156 0.1750 -0.1250 0.0812
-0.1750 0.1938 -0.1469 0.0563 0.0031 -0.0156
and this is a segment of nom (Columns 493 through 498)
1.0904 1.0235 1.0881 1.0368 1.0769 1.0514
1.0685 1.0201 1.0769 1.0272 1.0497 1.0532
1.0928 1.0180 1.1210 1.0201 1.0568 1.0685
1.0568 1.0285 1.1001 1.0170 1.0952 1.0260
1.0952 1.0078 1.1380 1.0107 1.1026 1.0272
1.0928 1.0078 1.1077 1.0212 1.0463 1.0480
Why is this division leading to this result? I've tried dividing with rdivide, in a double for loop, and row by row in a for loop. All number types are double.

spectral clustering

First off I must say that I'm new to matlab (and to this site...) , so please excuse my ignorance.
I'm trying to write a function in matlab that will use Spectral Clustering to split a set of points into two clusters.
my code is as follows
function Groups = TrySpectralClustering(data)
dist_mat = squareform(pdist(data));
W= zeros(length(data),length(data));
for i=1:length(data),
for j=(i+1):length(data),
W(i,j)=10^(-dist_mat(i,j));
W(j,i)=W(i,j);
end
end
D = zeros(length(data),length(data));
for i=1:length(W),
D(i,i)=sum(W(i,:));
end
L=D-W;
L=D^(-0.5)*L*D^(-0.5);
[ V E ] = eig(L);
disp ('V:');
disp (V);
If I understand correctly, then by using the second smallest eigenvector I should be able to perform a partition of the data into two clusters - If the ith member of the 2nd eigenvector is positive, the ith data point would be in the one cluster, otherwise it would be in the other cluster.
However, when I try the following
f=[1,1;0,0;1,0;0,1;100,100;100,101;101,101;101,100]
TrySpectralClustering(f)
I would expect that the first four points would form one cluster, and the last four would form another.
However, I receive
V:
-0.0000 -0.5000 0.0000 -0.5777 0.0000 0.4078 -0.0000 0.5000
-0.0000 -0.5000 0.0000 0.5777 0.0000 -0.4078 -0.0000 0.5000
-0.0000 -0.5000 0.0000 0.4078 0.0000 0.5777 -0.0000 -0.5000
-0.0000 -0.5000 0.0000 -0.4078 0.0000 -0.5777 -0.0000 -0.5000
-0.5000 -0.0000 -0.0000 -0.0000 -0.7071 -0.0000 0.5000 -0.0000
-0.5000 -0.0000 0.7071 0.0000 -0.0000 -0.0000 -0.5000 -0.0000
-0.5000 0.0000 -0.0000 0.0000 0.7071 0.0000 0.5000 0.0000
-0.5000 0 -0.7071 0 0 0 -0.5000 0
Taking the 2nd eigenvector
-0.0000 -0.5000 0.0000 0.5777 0.0000 -0.4078 -0.0000 0.5000
I find the one cluster includes the points 1,0;0,1;100,100;101,100
and the other cluster is made from the points 1,1;0,0;100,101;101,101
I wonder what am I doing wrong.
Note: I am working on the above as a part of a homework project.
Thanks in advance!
What you are getting is correct. Let U be the matrix containing the eigenvectors as shown above and let them be arranged such that the 1st column corresponds to the smallest eigenvalue and progressive columns correspond to the ascending eigenvalues. Then, take a subset of columns of U by retaining the eigenvectors corresponding to the smaller eigenvalues. Now, read these columns row-wise into a new set of vectors, call it Y. Cluster Y to get the spectral clusters. So, let us assume our subset is only the first column. We clearly see that if u were to cluster the first column, u would get the first 4 into 1 cluster and the next 4 into another cluster, which is what you want.
Take a look at the implementation on Prof. J. Shi's webpage. Pay close attention to discretisation.m function.
Moreover, your code is very inefficient. You need to take more advantage of Matlab's vectorization:
W = 10.^( - dist_mat ); % single liner of nested loop for comuting W
% computing the symmetric laplacian
d = sum( W, 2 ); % sum each row
d( d == 0 ) = 1; % avoid division by zero
d_half = 1./sqrt( d );
L = eye( n ) - bsxfun( #times, bsxfun( #times, W, d_half' ), d_half );
Two observations:
L=D-W; L=D^(-0.5)*L*D^(-0.5);
Why do you let him calculate the identity matrix? Just use the identity matrix eye(n) and substract D^(-0.5) * W * D^(-0.5) from that to calculate the Laplacian L
eig returns the eigenvectors as columns, why do you take the row? Did you check the values of the corresponding eigenvalues in E, so you can be sure you are looking at a eigenvec corresponding to the 2nd smallest eigenval?

Powers Table MATLAB

For this question, I'm supposed to create a NxN powers table in matlab using arrays.
The code I have so far is as follows:
C = [];
D = [];
N = input('Enter the value you would like to use for your NxN Powers Table: ');
for i = 1:N
for j = 1:N
C = [C;i^j];
end
C = transpose(C);
D = [D;C];
C = [];
end
D
This code works perfectly fine for any numbers from 1-9, as soon as I enter anything greater than that, it prints out weird values.
Here is the output I have using 5 as an input, and the second one is using 10 as an input.
Enter the value you would like to use for your NxN Powers Table: 5
D =
1 1 1 1 1
2 4 8 16 32
3 9 27 81 243
4 16 64 256 1024
5 25 125 625 3125
Enter the value you would like to use for your NxN Powers Table: 10
D =
1.0e+010 *
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0002 0.0010
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0002 0.0010 0.0060
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0006 0.0040 0.0282
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0002 0.0017 0.0134 0.1074
0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0005 0.0043 0.0387 0.3487
0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0010 0.0100 0.1000 1.0000
Any ideas what could be wrong with my code? Seems like a simple fix, I just can't figure out whats wrong with it. Any help is greatly appreciated. Thanks
Notice the 1.0e+010 *. It means that the numbers should be multiplied by 10000000000. Five digits are not enough to print it. Insert format long or format short g to see the whole numbers.
I think your code works fine. Note that 10^10 = 1e10; the very last element in your output D is indeed 1e10. Check individual elements D(i,j) to verify that those are correct. MATLAB can't display all the elements because some elements are so much larger than other ones; 1e10 has 10 digits in it, for instance, while 1^1 = 1 has 1 digit. So spacing would get screwed up if this behavior didn't happen.