Whats wrong with this [reading input from a text file in Matlab]? - matlab

I have a text file (c:\input.txt) which has:
2.0 4.0 8.0 16.0 32.0 64.0 128.0 256.0 512.0 1024.0 2048.0 4096.0 8192.0
In Matlab, I want to read it as:
data = [2.0 4.0 8.0 16.0 32.0 64.0 128.0 256.0 512.0 1024.0 2048.0 4096.0 8192.0]
I tried this code:
fid=fopen('c:\\input.txt','rb');
data = fread(fid, inf, 'float');
data
but I am getting some garbage values:
data =
1.0e-004 *
0.0000
0.0015
0.0000
0.0000
0.0000
0.0000
0.0000
0.0001
0.0239
0.0000
0.0000
0.0000
0.0000
0.0066
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0016
0.0000
0.0000
0.0276
0.0000
0.3819
0.0000
0.0000
Where is the mistake?

fread is for reading binary files only!
The equivalent for text files is fscanf, used as follows:
fid = fopen('c:\\input.txt','rt');
data = fscanf(fid, '%f', inf)';
fclose(fid);
Or in your case, simply use load:
data = load('c:\\input.txt', '-ascii');
There are many other ways in MATLAB to read text data from files:
dlmread
textscan
importdata

Your file is a text file, so you should open it for text read:
fid=fopen('c:\\input.txt','rt');
Then, for reading, I find TEXTSCAN to be more powerful than FREAD/FSCANF (the differences between them all are summarized here
data = textscan(f, '%f')
returns a cell array. You can get at the contents with
>> data{1}
ans =
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
TEXTREAD is easier to use than TEXTSCAN, but according to the documentation is now outdated.

Related

How do i perform operations on matrix rows while keeping the matrix intact?

Question/problem summary:
Create a 10 by 10 matrix whose first column is the numbers 1,2,3,4,5,6,7,8,9,10
the next column contains the squares of first column: 1, 4, 9,...,100
the third column contains the 3rd power of first column: 1, 8, 27,..., 1000.
the 10th column contains the 10th power of the first column.
Background:
This is for a class assignment, intro to analytical programming. I have tried the following code, but i am not sure why it is not giving the correct output. Any advice or suggestions is appreciated.
row1 = [1:10]
tenXtenMatrix = repmat(row1,10,1)
[row col] = size(tenXtenMatrix)
for i=2:row
for j=1:col
tenXtenMatrix(i,:).^i
end
end
what is expected:
1 2 3 4 5 6 7 8 9 10
1 4 9 16 25 36 49 64 81 100
1 8 27 64 125 216 343 512 729 1000
1 16 81 256 625 1296 2401 4096 6561 10000
etc..
what i got:
0.0000 0.0000 0.0000 0.0001 0.0010 0.0060 0.0282 0.1074 0.3487 1.0000
0.0000 0.0000 0.0000 0.0001 0.0010 0.0060 0.0282 0.1074 0.3487 1.0000
0.0000 0.0000 0.0000 0.0001 0.0010 0.0060 0.0282 0.1074 0.3487 1.0000
0.0000 0.0000 0.0000 0.0001 0.0010 0.0060 0.0282 0.1074 0.3487 1.0000
etc...
Using implicit expansion:
x = 1:10
A = x.^(x.')
Where:
.^ is the element-wise power operator
.' is the transpose operator
More informations about implicit expansion here.
Fixes:
you run on j and not using it.
you calculate the power but not updating the matrix
row1 = [1:10];
tenXtenMatrix = repmat(row1,10,1);
[row col] = size(tenXtenMatrix);
for i=2:row
tenXtenMatrix(i,:) = tenXtenMatrix(i,:).^i;
end

Cluster Analysis: correcting observations with negative silhouette width

I am trying to find patterns in a dataset (~1000 series) containing time series data with yearly frequency. Some sample data:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18
1 1.0000 0.6154 0.0000 0.0769 0.0000 0.0000 0.0000 0.2308 0.6923 0.6923 0.6923 0.6923 0.6923 0.3846 0.3846 0.0769 0.0769 0.0769
2 1.0000 0.8354 0.5274 0.4451 0.4604 0.4634 0.4543 0.2195 0.0976 0.1159 0.0793 0.0000 0.0152 0.0305 0.0305 0.0335 0.0915 0.0152
3 0.9524 0.8571 0.2381 0.1429 0.6667 1.0000 1.0000 0.1905 0.4286 0.3810 0.3810 0.5714 0.0952 0.1905 0.0000 0.0000 0.0952 0.8571
4 0.9200 1.0000 0.6000 0.4000 0.0000 0.4200 0.3600 0.4400 0.4200 0.3200 0.4800 0.6400 0.5200 0.5200 0.5200 0.5400 0.4800 0.7800
5 0.8372 1.0000 0.7209 0.7907 0.6279 0.6047 0.6047 0.6279 0.5349 0.4419 0.4419 0.2791 0.4419 0.2326 0.1860 0.1860 0.1860 0.0000
6 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.6154 0.6154 0.6154 0.6154 1.0000
Note that the data is normalized, because I want to cluster the timeseries based on similar shapes. I imagined that a cluster analysis would be an appropiate analysis and I tried to cluster the time series with the following function:
a <- factoextra::eclust(Normalized_df, FUNcluster = "kmeans", nstart = 25, k.max = 5)
However, I have a couple of observations which have a negative silouhette width. Is there a way to correct for these assignments? For example, if the value sil_width is negative, then place the observation in neighbour cluster. An example can be found below.
cluster neighbor sil_width
1 1 3 -0.001258464
2 1 3 -0.004661913
3 1 4 -0.010083277
4 1 4 -0.012569472
5 1 3 -0.012793575
6 1 4 -0.013089868
7 1 5 -0.013346165
The motivation is to correct for these observations, in order to increase the average silhouette width for the clusters.
Any help would be much appreciated!
Moving points with a negative silhouette to another cluster would likely decrease the Silhouette of other points in that cluster. It's not obvious how to druther improve the results, and a) the best solution may contain negative Silhouette values, and b) it might be impossible to find a solution with only positive values. Last but not least, c) it will not be a k-means clustering result anymore - some points will no longer be assigned to the closest mean.
The core reason is that the scores within each cluster are tied. Moving one point to another cluster changes all their scores.

Calculate Singular Value Decomposition (SVD) by reading a .txt file in matlab [duplicate]

I have a text file (c:\input.txt) which has:
2.0 4.0 8.0 16.0 32.0 64.0 128.0 256.0 512.0 1024.0 2048.0 4096.0 8192.0
In Matlab, I want to read it as:
data = [2.0 4.0 8.0 16.0 32.0 64.0 128.0 256.0 512.0 1024.0 2048.0 4096.0 8192.0]
I tried this code:
fid=fopen('c:\\input.txt','rb');
data = fread(fid, inf, 'float');
data
but I am getting some garbage values:
data =
1.0e-004 *
0.0000
0.0015
0.0000
0.0000
0.0000
0.0000
0.0000
0.0001
0.0239
0.0000
0.0000
0.0000
0.0000
0.0066
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0016
0.0000
0.0000
0.0276
0.0000
0.3819
0.0000
0.0000
Where is the mistake?
fread is for reading binary files only!
The equivalent for text files is fscanf, used as follows:
fid = fopen('c:\\input.txt','rt');
data = fscanf(fid, '%f', inf)';
fclose(fid);
Or in your case, simply use load:
data = load('c:\\input.txt', '-ascii');
There are many other ways in MATLAB to read text data from files:
dlmread
textscan
importdata
Your file is a text file, so you should open it for text read:
fid=fopen('c:\\input.txt','rt');
Then, for reading, I find TEXTSCAN to be more powerful than FREAD/FSCANF (the differences between them all are summarized here
data = textscan(f, '%f')
returns a cell array. You can get at the contents with
>> data{1}
ans =
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
TEXTREAD is easier to use than TEXTSCAN, but according to the documentation is now outdated.

Powers Table MATLAB

For this question, I'm supposed to create a NxN powers table in matlab using arrays.
The code I have so far is as follows:
C = [];
D = [];
N = input('Enter the value you would like to use for your NxN Powers Table: ');
for i = 1:N
for j = 1:N
C = [C;i^j];
end
C = transpose(C);
D = [D;C];
C = [];
end
D
This code works perfectly fine for any numbers from 1-9, as soon as I enter anything greater than that, it prints out weird values.
Here is the output I have using 5 as an input, and the second one is using 10 as an input.
Enter the value you would like to use for your NxN Powers Table: 5
D =
1 1 1 1 1
2 4 8 16 32
3 9 27 81 243
4 16 64 256 1024
5 25 125 625 3125
Enter the value you would like to use for your NxN Powers Table: 10
D =
1.0e+010 *
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0002 0.0010
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0002 0.0010 0.0060
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0006 0.0040 0.0282
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0002 0.0017 0.0134 0.1074
0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0005 0.0043 0.0387 0.3487
0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0010 0.0100 0.1000 1.0000
Any ideas what could be wrong with my code? Seems like a simple fix, I just can't figure out whats wrong with it. Any help is greatly appreciated. Thanks
Notice the 1.0e+010 *. It means that the numbers should be multiplied by 10000000000. Five digits are not enough to print it. Insert format long or format short g to see the whole numbers.
I think your code works fine. Note that 10^10 = 1e10; the very last element in your output D is indeed 1e10. Check individual elements D(i,j) to verify that those are correct. MATLAB can't display all the elements because some elements are so much larger than other ones; 1e10 has 10 digits in it, for instance, while 1^1 = 1 has 1 digit. So spacing would get screwed up if this behavior didn't happen.

Matrix creation MATLAB

I am building a nxn matrix in matlab with the following code:
x = linspace(a,b,n);
for i=1:n
for j=1:n
A(i,j) = x(j)^(i-1);
end
A
i
b(i) = (1/i)*x(n)^i - (1/i)*x(1)^i;
end
I am testing it with a=1 b=10 and n=10. I get the expected results up to i=8
i =
8
A =
Columns 1 through 7
1 1 1 1 1 1 1
1 2 3 4 5 6 7
1 4 9 16 25 36 49
1 8 27 64 125 216 343
1 16 81 256 625 1296 2401
1 32 243 1024 3125 7776 16807
1 64 729 4096 15625 46656 117649
1 128 2187 16384 78125 279936 823543
1 256 6561 65536 390625 1679616 5764801
Columns 8 through 10
1 1 1
8 9 10
64 81 100
512 729 1000
4096 6561 10000
32768 59049 100000
262144 531441 1000000
2097152 4782969 10000000
16777216 43046721 100000000
however from i=9 on it becomes this:
i =
9
A =
1.0e+09 *
Columns 1 through 9
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0003 0.0005
0.0000 0.0000 0.0000 0.0000 0.0001 0.0003 0.0008 0.0021 0.0048
0.0000 0.0000 0.0000 0.0001 0.0004 0.0017 0.0058 0.0168 0.0430
0.0000 0.0000 0.0000 0.0003 0.0020 0.0101 0.0404 0.1342 0.3874
Column 10
0.0000
0.0000
0.0000
0.0000
0.0000
0.0001
0.0010
0.0100
0.1000
1.0000
Can someone please tell me what is happening? I am not very experienced in matlab (I mostly use c++ or python) and so far can't seem to figure it out myself.
It's just a formatting issue for larger numbers. Try
sprintf('%20.0f', A(end,end))
and you will see that the number is correct. At least up to some point, where you will run into double representation problems...
Because a common scaling is applied to your data display. See in your output:
A =
1.0e+09 *
A common factor of 10^9 was factored out of every entry in your matrix.
You may want to adjust your output display using:
format short g