Importing text file into matrix form with indexes as strings? - matlab

I'm new to Matlab so bear with me. I have a text file in this form :
b0002 b0003 999
b0002 b0004 999
b0002 b0261 800
I need to read this file and convert it into a matrix. The first and second column in the text file are analogous to row and column of a matrix(the indices). I have another text file with a list of all values of 'indices'. So it should be possible to create an empty matrix beforehand.
b0002
b0003
b0004
b0005
b0006
b0007
b0008
Is there anyway to access matrix elements using custom string indices(I doubt it but just wondering)? If not, I'm guessing the only way to do this is to assign the first row and first column the index string values and then assign the third column values based on the first text file. Can anyone help me with that?

You can easily convert those strings to numbers and then use those as indices. For a given string, b0002:
s = 'b0002'
str2num(s(2:end); % output = 2
Furthermore, you can also do this with a char matrix:
t = ['b0002';'b0003';'b0004']
t =
b0002
b0003
b0004
str2num(t(:,2:end))
ans =
2
3
4
First, we use textscan to read the data in as two strings and a float (could use other numerical formats. We have to open the file for reading first.
fid = fopen('myfile.txt');
A = textscan(fid,'%s%s%f');
textscan returns a cell array, so we have to extract your three variables. x and y are converted to single char arrays using cell2mat (works only if all the strings inside are the same length), n is a list of numbers.
x = cell2mat(A{1});
y = cell2mat(A{2});
n = A{3};
We can now convert x and y to numbers by telling it to take every row : but only the second to final part of the row 2:end, e.g 002, 003 , not b002, b003.
x = str2num(x(:,2:end));
y = str2num(y(:,2:end));
Slight problem with indexing - if I have a matrix A and I do this:
A = magic(8);
A([1,5],[3,8])
Then it returns four elements - [1,3],[5,3],[1,8],[5,8] - not two. But what you want is the location in your matrix equivalent to x(1),y(1) to be set to n(1) and so on. To do this, we need to 1) work out the final size of matrix. 2) use sub2ind to calculate the right locations.
% find the size
% if you have a specified size you want the output to be use that instead
xsize = max(x);
ysize = max(y);
% initialise the output matrix - not always necessary but good practice
out = zeros(xsize,ysize);
% turn our x,y into linear indices
ind = sub2ind([xsize,ysize],x,y);
% put our numbers in our output matrix
out(ind) = n;

Related

Save Complex vector from Matlab to text file

I generate a vector of complex numbers with Matlab and I want to save that vector into text file (.txt) to use it as input in my C code so the complex vectors looks like :
y = zeros(1,N);
for n = 1:N
y(n) = exp(-1i*(n-1)*k*d*sind(Qtgt));
end
picture
So I tried the functon dlmwrite to save the vector into text file :
dlmwrite('data.txt', y, 'delimiter','\n','newline', 'pc')
the vector stored like this :
picture 2
but I want it to be stored in this way : picture3
every complex number should stored in new line and the real part, the imaginary part should be separated with coma Any idea please ?
lifay:
This has a very simple fix: In your callout to dlmwrite, replace the y input for [real(y'),imag(y')]
Here's my attempt:
N = 10;
y = zeros(1,N);
for n = 1:N
y(n) = exp(-1i*(n-1));
re = real(y(n));
im = imag(y(n));
fprintf('%6.10f,%6.10f\n',re,im)
end
filename = 'output';
dlmwrite(filename,[real(y)',imag(y')],'precision',16)
Output to the above (granted I have a slightly different formula) on my end is:
The reason the output is given as shown by your "Picture 2" is that y is a row vector. Unless explicitly "told" otherwise by the programmer, MATLAB assumes arrays to be row vectors. MATLAB's dlmwrite output emulates the shape of the input array, in this case, a row vector, therefore, picture 2 is what you get. To get the output of picture 3, you must input a rectangular matrix. I achieve this above by concatenating two(2) row vectors that were transposed into column vectors. To perform the transposition use the ' operator as shown. Notice that I transpose the output of the real function and then the input to the imag function. These functions also emulate the shape of their respective input arrays into their respective output arrays. So, [real(y)',imag(y')] has the same effect as [real(y'),imag(y')].
Also, don't forget to specify the "precision" parameter to ensure that dlmwrite stores all the digits possible, if unspecified, dlmwrite will truncate the numbers and introduce truncation error into your further calculations in C.

Using Matlab to randomly split an Excel Sheet

I have an Excel sheet containing 1838 records and I need to RANDOMLY split these records into 3 Excel Sheets. I am trying to use Matlab but I am quite new to it and I have just managed the following code:
[xlsn, xlst, raw] = xlsread('data.xls');
numrows = 1838;
randindex = ceil(3*rand(numrows, 1));
raw1 = raw(:,randindex==1);
raw2 = raw(:,randindex==2);
raw3 = raw(:,randindex==3);
Your general procedure will be to read the spreadsheet into some matlab variables, operate on those matrices such that you end up with three thirds and then write each third back out.
So you've got the read covered with xlsread, that results in the two matrices xlsnum and xlstxt. I would suggest using the syntax
[~, ~, raw] = xlsread('data.xls');
In the xlsread help file (you can access this by typing doc xlsread into the command window) it says that the three output arguments hold the numeric cells, the text cells and the whole lot. This is because a matlab matrix can only hold one type of value and a spreadsheet will usually be expected to have text or numbers. The raw value will hold all of the values but in a 'cell array' instead, a different kind of matlab data type.
So then you will have a cell array valled raw. From here you want to do three things:
work out how many rows you have (I assume each record is a row) by using the size function and specifying the appropriate dimension (again check the help file to see how to do this)
create an index of random numbers between 1 and 3 inclusive, which you can use as a mask
randindex = ceil(3*rand(numrows, 1));
apply the mask to your cell array to extract the records matching each index
raw1 = raw(:,randindex==1); % do the same for the other two index values
write each cell back to a file
xlswrite('output1.xls', raw1);
You will probably have to fettle the arguments to get it to work the way you want but be sure to check the doc functionname page to get the syntax just right. Your main concern will be to get the indexing correct - matlab indexes row-first whereas spreadsheets tend to be column-first (e.g. cell A2 is column A and row 2, but matlab matrix element M(1,2) is the first row and the second column of matrix M, i.e. cell B1).
UPDATE: to split the file evenly is surprisingly more trouble: because we're using random numbers for the index it's not guaranteed to split evenly. So instead we can generate a vector of random floats and then pick out the lowest 33% of them to make index 1, the highest 33 to make index 3 and let the rest be 2.
randvec = rand(numrows, 1); % float between 0 and 1
pct33 = prctile(randvec,100/3); % value of 33rd percentile
pct67 = prctile(randvec,200/3); % value of 67th percentile
randindex = ones(numrows,1);
randindex(randvec>pct33) = 2;
randindex(randvec>pct67) = 3;
It probably still won't be absolutely even - 1838 isn't a multiple of 3. You can see how many members each group has this way
numel(find(randindex==1))

read textfile in Matlab

I am quite stuck with my Matlab problem here.. I have a *.txt file that looks like this:
1
2
2
x50
2
2
2
x79
which means that at the coordinates (1,2,2) the f(x)-value is 50 and at coordinates (2,2,2) the f(x)-value is 79. I am trying to read this into Matlab so I have for a vector (or using repmat a meshgrid-like matrix) for x and one for y. I will not need z since it will not change over the process.
Also I want to read in the f(x)-value so I can plot the whole thing using surf().
If i use
[A] = textread('test.txt','%s')
it always gives me the whole thing... Can someone give me an idea please? I am thinking about putting the thing in a loop, something like this pseudocode
for i=1 to 50
xpos = read first line
ypos =read next line
zpos = read next line (or ignore.. however..)
functionvalue= read next line
end
Any hints? Thanks
Assuming that the data setup in the text file is like lines 1,2,3 are XYZ coordinate points and the next line (fourth line) is the function value. Then 5,6,7 are the next set of XYZ coordinate points followed by the function value on the 8th line for that set and so on with such a repeating format, see if this works for you -
%// Read data from text file
text_data = textread(inputfile,'%s')
data = reshape(text_data,4,[])
%// Get numeric data from it
data(end,:) = strrep(data(end,:),'x','')
%// OR data(end,:) = arrayfun(#(n) data{end,n}(2:end),1:size(data,2),'Uni',0)
data_numeric = str2double(data)
%// Separate XYZ and function values
xyz = data_numeric(1:3,:)' %//'# Each row will hold a set of XYZ coordinates
f_value = data_numeric(end,:) %// function values
A bit more robust approach -
%// Read data from text file
txtdata = textread(inputfile,'%s');
%// ----------- Part I: Get XYZ ---------------------
%// Find cell positions where the first character is digit indicating that
%// these are the cells containing the coordinate points
digits_pos_ele = isstrprop(txtdata,'digit');
digits_pos_cell = arrayfun(#(x) digits_pos_ele{x}(1),1:numel(digits_pos_ele));
%// Convert to numeric format and reshape to have each row holding each set
%// of XYZ coordinates
xyz_vals = reshape(str2double(txtdata(digits_pos_cell)),3,[])'; %//'
%// ----------- Part II: Get function values ---------------------
%// Find the positions where cell start with `x` indicating these are the
%// function value cells
x_start_pos = arrayfun(#(n) strcmp(txtdata{n}(1),'x'),1:numel(txtdata));
%// Collect all function value cells and find the function value
%// themeselves by excluding the first character from all those cells
f_cell = txtdata(x_start_pos);
f_vals = str2double(arrayfun(#(n) f_cell{n}(2:end), 1:numel(f_cell),'Uni',0))'; %//'
%// Error checking
if size(xyz_vals,1)~=size(f_vals,1)
error('Woops, something is not right!')
end

Indexing must appear last in an index expression

I have a vector CD1 (120-by-1) and I separate CD1 into 6 parts. For example, the first part is extracted from row 1 to row 20 in CD1, and second part is extracted from row 21 to row 40 in CD1, etc. For each part, I need to compute the means of the absolute values of second differences of the data.
for PartNo = 1:6
% extract data
Y(PartNo) = CD1(1 + 20*(PartNo-1):20*(PartNo),:);
% find the second difference
Z(PartNo) = Y(PartNo)(3:end) - Y(PartNo)(1:end-2);
% mean of absolute value
MEAN_ABS_2ND_DIFF_RESULT(PartNo) = mean(abs(Z));
end
However, the commands above produce the error:
()-indexing must appear last in an index expression for Line:2
Any ideas to change the code to have it do what I want?
This error is often encountered when Y is a cell-array. For cell arrays,
Y{1}(1:3)
is legal. Curly braces ({}) mean data extraction, so this means you are extracting the array stored in location 1 in the cell array, and then referencing the elements 1 through 3 of that array.
The notation
Y(1)(1:3)
is different in that it does not extract data, but it references the cell's location 1. This means the first part (Y(1)) returns a cell-array which, in your case, contains a single array. So you won't have direct access to the regular array as before.
It is an infamous limitation in Matlab that you cannot do indirect or double-referencing, which is in effect what you are doing here.
Hence the error.
Now, to resolve: I suspect replacing a few normal braces with curly ones will do the trick:
Y{PartNo} = CD1(1+20*(PartNo-1):20*PartNo,:); % extract data
Z{PartNo} = Y{PartNo}(3:end)-Y{PartNo}(1:end-2); % find the second difference
MEAN_ABS_2ND_DIFF_RESULT{PartNo} = mean(abs(Z{PartNo})); % mean of absolute value
I might suggest a different approach
Y = reshape(CD1, 20, 6);
Z = diff(y(1:2:end,:));
MEAN_ABS_2ND_DIFF_RESULT = mean(abs(Z));
This is not a valid statement in matlab:
Y(PartNo)(3:end)
You should either make Y two-dimensional and use this indexing
Y(PartNo, 3:end)
or extract vector parts and use them directly, if you use a loop like you have shown
for PartNo = 1:6
% extract data
Y = CD1(1 + 20*(PartNo-1):20*(PartNo),:);
% find the second difference
Z = Y(3:end) - Y(1:end-2);
% mean of absolute value
MEAN_ABS_2ND_DIFF_RESULT(PartNo) = mean(abs(Z));
end
Also, since CD1 is a vector, you do not need to index the second dimension. Drop the :
Y = CD1(1 + 20*(PartNo-1):20*(PartNo));
Finally, you do not need a loop. You can reshape the CD1 vector to a two-dimensional array Y of size 20x6, in which the columns are your parts, and work directly on the resulting matrix:
Y = reshape(CD1, 20, 6);
Z = Y(3:end,:)-Y(1:end-1,:);
MEAN_ABS_2ND_DIFF_RESULT = mean(abs(Z));

read text files containing binary data as a single matrix in matlab

I have a text file which contains binary data in the following manner:
00000000000000000000000000000000001011111111111111111111111111111111111111111111111111111111110000000000000000000000000000000
00000000000000000000000000000000000000011111111111111111111111111111111111111111111111000111100000000000000000000000000000000
00000000000000000000000000000000000011111111111111111111111111111111111111111111111111111111100000000000000000000000000000000
00000000000000000000000000000000000111111111111111111111111111111111111111111111111111111111100000000000000000000000000000000
00000000000000000000000000000000000011111111111111111111111111111111111111111111111111111111100000000000000000000000000000000
00000000000000000000000000000000000000011111111111111111111111111111111111111111111111111111100000000000000000000000000000000
00000000000000000000000000000000000000011111111111111111111111111111111111111111111111000111110000000000000000000000000000000
00000000000000000000000000000000000000111111111111111111111111111111111111111111111111111111110000000000000000000000000000000
00000000000000000000000000000000000000000000111111111111111111111111111111111111110000000011100000000000000000000000000000000
00000000000000000000000000000000000000011111111111111111111111111111111111111111111111100111110000000000000000000000000000000
00000000000000000000000000000000000111111111111111111111111111111111111111111111111111110111110000000000000000000000000000000
00000000000000000000000000000000001111111111111111111111111111111111111111111111111111111111100000000000000000000000000000000
00000000000000000000000000000000000000001111111111111111111111111111111111111111111111000011100000000000000000000000000000000
00000000000000000000000000000000000000001111111111111111111111111111111111111111111111000011100000000000000000000000000000000
00000000000000000000000000000000000001111111111111111111111111111111111111111111111111111111000000000000000000000000000000000
00000000000000000000000000000000000000011111111111111111111111111111111111111111111110000011100000000000000000000000000000000
00000000000000000000000000000000000000000000011111111111111111111111111111111111100000000011100000000000000000000000000000000
00000000000000000000000000000000000000111111111111111111111111111111111111111111111111110111100000000000000000000000000000000
Please note that each 1 or 0 is independent i.e the values are not decimal. I need to find the column wise sum of the file. There are 125 columns in all and there are 840946 rows.
I have tried textread, fscanf and a few other matlab commands, but the result is that they all read each row in decimal format and create a 840946x1 array. I want to create a 840946x125 matrix to compute a column wise sum.
You can use textread to do it. Just read strings and later process them with sscanf, one digit at a time
A = textread('data.txt', '%s');
ncols = size(A, 1);
nrows = size(A{1}, 2);
A = reshape(sscanf([A{:}], '%1d'), nrows, ncols);
Note that now A is transposed, i.e. you have 125 rows.
The column-wise sum is then computed simply by
colsum = sum(A);
Here's a slightly hack-ish approach:
A = textread('data.txt', '%s');
colsum = sum(cat(1,A{:})-'0')
Breakdown:
textread will read each line of 0's and 1's as a single string. A will therefore be a cell-string, with each element equal to a string of length 125.
cat(1,A{:}) will concatenate the cell string into a "normal" Matlab character array of size 840946-by-125.
Subtracting the ASCII-value '0' from any character array consisting of 0's and 1's will return their numeric representation. For example, 'a'-0 = 97, the ASCII-value for lower-case 'a'.
sum will finally sum over the columns of this array.