Exporting blank values into a .txt file - MATLAB - matlab

I'm currently trying to export multiple matrices of unequal lengths into a delimited .txt file thus I have been padding the shorter matrices with 0's such that dlmwrite can use horzcat without error:
dlmwrite(filename{1},[a,b],'delimiter','\t')
However ideally I do not want the zeroes to appear in the .txt file itself - but rather the entries are left blank.
Currently the .txt file looks like this:
55875 3.1043e+05
56807 3.3361e+05
57760 3.8235e+05
58823 4.2869e+05
59913 4.3349e+05
60887 0
61825 0
62785 0
63942 0
65159 0
66304 0
67509 0
68683 0
69736 0
70782 0
But I want it to look like this:
55875 3.1043e+05
56807 3.3361e+05
57760 3.8235e+05
58823 4.2869e+05
59913 4.3349e+05
60887
61825
62785
63942
65159
66304
67509
68683
69736
70782
Is there anyway I can do this? Is there an alternative to dlmwrite which will mean I do not need to have matrices of equal lengths?

If a is always longer than b you could split vector a into two vectors of same length as vector b and the rest:
a = [1 2 3 4 5 6 7 8]';
b = [9 8 7 ]';
len = numel(b);
dlmwrite( 'foobar.txt', [a(1:len), b ], 'delimiter', '\t' );
dlmwrite( 'foobar.txt', a(len+1:end), 'delimiter', '\t', '-append');

You can read in the numeric data and convert to string and then add proper whitespaces to have the final output as string based cell array, which you can easily write into the output text file.
Stage 1: Get the cell of strings corresponding to the numeric data from column vector inputs a, b, c and so on -
%// Concatenate all arrays into a cell array with numeric data
A = [{a} {b} {c}] %// Edit this to add more columns
%// Create a "regular" 2D shaped cell array to store the cells from A
lens = cellfun('length',A)
max_lens = max(lens)
A_reg = cell(max_lens,numel(lens))
A_reg(:) = {''}
A_reg(bsxfun(#le,[1:max_lens]',lens)) = cellstr(num2str(vertcat(A{:}))) %//'
%// Create a char array that has string data from input arrays as strings
wsp = repmat({' '},max_lens,1) %// Create whitespace cell array
out_char = [];
for iter = 1:numel(A)
out_char = [out_char char(A_reg(:,iter)) char(wsp)]
end
out_cell = cellstr(out_char)
Stage 2: Now, that you have out_cell as the cell array that has the strings to be written to the text file, you have two options next for the writing operation itself.
Option 1 -
dlmwrite('results.txt',out_cell(:),'delimiter','')
Option 2 -
outfile = 'results.txt';
fid = fopen(outfile,'w');
for row = 1:numel(out_cell)
fprintf(fid,'%s\n',out_cell{row});
end
fclose(fid);

Related

Matlab from text file to sparse matrix.

I have a huge text file in the following format:
1 2
1 3
1 10
1 11
1 20
1 376
1 665255
2 4
2 126
2 134
2 242
2 247
First column is the x coordinate while second column is the y coordinate.
It indicates that if I had to construct a Matrix
M = zeros(N, N);
M(1, 2) = 1;
M(1, 3) = 1;
.
.
M(2, 247) = 1;
This text file is huge and can't be brought to main memory at once. I must read it line by line. And save it in a sparse matrix.
So I need the following function:
function mat = generate( path )
fid = fopen(path);
tline = fgetl(fid);
% initialize an empty sparse matrix. (I know I assigned Mat(1, 1) = 1)
mat = sparse(1);
while ischar(tline)
tline = fgetl(fid);
if ischar(tline)
C = strsplit(tline);
end
mat(C{1}, C{2}) = 1;
end
fclose(fid);
end
But unfortunately besides the first row it just puts trash in my sparse mat.
Demo:
1 7
1 9
2 4
2 9
If I print the sparse mat I get:
(1,1) 1
(50,52) 1
(49,57) 1
(50,57) 1
Any suggestions ?
Fixing what you have...
Your problem is that C is a cell array of characters, not numbers. You need to convert the strings you read from the file into integer values. Instead of strsplit you can use functions like str2num and str2double. Since tline is a space-delimited character array of integers in this case, str2num is the easiest to use to compute C:
C = str2num(tline);
Then you just index C like an array instead of a cell array:
mat(C(1), C(2)) = 1;
Extra tidbit: If you were wondering how your demo code still worked even though C contained characters, it's because MATLAB has a tendency to automatically convert variables to the correct type for certain operations. In this case, the characters were converted to their double ASCII code equivalents: '1' became 49, '2' became 50, etc. Then it used these as indices into mat.
A simpler alternative...
You don't even have to bother with all that mess above, since you can replace your entire function with a much simpler approach using dlmread and sparse like so:
data = dlmread(filePath);
mat = sparse(data(:, 1), data(:, 2), 1);
clear data; % Save yourself some memory if you don't need it any more

Create a matrix of integers given a set of strings and a dictionary, in matlab

I have a set of strings (as a Nx1 matrix) and a dictionary(Mx1) that contains all the words that are there in the strings.
eg:
strings= ['i went to the mall'; 'i am hungry']
dictionary = ['i','went','to','the', 'mall','am','hungry']
I want to create a matrix (of size MxN) where the cell contains 1 if the corresponding word is present in the corresponding tweet. How could I do it in matlab?
I tried doing this:
for i=1:len_of_dict
for j=1:len_of_str
temp=strfind(string1(j),dict(i));
x=find(cellfun(#isempty,temp));
xx = isempty(x);
if(xx~=0)
vec(i,j)=1;
end
end
end
But the vector that i got is not correct. Pleas help!
String sould be stored as cell arrays. then you can use strsplit to break the tweets then use ismember for checking presence of each word in tweets.
strings= {'i went to the mall'; 'i am hungry'};
dictionary = {'i','went','to','the', 'mall','am','hungry'};
splitted_strings = cellfun(#(x) strsplit(x,' '), strings, 'UniformOutput', false);
presence = cellfun(#(s) ismember(dictionary, s), splitted_strings, 'UniformOutput', false);
You can use vertcat to convert result to a matrix:
out = vertcat(presence {:})
result:
1 1 1 1 1 0 0
1 0 0 0 0 1 1

Read multiple data from a MATLAB file

I am currently trying to read data from a text file written exactly like this:
Height = 10
Length = 10
NodeX = 11
NodeY = 11
K = 10
I've written a small code like this
fileID = fopen('input.dat','r');
[a, b] = fscanf(fileID, '%s %f')
And I get the following answer:
a =
72
101
105
103
104
116
b =
1
It seems quite obvious I am not mananging to specify the format specification.
I would like to know how to pick a string along with a float multiple times in the same file.
As the documentation for fscanf states:
If formatSpec contains a combination of numeric and character
specifiers, then fscanf converts each character to its numeric
equivalent. This conversion occurs even when the format explicitly
skips all numeric values (for example, formatSpec is '%*d %s').
MATLAB can be annoyingly bad at reading mixed data types. One possible alternative is to read each line and split up your data using a simple regular expression:
fileID = fopen('results.txt','r');
mydata = {};
ii = 1;
while ~feof(fileID) % While we're not at the end of the file
tline = fgetl(fileID); % Get next line
mydata(ii,:) = regexp(tline, '([a-zA-Z])* = (\d*)', 'tokens');
ii = ii + 1;
end
fclose(fileID);
This returns a 5 x 1 cell array where each cell contains 2 cells (slightly annoying, but you can pull them out) that match your data. In this case, mydata{1}{1} is Height and mydata{1}{2} is 10.
Edit:
And you can flatten your cell array with a reshape call:
mydata = reshape([mydata{:}], 2, [])';
Which turns mydata in this case into a 5x2 cell array.
The fscanf function is a low-level I/O function and is often not the best choice for such rather high-level file input. One alternative would be to use the textscan function, which allows quite advanced format specifications:
fileID = fopen('input.dat','r');
C = textscan(fileID,'%s = %d')
which creates a 1x2 cell array. The first cell C{1} contains another 5x1 cell, where each field contains the name of the field, e.g. 'Height'. The second cell C{2} contains a 5x1 vector containing all integer values from the file.

Importing text file into matrix form with indexes as strings?

I'm new to Matlab so bear with me. I have a text file in this form :
b0002 b0003 999
b0002 b0004 999
b0002 b0261 800
I need to read this file and convert it into a matrix. The first and second column in the text file are analogous to row and column of a matrix(the indices). I have another text file with a list of all values of 'indices'. So it should be possible to create an empty matrix beforehand.
b0002
b0003
b0004
b0005
b0006
b0007
b0008
Is there anyway to access matrix elements using custom string indices(I doubt it but just wondering)? If not, I'm guessing the only way to do this is to assign the first row and first column the index string values and then assign the third column values based on the first text file. Can anyone help me with that?
You can easily convert those strings to numbers and then use those as indices. For a given string, b0002:
s = 'b0002'
str2num(s(2:end); % output = 2
Furthermore, you can also do this with a char matrix:
t = ['b0002';'b0003';'b0004']
t =
b0002
b0003
b0004
str2num(t(:,2:end))
ans =
2
3
4
First, we use textscan to read the data in as two strings and a float (could use other numerical formats. We have to open the file for reading first.
fid = fopen('myfile.txt');
A = textscan(fid,'%s%s%f');
textscan returns a cell array, so we have to extract your three variables. x and y are converted to single char arrays using cell2mat (works only if all the strings inside are the same length), n is a list of numbers.
x = cell2mat(A{1});
y = cell2mat(A{2});
n = A{3};
We can now convert x and y to numbers by telling it to take every row : but only the second to final part of the row 2:end, e.g 002, 003 , not b002, b003.
x = str2num(x(:,2:end));
y = str2num(y(:,2:end));
Slight problem with indexing - if I have a matrix A and I do this:
A = magic(8);
A([1,5],[3,8])
Then it returns four elements - [1,3],[5,3],[1,8],[5,8] - not two. But what you want is the location in your matrix equivalent to x(1),y(1) to be set to n(1) and so on. To do this, we need to 1) work out the final size of matrix. 2) use sub2ind to calculate the right locations.
% find the size
% if you have a specified size you want the output to be use that instead
xsize = max(x);
ysize = max(y);
% initialise the output matrix - not always necessary but good practice
out = zeros(xsize,ysize);
% turn our x,y into linear indices
ind = sub2ind([xsize,ysize],x,y);
% put our numbers in our output matrix
out(ind) = n;

Combine Columns of Cell Array Matlab

I have a 2D cell array with dynamic row sizes and column sizes. One example:
cell3 = [{'abe' 'basdasd' 'ceee'}; {'d' 'eass' 'feeeeeeeeee'}]
Gives: (with dimension 2 by 3)
'abe' 'basdasd' 'ceee'
'd' 'eass' 'feeeeeeeeee'
I want to be able to combine the columns and reproduce a aggregate string cell array where each string is separated by a single white space. Any idea how to do that?
So the output i am looking for is:
'abe basdasd ceee'
'd eass feeeeeeeeee'
The final dimension is 2 by 1.
Is this possible?
Apply strjoin either in a loop or via cellfun. The latter:
>> cellRows = mat2cell(cell3,ones(size(cell3,1),1),size(cell3,2));
>> out = cellfun(#strjoin,cellRows,'uni',0)
out =
'abe basdasd ceee'
'd eass feeeeeeeeee'
Solution without loops or cellfun:
[m n] = size(cell3);
cellCols = mat2cell(cell3,m,ones(1,n)); %// group by columns
cellColsSpace(1:2:2*size(cell3,2)-1) = cellCols; %// make room (columns)...
cellColsSpace(2:2:2*size(cell3,2)-1) = {{' '}}; %// ...to introduce spaces
%// Note that a cell within cell is needed, because of how strcat works.
result = strcat(cellColsSpace{:});