How to convert Matlab variables to .dat (text) file with headers - matlab

EDITED QUESTION:
I have 2500 rows x 100 columns data in variable named avg_data_models. I also have 2500 rows x 100 columns variable 'X' and similar size matrix variable 'Y', both containing the co-ordinates. I want to save the values of this variable in a text (.dat) file which must have 302 header lines in the following manner:
avg_data_models
300
X_1
X_2
.
.
.
X_100
Y_1
Y_2
.
.
.
Y_100
avg_data_models_1
avg_data_models_2
avg_data_models_3
.
.
.
.
.
avg_data_models_100
In the above header style, the first line is the name of the file, the 2nd line tells the number of columns (each column has 2500 rows), and the rest of the 300 lines represent the model of each variable respectively - Like 100 models of X, 100 models of Y and 100 models of avg_data_models.

Consider this code:
%# here you have your data X/Y/..
%#X = rand(2500,100);
[r c] = size(X);
prefixX = 'X';
prefixY = 'Y';
prefixData = 'avg_data_models';
%# build a cell array that contains all the header lines
num = strtrim( cellstr(num2str((1:c)','_%d')) ); %#' SO fix
headers = [ prefixData ;
num2str(3*c) ;
strcat(prefixX,num) ;
strcat(prefixY,num) ;
strcat(prefixData,num) ];
%# write to file
fid = fopen('outputFile.dat', 'wt');
fprintf(fid, '%s\n',headers{:});
fclose(fid);
EDIT
It seems I misunderstood the question.. Here's the code to write the actual data (not the header titles!):
%# here you have your data X/Y/..
avg_data_models = rand(2500,100);
X = rand(2500,100);
Y = rand(2500,100);
%# create file, and write the title and number of columns
fid = fopen('outputFile.dat', 'wt');
fprintf(fid, '%s\n%d\n', 'avg_data_models', 3*size(X,2));
fclose(fid);
%# append rest of data
dlmwrite('outputFile.dat', [X Y avg_data_models], '-append', 'delimiter',',')
Note: I used a comma , as delimiter, you can change it to be a space or a tab \t if you like..

You can use fprintf to write the header, like so:
%# define the number of data
nModels = 100;
dataName = 'avg_data_models';
%# open the file
fid = fopen('output.dat','w');
%# start writing. First line: title
fprintf(fid,'%s\n',dataName); %# don't forget \n for newline. Use \n\r if yow want to open this in notepad
%# write number of models
fprintf(fid,'%i\n',nModels)
%# loop to write the rest of the header
for iModel = 1:nModels
fprintf(fid,'%s_%i\n',dataName,iModel);
end
%# use your favorite method to write the rest of the data.
%# for example, you could use fprintf again, using /t to add tabs
%# create format-string
%# check the help to fprintf to learn about formatting details
formatString = repmat('%f\t',1,100);
formatString = [formatString(1:end-1),'n']; %# replace last tab with newline
%# transpose the array, because fprintf reshapes the array to a vector and
%# 'fills' the format-strings sequentially until it runs out of data
fprintf(fid,formatString,avg_data'); %'# SO formatting
%# close the file
fclose(fid);

Related

Matlab - string containing a number and equal sign

I have a data file that contains parameter names and values with an equal sign in between them. It's like this:
A = 1234
B = 1353.335
C =
D = 1
There is always one space before and after the equal sign. The problem is some variables don't have values assigned to them like "C" above and I need to weed them out.
I want to read the data file (text) into a cell and just remove the lines with those invalid statements or just create a new data file without them.
Whichever is easier, but I will eventually read the file into a cell with textscan command.
The values (numbers) will be treated as double precision.
Please, help.
Thank you,
Eric
Try this:
fid = fopen('file.txt'); %// open file
x = textscan(fid, '%s', 'delimiter', '\n'); %// or '\r'. Read each line into a cell
fclose(fid); %// close file
x = x{1}; %// each cell of x contains a line of the file
ind = ~cellfun(#isempty, regexp(x, '=\s[\d\.]+$')); %// desired lines: space, numbers, end
x = x(ind); %// keep only those lines
If you just want to get the variables, and reject lines that do not have any character, this might work (the data.txt is just a txt generated by the example of data you have given):
fid = fopen('data.txt');
tline = fgets(fid);
while ischar(tline)
tmp = cell2mat(regexp(tline,'\=(.*)','match'));
b=str2double(tmp(2:end));
if ~isnan(b)
disp(b)
end
tline = fgets(fid);
end
fclose(fid);
I am reading the txt file line by line, and using general expressions to get rid of useless chars, and then converting to double the value read.

Reading data from a Text File into Matlab array

I am having difficulty in reading data from a .txt file using Matlab.
I have to create a 200x128 dimension array in Matlab, using the data from the .txt file. This is a repetitive task, and needs automation.
Each row of the .txt file is a complex number of form a+ib, which is of form a[space]b. A sample of my text file :
Link to text file : Click Here
(0)
1.2 2.32222
2.12 3.113
.
.
.
3.2 2.22
(1)
4.4 3.4444
2.33 2.11
2.3 33.3
.
.
.
(2)
.
.
(3)
.
.
(199)
.
.
I have numbers of rows (X), inside the .txt file surrounded by brackets. My final matrix should be of size 200x128. After each (X), there are exactly 128 complex numbers.
Here is what I would do. First thing, delete the "(0)" types of lines from your text file (could even use a simple shells script for that). This I put into the file called post2.txt.
# First, load the text file into Matlab:
A = load('post2.txt');
# Create the imaginary numbers based on the two columns of data:
vals = A(:,1) + i*A(:,2);
# Then reshape the column of complex numbers into a matrix
mat = reshape(vals, [200,128]);
The mat will be a matrix of 200x128 complex data. Obviously at this point you can put a loop around this to do this multiple times.
Hope that helps.
You can read the data in using the following function:
function data = readData(aFilename, m,n)
% if no parameters were passed, use these as defaults:
if ~exist('aFilename', 'var')
m = 128;
n = 200;
aFilename = 'post.txt';
end
% init some stuff:
data= nan(n, m);
formatStr = [repmat('%f', 1, 2*m)];
% Read in the Data:
fid = fopen(aFilename);
for ind = 1:n
lineID = fgetl(fid);
dataLine = fscanf(fid, formatStr);
dataLineComplex = dataLine(1:2:end) + dataLine(2:2:end)*1i;
data(ind, :) = dataLineComplex;
end
fclose(fid);
(edit) This function can be improved by including the (1) parts in the format string and throwing them out:
function data = readData(aFilename, m,n)
% if no parameters were passed, use these as defaults:
if ~exist('aFilename', 'var')
m = 128;
n = 200;
aFilename = 'post.txt';
end
% init format stuff:
formatStr = ['(%*d)\n' repmat('%f%f\n', 1, m)];
% Read in the Data:
fid = fopen(aFilename);
data = fscanf(fid, formatStr);
data = data(1:2:end) + data(2:2:end)*1i;
data = reshape(data, n,m);
fclose(fid);

read a txt file to matrix and cellarray Matlab

I have a txt file with those entries and I would like to know how to get the numerical values from the second column until the last column in a matrix and the first column in a cell array.
I've tried with import data and fscanf and I dont understand what's going on.
CP6 7,2 -2,7 6,6
P5 -5,8 -5,9 5,8
P6 5,8 -5,9 5,8
AF7 -5,0 7,2 3,6
AF8 5,0 7,2 3,6
FT7 -7,6 2,8 3,6
This should give you what you want based on the text sample you supplied.
fileID = fopen('x.txt'); %open file x.txt
m=textscan(fileID,'%s %d ,%d %d ,%d %d ,%d');
fclose(fileID); %close file
col1 = m{1,1}; %get first column into cell array col1
colRest = cell2mat(m(1,2:6)); %convert rest of columns into matrix colRest
Lookup textscan for more info on reading specially formatted data
This function should do the trick. It reads your file and scans it according to your pattern. Then, put the first column in a cell array and the others in a matrix.
function [ C1,A ] = scan_your_txt_file( filename )
fid = fopen(filename,'rt');
C = textscan(fid, '%s %d,%d %d,%d %d,%d');
fclose(fid);
C1 = C{1};
A = cell2mat(C(2:size(C,2)));
end
Have you tried xlsread? It makes a numeric array and two non-numeric arrays.
[N,T,R]=xlsread('yourfilename.txt')
but your data is not comma delimited. It also looks like you are using a comma to represent a decimal point. Does this array have 7 columns or 4? Because I'm in the US, I'm going to assume you have paired coordinates and the comma is one kind of delimiter while the space is a second one.
So here is something klugy, but it works. It is a gross ugly hack, but it works.
%housekeeping
clc
%get name of raw file
d=dir('*22202740*.txt')
%translate from comma-as-decimal to period-as-decimal
fid = fopen(d(1).name,'r') %source
fid2= fopen('myout.txt','w+') %sink
while 1
tline = fgetl(fid); %read
if ~ischar(tline), break, end %end loop
fprintf(fid2,'%s\r\n',strrep(tline,',','.')) %write updated line to output
end
fclose(fid)
fclose(fid2)
%open, gulp, parse/store, close
fid3 = fopen('myout.txt','r');
C=textscan(fid3,'%s %f %f %f ');
fclose(fid3);
%measure waist size and height
[n,m]=size(C);
n=length(C{1});
%put in slightly more friendly form
temp=zeros(n,m);
for i=2:m
t0=C{i};
temp(:,i)=t0;
end
%write to excel
xlswrite('myout_22202740.xlsx',temp(:,2:end),['b1:' char(96+m) num2str(n)]);
xlswrite('myout_22202740.xlsx',C{1},['a1:a' num2str(n)])
%read from excel
[N,T,R]=xlsread('myout_22202740.xlsx')
If you want those commas to be decimal points, then that is a different question.

Text Scanning to read in unknown number of variables and unknown number of runs

I am trying to read in a csv file which will have the format
Var1 Val1A Val1B ... Val1Q
Var2 Val2A Val2B ... Val2Q
...
And I will not know ahead of time how many variables (rows) or how many runs (columns) will be in the file.
I have been trying to get text scan to work but no matter what I try I cannot get either all the variable names isolated or a rows by columns cell array. This is what I've been trying.
fID = fopen(strcat(pwd,'/',inputFile),'rt');
if fID == -1
disp('Could not find file')
return
end
vars = textscan(fID, '%s,%*s','delimiter','\n');
fclose(fID);
Does anyone have a suggestion?
If the file has the same number of columns in each row (you just don't know how many to begin with), try the following.
First, figure out how many columns by parsing just the first row and find the number of columns, then parse the full file:
% Open the file, get the first line
fid = fopen('myfile.txt');
line = fgetl(fid);
fclose(fid);
tmp = textscan(line, '%s');
% The length of tmp will tell you how many lines
n = length(tmp);
% Now scan the file
fid = fopen('myfile.txt');
tmp = textscan(fid, repmat('%s ', [1, n]));
fclose(fid);
For any given file, are all the lines equal length? If they are, you could start by reading in the first line and use that to count the number of fields and then use textscan to read in the file.
fID = fopen(strcat(pwd,'/',inputFile),'rt');
firstLine = fgetl(fID);
numFields = length(strfind(firstLine,' ')) + 1;
fclose(fID);
formatString = repmat('%s',1,numFields);
fID = fopen(strcat(pwd,'/',inputFile),'rt');
vars = textscan(fID, formatString,' ');
fclose(fID);
Now you will have a cell array where first entry are the var names and all the other entries are the observations.
In this case I assumed the delimiter was space even though you said it was a csv file. If it is really commas, you can change the code accordingly.

MATLAB: Convert comma separated single cell to multiple cell array whilst maintaining UTF-8 encoding using textscan

From the beginning.
I have data in a csv file like:
La Loi des rues,/m/0gw3lmk,/m/0gw1pvm
L'Étudiante,/m/0j9vjq5,/m/0h6hft_
The Kid From Borneo,/m/04lrdnn,/m/04lrdnt,/m/04lrdn5,/m/04lrdnh,/m/04lrdnb
etc.
This is in UTF-8 format. I import this file as follows (taken from somewhere else):
feature('DefaultCharacterSet','UTF-8');
fid = fopen(filename,'rt'); %# Open the file
lineArray = cell(100,1); %# Preallocate a cell array (ideally slightly
%# larger than is needed)
lineIndex = 1; %# Index of cell to place the next line in
nextLine = fgetl(fid); %# Read the first line from the file
while ~isequal(nextLine,-1) %# Loop while not at the end of the file
lineArray{lineIndex} = nextLine; %# Add the line to the cell array
lineIndex = lineIndex+1; %# Increment the line index
nextLine = fgetl(fid); %# Read the next line from the file
end
fclose(fid); %# Close the file
This makes an array with the UTF-8 text within it. {3x1} array:
'La Loi des rues,/m/0gw3lmk,/m/0gw1pvm'
'L''Étudiante,/m/0j9vjq5,/m/0h6hft_'
'The Kid From Borneo,/m/04lrdnn,/m/04lrdnt,/m/04lrdn5,/m/04lrdnh,/m/04lrdnb'
Now the next part separates each value into an array:
lineArray = lineArray(1:lineIndex-1); %# Remove empty cells, if needed
for iLine = 1:lineIndex-1 %# Loop over lines
lineData = textscan(lineArray{iLine},'%s',... %# Read strings
'Delimiter',',');
lineData = lineData{1}; %# Remove cell encapsulation
if strcmp(lineArray{iLine}(end),',') %# Account for when the line
lineData{end+1} = ''; %# ends with a delimiter
end
lineArray(iLine,1:numel(lineData)) = lineData; %# Overwrite line data
end
This outputs:
'La Loi des rues' '/m/0gw3lmk' '/m/0gw1pvm' [] [] []
'L''�tudiante' '/m/0j9vjq5' '/m/0h6hft_' [] [] []
'The Kid From Borneo' '/m/04lrdnn' '/m/04lrdnt' '/m/04lrdn5' '/m/04lrdnh' '/m/04lrdnb'
The problem is that the UTF-8 encoding is lost on the textscan (note the question mark I now get whereas it was fine in the previous array).
Question: How do I maintain the UTF-8 coding when it translates the {3x1} array into a 3xN array.
I can't find anything on how to keep UTF-8 encoding in a textscan of an array already in the workspace. Everything is to do with importing a text file which I have no problems with - it is the second step.
Thanks!
Try the following code:
%# read whole file as a UTF-8 string
fid = fopen('utf8.csv', 'rb');
b = fread(fid, '*uint8')';
str = native2unicode(b, 'UTF-8');
fclose(fid);
%# split into lines
lines = textscan(str, '%s', 'Delimiter','', 'Whitespace','\n');
lines = lines{1};
%# split each line into values
C = cell(numel(lines),6);
for i=1:numel(lines)
vals = textscan(lines{i}, '%s', 'Delimiter',',');
vals = vals{1};
C(i,1:numel(vals)) = vals;
end
The result:
>> C
C =
'La Loi des rues' '/m/0gw3lmk' '/m/0gw1pvm' [] [] []
'L'Étudiante' '/m/0j9vjq5' '/m/0h6hft_' [] [] []
'The Kid From Borneo' '/m/04lrdnn' '/m/04lrdnt' '/m/04lrdn5' '/m/04lrdnh' '/m/04lrdnb'
Note that when I tested this, I encoded the input CSV file as "UTF-8 without BOM" (I was using Notepad++ as editor)
Try using the following fopen command instead of the one you currently are. It specifies UTF-8 encoding for the file.
f = fopen(filename,'rt', 'UTF-8');
You can probably shorten up some of the code using this as well:
text = fscanf(f,'%c');
Lines = textscan(text,'%s','Delimiter',',');
That might help alleviate some of the pre-allocation that you're doing there.