Load large matrix from text file in matlab - matlab

I have a text file like :
[ 1, 2, 3;
2, 4, 5;
2, 2, 2;
8, 3, 3 ]
What is the quickest way to load this as a matrix in Octave/Matlab? I want to see this as a matrix with 4 rows and 3 cols.

Drag the text file with your mouse over your workspace in MATLAB (the area where all your current variables are shown) and drop it there. This opens the "import" window:
Give the file a name (mine is currently "NewTextDocument2") and select IMPORT on the top right. MATLAB will take care of semicolons and brackets. If you want to have a function that does this, select "generate function" instead of IMPORT.

I'm not sure if it is the simplest.
fid = fopen('filename.txt','r');
C = textscan(fid, '%f %f %f', ...
'Delimiter',' ','MultipleDelimsAsOne', 1);
fclose(fid);
DataMatrix = cat(2,C{:});

Quick and really dirty approach using the generally non-recommended function eval:
fid = fopen('data.txt');
s = fscanf(fid, '%s');
fclose(fid);
eval(['dataMatrix = ' s ';']);

in Octave you could do
fid = fopen ("yourfile", "r");
x = str2num (char(fread(fid))');
fclose (fid)
(I don't know if this works in Matlab)

If your file contains only numbers you can use the Matlab load() function. This function is often used to load .mat files. However it is capable of dealing with what matlab calls ASCII format files.
Say your file is purely textual, contains only numbers and is structured as follows:
filename.txt
1 2 3
2 4 5
2 2 2
8 3 3
The load function will create a variable called filename containing your array:
> load('filename.txt');
> filename =
[ 1, 2, 3;
2, 4, 5;
2, 2, 2;
8, 3, 3 ]

This works with your current format of textfile.
Use this function importfile.m:
function filename = importfile(filename, startRow, endRow)
delimiter = ',';
if nargin<=2
startRow = 1;
endRow = inf;
end
formatSpec = '%s%s%s%[^\n\r]';
fileID = fopen(filename,'r');
dataArray = textscan(fileID, formatSpec, endRow(1)-startRow(1)+1, 'Delimiter', delimiter, 'HeaderLines', startRow(1)-1, 'ReturnOnError', false);
for block=2:length(startRow)
frewind(fileID);
dataArrayBlock = textscan(fileID, formatSpec, endRow(block)-startRow(block)+1, 'Delimiter', delimiter, 'HeaderLines', startRow(block)-1, 'ReturnOnError', false);
for col=1:length(dataArray)
dataArray{col} = [dataArray{col};dataArrayBlock{col}];
end
end
fclose(fileID);
raw = repmat({''},length(dataArray{1}),length(dataArray)-1);
for col=1:length(dataArray)-1
raw(1:length(dataArray{col}),col) = dataArray{col};
end
numericData = NaN(size(dataArray{1},1),size(dataArray,2));
for col=[1,2,3]
rawData = dataArray{col};
for row=1:size(rawData, 1);
regexstr = '(?<prefix>.*?)(?<numbers>([-]*(\d+[\,]*)+[\.]{0,1}\d*[eEdD]{0,1}[-+]*\d*[i]{0,1})|([-]*(\d+[\,]*)*[\.]{1,1}\d+[eEdD]{0,1}[-+]*\d*[i]{0,1}))(?<suffix>.*)';
try
result = regexp(rawData{row}, regexstr, 'names');
numbers = result.numbers;
invalidThousandsSeparator = false;
if any(numbers==',');
thousandsRegExp = '^\d+?(\,\d{3})*\.{0,1}\d*$';
if isempty(regexp(thousandsRegExp, ',', 'once'));
numbers = NaN;
invalidThousandsSeparator = true;
end
end
if ~invalidThousandsSeparator;
numbers = textscan(strrep(numbers, ',', ''), '%f');
numericData(row, col) = numbers{1};
raw{row, col} = numbers{1};
end
catch me
end
end
end
filename = cell2mat(raw);
How to use it:
>> importfile('file.txt',1,4)
ans =
1 2 3
2 4 5
2 2 2
8 3 3

Related

importing text file data by blocks?

I am trying to import every rows that starts with '//', I have tried to extract it with the script below. can anybody check my script please?
formatSpec = '//NFE=%f //ElapsedTime=%f //SBX=%f //DE=%f //PCX=%f //SPX=%f //UNDX=%f //UM=%f //Improvements=%f //Restarts=%f //PopulationSize=%f //ArchiveSize=%f //MutationIndex=%f %*f';
N=1
k = 0;
while ~feof(fileID)
k = k+1;
C = textscan(fileID,formatSpec,N,'CommentStyle','#','Delimiter','\n');
end
It is not clear to me how you want the output to look, but here is one possibilitiy:
fid = fopen(filename, 'rt');
dataset = textscan(fid, '%s', 'delimiter', '\n', 'headerlines', 0);
fclose(fid);
result = regexp(dataset{1}, '//([A-Za-z].*)=([0-9\.].*)', 'tokens');
result = result(cellfun(#(x) ~isempty(x), result));
result contains both the type, e.g. NFE or SBX, and the number (albeit in character format).

Importing data block with Matlab

I have a set of data in the following format, and I would like to import each block in order to analyze them with Matlab.
Emax=0.5/real
----------------------------------------------------------------------
4.9750557 14535
4.9825821 14522
4.990109 14511
4.9976354 14491
5.0051618 14481
5.0126886 14468
5.020215 14437
5.0277414 14418
5.0352678 14400
5.0427947 14372
5.0503211 14355
5.0578475 14339
5.0653744 14321
Emax=1/real
----------------------------------------------------------------------
24.965595 597544
24.973122 597543
24.980648 597543
24.988174 597542
24.995703 597542
25.003229 597542
I have modified this piece of code from MathWorks, but I think, I have problems dealing with the spaces between each column.
Each block of data consist of 3874 rows and is divided by a text (Emax=XX/real) and a line of ----, unfortunately is the only way the software export the data.
Here is one way to import the data:
% read file as a cell-array of lines
fid = fopen('file.dat', 'rt');
C = textscan(fid, '%s', 'Delimiter','');
C = C{1};
fclose(fid);
% remove separator lines
C(strncmp('---',C,3)) = [];
% location of section headers
headInd = [find(strncmp('Emax=', C, length('Emax='))) ; numel(C)+1];
% extract each section
num = numel(headInd)-1;
blocks = struct('header',cell(num,1), 'data',cell(num,1));
for i=1:num
% section header
blocks(i).header = C{headInd(i)};
% data
X = regexp(C(headInd(i)+1:headInd(i+1)-1), '\s+', 'split');
blocks(i).data = str2double(vertcat(X{:}));
end
The result is a structure array containing the data from each block:
>> blocks
blocks =
2x1 struct array with fields:
header
data
>> blocks(2)
ans =
header: 'Emax=1/real'
data: [6x2 double]
>> blocks(2).data(:,1)
ans =
24.9656
24.9731
24.9806
24.9882
24.9957
25.0032
This should work. I don't think textscan() will work with a file like this because of the breaks between blocks.
Essentially what this code does is loop through lines between blocks until it finds a line that matches the data format. The code is naive and assumes that the file will have exactly the number of blocks lines per block that you specify. If there were a fixed number of lines between blocks it would be a lot easier and you could remove the first inner loop and replace with just ~=fgets(fid) once for each line.
function block_data = readfile(in_file_name)
fid = fopen(in_file_name, 'r');
delimiter = ' ';
line_format = '%f %f';
n_cols = 2; % Number of numbers per line
block_length = 3874; % Number of lines per block
n_blocks = 2; % Total number of blocks in file
tline = fgets(fid);
line_data = cell2mat(textscan(tline,line_format,'delimiter',delimiter,'MultipleDelimsAsOne',1));
block_n = 0;
block_data = zeros(n_blocks,block_length,n_cols);
while ischar(tline) && block_n < n_blocks
block_n = block_n+1;
tline = fgets(fid);
if ischar(tline)
line_data = cell2mat(textscan(tline,line_format,'delimiter',delimiter,'MultipleDelimsAsOne',1));
end
while ischar(tline) && isempty(line_data)
tline = fgets(fid);
line_data = cell2mat(textscan(tline,line_format,'delimiter',delimiter,'MultipleDelimsAsOne',1));
end
line_n = 1;
while line_n <= block_length
block_data(block_n,line_n,:) = cell2mat(textscan(tline,line_format,'delimiter',delimiter,'MultipleDelimsAsOne',1));
tline = fgets(fid);
line_n = line_n+1;
end
end
fclose(fid)

Matlab: Put each line of a text file in a separate array

I have a file like the following
10158 18227 2055 24478 25532
12936 14953 17522 17616 20898 24993 24996
26375 27950 32700 33099 33496 3663
...
I would like to put each line in an array in order to access elements of each line separately.
I used cell arrays but it seems to create a 1 by 1 array for each cell element:
fid=fopen(filename)
nlines = fskipl(fid, Inf)
frewind(fid);
cells = cell(nlines, 1);
for ii = 1:nlines
cells{ii} = fscanf(fid, '%s', 1);
end
fclose(fid);
when I access cells{ii} I get all values in the same element and I can't access the list values
A shorter solution would be reading the file with textscan:
fid = fopen(filename, 'r');
C = cellfun(#str2num, textscan(fid, '%s', 'delimiter', ''), 'Uniform', false);
fclose(fid);
The resulting cell array C is what you're looking for.
I think that fscanf(fid, '%s', 1); is telling matlab to read the line a single string. You will still have to convert it to an array of numbers:
for ii = 1:nlines
cells{ii} = str2num(fscanf(fid, '%s', 1));
end

Matlab too many outputs

The program myfile.m reads a txt file that contains a total of 25 names and numbers like
exemple:
John doughlas 15986
Filip duch 357852
and so on.
The program converts them to
15986 Doughlas John
357852 duch Filip
This is without function, with it I get too many outputs.
Error message:
Error using disp
Too many output arguments.
Error in red4 (line 26)
array = disp(All);
Original code below:
function array = myfile(~)
if nargin == 0
dirr = '.';
end
answer = dir(dirr);
k=1;
while k <= length(answer)
if answer(k).isdir
answer(k)=[];
else
filename{k}=answer(k).name;
k=k+1;
end
end
chose=menu( 'choose file',filename);
namn = char(filename(chose));
fid = fopen(namn, 'r');
R = textscan(fid,'%s %s %s');
x=-1;
k=0;
while x <= 24
x = k + 1;
All = [R{3}{x},' ',R{1}{x},' ',R{2}{x}];
disp(All)
k = k + 1;
end
fclose(fid);
Now I have got many good answers from people and sites like functions but I cant get the results like the above with function.
I have tried combining them and got some results:
y = 15986 & [a,z,b] = myfile
y = 25 & myfile = x
y = numbers name1,2,3,4 and so one & myfile = fprintf(All)
y = & I used results().namn,
numbers name 1 & results().id, results().lastname
y =
numbers name 2 and so on.
The result I want is:
y = myfile
y =
15986 Doughlas John
357852 duch Filip
update: Change it like Eitan T said but did't get the result like above.
Got the result:
'John doughlas 15986'
'Filip duch 357852'
function C = myfile()
if nargin == 0
dirr = '.';
end
answer = dir(dirr);
k=1;
while k <= length(answer)
if answer(k).isdir
answer(k)=[];
else
filname{k}=answer(k).name;
k=k+1;
end
end
chose=menu( 'choose',filname);
name = char(filname(chose));
fid = fopen(name, 'r');
C = textscan(fid, '%s', 'delimiter', '');
C = regexprep(C{1}, '(\w+) (\w+) (\w+)', '$3 $2 $1');
fclose(fid);
Why use loops? Read the lines at once with textscan and use regexprep to manipulate the words:
fid = fopen(filename, 'r');
C = textscan(fid, '%s', 'delimiter', '');
C = regexprep(C{1}, '(\w+) (\w+) (\w+)', '$3 $2 $1')
fclose(fid);
The result is a cell array C, each cell storing a line. For your example, you'll get a 2×1 cell array:
C =
'15986 doughlas John'
'357852 duch Filip'
I'm not sure what you want to do with it, but if you provide more details I can improve my answer further.
Hope this helps!

Matlab Muliple delimiter removing during import

I am struggling with a problem and want a easy way around. I have a big array of data where I have some vector values as ( 1.02 1.23 3.32) format. I want it as 1.02 1.23 3.32 in a tabular form. The problem here is that there are two types of delimiter '(' and ')'.
can anyone help in writing a code for this? I have something like this:
filename = 'U.dat';
delimiterIn = '(';
headerlinesIn = 0;
A = textscan(filename,delimiterIn,headerlinesIn);
But one thing is that it only have one delimiter "(" and it does not work either.
Nishant
If your text file looks something like this:
(1 2 3) (4 5 6)
(7 8 9) (10 11 12)
You can read it in as strings and convert it to a cell array of vectors like this:
% read in file
clear all
filename = 'delim.txt';
fid = fopen(filename); %opens file for reading
tline = fgets(fid); %reads in a line
index = 1;
%reads in all lines until the end of the file is reached
while ischar(tline)
data{index} = tline;
tline = fgets(fid);
index = index + 1;
end
fclose(fid); %close file
% convert strings to a cell array of vectors
rowIndex = 1;
colIndex = 1;
outData = {};
innerStr = [];
for aCell = data % for each entry in data
sline = aCell{1,1};
for c = sline % for each charecter in the line
if strcmp(c, '(')
innerStr = [];
elseif strcmp(c, ')')
outData{rowIndex,colIndex} = num2str(innerStr);
colIndex = colIndex + 1;
else
innerStr = [innerStr, c];
end
end
rowIndex = rowIndex + 1;
colIndex = 1;
end
outData