Problem (bug?) loading hexadecimal data into MATLAB - matlab

I'm trying to load the following ascii file into MATLAB using load()
% some comment
1 0xc661
2 0xd661
3 0xe661
(This is actually a simplified file. The actual file I'm trying to load contains an undefined number of columns and an undefined number of comment lines at the beginning, which is why the load function was attractive)
For some strange reason, I obtain the following:
K>> data = load('testMixed.txt')
data =
1 50785
2 58977
3 58977
I've observed that the problem occurs anytime there's a "d" in the hexadecimal number.
Direct hex2dec conversion works properly:
K>> hex2dec('d661')
ans =
54881
importdata seems to have the same conversion issue, and so does the ImportWizard:
K>> importdata('testMixed.txt')
ans =
1 50785
2 58977
3 58977
Is that a bug, am I using the load function in some prohibited way, or is there something obvious I'm overlooking?
Are there workarounds around the problem, save from reimplementing the file parsing on my own?
Edited my input file to better reflect my actual file format. I had a bit oversimplified in my original question.

"GOLF" ANSWER:
This starts with the answer from mtrw and shortens it further:
fid = fopen('testMixed.txt','rt');
data = textscan(fid,'%s','Delimiter','\n','MultipleDelimsAsOne','1',...
'CommentStyle','%');
fclose(fid);
data = strcat(data{1},{' '});
data = sscanf([data{:}],'%i',[sum(isspace(data{1})) inf]).';
PREVIOUS ANSWER:
My first thought was to use TEXTSCAN, since it has an option that allows you to ignore certain lines as comments when they start with a given character (like %). However, TEXTSCAN doesn't appear to handle numbers in hexadecimal format well. Here's another option:
fid = fopen('testMixed.txt','r'); % Open file
% First, read all the comment lines (lines that start with '%'):
comments = {};
position = 0;
nextLine = fgetl(fid); % Read the first line
while strcmp(nextLine(1),'%')
comments = [comments; {nextLine}]; % Collect the comments
position = ftell(fid); % Get the file pointer position
nextLine = fgetl(fid); % Read the next line
end
fseek(fid,position,-1); % Rewind to beginning of last line read
% Read numerical data:
nCol = sum(isspace(nextLine))+1; % Get the number of columns
data = fscanf(fid,'%i',[nCol inf]).'; % Note '%i' works for all integer formats
fclose(fid); % Close file
This will work for an arbitrary number of comments at the beginning of the file. The computation to get the number of columns was inspired by Jacob's answer.

New:
This is the best I could come up with. It should work for any number of comment lines and columns. You'll have to do the rest yourself if there are strings, etc.
% Define the characters representing the start of the commented line
% and the delimiter
COMMENT_START = '%%';
DELIMITER = ' ';
% Open the file
fid = fopen('testMixed.txt');
% Read each line till we reach the data
l = COMMENT_START;
while(l(1)==COMMENT_START)
l = fgetl(fid);
end
% Compute the number of columns
cols = sum(l==DELIMITER)+1;
% Split the first line
split_l = regexp(l,' ','split');
% Read all the data
A = textscan(fid,'%s');
% Compute the number of rows
rows = numel(A{:})/cols;
% Close the file
fclose(fid);
% Assemble all the data into a matrix of cell strings
DATA = [split_l ; reshape(A{:},[cols rows])']; %' adding this to make it pretty in SO
% Recognize each column and process accordingly
% by analyzing each element in the first row
numeric_data = zeros(size(DATA));
for i=1:cols
str = DATA(1,i);
% If there is no '0x' present
if isempty(findstr(str{1},'0x')) == true
% This is a number
numeric_data(:,i) = str2num(char(DATA(:,i)));
else
% This is a hexadecimal number
col = char(DATA(:,i));
numeric_data(:,i) = hex2dec(col(:,3:end));
end
end
% Display the data
format short g;
disp(numeric_data)
This works for data like this:
% Comment 1
% Comment 2
1.2 0xc661 10 0xa661
2 0xd661 20 0xb661
3 0xe661 30 0xc661
Output:
1.2 50785 10 42593
2 54881 20 46689
3 58977 30 50785
OLD:
Yeah, I don't think LOAD is the way to go. You could try:
a = char(importdata('testHexa.txt'));
a = hex2dec(a(:,3:end));

This is based on both gnovice's and Jacob's answers, and is a "best of breed"
For files like:
% this is my comment
% this is my other comment
1 0xc661 123
2 0xd661 456
% surprise comment
3 0xe661 789
4 0xb661 1234567
(where the number of columns within the file MUST be the same, but not known ahead of time, and all comments denoted by a '%' character), the following code is fast and easy to read:
f = fopen('hexdata.txt', 'rt');
A = textscan(f, '%s', 'Delimiter', '\n', 'MultipleDelimsAsOne', '1', 'CollectOutput', '1', 'CommentStyle', '%');
fclose(f);
A = A{1};
data = sscanf(A{1}, '%i')';
data = repmat(data, length(A), 1);
for ctr = 2:length(A)
data(ctr,:) = sscanf(A{ctr}, '%i')';
end

Related

Reading Irregular Text Files with MATLAB

In short, I'm having a headache in multiple languages to read a txt file (linked below). My most familiar language is MATLAB so for that reason I'm using that in this example. I've found a way to read this file in ~ 5 minutes, but given I'll have tons and tons of data from my instrument shortly as it measures all day every 30 seconds this just isn't feasible.
I'm looking for a way to quickly read these irregular text files so that going forward I can knock these out with less of a time burden.
You can find my exact data at this link:
http://lb3.pandonia.net/BostonMA/Pandora107s1/L0/Pandora107s1_BostonMA_20190814_L0.txt.bz2
I've been using the "readtable" function in matlab and I have achieved a final product I want but I'm looking to increase the speed
Below is my code!
clearvars -except pan day1; % Clearing all variables except for the day and instrument variables.
close all;
clc;
pan_mat = [107 139 155 153]; % Matrix of pandora numbers for file-choosing
reasons.
pan = pan_mat(pan); % pandora number I'm choosing
pan = num2str(pan); % Turning Pandora number into a string.
%pan = '107'
pandora = strcat('C:\Users\tadams15\Desktop\Folders\Counts\Pandora_Dta\',pan)
% string that designates file location
%date = '90919'
month = '09'; % Month
day2 = strcat('0',num2str(day1)) % Creating a day name for the figure I ultimately produce
cd(pandora)
d2 = strcat('2019',num2str(month),num2str(day2)); % The final date variable
for the figure I produce
%file_pan = 'Pandora107s1_BostonMA_20190909_L0';
file_pan = strcat('Pandora',pan,'s1_BostonMA_',d2,'_L0'); % File name string
%Try reading it in line by line?
% Load in as a string and then convert the lines you want as numbers into
% number.
delimiterIn = '\t';
headerlinesIn = 41;
A = readtable(file_pan,'HeaderLines', 41, 'Delimiter', '\t'); %Reading the
file as a table
A = table2cell(A); % Converting file to a cell
A = regexp(A, ' ', 'split'); % converting cell to a structure matrix.
%%
A= array2table(A); % Converting Structure matrix back to table
row_num = 0;
pan_mat_2 = zeros(2359,4126);
datetime_mat = zeros(2359,2);
blank = 0;
%% Converting data to proper matrices
[length width] = size(A);
% The matrix below is going through "A" and writing from it to a new
% matrix, "pan_mat_2" which is my final product as well as singling out the
% rows that contain non-number variables I'd like to keep and adding them
% later.
tic
%flag1
for i = 1:length; % Make second number the length of the table, A
blank = 0;
b = table2array(A{i,1});
[rows, columns] = size(b);
if columns > 4120 && columns < 4140
row_num = row_num + 1;
blank = regexp(b(2), 'T', 'split');
blank2 = regexp(blank{1,1}(2), 'Z', 'split');
datetime_mat(row_num,1) = str2double(blank{1,1}(1));
datetime_mat(row_num,2) = str2double(blank2{1,1}(1));
for j = 1:4126;
pan_mat_2(row_num,j) = str2double(b(j));
end
end
end
toc
%flag2
In short, I'm already getting the result I want but the part of the code where I'm writing to a new array "flag 1" to "flag 2" is taking roughly 222 seconds while the entire code only takes about 248 seconds. I'd like to find a better way to create the data there than to write it to a new array and take a whole bunch of time.
Any suggestions?
Note:
There are a quite a few improvments you can make for speed but there are also corrections. You preallocate you final output variable with hard coded values:
pan_mat_2 = zeros(2359,4126);
But later you populate it in a loop which run for i = 1:length.
length is the full number of lines picked from the file. In your example file there are only 784 lines. So even if all your line were valid (ok to be parsed), you would only ever fill the first 784 lines of the total 2359 lines you allocated in your pan_mat_2. In practice, this file has only 400 valid data lines, so your pan_mat_2 could definitely be smaller.
I know you couldn't know you had only 400 line parsed before you parsed them, but you knew from the beginning that you had only 784 line to parse (you had the info in the variable length). So in case like these pre-allocate to 784 and only later discard the empty lines.
Fortunately, the solution I propose does not need to pre-allocate larger then discard. The matrices will end up the right size from the start.
The code:
%%
file_pan = 'Pandora107s1_BostonMA_20190814_L0.txt' ;
delimiterIn = '\t';
headerlinesIn = 41;
A = readtable(file_pan,'HeaderLines', 41, 'Delimiter', '\t'); %Reading the file as a table
A = table2cell(A); % Converting file to a cell
A = regexp(A, ' ', 'split'); % converting cell to a structure matrix.
%% Remove lines which won't be parsed
% Count the number of elements in each line
nelem = cell2mat( cellfun( #size , A ,'UniformOutput',0) ) ;
nelem(:,1) = [] ;
% find which lines does not have enough elements to be parsed
idxLine2Remove = ~(nelem > 4120 & nelem < 4140) ;
% remove them from the data set
A(idxLine2Remove) = [] ;
%% Remove nesting in cell array
nLinesToParse = size(A,1) ;
A = reshape( [A{:}] , [], nLinesToParse ).' ;
% now you have a cell array of size [400x4126] cells
%% Now separate the columns with different data type
% Column 1 => [String] identifier
% Column 2 => Timestamp
% Column 3 to 4125 => Numeric values
% Column 4126 => empty cell created during the 'split' operation above
% because of a trailing space character.
LineIDs = A(:,1) ;
TimeStamps = A(:,2) ;
Data = A(:,3:end-1) ; % fetch to "end-1" to discard last empty column
%% now extract the values
% You could do that directly:
% pan_mat = str2double(Data) ;
% but this takes a long time. A much computationnaly faster way (even if it
% uses more complex code) would be:
dat = strjoin(Data) ; % create a single long string made of all the strings in all the cells
nums = textscan( dat , '%f' , Inf ) ; % call textscan on it (way faster than str2double() )
pan_mat = reshape( cell2mat( nums ) , nLinesToParse ,[] ) ; % reshape to original dimensions
%% timestamps
% convert to character array
strTimeStamps = char(TimeStamps) ;
% convert to matlab own datetime numbering. This will be a lot faster if
% you have operations to do on the time stamps later
ts = datenum(strTimeStamps,'yyyymmddTHHMMSSZ') ;
%% If you really want them the way you had it in your example
strTimeStamps(:,9) = ' ' ; % replace 'T' with ' '
strTimeStamps(:,end) = ' ' ; % replace 'Z' characters with ' '
%then same again, merge into a long string, parse then reshape accordingly
strdate = reshape(strTimeStamps.',1,[]) ;
tmp = textscan( strdate , '%d' , Inf ) ;
datetime_mat = reshape( double(cell2mat(tmp)),2,[]).' ;
The performance:
As you can see on my machine your original code takes ~102 seconds to execute, with 80% of that (81s) spent on calling the function str2double() 3,302,400 times!
My solution, run on the same input file, takes ~5.5 seconds, with half of the time spent on calling strjoin() 3 times.
When you read the code above, try to understand how I limited the repetition of function call in lengthy loops by trying to keep everything as vectorised as possible.
Using the profiler, you can see that you call str2double 3302400 times in a run which takes about 80% of the total time on my pc. Now thats suboptimal, as each time you only translate 1 value and as far as your code goes you dont need the values as string again. I added this under you original code:
row_num = 0;
pan_mat_2_b = cell(2359,4126);
datetime_mat_b = cell(2359,2);%not zeros
blank = 0;
tic
%flag1
for i = 1:length % Make second number the length of the table, A
blank = 0;
b = table2array(A{i,1});
[rows, columns] = size(b);
if columns > 4120 && columns < 4140
row_num = row_num + 1;
blank = regexp(b(2), 'T', 'split');
blank2 = regexp(blank{1,1}(2), 'Z', 'split');
%datetime_mat(row_num,1) = str2double(blank{1,1}(1));
%datetime_mat(row_num,2) = str2double(blank2{1,1}(1));
datetime_mat_b(row_num,1) = blank{1,1}(1);
datetime_mat_b(row_num,2) = blank2{1,1}(1);
pan_mat_2_b(row_num,:) = b;
% for j = 1:4126
% pan_mat_2(row_num,j) = str2double(b(j));
% end
end
end
datetime_mat_b = datetime_mat_b(~all(cellfun('isempty',datetime_mat_b),2),:);
pan_mat_2_b=pan_mat_2_b(~all(cellfun('isempty',pan_mat_2_b),2),:);
datetime_mat_b=str2double(string(datetime_mat_b));
pan_mat_2_b=str2double(pan_mat_2_b);
toc
Still not great, but better. If you want to speed this up further i recommend you take a closer look at the readtable part. As you can save up quite some time if you start with reading in the format as doubles right from the beginning

MATLAB - Read Textfile (lines with different formats) line by line

I have a text file (lets call it an input file) of this type:
%My kind of input file % Comment 1 % Comment 2
4 %Parameter F
2.745 5.222 4.888 1.234 %Parameter X
273.15 373.15 1 %Temperature Initial/Final/Step
3.5 %Parameter Y
%Matrix A
1.1 1.3 1 1.05
2.0 1.5 3.1 2.1
1.3 1.2 1.5 1.6
1.3 2.2 1.7 1.4
I need to read this file and save the values as variables or even better as part of different arrays. For example by reading I should obtain Array1.F=4; then Array1.X should be a vector of 3 real numbers, Array2.Y=3.5 then Array2.A is a matrix FxF. There are tons of functions to read from text file but I don't know how to read these kind of different formats. I've used in the past fgetl/fgets to read lines but it reads as strings, I've used fscanf but it reads the whole text file as if it is formatted all equally. However I need something to read sequentially with predefined formats. I can easily do this with fortran reading line by line because read has a format statement. What is the equivalent in MATLAB?
This actually parses the file you posted in your example. I could've done better, but I'm tired today:
res = struct();
fid = fopen('test.txt','r');
read_mat = false;
while (~feof(fid))
% Read text line by line...
line = strtrim(fgets(fid));
if (isempty(line))
continue;
end
if (read_mat) % If I'm reading the final matrix...
% I use a regex to capture the values...
mat_line = regexp(line,'(-?(?:\d*\.)?\d+)+','tokens');
% If the regex succeeds I insert the values in the matrix...
if (~isempty(mat_line))
res.A = [res.A; str2double([mat_line{:}])];
continue;
end
else % If I'm not reading the final matrix...
% I use a regex to check if the line matches F and Y parameters...
param_single = regexp(line,'^(-?(?:\d*\.)?\d+) %Parameter (F|Y)$','tokens');
% If the regex succeeds I assign the values...
if (~isempty(param_single))
param_single = param_single{1};
res.(param_single{2}) = str2double(param_single{1});
continue;
end
% I use a regex to check if the line matches X parameters...
param_x = regexp(line,'^((?:-?(?:\d*\.)?\d+ ){4})%Parameter X$','tokens');
% If the regex succeeds I assign the values...
if (~isempty(param_x))
param_x = param_x{1};
res.X = str2double(strsplit(strtrim(param_x{1}),' '));
continue;
end
% If the line indicates that the matrix starts I set my loop so that it reads the final matrix...
if (strcmp(line,'%Matrix A'))
res.A = [];
read_mat = true;
continue;
end
end
end
fclose(fid);

Open specific file with the specific words (16 bits) structure

I have a specific binary? file format containing datas about the configuration used to take a picture with a custom camera. This file format is named DAI and contains for example values of offset/gain/etc...
I am using a black-box script in java to turn this file into a .csv and I want to perform the same thing in Matlab. I've got a config file describing in ascii format how this file is built (name of the field, type of the data, first_word, last_word, low_bit, high_bit). For example I know that the first field in the DAI file will be :
spare1; PCHAR; first_word=0; low_bit=0; high_bit=7
But right now I have no clue of how to use this information. My first thought were to fopen() the file and use fread() to read the binary data from the file and turn it into the format I want but I don't know how to use the values of "last_word,high_bit,..." to do so. I have a limited understanding of binary files.
To sum up everything :
file.dai contains datas /
file.cfg contains the structure :
mband_1_start_line; PCHAR; first_word=12; low_bit=6; high_bit=15
mband_1_length; PCHAR; first_word=12; low_bit=0; high_bit=5
mband_1_gain; PCHAR; first_word=13; low_bit=0; high_bit=7
mband_1_offset; PCHAR; first_word=13; low_bit=8; last_word=14; high_bit=7
and I want to recover the datas corresponding to the fields like mband_1_offset.
If someone can help me to figure the good way of doing that I will be very thankful !
[EDIT : SOLVED] So thanks to your very helpful help I've manage to get the values for each field even when the header changes !!
Here's the final code :
Here's the final code :
...code to retrieve the content of the .cfg file....
%% Open and read the DAI file
fid = fopen(dai_file,'r','l');
% First thing is to skip the header
% We read a first time the file
dat=fread(fid,inf,'*uint8');
% We search for the position of the end of the header : NUL NUL ETX
% In decimal it gives :
skip = findstr(dat',[000,000,003]);
% We define the wordsize : 2 bytes (2 words)
wordsize = 2;
% We rewind the file to start over to get the values for each field
frewind(fid);
% We initiate the structure camdat containing the datas of the camera
camdat=struct;
% We start the loop for each field of the layout config file
for ct = 1:length(layout)
% Defining the words/bits
first_word = layout{ct,3};
last_word = layout{ct,5};
low_bit = layout{ct,4};
high_bit = layout{ct,6};
% We position to the "skip value + the position of the first_word in bytes"
fseek(fid,skip+first_word*wordsize,-1);
% We compute the number of words (last - first +1)
datasize=last_word-first_word+1;
% We read the datas as uint16 (words are 16bits)
data=fread(fid,datasize,'*uint16');
% We convert it to bits
% Case of 1 word
bits=bitget(data(1),[1:16]);
% Case of 2 words
if length(data) > 1
bits=[bits,bitget(data(2),1:16)];
high_bit = high_bit+16;
end
% We take only the bits that define the field (between low_bit and
% high_bit)
bits_used = bits(low_bit+1:high_bit+1);
% We convert the bits to dec
data = sum(bits_used.*uint16(2).^uint16([0:length(bits_used)-1]));
% We store it in the camdat.field struct
camdat.(layout{ct,1})=data;
end
% We close the DAI file
fclose(fid);
% Displaying for test
camdat
My approach in this case is to find the part of the file that matches your data.
fid = fopen('dai_file.dai','r','l');
dat=fread(fid,inf,'*uint8');
findstr(dat',[74,210,129,93]);
>> 891 1159 1427 1695 ....
Strange enough this happens 100 times.
If byte 891 is right than bios_1 is NOT in the 4th word from bit 0 to 7, but in the 445th word bit 0 to 7.
Let's try
fid = fopen(dai_file,'r','l');
fseek(fid,445*2,-1)
data=fread(fid,1,'*uint16');
bits=bitget(data(1),[1:16]);
bits = bits(1:8);
data = sum(bits.*uint16(2).^uint16([0:7]))
>> data = 74
Yep, there it is. So I would suggest to add 441 to each word entry and see if it works.
Oke, so you get information about the layout of the file.
I would first store this in a more accessabel format
layout{1,1} = 'mband_1_start_line';
layout{1,2} = 'PCHAR';
layout{1,3} = 12;
layout{1,4} = 6;
layout{1,5} = 12;
layout{1,6} = 15;
Then you loop over the layout
wordsize = 2; %bytes / word
fid = fread(filename,'r','l')
camdat=struct;
for ct = 1:size(layout,1)
fseek(fid,-1,layout{1,3}/wordsize) %go to byte position
datsize=layout{1,5}-layout{1,3}+1; %number of words
data=fread(fid,datsize,'*uint16') %get words
bits=bitget(data(1),[1:16]); %convert to bits
for ct = 2:datasize
bits=[bits,bitget(data(ct),[1:16])];
end
bits = bits(layout{1,4}:(datasize-1)*16+layout{1,6};%get bits
data = sum(bits.*uint16(2).^uint16([0:(length(bits)-1)])) %convert back
camdat.(layout{1,1})=data; %store
end
fclose(fid)
There will be problems with values that are longer than 16 bits ofcourse.
If the wordsize is different, you can change it to 4 for 32 bit, or 8 for 64 bit, but then you have to also change that in the loop.
So I've been using your help to figure a way to do what I wanted.
The idea is to go to the bytes of the "first_word", take the bits between the first and last word (and low_bit and high_bit), turn them into decimals. With your code I've done the following that gives results but not the one I was waiting for (in the .csv) (attached file).
First I'm not sure I'm handling well the case where the last_word is not the same as the first_word.
Then I'm not sure that my fseek() sends me at the correct bytes of the file...
%% Name of the files
%% Open and read the .cfg file
%% Open and read the DAI file
...So here I've got my .cfg opened and store in layout{i,j}
wordsize = 2; %bytes / word
fid = fopen(dai_file,'r','l');
camdat=struct;
for ct = 1:length(layout)
first_word = layout{ct,3};
last_word = layout{ct,5};
low_bit = layout{ct,4};
high_bit = layout{ct,6};
fseek(fid,first_word*wordsize,-1); %go to bytes
datasize=last_word-first_word+1; %number of words
data=fread(fid,datasize,'*uint16'); %get words
bits=bitget(data(1),[1:16]); %convert to bits
if length(data) > 1 % case of 2 words
bits=[bits,bitget(data(2),1:16)];
high_bit = high_bit+16;
end
bits = bits(low_bit+1:high_bit+1);%get bits
data = sum(bits.*uint16(2).^uint16([0:length(bits)-1])); %convert back
camdat.(layout{ct,1})=data; %store
end
camdat
fclose(fid);
So if you have ideas of where I'm wrong, I'll be very grateful !!!!

How to sparsely read a large file in Matlab?

I ran a simulation which wrote a huge file to disk. The file is a big matrix v. I can't read it all, but I really only need a portion of the matrix, say, 1:100 of the columns and rows. I'd like to do something like
vtag = dlmread('v',1:100:end, 1:100:end);
Of course, that doesn't work. I know I should have only done the following when writing to the file
dlmwrite('vtag',v(1:100:end, 1:100:end));
But I did not, and running everything again would take two more days.
Thanks
Amir
Thankfully the dlmread function supports specifying a range to read as the third input. So if you wan to read all N columns for the first 100 rows, you can specify that with the following command
startRow = 1;
startColumn = 1;
endRow = 100;
endColumn = N;
rng = [startRow, startColumn, endRow, endColumn] - 1;
vtag = dlmread(filename, ',', rng);
EDIT Based on your clarification
Since you don't want 1:100 rows but rather 1:100:end rows, the following approach should work better for you.
You can use textscan to read chunks of data at a time. You can read a "good" row and then read in the next "chunk" of data to ignore (discarding it in the process), and continue until you reach the end of the file.
The code below is a slight modification of that idea, except it utilizes the HeaderLines input to textscan which instructs the function how many lines to ignore before reading in the data. The first time through the loop, no lines will be skipped, however all other times through the loop, rows2skip lines will be skipped. This allows us to "jump" through the file very rapidly without calling any additional file opertions.
startRow = 1;
rows2skip = 99;
columns = 3000;
fid = fopen(filename, 'rb');
% For now, we'll just assume you're reading in floating-point numbers
format = repmat('%f ', [1 columns]);
count = 1;
lines2discard = startRow - 1;
while ~feof(fid)
% Use "HeaderLines" to skip data before reading in data we care about
row = textscan(fid, format, 1, 'Delimiter', ',', 'HeaderLines', lines2discard);
data{count} = [row{:}];
% After the first time through, set the "HeaderLines" (i.e. lines to ignore)
% to be the # we want to skip between lines (much faster than alternatives!)
lines2discard = rows2skip;
count = count + 1;
end
fclose(fid);
data = cat(1, data{:});
You may need to adjust your format specifier for your own type of input.

Parsing a data file in matlab

I have a text file with two columns of data. I want to split this file and save it as two individual strings in matlab, but I also need to stop copying the data when I meet an identifier in the data then stat two new strings.
For example
H 3
7 F
B B
T Y
SPLIT
<>
Where SPLIT <> is where I want to end the current string.
I'm trying to use fopen and fscanf, but struggling to get it to do what I want it to.
I tried the following script on the example you provided and it works. I believe the comments are very self explanatory.
% Open text file.
fid = fopen('test.txt');
% Read the first line.
tline = fgetl(fid);
% Initialize counter.
ii = 1;
% Check for end string.
while ~strcmp(tline, 'SPLIT')
% Analyze line only if it is not an empty one.
if ~strcmp(tline, '')
% Read the current line and split it into column 1 and column 2.
[column1(ii), column2(ii)] = strread(tline, ['%c %c']);
% Advance counter.
ii = ii + 1;
end
% Read the next line.
tline = fgetl(fid);
end
% Display results in console.
column1
column2
% Close text file.
fclose(fid);
The key functions here are fgetl and strread. Take a look at their documentation, it has some very nice examples as well. Hope it helps.