I have a similar question to MATLAB CSV File Read, but slightly more complicated.
In my instance, as well as having header lines that the start of my data, the data is separated into steps all in the same file like below:
headers
1,2,3
more headers
4,5,6
more headers again
7,8,9
Is there a way to "scan" through the entire data file and output the data?
I tried with csvread;
%step 1 pre load
RNG = [23 0 1539 8];
M = csvread(dat,23,0,RNG);
%step 2 stabilisation
RNG = [1555 0 2566 8];
N = csvread(dat,1555,0,RNG);
%step 3 test
RNG = [2582 0 4693 8];
O = csvread(dat,2582,0,RNG);
%step 4 end of test
RNG = [4709 0 4920 8];
P = csvread(dat,4709,0,RNG);
but the "ranges" are slightly different between tests, so without manually checking each data file for where the data is, this approach won't work. I "experimented" with textscan but didn't get anything useful to share.
Related
First of all thank you for reading my question.
I am trying to import data from a file with following format into Matlab:
#Text
#Text: Number
...
#Text: Number
Set1:
1 2
3 4
Set2:
5 6
7 8
...
I would like to get those numbers into two matrices of the form:
(1 5
3 7)
and
(2 6
4 8)
I started by only building the first of those two matrices.
Winkel = 15;
xp = 30;
M = readtable('Ebene_1.txt')
M([1:4],:) = [];
M(:,3) = [];
for i=0:Winkel-1
A = table2array(M((2+i*31:31+i*31),1))
end
But this solution only gave me cell arrays which I could not transform into normal vectors.
I also tried to use the importdata command, but could not find a way to get this to work either. I know there are many other questions similar to mine, but I could not find one where all the data were in a single column. Also, there are many Matlab-commands for importing data into Matlab and I am not sure which would be the best.
First time asking such a question online so feel free to ask me for more details.
You can import the data you provided in your sample using readtable, however because of the format of your file you will need to tweak the function a bit.
You can use detectImportOptions to tell the function how to import the data.
%Detect import options for your text file.
opts = detectImportOptions('Ebene_1.txt')
%Specify variable names for your table.
opts.VariableNames = {'Text','Number'};
%Ignore last column of your text file as it does not contain data you are interested in.
opts.ExtraColumnsRule = 'ignore';
%You can confirm that the function has successfully identified that the data is numeric by inspecting the VariableTypes property.
%opts.VariableTypes
%Read your text file with detectImportOptions.
M = readtable('Ebene_1.txt',opts)
Now that you have table M, simply apply basic Matlab operations to obtain the matrices as you specified.
%Find numerical values in Text and Number variables. Ignore NaN values.
A = M.Text(~isnan(M.Text));
B = M.Number(~isnan(M.Number));
%Build matrices.
A = [A(1:2:end)';A(2:2:end)']
B = [B(1:2:end)';B(2:2:end)']
Output:
A =
1 5
3 7
B =
2 6
4 8
I think the solution will be quite simple for somebody with some MATLAB knowhow however I do not know how to do it.
I have a binary file that I am reading with fread and I am reading the first 4 bytes of this file followed by the next 2 bytes.
I basically want this process of reading 4 bytes followed by 2 bytes repeated till the end of the file is reached.
So the number of bytes read is 4,2,4,2,4,2......
I have the following to read the first pair of data and I want this to repeat.
fileID = fopen('MyBinaryFile');
4bytes = fread(fileID, 4);
fseek(fileID, 4, 0);
2bytes = fread(fileID, 2);
Thanks in advance for any help and suggestions
I take it this is a variant of your former question MATLAB reading a mixed data type binary file.
Your goal is to read a binary file containing mixed data type. In your case it contains 2 columns:
1x single value (4 bytes) and 1x int16 value (2 bytes).
There are several ways to read this type of file. They differ in speed because some ways minimize disk access but require more temporary memory, and other way use just the memory needed but require more disk access (= slower).
Ultimately, the 3 ways I'm going to show you produce exactly the same result.
The direct answer to this question is the version #3 below, but I encourage you to have a look at the 2 other options described here, they are both really worth understanding.
For the purpose of the example, I had to create a binary file as you described. This is done this way:
%% // write example file
A = single(linspace(-3,1,11)) ; %// a few "float" (=single) data
B = int16(-5:5) ; %// a few "int16" data
fileID = fopen('testmixeddata.bin','w');
for il=1:11
fwrite(fileID,A(il),'single');
fwrite(fileID,B(il),'int16');
end
fclose(fileID);
This create a 2 column binary file, the columns being:
11 values of type float going from -3 to 1.
11 values of type int16 going from -5 to +5.
For future reference:
>> disp(A)
-3.0000 -2.6000 -2.2000 -1.8000 -1.4000 -1.0000 -0.6000 -0.2000 0.2000 0.6000 1.0000
>> disp(B)
-5 -4 -3 -2 -1 0 1 2 3 4 5
In each of the solution below, the first column will be read in a variable called varSingle, and the second column in a variable called varInt16.
1) Read all data in one go - convert to proper type after
%% // SOLUTION 1 (fastest) : Read all data in one go - convert to proper type after
fileID = fopen('testmixeddata.bin');
R = fread(fileID,'uint8=>uint8') ; %// read all values, most basic data type (unsigned 8 bit integer)
fclose(fileID);
colSize = [4 2] ; %// number of byte for each column [4 byte single, 2 byte int16]
R = reshape( R , sum(colSize) , [] ) ; %// reshape data into a matrix (6 is because 4+2byte=6 byte per column)
temp = R(1:4,:) ; %// extract data for first column into temporary variable (OPTIONAL)
varSingle = typecast( temp(:) , 'single' ) ; %// convert into "single/float"
temp = R(5:end,:) ; %// extract data for second column
varInt16 = typecast( temp(:) , 'int16' ) ; %// convert into "int16"
This is my favourite method. Specially for speed because it minimizes the read/seek operations on disk, and most post calculations are done in memory (much much faster than disk operations).
Note that the temporary variable I used was only for clarity/verbose, you can avoid it altogether if you get your indexing into the raw data right.
The key thing to understand is the use of the typecast function. And the good news is it got even faster since 2014b.
2) Read column by column (using "skipvalue") - 2 pass approach
%% // SOLUTION 2 : Read column by column (using "skipvalue") - 2 pass approach
col1size = 4 ; %// size of data in column 1 (in [byte])
col2size = 2 ; %// size of data in column 2 (in [byte])
fileID = fopen('testmixeddata.bin');
varSingle = fread(fileID,'*single',col2size) ; %// read all "float" values, skipping all "int16"
fseek(fileID,col1size,'bof') ; %// rewind to beginning of column 2 at the top of the file
varInt16 = fread(fileID,'*int16',col1size) ; %// read all "int16" values, skipping all "float"
fclose(fileID);
That works too. It works fine ... but it is going to be slower than method 1 above, because you will have to scan the file twice. It may be a good option if the file is very large and method 1 above fail because of an out of memory error.
3) Read element by element
%% // SOLUTION 3 : Read element by element (slow - not recommended)
fileID = fopen('testmixeddata.bin');
varSingle=[];varInt16=[];
while ~feof(fileID)
try
varSingle(end+1) = fread(fileID, 1, '*single' ) ;
varInt16(end+1) = fread(fileID, 1, '*int16' ) ;
catch
disp('reached End Of File')
end
end
fclose(fileID);
That does work too, and if you were writing C code it would be more than ok. But in Matlab this is not the recommended way to go (your choice ultimately)
As promised, the 3 methods above will give you exactly what we wrote in the file at the beginning:
>> disp(varSingle)
-3.0000 -2.6000 -2.2000 -1.8000 -1.4000 -1.0000 -0.6000 -0.2000 0.2000 0.6000 1.0000
>> disp(varInt16)
-5 -4 -3 -2 -1 0 1 2 3 4 5
fileID = fopen('MyBinaryFile');
kk=1;
while ~feof(fileID)
bytes4(kk) = fread(fileID, 4);
fseek(fileID, 4, 0);
bytes2(kk) = fread(fileID, 2);
kk=kk+1;
end
the while loop condition is ~feof, which stands for End-Of-File. So as long as you haven't reached the end of your file it runs.
I added the kk just so you store everything and not just overwrite them each loop iteration.
If you want to get the data without loops, there are MATLABish ways to that:
%'Sizes'
T = 4; %'Time record size'
D = 2; %'Date record size'
R = T+D; %'Record size'
%'Open file'
f = fopen('MyBinaryFile', 'rb');
if f < 0
error('Could not open file.');
end;
%'Read the entire file at once, and close file'
buf = fread(f, Inf, '*uint8');
fclose(f);
%'Ignore the last unpadded bytes, and reshape by the size of 1 record'
buf = reshape(buf(1:R*fix(numel(buf)/R)), R, []);
%'Pinpoint the data'
time_bytes = buf( 1: T, :);
date_bytes = buf(T+1:T+D, :);
Essentially I am writing a Matlab file to change the 2nd, 3rd and 4th numbers in the line below "STR" and above "CON" in the text file (which is given below and is called '0.dat'). Currently, my Matlab code makes no changes to the text file.
Text File
pri
3
len 0.03
vic 5 5
MAT
1 147E9 0.3 0 4.9E9 8.5E9
LAY
1 0.000125 1 45
2 0.000125 1 0
3 0.000125 1 -45
4 0.000125 1 90
5 0.000125 1 45
WAL
1 1 2 3 4 5
PLATE
1 0.005 1 1
STR
1 32217.442335442 3010.34241024889 2689.48842888812
CON
1 2 1 2 3 1 3 4 1 4 5 1 5 6 1 6 7 1
ATT
1 901 7 901
LON
34
POI
123456
1 7
X 0.015
123456
2 6
X 0.00381966011250105 0.026180339887499
123456
3 5
X 0.000857864376269049 0.0291421356237309
123456
4
X 0
PLO
2 3
CRO
0
RES
INMOD=1
END
Matlab code:
impafp = importdata('0.dat','\t');
afp = impafp.textdata;
fileID = fopen('0.dat','r+');
for i = 1:length(afp)
if (strncmpi(afp{i},'con',3))
newNx = 100;
newNxy = 50;
newNy = 500;
myformat = '%0.6f %0.9f %0.9f %0.9f\n';
newData = [1 newNx newNxy newNy];
afp{i-1} = fprintf(fileID, myformat, newData);
fclose(fileID);
end
end
From the help for importdata:
For ASCII files and spreadsheets, importdata expects to find numeric
data in a rectangular form (that is, like a matrix). Text headers can
appear above or to the left of numeric data. To import ASCII files
with numeric characters anywhere else, including columns of character
data or formatted dates or times, use TEXTSCAN instead of import data.
Indeed, if you print out the value of afp, you'll see that it just contains the first line. You were also not performing any operation that was writing to a file. And you were not closing the file ID if the if state wasn't triggered.
Here is one way to do this with textscan (which is probably faster too):
% Read in data as strings using textscan
fid = fopen('0.dat','r');
afp = textscan(fid,'%s','Delimiter','');
fclose(fid);
isSTR = strncmpi(afp{:},'str',3); % True for all lines starting with STR
isCON = strncmpi(afp{:},'con',3); % True for all lines starting with CON
% Find indices to replace - create logical indices
% True if line before is STR and line after is CON
% Offset isSTR and isCON by 2 elements in opposite directions to align
% Use & to perform vectorized AND
% Pad with FALSE on either side to make output the same length as afp{1}{:}
datIDX = [false;(isSTR(1:end-2)&isCON(3:end));false];
% Overwrite data using sprintf
myformat = '%0.6f %0.9f %0.9f %0.9f';
newNx = 100;
newNxy = 50;
newNy = 500;
newData = [1 newNx newNxy newNy];
afp{1}{datIDX} = sprintf(myformat, newData); % Set only elements that pass test
% Overwrite old file using fprintf (or change filename to new one)
fid = fopen('0.dat','w');
fprintf(fid,'%s\r\n',afp{1}{1:end-1});
fprintf(fid,'%s',afp{1}{end}); % Avoid blank line at end
fclose(fid);
If you're unfamiliar with logical indexing, you might read this blog post and this.
I would recommend just reading the entire file in, finding which lines contain your "keywords", modifying specific lines, and then writing it back out to a file, which can have the same name or a different one.
file = fileread('file.dat');
parts = regexp(file,'\n','split');
startIndex = find(~cellfun('isempty',regexp(parts,'STR')));
endIndex = find(~cellfun('isempty',regexp(parts,'CON')));
ind2Change = startIndex+1:endIndex-1;
tempCell{1} = sprintf('%0.6f %0.9f %0.9f %0.9f',[1,100,50,500]);
parts(ind2Change) = deal(tempCell);
out = sprintf('%s\n',parts{:});
out = out(1:end-1);
fh = fopen('file2.dat','w');
fwrite(fh,out);
fclose(fh);
I've come across numerous ways to write matlab data to a .txt file but I am unsure which way would be best suited for my needs - I have two sets of data labelled 'x' and 'y' within which data simply runs down 1 column (A1....An) and I need a tab delimited .txt file made with the format:
Name X X Y
Test 2 2 5.5
Test 3 3 6.5
Test 4 4 7.5
etc...
Whereby I can have 2 identical columns of the X data, followed by the Y data. I also need to be able to input something for the 'Name' column which will copy itself down until the data in X/Y stops. I don't need any column headers in it i.e. 'X' 'Y' or 'Name' just the data itself.
What would be the best way to go about this?
Run this code example and you can check that it does what you want:
% Example data:
x = [1:5];
y = rand(1,5);
fileID = fopen('yourfile.txt','w');
for i = 1:length(x)
fprintf(fileID,'%s\t%d\t%d\t%f\n', 'Test', x(i),x(i),y(i));
end
fclose(fileID);
Opening the text file you will see something like:
Test 1 1 0.655741
Test 2 2 0.035712
Test 3 3 0.849129
Test 4 4 0.933993
Test 5 5 0.678735
If you want the value for string 'Test' to change in each row, simply pass in an array with those string values, similar to how the x and y variables are passed to the fprintf() statement.
One way is to first put everything into a single cell:
Name = repmat({'Test'}, [1 DataSize]); % A Cell containing n 'Test' string
C = [Name num2cell(X') num2cell(X') num2cell(Y')]; % Concatenating cells
Then use fprintf to write the cell into a file:
fid = fopen('data.txt', 'wt');
fprintf(fid, '%s\t%d\t%d\t%d\n', C{:});
fclose(fid);
Hope it helps.
I'd like to write cell arrays of strings to csv files and overwrite parts of them with numerical data. I guess, for illustrative purposes we could use two matrices:
a = ones(5,5);
b = zeros(3,3);
I'd like to write a to a csv file and then overwrite specific fields of this file with b, resulting in:
1 1 1 1 1
1 1 1 1 1
1 1 0 0 0
1 1 0 0 0
1 1 0 0 0
Is there a way to do this in matlab? I tried
csvwrite('foo.csv', a);
dlmwrite('foo.csv', b, 'roffset', 2, 'coffset', 2)
but this would overwrite the entire file. I would be thankful for any suggestions.
Here's a solution based on Marcin's suggestion:
datsize = size(a);
precision = 6;
output_cell = reshape(cellstr(num2str(a(:),precision)), size(a));
for i = 3:datsize(1,1),
for j = 3:datsize(1,2),
output_textdata(i,j) = output_cell(i-2,j-2);
end
end
cell2csv('foo.csv', output_textdata);
While this produces the desired outcome three issues remain. First, the 'precision' varies from cell to cell. Second, which is not a problem for the intended limited application of this script, this code would produce an error if matrix b partially overlapped with matrix a and partially exceeded its dimensions, e.g., b had a size of 4x4 and were superimposed on a starting from a(3,3). Third, this workaround doesn't answer the more general question of whether only specific fields of a csv can be overwritten in matlab.