I have been trying in vain for days to do one seemingly simple thing--I want to read data from a .txt file that looks like this:
0.221351321
0.151351321
0.235165165
8.2254546 E-7
into Matlab. I've been able to load the data in the .txt file as a column vector using the fscanf command, like so:
U=fscanf(FileID, '%e')
provided that I go through the file first and remove the space before the 'E' wherever scientific notation occurs in the data set.
Since I have to generate a large number of such sets, it would be impractical to have to do a search-and-replace for every .txt file.
Is there a way for matlab to read the data as it appears, as in the above example (with the space preceding 'E'), and put it into a column vector?
For anyone who knows PARI-GP, an alternate fix would be to have the output devoid of spaces in the first place--but so far I haven't found a way to erase the space before 'E' in scientific notation, and I can't predict if a number in scientific notation will appear or not in the data set.
Thank you!
Thank you all for your help, I have found a solution. There is a way to eliminate the space from PARI-GP, so that the output .txt file has no spaces to begin with. I had the output set to "prettymatrix". One needs to enter the following:
? \o{0}
to change the output to "Raw," which eliminates the space before the "E" in scientific notation.
Thanks again for your help.
A simple way, may not be the best, is to read line by line, remove the space and convert back to floating point number.
For example,
x = []
tline = fgetl(FileID);
while ischar(tline)
x = [x str2num(tline(find(~isspace(tline))))]
tline = fgetl(FileID);
end
One liner:
data = str2double(strsplit(strrep(fileread('filename.txt'),' ',''), '\n'));
strrep removes all the spaces, strsplit takes each line as a separate string, and str2double coverts the strings to numbers.
Related
Ok, so I'm struggling with the most mundane of things I have a space delimited text file with a header in the first row and a row per observation and I'd like to open that file in matlab. If I do this in R I have no problem at all, it'll create the most basic matrix and voila!
But MATLAB seems to be annoying with this...
Example of the text file:
"picFile" "subjCode" "gender"
"train_1" 504 "m"
etc.
Can I get something like a matrix at all? I would then like to have MATLAB pull out some data by doing data(1,2) for example.
What would be the simplest way to do this?
It seems like having to write a loop using f-type functions is just a waste of time...
If you have a sufficiently new version of Matlab (R2013b+, I believe), you can use readtable, which is very much like how R does it:
T = readtable('data.txt','Delimiter',' ')
There are many functions for manipulating tables and converting back and forth between them and other data types such as cell arrays.
There are some other options in the data import and export section of the Statistics toolbox that should work in older versions of Matlab:
tblread: output in terms of separate variables for strings and numbers
caseread: output in terms of a char array
tdfread: output in terms of a struct
Alternatively, textscan should be able to accomplish what you need and probably will be the fastest:
fid = fopen('data.txt');
header = textscan(fid,'%s',3); % Optionally save header names
C = textscan(fid,'%s%d%s','HeaderLines',1); % Read data skipping header
fclose(fid); % Don't forget to close file
C{:}
Found a way to solve my problem.
Because I don't have the latest version of MATLAB and cannot use readable which would be the preferred option I ended up doing using textread and specifying the format of each column.
Tedious but maybe the "simplest" way I could find:
[picFile subCode gender]=textread('data.txt', '%s %f %s', 'headerlines',1);
T=[picFile(:) subCode(:) gender(:)]
The textscan solution by #horchler seems pretty similar. Thanks!
I have decided to use memmapfile because my data (typically 30Gb to 60Gb) is too big to fit in a computer's memory.
My data files consist two columns of data that correspond to the outputs of two sensors and I have them in both .bin and .txt formats.
m=memmapfile('G:\E-Stress Research\Data\2013-12-18\LD101_3\EPS/LD101_3.bin','format','int32')
m.data(1)
I used the above code to memory map my data to a variable "m" but I have no idea what data format to use (int8', 'int16', 'int32', 'int64','uint8', 'uint16', 'uint32', 'uint64', 'single', and 'double').
In fact I tried all of the data formats listed that MATLAB supports, but when I used the m.data(index number) I never get a pair of numbers (2 columns of data) which is what I expected, also the number will be different depending on the format I used.
If anyone has experience with memmapfile please help me.
Here are some smaller versions of my data files so people can understand how my data is structured:
cheers
James
memmapfile is designed for reading binary files, that's why you are having trouble with your text file. The data in there is characters, so you'll have to read them as characters and then parse them into numbers. More on that below.
The binary file appears to contain more than just a stream of floating point values written in binary format. I see identifiers (strings) and other things in the file as well. Your only hope of reading that is to contact the manufacturer of the device that created the binary file and ask them about how to read in such files. There'll probably be an SDK, or at least a description of the format. You might want to look into this as the floating point numbers in your text file might be truncated, i.e., you have lost precision compared to directly reading the binary representation of the floats.
Ok, so how to read your file with memmapfile? This post provides some hints.
So first we open your file as 'uint8' (note there is no 'char' option, so as a workaround we read the content of the file into a datatype of the same size):
m = memmapfile('RTL5_57.txt','Format','uint8'); % uint8 is default, you could leave that off
We can render the data read in as uint8 as characters by casting it to char:
c = char(m.Data(1:19)).' % read the first three lines. NB: transpose just for getting nice output, don't use it in your code
c =
0.398516 0.063440
0.399611 0.063284
0.398985 0.061253
As each line in your file has the same length (2*8 chars for the numbers, 1 tab and 2 chars for newline = 19 chars), we can read N lines from the file by reading N*19 values. So m.Data(1:19) gets you the first line, m.Data(20:38), the second line, and m.Data(20:57) the second and third lines. Read as much as you want at once.
Then we'll have to parse the read-in data into floating point numbers:
f = sscanf(c,'%f')
f =
0.3985
0.0634
0.3996
0.0633
0.3990
0.0613
All that's left now is to reshape them into your two column format
d = reshape(f,2,[]).'
d =
0.3985 0.0634
0.3996 0.0633
0.3990 0.0613
Easier ways than using memmapfile:
You don't need to use memmapfile to solve your problem, and I think it makes things more complicated. You can simply use fopen followed by fread:
fid = fopen('RTL5_57.txt');
c = fread(fid,Nlines*19,'*char');
% now sscanf and reshape as above
% NB: one can read the values the text file directly with f = fscanf(fid,'%f',Nlines*19).
% However, in testing, I have found calling fread followed by sscanf to be faster
% which will make a significant difference when reading such large files.
Using this you can read Nlines pairs of values at a time, process them and simply call fread again to read the next Nlines. fread remembers where it is in the file (as does fscanf), so simply use same call to get next lines. Its thus easy to write a loop to process the whole file, testing with feof(fid) if you are at the end of the file.
An even easier way is suggested here: use textscan. To slightly adapt their example code:
Nlines = 10000;
% describe the format of the data
% for more information, see the textscan reference page
format = '%f\t%f';
fid = fopen('RTL5_57.txt');
while ~feof(fid)
C = textscan(fid, format, Nlines, 'CollectOutput', true);
d = C{1}; % immediately clear C at this point if you need the memory!
% process d
end
fclose(fid);
Note again however that the fread followed by sscanf will be fastest. Note however that the fread method would die as soon as there is one line in the text file that doesn't exactly match your format. textscan is forgiving of whitespace changes on the other hand and thus more robust.
I want to have a look at a large matrix in MATLAB such that all columns are printed in one single line rather than spread out over several lines.
Is such thing possible? That would be great to know.
Try disp(matrixName(:)). The matrixName(:) command turns your matrix into a long vector in column-major order, so it basically just shows you the first column, followed by the second, the third, etc.
If that does not do the trick, you could look into the doprint command.
EDIT: You could also save the matrix to a text file and view the file. You do this like so:
fileID = fopen('C:/path/to/file/myMatrix.txt');
fprintf(fileID, formatString, myMat);
fclose(fileID);
fopen documentation
fprintf documentation
Additional information can be found here
The formatString variable in the above tells fprintf how the data should be displayed. If you have a really big matrix with tons of columns, where all of the values are floats, the easiest way to create this string is to use something like:
formatString = strcat(repmat('%f ', 1, size(myMat, 2)), '\n');
This will create a long string specifying that each element in your matrix is a float, and where it goes, and then cap it off with a line feed so that the next row of your matrix starts on the next line.
Suppress your original matrix with a semicolon and then use the "disp" command to show your matrix however you want.
for i = 1 : length(matrix(1,:))
disp(matrix(:,i))
end
Some "obvious" answers:
You can choose a smaller font - then more values will fit in a line
You can play with the format command to have less digits displayed
(my favourite) Use the variable viewer - via "open selection" or Ctrl-D when the name of a variable is highlighted. This will show your matrix in an excel-like table.
I am trying to import some data from a .csv file, I have search for solutions but no one seems to solve my problem. My .csv is just one column of numbers, but when I try to read it with csvread('myfile.csv') it says that it cannot convert from string. When I double click on the .csv file in matlab I can see that every number from the .csv has this aspect:
"996.47"
So every number is between double commas, and whatever I do I can not get just the number between them. I am trying also opening the file and with textscan but I find no way. Thank you very much in advance.
You can try this workaround:
V = dlmread('myfile.csv','"');
v = V(:,2)
According to your description you have one column of values formatted like "996.47". The first line creates a matrix where columns are delimited by '"' - you get three columns where the middle one is filled with your values. The second line extracts the middle column.
what about using
importdata('yourfile.csv')
It should work if you are only interested in data.
If you want a more generic solution that doesn't need to deal with indexing, you can use MATLAB's built-in function importdata.
x = importdata('yourfile.csv'); % reads in the file as text surrounded by double quotes
x = cellfun(#str2num,strrep(v,'"','')); % removes the double quotes and changes text to numbers
My raw CSV file looks like the 1st pic. And I wants to use Matlab read it into the format as the 2rd pic. I have over 1000 the same kind of CSV files, it will be painful if I do it by copy/paste. How can I do this? Any examples?
raw data:
output data:
First thing to realize is that a .csv file has a very simple format. Your above file is actually a plain text file with the following text on each line:
id,A001
height
a1,a2,a3
3,4,5
3,4,5
6,7,5
weight
a1,a2,a3
4,4,5
5,4,6
i6,7,5
So it is not all that hard for you to write your own parser in Matlab. You want to use commands like
fid = fopen('filename.csv','r');
L = fgetl(fid); % get a text line from the file
commas = find(L==','); % find where the commas are in the line
n1 = str2num(L(1:commas(1)-1); % convert the first comma-delimited number on line L
fidout - fopen('myfile.csv','w');
Lout = [ L(commas(2)+1:commas(3)-1) ', a1, a1'];
fwrite(fidout,Lout); % write a line out to the output file
fclose all; % close all open files.
It will seem slow at first reading the various values in to various variables, and then arranging them to write out the way you want them written out to your output file. But once you get rolling it will go pretty fast and you will find yourself with a pretty good understanding of what is in files, and you will know first hand what is involved in writing something like texscan.m or csvwrite.m and so on.
Good luck!