I have a measurement device PCE-VDL, which gives me measurements in following CSV format below, which I need to import to OCTAVE for further investigation.
Especially I need to import last 3 columns with xyz acceleration data.
The file is in CSV format with delimiter of semicolon ";".
I have tried:
A_1 = importdata ("file.csv", ";", 3);
but have recieved
error: missing_idx(10): out of bound 9
The CSV file looks like this:
#PCE-VDL X - TableView series
#2020.16.11
#Date;Time;Duration [s];t [°C];RH [%];p [mbar];aX [g];aY [g];aZ [g];
2020.28.10;16:16:32:0000;00:000;;;;0,0195;-0,0547;1,0039;
2020.28.10;16:16:32:0052;00:005;;;;0,0898;-0,0273;0,8789;
2020.28.10;16:16:32:0104;00:010;;;;0,0977;-0,0313;0,9336;
2020.28.10;16:16:32:0157;00:015;;;;0,1016;-0,0273;0,9297;
The numbers in last 3 columns have also decimal coma and not decimal point. So there probably should be done also some conversion.
Thank you very much for any help.
Regards
EDIT: 18.11.2020
Thanks for help. I have tried now following:
A_1_str = fileread ("file.csv");
A_1_str_m = strrep (A_1_str, ".", "-");
A_1_str_m = strrep (A_1_str_m, ",", ".");
save "A_1_str_m.csv" A_1_str_m;
A_1 = importdata ("A_1_str_m.csv", ";", 8);
and still receive error: file_content(140): out of bound 139
There is probably some problem with time format in first columns, which I do not want to read. I just need last three columns.
After my conversion, the file looks like this:
# Created by Octave 5.1.0, Wed Nov 18 21:40:52 2020 CET <zdenek#ASUS-F5V>
# name: A_1_str_m
# type: sq_string
# elements: 1
# length: 7849
#PCE-VDL X - TableView series
#2020-16-11
#Date;Time;Duration [s];t [°C];RH [%];p [mbar];aX [g];aY [g];aZ [g];
2020-28-10;16:16:32:0000;00:000;;;;0.0195;-0.0547;1.0039;
2020-28-10;16:16:32:0052;00:005;;;;0.0898;-0.0273;0.8789;
2020-28-10;16:16:32:0104;00:010;;;;0.0977;-0.0313;0.9336;
Thanks for support!
You can first read the data with fileread, which stores the data as a string. Then you can manipulate the string like this:
new_string = strrep(string, ",", ".");
strrep replaces all occurrences of a pattern within a string. Afterwards you save this data as a separate file or you overwrite the existing file with the manipulated data. When this is done you proceed as you have tried before.
EDIT: 19.11.2020
To avoid the additional heading lines in the new file, you can save it like this:
fid = fopen("A_1_str_m.csv", "w");
fputs(fid, A_1_str_m);
fclose(fid);
fputs will just write the string to the file.
The you can read the new file with dlmread.
A1_buf = dlmread("A_1_str_m.csv", ";");
A1_buf = real(A1); # get the real value of the complex number
A1_buf(1:3, :) = []; # remove the headlines
A1 = A1_buf(:, end-3:end-1); # get only the the 3 columns you're looking for
This will give you the three columns your looking for. But the date and time data will be ignored.
EDIT 20.11.2020
Replaced abs with real, so the sign of the value will be kept.
Use csv2cell from the io package.
I want to use fscanf for reading a text file containing 4 rows with an unknown number of columns. The newline is represented by two consecutive spaces.
It was suggested that I pass : as the sizeA parameter but it doesn't work.
How can I read in my data?
update: The file format is
String1 String2 String3
10 20 30
a b c
1 2 3
I have to fill 4 arrays, one for each row.
See if this will work for your application.
fid1=fopen('test.txt');
i=1;
check=0;
while check~=1
str=fscanf(fid1,'%s',1);
if strcmp(str,'')~=1;
string(i)={str};
end
i=i+1;
check=strcmp(str,'');
end
fclose(fid1);
X=reshape(string,[],4);
ar1=X(:,1)
ar2=X(:,2)
ar3=X(:,3)
ar4=X(:,4)
Once you have 'ar1','ar2','ar3','ar4' you can parse them however you want.
I have found a solution, i don't know if it is the only one but it works fine:
A=fscanf(fid,'%[^\n] *\n')
B=sscanf(A,'%c ')
Z=fscanf(fid,'%[^\n] *\n')
C=sscanf(Z,'%d')
....
You could use
rawText = getl(fid);
lines = regexp(thisLine,' ','split);
tokens = {};
for ix = 1:numel(lines)
tokens{end+1} = regexp(lines{ix},' ','split'};
end
This will give you a cell array of strings having the row and column shape or your original data.
To read an arbitrary line of text then break it up according the the formating information you have available. My example uses a single space character.
This uses regular expressions to define the separator. Regular expressions powerful but too complex to describe here. See the MATLAB help for regexp and regular expressions.
I have the following sample from a CSV file. Structure is:
Date ,Time(Hr:Min:S:mS), Value
2015:08:20,08:20:19:123 , 0.05234
2015:08:20,08:20:19:456 , 0.06234
I then would like to read this into a matrix in MATLAB.
Attempt :
Matrix = csvread('file_name.csv');
Also tried an attempt formatting the string.
fmt = %u:%u:%u %u:%u:%u:%u %f
Matrix = csvread('file_name.csv',fmt);
The problem is when the file is read the format is wrong and displays it differently.
Any help or advice given would be greatly appreciated!
EDIT
When using #Adriaan answer the result is
2015 -11 -9
8 -17 -1
So it seems that MATLAB thinks the '-' is the delimiter(separator)
Matrix = csvread('file_name.csv',1,0);
csread does not support a format specifier. Just enter the number of header rows (I took it to be one, as per example), and number of header columns, 0.
You file, however, contains non-numeric data. Thus import it with importdata:
data = importdata('file_name.csv')
This will get you a structure, data with two fields: data.data contains the numeric data, i.e. a vector containing your value. data.textdata is a cell containing the rest of the data, you need the first two column and extract the numerics from it, i.e.
for ii = 2:size(data.textdata,1)
tmp1 = data.textdata{ii,1};
Date(ii,1) = datenum(tmp1,'YYYY:MM:DD');
tmp2 = data.textdata{ii,2};
Date(ii,2) = datenum(tmp2,'HH:MM:SS:FFF');
end
Thanks to #Excaza it turns out milliseconds are supported.
I am importing a CSV file that is comma delimited into MATLAB. Each column has quotes around anything I want to consider as text and then a comma.
I am using read_mixed_csv function from the answer to this question to read in the data as a cell: Import CSV file with mixed data types
thisdata = read_mixed_csv(fname, ','); % Reads in the CSV file
thisdata = regexprep(thisdata, '^"|"$','');
However, since a few of my columns look like this:
"FAIRHOPE, Alabama"
"FAIRHOPE HIGH SCHOOL, FAIRHOPE, ALABAMA"
"Daphne-Fairhope-Foley, AL"
MATLAB places everything after a comma into a new column. So
"Daphne-Fairhope-Foley, AL"
Becomes two columns
"Daphne-Fairhope-Foley
AL"
How can I get MATLAB to read in a mixed csv file and not only consider a comma as a delimiter, but also consider the quotation marks? Is there a more automated way of doing it than textscan? If textscan is an option, what would that look like?
Here is a sample of the data I'm trying to read in with the header included:
"State Code","County Code","Site Num","Parameter Code","POC","Latitude","Longitude","Datum","Parameter Name","Sample Duration","Pollutant Standard","Date Local","Units of Measure","Event Type","Observation Count","Observation Percent","Arithmetic Mean","1st Max Value","1st Max Hour","AQI","Method Name","Local Site Name","Address","State Name","County Name","City Name","CBSA Name","Date of Last Change"
"01","003","0010","88101",1,30.498001,-87.881412,"NAD83","PM2.5 - Local Conditions","24 HOUR","PM25 24-hour 2006","2013-01-01","Micrograms/cubic meter (LC)","None",1,100.0,7.3,7.3,0,30,"R & P Model 2025 PM2.5 Sequential w/WINS - GRAVIMETRIC","FAIRHOPE, Alabama","FAIRHOPE HIGH SCHOOL, FAIRHOPE, ALABAMA","Alabama","Baldwin","Fairhope","Daphne-Fairhope-Foley, AL","2014-02-11"
"01","003","0010","88101",1,30.498001,-87.881412,"NAD83","PM2.5 - Local Conditions","24 HOUR","PM25 24-hour 2006","2013-01-04","Micrograms/cubic meter (LC)","None",1,100.0,7.6,7.6,0,32,"R & P Model 2025 PM2.5 Sequential w/WINS - GRAVIMETRIC","FAIRHOPE, Alabama","FAIRHOPE HIGH SCHOOL, FAIRHOPE, ALABAMA","Alabama","Baldwin","Fairhope","Daphne-Fairhope-Foley, AL","2014-02-11"
"01","003","0010","88101",1,30.498001,-87.881412,"NAD83","PM2.5 - Local Conditions","24 HOUR","PM25 24-hour 2006","2013-01-07","Micrograms/cubic meter (LC)","None",1,100.0,8.6,8.6,0,36,"R & P Model 2025 PM2.5 Sequential w/WINS - GRAVIMETRIC","FAIRHOPE, Alabama","FAIRHOPE HIGH SCHOOL, FAIRHOPE, ALABAMA","Alabama","Baldwin","Fairhope","Daphne-Fairhope-Foley, AL","2014-02-11"
"01","003","0010","88101",1,30.498001,-87.881412,"NAD83","PM2.5 - Local Conditions","24 HOUR","PM25 24-hour 2006","2013-01-10","Micrograms/cubic meter (LC)","None",1,100.0,7,7,0,29,"R & P Model 2025 PM2.5 Sequential w/WINS - GRAVIMETRIC","FAIRHOPE, Alabama","FAIRHOPE HIGH SCHOOL, FAIRHOPE, ALABAMA","Alabama","Baldwin","Fairhope","Daphne-Fairhope-Foley, AL","2014-02-11"
*Note: Converting the CSV file to a tab delimited file makes it easier for MATLAB to deal with and circumvents this problem.
Having a text qualifier (like ") is a little tricky, but the following might work if you ensure that each row of your table will have the same number of columns (and probably no empty ones).
Anything not within the text qualifier must be convertible to a number.
function C = csvmixed(eachLine,delim,textQualifier)
% Outputs cell containing mixed string and numeric data given a delimiter (',')
% and a text qualifier ('"'). Each line of the delimited file must be loaded into
% the cell array eachLine, and each line must have the same number of columns.
%
% Example:
% fid = fopen('testcsv.txt','r');
% eachLine = textscan(fid,'%s','Delimiter','\n'); fclose(fid);
% C = csvmixed(eachLine{1},',','"')
assert(ischar(delim) && numel(delim)==1);
assert(ischar(textQualifier) && numel(textQualifier)==1);
% find strings, as specified by the input qualifier
patternStr = sprintf('"([^"]*)"%c?',delim);
patternStr = strrep(patternStr,'"',textQualifier);
Cstr = regexp(eachLine,patternStr,'tokens');
% find numeric data
patternNum = sprintf('(?<=(,|^))[^%c,a-zA-Z]*(?=(,|$))',textQualifier);
patternNum = strrep(patternNum,',',delim);
Cnum = regexp(eachLine,patternNum,'match','emptymatch');
numCols = cellfun(#numel,Cstr) + cellfun(#numel,Cnum);
assert(nnz(diff(numCols))==0,'Number of columns not consistent.')
% get string extents (begin, start) indexes for each string
strExtents = regexp(eachLine,patternStr,'tokenExtents');
% deal out parsed data for each line
C = cell(numel(eachLine),numCols(1));
for ii = 1:numel(eachLine),
strBounds = vertcat(strExtents{ii}{:});
delimLocs = getDelimLocs(eachLine{ii},strBounds,delim);
strCellMap = getCellMap(strBounds,delimLocs);
C(ii,strCellMap) = [Cstr{ii}{:}]; % TODO: preallocate
C(ii,~strCellMap) = num2cell(str2double(Cnum{ii})); % all else must be numeric
end
end
function delimLocs = getDelimLocs(lineText,solidBounds,delim)
delimCharLocs = strfind(lineText,delim);
delimLocs = delimCharLocs(~any(bsxfun(#ge,delimCharLocs,solidBounds(:,1)) & ...
bsxfun(#le,delimCharLocs,solidBounds(:,2)),1));
end
function cellMap = getCellMap(typeBounds,delimLocs)
cellMap = any(bsxfun(#gt,typeBounds(:,1),[0 delimLocs]) & ...
bsxfun(#lt,typeBounds(:,1),[delimLocs Inf]), 1);
end
UPDATE: Fix small typos in getDelimLocs. Add preallocation of cell array.
Use the file exchange code replaceinfile to replace the strings that have commas in them with a period instead.
Use read_mixed_csv from Import CSV file with mixed data types to read in the file.
Remove the extra quotes from the strings that are still left.
replaceinfile(', ', '. ', fname); % Replace commas that was inside quotes and not meant to be separated as periods so they don't show up as a new column
thisdata = read_mixed_csv(fname, ','); % Reads in the CSV file (\t for tab)
thisdata = regexprep(thisdata, '^"|"$',''); % Remove quotes from file and only keep the first 28 columns (last two columns are empty)
For replaceinfile.m function:
For running the code on Linux, change the first line of the section on Perl to
perlCmd = sprintf('"%s"', '/usr/bin/perl');
I had a similar question. but what i am trying now is to read files in .txt format into MATLAB. My problem is with the headers. Many times due to errors the system rewrites the headers in the middle of file and then MATLAB cannot read the file. IS there a way to skip it? I know i can skip reading some characters if i know what the character is.
here is the code i am using.
[c,pathc]=uigetfile({'*.txt'},'Select the data','V:\data');
file=[pathc c];
data= dlmread(file, ',', 1,4);
this way i let the user pick the file. My files are huge typically [ 86400 125 ]
so naturally it has 125 header fields or more depends on files.
Thanks
Because the files are so big i cannot copy , but its in format like
day time col1 col2 col3 col4 ...............................
2/3/2010 0:10 3.4 4.5 5.6 4.4 ...............................
..................................................................
..................................................................
and so on
With DLMREAD you can read only numeric data. It will not read date and time, as your first two columns contain. If other data are all numeric you can tell DLMREAD to skip first row and 2 columns on the right:
data = dlmread(file, ' ', 1,2);
To import also day and time you can use IMPORTDATA instead of DLMREAD:
A = importdata(file, ' ', 1);
dt = datenum(A.textdata(2:end,1),'mm/dd/yyyy');
tm = datenum(A.textdata(2:end,2),'HH:MM');
data = A.data;
The date and time will be converted to serial numbers. You can convert them back with DATESTR function.
It turns out that you can still use textscan. Except that you read everything as string. Then, you attempt to convert to double. 'str2double' returns NaN for strings, and since headers are all strings, you can identify header rows as rows with all NaNs.
For example:
%# find and open file
[c,pathc]=uigetfile({'*.txt'},'Select the data','V:\data');
file=[pathc c];
fid = fopen(file);
%# read all text
strData = textscan(fid,'%s%s%s%s%s%s','Delimiter',',');
%# close the file again
fclose(fid);
%# catenate, b/c textscan returns a column of cells for each column in the data
strData = cat(2,strData{:});
%# convert cols 3:6 to double
doubleData = str2double(strData(:,3:end));
%# find header rows. headerRows is a logical array
headerRowsL = all(isnan(doubleData),2);
%# since I guess you know what the headers are, you can just remove the header rows
dateAndTimeCell = strData(~headerRowsL,1:2);
dataArray = doubleData(~headerRowsL,:);
%# and you're ready to start working with your data