Here is my code. I have 59 CSV files in my directory, I have 11 variables in each, with first one date in quarter format. I need to tell MATLAB that first column has date format, because the code below imports it as a string variable.
ext = '.csv';
countries = dir(['*', ext]);
countryFiles = {countries.name};
countriesNames = strrep(countryFiles, ext, '');
%%
a = cell(length(countriesNames), 2);
a(:,1) = countriesNames(:)';
a(:,2) = cellfun(#(file) readtable(file, 'TreatAsEmpty', '.','Format'?), countryFiles(:), 'uni', 0);
As far as I understand this is option 'Format' with datenum in readtable... However, I can`t find useful information on it in helpfiles and whatever I try I get errors... Below how my data looks for each country. Data is 1980Q1-2014Q4.
So I have 59*2 cell array with first column representing countryNames and second column contains 59 140*11 tables. First I do not know how to access variable names within cell array`s table. Second problem is that if you try to define x to a column in that table and then datenum(x,'YYYYQQ') I get "Input expected to be a cell array, was table instead."
So, I need to extract from cell a tables in a{1,2}, a{2,2}, a{3,2} ... then convert those tables to cellarray using table2cell option. How to do it using loop for each country and then write it back to the cell?
Related
I am trying to read in a csv-file that contains daily data on EUR/USD exchange rates including the dates specifying year, month and day. The problem is that using readtable(filename) puts single quotes around all table-entries and therefore hinders me using the data at all.
Detect import options:
opts = detectImportOptions('EUR_USD Historische Data.csv');
Read in the data:
EUR_USD = readtable('EUR_USD Historische Data.csv');
Substract dates and transform to datetime variable:
dt = EUR_USD(:,1);
dates = datetime(dt,'InputFormat','yyyyMMdd');
% Does not work because of single quotes
I was able to subtract closing prices and make them workable, but I am not sure if this is an elegant way of doing so:
closing_prices = str2double(table2array(EUR_USD(:,5)));
Ultimately the goal is to make the data workable. I need to compare two columns with datetime-variables and if dates do not match between the two columns I need to remove that entry such that in the end both columns match.
This is the vector with dates:
Dates vector wrong
I need it to look like this:
Dates vector correct
I think all you need to do is remove the ' character in order to read the data into datetime correctly. Look at the following example:
%stringz is the same as dt here: just the string data
T = table;
T.stringz = string(['''string1'''; '''string2'''; '''string3''']);
stringz = T.stringz;
%Run the for loop to remove the ' chars
for i = 1:length(stringz)
strval = char(stringz(i,1));
strval = strval(2:end-1);
strmat(i,1) = string(strval);
end
%Then load data into datetime after this for loop
dates = datetime(strmat,'InputFormat','yyyyMMdd');
strmat return a 3x1 string array with no ' characters on the outside of the string.
I have a deco.csv file and I only want to extract B1 to K1 (20 columns of the first rows), i.e. Deco_0001 to Deco_0020.
I first make a pre-allocation:
names = string(20,1);
and what I want is when calling S(1), it gives Deco_0001; when calling S(20), it gives Deco_0020.
I have read through textscan but I do not know how to specify the range is first row and running from column 2 to column 21 of the csv file.
Also, I want save the names individually but what I have tried just save the first line in only one cell:
fid=fopen('deco.csv');
C=textscan(fid, '%s',1);
fclose(fid);
Thanks!
It's not very elegant, but this should work for you:
fid=fopen('deco.csv');
C=textscan(fid, '%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s',1,'Delimiter',',');
fclose(fid);
S = cell(20,1);
for ii = 1:20
S{ii} = C{ii+1};
end
I have the following sample from a CSV file. Structure is:
Date ,Time(Hr:Min:S:mS), Value
2015:08:20,08:20:19:123 , 0.05234
2015:08:20,08:20:19:456 , 0.06234
I then would like to read this into a matrix in MATLAB.
Attempt :
Matrix = csvread('file_name.csv');
Also tried an attempt formatting the string.
fmt = %u:%u:%u %u:%u:%u:%u %f
Matrix = csvread('file_name.csv',fmt);
The problem is when the file is read the format is wrong and displays it differently.
Any help or advice given would be greatly appreciated!
EDIT
When using #Adriaan answer the result is
2015 -11 -9
8 -17 -1
So it seems that MATLAB thinks the '-' is the delimiter(separator)
Matrix = csvread('file_name.csv',1,0);
csread does not support a format specifier. Just enter the number of header rows (I took it to be one, as per example), and number of header columns, 0.
You file, however, contains non-numeric data. Thus import it with importdata:
data = importdata('file_name.csv')
This will get you a structure, data with two fields: data.data contains the numeric data, i.e. a vector containing your value. data.textdata is a cell containing the rest of the data, you need the first two column and extract the numerics from it, i.e.
for ii = 2:size(data.textdata,1)
tmp1 = data.textdata{ii,1};
Date(ii,1) = datenum(tmp1,'YYYY:MM:DD');
tmp2 = data.textdata{ii,2};
Date(ii,2) = datenum(tmp2,'HH:MM:SS:FFF');
end
Thanks to #Excaza it turns out milliseconds are supported.
The following code generates a similar dataset to what I am currently working with:
clear all
a = rand(131400,12);
DateTime=datestr(datenum('2011-01-01 00:01','yyyy-mm-dd HH:MM'):4/(60*24):...
datenum('2011-12-31 23:57','yyyy-mm-dd HH:MM'),...
'yyyy-mm-dd HH:MM');
DateTime=cellstr(DateTime);
header={'DateTime','temp1','temp2','temp4','temp7','temp10',...
'temp13','temp16','temp19','temp22','temp25','temp30','temp35'};
I'm trying to convert the outputs into one variable (called 'Data'), i.e. have header as the first row (1,:), 'DateTime' starting from row 2 (2:end,1) and running through each row, and finally having 'a' as the data (2:end,2:end) if that makes sense. So, 'DateTime' and 'header' are used as the heading for the rows and column respectively. Following this I need to save this into a tab delimited text file.
I hope I've been clear in expressing what I'm attempting.
An easy way, but might be not the fastest:
Data = [header; DateTime, num2cell(a)];
filename = 'test.txt';
dlmwrite(filename,1); %# no create text file, not Excel
xlswrite(filename,Data);
UPDATE:
It appears that xlswrite actually changes the format of DateTime values even if it writes to a text file. If the format is important here is the better and actually faster way:
filename = 'test.txt';
out = [DateTime, num2cell(a)];
out = out'; %# our cell array will be printed by columns, so we have to transpose
fid = fopen(filename,'wt');
%# printing header
fprintf(fid,'%s\t',header{1:end-1});
fprintf(fid,'%s\n',header{end});
%# printing the data
fprintf(fid,['%s\t', repmat('%f\t',1,size(a,2)-1) '%f\n'], out{:});
fclose(fid);
I had a similar question. but what i am trying now is to read files in .txt format into MATLAB. My problem is with the headers. Many times due to errors the system rewrites the headers in the middle of file and then MATLAB cannot read the file. IS there a way to skip it? I know i can skip reading some characters if i know what the character is.
here is the code i am using.
[c,pathc]=uigetfile({'*.txt'},'Select the data','V:\data');
file=[pathc c];
data= dlmread(file, ',', 1,4);
this way i let the user pick the file. My files are huge typically [ 86400 125 ]
so naturally it has 125 header fields or more depends on files.
Thanks
Because the files are so big i cannot copy , but its in format like
day time col1 col2 col3 col4 ...............................
2/3/2010 0:10 3.4 4.5 5.6 4.4 ...............................
..................................................................
..................................................................
and so on
With DLMREAD you can read only numeric data. It will not read date and time, as your first two columns contain. If other data are all numeric you can tell DLMREAD to skip first row and 2 columns on the right:
data = dlmread(file, ' ', 1,2);
To import also day and time you can use IMPORTDATA instead of DLMREAD:
A = importdata(file, ' ', 1);
dt = datenum(A.textdata(2:end,1),'mm/dd/yyyy');
tm = datenum(A.textdata(2:end,2),'HH:MM');
data = A.data;
The date and time will be converted to serial numbers. You can convert them back with DATESTR function.
It turns out that you can still use textscan. Except that you read everything as string. Then, you attempt to convert to double. 'str2double' returns NaN for strings, and since headers are all strings, you can identify header rows as rows with all NaNs.
For example:
%# find and open file
[c,pathc]=uigetfile({'*.txt'},'Select the data','V:\data');
file=[pathc c];
fid = fopen(file);
%# read all text
strData = textscan(fid,'%s%s%s%s%s%s','Delimiter',',');
%# close the file again
fclose(fid);
%# catenate, b/c textscan returns a column of cells for each column in the data
strData = cat(2,strData{:});
%# convert cols 3:6 to double
doubleData = str2double(strData(:,3:end));
%# find header rows. headerRows is a logical array
headerRowsL = all(isnan(doubleData),2);
%# since I guess you know what the headers are, you can just remove the header rows
dateAndTimeCell = strData(~headerRowsL,1:2);
dataArray = doubleData(~headerRowsL,:);
%# and you're ready to start working with your data