I have a matrix containing the date in a cell structure. I managed to convert the date (2nd column) using datenum(), but I am not sure how to add on the time (3rd column)
The data looks like this:
'IBM' 20090602 0 108.410000000000
'IBM' 20090602 500 108.560000000000
My code:
date = datenum(num2str(IBM(:,2)),'yyyymmdd')
Let's review your mistakes first:
You feed datenum with the string 'IBM(:, 2)' instead of the actual array. Discard the quotes.
datenum accepts strings, not numerical values.
A possible solution is converting the second column of your data into an array of strings, and feeding it into datenum, like so:
d = datenum(num2str(vertcat(IBM{:, 2})), 'yyyymmdd');
Note that this is, of course, possible only if the format of the date string is fixed in each row.
EDIT:
To add the values in the third column to the result of datenum, simply do the following:
d + vertcat(IBM{:, 3})
Where d is a column vector of date values obtained from datenum (I assume that you want to do basic addition, since you haven't specified the actual meaning of the timje values in the third column).
In one line, the complete answer would look like this:
datenum(num2str(vertcat(IBM{:, 2})), 'yyyymmdd') + vertcat(IBM{:, 3})
You can straight-up add the time values in when you're converting to datenum. Just convert from what I assumed were minutes (if they're in seconds, add in another *60 to the divisor) to days, which is what MATLAB uses for its datenum calculations.
timestamps = cellfun(#(x,y) datenum(num2str(x),'yyyymmdd')+y/(24*60),...
IBM(:,2),...
IBM(:,3),...
'UniformOutput',false)
Related
I have data (240 x 9 dimensions) in which a DATETIME column has entries as shown below. I want to separate the date and time parts. Also, I need to calculate the time difference between an entry and a previous one. The date is formatted as yyyymmdd followed by the hhmmsssss. However, the last three entries of the seconds represent the decimal seconds i.e., 36200 means 36.2 seconds
Using MATLAB I have separated the date and time entries as character arrays as the original data was in the double format as below:
datestr: 240 x8 char e.g., '20190101'
timestr: 240 x9 char e.g., '001634200'
I want to convert the date into the standard format. I used
formatOut = 'yyyy/mm/dd';
date=datestr(datenum('20190101','yyyymmdd',1900),formatOut);
But getting this error : Index in position 1 exceeds array bounds (must not exceed 240).
Additionally, I need to convert the time into hh:mm:ss.sss format and then calculate the difference between a time and its previous entry.
Any help will be appreciated.
I have two excel files with dates in each of them. The goal is to find the location of datetimes in file A in file B.
e.g.
Excel file A has dates and each hour in column A from 1Jan1970 1AM to 31Dec2015 1AM with a lot of random missing dates and hour.
Excel file B has date e.g. 1jan1978 5PM
I read file A in array called A and do the following:
ind = find( x2mdate(A) == x2mdate(28491.7083333333) ); %datestr(x2mdate(28491.7083333333)) ans = 01-Jan-1978 17:00:00
it returns empty even though I can see that 1/1/1978 all hours are available in file A.
This is clearly a rounding issue. So, how do I deal with this? I tried using datestr but it is very slow.
Instead of x2mdate(28491.7083333333), try using:
datenum('01-Jan-1978 17:00:00', 'dd-mmm-yyyy HH:MM:SS')
It's easy to see that because of the rounding, they are not considered equal:
>> datenum('01-Jan-1978 17:00:00', 'dd-mmm-yyyy HH:MM:SS') == x2mdate(28491.7083333333)
ans =
0
You are comparing to the wrong value. 28491.7083333333 is slightly off the value you are looking for. When you want to use a precise match with constant floats, you have to use 17 digits. Otherwise compare with a reasonable tolerance.
tol=datenum(0,0,0,0,0,60) %60 seconds tolerance
ind = find( abs(x2mdate(A) - x2mdate(28491.7083333333)<tol );
I have two vectors(?) of data - one being prices, and the other the dates that those prices occured, and I am trying to plot a scatter plot of the two.
My dates are of the format ddmmyyyy and I have tried using mat2str to convert the vector into strings, and then used
formatin='ddmmyyy';
datenum(MYDATA,formatin)
however it returns an error saying that datenum has failed.
EDIT
This is the example of my code
This is what I am trying to run, where AvivaDate is a vector of 1200x1 double. The problem seems to be that mat2str is not changing the vector into a string of numbers: e.g. I need it in {'12345','12345'} form but mat2str is changing it into a string of '[12345' 12345]', so not a list of separate strings if that makes sense
formatin = 'ddmmyyyy';
DateAviva = mat2str(AvivaDate);
datenum(DateAviva,formatin);
hist(ReturnAviva,datenum(AvivaDate,formatin));
datetick('x','keepticks','keeplimits');
mat2str is not the function you want to convert between numbers and strings. mat2str has a very specific function (you see how it puts the [] around the output?). Use num2str, then convert it into a cell array:
S = num2str(AvivaDate); % should be 1200 x 8 char
C = mat2cell(S,ones(size(S,1),1)); % should be 1200 x 1 cell
dates = datenum(C,'ddmmyyyy'); % should be 1200 x 1 datenums
Although, depending where you're getting the information from in the first place, there may be a better way of reading the dates in from file, so you don't end up with a matrix of numbers where you want dates in the first place.
I am trying to read time-coordinate data from a netCDF file using matlab. I have a netCDF file (which I created) that has a time variable in the format of a double corresponding to the number of hours from a specific time (see below).
Variable attributes:
double time(Time) ;
time:standard_name = "Time" ;
time:units = "hours since 2002-01-01 0:0:0" ;
time:calendar = "proleptic_gregorian" ;
When I read the time variable using ncread) into matlab, it just prints out an integer e.g.,1. However, if I use "ncdump" to explore the file, I see the time variable in it's coordinate time e.g., 2002-01-01 01.
Specifically: "ncdump -t -v time ncfile.nc"
I'm relatively new to matlab, and I was wondering if anyone knew if there was a similar, or an equally simple, way to read this time variable as its coordinate time into matlab, either as a string, or numerical date. Specifically, I would like to avoid having to parse the attribute string and code up a bunch of pointers and conditions to convert the integer data to an actual date.
Alternatively, should I just create a new time variable in these files that is just an array of dates as strings?
Any information is very much appreciated!
Thanks!
NetCDF stores time as an offset from an epoch. From your variable attribute, your epoch is 2002-01-01 0:0:0, and the time is hours since then. Matlab has a similar methodology called date numbers, although it is based off of days since an epoch (which they call pivot years). There are two functions that you should look into: datenum and datestr. The first converts a string into a date number and the other converts a date number into a date string.
You can convert your time variable into a compatible Matlab date number by dividing by 24 and then use the datestr function to format it however you like. Here is a simple example:
>> time = [1;2;3;4];
>> datestr(time./24+datenum('2002-01-01 0:0:0'))
ans =
01-Jan-2002 01:00:00
01-Jan-2002 02:00:00
01-Jan-2002 03:00:00
01-Jan-2002 04:00:00
Look at the Matlab help files associated with the two functions and you can format the date output however you like.
I'm porting a Matlab script to Python. Below is an extract:
%// Create a list of unique trade dates
DateList = unique(AllData(:,1));
%// Loop through the dates
for DateIndex = 1:size(DateList,1)
CalibrationDate = DateList(DateIndex);
%// Extract the data for a single cablibration date (but all expiries)
SubsetIndices = ismember(AllData(:,1) , DateList(DateIndex)) == 1;
SubsetAllExpiries = AllData(SubsetIndices, :);
AllData is an N-by-6 cell matrix, the first 2 columns are dates (strings) and the other 4 are numbers. In python I will be getting this data out of a csv so something like this:
import numpy as np
AllData = np.recfromcsv(open("MyCSV.csv", "rb"))
So now if I'm not mistaken AllData is a numpy array of ordinary tuples. Is this is best format to have this data in? The goal will be to extract a list of unique dates from column 1, and for each date extract the rows with that date in column 1 (column one is ordered). Then for each row in column one do some maths on the numbers and date in the remaining 5 columns.
So in matlab I can get the list of dates by unique(AllData(:,1)) and then I can get the records (rows) corresponding to that date (i.e. with that date in columns one) like this:
SubsetIndices = ismember(AllData(:,1) , MyDate) == 1;
SubsetAllExpiries = AllData(SubsetIndices, :);
How can I best achieve the same results in Python?
To put things in context, np.recfromcsv is just a modified version of np.genfromtxt which outputs record arrays instead of structured arrays.
A structured array lets you access the individual fields (here, your columns) by their names, like in my_array["field_one"] while a record array gives you the same plus the possibility to access the fields as attributes, like in my_array.field_one. I'm not fond of "access-as-attributes", so I usually stick to structured arrays.
For your information, structurede/record arrays are not arrays of tuples, but arrays of some numpy object call a np.void: it's a block of memory composed of as many sub-blocks you have of fields, the size of each sub-block depending on its datatype.
That said, yes, what you seem to have in mind is exactly the kind of usage for a structured array. The approach would then be:
to take your dates array and filter them to find the unique elements.
to find the indices of these unique elements, as an array of integers we'll call, say, matching;
to use matching to access the corresponding records (eg, rows of your array) using fancy indexing, as
my_array[matching].
to perform your computations on the records, as you want.
Note that you can keep your dates as strings or transform them into datetime objects using a user-defined converter, as described in the documentation. For example, your could transform a YYYY-MM-DD into a datetime object with a lambda s:datetime.dateime.strptime(s,"%Y-%m-%d"). That way, instead of having, say, a N array where each row (a record) consists of two dates as strings and 4 floats, you would have a N array where each row consists of two datetime objects and 4 floats.
Note the shape of your array (via my_array.shape), it says (N,), meaning it's a 1D array, even if it looks like a 2D table with multiple columns. You can access individual fields (each "column") by using its name. For example, if we create an array consisting of one string field called first and one int field called second, like that:
x = np.array([('a',1),('b',2)], dtype=[('first',"|S10"),('second',int)])
you could access the first column with
>>> x['first']
array(['a', 'b'],
dtype='|S10')