I have a dataset which includes the seconds that have passed since 2000-01-01 00:00:00.0 and I would like them to be converted to decimal years (for example 2013.87).
An example from the dataset:
416554767.293262
416554768.037637
416554768.782013
416554769.526386
416554770.270761
416554771.015136
416554771.759509
416554772.503884
416554773.248258
416554773.992632
416554774.737007
416554775.481381
416554776.225757
416554776.970131
416554777.714504
416554778.458880
Can anyone help me out on this?
Thanks!
You should be able to perform these computations using methods of datetime and duration. A bit like this. I've tried to be careful regarding the number of seconds / year, since of course that varies depending on whether the year in question is a leap year.
% Original data
data = [416554767.293262
416554768.037637
416554768.782013
416554769.526386
416554770.270761
416554771.015136
416554771.759509
416554772.503884
416554773.248258
416554773.992632
416554774.737007
416554775.481381
416554776.225757
416554776.970131
416554777.714504
416554778.458880];
% Original data is seconds since 'base':
base = datetime(2000,1,1);
% Get datetimes corresponding to 'data'
dates = base + seconds(data);
% Extract the year portion from the dates
wholeYears = year(dates);
% Extract the remainder of the dates as seconds
remainderInSeconds = seconds(dates - datetime(wholeYears,1,1));
% Calculate the number of seconds in each of the years
secondsPerYear = seconds(datetime(wholeYears + 1, 1, 1) - datetime(wholeYears, 1, 1));
% Final result is whole years + remainder expressed as years
result = wholeYears + (remainderInSeconds ./ secondsPerYear);
fprintf('%.16f\n', result);
Related
I'm trying to convert an array of milliseconds and its respective data. However I want to do so in hours and minutes.
Millis = [60000 120000 180000 240000....]
Power = [ 12 14 12 13 14 ...]
I've set it up so the data records every minute, hence the 60000 millis (= 1 minimte). I am trying to plot time on the x axis and power on the y. I would like to have the x axis displayed in hours and minutes with each respective power data corresponding to its respective time.
I've tried this
for i=2:length(Millis)
Conv2Min(i) = Millis(i) / 60000;
Time(i) = startTime + Conv2Min(i);
if (Time(i-1) > Time(i) + 60)
Time(i) + 100;
end
end
s = num2str(Time);
This in attempt to turn the milliseconds into hours starting at 08:00 and once 60 minutes have past going to 09:00, the problem is plotting this. I get a gap between 08:59 and 09:00. I also cannot maintain the 0=initial 0.
In this scenario it is preferable to work with datenum values and then use datetick to set the format of the tick labels of your plot to 'HH:MM'.
Let's suppose that you started taking measurements at t_1 = [HH_1, MM_1] and stopped taking measurements at t_2 = [HH_2, MM_2].
A cool trick to generate the array of datenum values is to use the following expression:
time_datenums = HH_1/24 + MM_1/1440 : 1/1440 : HH_2/24 + MM_2/1440;
Explanation:
We are creating a regularly-spaced vector time_datenums = A:B:C using the colon (:) operator, where A is the starting datenum value, B is the increment between datenum values and C is the ending datenum value.
Since your measurements have been taken every minute (60000 milliseconds), then the increment between datenum values should be of 1 minute too. As a day has 24 hours, that makes 1440 minutes a day, so use B = 1/1440 as the increment between vector elements, to get 1 minute increments.
For A and C we simply need to divide the hour digits by 24 and the minute digits by 1440 and sum them up like this:
A = HH_1/24 + MM_1/1440
C = HH_2/24 + MM_2/1440
So for example, if t_1 = [08, 00], then A = 08/24 + 00/1440. As simple as that.
Notice that this procedure doesn't use the datenum function at all, and still, it manages to generate a valid array of datenum values only taking into consideration the time of the datenum, without needing to bother about the date of the datenum. You can learn more about this here and here.
Going back to your original problem, let's have a look at the code:
time_millisec = 0:60000:9e6; % Time array in milliseconds.
power = 10*rand(size(time_millisec)); % Random power data.
% Elapsed time in milliseconds.
elapsed_millisec = time_millisec(end) - time_millisec(1);
% Integer part of elapsed hours.
elapsed_hours_int = fix(elapsed_millisec/(1000*60*60));
% Fractional part of elapsed hours.
elapsed_hours_frac = (elapsed_millisec/(1000*60*60)) - elapsed_hours_int;
t_1 = [08, 00]; % Start time 08:00
t_2 = [t_1(1) + elapsed_hours_int, t_1(2) + elapsed_hours_frac*60]; % Compute End time.
HH_1 = t_1(1); % Hour digits of t_1
MM_1 = t_1(2); % Minute digits of t_1
HH_2 = t_2(1); % Hour digits of t_2
MM_2 = t_2(2); % Minute digits of t_2
time_datenums = HH_1/24+MM_1/1440:1/1440:HH_2/24+MM_2/1440; % Array of datenums.
plot(time_datenums, power); % Plot data.
datetick('x', 'HH:MM'); % Set 'HH:MM' datetick format for the x axis.
This is the output:
I would use datenums:
Millis = [60000 120000 180000 240000 360000];
Power = [ 12 14 12 13 14 ];
d = [2017 05 01 08 00 00]; %starting point (y,m,d,h,m,s)
d = repmat(d,[length(Millis),1]);
d(:,6)=Millis/1000; %add them as seconds
D=datenum(d); %convert to datenums
plot(D,Power) %plot
datetick('x','HH:MM') %set the x-axis to datenums with HH:MM as format
an even shorter approach would be: (thanks to codeaviator for the idea)
Millis = [60000 120000 180000 240000 360000];
Power = [ 12 14 12 13 14 ];
D = 8/24+Millis/86400000; %24h / day, 86400000ms / day
plot(D,Power) %plot
datetick('x','HH:MM') %set the x-axis to datenums with HH:MM as format
I guess, there is an easier way using datetick and datenum, but I couldn't figure it out. This should solve your problem for now:
Millis=6e4:6e4:6e6;
power=randi([5 15],1,numel(Millis));
hours=floor(Millis/(6e4*60))+8; minutes=mod(Millis,(6e4*60))/6e4; % Calculate the hours and minutes of your Millisecond vector.
plot(Millis,power)
xlabels=arrayfun(#(x,y) sprintf('%d:%d',x,y),hours,minutes,'UniformOutput',0); % Create time-strings of the format HH:MM for your XTickLabels
tickDist=10; % define how often you want your XTicks (e.g. 1 if you want the ticks every minute)
set(gca,'XTick',Millis(tickDist:tickDist:end),'XTickLabel',xlabels(tickDist:tickDist:end))
I have a dataset of trajectories of users: every current location of the traiectories has these fields:_ [userId year month day hour minute second latitude longitude regionId]. Based on the field day, I want to divide trajectories based on daily-scale in interval of different hours: 3 hours, 4 hours, 2 hours. I have realized this code that run for interval of 4 hours
% decomposedTraj is a struct that contains the trajectories based on daily scale
for i=1:size(decomposedTraj,2)
if ~isempty(decomposedTraj(i).dailyScaled)
% find the intervals
% interval [0-4]hours
Interval(i).interval_1=(decomposedTraj(i).dailyScaled(:,5)>=0&decomposedTraj(i).dailyScaled(:,5)<4);
% interval [4-8]hours
Interval(i).interval_2=(decomposedTraj(i).dailyScaled(:,5)>=4&decomposedTraj(i).dailyScaled(:,5)<8);
% interval [8-12]hours
Interval(i).interval_3=(decomposedTraj(i).dailyScaled(:,5)>=8&decomposedTraj(i).dailyScaled(:,5)<12);
% interval [12-16]hours
Interval(i).interval_4=(decomposedTraj(i).dailyScaled(:,5)>=12&decomposedTraj(i).dailyScaled(:,5)<16);
% interval [16-20]hours
Interval(i).interval_5=(decomposedTraj(i).dailyScaled(:,5)>=16&decomposedTraj(i).dailyScaled(:,5)<20);
% interval [20-0]hours
Interval(i).interval_6=(decomposedTraj(i).dailyScaled(:,5)>=20);
end
end
or more easily to understand the logic of the code:
A=[22;19;15;15;0;20;22;19;15;15;0;20;20;0;22;21;17;23;22]';
A(A>=0&A<4)
A(A>=4&A<8)
A(A>=8&A<12)
A(A>=12&A<16)
A(A>=16&A<20)
A(A>=20)
It runs and gives the right answer but it's not smart: if I want to change the interval, I have to change all the code... can you help me to find a smart solution more dinamical of this? thanks
0 Comments
Interval k is defined as [(k-1)*N k*N] where N=4 in your example. Therefore you can do the same using a for loop:
for k=1:floor(24/N)
Interval(k) = A(A>=(k-1)*N & A<k*N);
end
Note that in this example A(A>=(k-1)*N & A<k*N) is not necessarily the same size for each k so Interval should be a cell array.
I have a table with dates (and other things), which I have extracted from a CSV file. In order to do some processing of my data (including plotting) I decided to convert all my date-strings to date-numbers (below for simplicity reasons I will exclude all the rest of the data and concentrate on the dates only so don't mind the step from dates to timetable and the fact that it can be omitted):
dates = [7.330249777777778e+05;7.330249291666667e+05;7.330246729166667;7.330245256944444;7.330246763888889;7.330245284722222;7.330245326388889;7.330246625000000];
timetable = table(dates);
timetable
_________
7.330249777777778e+05
7.330249291666667e+05
7.330246729166667
7.330245256944444
7.330246763888889
7.330245284722222
7.330245326388889
7.330246625000000
I'm facing the following issue - based on the time during the day I want to tell the user if a date is in the morning (24-hours scale: 5-12h), noon (12-13h), afternoon (13-18h), evening (18-21h), night (21-5h) based on the date I have stored in my table. In case I had a date-vector (with elements: year,month,day,hour,minute,second) it would be pretty straight forward:
for date = 1:size(timetable)
switch timetable(date).hour
case {5,12}
'morning'
case {12,13}
'noon'
case {13,18}
'afternoon'
case {18,21}
'evening'
otherwise
'night'
end
end
With 7.330246729166667 and the rest this is not that obvious at least to me. Any idea how to avoid converting to some other date-format just for this step and at the same time avoid some complex formula for extracting the required data (not necessarily hour only but I'm interested in the rest too)?
One unit in Matlab serial dates is equivalent to 1 day, i.e. 24 hours. Knowing this, you can bin the fractional part of the the dates within the intraday buckets you defined (note that your switch will only work for values exactly equal to the case lists):
bins = {'morning', 'noon', 'afternoon', 'evening', 'night'};
edges = [5,12,13,18,21,25]./24; % As fraction of a day
% Take fractional part
time = mod(dates,1);
% Bin with lb <= x < ub, where e.g. lb = 5/25 and is ub = 12/24
[counts,~,pos] = histcounts(time, edges);
% Make sure unbinned x in [0,5) are assigned 'night'
pos(pos==0) = 5;
bins(pos)'
ans =
'night'
'night'
'morning'
'morning'
'morning'
'morning'
'morning'
'morning'
Hello everyone I have a new small problem:
The data I am using have a weird trade time that goes from 17.00 of one day to 16.15 of the day after.
That means that, e.g., for the day 09-27-2013 The source I am using registers the transactions occurred as follows:
DATE , TIME , PRICE
09/27/2013,17:19:42,3225.00,1 #%first obs of the vector
09/27/2013,18:37:59,3225.00,1 #%second obs of the vector
09/27/2013,08:31:32,3200.00,1
09/27/2013,08:36:17,3203.00,1
09/27/2013,09:21:34,3210.50,1 #%fifth obs of the vector
Now first and second obs are incorrect for me: they belong to 9/27 trading day but they have been executed on 9/26. Since I am working on some functions in matlab that relies on non-decremental times I need to solve this issue. The date format I am using is actually the datenum Matlab format so I am trying to solve the problem just subtracting one from the incorrect observations:
%#Call time the time vector, I can identify the 'incorrect' observations
idx=find(diff(time)<0);
time(idx)=time(idx)-1;
It is easy to tell that this will only fix the 'last' incorrect observations of a series. In the previous example this would only correct the second element. And I should run the code several times (I thought about a while loop) until idx will be empty. This is not a big issue when working with small series but I have up to 20millions observations and probably hundred of thousands consecutively incorrect ones.
Is there a way to fix this in a vectorized way?
idx=find(diff(time)<0);
while idx
However, given that the computation would not be so complex I thought that a for loop could efficiently solve the issue and my idea was the following:
[N]=size(time,1);
for i=N:-1:1
if diff(time(i,:)<0)
time(i,:)=time(i,:)-1;
end
end
sadly it does not seems to work.
Here is an example of data I am actually using.
735504.591157407
735507.708030093 %# I made this up to give you an example of two consecutively wrong observations
735507.708564815 %# This is an incorrect observation
735507.160138889
735507.185358796
735507.356562500
Thanks everyone in advance
Sensible version -
for count = 1:numel(time)
dtime = diff([0 ;time]);
ind1 = find(dtime<0,1,'last')-1;
time(ind1) = time(ind1)-1;
end
Faster-but-crazier version -
dtime = diff([0 ;time]);
for count = 1:numel(time)
ind1 = find(dtime<0,1,'last')-1;
time(ind1) = time(ind1)-1;
dtime(ind1+1) = 0;
dtime(ind1) = dtime(ind1)-1;
end
More Crazier version -
dtime = diff([0 ;time]);
ind1 = numel(dtime);
for count = 1:numel(time)
ind1 = find(dtime(1:ind1)<0,1,'last')-1;
time(ind1) = time(ind1)-1;
dtime(ind1) = dtime(ind1)-1;
end
Some average computation runtimes for these versions with various datasizes -
Datasize 1: 3432 elements
Version 1 - 0.069 sec
Version 2 - 0.042 sec
Version 3 - 0.034 sec
Datasize 2: 20 Million elements
Version 1 - 37029 sec
Version 2 - 23303 sec
Version 3 - 20040 sec
So apparently I had 3 other different problems in the data source that I think could have stucked the routine Divakar proposed. Anyway I thought it was being too slow so I started thinking to another solution and came up with a super quick vectorized one.
Given that the observations I wanted to modify fall in a determined known interval of time the function just look for every observation falling in that interval and modifies it as I want (-1 in my case).
function [ datetime ] = correct_date( datetime,starttime, endtime)
%#datetime is my vector of dates and times in matlab numerical format
%#starttime is the starting hour of the interval expressed in datestr format. e.g. '17:00:00'
%#endtime is the ending hour of the interval expressed in datestr format. e.g. '23:59:59'
if (nargin < 1) || (nargin > 3),
error('Requires 1 to 3 input arguments.')
end
% default values
if nargin == 1,
starttime='17:00';
endtime='23:59:59';
elseif nargin == 2,
endtime='23:59:59';
end
tvec=[datenum(starttime) datenum(endtime)];
tvec=tvec-floor(tvec); %#As I am working on multiples days I need to isolate only HH:MM:SS for my interval limits
temp=datetime-floor(datetime); %#same motivation as in the previous line
idx=find(temp>=tvec(1)&temp<=tvec(2)); %#logical find the indices
datetime(idx)=datetime(idx)-1; %#modify them as I want
clear tvec temp idx
end
So I want to construct a moving time average with different weights for different months. E.g. see the filter function at http://www.mathworks.com/help/matlab/data_analysis/filtering-data.html, where b = # of days in each month and a = # of days in a year.
The issue is, though, that the time-series is a series of temperatures for every month (and I want to construct a yearly average temperature for each set of possible years, where a year could be from March to February, for example). Using this approach, the first month in each window would be weighted as 31/365, irrespective of whether the first month is January or June.
In that case, the standard filter algorithm wouldn't work. Is there an alternative?
A solution that incorporates leap years would also be nice, but is not necessary for an initial solution.
A weighted average is defined as sum(x .* weights) / sum(weights). If you want to calculate this in a moving average kind of way, I guess you could do (untested):
moving_sum = #(n, x) filter(ones(1,n), 1, x);
moving_weighted_avg = moving_sum(12, temperature .* days_per_month) ...
./ moving_sum(12, days_per_month);
If temperature is a vector of monthly temperatures and days_per_month contains the actual number of days of the corresponding months, this should even work in case of leap years.
Edit to answer comment
You can reconstruct days_per_month like so:
start_year = 2003;
start_month = 10;
nmonth = 130;
month_offset = 0:nmonth - 1;
month = mod(start_month + month_offset - 1, 12) + 1;
year = start_year + floor((start_month + month_offset - 1) / 12);
days_in_month = eomday(year, month);
disp([month_offset; year; month; days_in_month]') %print table to check