CDO - Resample netcdf files from monthly to daily timesteps - date

I have a netcdf file that has monthly global data from 1991 to 2000 (10 years).
Using CDO, how can I modify the netcdf from monthly to daily timesteps by repeating the monthly values each day of each month?
for eaxample,
convert from
Month 1, value = 0.25
to
Day 1, value = 0.25
Day 2, value = 0.25
Day 3, value = 0.25
....
Day 31, value = 0.25
convert from
Month 2, value = 0.87
to
Day 1, value = 0.87
Day 2, value = 0.87
Day 3, value = 0.87
....
Day 28, value = 0.87
Thanks
##############
Update
my monthly netcdf has the monthly values not on the first day of each month, but in sparse order. e.g. on the 15th, 7th, 9th, etc.. however one value for each month.

The question is perhaps ambiguously worded. Adrian Tompkins' answer is correct for interpolation. However, you are actually asking to set the value for each day of the month to that for the first day of the month. You could do this by adding a second CDO call as follows:
cdo -inttime,1991-01-01,00:00:00,1day in.nc temp.nc
cdo -monadd -gtc,100000000000000000 temp.nc in.nc out.nc
Just set the value after gtc to something much higher than anything in your data.

You can use inttime which interpolates in time at the interval required, but this is not exactly what you asked for as it doesn't repeat the monthly values and your series will be smoothed by the interpolation.
If we assume your dataset starts on the 1st January at time 00:00 (you don't state in the question) then the command would be
cdo inttime,1991-01-01,00:00:00,1day in.nc out.nc
This performs a simple linear interpolation between steps.
Note: This is fine for fields like temperature and seems to be want you ask for, but readers should note that one has to be more careful with flux fields such as rainfall, where one might want to scale and/or change the units appropriately.

I could not find a solution with CDO but I solved the issue with R, as follows:
library(dplyr)
library(ncdf4)
library(reshape2)
## Read ncfile
ncpath="~/my/path/"
ncname="my_monthly_ncfile"
ncfname=paste(ncpath, ncname, ".nc", sep="")
ncin=nc_open(ncfname)
var=ncvar_get(ncin, "nc_var")
## melt ncfile
var=melt(var)
var=var[complete.cases(var), ] ## remove any NA
## split ncfile by gridpoint (lat and lon) into a list
var=split(var, list(var$lat, var$lon))
var=var[lapply(var,nrow)>0] ## remove any empty list element
## create new list and replicate, for each gridpoint, each monthly value n=30 times
var_rep=list()
for (i in 1:length(var)) {
var_rep[[i]]=data.frame(value=rep(var[[i]]$value, each=30))
}

Related

Strange Date object in MongoDB [duplicate]

As per MDN
"Date objects are based on a time value that is the number of milliseconds since 1 January, 1970 UTC."
Then why does it accept negative values ?
Even if it did shouldn't negative value mean values before Jan 1, 1970 ?
new Date('0000', '00', '-1'); // "1899-12-30T05:00:00.000Z"
new Date('0000', '00', '00'); // "1899-12-31T05:00:00.000Z"
new Date('-9999', '99', '99'); // "-009991-07-08T04:00:00.000Z"
What is happening ?
Update
For some positive values , the year begins from 1900
new Date(100); // "1970-01-01T00:00:00.100Z" // it says 100Z
new Date(0100); // "1970-01-01T00:00:00.064Z" // it says 64Z
new Date("0006","06","06"); // "1906-07-06T04:00:00.000Z"
Also note that, in the last one, the date is shown as 4 which is wrong.
I suspect this is some sort of Y2K bug ?!!
This is hard and inconsistent, yes. The JavaScript Date object was based on the one in Java 1.0, which is so bad that Java redesigned a whole new package.
JavaScript is not so lucky.
Date is "based on" unix epoch because of how it is defined. It's internal details.
1st Jan 1970 is the actual time of this baseline.
since is the direction of the timestamp value: forward for +ve, backward for -ve.
Externally, the Date constructor has several different usages, based on parameters:
Zero parameters = current time
new Date() // Current datetime. Every tutorial should teach this.
The time is absolute, but 'displayed' timezone may be UTC or local.
For simplicity, this answer will use only UTC. Keep timezone in mind when you test.
One numeric parameter = timestamp # 1970
new Date(0) // 0ms from 1970-01-01T00:00:00Z.
new Date(100) // 100ms from 1970 baseline.
new Date(-10) // -10ms from 1970 baseline.
One string parameter = iso date string
new Date('000') // Short years are invalid, need at least four digits.
new Date('0000') // 0000-01-01. Valid because there are four digits.
new Date('1900') // 1900-01-01.
new Date('1900-01-01') // Same as above.
new Date('1900-01-01T00:00:00') // Same as above.
new Date('-000001') // 2 BC, see below. Yes you need all those zeros.
Two or more parameters = year, month, and so on # 1900 or 0000
new Date(0,0) // 1900-01-01T00:00:00Z.
new Date(0,0,1) // Same as above. Date is 1 based.
new Date(0,0,0) // 1 day before 1900 = 1899-12-31.
new Date(0,-1) // 1 month before 1900 = 1899-12-01.
new Date(0,-1,0) // 1 month and 1 day before 1900 = 1899-11-30.
new Date(0,-1,-1) // 1 month and *2* days before 1900 = 1899-11-29.
new Date('0','1') // 1900-02-01. Two+ params always cast to year and month.
new Date(100,0) // 0100-01-01. Year > 99 use year 0 not 1900.
new Date(1900,0) // 1900-01-01. Same as new Date(0,0). So intuitive!
Negative year = BC
new Date(-1,0) // 1 year before 0000-01-01 = 1 year before 1 BC = 2 BC.
new Date(-1,0,-1) // 2 days before 2 BC. Fun, yes? I'll leave this as an exercise.
There is no 0 AC. There is 1 AC and the year before it is 1 BC. Year 0 is 1 BC by convention.
2 BC is displayed as year "-000001".
The extra zeros are required because it is outside normal range (0000 to 9999).
If you new Date(12345,0) you will get "+012345-01-01", too.
Of course, the Gregorian calendar, adopted as late as 1923 in Europe, will cease to be meaningful long before we reach BC.
In fact, scholars accept that Jesus wasn't born in 1 BC.
But with the stars and the land moving at this scale, calendar is the least of your worries.
The remaining given code are just variations of these cases. For example:
new Date(0100) // One number = epoch. 0100 (octal) = 64ms since 1970
new Date('0100') // One string = iso = 0100-01-01.
new Date(-9999, 99, 99) // 9999 years before BC 1 and then add 99 months and 98 days
Hope you had some fun time. Please don't forget to vote up. :)
To stay sane, keep all dates in ISO 8601 and use the string constructor.
And if you need to handle timezone, keep all datetimes in UTC.
Well, firstly, you're passing in string instead of an integer, so that might have something to do with your issues here.
Check this out, it explains negative dates quite nicely, and there is an explanation for your exact example.
Then why does it accept negative values ?
You are confusing the description of how the data is stored internally with the arguments that the constructor function takes.
Even if it did shouldn't negative value mean values before Jan 1, 1970 ?
No, for the above reason. Nothing stops the year, month or day from being negative. You just end up adding a negative number to something.
Also note that, in the last one, the date is shown as 4 which is wrong.
Numbers which start with a 0 are expressed in octal, not decimal. 0100 === 64.
Please have a look at the documentation
Year: Values from 0 to 99 map to the years 1900 to 1999
1970 with appropriate timezone: new Date(0); // int MS since 1970
1900 (or 1899 with applied timezone): new Date(0,0) or new Date(0,0,1) - date is 1 based, month and year are 0 based
1899: new Date(0,0,-1)

subtracting one from another in matlab

My time comes back from a database query as following:
kdbstrbegtime =
09:15:00
kdbstrendtime =
15:00:00
or rather this is what it looks like in the command window.
I want to create a matrix with the number of rows equal to the number of seconds between the two timestamps. Are there time funcitons that make this easily possible?
Use datenum to convert both timestamps into serial numbers, and then subtract them to get the amount of seconds:
secs = fix((datenum(kdbstrendtime) - datenum(kdbstrbegtime)) * 86400)
Since the serial number is measured in days, the result should be multiplied by 86400 ( the number of seconds in one day). Then you can create a matrix with the number of rows equal to secs, e.g:
A = zeros(secs, 1)
I chose the number of columns to be 1, but this can be modified, of course.
First you have to convert kdbstrendtime and kdbstrbegtime to char by datestr command, then:
time = datenum(kdbstrendtime )-datenum(kdbstrbegtime )
t = datestr(time,'HH:MM:SS')

Counting values by day/hour with timeseries in MATLAB

So, I'm beginning to use timeseries in MATLAB and I'm kinda stuck.
I have a list of timestamps of events which I imported into MATLAB. It's now a 3000x25 array which looks like
2000-01-01T00:01:01+00:00
2000-01-01T00:01:02+00:00
2000-01-01T00:01:03+00:00
2000-01-01T00:01:04+00:00
As you can see, each event was recorded by date, hour, minute, second, etc.
Now, I would like to count the number of events by date, hour, etc. and then do various analyses (regression, etc.).
I considered creating a timeseries object for each day, but considering the size of the data, that's not practical.
Is there any way to manipulate this array such that we have "date: # of events"?
Perhaps there's just a simpler way to count events using timeseries?
As others have suggested, you should convert the string dates to serial date numbers. This makes it easy to work with the numeric data.
An efficient way to count number of events per interval (days, hours, minutes, etc...) is to use functions like HISTC and ACCUMARRAY. The process will involve manipulating the serial dates into units/format required by such functions (for example ACCUMARRAY requires integers, whereas HISTC needs to be given the bin edges to specify the ranges).
Here is a vectorized solution (no-loop) that uses ACCUMARRAY to count number of events. This is a very efficient function (even of large input). In the beginning I generate some sample data of 5000 timestamps unevenly spaced over a period of 4 days. You obviously want to replace it with your own:
%# lets generate some random timestamp between two points (unevenly spaced)
%# 1000 timestamps over a period of 4 days
dStart = datenum('2000-01-01'); % inclusive
dEnd = datenum('2000-01-5'); % exclusive
t = sort(dStart + (dEnd-dStart).*rand(5000,1));
%#disp( datestr(t) )
%# shift values, by using dStart as reference point
dRange = (dEnd-dStart);
tt = t - dStart;
%# number of events by day/hour/minute
numEventsDays = accumarray(fix(tt)+1, 1, [dRange*1 1]);
numEventsHours = accumarray(fix(tt*24)+1, 1, [dRange*24 1]);
numEventsMinutes = accumarray(fix(tt*24*60)+1, 1, [dRange*24*60 1]);
%# corresponding datetime range/interval label
days = cellstr(datestr(dStart:1:dEnd-1));
hours = cellstr(datestr(dStart:1/24:dEnd-1/24));
minutes = cellstr(datestr(dStart:1/24/60:dEnd-1/24/60));
%# display results
[days num2cell(numEventsDays)]
[hours num2cell(numEventsHours)]
[minutes num2cell(numEventsMinutes)]
Here is the output for the number of events per day:
'01-Jan-2000' [1271]
'02-Jan-2000' [1258]
'03-Jan-2000' [1243]
'04-Jan-2000' [1228]
And an extract of the number of events per hour:
'02-Jan-2000 09:00:00' [50]
'02-Jan-2000 10:00:00' [54]
'02-Jan-2000 11:00:00' [53]
'02-Jan-2000 12:00:00' [74]
'02-Jan-2000 13:00:00' [49]
'02-Jan-2000 14:00:00' [59]
similarly for minutes:
'03-Jan-2000 08:54:00' [1]
'03-Jan-2000 08:55:00' [1]
'03-Jan-2000 08:56:00' [1]
'03-Jan-2000 08:57:00' [0]
'03-Jan-2000 08:58:00' [0]
'03-Jan-2000 08:59:00' [0]
'03-Jan-2000 09:00:00' [1]
'03-Jan-2000 09:01:00' [2]
You can convert those timestamps to a number with datenum:
A serial date number represents the whole and fractional number of days from a specific date and time, where datenum('Jan-1-0000 00:00:00') returns the number 1. (The year 0000 is merely a reference point and is not intended to be interpreted as a real year in time.)
This way, it's easier to check where a period starts and end. Eg: the week your looking for starts at x and ends at x+7.999... ; all you have to do to find events in that period is checking if the datenum value is between x and x+8:
week_x_events = find(dn_timestamp>=x & dn_timestamp<x+8)
The difficulty is in converting your timestamp to datenum acceptable format, which is doable using regexp, good luck!
I don't know what +00:00 means (maybe time zone?), but you can simply convert your string timestamps into numerical format:
>> t = datenum('2000-01-01T00:01:04+00:00', 'yyyy-mm-ddTHH:MM:SS')
t =
7.3049e+005
>> datestr(t)
ans =
01-Jan-2000 00:01:04

UTC Time to String Conversion

I am looking for helping doing time conversions from UTC time to string using MATLAB.
I am trying to extract time from a data file collected at the end of October 2010.
The data file says it is reporting in UTC time and the field is an integer string value in milliseconds that is around 3.02e11. I would like to convert this to a string but am have some trouble.
I figured out that the units are most definitely in milliseconds so I convert this to fractions of days to be compatible with datenum format.
If the data was collected at the end of October (say, October 31, 2010) then I can guess what kind of number I might get. I thought that January 1, 2001 would be a good epoch and calculated what sort of number (in days) I might get:
suspectedDate = datenum('October 31, 2010')
suspectedEpoch = datenum('January 1, 2001')
suspectedTimeInDays = suspectedDate - suspectedEpoch
Which comes out as 3590.
However, my actual time, in days, comes out with the following code
actualTime = 3.02e11
actualTimeInDays = 3.02e11/1000/24/3600
as 3495.4.
This is troubling as the difference is only 94.6 -- not a full year. This would mean either the documentation for the file is wrong or the epoch is close to April 1-5, 2001:
calculatedEpoch = suspectedDate - actualTimeInDays
calculatedEpochStr = datestr(calculatedEpoch)
Alternately, if the epoch is January 1, 2001 then the actual date in the file is from the end of July.
ifEpochIsJanuaryDate = suspectedEpoch + actualTimeInDays
ifEpochIsJanuaryDateStr = datestr(ifEpochIsJanuaryDate)
Is this a known UTC format and can anyone give suggestions on how to get an October date from 3.02e11 magnitude number?
Unix time today is about 13e11, and is measured in ms since 1970.
If your time is ~3e11, then it's probably since year 2000.
>> time_unix = 1339116554872; % example time
>> time_reference = datenum('1970', 'yyyy');
>> time_matlab = time_reference + time_unix / 8.64e7;
>> time_matlab_string = datestr(time_matlab, 'yyyymmdd HH:MM:SS.FFF')
time_matlab_string =
20120608 00:49:14.872
Notes:
1) change 1970 into 2000 if your time is since 2000;
2) See the definition of matlab's time.
3) 8.64e7 is number of milliseconds in a day.
4) Matlab does not apply any time-zone shifts, so the result is the same UTC time.
5) Example for backward transformation:
>> matlab_time = now;
>> unix_time = round(8.64e7 * (matlab_time - datenum('1970', 'yyyy')))
unix_time =
1339118367664
You can't just make up your own epoch. Also datenum returns things in days. So the closeness you got with doing your math was just a coincidence.
Turns out that
>> datenum('Jan-1-0000')
ans =
1
and
>> datenum('Jan-1-0001')
ans =
367
So Matlab should be returning things in days since Jan. 1, 0000. (Not a typo)
However, I'd look carefully at this 3.02e11 number and find out exactly what it means. I'm pretty sure it's not standard Unix UTC, which should be seconds since January 1, 1970. It's way too big. It's close to GMT: Mon, 1 Jan 11540 08:53:20 UTC.

find mean or median date of event

I have a dataset for which I have extracted the date at which an event occurred. The date is in the format of MMDDYY although MatLab does not show leading zeros so often it's MDDYY.
Is there a method to find the mean or median (I could use either) date? median works fine when there is an odd number of days but for even numbers I believe it is averaging the two middle ones which doesn't produce sensible values. I've been trying to convert the dates to a MatLab format with regexp and put it back together but I haven't gotten it to work. Thanks
dates=[32381 41081 40581 32381 32981 41081 40981 40581];
You can use datenum to convert dates to a serial date number (1 at 01/01/0000, 2 at 02/01/0000, 367 at 01/01/0001, etc.):
strDate='27112011';
numDate = datenum(strDate,'ddmmyyyy')
Any arithmetic operation can then be performed on these date numbers, like taking a mean or median:
mean(numDates)
median(numDates)
The only problem here, is that you don't have your dates in a string type, but as numbers. Luckily datenum also accepts numeric input, but you'll have to give the day, month and year separated in a vector:
numDate = datenum([year month day])
or as rows in a matrix if you have multiple timestamps.
So for your specified example data:
dates=[32381 41081 40581 32381 32981 41081 40981 40581];
years = mod(dates,100);
dates = (dates-years)./100;
days = mod(dates,100);
months = (dates-days)./100;
years = years + 1900; % set the years to the 20th century
numDates = datenum([years(:) months(:) days(:)]);
fprintf('The mean date is %s\n', datestr(mean(numDates)));
fprintf('The median date is %s\n', datestr(median(numDates)));
In this example I converted the resulting mean and median back to a readable date format using datestr, which takes the serial date number as input.
Try this:
dates=[32381 41081 40581 32381 32981 41081 40981 40581];
d=zeros(1,length(dates));
for i=1:length(dates)
d(i)=datenum(num2str(dates(i)),'ddmmyy');
end
m=mean(d);
m_str=datestr(m,'dd.mm.yy')
I hope this info to be useful, regards
Store the dates as YYMMDD, rather than as MMDDYY. This has the useful side effect that the numeric order of the dates is also the chronological order.
Here is the pseudo-code for a function that you could write.
foreach date:
year = date % 100
date = (date - year) / 100
day = date % 100
date = (date - day) / 100
month = date
newdate = year * 100 * 100 + month * 100 + day
end for
Once you have the dates in YYMMDD format, then find the median (numerically), and this is also the median chronologically.
You see above how to present dates as numbers.
I will add no your issue of finding median of the list. The default matlab median function will average the two middle values when there are an even number of values.
But you can do it yourself! Try this:
dates; % is your array of dates in numeric form
sdates = sort(dates);
mediandate = sdates(round((length(sdates)+1)/2));