How many days in ISO8601 duration 'months' and 'years'? - iso8601

I need to represent a duration of 100 days.
From Wikipedia:
Durations are represented by the format P[n]Y[n]M[n]DT[n]H[n]M[n]S
But for a duration, how many days are in a month? Surely it depends on which month we're in... Ditto, days in a year.

Following up on my comment, it looks like it's perfectly legal to specify a count larger than the number of units that make up the next "larger" unit, so you could just use P100D. Wikipedia says:
The standard does not prohibit date and time values in a duration representation from exceeding their "carry over points" except as noted below. Thus, "PT36H" could be used as well as "P1DT12H" for representing the same duration. But keep in mind that "PT36H" is not the same as "P1DT12H" when switching from or to Daylight saving time.
While the example is for hours, it seems like days should be fine too (and as your problem illustrates, even more useful, since months and years aren't fixed quantities). That said, the standard doesn't specify the maximum number of digits for each unit, only:
Leading zeros are not required, but the maximum number of digits for each element should be agreed to by the communicating parties.
So whatever is consuming your durations must accept at least 3 digits per element for P100D to work; that doesn't seem like an unusually high level of required support.

Related

How precise should I encode a Unix Time?

I came across this because I am working with time across multiple platforms and seems like they all differ a little bit from each other in how unix time is implemented and/or handled in their system. Thus the question.
Quoting Wikipedia page on Unix Time:
Unix has no tradition of directly representing non-integer Unix time numbers as binary fractions. Instead, times with sub-second precision are represented using composite data types that consist of two integers, the first being a time_t (the integral part of the Unix time), and the second being the fractional part of the time number in millionths (in struct timeval) or billionths (in struct timespec). These structures provide a decimal-based fixed-point data format, which is useful for some applications, and trivial to convert for others.
Which seems to be the implemention in Go (UnixNano). However, in practice, there are many languages/platforms which use milliseconds (Java?) and also some platforms uses Float (to try to maintain some precision) and others mostly uses Int.
So if I'm implementing a transport format and I only have exactly 64 bits available to store a time value and no more, my question is two-fold:
Should I encode it as an integer or a floating-point value? And
Should I use seconds, milliseconds or nanosecond precision?
The main goal being to try to be as accurate as possible across as many languages and platforms as possible (without resorting to custom code in every single platform, of course).
p.s. I know this is a little subjective but I believe it's still possible to make a good, objective answer. Feel free to close if that's not the case.
It depends on what the required precision of the time value is, and its maximal range.
When storing nanoseconds in an unsigned 64bit integer, the range is about 584 years (2^64 ns), so precise and long enough for any practical application already.
Using a floating point format has the advantage that both very small and very large values can be stored, with higher absolute precision for smaller values. But with 64bit it this probably not a problem anyways.
If the time value is an absolute point in time instead of duration, the transform format would also need to define what date/time the value 0 stands for. (i.e. the Epoch)
Getting the current time on a UNIX-like system can be done using gettimeofday(), for example, which returns a struct with a seconds and microseconds value. This can then be converted into a single 64bit integer giving a value in microseconds. The Epoch for UNIX time is 1 January 1970 00:00:00 UT. (The clock() function does not measure real time, but instead the duration of time that the processor was active.)
When a time value for the same transport format is generated on another platform (for example Windows with GetSystemTime(), it would need to be converted to the same unit and epoch.
So the following things would need to be fixed for a transport protocol:
The unit of the time value (ms, us, ...), depending on required precision and range
If the time is a time point and not a duration, the Epoch (date and time of value 0)
Whether it is stored in an integer (unsigned or signed, if it is a duration that can be negative), or as a floating point
The endianess of the 64bit value
If floating point is used, the format of the floating point value (normally IEEE 754)
Because different platforms have different APIs to get the current time, probably it would always need some code to properly convert the time value, but this is trivial.
For maximum portability and accuracy, you should probably go with a type specified by POSIX. That way, the code will be portable across all Unixes and other operating systems conforming to POSIX.
I suggest that you use clock_t and the clock() function for time. This has a variety of uses, including measuring time and distance between one point in a program and another. Just make sure to cast the result to a double and divide by CLOCKS_PER_SEC afterwards to convert that time into a human-readable format.
So, to answer your question:
Use both an integer and a floating-point value
Unsure precision (the number of clock cycles between calls) but accurate enough for all non-critical applications and some more important ones

Is the Rate in the source block a fixed rate?

I have a simple source to sink model and I am merely altering the "Rate" to 6 per hour. I would expect a fixed 6 agents to be generated each hour, but it seems like in the first hour from 0 to 60min, only 3 agents are generated. Similarly in the time 60-120min, only 5 agents were generated.
Is there a warm up period in Anylogic or something like this that explains what is happening?
Another alternative is to just use the interarrival time with a fixed time. This will give you the same results as Felipe's answer, but with one less object, as you will not need the event.
A few important items to note on this approach:
Instead of 6.0, using a parameter would be better. You could call this parameter dArrivalsPerHour. This would make your source block easier to read in the future, and give you some better flexibility. Your interarrival time would be 1.0 / dArrivalsPerHour.
Make sure you divide by at least (1) double. If you did 1/6, java would actually return 0! This is because in Java two integers divided by each other returns an integer, so java just truncates the decimal. If you use a parameter, just set its type to double. Usually to be extra careful against anyone accidentally changing my parameter type to integer in the future, I would still go ahead and use a 1.0.
AnyLogic does not have an arrival at time zero in this approach. The first arrival would be at 0.166 hours. If you want an arrival at time zero, followed by this pattern (it would still be 6 per hour, just shifting when it starts), then you have a couple of options. First, you can use Felipe's approach and set the first occurrence time to zero. An alternative would be to could call an inject On Startup OR after you have finished any initialization code your model has.
Happy Modeling!
The source block doesn't produce exactly 6 agents per hour, it produces agents using a poisson distribution with mean 6 per hour (lambda=6). So the number of agents per hour you get will be random. But the reason why you always get 3 in the first hour and 5 in the second hour is that you have a fixed seed:
You can find that option clicking on your simulation experiment under the randomness tab. If you change to random seed it will produce different agents per hour instead of always 3 and 5.
To produce EXACTLY 6 per hours you need to use an event. But first create a source that generates agents through injection:
And the event running 6 times per hour, adding 1 agent to the source:

How to count the number of significant digits?

For example, 5.020 would return 4. Preferably, it should work with vector inputs too.
I Googled around and found some answers, but none of them counted the last zero in 5.020.
From the given information, it is not possible.
The problem is that when you enter a number it is (per standard) represented as a double, and thus it has a precision of eps (the entered precision is lost). However, as one is typically not interested in showing all ~15 digits Matlab uses a couple of different display rules which are independent of the originally entered number, this typically involves the integer part plus 4 digits.
Additionally, the standard rule, when converting a number to a string (num2str) is to cutoff trailing zeros. Which is why you do not get the last zero.
Your only option is to count the number of significant digits when you obtain the data. Which leads back to the question #Beaker asks you in the comments

Calculating IV60, and IV90 on interactive brokers

I am trading options, but I need to calculate the historical implied volatility in the last year. I am using Interactive Broker's TWS. Unfortunately they only calculate V30 (the implied volatility of the stock using options that will expire in 30 days). I need to calculate the implied volatility of the stock using options that will expire in 60 days, and 90 days.
The problem: Calculate the implied volatility of at least a whole year of an individual stock using options that will expire in 60 days and 90 days giving that:
TWS does not provide V60 or V90.
TWS does not provide historical pricing data for individual options for more than 3 months.
The attempted solution:
Use the V30 that TWS provide too come up with V60 and V90 giving the fact that usually option prices will behave like a skew (horizontal skew). However, the problem to this attempted solution is that the skew does not always have a positive slope, so I can't come up with a mathematical solution to always correctly estimate IV60 and IV90 as this can have a positive or negative slope like in the picture below.
Any ideas?
Your question is either confusing or isn't about programming. This is what IB says.
The IB 30-day volatility is the at-market volatility estimated for a
maturity thirty calendar days forward of the current trading day, and
is based on option prices from two consecutive expiration months.
It makes no sense to me and I can't even get those ticks to arrive (generic type 24). But even if you get them, they don't seem to be useful. My guess is it's an average to estimate what the IV would be for an option expiring exactly 30 days in the future. I can't imagine the purpose for this. The data would be impossible to trade with and doesn't represent reality. Imagine an earnings report at 29 or 31 days!
If you'd like the IV about 60 or 90 days in the future call reqMktData with an option contract that expires around then and an empty generic tick list. You will get tick types 10, 11, 12, and 13 which all have an IV. That's how you build the IV surface. If you'd like to build it with a weighted average to estimate 60 days, it's possible.
This is python but should be self explanatory
tickerId = 1
optCont = Contract()
optCont.m_localSymbol = "AAPL 170120C00130000"
optCont.m_exchange = "SMART"
optCont.m_currency = "USD"
optCont.m_secType = "OPT"
tws.reqMktData(tickerId, optCont, "", False)
Then I get data like
<tickOptionComputation tickerId=1, field=10, impliedVol=0.20363398519176756, delta=0.0186015418248492, optPrice=0.03999999910593033, pvDividend=0.0, gamma=0.007611155331932943, vega=0.012855970569816431, theta=-0.005936076573849303, undPrice=116.735001>
If there's something I'm missing about options, you should ask this at https://quant.stackexchange.com/

How to convert month to other duration measurement types?

For some duration-related calculations I need to convert values measured in "months" to other formats, such as years, days, or hours.
For example, what is the proper way to measure a month in terms of days? is it 30 days? or 30.4375 days? (365.25 / 12) and which format would be useful in which cases?
If you have any information on the casual/business use cases for such conversions it would be helpful too.
Unfortunately, there's really no single generally valid answer to your question.
If this is for business use, first check whether there are any existing relevant standards or business practices that define what a "month" means in your business context. If yes, you should follow that definition as closely as possible, however silly or awkward it may seem.
For casual use, the simplest solution is probably to pick any widely use date manipulation library and do whatever it does. The default behavior may not be perfect, but it's probably at least close to a fairly sensible compromise of the many contradictory expectations that users of such a library may have.
OK, but what if you insist on rolling your own solution? In that case, the first choice you should make is how you want to represent date / time values. There are at least two common choices:
The first option is to store dates / times using a simple linear count of fixed time units from a given epoch, such as Julian days or Unix timestamps. This provides a simple and compact date/time representation, makes comparing timestamps and simple date/time arithmetic (like adding n seconds to a time value) easy, and ensures that any time value corresponds to a (more or less) unique and well defined point in time.
The downside, as you've noticed, is that arithmetic using "fuzzy" time units like months or years gets difficult: you can define a year as 365.25 days (or as 365.2425 days, to take into account that only 97 out of every 400 years are leap years in the Gregorian calendar) and a month as 1/12 years, but this will cause adding a year to a date-time value to also shift the time of day by (about) 6 hours, which may be unexpected.
This approach also doesn't let you easily represent "floating" time value, like times of day without a specified date and time zone. (You can sort of deal with floating time zones by doing your time math in UTC and just pretending that it's in your local time zone, but this can cause weird stuff to happen around DST changeovers.) Conversely, it can also cause difficulties if you need to represent imprecise date/time values, such as dates without a time component.
In particular, if you choose the "natural" representation, where imprecise datetimes are represented by their starting point, so that e.g. an unspecified time of day defaults to 00:00:00.0, then anything that causes the time part to be reduced by even a fraction of a second — like, say, shifting to a later time zone, or subtracting a fuzzy time unit that is not an integral number of days — will flip the date part to the previous day. For example, with this representation, subtracting one year (= 265.2425 days) from January 1, 2014 will yield a date in 2012 (specifically, December 31, 2012, 17:56:32)!
You can avoid some of these issues by representing imprecise date/time values by their midpoints instead, so that e.g. the date 2014 is treated as shorthand for June 2, 2014, 12:00:00. What you lose, with this representation, is the ability to build datetimes just by adding up components: with this representation, 2014 + 5 months + 3 days isn't anywhere near May 3, 2014.
Also, just when you think you've at least got simple non-fuzzy time arithmetic unambiguously sorted out, someone's going to tell you about leap seconds...
The alternative approach is to store datetime values in decomposed year / month / day / hour / minute / second / etc. format. With this presentation, time intervals are also naturally stored in a decomposed format: "one month + 17 days" is, in itself, a valid time interval in such a representation, and need not (and should not) be simplified further.
This has a few obvious advantages:
Fuzzy unit arithmetic is (conceptually) simple: to add one year to a date, just increment the year component by one.
Imprecise date/time values can be naturally represented: for a pure date value, the time-of-day components can simply be left undefined (= e.g. represented by negative values for the undefined components, or simply by having each datetime value store its precision).
You have precise control over when and if rollover occurs: adding a year to a date in 2014 will always yield a date in 2015.
You can also support floating time values, such as times of day without a specified date, or dates of year without a specified year. Floating time zones also become supportable.
That said, there are some disadvantages, too:
Implementing date arithmetic gets more complex, since you have to deal with non-trivial carry/borrow rules. (Quick! What's the date 10,000,000 seconds after May 3, 2014?)
You'll still have ambiguities with month arithmetic: what's the date one month after January 31? And does it depend on whether it's a leap year or not?
You can allow such a format to store "impossible" dates like "February 31", with an optional method to normalize them to, say, February 28 (or 29, for a leap year) later. This has the advantage of preserving (some) arithmetic consistency: it allows (January 31 + 1 month) + 1 month to equal March 31 as expected.
In some ways, though this merely postpones the problem: presumably, January 31 + 24 hours should fall on February 1, but what day and month should January 31 + 1 month + 24 hours fall on? The "obvious" choice would be March 1, but whatever you choose, there will be some sequence of arithmetic operations that will yield inconsistent results.