What is the largest Instant that can be represented as millis since epoch? - java-time

Instant.MAX.toEpochMilli() raises java.lang.ArithmeticException: long overflow.
What is the largest timestamp representable without hitting the limit of long?
And is there a constant somewhere for it?

The largest timestamp that won't raise an exception in toEpochMilli() is +292278994-08-17T07:12:55.807Z.
Instant.MAX is much larger +1000000000-12-31T23:59:59.999999999Z
Instant.ofEpochMilli(Long.MAX_VALUE); // +292278994-08-17T07:12:55.807Z
I'm not aware of any constant for this specific date but it's easy enough to compute with Instant.ofEpochMilli(Long.MAX_VALUE))

Related

How precise should I encode a Unix Time?

I came across this because I am working with time across multiple platforms and seems like they all differ a little bit from each other in how unix time is implemented and/or handled in their system. Thus the question.
Quoting Wikipedia page on Unix Time:
Unix has no tradition of directly representing non-integer Unix time numbers as binary fractions. Instead, times with sub-second precision are represented using composite data types that consist of two integers, the first being a time_t (the integral part of the Unix time), and the second being the fractional part of the time number in millionths (in struct timeval) or billionths (in struct timespec). These structures provide a decimal-based fixed-point data format, which is useful for some applications, and trivial to convert for others.
Which seems to be the implemention in Go (UnixNano). However, in practice, there are many languages/platforms which use milliseconds (Java?) and also some platforms uses Float (to try to maintain some precision) and others mostly uses Int.
So if I'm implementing a transport format and I only have exactly 64 bits available to store a time value and no more, my question is two-fold:
Should I encode it as an integer or a floating-point value? And
Should I use seconds, milliseconds or nanosecond precision?
The main goal being to try to be as accurate as possible across as many languages and platforms as possible (without resorting to custom code in every single platform, of course).
p.s. I know this is a little subjective but I believe it's still possible to make a good, objective answer. Feel free to close if that's not the case.
It depends on what the required precision of the time value is, and its maximal range.
When storing nanoseconds in an unsigned 64bit integer, the range is about 584 years (2^64 ns), so precise and long enough for any practical application already.
Using a floating point format has the advantage that both very small and very large values can be stored, with higher absolute precision for smaller values. But with 64bit it this probably not a problem anyways.
If the time value is an absolute point in time instead of duration, the transform format would also need to define what date/time the value 0 stands for. (i.e. the Epoch)
Getting the current time on a UNIX-like system can be done using gettimeofday(), for example, which returns a struct with a seconds and microseconds value. This can then be converted into a single 64bit integer giving a value in microseconds. The Epoch for UNIX time is 1 January 1970 00:00:00 UT. (The clock() function does not measure real time, but instead the duration of time that the processor was active.)
When a time value for the same transport format is generated on another platform (for example Windows with GetSystemTime(), it would need to be converted to the same unit and epoch.
So the following things would need to be fixed for a transport protocol:
The unit of the time value (ms, us, ...), depending on required precision and range
If the time is a time point and not a duration, the Epoch (date and time of value 0)
Whether it is stored in an integer (unsigned or signed, if it is a duration that can be negative), or as a floating point
The endianess of the 64bit value
If floating point is used, the format of the floating point value (normally IEEE 754)
Because different platforms have different APIs to get the current time, probably it would always need some code to properly convert the time value, but this is trivial.
For maximum portability and accuracy, you should probably go with a type specified by POSIX. That way, the code will be portable across all Unixes and other operating systems conforming to POSIX.
I suggest that you use clock_t and the clock() function for time. This has a variety of uses, including measuring time and distance between one point in a program and another. Just make sure to cast the result to a double and divide by CLOCKS_PER_SEC afterwards to convert that time into a human-readable format.
So, to answer your question:
Use both an integer and a floating-point value
Unsure precision (the number of clock cycles between calls) but accurate enough for all non-critical applications and some more important ones

How to handle integer overflow in LogParser?

I'm selecting Sum(time-taken) from a big period of time logs and it is getting negative values. How do I handle it?
You could try doing a sum of TO_REAL(time-taken).

snowflake: "left shift" made that result exceeds long.max value

((timestamp - 1288834974657) << 32)
I included some more bits information, for example, total 32 bits after timestamp information needs, then the timestamp needs to be left shift 32 bits, such that the result exceeds long.max value. The result shown a negative value something like -7187691577906700288, it was wrong.
Hope I described my question correctly. Please help...
I don't know snowflake well (I assume it's a language?) I also don't know what format that timestamp is. If 1288834974657 a unix timestamp, it's in the year 42811.
The issue is that this particular timestamp is larger than 32bit. Since you move it up another 32bit, your number overflows. It looks like the long in your language might be unsigned, which means that the maximum number is probably 2^63-1. If the long were unsigned, the maximum number would probably be 2^64-1.

What's Int.MaxValue between friends?

The max values of int, float and long in Scala are:
Int.MaxValue = 2147483647
Float.MaxValue = 3.4028235E38
Long.MaxValue = 9223372036854775807L
From the authors of Scala compiler, Keynote, PNW Scala 2013, slide 16 What's Int.MaxValue between friends?:
val x1: Float = Long.MaxValue
val x2: Float = Long.MaxValue - Int.MaxValue
println (x1 == x2)
// NO WONDER NOTHING WORKS
Why does this expression return true?
A Float is a 4-byte floating point value. Meanwhile a Long is an 8-byte value and an Int is also a 4-byte value. However, the way numbers are stored in 4-byte floating point values means that they have only around 8 digits of precision. Consequently, they do not have the capacity to store even the 4 most significant bytes (around 9-10 digits) of a Long regardless of the value of the least 4 significant bytes (another 9-10 digits).
Consequently, the Float representation of the two expressions is the same, because the bits that differ are below the resolution of a Float. Hence the two values compare equal.
Echoing Mike Allen's answer, but hoping to provide some additional context (would've left this as a comment rather than a separate answer, but SO's reputation feature wouldn't let me).
Integers have a maximum range of values defined as either 0 to 2^n (if it is an unsigned integer) or -2^(n-1) to 2^(n-1) (for signed integers) where n is the number of bits in the underlying implementation (n=32 in this case). If you wish to represent a number larger than 2^31 with a signed value, you can't use an int. A signed long will work up to 2^63. For anything larger than this, a signed float can go up to roughly 2^127.
One other thing to note is that these resolution issues are only in force when the value stored in the floating point number approaches the max. In this case, the subtraction operation causes a change in true value that is many orders of magnitude smaller than the first value. A float would not round off the difference between 100 and 101, but it might round off the difference between 10000000000000000000000000000 and 10000000000000000000000000001.
Same goes for small values. If you cast 0.1 to an integer, you get exactly 0. This is not generally considered a failing of the integer data type.
If you are operating on numbers that are many orders of magnitude different in size, and also not able to tolerate rounding errors, you will need data structures and algorithms that account for inherent limitations of binary data representation. One possible solution would be to use a floating point encoding with fewer bits of exponential, thereby limiting the max value but providing for greater resolution is less significant bits. For greater detail, check out:
look up the IEEE Standard 754 (which defines the floating point encoding)
http://steve.hollasch.net/cgindex/coding/ieeefloat.html
https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/

how to get a Matlab timestamp in integer seconds

I need to get a timestamp in integer seconds, that won't roll over.
Can be elapsed CPU seconds, or elapsed clock or epoch seconds.
'clock' gives a date/time vector in years ... seconds.
But I can't figure out how to convert this to integer seconds.
cputime returns elapsed integer seconds but "This number can overflow the internal representation and wrap around.".
What about round(3600 * 24 * now)?
According to the manual, now returns the number of days since the year 0, as a floating point number. Multiplying by 86400 should thus give seconds.
Usually it is better to use a fixed-point format for keeping track of time, but since you are only interested in integer seconds, it should not be too much of a problem. The time resolution of now due to floating point resolution can be found like this:
>> eps(now*86400)
ans =
7.6294e-06
Or almost 8 microseconds. This should be good enough for your use case. Since these are 64-bit floating point numbers, you should not have to worry about wrapping around within your lifetime.
One practical issue is that the number of seconds since the year 0 is too large to be printed as an integer on the Matlab prompt with standard settings. If that bothers you, you can do fprintf('%i\n', round(3600 * 24 * now)), or simply subtract some arbitrary number, e.g. to get the number of seconds since the year 2000 you could do
epoch = datenum(2000, 1, 1);
round(86400 * (now - epoch))
which currently prints 488406681.