Subtract 1 millisecond from time Hiveql - hiveql

I am trying to subtract 1 millisecond from the update time (existing timestamp in dd-MMM-yy hh.mm.ss.MS format but I am getting null when I write -1. Please help. Below is my query in which I need to subtract 1 from the result I get by the lead of update time.
I tried this
nvl( lead(updatetime) over (partition by id order by updatetime asc)-1, now()) DW_END_DATE_TIME
Does not work. I am new to hive, I have no idea why this is not working.

Related

How to convert timestamp in Redshift so that I get seconds/milliseconds?

I have a dataframe like:
id state city time
123.04 ny 1 01-10-2021 12:30
123.05 ny 2 01-10-2021 12:30
I want the the id that that is associated with the most recent time by state. So I do:
select id, state
from data a
join (select state, max(time) as most_recent
from data group by 1) b on a on a.state = b.state and a.time = b.most_recent)
However, I am running into issues where the timestamp is the same. I know that I can do another query to then get the max ID but I would ideally like to just go by timestamp. I know that the ID is assigned in sequential order so if I am able to get the seconds or milliseconds then I will be able to actually get the most recent ID.
Is there way to get seconds/milliseconds or do I have to do another query?
Redshift timestamps have a 1 microsecond resolution - this is part of the data type definition. If you are not seeing the seconds and fractional seconds it is because of how your bench (/ connection to RS) is presenting the data to you. The easiest way to see the fractional seconds is to format the timestamp to a string when viewing and then there will not be any reformatting. For example:
select to_char(sysdate, 'HH24:MI:SS.US');

Count Until A Specific Value?

Say you've got a table ordered by the date that captures the speed of vehicles with a device in them. And imagine you get 30 updates per day for the speed. It's not always 30 per vehicle. The data will have the vehicle, the timestamp, and the speed.
What I want to do is be able to count how many days have passed since the vehicle last went over 10 mph in order to find inactive vehicles. Is something like that possible in postgresql?
*Or is there a way to get back the row number of the table if it's sorted where the speed goes past 10, and then select the date in that row number to subtract the current date from the date listed?
SELECT DISTINCT ON (vessel) vessel, now() - date
FROM your_table
WHERE speed > 10
ORDER BY vessel, date DESC
This will tell you, for every vehicle, how long ago its speed field was last over 10.
SELECT vessel, now() - max(date)
WHERE speed > 10
FROM your_table
GROUP BY vessel;

Get the last timestamps in a group by time query in Influxdb

I have a database with price and timestamps in nanoseconds measurement in InfluxDB. When I do a select grouped by time like this one:
select first(price),last(price) from priceseries where time>=1496815212834974866 and time<=1496865599580302882 group by time(1s)
I received a time column in which the timestamps is aligned to the second beginning the group. For example, the timestamp will be 08:00:00 and the next timestamps will be 08:00:01
How to
apply aggregation function on the record timestamps itself like last(time) or first(time) so that to have the real first and last timestamps of the group (I can have many prices within my group) ?
and how the time column in the response could be the closing second and not the opening second, that is if the group goes from 08:00:00 to 08:00:01, I want to see 08:00:01 in my time column instead of 08:00:00 which I see now ?
Not when using an aggregation function, which implies use of group by.
select first(price), last(price) where time >= <..> and time <= <..> will give you the first and last price within that time window.
When the query has a group by, the aggregation applies only to values within the intervals. The values themselves are the real values that fall in the 08:00:00 - 08:00:01 interval, it's just that the timestamp shown is for the interval itself, not the actual values.
Meaning that the query for between 08:00:00 and 08:00:01 without a group by and the query with a group by time(1s) for same period will give same result. Only difference is query without group by will have the value's actual timestamp and the group by query will have the interval's timestamp instead.
The timestamp when using group by indicates the starting time of the interval. From that, you can calculate end time is start time + interval. What timestamp to show is not configurable in the query language.

Subtract between timestamp in Redshift

I found a weird thing. If a timestamp value subtract another, then Redshift will return an strange prefix. For example,
select table1.c_timestamp - table1.c_timestamp from table_1
Expect result should be ZERO or similar something, because these two timestamp values are same.
However, what I received is "5012369 years 4 mons", which I have no idea how does Redshift calculate the result.
Is there anyone can show me some clues?
Thanks
Contrary to the other answer,
Datediff doesn't exactly subtract, but rather counts the number of times the datepart chosen starts between the two timestamps.
datediff(second, '2018-04-10 00:00:00.001','2018-04-10 00:00:00.999')
>> 0
select datediff(second, '2018-04-10 00:00:00.999','2018-04-10 00:00:01.001')
>> 1
See: Datediff documentation
Edit: this is the way I found of how to perform the OP's task
SELECT
round(((EXTRACT('epoch' FROM TIMESTAMP '2018-05-27 09:59:59.999') - EXTRACT('epoch' FROM TIMESTAMP '2018-05-27 09:59:59.001'))*1000 + EXTRACT(millisecond FROM TIMESTAMP '2018-05-27 09:59:59.999') - EXTRACT(millisecond FROM TIMESTAMP '2018-05-27 09:59:59.001'))::real/1000)
The right way to subtract between datetimes is:
select datediff(seconds, table1.c_timestamp, table1.c_timestamp) from table_1
Of course, it doesn't make much sense to subtract a timestamp from itself, because that obviously returns 0, but I assume you just run that as a test.

Group by day even if data is missing

I want to group results by day: max value for each of the the last 30 days. I came up with :
select max(value), DATE(time) from table where time>DATE('now', '-30 days') group by DATE(time);
But this only gives results for dates with data. I want null or 0 for the dates without data. Is that possible?