Redshift human-readable durations / equivalent to postgres `justify_interval` - amazon-redshift

I have duration values in microseconds in Redshift that I want to show in a reduced form, eg 49275177354 should appear as 13:41:15.177354 or something similarly human-readable.
The values can be anywhere from a few seconds to several hours, so the appropriate base unit will vary from row to row.
In postgres, I would handle this using justify_interval like this:
select justify_interval(interval '1 usec' * my_col_usec) from my_table;
How can this be accomplished in Redshift? I've looked at the docs on Redshift intervals (eg this) but haven't found any relevant functions. The next best workaround I've come up with is to use case statements but I'd like something a bit more concise and, ideally, built in.

You should use something like
select (49275177354 * INTERVAL '0.000001 second');
or
select '00:00:00.000'::time+ (49275177354 * INTERVAL '0.000001 second');

Related

Get truncked data from a table - postgresSQL

I want to get truncked data over the last month. My time is in unix timestamps and I need to get data from last 30 days for each specific day.
The data is in the following form:
{
"id":"648637",
"exchange_name":"BYBIT",
"exchange_icon_url":"https://cryptowisdom.com.au/wp-content/uploads/2022/01/Bybit-colored-logo.png",
"trade_time":"1675262081986",
"price_in_quote_asset":23057.5,
"price_in_usd":1,
"trade_value":60180.075,
"base_asset_icon":"https://assets.coingecko.com/coins/images/1/large/bitcoin.png?1547033579",
"qty":2.61,
"quoteqty":60180.075,
"is_buyer_maker":true,
"pair":"BTCUSDT",
"base_asset_trade":"BTC",
"quote_asset_trade":"USDT"
}
I need to truncate data based on trade_time
How do I write the query?
The secret sauce is the date_trunc function, which takes a timestamp with time zone and truncates it to a specific precision (hour, day, week, etc). You can then group based on this value.
In your case we need to convert these unix timestamps javascript style timestamps to timestamp with time zone first, which we can do with to_timestamp, but it's still a fairly simple query.
SELECT
date_trunc('day', to_timestamp(trade_time / 1000.0)),
COUNT(1)
FROM pings_raw
GROUP BY date_trunc('day', to_timestamp(trade_time / 1000.0))
Another approach would be to leave everything as numbers, which might be marginally faster, though I find it less readable
SELECT
(trade_time/(1000*60*60*24))::int * (1000*60*60*24),
COUNT(1)
FROM pings_raw
GROUP BY (trade_time/(1000*60*60*24))::int

Subract Additional Time from $__timeFilter

I want to subract additional time in $__timeFilter in grafana. Like if I have selected Last 7 days, I want to run 2 queries which do a comparison like one query gives me avg cpu utilization for last 7 days and another one gives me avg cpu utilzation for now() - 14d to now() - 7d. And this is dynamic. I can get for 6hrs, 2days or anything selected.
My database is TimescaleDB and grafana version in 8.3.5
Edit
Query is
select avg(cpu) from cpu_utilization where $__timeFilter(timestamp)
Whatever is selected in the time filter in grafana, the query is manipulated accordingly
Now with grafana understands this query becomes as following. if I select last 24hrs
select avg(cpu) from cpu_utilization where timestamp BETWEEN '2022-09-07 05:32:10' and '2022-09-08 05:32:10'
This is normal behaviour. Now I wanted that if I select last 24hrs, this query to behave as it is but an additional query becomes
select avg(cpu) from cpu_utilization where timestamp BETWEEN '2022-09-06 05:32:10' and '2022-09-07 05:32:10'
(I just don't want it for last 24hrs, but any relative time period selected in the filter)
Answer : https://stackoverflow.com/a/73658919/14817486
You can use the global variables $__to and $__from.
For example, ${__from:date:seconds} will give you a timestamp in seconds. You can then subtract 7 days (= 604800 seconds) from it and use it in your query's WHERE clause. Depending on your SQL dialect, that might be by using TIMESTAMP(), TO_TIMESTAMP() or something similar. So it would look similar to this:
[...] WHERE timestamp BETWEEN TO_TIMESTAMP(${__from:date:seconds}-604800) AND TO_TIMESTAMP(${__to:date:seconds}-604800) [...]
Interesting question! If I understood correctly, you could use the timestamp column as the reference as the grafana is already filtering by this to the comparison query. So you can get the min(timestamp) and max(timestamp) to know the limits of your period and then build something from it.
Like min(timestamp) - INTERVAL '7 days' would give you the start of the previous range, and max(timestamp) - INTERVAL '7 days' would offer the final scope.

Difference between two timestamps as timestamp across multiple days

I have two timestamps and I would like to have a result with the difference between them. I found a similar question asked here but I have noticed that:
select
to_char(column1::timestamp - column2::timestamp, 'HH:MS:SS')
from
table
Gives me an incorrect return if these timestamps cross multiple days. I know that I can use EPOCH to work out the number of hours/days/minutes/seconds etc but my use case requires the result as a timestamp (or a string...anything not an interval!).
In the case of multiple days I would like to continue counting the hours, even if it should go past 24. This would allow results like:
36:55:01
I'd use the built-in date_part function (as previously described in an older thread: How to convert an interval like "1 day 01:30:00" into "25:30:00"?) but finally cast the result to the type you desire:
SELECT
from_date,
to_date,
to_date - from_date as date_diff_interval,
(date_part('epoch', to_date - from_date) * INTERVAL '1 second')::text as date_diff_text
from (
(select
'2018-01-01 04:03:06'::timestamp as from_date,
'2018-01-02 16:58:07'::timestamp as to_date)
) as dates;
This results in the following:
I'm currently unaware of any way to convert this interval into a timestamp and also not sure whether there is a use for it. You're still dealing with an interval and you'd need a point of reference in time to transform that interval into an actual timestamp.

Select a datetime in Tsql, trim the milliseconds and return as a string

Problem: I want to select a date (stored as datetime) and return is as a string with the milliseconds trimmed off.
eg 2017-01-04 08:47:30.0000000 => "2017-01-04 08:47:30"
My current solutions:
I have got 3 statements which do the above:
Substring option
select
SUBSTRING(CONVERT(nvarchar,EventDate),0,20)
from EventsTable
Double convertions
select
CONVERT(nvarchar, CONVERT(datetime2(0),EventDate))
from EventsTable
Short nvarchar
select
CONVERT(nvarchar(19),EventDate)
from EventsTable
Allof the above solutions work and achieve my goal.
Question:
What is the best practice / most efficient way to achieve my goal?
Use a date style in your convert:
select CONVERT(varchar(19),EventDate,120)
from EventsTable
I would use this. It uses only one function, and sets the correct length for the varchar, and the format tells SQL Server the correct format you need.
Since style 120 is a standard you are sure that it the same no matter what localisation settings are on your DB or Session.
I ran all the three queries with the Execution Plan enabled with SQL Server 2008R2.
I received the cost equal in all of them ( 33% each).
All the three queries take the same time and resources.
However, the third one might help you with the less code being written.

Postgres: using timestamps for pagination

I have table with created (timestamptz) property. Now, i need to create pagination based on timestamp, because while user is watching first page, new items could be submitted into this table, which will make data inconsistent in case if i'll use OFFSET for pagination.
So, the question is: should i keep created type as timestamptz or it's better to convert it into integer (unix, e.g. 1472031802812). If so, is there any disadvantages? Also, atm i have now() as default value in created - is there alternative function to create unix timestamp?
Let me rewrite things from comments to my answer. You want to use timestamp type instead of integer simply because that's exactly what it was designed for. Doing manual convertions between timestamp integers and timestamp objects is just a pain and you gain nothing. And you will need it eventually for more complex datetime based queries.
To answer a question about pagination. You simply do a query
SELECT *
FROM table_name
WHERE created < lastTimestamp
ORDER BY created DESC
LIMIT 30
If it is first query then you set say lastTimestamp = '3000-01-01'. Otherwise you set lastTimestamp = last_query.last_row.created.
Optimization
Note that if the table is big then ORDER BY created DESC might not be efficient (especially if called parallely with different ranges). In this case you can use moving "time windows", for example:
SELECT *
FROM table_name
WHERE
created < lastTimestamp
AND created >= lastTimestamp - interval '1 day'
The 1 day interval is picked arbitrarly (tune it to your needs). You can also sort results in the app.
If results is not empty then you update (in your app)
lastTimestamp = last_query.last_row.created
(assuming you've done sorting, otherwise you take min(last_query.row.created))
If results is empty then you repeat the query with lastTimestamp = lastTimestamp - interval '1 day' until you fetch something. Also you have to stop if lastTimestamp becomes to low, i.e. when it is lower then any other timestamp in the table (which has to be prefetched).
All of that is under some assumptions for inserts:
new_row.created >= any_row.created and
new_row.created ~ current_time
The distribution of new_row.created is more or less uniform
Assumption 1 ensures that pagination results in consistent data while assumption 2 is only needed for the default 3000-01-01 date. Assumption 3 is to make sure that you don't have big empty gaps when you have to issue many empty queries.
You mean something like this?
select extract(epoch from now())::integer as unix_time