How to calculate the difference between two timestamptz in Postgres - postgresql

I have a table and I want to calculate the difference (in time) between two columns of my table.
My columns are: scheduled_arrival_time(timestamptz), scheduled_departure_time(timestamptz) and I want to get the difference of them as "scheduled_duration"
(scheduled_duration = scheduled_arrival_time - scheduled_departure_time)
I tried this:
scheduled_departure_time TIMESTAMPTZ NOT NULL,
scheduled_arrival_time TIMESTAMPTZ NOT NULL,
scheduled_duration numeric(4,2) NOT NULL
generated always as
( extract(epoch from (scheduled_arrival_time - scheduled_departure_time))/3600 )
stored
but I got the error when I tried to insert data:
ERROR: cannot insert a non-DEFAULT value into column "scheduled_duration" DETAIL: Column "scheduled_duration" is a generated column. SQL state: 428C9

You can do this with for excample:
SELECT
EXTRACT(EPOCH FROM '2022-06-16 15:00:00.00000'::TIMESTAMP - '2022-06-15 15:00:00.00000'::TIMESTAMP)
You get the difference in seconds.

Related

postgresql group by datetime in join query

I have 2 tables in my postgresql timescaledb database (version 12.06) that I try to query through inner join.
Tables' structure:
CREATE TABLE currency(
id serial PRIMARY KEY,
symbol TEXT NOT NULL,
name TEXT NOT NULL,
quote_asset TEXT
);
CREATE TABLE currency_price (
currency_id integer NOT NULL,
dt timestamp WITHOUT time ZONE NOT NULL,
open NUMERIC NOT NULL,
high NUMERIC NOT NULL,
low NUMERIC NOT NULL,
close NUMERIC,
volume NUMERIC NOT NULL,
PRIMARY KEY (
currency_id,
dt
),
CONSTRAINT fk_currency FOREIGN KEY (currency_id) REFERENCES currency(id)
);
The query I'm trying to make is:
SELECT currency_id AS id, symbol, MAX(close) AS close, DATE(dt) AS date
FROM currency_price
JOIN currency ON
currency.id = currency_price.currency_id
GROUP BY currency_id, symbol, date
LIMIT 100;
Basically, it returns all the rows that exist in currency_price table. I know that postgres doesn't allow select columns without an aggregate function or including them in "group by" clause. So, if I don't include dt column in my select query, i receive expected results, but if I include it, the output shows rows of every single day of each currency while I only want to have the max value of every currency and filter them out based on various dates afterwards.
I'm very inexperienced with SQL in general.
Any suggestions to solve this would be very appreciated.
There are several ways to do it, easiest one comes to mind is using window functions.
select *
from (
SELECT currency_id,symbol,close,dt
,row_number() over(partition by currency_id,symbol
order by close desc,dt desc) as rr
FROM currency_price
JOIN currency ON currency.id = currency_price.currency_id
where dt::date = '2021-06-07'
)q1
where rr=1
General window functions:
https://www.postgresql.org/docs/9.5/functions-window.html
works also with standard aggregate functions like SUM,AVG,MAX,MIN and others.
Some examples: https://www.postgresqltutorial.com/postgresql-window-function/

Redshift: milliseconds to timestamp

Let us say we have have two tables:
CREATE TABLE IF NOT EXISTS tech_time(
ms_since_epoch BIGINT
);
CREATE TABLE IF NOT EXISTS readable_time(
ts timestamp without time zone,
);
Let us say tech_time has data and we would like to populate readable_time.
So in Postgres you could use to_timestamp(double precision) and do something like
INSERT INTO readable_time(ts)
SELECT DISTINCT to_timestamp(ms_since_epoch::float / 1000) AS ts,
FROM tech_time;
No such function seems to exist in Amazon Redshift:
function to_timestamp(double precision) does not exist
My question is: how do I properly populate readable_time, while losing the least amount of precision?
We can try using DATEADD and add the ms_since_epoch to January 1, 1970:
INSERT INTO readable_time (ts)
SELECT DATEADD(ms, ms_since_epoch, 'epoch')
FROM tech_time;

Select largest absolute value column pairs with headers per row

I am using: Microsoft SQL Server 2014 - 12.0.4213.0
Here is my sample table (numbers fuzzed):
CREATE TABLE most_recent_counts(
State VARCHAR(2) NOT NULL PRIMARY KEY
,BuildDate DATE NOT NULL
,Count_1725_Change INTEGER NOT NULL
,Count_1725_Percent_Change NUMERIC(20,2) NOT NULL
,Count_2635_Change INTEGER NOT NULL
,Count_2635_Percent_Change NUMERIC(20,2) NOT NULL
,Count_3645_Change INTEGER NOT NULL
,Count_3645_Percent_Change NUMERIC(20,2) NOT NULL
);
INSERT INTO most_recent_counts(State,BuildDate,Count_1725_Change,Count_1725_Percent_Change,Count_2635_Change,Count_2635_Percent_Change,Count_3645_Change,Count_3645_Percent_Change) VALUES ('AK','2018-06-05',1025,5.00,1700,2.50,2050,3.00);
INSERT INTO most_recent_counts(State,BuildDate,Count_1725_Change,Count_1725_Percent_Change,Count_2635_Change,Count_2635_Percent_Change,Count_3645_Change,Count_3645_Percent_Change) VALUES ('AL','2018-06-02',15000,4.00,10400,2.00,6800,1.25);
INSERT INTO most_recent_counts(State,BuildDate,Count_1725_Change,Count_1725_Percent_Change,Count_2635_Change,Count_2635_Percent_Change,Count_3645_Change,Count_3645_Percent_Change) VALUES ('AR','2018-06-07',2300,1.00,2700,1.00,1800,0.50);
INSERT INTO most_recent_counts(State,BuildDate,Count_1725_Change,Count_1725_Percent_Change,Count_2635_Change,Count_2635_Percent_Change,Count_3645_Change,Count_3645_Percent_Change) VALUES ('AZ','2018-04-26',107000,5.50,45400,3.00,180000,16.00);
INSERT INTO most_recent_counts(State,BuildDate,Count_1725_Change,Count_1725_Percent_Change,Count_2635_Change,Count_2635_Percent_Change,Count_3645_Change,Count_3645_Percent_Change) VALUES ('CA','2018-06-07',140000,6.00,550000,14.00,600000,18.00);
It should look something like this:
IMG: https://i.imgur.com/KGkfm66.png
In the real table, I have some 600ish such counts.
I would like to produce a table from this table, where for each state, I have the top ten (in magnitude) pairs of columns (i.e. The abs. change, and the percent change) (i.e. if in Alabama's row, there is a minus 10 million count in the sales to people in the 46-55 range, that should definitely be part of the result set, even if the rest of the columns are positive accruals in the thousands)
What's the best way to do this?

unique date field postgresql default value

I have a date column which I want to be unique once populated, but want the date field to be ignored if it is not populated.
In MySQL the way this is accomplished is to set the date column to "not null" and give it a default value of '0000-00-00' - this allows all other fields in the unique index to be "checked" even if the date column is not populated yet.
This does not work in PosgreSQL because '0000-00-00' is not a valid date, so you cannot store it in a date field (this makes sense to me).
At first glance, leaving the field nullable seemed like an option, but this creates a problem:
=> create table uniq_test(NUMBER bigint not null, date DATE, UNIQUE(number, date));
CREATE TABLE
=> insert into uniq_test(number) values(1);
INSERT 0 1
=> insert into uniq_test(number) values(1);
INSERT 0 1
=> insert into uniq_test(number) values(1);
INSERT 0 1
=> insert into uniq_test(number) values(1);
INSERT 0 1
=> select * from uniq_test;
number | date
--------+------
1 |
1 |
1 |
1 |
(4 rows)
NULL apparently "isn't equal to itself" and so it does not count towards constraints.
If I add an additional unique constraint only on the number field, it checks only number and not date and so I cannot have two numbers with different dates.
I could select a default date that is a 'valid date' (but outside working scope) to get around this, and could (in fact) get away with that for the current project, but there are actually cases I might be encountering in the next few years where it will not in fact be evident that the date is a non-real date just because it is "a long time ago" or "in the future."
The advantage the '0000-00-00' mechanic had for me was precisely that this date isn't real and therefore indicated a non-populated entry (where 'non-populated' was a valid uniqueness attribute). When I look around for solutions to this on the internet, most of what I find is "just use NULL" and "storing zeros is stupid."
TL;DR
Is there a PostgreSQL best practice for needing to include "not populated" as a possible value in a unique constraint including a date field?
Not clear what you want. This is my guess:
create table uniq_test (number bigint not null, date date);
create unique index i1 on uniq_test (number, date)
where date is not null;
create unique index i2 on uniq_test (number)
where date is null;
There will be an unique constraint for not null dates and another one for null dates effectively turning the (number, date) tuples into distinct values.
Check partial index
It's not a best practice, but you can do it such way:
t=# create table so35(i int, d date);
CREATE TABLE
t=# create unique index i35 on so35(i, coalesce(d,'-infinity'));
CREATE INDEX
t=# insert into so35 (i) select 1;
INSERT 0 1
t=# insert into so35 (i) select 2;
INSERT 0 1
t=# insert into so35 (i) select 2;
ERROR: duplicate key value violates unique constraint "i35"
DETAIL: Key (i, (COALESCE(d, '-infinity'::date)))=(2, -infinity) already exists.
STATEMENT: insert into so35 (i) select 2;

an empty row with null-like values in not-null field

I'm using postgresql 9.0 beta 4.
After inserting a lot of data into a partitioned table, i found a weird thing. When I query the table, i can see an empty row with null-like values in 'not-null' fields.
That weird query result is like below.
689th row is empty. The first 3 fields, (stid, d, ticker), are composing primary key. So they should not be null. The query i used is this.
select * from st_daily2 where stid=267408 order by d
I can even do the group by on this data.
select stid, date_trunc('month', d) ym, count(*) from st_daily2
where stid=267408 group by stid, date_trunc('month', d)
The 'group by' results still has the empty row.
The 1st row is empty.
But if i query where 'stid' or 'd' is null, then it returns nothing.
Is this a bug of postgresql 9b4? Or some data corruption?
EDIT :
I added my table definition.
CREATE TABLE st_daily
(
stid integer NOT NULL,
d date NOT NULL,
ticker character varying(15) NOT NULL,
mp integer NOT NULL,
settlep double precision NOT NULL,
prft integer NOT NULL,
atr20 double precision NOT NULL,
upd timestamp with time zone,
ntrds double precision
)
WITH (
OIDS=FALSE
);
CREATE TABLE st_daily2
(
CONSTRAINT st_daily2_pk PRIMARY KEY (stid, d, ticker),
CONSTRAINT st_daily2_strgs_fk FOREIGN KEY (stid)
REFERENCES strgs (stid) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE,
CONSTRAINT st_daily2_ck CHECK (stid >= 200000 AND stid < 300000)
)
INHERITS (st_daily)
WITH (
OIDS=FALSE
);
The data in this table is simulation results. Multithreaded multiple simulation engines written in c# insert data into the database using Npgsql.
psql also shows the empty row.
You'd better leave a posting at http://www.postgresql.org/support/submitbug
Some questions:
Could you show use the table
definitions and constraints for the
partions?
How did you load your data?
You get the same result when using
another tool, like psql?
The answer to your problem may very well lie in your first sentence:
I'm using postgresql 9.0 beta 4.
Why would you do that? Upgrade to a stable release. Preferably the latest point-release of the current version.
This is 9.1.4 as of today.
I got to the same point: "what in the heck is that blank value?"
No, it's not a NULL, it's a -infinity.
To filter for such a row use:
WHERE
case when mytestcolumn = '-infinity'::timestamp or
mytestcolumn = 'infinity'::timestamp
then NULL else mytestcolumn end IS NULL
instead of:
WHERE mytestcolumn IS NULL