How to compute a derived age attribute in postgresql? - postgresql

I have the following person table with the dbirth column (date of birth). I need to compute the column age. I has been trying, but I have the following ERROR: generation expression is not immutable. I would greatly appreciate any help.
CREATE TABLE person
(
person_id SERIAL NOT NULL,
fname VARCHAR(50) NOT NULL,
lname VARCHAR(50) NOT NULL,
ssn CHAR(10) NOT NULL,
pnumber CHAR(12) NOT NULL,
dbirth DATE NOT NULL,
age integer GENERATED ALWAYS AS ( extract( year FROM CURRENT_DATE ) - extract( year FROM dbirth)) STORED,
address_id INTEGER NOT NULL,
sex_id INTEGER NOT NULL,
PRIMARY KEY ( person_id ),
FOREIGN KEY ( sex_id ) REFERENCES sex ( sex_id ),
FOREIGN KEY ( address_id ) REFERENCES address ( address_id )
);

The reason is EXTRACT function is not an IMMUTABLE.
Create your own custom IMMUTABLE function and use it in the generated column.
Example: Function to get_age as an Interval contains Year, Month and days
CREATE OR REPLACE FUNCTION get_age( birthday date )
RETURNS interval
AS $CODE$
BEGIN
RETURN age(birthday);
END
$CODE$
LANGUAGE plpgsql IMMUTABLE;
Then use the IMMUTABLE function in the generated column
CREATE TABLE person
(
person_id SERIAL NOT NULL,
fname VARCHAR(50) NOT NULL,
lname VARCHAR(50) NOT NULL,
ssn CHAR(10) NOT NULL,
pnumber CHAR(12) NOT NULL,
dbirth DATE NOT NULL,
age interval GENERATED ALWAYS AS (get_age(dbirth)) STORED,
address_id INTEGER NOT NULL,
sex_id INTEGER NOT NULL,
PRIMARY KEY ( person_id ),
FOREIGN KEY ( sex_id ) REFERENCES sex ( sex_id ),
FOREIGN KEY ( address_id ) REFERENCES address ( address_id )
);
Note: Age is an Interval datatype if you want to use it as an int modify the get_age function.

You do not want a generated column for age. Postgres only supports STORED generated columns, the problem comes in that the age column is NOT update unless dbirth is updated.
-- setup
create or replace function get_age( from_date timestamptz )
returns interval
language sql immutable
as $$
select age(from_date);
$$;
create table test_person
( person_id serial
, fname varchar(10) not null
, pnumber varchar(12) not null
, dbirth timestamptz(6) not null
, age interval generated always as (get_age(dbirth)) stored
);
insert into test_person(fname,pnumber,dbirth)
values ('MSam', '555-937-9292', '1995-06-23 08:00:01.344612'::timestamptz )
, ('WSam', '555-937-9292', '2019-03-21 15:32:18.863452'::timestamptz );
commit ;
select p.fname, p.pnumber,dbirth, p.age, current_timestamp
from test_person p
order by fname;
|fname |pnumber |dbirth |age |current_timestamp |
|----------|------------|-------------------|----------------------------|-------------------|
|MSam |555-937-9292|1995-06-23 03:00:01|25 years 24 days 15:59:58.65|2020-07-18 17:53:14|
|WSam |555-937-9292|2019-03-21 10:32:18|1 year 3 mons 27 days 08:27:|2020-07-18 17:53:14|
Now wait some time, does not have to be long, but noticeable.
Then run the exact same query.
select p.fname, p.pnumber,dbirth, p.age, current_timestamp
from test_person p
order by fname;
|fname |pnumber |dbirth |age |current_timestamp |
|----------|------------|-------------------|----------------------------|-------------------|
|MSam |555-937-9292|1995-06-23 03:00:01|25 years 24 days 15:59:58.65|2020-07-18 17:55:29|
|WSam |555-937-9292|2019-03-21 10:32:18|1 year 3 mons 27 days 08:27:|2020-07-18 17:55:29|
Notice in the above that although the current timestamp has changes (time has passed) the age has not. Have we just discovered "the Fountain Of Youth" So try a likely update, like phone number.
update test_person
set pnumber = '555-949-0070'
where fname = 'MSam';
select p.fname, p.pnumber,dbirth, p.age, current_timestamp
from test_person p
order by fname;
|fname |pnumber |dbirth |age |current_timestamp |
|----------|------------|-------------------|----------------------------|-------------------|
|MSam |555-949-0070|1995-06-23 03:00:01|25 years 24 days 15:59:58.65|2020-07-18 17:57:16|
|WSam |555-937-9292|2019-03-21 10:32:18|1 year 3 mons 27 days 08:27:|2020-07-18 17:57:16|
Perhaps a better solution would be:
Drop the Age column from the table completely.
Create a view having and have it generate the Age.
Write your queries against the view.
alter table test_person drop column age;
create view test_person_v as
( select t.*, age( current_timestamp, dbirth) age
from test_person t
);
);
select p.fname, p.pnumber,dbirth, p.age, current_timestamp
from test_person_v p
order by fname;
|fname |pnumber |dbirth |age |current_timestamp |
|----------|------------|-------------------|----------------------------|-------------------|
|MSam |555-949-0070|1998-06-23 03:00:01|22 years 25 days 15:37:45.54|2020-07-18 18:37:46|
|WSam |555-937-9292|2019-03-21 10:32:18|1 year 3 mons 28 days 08:05:|2020-07-18 18:37:46|
select p.fname, p.pnumber,dbirth, p.age, current_timestamp
from test_person_v p
order by fname;
|fname |pnumber |dbirth |age |current_timestamp |
|----------|------------|-------------------|----------------------------|-------------------|
|MSam |555-949-0070|1998-06-23 03:00:01|22 years 25 days 15:42:50.78|2020-07-18 18:42:52|
|WSam |555-937-9292|2019-03-21 10:32:18|1 year 3 mons 28 days 08:10:|2020-07-18 18:42:52|

Related

I get error as "duplicate key value violates unique constraint "

I'm working on a data warehouse. I have 4 table on public schema they are customer, product, addressee and orders
Then I created this tables on my olap schema
CREATE TABLE olap.time
(
idtime SERIAL NOT NULL PRIMARY KEY,
year integer,
month integer,
week integer,
day integer
);
CREATE TABLE olap.addressees
(
idaddressee integer PRIMARY KEY NOT NULL,
name varchar(40) NOT NULL,
zip char(6) NOT NULL,
address varchar(60) NOT NULL
);
CREATE TABLE olap.customers
(
idcustomer varchar(10) PRIMARY KEY ,
name varchar(40) NOT NULL,
city varchar(40) NOT NULL,
zip char(6) NOT NULL,
address varchar(40) NOT NULL,
email varchar(40),
phone varchar(16) NOT NULL,
regon char(9)
);
CREATE TABLE olap.fact
(
idtime integer NOT NULL,
idaddressee integer NOT NULL,
idcustomer varchar(10) NOT NULL,
idfact integer NOT NULL,
price numeric(7,2),
PRIMARY KEY (idtime, idaddressee, idcustomer),
FOREIGN KEY (idaddressee) REFERENCES olap.addressees(idaddressee),
FOREIGN KEY (idcustomer) REFERENCES olap.customers(idcustomer),
FOREIGN KEY (idtime) REFERENCES olap.time(idtime)
);
After the creating tables I run these queries
INSERT INTO olap.time (year, month, week, day)
SELECT date_part('year', date), date_part('month', date), date_part('week', date), date_part('day', date)
FROM public.orders
GROUP BY public.orders.date
ORDER BY public.orders.date;
INSERT INTO olap.addressees(idaddressee, name, zip, address)
SELECT idaddressee, name, zip, address
FROM public.addressee;
INSERT INTO olap.customers (idcustomer, name, city, zip, address, email, phone, regon)
SELECT idcustomer, name, city, zip, address, email, phone, regon
FROM public.customer;
And then I try to do these set of query
INSERT INTO olap.fact (idtime, idaddressee, idcustomer, idfact, price)
SELECT olap.time.idtime, olap.addressees.idaddressee, olap.customers.idcustomer, COUNT(*), public.orders.price
FROM (((public.orders
INNER JOIN olap.time ON (date_part('year', public.orders.date) = olap.time.year AND date_part('month', public.orders.date) = olap.time.month AND date_part('week', public.orders.date) = olap.time.week) AND date_part('day', public.orders.date) = olap.time.day)
INNER JOIN olap.addressees ON public.orders.idaddressee = olap.addressees.idaddressee)
INNER JOIN olap.customers ON public.orders.idcustomer = olap.customers.idcustomer)
GROUP BY olap.time.idtime, olap.addressees.idaddressee, olap.customers.idcustomer, public.orders.price;
After running last set of queries I got error
ERROR: syntax error at or near "duplicate"
LINE 1: duplicate key value violates unique constraint"
What can the problem be? Thanks in advance

POSTGRES INSERT/UPDATE ON CONFLICT using WITH CTE

I have a table like below. I am trying to merge into this table based on the value in a CTE. But when I try to update the table when there is a conflict, it cannot get the value in CTE
CREATE TABLE IF NOT EXISTS master_config_details
(
master_config_id INT NOT NULL,
account_id INT NOT NULL,
date_value TIMESTAMP(3) NULL,
number_value BIGINT NULL,
string_value VARCHAR(50) NULL,
row_status SMALLINT NOT NULL,
created_date TIMESTAMP(3) NOT NULL,
modified_date TIMESTAMP(3) NULL,
CONSTRAINT pk_master_config_details PRIMARY KEY (master_config_id, account_id, row_status)
);
INSERT INTO master_config_details VALUES (
1, 11, NULL,100,NULL, 0, '2020-11-18 12:01:18', '2020-11-18 12:02:31');
select * from master_config_details;`
Now using a cte I want to insert/update records in this table. Below is the code I am using to do the same. When the record already exist in the table I want to update the table based on the data_type_id value in the cte (cte_input_data.data_type_id ) but it fails with the error.
SQL Error [42703]: ERROR: column excluded.data_type_id does not exist
what it should achieve is
if cte_input_data.data_type_id = 1 update master_config_details set date_value = cte.value
if cte_input_data.data_type_id = 2 update master_config_details set number_value = cte.value
if cte_input_data.data_type_id = 3 update master_config_details set string_value = cte.value
The below code should do an update to the table master_config_details.number_value = 22 as there is already a record in that combination (master_config_id, account_id, row_status) which is (1,11,1) ( run this to see the record select * from master_config_details;) but its throwing an error instead
SQL Error [42703]: ERROR: column excluded.data_type_id does not exist
WITH cte_input_data AS (
select
1 AS master_config_id
,11 AS account_id
,2 AS data_type_id
,'22' AS value
,1 AS row_status)
INSERT INTO master_config_details
SELECT
cte.master_config_id
,cte.account_id
,CASE WHEN cte.data_type_id = 1 THEN cte.value::timestamp(3) ELSE NULL END AS date_time_value
,CASE WHEN cte.data_type_id = 2 THEN cte.value::integer ELSE NULL END AS number_value
,CASE WHEN cte.data_type_id = 3 THEN cte.value ELSE NULL END AS string_value
,1
,NOW() AT TIME ZONE 'utc'
,NOW() AT TIME ZONE 'utc'
FROM cte_input_data cte
ON CONFLICT (master_config_id,account_id,row_status)
DO UPDATE SET
date_value = CASE WHEN excluded.data_type_id = 1 THEN excluded.date_time_value::timestamp(3) ELSE NULL END
,number_value = CASE WHEN excluded.data_type_id = 2 THEN excluded.number_value::integer ELSE NULL END
,string_value = CASE WHEN excluded.data_type_id = 3 THEN excluded.string_value ELSE NULL END
,modified_date = NOW() AT TIME ZONE 'utc';
Special excluded table is used to reference values originally proposed for insertion.
So you’re getting this error because this column doesn’t exist in your target table, and so in special excluded table. It exists only in your cte.
As a workaround you can select it from cte using nested select in on conflict statement.

Join on multiple tables using distinct on

create table emp
(
emp_id serial primary key,
emp_no integer,
emp_ref_no character varying(15),
emp_class character varying(15)
);
create table emp_detail
(
emp_detail_id serial primary key,
emp_id integer,
class_no integer,
created_at timestamp without time zone,
constraint con_fk foreign key(emp_id) references emp(emp_id)
);
create table class_detail
(
class_id serial primary key,
emp_id integer,
class_no integer,
col1 JSONB,
created_at timestamp without time zone default now(),
constraint cd_fk foreign key(emp_id) references emp(emp_id)
);
INSERT INTO emp(
emp_no, emp_ref_no, emp_class)
VALUES ('548251', '2QcW', 'abc' );
INSERT INTO emp(
emp_no, emp_ref_no, emp_class)
VALUES ('548251', '2FQx', 'abc');
INSERT INTO emp(
emp_no, emp_ref_no, emp_class)
VALUES ('548251', '2yz', 'abc');
INSERT INTO emp_detail(
emp_id, class_no, created_at
)
VALUES ( 1, 2, '2018-05-04 11:00:00'
);
INSERT INTO emp_detail(
emp_id, class_no, created_at
)
VALUES ( 1, 1, '2018-04-04 11:00:00'
);
INSERT INTO emp_detail(
emp_id, class_no, created_at
)
VALUES ( 2, 1, '2018-05-10 11:00:00'
);
INSERT INTO emp_detail(
emp_id, class_no, created_at
)
VALUES ( 2, 2, '2018-02-01 11:00:00'
);
INSERT INTO emp_detail(
emp_id, class_no, created_at
)
VALUES ( 3, 2, '2018-02-01 11:00:00'
);
insert into class_detail(emp_id, class_no, col1, created_at) values(1,1,'{"Name":"Nik"}', '2018-02-01 10:00:00');
insert into class_detail(emp_id, class_no, col1, created_at) values(1,1,'{"Name":"Nik Anderson"}', '2018-03-01 10:00:00');
insert into class_detail(emp_id, class_no, col1, created_at) values(1,2,'{"Name":"James Anderson TST"}', '2018-03-15 10:00:00');
insert into class_detail(emp_id, class_no, col1, created_at) values(1,2,'{"Name":"Tim Paine ST"}', '2018-04-01 10:00:00');
I want to display corresponding emp_id, emp_no, emp_ref_no, class_no(the latest one from emp_detail table based on created at)along with all the columns of class_detail table. Class_detail table should show the latest corresponding record of the class no
The expected output which I would like to see is something like below :-
emp id | emp_no | emp_ref_no | class_no | class_id | class.col1 | class.created_at | class.created_by
1 | 548251 | 2QcW | 2 | 4 |{"Name":"Tim Paine ST"}|2018-04-01 10:00:00| NUlL
2 | 548251 | 2FQx | 1 | 2 |{"Name":"Nik Anderson"}|2018-03-01 10:00:00| NULL
3 | 548251 | 2yz | 2 | 4 |{"Name":"Tim Paine ST"}|2018-04-01 10:00:00| NULL
As I stated in the comments: It is exactly the same thing as in Inner join using distinct on. You simply have to add another join and another ORDER BY group (cd.created_at DESC)
demo:db<>fiddle
SELECT DISTINCT ON (ed.emp_id)
e.emp_id, e.emp_no, e.emp_ref_no, ed.class_no, cd.*
FROM
emp_detail ed
JOIN emp e ON e.emp_id = ed.emp_id
JOIN class_detail cd ON ed.class_no = cd.class_no
ORDER BY ed.emp_id, ed.created_at DESC, cd.created_at DESC
Note: I am not sure what the emp_id column in class_detail is for. It seems not well designed (this is also because it is always 1 in your example.) You should check whether you really need it.

aggregate multiple columns over dynamic pivot in sql

I'm creating a stored procedure that would allow the user to retrieve data from 2 tables by providing the PersonID number as a parameter.
I thought of using the pivot function to pivot the Data table dynamically by non-aggregating over multiple columns and retrieving data from ONE column in a different table. The 2 tables below are just sample data as I have over 100 columns for the data table, hence the dynamic part. The 2 tables doesn't have a common ID column but just a common column_name.
Here are the 2 tables:
Mapping Table:
CREATE table #table (
ID varchar(10) NOT NULL,
Column_Name varchar (255) NOT NULL,
Page_Num varchar(10) NOT NULL,
Line_Num varchar(10) NOT NULL,
Element_Num varchar(10) NOT NULL
)
INSERT INTO #table (ID,Column_Name,Page_Num,Line_Num,Element_Num) VALUES ('1','Name', 'DT-01', '200','20')
INSERT INTO #table (ID,Column_Name,Page_Num,Line_Num,Element_Num) VALUES ('2','SSN', 'DT-02', '220','10')
INSERT INTO #table (ID,Column_Name,Page_Num,Line_Num,Element_Num) VALUES ('3','City', 'DT-03', '300','11')
INSERT INTO #table (ID,Column_Name,Page_Num,Line_Num,Element_Num) VALUES ('4','StreetName', 'DT-04', '350','33')
INSERT INTO #table (ID,Column_Name,Page_Num,Line_Num,Element_Num) VALUES ('5','Sex', 'DT-05', '310','51')
Creates:
ID Column_Name Page_Num Line_Num Element_Num
_________________________________________________________________
1 Name DT-01 200 20
2 SSN DT-02 220 10
3 City DT-03 300 11
4 StreetName DT-04 350 33
5 Sex DT-05 310 51
Data table:
CREATE table #temp (
PersonID varchar (100) NOT NULL,
Name varchar(100) NOT NULL,
SSN varchar (255) NOT NULL,
City varchar(100) NOT NULL,
StreetName varchar(100) NOT NULL,
Sex varchar(100) NOT NULL
)
INSERT INTO #temp (PersonID,Name,SSN,City,StreetName,Sex) VALUES ('112','Joe','945890189', 'Lookesville', 'Broad st','Male')
INSERT INTO #temp (PersonID,Name,SSN,City,StreetName,Sex) VALUES ('140','Santana','514819926', 'Falls Church', 'Gane Rd', 'Female')
INSERT INTO #temp (PersonID,Name,SSN,City,StreetName,Sex) VALUES ('481','Wyatt','014523548','Gainesville', 'Westfield blvd', 'Male')
INSERT INTO #temp (PersonID,Name,SSN,City,StreetName,Sex) VALUES ('724','Brittany','551489230','Aldi', 'Ostrich rd', 'Female')
INSERT INTO #temp (PersonID,Name,SSN,City,StreetName,Sex) VALUES ('100','Giovanni','774451362','Paige', 'Company ln', 'Male')
Creates:
PersonID Name SSN City StreetName Sex
_______________________________________________________________________
112 Joe 945890189 Lookesville Broad st Male
140 Santana 514819926 Falls Church Gane Rd Female
481 Wyatt 014523548 Gainesville Westfield rd Male
724 Brittany 551489230 Aldi Ostrich rd Female
100 Giovanni 774451362 Paige Company ln Male
The end result should be:
Example: User enters parameter PersonID = 140
Column_name Page_Num Line_Num Element_Num Data
_____________________________________________________________________________
Name DT-01 200 20 Santana
SSN DT-02 220 10 514819926
City DT-03 300 11 Falls Church
StreetName DT-04 350 33 Gane Rd
Sex DT-05 310 51 Female
... ... ... ... ...
and so on..
The following will dynamically unpivot a data row, and then perform a join on the field name with the def data.
If you want to run this query without a filter, I would suggest adding A.PersonID to the top SELECT and remove the WHERE
I should add, UNPIVOT would be more performant, but with this approach, there is no need to define and/or recast values. That said, the performance is still very respectable.
Example
Select D.*
,Data=C.Value
From #Temp A
Cross Apply (Select XMLData = cast((Select A.* For XML Raw) as xml)) B
Cross Apply (
Select Item = attr.value('local-name(.)','varchar(100)')
,Value = attr.value('.','varchar(max)')
From B.XMLData.nodes('/row') as X(r)
Cross Apply X.r.nodes('./#*') AS N(attr)
) C
Join #Table D on (C.Item=D.Column_Name)
Where PersonID=140
Returns
If it Helps with the Visualization, the CROSS APPLY C generates the following:
EDIT - As a Stored Procedure
CREATE PROCEDURE [dbo].[YourProcedureName](#PersonID int)
As
Begin
Set NoCount On;
Select D.*
,Data=C.Value
From YourPersonTableName A
Cross Apply (Select XMLData = cast((Select A.* For XML Raw) as xml)) B
Cross Apply (
Select Item = attr.value('local-name(.)','varchar(100)')
,Value = attr.value('.','varchar(max)')
From B.XMLData.nodes('/row') as X(r)
Cross Apply X.r.nodes('./#*') AS N(attr)
) C
Join YourObjectTableName D on (C.Item=D.Column_Name)
Where PersonID=#PersonID
End

Using JSONB_ARRAY_ELEMENTS with WHERE ... IN condition

Online poker players can optionally purchase access to playroom 1 or playroom 2.
And they can be temporarily banned for cheating.
CREATE TABLE users (
uid SERIAL PRIMARY KEY,
paid1_until timestamptz NULL, -- may play in room 1
paid2_until timestamptz NULL, -- may play in room 2
banned_until timestamptz NULL, -- punished for cheating etc.
banned_reason varchar(255) NULL
);
Here the above table is filled with 4 test records:
INSERT INTO users (paid1_until, paid2_until, banned_until, banned_reason)
VALUES (NULL, NULL, NULL, NULL),
(current_timestamp + interval '1 month', NULL, NULL, NULL),
(current_timestamp + interval '2 month', current_timestamp + interval '4 month', NULL, NULL),
(NULL, current_timestamp + interval '8 month', NULL, NULL);
All 4 records belong to the same person - who has authenticated herself via different social networks (for example through Facebook, Twitter, Apple Game Center, etc.)
I am trying to create a stored function, which would take a list of numeric user ids (as a JSON array) and merge records belonging to same person into a single record - without losing her payments or punishments:
CREATE OR REPLACE FUNCTION merge_users(
IN in_users jsonb,
OUT out_uid integer)
RETURNS integer AS
$func$
DECLARE
new_paid1 timestamptz;
new_paid2 timestamptz;
new_banned timestamptz;
new_reason varchar(255);
BEGIN
SELECT min(uid),
current_timestamp + sum(paid1_until - current_timestamp),
current_timestamp + sum(paid2_until - current_timestamp),
max(banned_until)
INTO
out_uid, new_paid1, new_paid2, new_banned
FROM users
WHERE uid IN (SELECT JSONB_ARRAY_ELEMENTS(in_users));
IF out_uid IS NOT NULL THEN
SELECT banned_reason
INTO new_reason
FROM users
WHERE new_banned IS NOT NULL
AND banned_until = new_banned
LIMIT 1;
DELETE FROM users
WHERE uid IN (SELECT JSONB_ARRAY_ELEMENTS(in_users))
AND uid <> out_uid;
UPDATE users
SET paid1_until = new_paid1,
paid2_until = new_paid2,
banned_until = new_banned,
banned_reason = new_reason
WHERE uid = out_uid;
END IF;
END
$func$ LANGUAGE plpgsql;
Unfortunately, its usage results in the following error:
# TABLE users;
uid | paid1_until | paid2_until | banned_until | banned_reason
-----+-------------------------------+-------------------------------+--------------+---------------
1 | | | |
2 | 2016-03-27 19:47:55.876272+02 | | |
3 | 2016-04-27 19:47:55.876272+02 | 2016-06-27 19:47:55.876272+02 | |
4 | | 2016-10-27 19:47:55.876272+02 | |
(4 rows)
# select merge_users('[1,2,3,4]'::jsonb);
ERROR: operator does not exist: integer = jsonb
LINE 6: WHERE uid IN (SELECT JSONB_ARRAY_ELEMENTS(in_users))
^
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
QUERY: SELECT min(uid),
current_timestamp + sum(paid1_until - current_timestamp),
current_timestamp + sum(paid2_until - current_timestamp),
max(banned_until)
FROM users
WHERE uid IN (SELECT JSONB_ARRAY_ELEMENTS(in_users))
CONTEXT: PL/pgSQL function merge_users(jsonb) line 8 at SQL statement
Please help me to solve the problem.
Here is a gist with SQL code for your convenience.
Result of jsonb_array_elements() is a set of jsonb elements, therefore you need add explicit cast of uid to jsonb with to_jsonb() function, IN will be replaced with <# operator:
WITH t(val) AS ( VALUES
('[1,2,3,4]'::JSONB)
)
SELECT TRUE
FROM t,jsonb_array_elements(t.val) element
WHERE to_jsonb(1) <# element;
For your case, snippet should be adjusted to something like:
...SELECT ...,JSONB_ARRAY_ELEMENTS(in_users) user_ids
WHERE to_jsonb(uid) <# user_ids...