Related
Does PostgreSQL support computed / calculated columns, like MS SQL Server? I can't find anything in the docs, but as this feature is included in many other DBMSs I thought I might be missing something.
Eg: http://msdn.microsoft.com/en-us/library/ms191250.aspx
Postgres 12 or newer
STORED generated columns are introduced with Postgres 12 - as defined in the SQL standard and implemented by some RDBMS including DB2, MySQL, and Oracle. Or the similar "computed columns" of SQL Server.
Trivial example:
CREATE TABLE tbl (
int1 int
, int2 int
, product bigint GENERATED ALWAYS AS (int1 * int2) STORED
);
fiddle
VIRTUAL generated columns may come with one of the next iterations. (Not in Postgres 15, yet).
Related:
Attribute notation for function call gives error
Postgres 11 or older
Up to Postgres 11 "generated columns" are not supported.
You can emulate VIRTUAL generated columns with a function using attribute notation (tbl.col) that looks and works much like a virtual generated column. That's a bit of a syntax oddity which exists in Postgres for historic reasons and happens to fit the case. This related answer has code examples:
Store common query as column?
The expression (looking like a column) is not included in a SELECT * FROM tbl, though. You always have to list it explicitly.
Can also be supported with a matching expression index - provided the function is IMMUTABLE. Like:
CREATE FUNCTION col(tbl) ... AS ... -- your computed expression here
CREATE INDEX ON tbl(col(tbl));
Alternatives
Alternatively, you can implement similar functionality with a VIEW, optionally coupled with expression indexes. Then SELECT * can include the generated column.
"Persisted" (STORED) computed columns can be implemented with triggers in a functionally equivalent way.
Materialized views are a related concept, implemented since Postgres 9.3.
In earlier versions one can manage MVs manually.
YES you can!! The solution should be easy, safe, and performant...
I'm new to postgresql, but it seems you can create computed columns by using an expression index, paired with a view (the view is optional, but makes makes life a bit easier).
Suppose my computation is md5(some_string_field), then I create the index as:
CREATE INDEX some_string_field_md5_index ON some_table(MD5(some_string_field));
Now, any queries that act on MD5(some_string_field) will use the index rather than computing it from scratch. For example:
SELECT MAX(some_field) FROM some_table GROUP BY MD5(some_string_field);
You can check this with explain.
However at this point you are relying on users of the table knowing exactly how to construct the column. To make life easier, you can create a VIEW onto an augmented version of the original table, adding in the computed value as a new column:
CREATE VIEW some_table_augmented AS
SELECT *, MD5(some_string_field) as some_string_field_md5 from some_table;
Now any queries using some_table_augmented will be able to use some_string_field_md5 without worrying about how it works..they just get good performance. The view doesn't copy any data from the original table, so it is good memory-wise as well as performance-wise. Note however that you can't update/insert into a view, only into the source table, but if you really want, I believe you can redirect inserts and updates to the source table using rules (I could be wrong on that last point as I've never tried it myself).
Edit: it seems if the query involves competing indices, the planner engine may sometimes not use the expression-index at all. The choice seems to be data dependant.
One way to do this is with a trigger!
CREATE TABLE computed(
one SERIAL,
two INT NOT NULL
);
CREATE OR REPLACE FUNCTION computed_two_trg()
RETURNS trigger
LANGUAGE plpgsql
SECURITY DEFINER
AS $BODY$
BEGIN
NEW.two = NEW.one * 2;
RETURN NEW;
END
$BODY$;
CREATE TRIGGER computed_500
BEFORE INSERT OR UPDATE
ON computed
FOR EACH ROW
EXECUTE PROCEDURE computed_two_trg();
The trigger is fired before the row is updated or inserted. It changes the field that we want to compute of NEW record and then it returns that record.
PostgreSQL 12 supports generated columns:
PostgreSQL 12 Beta 1 Released!
Generated Columns
PostgreSQL 12 allows the creation of generated columns that compute their values with an expression using the contents of other columns. This feature provides stored generated columns, which are computed on inserts and updates and are saved on disk. Virtual generated columns, which are computed only when a column is read as part of a query, are not implemented yet.
Generated Columns
A generated column is a special column that is always computed from other columns. Thus, it is for columns what a view is for tables.
CREATE TABLE people (
...,
height_cm numeric,
height_in numeric GENERATED ALWAYS AS (height_cm * 2.54) STORED
);
db<>fiddle demo
Well, not sure if this is what You mean but Posgres normally support "dummy" ETL syntax.
I created one empty column in table and then needed to fill it by calculated records depending on values in row.
UPDATE table01
SET column03 = column01*column02; /*e.g. for multiplication of 2 values*/
It is so dummy I suspect it is not what You are looking for.
Obviously it is not dynamic, you run it once. But no obstacle to get it into trigger.
Example on creating an empty virtual column
,(SELECT *
From (values (''))
A("virtual_col"))
Example on creating two virtual columns with values
SELECT *
From (values (45,'Completed')
, (1,'In Progress')
, (1,'Waiting')
, (1,'Loading')
) A("Count","Status")
order by "Count" desc
I have a code that works and use the term calculated, I'm not on postgresSQL pure tho we run on PADB
here is how it's used
create table some_table as
select category,
txn_type,
indiv_id,
accum_trip_flag,
max(first_true_origin) as true_origin,
max(first_true_dest ) as true_destination,
max(id) as id,
count(id) as tkts_cnt,
(case when calculated tkts_cnt=1 then 1 else 0 end) as one_way
from some_rando_table
group by 1,2,3,4 ;
A lightweight solution with Check constraint:
CREATE TABLE example (
discriminator INTEGER DEFAULT 0 NOT NULL CHECK (discriminator = 0)
);
Does PostgreSQL support computed / calculated columns, like MS SQL Server? I can't find anything in the docs, but as this feature is included in many other DBMSs I thought I might be missing something.
Eg: http://msdn.microsoft.com/en-us/library/ms191250.aspx
Postgres 12 or newer
STORED generated columns are introduced with Postgres 12 - as defined in the SQL standard and implemented by some RDBMS including DB2, MySQL, and Oracle. Or the similar "computed columns" of SQL Server.
Trivial example:
CREATE TABLE tbl (
int1 int
, int2 int
, product bigint GENERATED ALWAYS AS (int1 * int2) STORED
);
fiddle
VIRTUAL generated columns may come with one of the next iterations. (Not in Postgres 15, yet).
Related:
Attribute notation for function call gives error
Postgres 11 or older
Up to Postgres 11 "generated columns" are not supported.
You can emulate VIRTUAL generated columns with a function using attribute notation (tbl.col) that looks and works much like a virtual generated column. That's a bit of a syntax oddity which exists in Postgres for historic reasons and happens to fit the case. This related answer has code examples:
Store common query as column?
The expression (looking like a column) is not included in a SELECT * FROM tbl, though. You always have to list it explicitly.
Can also be supported with a matching expression index - provided the function is IMMUTABLE. Like:
CREATE FUNCTION col(tbl) ... AS ... -- your computed expression here
CREATE INDEX ON tbl(col(tbl));
Alternatives
Alternatively, you can implement similar functionality with a VIEW, optionally coupled with expression indexes. Then SELECT * can include the generated column.
"Persisted" (STORED) computed columns can be implemented with triggers in a functionally equivalent way.
Materialized views are a related concept, implemented since Postgres 9.3.
In earlier versions one can manage MVs manually.
YES you can!! The solution should be easy, safe, and performant...
I'm new to postgresql, but it seems you can create computed columns by using an expression index, paired with a view (the view is optional, but makes makes life a bit easier).
Suppose my computation is md5(some_string_field), then I create the index as:
CREATE INDEX some_string_field_md5_index ON some_table(MD5(some_string_field));
Now, any queries that act on MD5(some_string_field) will use the index rather than computing it from scratch. For example:
SELECT MAX(some_field) FROM some_table GROUP BY MD5(some_string_field);
You can check this with explain.
However at this point you are relying on users of the table knowing exactly how to construct the column. To make life easier, you can create a VIEW onto an augmented version of the original table, adding in the computed value as a new column:
CREATE VIEW some_table_augmented AS
SELECT *, MD5(some_string_field) as some_string_field_md5 from some_table;
Now any queries using some_table_augmented will be able to use some_string_field_md5 without worrying about how it works..they just get good performance. The view doesn't copy any data from the original table, so it is good memory-wise as well as performance-wise. Note however that you can't update/insert into a view, only into the source table, but if you really want, I believe you can redirect inserts and updates to the source table using rules (I could be wrong on that last point as I've never tried it myself).
Edit: it seems if the query involves competing indices, the planner engine may sometimes not use the expression-index at all. The choice seems to be data dependant.
One way to do this is with a trigger!
CREATE TABLE computed(
one SERIAL,
two INT NOT NULL
);
CREATE OR REPLACE FUNCTION computed_two_trg()
RETURNS trigger
LANGUAGE plpgsql
SECURITY DEFINER
AS $BODY$
BEGIN
NEW.two = NEW.one * 2;
RETURN NEW;
END
$BODY$;
CREATE TRIGGER computed_500
BEFORE INSERT OR UPDATE
ON computed
FOR EACH ROW
EXECUTE PROCEDURE computed_two_trg();
The trigger is fired before the row is updated or inserted. It changes the field that we want to compute of NEW record and then it returns that record.
PostgreSQL 12 supports generated columns:
PostgreSQL 12 Beta 1 Released!
Generated Columns
PostgreSQL 12 allows the creation of generated columns that compute their values with an expression using the contents of other columns. This feature provides stored generated columns, which are computed on inserts and updates and are saved on disk. Virtual generated columns, which are computed only when a column is read as part of a query, are not implemented yet.
Generated Columns
A generated column is a special column that is always computed from other columns. Thus, it is for columns what a view is for tables.
CREATE TABLE people (
...,
height_cm numeric,
height_in numeric GENERATED ALWAYS AS (height_cm * 2.54) STORED
);
db<>fiddle demo
Well, not sure if this is what You mean but Posgres normally support "dummy" ETL syntax.
I created one empty column in table and then needed to fill it by calculated records depending on values in row.
UPDATE table01
SET column03 = column01*column02; /*e.g. for multiplication of 2 values*/
It is so dummy I suspect it is not what You are looking for.
Obviously it is not dynamic, you run it once. But no obstacle to get it into trigger.
Example on creating an empty virtual column
,(SELECT *
From (values (''))
A("virtual_col"))
Example on creating two virtual columns with values
SELECT *
From (values (45,'Completed')
, (1,'In Progress')
, (1,'Waiting')
, (1,'Loading')
) A("Count","Status")
order by "Count" desc
I have a code that works and use the term calculated, I'm not on postgresSQL pure tho we run on PADB
here is how it's used
create table some_table as
select category,
txn_type,
indiv_id,
accum_trip_flag,
max(first_true_origin) as true_origin,
max(first_true_dest ) as true_destination,
max(id) as id,
count(id) as tkts_cnt,
(case when calculated tkts_cnt=1 then 1 else 0 end) as one_way
from some_rando_table
group by 1,2,3,4 ;
A lightweight solution with Check constraint:
CREATE TABLE example (
discriminator INTEGER DEFAULT 0 NOT NULL CHECK (discriminator = 0)
);
I needed basic help on how to combine columns into one new column in the same table. I have done the below as a SELECT command and it works fine. I just don't know how to add it to the table permanently so that it becomes part of the table.
SELECT *, concat(z41, z42, z43, z44) AS option_3,
concat(z411, z412, z413, z421, z422, z423, z431, z432, z433, z434, z444,z443, z442, z441) AS option_4,
concat(z4211, z4212, z4213, z4214, z4215, z4311, z4312, z4313, z4314, z4431, z4432, z4433, z4434, z4421, z4422, z4423, z4424, z4425, z4426) AS option_5
FROM combined_full
Like others have mentioned, you are probably better off using a view. But if you really need this computed data in column then you can do this:
ALTER TABLE combined_full ADD COLUMN option_3 varchar,
ADD COLUMN option_4 varchar,
ADD COLUMN option_5 varchar;
UPDATE combined_full
SET option_3 = concat(z41, z42, z43, z44),
option_4 = concat(z411, z412, z413, z421, z422, z423, z431, z432, z433, z434, z444,z443, z442, z441),
option_5 = concat(z4211, z4212, z4213, z4214, z4215, z4311, z4312, z4313, z4314, z4431, z4432, z4433, z4434, z4421, z4422, z4423, z4424, z4425, z4426);
When adding new rows to the table, you should either also enter values for these three new columns, or create an insert trigger so that the values are automatically calculated as you do above.
"so that it becomes part of the table" - you can't. Unfortunately Postgres (as of 9.6) has no (persisted) computed columns.
If the expression is not very expensive to calculate and you don't need an index on it, I would suggest to create a view that contains the expression.
Given the example in your question, this should be good enough in your case as concatenating values isn't really that expensive.
If you really think you need to persist the calculation of the expression because e.g. you want to create an index on that or you constantly use that expression in a where clause, you will need to add a regular column to the table and a trigger that updates the expression when a row is inserted or updated.
I need to create a surrogate identity key for some intermediate tables used in a stored procedure in Oracle. I found that ROWID inserted into a UROWID column works well but this is not the correct way in older versions of Oracle (before 10g) -- using SEQUENCE.NEXTVAL is. SEQUENCE.NEXTVAL is a 2 step process and uses up memory/storage (full table scan) whereas with the ROWID way you just save the address and you're done. (like IDENTITY in SQL)
I want to use ROWID as the identity key. Is it OK for me to do this?
Just to be on the save side, this is how pros use sequences:
insert into master_table(id, x, y, z) values seq_master.nextval, :x, :y, :z;
insert into detail_table(master_id, a, b) values (seq_master.currval, :a, :b);
insert into detail_table(master_id, a, b) values (seq_master.currval, :c, :d);
...
I would prefer sequences any day over ROWIDs.
So I'm coming from MySQL where I could do INSERT on DUPLICATE UPDATE:
INSERT INTO table (a,b,c)
VALUES (1,2,3)
ON DUPLICATE KEY UPDATE c=c+1;
But now I'm using PostgreSQL and there are efforts to add the UPSERT functionality, looks like MERGE might work for what I would like but wanted to see if this is the most optimal syntax. Example Syntax 1, I've also seen this but don't understand how to implement. I haven't tried this yet because I thought MERGE was used for merging data from table1 to Table2 or would something like this work?
MERGE
INTO table
USING table
ON c = 1
WHEN MATCHED THEN
UPDATE
SET c=c+1
WHEN NOT MATCHED THEN
INSERT (a,b,c)
VALUES (1,2,3)
Any other suggestions?
Until MERGE is available, use this robust approach: Insert, on duplicate update in PostgreSQL?
Until merge is supported the simplest way IMO is to just break it up into two queries:
BEGIN;
INSERT INTO t (a,b,c) VALUES (1,2,3) WHERE id != 1;
UPDATE t SET c=c+1 WHERE id = 1;
END;
where id would be changed to the appropriate condition.
MERGE INTO table
USING (VALUES (1, 2, 3)) AS newvalues (a, b, c)
ON table.c = newvalues.c -- or whatever the PK is
WHEN MATCHED THEN UPDATE SET c = c + 1
WHEN NOT MATCHED THEN INSERT (a, b, c)
VALUES (newvalues.a, newvalues.b, newvalues.c)
The key here is that instead of merging in another table you create a constant table source using the VALUES construct in the USING clause. The exact merging rules you can obviously tailor to taste.
See also http://petereisentraut.blogspot.com/2010/05/merge-syntax.html.
I think "MERGE" is not yet in Postgres but is suposed to be in 9.1.
I like to use RULEs instead
CREATE OR REPLACE RULE "insert_ignore"
AS ON INSERT TO "table" WHERE
NEW.id = OLD.id --whatever your conditions are
DO INSTEAD NOTHING;
What you have linked to ("Insert, on duplicate update (postgresql)") is basically some pgsql that you feed the data. I think the RULE is more elegant since you don't need to call them explicitly and they work transparent in the background without the need to call a procedure within your actual INSERT.