How to update a value using string_agg without using a function? - postgresql

I solved this problem with a function and I like my solution, but I want to know if there is a way to solve this problem without using functions. Here is the thing:
There are four tables that are relevant to this:
entities: entities of the system (tenants)
members: members of an entity
member_sets: sets of members
members_and_sets: table to join members and sets (many to many)
The member_sets table has a column named bits which is a binary representation of the set, so, for example, if an entity has 5 members and one specific set has the third member, the value of the bits column is 00100, the entity has three special kinds of sets: universe, empty and unit, their binary repesentation is: 11111, 00000 and 10000 respectively, assuming the unit set has the first member.
The challenge is keep this binary representation of the set up to date; Whenever one member is added to the entity, all binary representations must be updated. This is easy to do with a trigger and a function, my solution is this:
CREATE FUNCTION setbits(INTEGER) RETURNS VARBIT AS
$$SELECT STRING_AGG(belongs, '')::VARBIT AS setbits FROM (
SELECT LEAST(COALESCE(members_and_sets.set_id, 0), 1)::text AS belongs
FROM members LEFT JOIN members_and_sets
ON members.id=members_and_sets.member_id
AND members_and_sets.set_id=$1
GROUP BY members.id,members_and_sets.set_id
ORDER BY members.id)
AS bitsring;$$
LANGUAGE SQL
RETURNS NULL ON NULL INPUT;
-- calling this function in a trigger after inserting a new member:
UPDATE member_sets ms
SET bits=setbits(ms.id)
WHERE ms.entity_id=NEW.entity_id;
Now my question is: Can I do this without using a function? I tried with CTE but apparently I'm to noob to accomplish this; I couldn't pass the set_id to the must inner query so my solution was to wrap the query in a function and pass the set_id as an argument to the function. Again, this solution works perfectly, I just want to know if there is no way I can do this without a function call.

As your function body is simply a SELECT, you should be able to replace the function call with a subquery:
UPDATE member_sets ms
SET bits= (
SELECT STRING_AGG(belongs, '')::VARBIT AS setbits FROM (
SELECT LEAST(COALESCE(members_and_sets.set_id, 0), 1)::text AS belongs
FROM members LEFT JOIN members_and_sets
ON members.id=members_and_sets.member_id
AND members_and_sets.set_id=ms.id
GROUP BY members.id,members_and_sets.set_id
ORDER BY members.id)
AS bitsring
)
WHERE ms.entity_id=NEW.entity_id;

Related

Postgres Functions: Getting the Return Table Column Details

I feel the need to get the column names and data types of the table returned by any function that has a 'record' return data type, because...
A key process in an existing SQL Server-based system makes use of a stored procedure that takes a user-defined function as a parameter. An initial step gets the column names and types of the table returned by the function that was passed as a parameter.
In Postgres 13 I can use pg_proc.prorettype and the corresponding pg_type to find functions that return record types...that's a start. I can also use pg_get_function_result() to get the string containing the information I need. But, it's a string, and while I ultimately will have to assemble a very similar string, this is just one application of the info. Is there a tabular equivalent containing (column_name, data_type, ordinal_position), or do I need to do that myself?
Is there access to a composite data type the system may have created when such a function is created?
One option that I think will work for me, but I think it's a little weird, is to:
> create temp table t as select * from function() limit 0;
then look that table up in info_schema.columns, assemble what I need and drop the temp table...putting all of this into a function.
You can query the catalog table pg_proc, which contains all the required information:
SELECT coalesce(p.na, 'column' || p.i),
p.ty::regtype,
p.i
FROM pg_proc AS f
CROSS JOIN LATERAL unnest(
coalesce(f.proallargtypes, ARRAY[f.prorettype]),
f.proargmodes,
f.proargnames
)
WITH ORDINALITY AS p(ty,mo,na,i)
WHERE f.proname = 'interval_ok'
AND coalesce(p.mo, 'o') IN ('o', 't')
ORDER BY p.i;

Assigning TABLE typed variables in T-SQL

I've defined a user defined Table type - call it TrackRefsTable
Having declared two variables
DECLARE #FOO1 AS TrackRefsTable
DECLARE #FOO2 AS TrackRefsTable
Is there any way to set one to t'other? The obvious
SET #FOO2 = #FOO1
doesn't work as this assignment method only appears to work for Scalar variables and therefore you get the error
Must declare the scalar variable "#FOO1"
I would hope to be able to avoid having to do INSERT statements to move data from one to the other as this can be an expensive operation.
DECLARE #FOO1 AS TrackRefsTable
DECLARE #FOO2 AS TrackRefsTable
-- INSERT INTO #FOO1 here
SET #FOO2 = #FOO1
So my issue was that the SP in which I implemented this would retrieve relatively unstructured data and then try to apply Sorting and Filtering on it. In order to squeeze maximum performance out of this we had to do things like sometimes populating a Table Variable #FOO1, but then sometimes apply Sorting or Filtering on it with results going into #FOO2 before joining it to an actual Data table to retrieve further column data. If performance wasn't such a big deal, I would have taken the simpler option to simply create a variable #FOOFinal into which all the data would be placed before implementing a single JOIN to get the remaining data. But INSERT INTO #FOOFinal SELECT * FROM #FOO1 (for example) costs precious milliseconds so that wasn't acceptable.
Ultimately, the solution was to simply create a separate SP in which we do the JOIN from such a Table Variable to the other data. Because the Table variable was defined as a Table Variable Type we could (thanks to the fact that we no longer support anything older than SQL Server 2008) use a Table Type as a parameter in the SP. So the solution then is to simply call that SP with either #FOO1 or #FOO2 as the parameter being passed in, and that obviates the need to assign one to the other.

How to add a date column which is 7 days later than an existing column in a Postgres table? [duplicate]

Does PostgreSQL support computed / calculated columns, like MS SQL Server? I can't find anything in the docs, but as this feature is included in many other DBMSs I thought I might be missing something.
Eg: http://msdn.microsoft.com/en-us/library/ms191250.aspx
Postgres 12 or newer
STORED generated columns are introduced with Postgres 12 - as defined in the SQL standard and implemented by some RDBMS including DB2, MySQL, and Oracle. Or the similar "computed columns" of SQL Server.
Trivial example:
CREATE TABLE tbl (
int1 int
, int2 int
, product bigint GENERATED ALWAYS AS (int1 * int2) STORED
);
fiddle
VIRTUAL generated columns may come with one of the next iterations. (Not in Postgres 15, yet).
Related:
Attribute notation for function call gives error
Postgres 11 or older
Up to Postgres 11 "generated columns" are not supported.
You can emulate VIRTUAL generated columns with a function using attribute notation (tbl.col) that looks and works much like a virtual generated column. That's a bit of a syntax oddity which exists in Postgres for historic reasons and happens to fit the case. This related answer has code examples:
Store common query as column?
The expression (looking like a column) is not included in a SELECT * FROM tbl, though. You always have to list it explicitly.
Can also be supported with a matching expression index - provided the function is IMMUTABLE. Like:
CREATE FUNCTION col(tbl) ... AS ... -- your computed expression here
CREATE INDEX ON tbl(col(tbl));
Alternatives
Alternatively, you can implement similar functionality with a VIEW, optionally coupled with expression indexes. Then SELECT * can include the generated column.
"Persisted" (STORED) computed columns can be implemented with triggers in a functionally equivalent way.
Materialized views are a related concept, implemented since Postgres 9.3.
In earlier versions one can manage MVs manually.
YES you can!! The solution should be easy, safe, and performant...
I'm new to postgresql, but it seems you can create computed columns by using an expression index, paired with a view (the view is optional, but makes makes life a bit easier).
Suppose my computation is md5(some_string_field), then I create the index as:
CREATE INDEX some_string_field_md5_index ON some_table(MD5(some_string_field));
Now, any queries that act on MD5(some_string_field) will use the index rather than computing it from scratch. For example:
SELECT MAX(some_field) FROM some_table GROUP BY MD5(some_string_field);
You can check this with explain.
However at this point you are relying on users of the table knowing exactly how to construct the column. To make life easier, you can create a VIEW onto an augmented version of the original table, adding in the computed value as a new column:
CREATE VIEW some_table_augmented AS
SELECT *, MD5(some_string_field) as some_string_field_md5 from some_table;
Now any queries using some_table_augmented will be able to use some_string_field_md5 without worrying about how it works..they just get good performance. The view doesn't copy any data from the original table, so it is good memory-wise as well as performance-wise. Note however that you can't update/insert into a view, only into the source table, but if you really want, I believe you can redirect inserts and updates to the source table using rules (I could be wrong on that last point as I've never tried it myself).
Edit: it seems if the query involves competing indices, the planner engine may sometimes not use the expression-index at all. The choice seems to be data dependant.
One way to do this is with a trigger!
CREATE TABLE computed(
one SERIAL,
two INT NOT NULL
);
CREATE OR REPLACE FUNCTION computed_two_trg()
RETURNS trigger
LANGUAGE plpgsql
SECURITY DEFINER
AS $BODY$
BEGIN
NEW.two = NEW.one * 2;
RETURN NEW;
END
$BODY$;
CREATE TRIGGER computed_500
BEFORE INSERT OR UPDATE
ON computed
FOR EACH ROW
EXECUTE PROCEDURE computed_two_trg();
The trigger is fired before the row is updated or inserted. It changes the field that we want to compute of NEW record and then it returns that record.
PostgreSQL 12 supports generated columns:
PostgreSQL 12 Beta 1 Released!
Generated Columns
PostgreSQL 12 allows the creation of generated columns that compute their values with an expression using the contents of other columns. This feature provides stored generated columns, which are computed on inserts and updates and are saved on disk. Virtual generated columns, which are computed only when a column is read as part of a query, are not implemented yet.
Generated Columns
A generated column is a special column that is always computed from other columns. Thus, it is for columns what a view is for tables.
CREATE TABLE people (
...,
height_cm numeric,
height_in numeric GENERATED ALWAYS AS (height_cm * 2.54) STORED
);
db<>fiddle demo
Well, not sure if this is what You mean but Posgres normally support "dummy" ETL syntax.
I created one empty column in table and then needed to fill it by calculated records depending on values in row.
UPDATE table01
SET column03 = column01*column02; /*e.g. for multiplication of 2 values*/
It is so dummy I suspect it is not what You are looking for.
Obviously it is not dynamic, you run it once. But no obstacle to get it into trigger.
Example on creating an empty virtual column
,(SELECT *
From (values (''))
A("virtual_col"))
Example on creating two virtual columns with values
SELECT *
From (values (45,'Completed')
, (1,'In Progress')
, (1,'Waiting')
, (1,'Loading')
) A("Count","Status")
order by "Count" desc
I have a code that works and use the term calculated, I'm not on postgresSQL pure tho we run on PADB
here is how it's used
create table some_table as
select category,
txn_type,
indiv_id,
accum_trip_flag,
max(first_true_origin) as true_origin,
max(first_true_dest ) as true_destination,
max(id) as id,
count(id) as tkts_cnt,
(case when calculated tkts_cnt=1 then 1 else 0 end) as one_way
from some_rando_table
group by 1,2,3,4 ;
A lightweight solution with Check constraint:
CREATE TABLE example (
discriminator INTEGER DEFAULT 0 NOT NULL CHECK (discriminator = 0)
);

Does Postgres support virtual columns? [duplicate]

Does PostgreSQL support computed / calculated columns, like MS SQL Server? I can't find anything in the docs, but as this feature is included in many other DBMSs I thought I might be missing something.
Eg: http://msdn.microsoft.com/en-us/library/ms191250.aspx
Postgres 12 or newer
STORED generated columns are introduced with Postgres 12 - as defined in the SQL standard and implemented by some RDBMS including DB2, MySQL, and Oracle. Or the similar "computed columns" of SQL Server.
Trivial example:
CREATE TABLE tbl (
int1 int
, int2 int
, product bigint GENERATED ALWAYS AS (int1 * int2) STORED
);
fiddle
VIRTUAL generated columns may come with one of the next iterations. (Not in Postgres 15, yet).
Related:
Attribute notation for function call gives error
Postgres 11 or older
Up to Postgres 11 "generated columns" are not supported.
You can emulate VIRTUAL generated columns with a function using attribute notation (tbl.col) that looks and works much like a virtual generated column. That's a bit of a syntax oddity which exists in Postgres for historic reasons and happens to fit the case. This related answer has code examples:
Store common query as column?
The expression (looking like a column) is not included in a SELECT * FROM tbl, though. You always have to list it explicitly.
Can also be supported with a matching expression index - provided the function is IMMUTABLE. Like:
CREATE FUNCTION col(tbl) ... AS ... -- your computed expression here
CREATE INDEX ON tbl(col(tbl));
Alternatives
Alternatively, you can implement similar functionality with a VIEW, optionally coupled with expression indexes. Then SELECT * can include the generated column.
"Persisted" (STORED) computed columns can be implemented with triggers in a functionally equivalent way.
Materialized views are a related concept, implemented since Postgres 9.3.
In earlier versions one can manage MVs manually.
YES you can!! The solution should be easy, safe, and performant...
I'm new to postgresql, but it seems you can create computed columns by using an expression index, paired with a view (the view is optional, but makes makes life a bit easier).
Suppose my computation is md5(some_string_field), then I create the index as:
CREATE INDEX some_string_field_md5_index ON some_table(MD5(some_string_field));
Now, any queries that act on MD5(some_string_field) will use the index rather than computing it from scratch. For example:
SELECT MAX(some_field) FROM some_table GROUP BY MD5(some_string_field);
You can check this with explain.
However at this point you are relying on users of the table knowing exactly how to construct the column. To make life easier, you can create a VIEW onto an augmented version of the original table, adding in the computed value as a new column:
CREATE VIEW some_table_augmented AS
SELECT *, MD5(some_string_field) as some_string_field_md5 from some_table;
Now any queries using some_table_augmented will be able to use some_string_field_md5 without worrying about how it works..they just get good performance. The view doesn't copy any data from the original table, so it is good memory-wise as well as performance-wise. Note however that you can't update/insert into a view, only into the source table, but if you really want, I believe you can redirect inserts and updates to the source table using rules (I could be wrong on that last point as I've never tried it myself).
Edit: it seems if the query involves competing indices, the planner engine may sometimes not use the expression-index at all. The choice seems to be data dependant.
One way to do this is with a trigger!
CREATE TABLE computed(
one SERIAL,
two INT NOT NULL
);
CREATE OR REPLACE FUNCTION computed_two_trg()
RETURNS trigger
LANGUAGE plpgsql
SECURITY DEFINER
AS $BODY$
BEGIN
NEW.two = NEW.one * 2;
RETURN NEW;
END
$BODY$;
CREATE TRIGGER computed_500
BEFORE INSERT OR UPDATE
ON computed
FOR EACH ROW
EXECUTE PROCEDURE computed_two_trg();
The trigger is fired before the row is updated or inserted. It changes the field that we want to compute of NEW record and then it returns that record.
PostgreSQL 12 supports generated columns:
PostgreSQL 12 Beta 1 Released!
Generated Columns
PostgreSQL 12 allows the creation of generated columns that compute their values with an expression using the contents of other columns. This feature provides stored generated columns, which are computed on inserts and updates and are saved on disk. Virtual generated columns, which are computed only when a column is read as part of a query, are not implemented yet.
Generated Columns
A generated column is a special column that is always computed from other columns. Thus, it is for columns what a view is for tables.
CREATE TABLE people (
...,
height_cm numeric,
height_in numeric GENERATED ALWAYS AS (height_cm * 2.54) STORED
);
db<>fiddle demo
Well, not sure if this is what You mean but Posgres normally support "dummy" ETL syntax.
I created one empty column in table and then needed to fill it by calculated records depending on values in row.
UPDATE table01
SET column03 = column01*column02; /*e.g. for multiplication of 2 values*/
It is so dummy I suspect it is not what You are looking for.
Obviously it is not dynamic, you run it once. But no obstacle to get it into trigger.
Example on creating an empty virtual column
,(SELECT *
From (values (''))
A("virtual_col"))
Example on creating two virtual columns with values
SELECT *
From (values (45,'Completed')
, (1,'In Progress')
, (1,'Waiting')
, (1,'Loading')
) A("Count","Status")
order by "Count" desc
I have a code that works and use the term calculated, I'm not on postgresSQL pure tho we run on PADB
here is how it's used
create table some_table as
select category,
txn_type,
indiv_id,
accum_trip_flag,
max(first_true_origin) as true_origin,
max(first_true_dest ) as true_destination,
max(id) as id,
count(id) as tkts_cnt,
(case when calculated tkts_cnt=1 then 1 else 0 end) as one_way
from some_rando_table
group by 1,2,3,4 ;
A lightweight solution with Check constraint:
CREATE TABLE example (
discriminator INTEGER DEFAULT 0 NOT NULL CHECK (discriminator = 0)
);

SELECT FROM a function returning a record with arbirary number of columns

I'm using PostgreSQL database.
I've a my plpgsql FUNCTION that returns a single record with an arbitrary number of columns.
Due to this arbitrariness I would need to use something like:
SELECT * FROM my_function(97)
But this doesn't work as Postgres gives me the following error:
a column definition list is required for functions returning "record"
But if I do:
SELECT my_function(97)
I can see the expected result but encapsulated in a single column.
Is there a way to fetch the expected result as a set of columns as intended by the function and not a single column encapsulating all of them?
When a function just RETURNS record or SETOF record (and no OUT parameters to go with it), PostgreSQL does not know the names and types of its elements and you are required to provide a column definition list with every call.
Avoid that if at all possible and return a well known (row) type instead. There are a several ways to declare the return type. See:
PostgreSQL: ERROR: 42601: a column definition list is required for functions returning "record"
Refactor a PL/pgSQL function to return the output of various SELECT queries
There are quite a few related questions on SO. Try a search!
When using a set returning function (setof) in the select list, on the left hand side of the FROM, the function returns a composite type. Using a function in the select list can be hard to avoid when using a table as input to a function.
A way to SELECT items from a single column of composite type follows:
SELECT
(my_function).field1,
(my_function).field2,
(my_function).field3
FROM
(SELECT my_function(*)
FROM sometable) t
You have a few options here:
Return a REFCURSOR and fetch from that cursor in the application. Note you can actually return multiple REFCURSORS if you need to return multiple result sets.
Return an XML document and parse it in the application.
Use a bunch of OUT variables, return RECORD, and determine which of these to select from
The basic problem is that the actual return results need to be known at planning time so you can't just return an arbitrary number of columns. The planner needs to know what is going to be returned.
In order to return a "set of columns" you will have define a return type as TABLE or SETOF in which case you actually return a SET of records which you should be able to SELECT from.
For more information about functions returning SETOF take a look at this link to documentation
I'm not certain that I follow what you're after, but does this work?
SELECT (my_function(97)).my_column