PostgreSQL: A function to add parent & child records - postgresql

I am writing a function to add a parent record and children records.
As far as I am aware, I should create appropriate data types, which I have simplified.
create type parenttype as (
data varchar -- data
);
create type childtype as (
parent integer, -- foreign key
details varchar -- data
);
This is a simplified version, omitting a number fields which add nothing to the question. However, they both also omit the primary key which will be generated.
I think the function would take the following form:
function adddata(parentdata parenttype, childdata childtype[])
-- etc
-- LANGUAGE plpgsql
I think I know what to do what to do with the data inside once it gets there.
The question is, how do I set the data before when I call the function? That is, how do I set values for the the parenttype and the array of childtype?
I have asked asked a related question for MS SQL Server, but I know that this requires a different approach.

Depends from what you call that function and where does this data come from.
Example 1
SELECT * FROM adddata(
row('somedata')::parenttype,
array[row(1::integer,'somedata'), row(2::integer,'somedata22')]::childtype[]
);
Example 2
SELECT adddata(
row(parent.column)::parenttype,
array_agg(row(child.parent_id,child.column)::childtype)
)
FROM parent
JOIN child ON child.parent_id = parent.id
GROUP BY parent.column;

Related

when does a (stored) GENERATED COLUMN get regenerated?

On any update to the row (which would be somehow dumb and I would expect a performance warning on the documentation page then) or is it smart enough of analyzing the generation expression and only regenerate the computed column when the input column(s) have changed?
From the documentation it's rather clear
A stored generated column is computed when it is written (inserted or updated) and occupies storage as if it were a normal column. A virtual generated column occupies no storage and is computed when it is read. Thus, a virtual generated column is similar to a view and a stored generated column is similar to a materialized view (except that it is always updated automatically).
So it seams that the generated always column is generated always.
Below a small test case to verify
We define a immutable function used in the formula with pg_sleepinside to see if the function was called
create or replace function wait10s(x numeric)
returns int
as $$
SELECT pg_sleep(10);
select x as result;
$$ language sql IMMUTABLE;
Table DDL
create table t
(col1 numeric,
col2 numeric,
gen_col numeric generated always as ( wait10s(col2) ) STORED
);
Insert
as expected we wait 10 seconds
insert into t (col1, col2) values (1,1);
Update of column used in formula
update t set col2 = 2
Again expected wait
Update of column NOT used in formula
update t set col1 = 2
No wait so it seems that there is an optimizing step calling the formula only in case of necessity.
This makes perfect sense, but of course you should take it with care as this behavior is not documented and may change...

Postgres Functions: Getting the Return Table Column Details

I feel the need to get the column names and data types of the table returned by any function that has a 'record' return data type, because...
A key process in an existing SQL Server-based system makes use of a stored procedure that takes a user-defined function as a parameter. An initial step gets the column names and types of the table returned by the function that was passed as a parameter.
In Postgres 13 I can use pg_proc.prorettype and the corresponding pg_type to find functions that return record types...that's a start. I can also use pg_get_function_result() to get the string containing the information I need. But, it's a string, and while I ultimately will have to assemble a very similar string, this is just one application of the info. Is there a tabular equivalent containing (column_name, data_type, ordinal_position), or do I need to do that myself?
Is there access to a composite data type the system may have created when such a function is created?
One option that I think will work for me, but I think it's a little weird, is to:
> create temp table t as select * from function() limit 0;
then look that table up in info_schema.columns, assemble what I need and drop the temp table...putting all of this into a function.
You can query the catalog table pg_proc, which contains all the required information:
SELECT coalesce(p.na, 'column' || p.i),
p.ty::regtype,
p.i
FROM pg_proc AS f
CROSS JOIN LATERAL unnest(
coalesce(f.proallargtypes, ARRAY[f.prorettype]),
f.proargmodes,
f.proargnames
)
WITH ORDINALITY AS p(ty,mo,na,i)
WHERE f.proname = 'interval_ok'
AND coalesce(p.mo, 'o') IN ('o', 't')
ORDER BY p.i;

PostgreSQL: Creating a function which accepts multiple values

I want to write a function which will add insert record and then insert one or more records in a related table. I think I know what to do inside the function, but I don’t know what the function signature should look like.
Here is a mockup sample:
CREATE TABLE sales(id SERIAL, customer id, sold date);
CREATE TABLE saleitems(SERIAL, sale int, details varchar, price numeric(6,2));
SELECT addSale(42, '2016-01-01',
values ('stuff',13),('more stuff',42),('things',3.14),('etc',0)) items(price,details));
CREATE OR REPLACE FUNCTION addSale(customer,sold,items) RETURNS int AS
$$
-- I think I can handle the rest
$$
LANGUAGE sql;
The salient points:
I would like to be able to use the VALUES (…) name(…) construct as an argument — is this possible?
The real problem, I think, is the last parameter items. What is the appropriate type of this?
I would like the language to be SQL, since my next step is to translate this into other dialects (MySQL & SQL Server). However, I’ll do whatever is needed.
Eventually I will wrap the code body inside a transaction, and return the new sales.id value.
The question is: what is the correct parameter to accept a table expression in the VALUES form?
Your best bet here is to create a new type that holds the details and price of a product:
CREATE TYPE product_details AS (
details varchar,
price numeric(6,2)
);
Then you can define a function parameter of type product_details[], i.e. an array of product details. Since you want to have a SQL function and need to retrieve the value of the serial column of one insert for use in another insert, you need a CTE:
CREATE FUNCTION addSale(_customer int, _sold int, _items product_details[]) RETURNS int AS
$$
WITH s AS (
INSERT INTO sales (customer, sold) VALUES (_customer, _sold) RETURNING id;
)
INSERT INTO saleitems (sale, details, price)
SELECT s.id, i.d, i.p
FROM s, unnest(_items) i(d, p);
$$ LANGUAGE sql;
And then you call the function like so:
SELECT addSale(42, '2016-01-01'::date,
ARRAY[('stuff',13),('more stuff',42),('things',3.14),('etc',0)]);

How to update a value using string_agg without using a function?

I solved this problem with a function and I like my solution, but I want to know if there is a way to solve this problem without using functions. Here is the thing:
There are four tables that are relevant to this:
entities: entities of the system (tenants)
members: members of an entity
member_sets: sets of members
members_and_sets: table to join members and sets (many to many)
The member_sets table has a column named bits which is a binary representation of the set, so, for example, if an entity has 5 members and one specific set has the third member, the value of the bits column is 00100, the entity has three special kinds of sets: universe, empty and unit, their binary repesentation is: 11111, 00000 and 10000 respectively, assuming the unit set has the first member.
The challenge is keep this binary representation of the set up to date; Whenever one member is added to the entity, all binary representations must be updated. This is easy to do with a trigger and a function, my solution is this:
CREATE FUNCTION setbits(INTEGER) RETURNS VARBIT AS
$$SELECT STRING_AGG(belongs, '')::VARBIT AS setbits FROM (
SELECT LEAST(COALESCE(members_and_sets.set_id, 0), 1)::text AS belongs
FROM members LEFT JOIN members_and_sets
ON members.id=members_and_sets.member_id
AND members_and_sets.set_id=$1
GROUP BY members.id,members_and_sets.set_id
ORDER BY members.id)
AS bitsring;$$
LANGUAGE SQL
RETURNS NULL ON NULL INPUT;
-- calling this function in a trigger after inserting a new member:
UPDATE member_sets ms
SET bits=setbits(ms.id)
WHERE ms.entity_id=NEW.entity_id;
Now my question is: Can I do this without using a function? I tried with CTE but apparently I'm to noob to accomplish this; I couldn't pass the set_id to the must inner query so my solution was to wrap the query in a function and pass the set_id as an argument to the function. Again, this solution works perfectly, I just want to know if there is no way I can do this without a function call.
As your function body is simply a SELECT, you should be able to replace the function call with a subquery:
UPDATE member_sets ms
SET bits= (
SELECT STRING_AGG(belongs, '')::VARBIT AS setbits FROM (
SELECT LEAST(COALESCE(members_and_sets.set_id, 0), 1)::text AS belongs
FROM members LEFT JOIN members_and_sets
ON members.id=members_and_sets.member_id
AND members_and_sets.set_id=ms.id
GROUP BY members.id,members_and_sets.set_id
ORDER BY members.id)
AS bitsring
)
WHERE ms.entity_id=NEW.entity_id;

Sequence Generators in T-SQL

We have an Oracle application that uses a standard pattern to populate surrogate keys. We have a series of extrinsic rows (that have specific values for the surrogate keys) and other rows that have intrinsic values.
We use the following Oracle trigger snippet to determine what to do with the Surrogate key on insert:
IF :NEW.SurrogateKey IS NULL THEN
SELECT SurrogateKey_SEQ.NEXTVAL INTO :NEW.SurrogateKey FROM DUAL;
END IF;
If the supplied surrogate key is null then get a value from the nominated sequence, else pass the supplied surrogate key through to the row.
I can't seem to find an easy way to do this is T-SQL. There are all sorts of approaches, but none of which use the notion of a sequence generator like Oracle and other SQL-92 compliant DBs do.
Anybody know of a really efficient way to do this in SQL Server T-SQL? By the way, we're using SQL Server 2008 if that's any help.
You may want to look at IDENTITY. This gives you a column for which the value will be determined when you insert the row.
This may mean that you have to insert the row, and determine the value afterwards, using SCOPE_IDENTITY().
There is also an article on simulating Oracle Sequences in SQL Server here: http://www.sqlmag.com/Articles/ArticleID/46900/46900.html?Ad=1
Identity is one approach, although it will generate unique identifiers at a per table level.
Another approach is to use unique identifiers, in particualr using NewSequantialID() that ensues the generated id is always bigger than the last. The problem with this approach is you are no longer dealing with integers.
The closest way to emulate the oracle method is to have a separate table with a counter field, and then write a user defined function that queries this field, increments it, and returns the value.
Here is a way to do it using a table to store your last sequence number. The stored proc is very simple, most of the stuff in there is because I'm lazy and don't like surprises should I forget something so...here it is:
----- Create the sequence value table.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[SequenceTbl]
(
[CurrentValue] [bigint]
) ON [PRIMARY]
GO
-----------------Create the stored procedure
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE procedure [dbo].[sp_NextInSequence](#SkipCount BigInt = 1)
AS
BEGIN
BEGIN TRANSACTION
DECLARE #NextInSequence BigInt;
IF NOT EXISTS
(
SELECT
CurrentValue
FROM
SequenceTbl
)
INSERT INTO SequenceTbl (CurrentValue) VALUES (0);
SELECT TOP 1
#NextInSequence = ISNULL(CurrentValue, 0) + 1
FROM
SequenceTbl WITH (HoldLock);
UPDATE SequenceTbl WITH (UPDLOCK)
SET CurrentValue = #NextInSequence + (#SkipCount - 1);
COMMIT TRANSACTION
RETURN #NextInSequence
END;
GO
--------Use the stored procedure in Sql Manager to retrive a test value.
declare #NextInSequence BigInt
exec #NextInSequence = sp_NextInSequence;
--exec #NextInSequence = sp_NextInSequence <skipcount>;
select NextInSequence = #NextInSequence;
-----Show the current table value.
select * from SequenceTbl;
The astute will notice that there is a parameter (optional) for the stored proc. This is to allow the caller to reserve a block of ID's in the instance that the caller has more than one record that needs a unique id - using the SkipCount, the caller need make only a single call for however many IDs are needed.
The entire "IF EXISTS...INSERT INTO..." block can be removed if you remember to insert a record when the table is created. If you also remember to insert that record with a value (your seed value - a number which will never be used as an ID), you can also remove the ISNULL(...) portion of the select and just use CurrentValue + 1.
Now, before anyone makes a comment, please note that I am a software engineer, not a dba! So, any constructive criticism concerning the use of "Top 1", "With (HoldLock)" and "With (UPDLock)" is welcome. I don't know how well this will scale but this works OK for me so far...