It's a function or table in this later join case? - postgresql

It is often particularly handy to LEFT JOIN to a LATERAL subquery, so that source rows will appear in the result even if the LATERAL subquery produces no rows for them. For example, if get_product_names() returns the names of products made by a manufacturer, but some manufacturers in our table currently produce no products, we could find out which ones those are like this:
SELECT m.name
FROM manufacturers m LEFT JOIN LATERAL get_product_names(m.id) pname ON true
WHERE pname IS NULL;
All contents extract from PostgreSQL manual. LINK
Now I finally probably get what does LATERAL mean. In this case,
Overall I am Not sure get_product_names is a table or function. The following is my understanding.
A: get_product_names(m.id) is a function, and using m.id as a input parameter returns a table. The return table alias as pname. Overall it's a table m join a null (where condition) table.
B: get_product_names is a table, table m left join table get_product_names on m.id. pname is alias for get_product_names. Overall it's a table m join a null (where condition) table.

get_product_names is a table function (also known as set returning function or SRF in PostgreSQL slang). Such a function does not necessarily return a single result row, but arbitrarily many rows.
Since the result of such a function is a table, you typically use it in SQL statements where you would use a table, that is in the FROM clause.
A simple example is
SELECT * FROM generate_series(1, 5);
generate_series
-----------------
1
2
3
4
5
(5 rows)
You can also use normal functions in this way, they are then treated as a table function that returns exactly one row.

Related

Having a postgres function taking an argument and returning a table how to merge many such results into ingle table (SelectMany)?

I want to do something like LINQ SelectMany in Postgres SQL. I have a function with a signature like this:
CREATE or REPLACE FUNCTION get_objects(id_in int) RETURNS TABLE (name varchar, id_out int)
Each call on it returns 200 rows.
I have a select that returns a table of in_IDs. I want to call my function on each row of such a table and merge them into a single one. In other words, join many lists of rows into a single big table. How to do such a thing in Postgres SQL (not inside function loop)?
Use lateral join.
select l.*
from the_table_of_in_ids as t -- or put the query in brackets here
cross join lateral get_objects(t.id) as l;

PostgreSQL reusing computed result as input to other select computations

Is there any way we can take a computed result inside the select clause and insert it into another computation inside the select clause?
For example this is what I want to have but can't so far:
select trim(leading https://www.amazon.com for url) as trimmedURL,
substring(trimmedURL, from position('/' in trimmedURL) for position ('html' in trimmedURL))....
As you can see I have used trimmedURL 3 times inside the substring function. I know how to naively do that be copy/paste of trim(leading https://www.amazon.com for url) into the substring function.
Is there any way to avoid that and not create really large function calls as the first value computed might be placed many times inside other functions. This will improve code readability and usability.
you could use a lateral join and place the computed fields i the lateral query. the lateral fields are then accessible from the main query.
Postgres documentation for lateral join
i.e.
SELECT
trimmedUrl
, SUBSTRING(trimmedURL,10,20) url_part
FROM mytable
LEFT JOIN LATERAL (SELECT trim(leading https://www.amazon.com for url) as trimmedURL) trmd
ON TRUE
also, note that postgresql ignores casing in the naming of columns / tables etc unless they are quoted.
Here's a self-contained example:
WITH x(col) AS (Values ('abc://cdf/def'), ('abc://xyz/pqr'))
SELECT x.col, SUBSTRING(y.col2 from position('/' in y.col2)) resuing_computation
FROM x
LEFT JOIN LATERAL (SELECT trim(leading 'abc://' from col) col2) y ON TRUE

Using EXCEPT and flagging column differences

What Im looking to do is select data from a postgres table, which does not appear in another. Both tables have identical columns, bar the use of boolean over Varchar(1) but the issue is that the data in those columns do not match up.
I know I can do this with a SELECT EXCEPT SELECT statement, which I have implemented and is working.
What I would like to do is find a method to flag the columns that do not match up. As an idea, I have thought to append a character to the end of the data in the fields that do not match.
For example if the updateflag is different in one table to the other, I would be returned '* f' instead of 'f'
SELECT id, number, "updateflag" from dbc.person
EXCEPT
SELECT id, number, "updateflag":bool from dbg.person;
Should I be joining the two tables together, post executing this statement to identify the differences, from whats returned?
I have tried to research methods to implement this but have no found anything on the topic
I prefer a full outer join for this
select *
from dbc.person p1
full join dbg.person p2 on p1.id = p2.id
where p1 is distinct from p2;
The id column is assumed the primary key column that "links" the two tables together.
This will only return rows where at least one column is different.
If you want to see the differences, you could use a hstore feature
select hstore(p1) - hstore(p2) as columns_diff_p1,
hstore(p2) - hstore(p1) as columns_diff_p2
from dbc.person p1
full join dbg.person p2 on p1.id = p2.id
where p1 is distinct from p2;

PostgreSQL CTE records as parameters to function

I have a function that accepts two integers as parameters my_function(input_a, input_b). Is there an easy way to pass the results of a CTE (that returns records of input_a, input_b) into the function?
Should I be looking into writing a custom function with a for loop or is there a better approach?
If the function returns a single record then:
WITH cte AS (SELECT 1 a, 2 b)
SELECT my_function(a, b) FROM cte;
will work. However, if the function is an SRF (Set-Returning-Function), then you need to use LATERAL, to let the database know that you want to feed the results of the prior tables in the JOIN statement to the functions later on in the JOIN. This is accomplished like so:
WITH cte AS (SELECT 1 a, 2 b)
SELECT * FROM cte, LATERAL my_function(a, b);
The LATERAL will cause PostgreSQL to take each row from the CTE and run "my_function" with the values from that row, returning the results of that function to the overall SELECT statement.

Retrieval of columns from functions that returns table (or setof record)

All time I have an variation of this problem, and not remember how to workaround, only "oop was so simple, but how to?"... Perhaps there are some patterns and best way to work with each pattern. Let's see the main one, examplefying by unnest() and ts_stat().
First, good examples, no problems, because unnest() returns only one column:
SELECT * FROM unnest(array[1,2,3]) t(id); -- is ok, the int columns there!
SELECT unnest(array[1,2,3]) t(id); -- is ok, the int columns
WITH t AS (SELECT unnest(array[1,2,3]) as id)
SELECT id, unnest(array[4,id]) as x
FROM t; -- more complex, but ok!
Now a function that returns a defined SETOF RECORD,
SELECT * FROM ts_stat('SELECT kx FROM terms where id=2') -- GOOD
-- show all word|ndoc|nentry columns
SELECT ts_stat('SELECT kx FROM terms where id=2') as x -- BAD
-- because lost columns, show only "x" column... but works
-- NOTE: you can imagine any other function, as json_each(), etc.
See GOOD/BAD considerations... So, this is the problem: a SETOF RECORD with more tham one column. In the simplest (unnest above) case, the solution is to use in the "FROM side", as a table; but, when RECORD have multiple fields, arises the problem.
--MAIN EXAMPLE FOR THE DISCUSSION:
WITH t AS (SELECT unnest(array[1,2,3]) as id)
SELECT id, ts_stat('SELECT kx FROM terms where id='||id) as x
FROM t; -- BAD, but works...
Now, in this main example, is not possible to use ts_stat() in the "FROM side", so, characterizing the pattern: a function that returns a TABLE or a SETOF RECORD, in a query where we need columns, but the function can't in the "FROM side".
QUESTION: What the generic (and most elegant) solution to this pattern? How (syntax pattern) to show columns?
NOTE: another problem is that, if you not remember exactly the syntax of solution, you try things that not works... In this case an error:
WITH t AS (SELECT unnest(array[1,2,3]) as id)
SELECT id, x.word, x.ndoc, x.nentry
FROM (
SELECT t.nsid,
ts_stat('SELECT kx FROM terms where id='||id) as x
FROM t
) s;
SQL PARSER ERROR (PostgreSQL 9.5): no table "x" in the FROM clause.
You should never use a set-returning-function (SRF) in a SELECT list. The main example should be written with an implicit LATERAL JOIN:
SELECT v.id, x.*
FROM (VALUES (1),(2),(3)) v(id)
JOIN ts_stat('SELECT kx FROM terms where id=' || v.id) x ON true;
The lateral join is implicit here because an SRF can refer to columns from relations specified before it the FROM clause without using the keyword LATERAL. In the example above the SRF ts_stat() makes a lateral reference to column and relation v(id). You can also do this with e.g. sub-queries but then you have to explicitly use the keyword LATERAL.
Note that while you can use a SRF in a select list, its use is discouraged. You provide the example of unnest(anyarray) which is interesting because there is also the overloaded variant unnest(anyarray, ...) (i.e. unnest multiple arrays in one call) which will throw an error when used in a select list; in can only be used as a row source. The reason why you should not use SRFs in a select list is that there is no obvious solution when using multiple SRFs each producing a different number of rows.