Elegant way to check item in an array of ranges

Elegant way to check item in an array of ranges - postgresql

About array of ranges (ex. int4range[]), and range functions.
Suppose table t as
CREATE TABLE t (id serial, r int4range[]);
INSERT INTO t (r) VALUES
('{"[2,5]","[100,200]"}'::int4range[]),
('{"[6,9]","[201,300]"}'::int4range[]);
So, to check if 7 or 70 are there, need a query like
SELECT * FROM (SELECT *, unnest(r) as ur FROM t) t2 WHERE 7<#ur; -- row2
SELECT * FROM (SELECT *, unnest(r) as ur FROM t) t2 WHERE 70<#ur; -- empty
There are a less ugly way to do this kind of queries?
NOTE: need a generic "template" to build functions that returns the table with simple * query,
CREATE FUNCTION t_where(int4) RETURNS t AS $f$
SELECT id,r -- but need *
FROM ( SELECT *, unnest(r) as ur FROM t) t2
WHERE $1 <# ur
$f$ LANGUAGE SQL IMMUTABLE;
So the same query is SELECT * FROM t_where(7), but I need (an elegant way) to build using *, not listing fields (id,r).

You don't need a subquery, you can use an implicit LATERAL JOIN, giving you access to t.* in the main SELECT:
#= SELECT t.*
FROM t, unnest(r) AS ur
WHERE 7<#ur;
┌────┬────────────────────────┐
│ id │ r │
├────┼────────────────────────┤
│ 2 │ {"[6,10)","[201,301)"} │
└────┴────────────────────────┘
(1 row)

Related

PgSQL function returning table and extra data computed in process

In PgSQL I make huge select, and then I want count it's size and apply some extra filters.
execute it twice sound dumm,
so I wrapped it in function
and then "cache" it and return union of filtered table and extra row at the end where in "id" column store size
with q as (select * from myFunc())
select * from q
where q.distance < 400
union all
select count(*) as id, null,null,null
from q
but it also doesn't look like proper solution...
and so the question: is in pg something like "generator function" or any other stuff that can properly solve this ?
postgreSQL 13
myFunc aka "selectItemsByRootTag"
CREATE OR REPLACE FUNCTION selectItemsByRootTag(
in tag_name VARCHAR(50)
)
RETURNS table(
id BIGINT,
name VARCHAR(50),
description TEXT,
/*info JSON,*/
distance INTEGER
)
AS $$
BEGIN
RETURN QUERY(
WITH RECURSIVE prod AS (
SELECT
tags.name, tags.id, tags.parent_tags
FROM
tags
WHERE tags.name = (tags_name)
UNION
SELECT c.name, c.id , c.parent_tags
FROM
tags as c
INNER JOIN prod as p
ON c.parent_tags = p.id
)
SELECT
points.id,
points.name,
points.description,
/*points.info,*/
points.distance
from points
left join tags on points.tag_id = tags.id
where tags.name in (select prod.name from prod)
);
END;
$$ LANGUAGE plpgsql;
as a result i want see maybe set of 2 table or generator function that yield some intermediate result not shure how exacltly it should look

demo
CREATE OR REPLACE FUNCTION pg_temp.selectitemsbyroottag(tag_name text, _distance numeric)
RETURNS TABLE(id bigint, name text, description text, distance numeric, count bigint)
LANGUAGE plpgsql
AS $function$
DECLARE _sql text;
BEGIN
_sql := $p1$WITH RECURSIVE prod AS (
SELECT
tags.name, tags.id, tags.parent_tags
FROM
tags
WHERE tags.name ilike '%$p1$ || tag_name || $p2$%'
UNION
SELECT c.name, c.id , c.parent_tags
FROM
tags as c
INNER JOIN prod as p
ON c.parent_tags = p.id
)
SELECT
points.id,
points.name,
points.description,
points.distance,
count(*) over ()
from points
left join tags on points.tag_id = tags.id
where tags.name in (select prod.name from prod)
and points.distance > $p2$ || _distance
;
raise notice '_sql: %', _sql;
return query execute _sql;
END;
$function$
You can call it throug following way
select * from pg_temp.selectItemsByRootTag('test',20);
select * from pg_temp.selectItemsByRootTag('test_8',20) with ORDINALITY;
The 1 way to call the function, will have a row of total count total number of rows. Second way call have number of rows plus a serial incremental number.
I also make where q.distance < 400 into function input argument.
selectItemsByRootTag('test',20); means that q.distance > 20 and tags.name ilike '%test%'.

Make multiple JSONArray rows into one single row by grouping with some other column in postgresql

I have a view from a query select * from table which returns the below data
I want to group by the name column which have same name and merge the JSONArray column like mentioned below

One way to do this, is to unnest the arrays and then aggregate them back:
select t.id, t.name, jsonb_agg(a.e)
from the_table t
cross join lateral jsonb_array_elements(t.json_array) as a(e)
group by t.id, t.name;
If you do that a lot, a custom aggregate makes this a bit easier to user (but probably not faster)
create function jsonb_array_combine(p_one jsonb, p_two jsonb)
returns jsonb
as
$$
select jsonb_agg(e)
from (
select e
from jsonb_array_elements(p_one) as o(e)
union all
select e
from jsonb_array_elements(p_two) as t(e)
) t
$$
language sql
immutable;
create aggregate jsonb_array_agg(jsonb)
(
SFUNC = jsonb_array_combine(jsonb, jsonb),
STYPE = jsonb
);
Then you can use it like this:
select t.id, t.name, jsonb_array_agg(t.json_array)
from the_table t
group by t.id, t.name;

Subquery select values and save into enum in postgresql

I'm trying to remove a value from an existing postgresql enum datatype. My research tells me that I need to remake the enum data type excluding the value I do not want. My idea for accomplishing this is to get all values from the existing data type via
select
e.enumlabel as enum_value
from pg_type t
join pg_enum e on t.oid = e.enumtypid
join pg_catalog.pg_namespace n ON n.oid = t.typnamespace
where t.typname = 'alert_level' where e.enumlabel is not 'value i want to exclude'
create type alert_type2 as enum ('ABOVE QUERY HERE') ::alert_type
and take these values I want and somehow insert them into a new data type
So my question: Is it possible to get values from a subquery and create an enum data type from the query results? Is there a better way to do this?
I've also tried something like this but I cant figure out how to exclude values from the select enum_range.
create type alert_type2 as enum (select enum_range(NULL::alert_type)::text)

You have to exceute a dynamic query inside a function or DO block, example:
create type fruits as enum ('apple', 'pear', 'banana');
do $$
begin
execute format(
'create type fruits_without_pear as enum (%s)',
(
select array_to_string(array_agg(quote_literal(e)), ',')
from unnest(enum_range(null::fruits)) e
where e <> 'pear'
)
);
end $$;
select enum_range(null::fruits_without_pear);
enum_range
----------------
{apple,banana}
(1 row)
Read about:
Executing Dynamic Commands
DO command with example
format() and quote_literal() functions
array_agg()
unnest() and array_to_string().
Notes about used query.
select enum_range(null::fruits)
enum_range
---------------------
{apple,pear,banana}
(1 row)
We want to transform the result to the string 'apple','banana' to use it as a parameter in format().
First, remove 'pear' from the list:
select e
from unnest(enum_range(null::fruits)) e
where e <> 'pear';
e
--------
apple
banana
(2 rows)
next, quote the elements and aggregate them into an array:
select array_agg(quote_literal(e))
from unnest(enum_range(null::fruits)) e
where e <> 'pear'
array_agg
--------------------
{'apple','banana'}
(1 row)
finally, transform an array to string:
select array_to_string(array_agg(quote_literal(e)), ',')
from unnest(enum_range(null::fruits)) e
where e <> 'pear'
array_to_string
------------------
'apple','banana'
(1 row)
However, we should have done it a little simpler ;)
select string_agg(quote_literal(e), ',')
from unnest(enum_range(null::fruits)) e
where e <> 'pear'
array_to_string
------------------
'apple','banana'
(1 row)

Join 2 sets based on default order

How do I join 2 sets of records solely based on the default order?
So if I have a table x(col(1,2,3,4,5,6,7)) and another table z(col(a,b,c,d,e,f,g))
it will return
c1 c2
-- --
1 a
2 b
3 c
4 d
5 e
6 f
7 g
Actually, I wanted to join a pair of one dimensional arrays from parameters and treat them like columns from a table.
Sample code:
CREATE OR REPLACE FUNCTION "Test"(timestamp without time zone[],
timestamp without time zone[])
RETURNS refcursor AS
$BODY$
DECLARE
curr refcursor;
BEGIN
OPEN curr FOR
SELECT DISTINCT "Start" AS x, "End" AS y, COUNT("A"."id")
FROM UNNEST($1) "Start"
INNER JOIN
(
SELECT "End", ROW_NUMBER() OVER(ORDER BY ("End")) rn
FROM UNNEST($2) "End" ORDER BY ("End")
) "End" ON ROW_NUMBER() OVER(ORDER BY ("Start")) = "End".rn
LEFT JOIN "A" ON ("A"."date" BETWEEN x AND y)
GROUP BY 1,2
ORDER BY "Start";
return curr;
END
$BODY$

Now, to answer the real question that was revealed in comments, which appears to be something like:
Given two arrays 'a' and 'b', how do I pair up their elements so I can get the element pairs as column aliases in a query?
There are a couple of ways to tackle this:
If and only if the arrays are of equal length, use multiple unnest functions in the SELECT clause (a deprecated approach that should only be used for backward compatibility);
Use generate_subscripts to loop over the arrays;
Use generate_series over subqueries against array_lower and array_upper to emulate generate_subscripts if you need to support versions too old to have generate_subscripts;
Relying on the order that unnest returns tuples in and hoping - like in my other answer and as shown below. It'll work, but it's not guaranteed to work in future versions.
Use the WITH ORDINALITY functionality added in PostgreSQL 9.4 (see also its first posting) to get a row number for unnest when 9.4 comes out.
Use multiple-array UNNEST, which is SQL-standard but which PostgreSQL doesn't support yet.
So, say we have function arraypair with array parameters a and b:
CREATE OR REPLACE FUNCTION arraypair (a integer[], b text[])
RETURNS TABLE (col_a integer, col_b text) AS $$
-- blah code here blah
$$ LANGUAGE whatever IMMUTABLE;
and it's invoked as:
SELECT * FROM arraypair( ARRAY[1,2,3,4,5,6,7], ARRAY['a','b','c','d','e','f','g'] );
possible function definitions would be:
SRF-in-SELECT (deprecated)
CREATE OR REPLACE FUNCTION arraypair (a integer[], b text[])
RETURNS TABLE (col_a integer, col_b text) AS $$
SELECT unnest(a), unnest(b);
$$ LANGUAGE sql IMMUTABLE;
Will produce bizarre and unexpected results if the arrays aren't equal in length; see the documentation on set returning functions and their non-standard use in the SELECT list to learn why, and what exactly happens.
generate_subscripts
This is likely the safest option:
CREATE OR REPLACE FUNCTION arraypair (a integer[], b text[])
RETURNS TABLE (col_a integer, col_b text) AS $$
SELECT
a[i], b[i]
FROM generate_subscripts(CASE WHEN array_length(a,1) >= array_length(b,1) THEN a::text[] ELSE b::text[] END, 1) i;
$$ LANGUAGE sql IMMUTABLE;
If the arrays are of unequal length, as written it'll return null elements for the shorter, so it works like a full outer join. Reverse the sense of the case to get an inner-join like effect. The function assumes the arrays are one-dimensional and that they start at index 1. If an entire array argument is NULL then the function returns NULL.
A more generalized version would be written in PL/PgSQL and would check array_ndims(a) = 1, check array_lower(a, 1) = 1, test for null arrays, etc. I'll leave that to you.
Hoping for pair-wise returns:
This isn't guaranteed to work, but does with PostgreSQL's current query executor:
CREATE OR REPLACE FUNCTION arraypair (a integer[], b text[])
RETURNS TABLE (col_a integer, col_b text) AS $$
WITH
rn_c1(rn, col) AS (
SELECT row_number() OVER (), c1.col
FROM unnest(a) c1(col)
),
rn_c2(rn, col) AS (
SELECT row_number() OVER (), c2.col
FROM unnest(b) c2(col)
)
SELECT
rn_c1.col AS c1,
rn_c2.col AS c2
FROM rn_c1
INNER JOIN rn_c2 ON (rn_c1.rn = rn_c2.rn);
$$ LANGUAGE sql IMMUTABLE;
I would consider using generate_subscripts much safer.
Multi-argument unnest:
This should work, but doesn't because PostgreSQL's unnest doesn't accept multiple input arrays (yet):
SELECT * FROM unnest(a,b);

select x.c1, z.c2
from
x
inner join
(
select
c2,
row_number() over(order by c2) rn
from z
order by c2
) z on x.c1 = z.rn
order by x.c1
If x.c1 is not 1,2,3... you can do the same that was done with z
The middle order by is not necessary as pointed by Erwin. I tested it like this:
create table t (i integer);
insert into t
select ceil(random() * 100000)
from generate_series(1, 100000);
select
i,
row_number() over(order by i) rn
from t
;
And i comes out ordered. Before this simple test which I never executed I though it would be possible that the rows would be numbered in any order.

By "default order" it sounds like you probably mean the order in which the rows are returned by select * from tablename without an ORDER BY.
If so, this ordering is undefined. The database can return rows in any order that it feels like. You'll find that if you UPDATE a row, it probably moves to a different position in the table.
If you're stuck in a situation where you assumed tables had an order and they don't, you can as a recovery option add a row number based on the on-disk ordering of the tuples within the table:
select row_number() OVER (), *
from the_table
order by ctid
If the output looks right, I recommend that you CREATE TABLE a new table with an extra field, then do an INSERT INTO ... SELECT to insert the data ordered by ctid, then ALTER TABLE ... RENAME the tables and finally fix any foreign key references so they point to the new table.
ctid can be changed by autovacuum, UPDATE, CLUSTER, etc, so it is not something you should ever be using in applications. I'm using it here only because it sounds like you don't have any real ordering or identifier key.
If you need to pair up rows based on their on-disk ordering (an unreliable and unsafe thing to do as noted above), you could per this SQLFiddle try:
WITH
rn_c1(rn, col) AS (
SELECT row_number() OVER (ORDER BY ctid), c1.col
FROM c1
),
rn_c2(rn, col) AS (
SELECT row_number() OVER (ORDER BY ctid), c2.col
FROM c2
)
SELECT
rn_c1.col AS c1,
rn_c2.col AS c2
FROM rn_c1
INNER JOIN rn_c2 ON (rn_c1.rn = rn_c2.rn);
but never rely on this in a production app. If you're really stuck you can use this with CREATE TABLE AS to construct a new table that you can start with when you're working on recovering data from a DB that lacks a required key, but that's about it.
The same approach given above might work with an empty window clause () instead of (ORDER BY ctid) when using sets that lack a ctid, like interim results from functions. It's even less safe then though, and should be a matter of last resort only.
(See also this newer related answer: https://stackoverflow.com/a/17762282/398670)

How to work around the "Recursive CTE member can refer itself only in FROM clause" requirement?

I'm trying to run a graph search to find all nodes accessible from a starting point, like so:
with recursive
nodes_traversed as (
select START_NODE ID
from START_POSITION
union all
select ed.DST_NODE
from EDGES ed
join nodes_traversed NT
on (NT.ID = ed.START_NODE)
and (ed.DST_NODE not in (select ID from nodes_traversed))
)
select distinct * from nodes_traversed
Unfortunately, when I try to run that, I get an error:
Recursive CTE member (nodes_traversed) can refer itself only in FROM clause.
That "not in select" clause is important to the recursive expression, though, as it provides the ending point. (Without it, you get infinite recursion.) Using generation counting, like in the accepted answer to this question, would not help, since this is a highly cyclic graph.
Is there any way to work around this without having to create a stored proc that does it iteratively?

Here is my solution that use global temporary table, I have limited recursion by level and nodes from temporary table.
I am not sure how it will work on large set of data.
create procedure get_nodes (
START_NODE integer)
returns (
NODE_ID integer)
as
declare variable C1 integer;
declare variable C2 integer;
begin
/**
create global temporary table id_list(
id integer
);
create index id_list_idx1 ON id_list (id);
*/
delete from id_list;
while ( 1 = 1 ) do
begin
select count(distinct id) from id_list into :c1;
insert into id_list
select id from
(
with recursive nodes_traversed as (
select :START_NODE AS ID , 0 as Lv
from RDB$DATABASE
union all
select ed.DST_NODE , Lv+1
from edges ed
join nodes_traversed NT
on
(NT.ID = ed.START_NODE)
and nt.Lv < 5 -- Max recursion level
and nt.id not in (select id from id_list)
)
select distinct id from nodes_traversed);
select count(distinct id) from id_list into :c2;
if (c1 = c2) then break;
end
for select distinct id from id_list into :node_id do
begin
suspend ;
end
end