In PostgreSQL, how to execute a recursive query with "dynamics" clauses ?
I'm using a recursive query because my data is a hierarchical model. An object can have children, themselves can also have children...
.
My goal is to find the last children of a searched object. My recursive query use a specific value in the where clause (A in this exemple, the searched object). This values are stored in an other table (called my_table here). An other table store every relations between object (A to B, A to C, A to D, D to E, D to F...), called my_table_filiation here. I need to repeat all this recursive query for each distinct values of my_table.cadastral_reference (A, B, C, D, E, F).
In other words how can I dynamically change a clause in a recursive query and run it for every values of a distinct table ?
Tables are like this :
CREATE TABLE IF NOT EXISTS my_table
(
id integer NOT NULL DEFAULT nextval('my_table_id_seq'::regclass),
cadastral_reference character varying(14) COLLATE pg_catalog."default",
filiation character varying COLLATE pg_catalog."default",
CONSTRAINT my_table_pkey PRIMARY KEY (id)
)
CREATE TABLE IF NOT EXISTS my_table_filiation
(
id integer NOT NULL DEFAULT nextval('my_table_filiation_id_seq'::regclass),
mother character varying(14) COLLATE pg_catalog."default",
daughter character varying(14) COLLATE pg_catalog."default",
CONSTRAINT my_table_filiation_pkey PRIMARY KEY (id)
)
Recursive query is like this :
WITH RECURSIVE q_filiation (mother, daughter) AS (
SELECT mother, daughter
FROM my_table_filiation
WHERE mother = 'A'
UNION ALL
SELECT p.mother, p.daughter
FROM q_filiation f, my_table_filiation p
WHERE p.mother = f.daughter
)
SELECT
array_agg(DISTINCT mother) AS mother,
array_agg(daughter) AS last_daughter,
FROM q_filiation
Actual result :
mother | last_daughter
--------------------
{A} | {E,F}
Desired results (the where clause declined with every values of my_table.cadastral_reference):
mother | last_daughter
--------------------
{A} | {E,F}
{B} | {}
{C} | {}
{D} | {E,F}
Finally I create a function with my previous query :
CREATE OR REPLACE FUNCTION my_function_filiation(
research_filiation character varying,
OUT mother character varying,
OUT daughter character varying)
RETURNS SETOF record
LANGUAGE 'plpgsql'
COST 100
VOLATILE PARALLEL UNSAFE
ROWS 1000
AS $BODY$
DECLARE
rec record;
sql text;
BEGIN
sql := '
WITH RECURSIVE q_filiation (mother, daughter) AS (
SELECT mother, daughter
FROM my_table_filiation
WHERE mother = '''||research_filiation||'''
UNION ALL
SELECT p.mother, p.daughter
FROM q_filiation f, my_table_filiation p
WHERE p.mother = f.daughter
)
SELECT
array_agg(DISTINCT mother) AS mother,
array_agg(daughter) AS last_daughter,
FROM q_filiation
';
FOR rec IN EXECUTE sql
LOOP
mother := rec.mother;
daughter := rec.daughter;
RETURN NEXT;
END LOOP;
END;
$BODY$;
Now I can use it in queries :
SELECT
my_function_filiation(p.cadastral_reference)
FROM my_table p;
If anyone know better solutions, on method or syntax, feel free to contribute.
INSERT INTO table_name (col_name) VALUES ('😂');
SELECT * FROM table_name WHERE col_name = '🍖';
I my opinion no row has to be returned from the second query, but 😂 is returned.
The table is utf8mb4 with collation utf8mb4_unicode_ci.
Is something related to ci? I would like to keep it.
SELECT '😂' = '🍖' COLLATE utf8mb4_unicode_ci,
'😂' = '🍖' COLLATE utf8mb4_unicode_520_ci;
Yields 1 and 0.
That is, utf8mb4_unicode_ci treats Emoji as equal, but utf8mb4_unicode_520_ci treats them as different.
So, change the collation of col_name to utf8mb4_unicode_520_ci.
I have the following table:
CREATE TABLE public_bodies
("id" int, "name" varchar(46))
;
INSERT INTO public_bodies
("id", "name")
VALUES
(1, 'Ytre Helgeland District Psychiatric Centre'),
(2, 'Åfjord Municipality'),
(3, 'Østfold Hospital')
;
I'd like to run this query:
SELECT public_bodies.id, public_bodies.name AS display_name
FROM public_bodies
ORDER BY display_name COLLATE "en_US";
But I get this error:
ERROR: column "display_name" does not exist
LINE 3: ORDER BY display_name COLLATE "en_US";
^
Ordering by the table name works fine:
SELECT public_bodies.id, public_bodies.name AS display_name
FROM public_bodies
ORDER BY public_bodies.name COLLATE "en_US";
-- id | display_name
-- ----+--------------------------------------------
-- 2 | Åfjord Municipality
-- 3 | Østfold Hospital
-- 1 | Ytre Helgeland District Psychiatric Centre
Ordering on the alias works okay too:
SELECT public_bodies.id, public_bodies.name AS display_name
FROM public_bodies
ORDER BY display_name;
-- id | display_name
-- ----+--------------------------------------------
-- 2 | Åfjord Municipality
-- 3 | Østfold Hospital
-- 1 | Ytre Helgeland District Psychiatric Centre
Applying the COLLATE before assigning the alias works, but I don't understand why this is different to collating after the ORDER_BY.
SELECT public_bodies.id, public_bodies.name COLLATE "en_US" AS display_name
FROM public_bodies
ORDER BY display_name;
-- id | display_name
-- ----+--------------------------------------------
-- 2 | Åfjord Municipality
-- 3 | Østfold Hospital
-- 1 | Ytre Helgeland District Psychiatric Centre
Postgres version:
SELECT version();
version
-------------------------------------------------------------------------------------------------------------
PostgreSQL 9.1.12 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit
I've get the same results on SQL fiddle (Postgres 9.3).
Why can't Postgres collate on the aliased field?
It's the way the language is defined. COLLATE clauses apply to expressions, and this case doesn't qualify.
By "expression", I mean some collection of operators, functions, variable identifiers, literals, etc., combined to produce an output value. In other words, the general class of value-producing "things" which are allowed to appear as a function argument, as a SELECT field definition, in a VALUES list, and so on.
A COLLATE clause may be attached to an expression, and an expression may appear in an ORDER BY list, but it's not the only thing allowed in an ORDER BY list; you can also include the names or positions of output columns, but these are treated as a distinct case by the parser.
The reason they need to be treated differently is that the query's output field identifiers are not in scope while evaluating expressions; this is why something like ORDER BY display_name || 'x' comes back with column "display_name" does not exist. To work around this, bare field names in the ORDER BY list are compared against the output list before expression evaluation is even attempted, but as a consequence, nothing more complex than a bare field name is accepted in this context (and that includes an attached COLLATE clause).
How can I add COLLATION to the following query....
This Collation clause = COLLATE Latin1_General_CS_AS
I only need to count the St_Text field if it exactly matches.
SELECT St_Text, count(*) as Counter
FROM tblSTBackup
GROUP BY St_Text
HAVING count(*) > 1
I have tried to add the collation to the GROUP BY.
I think this is how it should look:
SELECT St_Text COLLATE SQL_Latin1_General_CP1_CS_AS, count(*) as Counter
FROM tblSTBackup
GROUP BY St_Text COLLATE SQL_Latin1_General_CP1_CS_AS
HAVING count(*) > 1
SQL Fiddle Demo
We do have a small data warehouse in PostgreSQL database and I have to document all the tables.
I thought I can add a comment to every column and table and use pipe "|" separator to add more attributes. Then I can use information schema and array function to get documentation and use any reporting software to create desired output.
select
ordinal_position,
column_name,
data_type,
character_maximum_length,
numeric_precision,
numeric_scale,
is_nullable,
column_default,
(string_to_array(descr.description,'|'))[1] as cs_name,
(string_to_array(descr.description,'|'))[2] as cs_description,
(string_to_array(descr.description,'|'))[3] as en_name,
(string_to_array(descr.description,'|'))[4] as en_description,
(string_to_array(descr.description,'|'))[5] as other
from
information_schema.columns columns
join pg_catalog.pg_class klass on (columns.table_name = klass.relname and klass.relkind = 'r')
left join pg_catalog.pg_description descr on (descr.objoid = klass.oid and descr.objsubid = columns.ordinal_position)
where
columns.table_schema = 'data_warehouse'
order by
columns.ordinal_position;
It is a good idea or is there better approach?
Unless you must include descriptions of the system tables, I wouldn't try to shoehorn your descriptions into pg_catalog.pg_description. Make your own table. That way you get to keep the columns as columns, and not have to use clunky string functions.
Alternatively, consider adding specially formatted comments to your master schema file, along the lines of javadoc. Then write a tool to extract those comments and create a document. That way the comments stay close to the thing they're commenting, and you don't have to mess with the database at all to produce the report. For example:
--* Used for authentication.
create table users
(
--* standard Rails-friendly primary key. Also an example of
--* a long comment placed before the item, rather than on the
--* the same line.
id serial primary key,
name text not null, --* Real name (hopefully)
login text not null, --* Name used for authentication
...
);
Your documentation tool reads the file, looks for the --* comments, figures out what comments go with what things, and produces some kind of report, e.g.:
table users: Used for authentication
id: standard Rails-friendly primary key. Also an example of a
long comment placed before the item, rather than on the same
line.
name: Real name
login: Name used for authentication
You might note that with appropriate comments, the master schema file itself is a pretty good report in its own right, and that perhaps nothing else is needed.
If anyone interested, here is what I've used for initial load for my small documentation project. Documentation is in two tables, one for describing tables and one for describing columns and constraints. I appreciate any feedback.
/* -- Initial Load - Tables */
drop table dw_description_table cascade;
create table dw_description_table (
table_description_key serial primary key,
physical_full_name character varying,
physical_schema_name character varying,
physical_table_name character varying,
Table_Type character varying, -- Fact Dimension ETL Transformation
Logical_Name_CS character varying,
Description_CS character varying,
Logical_Name_EN character varying,
Description_EN character varying,
ToDo character varying,
Table_Load_Type character varying, --Manually TruncateLoad AddNewRows
Known_Exclusions character varying,
Table_Clover_Script character varying
);
insert into dw_description_table (physical_full_name, physical_schema_name, physical_table_name) (
select
table_schema || '.' || table_name as physical_full_name,
table_schema,
table_name
from
information_schema.tables
where
table_name like 'dw%' or table_name like 'etl%'
)
/* -- Initial Load - Columns */
CREATE TABLE dw_description_column (
column_description_key serial,
table_description_key bigint,
physical_full_name text,
physical_schema_name character varying,
physical_table_name character varying,
physical_column_name character varying,
ordinal_position character varying,
column_default character varying,
is_nullable character varying,
data_type character varying,
logical_name_cs character varying,
description_cs character varying,
logical_name_en character varying,
description_en character varying,
derived_rule character varying,
todo character varying,
pk_name character varying,
fk_name character varying,
foreign_table_name character varying,
foreign_column_name character varying,
is_primary_key boolean,
is_foreign_key boolean,
CONSTRAINT dw_description_column_pkey PRIMARY KEY (column_description_key ),
CONSTRAINT fk_dw_description_table_key FOREIGN KEY (table_description_key)
REFERENCES dw_description_table (table_description_key) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
);
insert into dw_description_column (
table_description_key ,
physical_full_name ,
physical_schema_name ,
physical_table_name ,
physical_column_name ,
ordinal_position ,
column_default ,
is_nullable ,
data_type ,
logical_name_cs ,
description_cs ,
logical_name_en ,
description_en ,
derived_rule ,
todo ,
pk_name ,
fk_name ,
foreign_table_name ,
foreign_column_name ,
is_primary_key ,
is_foreign_key )
(
with
dw_constraints as (
SELECT
tc.constraint_name,
tc.constraint_schema || '.' || tc.table_name || '.' || kcu.column_name as physical_full_name,
tc.constraint_schema,
tc.table_name,
kcu.column_name,
ccu.table_name AS foreign_table_name,
ccu.column_name AS foreign_column_name,
TC.constraint_type
FROM
information_schema.table_constraints AS tc
JOIN information_schema.key_column_usage AS kcu ON (tc.constraint_name = kcu.constraint_name and tc.table_name = kcu.table_name)
JOIN information_schema.constraint_column_usage AS ccu ON ccu.constraint_name = tc.constraint_name
WHERE
constraint_type in ('PRIMARY KEY','FOREIGN KEY')
AND tc.constraint_schema = 'bizdata'
and (tc.table_name like 'dw%' or tc.table_name like 'etl%')
group by
tc.constraint_name,
tc.constraint_schema,
tc.table_name,
kcu.column_name,
ccu.table_name ,
ccu.column_name,
TC.constraint_type
)
select
dwdt.table_description_key,
col.table_schema || '.' || col.table_name || '.' || col.column_name as physical_full_name,
col.table_schema as physical_schema_name,
col.table_name as physical_table_name,
col.column_name as physical_column_name,
col.ordinal_position,
col.column_default,
col.is_nullable,
col.data_type,
null as Logical_Name_CS ,
null as Description_CS ,
null as Logical_Name_EN,
null as Description_EN ,
null as Derived_Rule ,
null as ToDo,
dwc1.constraint_name pk_name,
dwc2.constraint_name as fk_name,
dwc2.foreign_table_name,
dwc2.foreign_column_name,
case when dwc1.constraint_name is not null then true else false end as is_primary_key,
case when dwc2.constraint_name is not null then true else false end as foreign_key
from
information_schema.columns col
join dw_description_table dwdt on (col.table_schema || '.' || col.table_name = dwdt.physical_full_name )
left join dw_constraints dwc1 on ((col.table_schema || '.' || col.table_name || '.' || col.column_name) = dwc1.physical_full_name and dwc1.constraint_type = 'PRIMARY KEY')
left join dw_constraints dwc2 on ((col.table_schema || '.' || col.table_name || '.' || col.column_name) = dwc2.physical_full_name and dwc2.constraint_type = 'FOREIGN KEY')
where
col.table_name like 'dw%' or col.table_name like 'etl%'
)