tsearch2 add resultset to index - postgresql

How can I add a resultset (more than one entry) to a tsvector? I use postgres 8.3.
I have an m-n relationship and I'd like to have all values from one column of the n-side in the tsvector of the m-side.
This statement will work if I have an limit to one the subselect. But not without the limit.
UPDATE mytable
SET mytsvector=to_tsvector('english',
coalesce(column_a, '') ||' '||
coalesce((SELECT item FROM other_table WHERE id = other_id LIMIT 1), '')
)
ERROR: more than one row returned by a subquery used as an expression

Under Postgres 8.3 at first I have to create an aggregate function to generate an array from an select.
CREATE AGGREGATE array_accum (
sfunc = array_append,
basetype = anyelement,
stype = anyarray,
initcond = '{}'
);
Since 8.4 there is the function array_agg().
The hole statement looks like this:
UPDATE mytable
SET mytsvector=to_tsvector('english',
coalesce(column_a, '') ||' '||
coalesce(
(SELECT array_to_string(array_accum(item), ' ')
FROM mytable m, other_table o
WHERE o.id = m.other_id AND m.id = id GROUP BY m.id), '')
)

Related

Postgresql dynamic dataset aggregation

I'm trying to aggregate multiple datasets where I have to be able to create dynamic rules on how to aggregate the data on an id basis. Below is a quick approach I wrote up that seems to do what I intended. Is there a more performant (and preferably safe) way of doing this. tbl1...n will be larger in size than agg_rule. I'm running postgres 13.0, if there are new features coming in 14 that'll help, that could also be of interest. I am only interesting in coalescing and sum:ing like below if that simplifies the problem.
CREATE TABLE tbl1 AS
SELECT 1 id
, 1 seq
, 1 val;
CREATE TABLE tbl2 AS
SELECT 1 id
, 1 seq
, 1 val;
CREATE TABLE tbl3 AS
SELECT 1 id
, 1 seq
, 1 val;
CREATE TABLE agg_rule AS
SELECT 1 id
, 'coalesce(tbl1.val, tbl2.val + tbl3.val)' expr;
CREATE OR REPLACE FUNCTION eval(_id INTEGER, _seq INTEGER)
RETURNS INTEGER
LANGUAGE plpgsql
AS $$
DECLARE _res INTEGER;
BEGIN
EXECUTE 'SELECT ' || (
SELECT expr
FROM agg_rule
WHERE id = _id
) || '
FROM tbl1
FULL JOIN tbl2 USING (id, seq)
FULL JOIN tbl3 USING (id, seq)
WHERE id = ' || _id || '
AND seq = ' || _seq || ';' INTO _res;
RETURN _res;
END
$$;

Redshift - Cannot load query results into table - leader node issue

My goal is it to load query results to a table to store an ordered list of column names for a given table. (Then I will stuff all these column names, into a single column, using the listagg function which I will pass to dynamic sql.) The reason I cannot load this into a table is because system table queries compute on leader nodes, yet this query does not execute on a leader node, and there is no way to force it to execute on a leader node. Any ideas how to get this to execute successfully?
create table temp_columns
as
select
cast(t1.columnname as varchar) as columname
--,
--cast(t2.ordinal_position as integer) as ordinal_position
FROM
(
SELECT
cast(schemaname as varchar) as schemaname,
cast(tablename as varchar) as tablename,
cast("column" as varchar) as columnname
FROM PG_TABLE_DEF
WHERE
schemaname = 'schema1'
and tablename = 'table1'
)
t1
join information_schema.columns t2
on t1.schemaname = t2.table_schema
and t1.tablename = t2.table_name
and t1.columnname = t2.column_name
WHERE
t1.schemaname = 'schema1'
and t1.tablename = 'table1'
order by t2.ordinal_position;
We got around this by creating the table first, then doing a INSERT INTO.
so something like:
create table public.temp_columns
(
columname varchar(255),
ordinal_position int
);
insert into public.temp_columns
select
cast(t1.columnname as varchar) as columname,
cast(t2.ordinal_position as integer) as ordinal_position
FROM
(
SELECT
cast(schemaname as varchar) as schemaname,
cast(tablename as varchar) as tablename,
cast("column" as varchar) as columnname
FROM PG_TABLE_DEF
WHERE
schemaname = 'schema1'
and tablename = 'table1'
)
t1
join information_schema.columns t2
on t1.schemaname = t2.table_schema
and t1.tablename = t2.table_name
and t1.columnname = t2.column_name
WHERE
t1.schemaname = 'schema1'
and t1.tablename = 'table1'
order by t2.ordinal_position;

How to list MAX(id) of all tables given the db schema name?

I am looking for a pgsql query to pull the last PK for all tables given the db schema.
Need this for my db migration work.
You can do this with a variation of a dynamic row count query:
with pk_list as (
select tbl_ns.nspname as table_schema,
tbl.relname as table_name,
cons.conname as pk_name,
col.attname as pk_column
from pg_class tbl
join pg_constraint cons on tbl.oid = cons.conrelid and cons.contype = 'p'
join pg_namespace tbl_ns on tbl_ns.oid = tbl.relnamespace
join pg_attribute col on col.attrelid = tbl.oid and col.attnum = cons.conkey[1]
join pg_type typ on typ.oid = col.atttypid
where tbl.relkind = 'r'
and cardinality(cons.conkey) = 1 -- only single column primary keys
and tbl_ns.nspname not in ('pg_catalog', 'information_schema')
and typ.typname in ('int2','int4','int8','varchar','numeric','float4','float8','date','timestamp','timestamptz')
and has_table_privilege(format('%I.%I', tbl_ns.nspname, tbl.relname), 'select')
), maxvals as (
select table_schema, table_name, pk_column,
(xpath('/row/max/text()',
query_to_xml(format('select max(%I) from %I.%I', pk_column, table_schema, table_name), true, true, ''))
)[1]::text as max_val
from pk_list
)
select table_schema,
table_name,
pk_column,
max_val
from maxvals;
The first CTE (pk_list ) retrieves the name of the primary key column for each "user" table (that is: tables that are not system tables)
The second CTE (maxvals) then creates a select statement that retrieves the max value for each PK column from the first CTE and runs that query using query_to_xml(). The xpath() function is then used to parse the XML and return the max value as a text value (so it's possible to mix numbers and varchars)
The final select then simply displays the result from that.
The above has the following restrictions:
Only single-column primary keys are considered
It only deals with data types that support using max() on them (e.g. UUID columns are not included)

Postgres find all rows in database tables matching criteria on a given column

I am trying to write sub-queries so that I search all tables for a column named id and since there are multiple tables with id column, I want to add the condition, so that id = 3119093.
My attempt was:
Select *
from information_schema.tables
where id = '3119093' and id IN (
Select table_name
from information_schema.columns
where column_name = 'id' );
This didn't work so I tried:
Select *
from information_schema.tables
where table_name IN (
Select table_name
from information_schema.columns
where column_name = 'id' and 'id' IN (
Select * from table_name where 'id' = 3119093));
This isn't the right way either. Any help would be appreciated. Thanks!
A harder attempt is:
CREATE OR REPLACE FUNCTION search_columns(
needle text,
haystack_tables name[] default '{}',
haystack_schema name[] default '{public}'
)
RETURNS table(schemaname text, tablename text, columnname text, rowctid text)
AS $$
begin
FOR schemaname,tablename,columnname IN
SELECT c.table_schema,c.table_name,c.column_name
FROM information_schema.columns c
JOIN information_schema.tables t ON
(t.table_name=c.table_name AND t.table_schema=c.table_schema)
WHERE (c.table_name=ANY(haystack_tables) OR haystack_tables='{}')
AND c.table_schema=ANY(haystack_schema)
AND t.table_type='BASE TABLE'
--AND c.column_name = "id"
LOOP
EXECUTE format('SELECT ctid FROM %I.%I WHERE cast(%I as text) like %L',
schemaname,
tablename,
columnname,
needle
) INTO rowctid;
IF rowctid is not null THEN
RETURN NEXT;
END IF;
END LOOP;
END;
$$ language plpgsql;
select * from search_columns('%3119093%'::varchar,'{}'::name[]) ;
The only problem is this code displays the table name and column name. I have to then manually enter
Select * from table_name where id = 3119093
where I got the table name from the code above.
I want to automatically implement returning rows from a table but I don't know how to get the table name automatically.
I took the time to make it work for you.
For starters, some information on what is going on inside the code.
Explanation
function takes two input arguments: column name and column value
it requires a created type that it will be returning a set of
first loop identifies tables that have a column name specified as the input argument
then it forms a query which aggregates all rows that match the input condition inside every table taken from step 3 with comparison based on ILIKE - as per your example
function goes into the second loop only if there is at least one row in currently visited table that matches specified condition (then the array is not null)
second loop unnests the array of rows that match the condition and for every element it puts it in the function output with RETURN NEXT rec clause
Notes
Searching with LIKE is inefficient - I suggest adding another input argument "column type" and restrict it in the lookup by adding a join to pg_catalog.pg_type table.
The second loop is there so that if more than 1 row is found for a particular table, then every row gets returned.
If you are looking for something else, like you need key-value pairs, not just the values, then you need to extend the function. You could for example build json format from rows.
Now, to the code.
Test case
CREATE TABLE tbl1 (col1 int, id int); -- does contain values
CREATE TABLE tbl2 (col1 int, col2 int); -- doesn't contain column "id"
CREATE TABLE tbl3 (id int, col5 int); -- doesn't contain values
INSERT INTO tbl1 (col1, id)
VALUES (1, 5), (1, 33), (1, 25);
Table stores data:
postgres=# select * From tbl1;
col1 | id
------+----
1 | 5
1 | 33
1 | 25
(3 rows)
Creating type
CREATE TYPE sometype AS ( schemaname text, tablename text, colname text, entirerow text );
Function code
CREATE OR REPLACE FUNCTION search_tables_for_column (
v_column_name text
, v_column_value text
)
RETURNS SETOF sometype
LANGUAGE plpgsql
STABLE
AS
$$
DECLARE
rec sometype%rowtype;
v_row_array text[];
rec2 record;
arr_el text;
BEGIN
FOR rec IN
SELECT
nam.nspname AS schemaname
, cls.relname AS tablename
, att.attname AS colname
, null::text AS entirerow
FROM
pg_attribute att
JOIN pg_class cls ON att.attrelid = cls.oid
JOIN pg_namespace nam ON cls.relnamespace = nam.oid
WHERE
cls.relkind = 'r'
AND att.attname = v_column_name
LOOP
EXECUTE format('SELECT ARRAY_AGG(row(tablename.*)::text) FROM %I.%I AS tablename WHERE %I::text ILIKE %s',
rec.schemaname, rec.tablename, rec.colname, quote_literal(concat('%',v_column_value,'%'))) INTO v_row_array;
IF v_row_array is not null THEN
FOR rec2 IN
SELECT unnest(v_row_array) AS one_row
LOOP
rec.entirerow := rec2.one_row;
RETURN NEXT rec;
END LOOP;
END IF;
END LOOP;
END
$$;
Exemplary call & output
postgres=# select * from search_tables_for_column('id','5');
schemaname | tablename | colname | entirerow
------------+-----------+---------+-----------
public | tbl1 | id | (1,5)
public | tbl1 | id | (1,25)
(2 rows)

Using query to set the column type in PostgreSQL

After the excellent answer by Alexandre GUIDET, I attempted to run the following query:
create table egg (id (SELECT
pg_catalog.format_type(a.atttypid, a.atttypmod) as Datatype
FROM
pg_catalog.pg_attribute a
WHERE
a.attnum > 0
AND NOT a.attisdropped
AND a.attrelid = (
SELECT c.oid
FROM pg_catalog.pg_class c
LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname ~ '^(TABLENAME)$'
AND pg_catalog.pg_table_is_visible(c.oid)
)
and a.attname = 'COLUMNNAME'));
PostgreSQL, however, complains about incorrect syntax. Specifically it says that I cannot write: create table egg (id (SELECT.
Are there any workarounds? Can't I convert the result of a query to text and reuse it as a query?
There is a much simpler way to do that.
SELECT pg_typeof(col)::text FROM tbl LIMIT 1
Only precondition is that the template table holds at least one row. See the manual on pg_typeof()
As Milen wrote, you need to EXECUTE dynamic DDL statements like this.
A much simpler DO statement:
DO $$BEGIN
EXECUTE 'CREATE TABLE egg (id '
|| (SELECT pg_typeof(col)::text FROM tbl LIMIT 1) || ')';
END$$;
Or, if you are not sure the template table has any rows:
DO $$BEGIN
EXECUTE (
SELECT format('CREATE TABLE egg (id %s)'
, format_type(atttypid, atttypmod))
FROM pg_catalog.pg_attribute
WHERE attrelid = 'tbl'::regclass -- name of template table
AND attname = 'col' -- name of template column
AND attnum > 0 AND NOT attisdropped
);
END$$;
These conditions seem redundant, since you look for a specific column any
format() requires Postgres 9.1+.
Related:
How to check if a table exists in a given schema
You can either convert that query to a function or (if you have Postgres 9.0) to an anonymous code block:
DO $$DECLARE the_type text;
BEGIN
SELECT ... AS datatype INTO the_type FROM <the rest of your query>;
EXECUTE 'create table egg ( id ' || the_type || <the rest of your create table statement>;
END$$;
You can either have a table a definition or a query, but not both. Maybe your thinking of the select into command.