Postgres: Parsing query for database objects - postgresql

I've got procedure which works for too long period of time.
It parses query to arrays and then search for intersections with objects in database.
In first temp table I split every statement to array.
Second is about combine all possible database objects in array
Third - I'm looking for intersections in arrays.
Now this procedure uses 3 month time period for analyzing.
I dont want to reduce time. Maybe I will if
you dont suggest me something.
I've read that index GIN on array may help. What do you think?
Maybe you did it another way?
database - POSTGRES 11
CREATE TEMP TABLE temp_array_data
AS
(
SELECT id,
pid,
regexp_split_to_array(query, '\s+') as query
FROM t_stat_session
WHERE query_start::DATE BETWEEN pdtQueryDateFrom AND pdtQueryDateTo
AND duration IS NOT NULL
);
CREATE TEMP TABLE temp_sys_objects_data
AS
(
SELECT string_to_array(schemaname || '.' || tablename, '.') object_arr1,
string_to_array(schemaname || '.' || tablename, ',') object_arr2,
schemaname,
tablename object_name,
'T' AS object_type
FROM pg_catalog.pg_tables
UNION ALL
SELECT string_to_array(schemaname || '.' || viewname, '.') object_arr1,
string_to_array(schemaname || '.' || viewname, ',') object_arr2,
schemaname,
viewname object_name,
'VW' AS object_type
FROM pg_catalog.pg_views
UNION ALL
SELECT string_to_array(schemaname || '.' || matviewname, '.') object_arr1,
string_to_array(schemaname || '.' || matviewname, ',') object_arr2,
schemaname,
matviewname object_name,
'MVW' AS object_type
FROM pg_catalog.pg_matviews
);
CREATE TEMP TABLE temp_data_for_final
AS
(
SELECT id,
pid,
schemaname,
object_name,
object_type,
1 cnt
FROM temp_array_data adta,
temp_sys_objects_data
WHERE (ARRAY [object_arr1] && ARRAY [query] OR ARRAY [object_arr2] <# ARRAY [query])
);

Related

Compare two tables and find the missing column using left join

I wanted to compare the two tables employees and employees_a and find the missing columns in the table comployees_a.
select a.Column_name,
From User_tab_columns a
LEFT JOIN User_tab_columns b
ON upper(a.table_name) = upper(b.table_name)||'_A'
AND a.column_name = b.column_name
Where upper(a.Table_name) = 'EMPLOYEES'
AND upper(b.table_name) = 'EMPLOYEES_A'
AND b.column_name is NULL
;
But this doesnt seems to be working. No rows are returned.
My employees table has the below columns
emp_name
emp_id
base_location
department
current_location
salary
manager
employees_a table has below columns
emp_name
emp_id
base_location
department
current_location
I want to find the rest two columns and add them into employees_a table.
I have more than 50 tables like this to compare them and find the missing column and add those columns into their respective "_a" table.
Missing columns? Why not using the MINUS set operator, seems to be way simpler, e.g.
select column_name from user_tables where table_name = 'EMP_1'
minus
select column_name from user_tables where table_name = 'EMP_2'
Thirstly, check if user_tab_columns table contains columns of your tables (in my case user_tab_columns is empty and I have to use all_tab_columns):
select a.Column_name
From User_tab_columns a
Where upper(a.Table_name) = 'EMPLOYEES'
Secondly, remove line AND upper(b.table_name) = 'EMPLOYEES_A', because upper(b.table_name) is null in case a column is not found. You have b.table_name in JOIN part of the SELECT already.
select a.Column_name
From User_tab_columns a
LEFT JOIN User_tab_columns b
ON upper(a.table_name) = upper(b.table_name)||'_A'
AND a.column_name = b.column_name
Where upper(a.Table_name) = 'EMPLOYEES'
AND b.column_name is NULL
You do not need any joins and can use:
select 'ALTER TABLE EMPLOYEES_A ADD "'
|| Column_name || '" '
|| CASE MAX(data_type)
WHEN 'NUMBER'
THEN 'NUMBER(' || MAX(data_precision) || ',' || MAX(data_scale) || ')'
WHEN 'VARCHAR2'
THEN 'VARCHAR2(' || MAX(data_length) || ')'
END
AS sql
From User_tab_columns
Where Table_name IN ('EMPLOYEES', 'EMPLOYEES_A')
GROUP BY COLUMN_NAME
HAVING COUNT(CASE table_name WHEN 'EMPLOYEES' THEN 1 END) = 1
AND COUNT(CASE table_name WHEN 'EMPLOYEES_A' THEN 1 END) = 0;
Or, for multiple tables:
select 'ALTER TABLE ' || MAX(table_name) || '_A ADD "'
|| Column_name || '" '
|| CASE MAX(data_type)
WHEN 'NUMBER'
THEN 'NUMBER(' || MAX(data_precision) || ',' || MAX(data_scale) || ')'
WHEN 'VARCHAR2'
THEN 'VARCHAR2(' || MAX(data_length) || ')'
END
AS sql
From User_tab_columns
Where Table_name IN ('EMPLOYEES', 'EMPLOYEES_A', 'SOMETHING', 'SOMETHING_A')
GROUP BY
CASE
WHEN table_name LIKE '%_A'
THEN SUBSTR(table_name, 1, LENGTH(table_name) - 2)
ELSE table_name
END,
COLUMN_NAME
HAVING COUNT(CASE WHEN table_name NOT LIKE '%_A' THEN 1 END) = 1
AND COUNT(CASE WHEN table_name LIKE '%_A' THEN 1 END) = 0;
fiddle

How to perform a database-wide full text search in PostgreSQL?

I have a PostgreSQL database with about 500 tables. Each table has a unique ID column named id and a user ID column named user_id. I would like to perform a full-text search of all varchar columns across all of these tables for a particular user. I do this today with ElasticSearch but I'd like to simplify my architecture. I understand that I can add full text search columns to all of the tables with things like stored generated columns and then add indices for fast full text search:
ALTER TABLE pgweb
ADD COLUMN textsearchable_index_col tsvector
GENERATED ALWAYS AS (to_tsvector('english', coalesce(title, '') || ' ' || coalesce(body, ''))) STORED;
CREATE INDEX textsearch_idx ON pgweb USING GIN (textsearchable_index_col);
However, I'm not familiar with how to do cross-table searches efficiently. Maybe a view across all textsearchable_index_col columns? I'd like the result to be something like the table name and id of the matching row. For example:
table_name | id
-------------+-------
table1 | 492
table42 | 20
If it matters, I'm using Ruby on Rails as the client with ActiveRecord. I'm using a managed PostgreSQL 13 database at Digital Ocean so I won't be able to install custom psql plugins.
Maybe It is not the answer you are looking for, because I am not sure if there is a better approach, but first I will try to automate the process.
I will make two dynamic queries, the first one to create columns textsearchable_index_col (in each table with at least one varchar column) and the other to create an index on that columns (one index per table).
You could ADD a textsearchable_index_col column for each "character varying" column instead only one concatenating all "character varying" columns, but in this case I will create one textsearchable_index_col column per table like you propose.
I assume table schema "public" but you can use the real one.
-- Create columns textsearchable_index_col:
SELECT 'ALTER TABLE ' || table_schema || '.' || table_name || E' ADD COLUMN textsearchable_index_col tsvector GENERATED ALWAYS AS (to_tsvector(\'english\', coalesce(' ||
string_agg(column_name, E', \'\') || \' \' || coalesce(') || E', \'\'))) STORED;'
FROM information_schema.columns
WHERE table_schema = 'public' AND data_type IN ('character varying')
GROUP BY table_schema, table_name;
-- Create indexes on textsearchable_index_col columns:
SELECT 'CREATE INDEX ' || table_name || '_textsearch_idx ON ' || table_schema || '.' || table_name || ' USING GIN (textsearchable_index_col);'
FROM information_schema.columns
WHERE table_schema = 'public' AND data_type IN ('character varying')
GROUP BY table_schema, table_name;
Then I will use a dynamic query to create a query (using UNION) to search on all that textsearchable_index_col columns:
You need to replace question mark by parameters (user_id and searched text), and take out the last "UNION ALL"
SELECT E'SELECT \'' || table_name || E'\' AS table_name, id FROM ' || table_schema || '.' || table_name || E' WHERE user_id = ? AND textsearchable_index_col' || ' ## to_tsquery(?) UNION ALL'
FROM information_schema.columns
WHERE table_schema = 'public' AND data_type IN ('character varying')
GROUP BY table_schema, table_name;

PostgresQL: Find array length of output from ARRAY_AGG()

How do I count the number of distinct elements in an array object, created by ARRAY_AGG() in PostgresQL? Here's a toy example for discussion purposes:
SELECT ARRAY_AGG (first_name || ' ' || last_name) actors
FROM film
I have tried ARRAY_LENGTH(), LENGTH(), etc., like so:
SELECT ARRAY_LENGTH(a.actors)
FROM (SELECT ARRAY_AGG (first_name || ' ' || last_name) actors
FROM film) a;
But I get an error:
function array_length(integer[]) does not exist
Hint: No function matches the given name and argument types. You might need to add explicit type casts.
Position: 208
So I tried (2):
SELECT ARRAY_LENGTH( CAST(COALESCE(a.actors, '0') AS integer) )
FROM (SELECT ARRAY_AGG (first_name || ' ' || last_name) actors
FROM film) a;
but I get the error:
malformed array literal: "0"
Detail: Array value must start with "{" or dimension information.
Position: 119
the function array_length(anyarray, int) require two elements, array and dimension for example:
Select array_length(array[1,2,3], 1);
Result:
3
If you are only dealing with a single dimension array, cardinality() is easier to use:
SELECT cardinality(ARRAY_LENGTH(a.actors))
FROM (
SELECT ARRAY_AGG (first_name || ' ' || last_name) actors
FROM film
) a;

Postgres generate sql output using data dictionary tables

All I need is to get a SQL query output as :
ALTER TABLE TABLE_NAME
ADD CONSTRAINT
FOREIGN KEY (COLUMN_NAME)
REFERENCES (PARENT_TABLE_NAME);
I'm running the below DYNAMIC query USING DATA DICTIONARY TABLES,
SELECT DISTINCT
'ALTER TABLE ' || cs.TABLE_NAME ||
'ADD CONSTRAINT' || rc.CONSTRAINT_NAME ||
'FOREIGN KEY' || c.COLUMN_NAME ||
'REFERENCES' || cs.TABLE_NAME ||
' (' || cs.CONSTRAINT_NAME || ') ' ||
' ON UPDATE ' || rc.UPDATE_RULE ||
' ON DELETE ' || rc.DELETE_RULE
FROM INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS RC,
INFORMATION_SCHEMA.TABLE_CONSTRAINTS CS,
INFORMATION_SCHEMA.COLUMNS C
WHERE cs.CONSTRAINT_NAME = rc.CONSTRAINT_NAME
AND cs.TABLE_NAME = c.TABLE_NAME
AND UPPER(cs.TABLE_SCHEMA) = 'SSP2_PCAT';
But here even though I'm able to generate the desired output, the concern is its not giving the PARENT_TABLE_NAME here,
rather its giving the same table_name after the ALTER TABLE Keywords.
I hope this is clear as we are using Dynamic SQL here and any help is absolutely appreciated!
Your query is missing a couple of join tables and join conditions. Also, don't forget that a foreign key can be defined on more than one column. Finally, your query is vulnerable to SQL injection via object names.
But it would be much simpler if you used pg_catalog.pg_constraint rather than the `information_schema':
SELECT format('ALTER TABLE %s ADD CONSTRAINT %I %s',
conrelid::regclass,
conname,
pg_get_constraintdef(oid))
FROM pg_catalog.pg_constraint
WHERE contype = 'f'
AND upper(connamespace::regnamespace::text) = 'SSP2_PCAT';

How do I find all the NUMERIC columns in a table and do a SUM() on them?

I have a few tables in Netezza, DB2 and PostgreSQL databases, for which I need to reconcile and the best way we have come out with is to do a SUM() across all the NUMERIC Table columns on all the 3 databases.
Does anyone have a quick and simple way to find all the COLUMNS which are either NUMERIC or INTEGER or BIGINT and then run a SUM() on all these?
For comparing the results, I can do it manually also, or if someone has a way to capture these results in a common table and automatically check the differences in the SUM?
For DB2 you can use this metadata which will help you to find out the data type for each column
SELECT
COLUMN_NAME || ' ' || REPLACE(REPLACE(DATA_TYPE,'DECIMAL','NUMERIC'),'CHARACTER','VARCHAR') ||
CASE
WHEN DATA_TYPE = 'TIMESTAMP' THEN ''
ELSE
' (' ||
CASE
WHEN CHARACTER_MAXIMUM_LENGTH IS NOT NULL THEN CAST(CHARACTER_MAXIMUM_LENGTH AS VARCHAR(30))
WHEN NUMERIC_PRECISION IS NOT NULL THEN CAST(NUMERIC_PRECISION AS VARCHAR(30)) ||
CASE
WHEN NUMERIC_SCALE = 0 THEN ''
ELSE ',' || CAST(NUMERIC_SCALE AS VARCHAR(3))
END
ELSE ''
END || ')'
END || ',' "SQLCOL",
COLUMN_NAME,
DATA_TYPE, CHARACTER_MAXIMUM_LENGTH, NUMERIC_PRECISION, NUMERIC_SCALE, ORDINAL_POSITION
FROM SYSIBM.COLUMNS
WHERE TABLE_NAME = 'insert your table name'
AND TABLE_SCHEMA = 'insert your table schema'
ORDER BY ORDINAL_POSITION
For Netezza, I got the following query:
SELECT 0 AS ATTNUM, 'SELECT' AS SQL
UNION
SELECT ATTNUM, 'SUM(' || ATTNAME || ') AS S_' || ATTNAME || ',' AS COLMN
FROM _V_RELATION_COLUMN RC
WHERE NAME = '<table-name>'
AND FORMAT_TYPE= 'NUMERIC'
UNION
SELECT 10000 AS ATTNUM, ' 0 AS FLAG FROM ' || '<table-name>'
ORDER BY ATTNUM
Still looking how to do this across DB2 and PostgreSQL.