PostgresQL: Find array length of output from ARRAY_AGG() - postgresql

How do I count the number of distinct elements in an array object, created by ARRAY_AGG() in PostgresQL? Here's a toy example for discussion purposes:
SELECT ARRAY_AGG (first_name || ' ' || last_name) actors
FROM film
I have tried ARRAY_LENGTH(), LENGTH(), etc., like so:
SELECT ARRAY_LENGTH(a.actors)
FROM (SELECT ARRAY_AGG (first_name || ' ' || last_name) actors
FROM film) a;
But I get an error:
function array_length(integer[]) does not exist
Hint: No function matches the given name and argument types. You might need to add explicit type casts.
Position: 208
So I tried (2):
SELECT ARRAY_LENGTH( CAST(COALESCE(a.actors, '0') AS integer) )
FROM (SELECT ARRAY_AGG (first_name || ' ' || last_name) actors
FROM film) a;
but I get the error:
malformed array literal: "0"
Detail: Array value must start with "{" or dimension information.
Position: 119

the function array_length(anyarray, int) require two elements, array and dimension for example:
Select array_length(array[1,2,3], 1);
Result:
3

If you are only dealing with a single dimension array, cardinality() is easier to use:
SELECT cardinality(ARRAY_LENGTH(a.actors))
FROM (
SELECT ARRAY_AGG (first_name || ' ' || last_name) actors
FROM film
) a;

Related

Postgres: Parsing query for database objects

I've got procedure which works for too long period of time.
It parses query to arrays and then search for intersections with objects in database.
In first temp table I split every statement to array.
Second is about combine all possible database objects in array
Third - I'm looking for intersections in arrays.
Now this procedure uses 3 month time period for analyzing.
I dont want to reduce time. Maybe I will if
you dont suggest me something.
I've read that index GIN on array may help. What do you think?
Maybe you did it another way?
database - POSTGRES 11
CREATE TEMP TABLE temp_array_data
AS
(
SELECT id,
pid,
regexp_split_to_array(query, '\s+') as query
FROM t_stat_session
WHERE query_start::DATE BETWEEN pdtQueryDateFrom AND pdtQueryDateTo
AND duration IS NOT NULL
);
CREATE TEMP TABLE temp_sys_objects_data
AS
(
SELECT string_to_array(schemaname || '.' || tablename, '.') object_arr1,
string_to_array(schemaname || '.' || tablename, ',') object_arr2,
schemaname,
tablename object_name,
'T' AS object_type
FROM pg_catalog.pg_tables
UNION ALL
SELECT string_to_array(schemaname || '.' || viewname, '.') object_arr1,
string_to_array(schemaname || '.' || viewname, ',') object_arr2,
schemaname,
viewname object_name,
'VW' AS object_type
FROM pg_catalog.pg_views
UNION ALL
SELECT string_to_array(schemaname || '.' || matviewname, '.') object_arr1,
string_to_array(schemaname || '.' || matviewname, ',') object_arr2,
schemaname,
matviewname object_name,
'MVW' AS object_type
FROM pg_catalog.pg_matviews
);
CREATE TEMP TABLE temp_data_for_final
AS
(
SELECT id,
pid,
schemaname,
object_name,
object_type,
1 cnt
FROM temp_array_data adta,
temp_sys_objects_data
WHERE (ARRAY [object_arr1] && ARRAY [query] OR ARRAY [object_arr2] <# ARRAY [query])
);

Postgres: how to use LIKE on every word of user input, AND-ing the results

In postgresql (9.6), given a variable length user input of the type 'alice chaplin' or 'alice' or 'alice chaplin meyer' but also 'lic chapl', I would like to search for records that contain 'alice' in column firstname OR column lastname (AND contain 'chaplin' in firstname OR lastname (AND contain 'meyer' in firstname OR lastname)), etc.
I had decided to use ILIKE %searchterm% for the matching, so the query would presumably be along the lines of:
... where
((lastname ILIKE '%' || SEARCHTERM1 || '%') OR (firstname ILIKE '%' || SEARCHTERM1 || '%'))
AND ((lastname ILIKE '%' || SEARCHTERM2 || '%') OR (firstname ILIKE '%' || SEARCHTERM2 || '%'))
AND etc.
After lots of attempts and searching, nothing comes up that resolves this... As a last resort I'll write a very procedural pgplsql function that loops over a split search string, intersecting the ILIKE results, but there has to be some more idiomatic SQL way of resolving such a run of the mill problem.
You can use string_to_array to convert an input string into an array of words. You can then use unnest to convert the array into a (virtual) table, and operate on the words to add '%' before and after. And finally, you can use the ALL comparison using ILIKE ALL (SELECT ...). This ALL will actually be AND-ing the results, as desired.
WITH q AS
(
SELECT 'Alice Chaplin Meyer'::text AS q
)
, words AS
(
SELECT
'%' || word || '%' AS wordish
FROM
q
JOIN LATERAL unnest(string_to_array(q, ' ')) AS a(word) ON true
)
SELECT
*
FROM
t
WHERE
concat_ws(' ', first_name, last_name) ILIKE ALL(SELECT wordish FROM words)
You can check it all at http://rextester.com/LNB38296
References:
string_to_array and unnest
Using ALL
NOTE: This can probably be simplified, but I've prefered a step-by-step approach.

How to add order by in string agg, when two columns are concatenated

SELECT string_agg( distinct a || '-' || b , ',' ORDER BY a,b)
FROM table;
The above sql giving error
ERROR: in an aggregate with DISTINCT, ORDER BY expressions must appear in argument list
For the documentation:
If DISTINCT is specified in addition to an order_by_clause, then all the ORDER BY expressions must match regular arguments of the aggregate; that is, you cannot sort on an expression that is not included in the DISTINCT list.
So try
select string_agg(distinct a || '-' || b, ',' order by a || '-' || b)
from a_table;
or use distinct in a derived table:
select string_agg(a || '-' || b , ',' order by a, b)
from (
select distinct a, b
from a_table
) s;

How do I find all the NUMERIC columns in a table and do a SUM() on them?

I have a few tables in Netezza, DB2 and PostgreSQL databases, for which I need to reconcile and the best way we have come out with is to do a SUM() across all the NUMERIC Table columns on all the 3 databases.
Does anyone have a quick and simple way to find all the COLUMNS which are either NUMERIC or INTEGER or BIGINT and then run a SUM() on all these?
For comparing the results, I can do it manually also, or if someone has a way to capture these results in a common table and automatically check the differences in the SUM?
For DB2 you can use this metadata which will help you to find out the data type for each column
SELECT
COLUMN_NAME || ' ' || REPLACE(REPLACE(DATA_TYPE,'DECIMAL','NUMERIC'),'CHARACTER','VARCHAR') ||
CASE
WHEN DATA_TYPE = 'TIMESTAMP' THEN ''
ELSE
' (' ||
CASE
WHEN CHARACTER_MAXIMUM_LENGTH IS NOT NULL THEN CAST(CHARACTER_MAXIMUM_LENGTH AS VARCHAR(30))
WHEN NUMERIC_PRECISION IS NOT NULL THEN CAST(NUMERIC_PRECISION AS VARCHAR(30)) ||
CASE
WHEN NUMERIC_SCALE = 0 THEN ''
ELSE ',' || CAST(NUMERIC_SCALE AS VARCHAR(3))
END
ELSE ''
END || ')'
END || ',' "SQLCOL",
COLUMN_NAME,
DATA_TYPE, CHARACTER_MAXIMUM_LENGTH, NUMERIC_PRECISION, NUMERIC_SCALE, ORDINAL_POSITION
FROM SYSIBM.COLUMNS
WHERE TABLE_NAME = 'insert your table name'
AND TABLE_SCHEMA = 'insert your table schema'
ORDER BY ORDINAL_POSITION
For Netezza, I got the following query:
SELECT 0 AS ATTNUM, 'SELECT' AS SQL
UNION
SELECT ATTNUM, 'SUM(' || ATTNAME || ') AS S_' || ATTNAME || ',' AS COLMN
FROM _V_RELATION_COLUMN RC
WHERE NAME = '<table-name>'
AND FORMAT_TYPE= 'NUMERIC'
UNION
SELECT 10000 AS ATTNUM, ' 0 AS FLAG FROM ' || '<table-name>'
ORDER BY ATTNUM
Still looking how to do this across DB2 and PostgreSQL.

Unable to INSERT between tables using ST_GeomFromText

I'm trying to insert point geometry values and other data from one table to another table.
-- create tables
create table bh_tmp (bh_id integer, bh_name varchar
, easting decimal, northing decimal, ground_mod decimal);
create table bh (name varchar);
SELECT AddGeometryColumn('bh', 'bh_geom', 27700, 'POINT',3);
-- popualte bh_tmp
insert into bh_tmp values
(1,'C5',542945.0,180846.0,3.947),
(3,'B24',542850.0,180850.0,4.020),
(4,'B26',543020.0,180850.0,4.020);
-- populate bh from bh_tmp
insert into bh(name, bh_geom) SELECT
bh_name,
CONCAT($$ST_GeomFromText('POINT($$, Easting, ' ', Northing, ' '
, Ground_mOD, $$)', 27700)$$);
FROM bh_tmp;
Gives this error:
ERROR: parse error - invalid geometry
SQL state: XX000
Hint: "ST" <-- parse error at position 2 within geometry
I can't see anything wrong with the ST_GeomFromText string that I've specified. But I can populate table bh if I insert rows 'manually', e.g.:
INSERT INTO bh (name, bh_geom)
VALUES ('C5' ST_GeomFromText('POINT(542945.0 180846.0 3.947)', 27700));
What am I doing wrong?
First of all, there is a misplaced semicolon after CONCAT(...);
And you can't concatenate the function name itself into the string:
INSERT INTO bh(name, bh_geom)
SELECT bh_name
, ST_GeomFromText('POINT(' || concat_ws(' ', easting, northing, ground_mod) || ')'
, 27700)
FROM bh_tmp;
Or, since you have values already (not text), you could use ST_MakePoint() and ST_SetSRID():
ST_SetSRID(ST_MakePoint(easting, northing, ground_mod), 27700)
Should be faster.
Npgsql parameterized query output incompatible with PostGIS
You're getting that error because the output of the CONCAT function is text, and your bh_geom column is geometry, so you're trying to insert text into geometry. This will work:
INSERT INTO bh(name, bh_geom) SELECT
bh_name,
ST_GeomFromText('POINT('
|| easting|| ' '
|| Northing
|| ' '
|| Ground_mOD
|| ')', 27700)
FROM bh_tmp;