Related
I am looking for a way to concatenate the strings of a field within a group by query. So for example, I have a table:
ID COMPANY_ID EMPLOYEE
1 1 Anna
2 1 Bill
3 2 Carol
4 2 Dave
and I wanted to group by company_id to get something like:
COMPANY_ID EMPLOYEE
1 Anna, Bill
2 Carol, Dave
There is a built-in function in mySQL to do this group_concat
PostgreSQL 9.0 or later:
Modern Postgres (since 2010) has the string_agg(expression, delimiter) function which will do exactly what the asker was looking for:
SELECT company_id, string_agg(employee, ', ')
FROM mytable
GROUP BY company_id;
Postgres 9 also added the ability to specify an ORDER BY clause in any aggregate expression; otherwise you have to order all your results or deal with an undefined order. So you can now write:
SELECT company_id, string_agg(employee, ', ' ORDER BY employee)
FROM mytable
GROUP BY company_id;
PostgreSQL 8.4.x:
PostgreSQL 8.4 (in 2009) introduced the aggregate function array_agg(expression) which collects the values in an array. Then array_to_string() can be used to give the desired result:
SELECT company_id, array_to_string(array_agg(employee), ', ')
FROM mytable
GROUP BY company_id;
PostgreSQL 8.3.x and older:
When this question was originally posed, there was no built-in aggregate function to concatenate strings. The simplest custom implementation (suggested by Vajda Gabo in this mailing list post, among many others) is to use the built-in textcat function (which lies behind the || operator):
CREATE AGGREGATE textcat_all(
basetype = text,
sfunc = textcat,
stype = text,
initcond = ''
);
Here is the CREATE AGGREGATE documentation.
This simply glues all the strings together, with no separator. In order to get a ", " inserted in between them without having it at the end, you might want to make your own concatenation function and substitute it for the "textcat" above. Here is one I put together and tested on 8.3.12:
CREATE FUNCTION commacat(acc text, instr text) RETURNS text AS $$
BEGIN
IF acc IS NULL OR acc = '' THEN
RETURN instr;
ELSE
RETURN acc || ', ' || instr;
END IF;
END;
$$ LANGUAGE plpgsql;
This version will output a comma even if the value in the row is null or empty, so you get output like this:
a, b, c, , e, , g
If you would prefer to remove extra commas to output this:
a, b, c, e, g
Then add an ELSIF check to the function like this:
CREATE FUNCTION commacat_ignore_nulls(acc text, instr text) RETURNS text AS $$
BEGIN
IF acc IS NULL OR acc = '' THEN
RETURN instr;
ELSIF instr IS NULL OR instr = '' THEN
RETURN acc;
ELSE
RETURN acc || ', ' || instr;
END IF;
END;
$$ LANGUAGE plpgsql;
How about using Postgres built-in array functions? At least on 8.4 this works out of the box:
SELECT company_id, array_to_string(array_agg(employee), ',')
FROM mytable
GROUP BY company_id;
As from PostgreSQL 9.0 you can use the aggregate function called string_agg. Your new SQL should look something like this: SELECT company_id, string_agg(employee, ', ')
FROM mytable
GROUP BY company_id;
I claim no credit for the answer because I found it after some searching:
What I didn't know is that PostgreSQL allows you to define your own aggregate functions with CREATE AGGREGATE
This post on the PostgreSQL list shows how trivial it is to create a function to do what's required:
CREATE AGGREGATE textcat_all(
basetype = text,
sfunc = textcat,
stype = text,
initcond = ''
);
SELECT company_id, textcat_all(employee || ', ')
FROM mytable
GROUP BY company_id;
As already mentioned, creating your own aggregate function is the right thing to do. Here is my concatenation aggregate function (you can find details in French):
CREATE OR REPLACE FUNCTION concat2(text, text) RETURNS text AS '
SELECT CASE WHEN $1 IS NULL OR $1 = \'\' THEN $2
WHEN $2 IS NULL OR $2 = \'\' THEN $1
ELSE $1 || \' / \' || $2
END;
'
LANGUAGE SQL;
CREATE AGGREGATE concatenate (
sfunc = concat2,
basetype = text,
stype = text,
initcond = ''
);
And then use it as:
SELECT company_id, concatenate(employee) AS employees FROM ...
This latest announcement list snippet might be of interest if you'll be upgrading to 8.4:
Until 8.4 comes out with a
super-effient native one, you can add
the array_accum() function in the
PostgreSQL documentation for rolling
up any column into an array, which can
then be used by application code, or
combined with array_to_string() to
format it as a list:
http://www.postgresql.org/docs/current/static/xaggr.html
I'd link to the 8.4 development docs but they don't seem to list this feature yet.
Following up on Kev's answer, using the Postgres docs:
First, create an array of the elements, then use the built-in array_to_string function.
CREATE AGGREGATE array_accum (anyelement)
(
sfunc = array_append,
stype = anyarray,
initcond = '{}'
);
select array_to_string(array_accum(name),'|') from table group by id;
Following yet again on the use of a custom aggregate function of string concatenation: you need to remember that the select statement will place rows in any order, so you will need to do a sub select in the from statement with an order by clause, and then an outer select with a group by clause to aggregate the strings, thus:
SELECT custom_aggregate(MY.special_strings)
FROM (SELECT special_strings, grouping_column
FROM a_table
ORDER BY ordering_column) MY
GROUP BY MY.grouping_column
Use STRING_AGG function for PostgreSQL and Google BigQuery SQL:
SELECT company_id, STRING_AGG(employee, ', ')
FROM employees
GROUP BY company_id;
I found this PostgreSQL documentation helpful: http://www.postgresql.org/docs/8.0/interactive/functions-conditional.html.
In my case, I sought plain SQL to concatenate a field with brackets around it, if the field is not empty.
select itemid,
CASE
itemdescription WHEN '' THEN itemname
ELSE itemname || ' (' || itemdescription || ')'
END
from items;
If you are on Amazon Redshift, where string_agg is not supported, try using listagg.
SELECT company_id, listagg(EMPLOYEE, ', ') as employees
FROM EMPLOYEE_table
GROUP BY company_id;
According to version PostgreSQL 9.0 and above you can use the aggregate function called string_agg. Your new SQL should look something like this:
SELECT company_id, string_agg(employee, ', ')
FROM mytable GROUP BY company_id;
You can also use format function. Which can also implicitly take care of type conversion of text, int, etc by itself.
create or replace function concat_return_row_count(tbl_name text, column_name text, value int)
returns integer as $row_count$
declare
total integer;
begin
EXECUTE format('select count(*) from %s WHERE %s = %s', tbl_name, column_name, value) INTO total;
return total;
end;
$row_count$ language plpgsql;
postgres=# select concat_return_row_count('tbl_name','column_name',2); --2 is the value
I'm using Jetbrains Rider and it was a hassle copying the results from above examples to re-execute because it seemed to wrap it all in JSON. This joins them into a single statement that was easier to run
select string_agg('drop table if exists "' || tablename || '" cascade', ';')
from pg_tables where schemaname != $$pg_catalog$$ and tableName like $$rm_%$$
my query in postgresql 9.3 is:
SELECT *
FROM route
WHERE 'my/uri/address' LIKE CONCAT(c_uri, '%')
But in postgresql 8.4 this code not working.
How can i resolve, please?
Thanks in advance!
In PostgreSQL version 8.4 to String concatenation use || operation.
Example:'Post' || 'greSQL'
Result is: PostgreSQL
You could also use WS_CONCAT function but it is not defined in 8.4. You can declare it as the following:
CREATE OR REPLACE FUNCTION concat_ws(
separator text,
VARIADIC names_array text[]
)
RETURNS text
LANGUAGE plpgsql
AS $$
BEGIN
RETURN array_to_string(array_remove(names_array, NULL), ' ');
END;
$$;
This definition works exactly as in 9.x and above: pass the separator text first and then an arbitrary amount of texts to concat. NULL values get filtered out.
Example:
SELECT WS_CONCAT(' ', 'Ford', 'Focus', NULL /* Missing body-style */, '1.8 L')
Result: Ford Focus 1.8 L
Can functions be stored as anonymous functions directly in column as its value?
Let's say I want this function be stored in column.
Example (pseudocode):
Table my_table: pk (int), my_function (func)
func ( x ) { return x * 100 }
And later use it as:
select
t.my_function(some_input) AS output
from
my_table as t
where t.pk = 1999
Function may vary for each pk.
Your title asks something else than your example.
A function has to be created before you can call it. (title)
An expression has to be evaluated. You would need a meta-function for that. (example)
Here are solutions for both:
1. Evaluate expressions dynamically
You have to take into account that the resulting type can vary. I use polymorphic types for that.
CREATE OR REPLACE FUNCTION f1(int)
RETURNS int
LANGUAGE sql IMMUTABLE AS
'SELECT $1 * 100;';
CREATE OR REPLACE FUNCTION f2(text)
RETURNS text
LANGUAGE sql IMMUTABLE AS
$$SELECT $1 || '_foo';$$;
CREATE TABLE my_expr (
expr text PRIMARY KEY
, def text
, rettype regtype
);
INSERT INTO my_expr VALUES
('x', 'f1(3)' , 'int')
, ('y', $$f2('bar')$$, 'text')
, ('z', 'now()' , 'timestamptz')
;
CREATE OR REPLACE FUNCTION f_eval(text, _type anyelement = 'NULL'::text, OUT _result anyelement)
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE
'SELECT ' || (SELECT def FROM my_expr WHERE expr = $1)
INTO _result;
END
$func$;
Related:
Refactor a PL/pgSQL function to return the output of various SELECT queries
Call:
SQL is strictly typed, the same result column can only have one data type. For multiple rows with possibly heterogeneous data types, you might settle for type text, as every data type can be cast to and from text:
SELECT *, f_eval(expr) AS result -- default to type text
FROM my_expr;
Or return multplce columns like:
SELECT *
, CASE WHEN rettype = 'text'::regtype THEN f_eval(expr) END AS text_result -- default to type text
, CASE WHEN rettype = 'int'::regtype THEN f_eval(expr, NULL::int) END AS int_result
, CASE WHEN rettype = 'timestamptz'::regtype THEN f_eval(expr, NULL::timestamptz) END AS tstz_result
-- , more?
FROM my_expr;
db<>fiddle here
2. Create and use functions dynamically
It is possible to create functions dynamically and then use them. You cannot do that with plain SQL, however. You will have to use another function to do that or at least an anonymous code block (DO statement), introduced in PostgreSQL 9.0.
It can work like this:
CREATE TABLE my_func (func text PRIMARY KEY, def text);
INSERT INTO my_func VALUES
('f'
, $$CREATE OR REPLACE FUNCTION f(int)
RETURNS int
LANGUAGE sql IMMUTABLE AS
'SELECT $1 * 100;'$$);
CREATE OR REPLACE FUNCTION f_create_func(text)
RETURNS void
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE (SELECT def FROM my_func WHERE func = $1);
END
$func$;
Call:
SELECT f_create_func('f');
SELECT f(3);
db<>fiddle here
You may want to drop the function afterwards.
In most cases you should just create the functions instead and be done with it. Use separate schemas if you have problems with multiple versions or privileges.
For more information on the features I used here, see my related answer on dba.stackexchange.com.
I have an array of type bigint, how can I remove the duplicate values in that array?
Ex: array[1234, 5343, 6353, 1234, 1234]
I should get array[1234, 5343, 6353, ...]
I tested out the example SELECT uniq(sort('{1,2,3,2,1}'::int[])) in the postgres manual but it is not working.
I faced the same. But an array in my case is created via array_agg function. And fortunately it allows to aggregate DISTINCT values, like:
array_agg(DISTINCT value)
This works for me.
The sort(int[]) and uniq(int[]) functions are provided by the intarray contrib module.
To enable its use, you must install the module.
If you don't want to use the intarray contrib module, or if you have to remove duplicates from arrays of different type, you have two other ways.
If you have at least PostgreSQL 8.4 you could take advantage of unnest(anyarray) function
SELECT ARRAY(SELECT DISTINCT UNNEST('{1,2,3,2,1}'::int[]) ORDER BY 1);
?column?
----------
{1,2,3}
(1 row)
Alternatively you could create your own function to do this
CREATE OR REPLACE FUNCTION array_sort_unique (ANYARRAY) RETURNS ANYARRAY
LANGUAGE SQL
AS $body$
SELECT ARRAY(
SELECT DISTINCT $1[s.i]
FROM generate_series(array_lower($1,1), array_upper($1,1)) AS s(i)
ORDER BY 1
);
$body$;
Here is a sample invocation:
SELECT array_sort_unique('{1,2,3,2,1}'::int[]);
array_sort_unique
-------------------
{1,2,3}
(1 row)
... Where the statandard libraries (?) for this kind of array_X utility??
Try to search... See some but no standard:
postgres.cz/wiki/Array_based_functions: good reference!
JDBurnZ/postgresql-anyarray, good initiative but needs some collaboration to enhance.
wiki.postgresql.org/Snippets, frustrated initiative, but "offcial wiki", needs some collaboration to enhance.
MADlib: good! .... but it is an elephant, not an "pure SQL snippets lib".
Simplest and faster array_distinct() snippet-lib function
Here the simplest and perhaps faster implementation for array_unique() or array_distinct():
CREATE FUNCTION array_distinct(anyarray) RETURNS anyarray AS $f$
SELECT array_agg(DISTINCT x) FROM unnest($1) t(x);
$f$ LANGUAGE SQL IMMUTABLE;
NOTE: it works as expected with any datatype, except with array of arrays,
SELECT array_distinct( array[3,3,8,2,6,6,2,3,4,1,1,6,2,2,3,99] ),
array_distinct( array['3','3','hello','hello','bye'] ),
array_distinct( array[array[3,3],array[3,3],array[3,3],array[5,6]] );
-- "{1,2,3,4,6,8,99}", "{3,bye,hello}", "{3,5,6}"
the "side effect" is to explode all arrays in a set of elements.
PS: with JSONB arrays works fine,
SELECT array_distinct( array['[3,3]'::JSONB, '[3,3]'::JSONB, '[5,6]'::JSONB] );
-- "{"[3, 3]","[5, 6]"}"
Edit: more complex but useful, a "drop nulls" parameter
CREATE FUNCTION array_distinct(
anyarray, -- input array
boolean DEFAULT false -- flag to ignore nulls
) RETURNS anyarray AS $f$
SELECT array_agg(DISTINCT x)
FROM unnest($1) t(x)
WHERE CASE WHEN $2 THEN x IS NOT NULL ELSE true END;
$f$ LANGUAGE SQL IMMUTABLE;
Using DISTINCT implicitly sorts the array. If the relative order of the array elements needs to be preserved while removing duplicates, the function can be designed like the following: (should work from 9.4 onwards)
CREATE OR REPLACE FUNCTION array_uniq_stable(anyarray) RETURNS anyarray AS
$body$
SELECT
array_agg(distinct_value ORDER BY first_index)
FROM
(SELECT
value AS distinct_value,
min(index) AS first_index
FROM
unnest($1) WITH ORDINALITY AS input(value, index)
GROUP BY
value
) AS unique_input
;
$body$
LANGUAGE 'sql' IMMUTABLE STRICT;
I have assembled a set of stored procedures (functions) to combat PostgreSQL's lack of array handling coined anyarray. These functions are designed to work across any array data-type, not just integers as intarray does: https://www.github.com/JDBurnZ/anyarray
In your case, all you'd really need is anyarray_uniq.sql. Copy & paste the contents of that file into a PostgreSQL query and execute it to add the function. If you need array sorting as well, also add anyarray_sort.sql.
From there, you can peform a simple query as follows:
SELECT ANYARRAY_UNIQ(ARRAY[1234,5343,6353,1234,1234])
Returns something similar to: ARRAY[1234, 6353, 5343]
Or if you require sorting:
SELECT ANYARRAY_SORT(ANYARRAY_UNIQ(ARRAY[1234,5343,6353,1234,1234]))
Return exactly: ARRAY[1234, 5343, 6353]
Here's the "inline" way:
SELECT 1 AS anycolumn, (
SELECT array_agg(c1)
FROM (
SELECT DISTINCT c1
FROM (
SELECT unnest(ARRAY[1234,5343,6353,1234,1234]) AS c1
) AS t1
) AS t2
) AS the_array;
First we create a set from array, then we select only distinct entries, and then aggregate it back into array.
In a single query i did this:
SELECT (select array_agg(distinct val) from ( select unnest(:array_column) as val ) as u ) FROM :your_table;
For people like me who still have to deal with postgres 8.2, this recursive function can eliminate duplicates without altering the sorting of the array
CREATE OR REPLACE FUNCTION my_array_uniq(bigint[])
RETURNS bigint[] AS
$BODY$
DECLARE
n integer;
BEGIN
-- number of elements in the array
n = replace(split_part(array_dims($1),':',2),']','')::int;
IF n > 1 THEN
-- test if the last item belongs to the rest of the array
IF ($1)[1:n-1] #> ($1)[n:n] THEN
-- returns the result of the same function on the rest of the array
return my_array_uniq($1[1:n-1]);
ELSE
-- returns the result of the same function on the rest of the array plus the last element
return my_array_uniq($1[1:n-1]) || $1[n:n];
END IF;
ELSE
-- if array has only one item, returns the array
return $1;
END IF;
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
for exemple :
select my_array_uniq(array[3,3,8,2,6,6,2,3,4,1,1,6,2,2,3,99]);
will give
{3,8,2,6,4,1,99}
I want to write a stored procedure that gets an array as input parameter and sort that array and return the sorted array.
The best way to sort an array of integers is without a doubt to use the intarray extension, which will do it much, much, much faster than any SQL formulation:
CREATE EXTENSION intarray;
SELECT sort( ARRAY[4,3,2,1] );
A function that works for any array type is:
CREATE OR REPLACE FUNCTION array_sort (ANYARRAY)
RETURNS ANYARRAY LANGUAGE SQL
AS $$
SELECT ARRAY(SELECT unnest($1) ORDER BY 1)
$$;
(I've replaced my version with Pavel's slightly faster one after discussion elsewhere).
In PostrgreSQL 8.4 and up you can use:
select array_agg(x) from (select unnest(ARRAY[1,5,3,7,2]) AS x order by x) as _;
But it will not be very fast.
In older Postgres you can implement unnest like this
CREATE OR REPLACE FUNCTION unnest(anyarray)
RETURNS SETOF anyelement AS
$BODY$
SELECT $1[i] FROM
generate_series(array_lower($1,1),
array_upper($1,1)) i;
$BODY$
LANGUAGE 'sql' IMMUTABLE
And array_agg like this:
CREATE AGGREGATE array_agg (
sfunc = array_append,
basetype = anyelement,
stype = anyarray,
initcond = '{}'
);
But it will be even slower.
You can also implement any sorting algorithm in pl/pgsql or any other language you can plug in to postgres.
Just use the function unnest():
SELECT
unnest(ARRAY[1,2]) AS x
ORDER BY
x DESC;
See array functions in the Pg docs.
This worked for me from http://www.pgsql.cz/index.php/PostgreSQL_SQL_Tricks_I#General_array_sort
CREATE OR REPLACE FUNCTION array_sort (ANYARRAY)
RETURNS ANYARRAY LANGUAGE SQL
AS $$
SELECT ARRAY(
SELECT $1[s.i] AS "foo"
FROM
generate_series(array_lower($1,1), array_upper($1,1)) AS s(i)
ORDER BY foo
);
$$;
Please see Craig's answer since he is far more more knowledgable on Postgres and has a better answer. Also if possible vote to delete my answer.
Very nice exhibition of PostgreSQL's features is general procedure for sorting by David Fetter.
CREATE OR REPLACE FUNCTION array_sort (ANYARRAY)
RETURNS ANYARRAY LANGUAGE SQL
AS $$
SELECT ARRAY(
SELECT $1[s.i] AS "foo"
FROM
generate_series(array_lower($1,1), array_upper($1,1)) AS s(i)
ORDER BY foo
);
$$;
If you're looking for a solution which will work across any data-type, I'd recommend taking the approach laid out at YouLikeProgramming.com.
Essentially, you can create a stored procedure (code below) which performs the sorting for you, and all you need to do is pass your array to that procedure for it to be sorted appropriately.
I have also included an implementation which does not require the use of a stored procedure, if you're looking for your query to be a little more transportable.
Creating the stored procedure
DROP FUNCTION IF EXISTS array_sort(anyarray);
CREATE FUNCTION
array_sort(
array_vals_to_sort anyarray
)
RETURNS TABLE (
sorted_array anyarray
)
AS $BODY$
BEGIN
RETURN QUERY SELECT
ARRAY_AGG(val) AS sorted_array
FROM
(
SELECT
UNNEST(array_vals_to_sort) AS val
ORDER BY
val
) AS sorted_vals
;
END;
$BODY$
LANGUAGE plpgsql;
Sorting array values (works with any array data-type)
-- The following will return: {1,2,3,4}
SELECT ARRAY_SORT(ARRAY[4,3,2,1]);
-- The following will return: {in,is,it,on,up}
SELECT ARRAY_SORT(ARRAY['up','on','it','is','in']);
Sorting array values without a stored procedure
In the following query, simply replace ARRAY[4,3,2,1] with your array or query which returns an array:
WITH
sorted_vals AS (
SELECT
UNNEST(ARRAY[4,3,2,1]) AS val
ORDER BY
val
)
SELECT
ARRAY_AGG(val) AS sorted_array
FROM
sorted_vals
... or ...
SELECT
ARRAY_AGG(vals.val) AS sorted_arr
FROM (
SELECT
UNNEST(ARRAY[4,3,2,1]) AS val
ORDER BY
val
) AS vals
I'm surprised no-one has mentioned the containment operators:
select array[1,2,3] <# array[2,1,3] and array[1,2,3] #> array[2,1,3];
?column?
══════════
t
(1 row)
Notice that this requires that all elements of the arrays must be unique.
(If a contains b and b contains a, they must be the same if all elements are unique)