I am trying to create a function that takes a table and a variable number of columns as arguments, and then returns a table without rows that have duplicates on all of those columns. I am trying to figure out how to have a variable number of columns as arguments, and I have gathered that I will probably need a VARIADIC argument, but I am not sure how to implement it. What I have so far:
CREATE FUNCTION remove_duplicates(orig_table, VARIADIC sel_columns column)
RETURNS table AS $$
SELECT * FROM
(SELECT *,
count(*) over (partition by sel_columns) AS count
FROM orig_table)
WHERE count = 1;
$$ LANGUAGE SQL;
As an example, if I had a table like this:
cola | colb | colc
-------------------
a | b | 1
a | b | 2
a | c | 3
a | d | 4
I would like to run SELECT * FROM remove_duplicates(mytable, cola, colb) and get this result:
cola | colb | colc
-------------------
a | c | 3
a | d | 4
Thank you for the help. I'm using postgresql 9.4.9
You'll cannot get what you want with a simple SQL function, you need the power of a procedural language. A possible solution is:
CREATE OR REPLACE FUNCTION remove_duplicates(orig_table anyelement, VARIADIC sel_columns text[])
RETURNS SETOF anyelement AS $$
DECLARE
orig_table_columns TEXT;
BEGIN
SELECT array_to_string(array_agg(quote_ident(column_name)),',') INTO orig_table_columns FROM information_schema.columns WHERE table_name = CAST(pg_typeof(orig_table) AS TEXT);
RETURN QUERY EXECUTE 'SELECT ' || orig_table_columns || ' FROM '
|| '(SELECT *, '
|| ' count(*) over (partition by ' || array_to_string(sel_columns, ',') || ') AS count '
|| 'FROM ' || pg_typeof(orig_table) || ') AS tmp '
|| ' WHERE count = 1 ';
END
$$ LANGUAGE PLPGSQL;
SELECT * FROM remove_duplicates(NULL::tests, 'cola', 'colb');
Don't forget to do your changes to avoid SQL Injection.
EDIT: For a very good explanation about functions with dynamic return types see Erwin's answer here.
Related
I am trying to count the number of rows that do not contain null for each column in the table
There is a simple table actor_new
The first 2 columns (actor_id, first_name) contain 203 rows not null
Other 2 columns (last_name, last_update) contain 200 rows not null
This is a simple test that outputs the same value for all columns, but if you perform select separately, then everything works correctly, please help me understand the LOOP block
create or replace function new_cnt_test_ho(in_table text, out out_table text, out cnt_rows int) returns setof record AS $$
DECLARE i text;
BEGIN
FOR i IN
select column_name
from information_schema."columns"
where table_schema = 'public'
and table_name = in_table
LOOP
execute '
select $1, count($1)
from '|| quote_ident(in_table) ||'
where $1 is not null '
INTO out_table, cnt_rows
using i, quote_literal(i), quote_ident(in_table), quote_literal(in_table) ;
return next;
END LOOP;
END;
$$LANGUAGE plpgsql
Result:
select * from new_cnt_test_ho('actor_new')
out_table |cnt_rows|
-----------+--------+
actor_id | 203|
first_name | 203|
last_name | 203|
last_update| 203|
There are 4 parameters specified in using, because I assumed that the error was in quotes, I took turns playing with arguments from 1 to 4
The correct result should be like this
out_table |cnt_rows|
-----------+--------+
actor_id | 203|
first_name | 203|
last_name | 200|
last_update| 200|
based on your title: input is a table name, output is a table one column is column name, another column is return of count(column)
first check the table exists or not.
then for loop get each column name, after that for each column name run a query.
a sample query is select 'cola',count(cola) from count_nulls. first occurrence is literal 'cola', so we need quote_literal(cols.column_name),
second is the column name, so we need use quote_ident(cols.column_name)
select 'cola',count(cola) from count_nulls will count column cola all not null value. if a column all value is null then return 0.
The following function will return the expected result. Can be simplified, since i use a lot of raise notice.
CREATE OR REPLACE FUNCTION get_all_nulls (_table text)
RETURNS TABLE (
column_name_ text,
numberofnull bigint
)
AS $body$
DECLARE
cols RECORD;
_sql text;
_table_exists boolean;
_table_reg regclass;
BEGIN
_table_reg := _table::regclass;
_table_exists := (
SELECT
EXISTS (
SELECT
FROM
pg_tables
WHERE
schemaname = 'public'
AND tablename = _table));
FOR cols IN
SELECT
column_name
FROM
information_schema.columns
WHERE
table_name = _table
AND table_schema = 'public' LOOP
_sql := 'select ' || quote_literal(cols.column_name) || ',count(' || quote_ident(cols.column_name) || ') from ' || quote_ident(_table::text);
RAISE NOTICE '_sql:%', _sql;
RETURN query EXECUTE _sql;
END LOOP;
END;
$body$ STRICT
LANGUAGE plpgsql;
setup.
begin;
create table count_nulls(cola int, colb int, colc int);
INSERT into count_nulls values(null,null,null);
INSERT into count_nulls values(1,null,null);
INSERT into count_nulls values(2,3,null);
commit;
I need your help on this. I'm trying to achieve a query for a jsonb column information I have in a table. My jsonb is an array of objects and in every object I have two key/value pairs. In this case, I have a key/value to exclude and only get the another one key without it value. So, I figure it out how to do it like:
jsonb : '[{"track":"value","location":"value"},{"extra":"value","location":"value"},...{"another":"value","location":"value"}]'
SELECT id, jsonb_object_keys((item::jsonb - 'location')::jsonb)
FROM mytable, jsonb_array_elements(theJsonB) with ordinality arr(item,position)
WHERE offer = '0001'
This query, get me the result like
id | jsonb_object_keys
-----------------------
1 | track
1 | extra
... |
1 | another
But I need to get the result in only one row for each id like
id | column1 | column2 | ... | column+
------------------------
1 | track | extra | ... | another
2 | track | extra | ... | another
3 | track | extra | ... | another
4 | track | extra | ... | another
How I could solve this? Thanks in advance, I'm a pretty newbie in SQL but I'm working hard ;-)
If you know the list of the resulting columns only at the runtime then you need some piece of dynamic sql code.
Here is a full dynamic solution which relies on the creation of a user-defined composite type and on the standard functions jsonb_populate_record and jsonb_object_agg :
First you dynamically create the list of keys as a new composite type :
CREATE OR REPLACE PROCEDURE key_list (NewJsonB jsonb) LANGUAGE plpgsql AS
$$
DECLARE key_list text ;
BEGIN
IF NewJsonB IS NULL
THEN
SELECT string_agg(DISTINCT k.object->>'key' || ' text', ',')
INTO key_list
FROM mytable
CROSS JOIN LATERAL jsonb_path_query(theJsonB, '$[*].keyvalue()[*] ? (#.key != "location")') AS k(object) ;
ELSE SELECT string_agg(DISTINCT k.key :: text || ' text', ',')
FROM (SELECT jsonb_object_keys(to_jsonb(a.*)) AS key FROM (SELECT(null :: key_list).*) AS a
UNION ALL
SELECT jsonb_path_query(NewJsonB, '$[*].keyvalue()[*] ? (#.key != "location")')->>'key'
) AS k
INTO key_list ;
END IF ;
EXECUTE 'DROP TYPE IF EXISTS key_list ' ;
EXECUTE 'CREATE TYPE key_list AS (' || COALESCE(key_list, '') || ')' ;
END ;
$$ ;
CALL key_list(NULL) ;
Then you call the procedure key_list() by trigger when the list of keys is supposed to be modified :
CREATE OR REPLACE FUNCTION mytable_insert_update()
RETURNS trigger LANGUAGE plpgsql VOLATILE AS
$$
BEGIN
IF NOT EXISTS (SELECT jsonb_object_keys(to_jsonb(a.*)) FROM (SELECT(null :: key_list).*) AS a)
THEN CALL key_list(NULL) ;
ELSIF EXISTS ( SELECT jsonb_path_query(NEW.theJsonB, '$[*].keyvalue()[*] ? (#.key != "location")')->>'key'
EXCEPT ALL
SELECT jsonb_object_keys(to_jsonb(a.*)) FROM (SELECT(null :: key_list).*) AS a
)
THEN CALL key_list(NEW.theJsonB) ;
END IF ;
RETURN NEW ;
END ;
$$ ;
CREATE OR REPLACE TRIGGER mytable_insert_update AFTER INSERT OR UPDATE OF theJsonB ON mytable
FOR EACH ROW EXECUTE FUNCTION mytable_insert_update() ;
CREATE OR REPLACE FUNCTION mytable_delete()
RETURNS trigger LANGUAGE plpgsql VOLATILE AS
$$
BEGIN
CALL key_list (NULL) ;
RETURN OLD ;
END ;
$$ ;
CREATE OR REPLACE TRIGGER mytable_delete AFTER DELETE ON mytable
FOR EACH ROW EXECUTE FUNCTION mytable_delete() ;
Finally, you should get the expected result with the following query :
SELECT (jsonb_populate_record(NULL :: key_list, jsonb_object_agg(lower(c.object->>'key'), c.object->'key'))).*
FROM mytable AS t
CROSS JOIN LATERAL jsonb_path_query(t.theJsonB, '$[*].keyvalue()[*] ? (#.key != "location")') AS c(object)
GROUP BY t
full test result in dbfiddle.
I have the following function called as pro(). From which I want to return the strings by union all of two select statements and product output.
Function: pro()
My try:
create or replace function pro()
returns varchar as
$$
declare
sql varchar;
q varchar;
begin
sql := 'SELECT DISTINCT CAST(COUNT(ProductNumber) as varchar) ||'' - Count of the product Number '' as Descp
FROM product
UNION ALL
SELECT DISTINCT CAST(COUNT(ProductName) AS varchar) || '' - Count of the product Name '' as Descp
FROM product';
raise info '%',sql;
execute sql into q;
return q;
end;
$$
language plpgsql;
Calling Function:
select pro();
This returning only the first part of select statement:
______________________________________
|pro |
|character varying |
|______________________________________|
|6 - Count of the product Number |
|______________________________________|
But the expected result should be:
______________________________________
|pro |
|character varying |
|______________________________________|
|6 - Count of the product Number |
|______________________________________|
|6 - Count of the product Name |
|______________________________________|
Try use these functions :
using plpgsql
create or replace function pro1()returns
table (
descp text
)
as
$$
begin
return QUERY execute (
'SELECT DISTINCT CAST(COUNT(product) as varchar) ||'' - Count of the product Number '' as Descp
FROM product
UNION ALL
SELECT DISTINCT CAST(COUNT(productid) AS varchar) || '' - Count of the product Name '' as Descp
FROM product');
end;
$$
language plpgsql;
or
using sql
create or replace function pro2() returns table ( descp text)
as
$$
SELECT DISTINCT CAST(COUNT(product) as varchar) ||' - Count of the product Number ' as Descp
FROM product
UNION ALL
SELECT DISTINCT CAST(COUNT(productid) AS varchar) || ' - Count of the product Name 'as Descp
FROM product;
$$
language sql;
I need to check the condition within function using string_agg() function and need to assign it to variable. After assigning I need to execute the variable with value.
Example:
create or replace function funct1(a int,b varchar)
returns void as
$$
declare
wrclause varchar := '';
sqlq varchar ;
t varchar;
begin
IF (b IS NOT NULL ) THEN
wrclause := 'AND b IN ('|| b || ')';
END IF;
sqlq := string_agg('select *, abcd as "D" from ' ||table_namess,' Union all ') as namess
from tablescollection2 ud
inner join INFORMATION_SCHEMA.Tables so on ud.table_namess = so.Table_name
WHERE cola NOT IN (SELECT cola FROM tablet WHERE colb = || a ||) || wrclause; /* Error occurred here at = || a */
raise info '%',sqlq;
execute sqlq into t;
raise info '%',t;
end;
$$
language plpgsql;
Calling Function:
select funct1(1,'1,2,3');
Error:
ERROR: operator does not exist: || integer
|| is an operator for catenating two pieces of text, it requires you to have text (or something convertible to text) both before and after the operator, like so:
select 'a' || 'b'
select 'a' || 3
So while these seem to be valid:
wrclause := 'AND b IN ('|| b || ')';
sqlq := string_agg('select *, abcd as "D" from ' ||table_namess,' Union all ') as namess
This is definitely not:
WHERE cola NOT IN (SELECT cola FROM tablet WHERE colb = || a ||) || wrclause;
What were you trying to achieve here?
It looks like you may be trying to construct a query dynamically. You need to remember that you cannot mix free text with SQL and expect Postgres to sort it out, no programming or query language does that.
If that's your intention, you should construct the query string first in its entirety (in a variable), and then call EXECUTE with it to have it interpreted.
Have a look at these:
Postgres Dynamic Query Function
PostgreSQL - dynamic value as table name
This piece contains the syntax error
... IN (SELECT cola FROM tablet WHERE colb = || a ||) || ...
PostgreSQL can understand this, but will try to search for unary prefix (and a postfix) || operator, which are not exist by default (they can be created however, but the error message says, that's not the case)
Edit:
F.ex. these are valid (predefined) unary operators on numbers:
SELECT |/ 25.0, -- prefix, square root, result: 5.0
5 !, -- postfix, factorial, result: 120,
# -5, -- prefix, absolute, result: 5
# -5 !; -- mixed, result: 120
I'm trying to retrieve player statistics for the last 20 weeks:
# select yw, money
from pref_money where id='OK122471020773'
order by yw desc limit 20;
yw | money
---------+-------
2010-52 | 1130
2010-51 | 3848
2010-50 | 4238
2010-49 | 2494
2010-48 | 936
2010-47 | 3453
2010-46 | 3923
2010-45 | 1110
2010-44 | 185
(9 rows)
But I would like to have the result as a string, where all values are concatenated by colons and semicolons like this:
"2010-44:185;2010-45:1110; .... ;2010-52:1130"
So I'm trying to create the following PL/pgSQL procedure:
create or replace function pref_money_stats(_id varchar)
returns varchar as $BODY$
begin
declare stats varchar;
for row in select yw, money from pref_money
where id=_id order by yw desc limit 20 loop
stats := row.id || ':' || row.money || ';' stats;
end loop;
return stats;
end;
$BODY$ language plpgsql;
But I get the syntax error:
ERROR: syntax error at or near "for"
LINE 7: for row in select yw, money from pref_money where id...
Using PostgreSQL 8.4.6 with CentOS 5.5 Linux.
UPDATE:
I'm trying to perform all this string concatenation with PL/pgSQL and not in PHP script, because I already have a main SQL select statement, which returns user information and that information is printed row by row as XML for my mobile app:
select u.id,
u.first_name,
u.female,
u.city,
u.avatar,
m.money,
u.login > u.logout as online
from pref_users u, pref_money m where
m.yw=to_char(current_timestamp, 'YYYY-IW')
and u.id=m.id
order by m.money desc
limit 20 offset ?
Here is the screenshot of the mobile app:
And here is an XML excerpt:
<?xml version="1.0"?>
<pref>
<user id="OK510352632290" name="ирина" money="2067" pos="1" medals="1" female="1" avatar="http://i221.odnoklassniki.ru/getImage?photoId=259607761026&photoType=0" city="староконстантинов" />
<user id="OK19895063121" name="Александр" money="1912" pos="2" online="1" avatar="http://i69.odnoklassniki.ru/getImage?photoId=244173589553&photoType=0" city="Сызрань" />
<user id="OK501875102516" name="Исмаил" money="1608" pos="3" online="1" avatar="http://i102.odnoklassniki.ru/res/stub_128x96.gif" city="Москва" />
.....
</pref>
But my problem is that I have 3 other tables, from which I need that statistics for the last 20 weeks. So I'm hoping to create 3 procedures returning varchars as in my original post and integrate them in this SQL select statement. So that I can add further attributes to the XML data:
<user id="OK12345" .... money_stats="2010-44:185;2010-45:1110; .... ;2010-52:1130" ..... />
Thank you!
Alex
Aggregate functions are good for concatenating values:
create or replace function test
(text, text, text)
returns text as
$$
select $1 || ':' || $2 || ';' || $3
$$
language sql;
drop function test(text, text);
drop aggregate test(text, text) cascade;
create aggregate test(text, text)
(
sfunc = test,
stype = text,
initcond = ''
);
test=# select test(a::text, b::text) from (select generate_series(1,3) as a, generate_series(4,5) a
s b) t;
:1;4:2;5:3;4:1;5:2;4:3;5
(I'll leave it to you to deal with the leading colin :-)
You probably have already found the answer to your problem. Even so, the problem was indeed syntax.
The problem was that the declare statement was misplaced: it should appear before the begin (docs):
create or replace function pref_money_stats(_id varchar)
returns varchar as $BODY$
declare stats varchar;
begin
...
Another detail to take notice of is that you need to declare row as a record:
declare
stats varchar;
row record;
Then this statement will run properly:
for row in select yw, money from pref_money where id=_id order by yw desc limit 20 loop
This is not exactly JSON but pretty close:
SELECT ARRAY
(
SELECT ROW(yw, money)
FROM pref_money
WHERE id = 'OK122471020773'
ORDER BY
yw DESC
LIMIT 20
)::TEXT
This will output this string:
{"(2010-44:185)","(2010-45:1110)",…,"(2010-52:1130)"}
which can later be cast back into the appropriate types.