for instance,
I want to make a view for a particular search,
create view search_in_structure as
select a,b,c,d,e
from
t1 natural inner join t2 natural inner join t3 ...
where
a ilike '%search_string%'
or b ilike '%search_string%'
or c ilike '%search_string%'
or f ilike '%search_string%';
it doesn't make sense because I can't modify search_string. Is there a mechanism to provide a value for search_string so it will execute the view statement with proper modification, something like :
select a,b from search_in_structure where search_string='postgresql 4ever';
if it's not possible, what solution would you recommend me to use and achieve the same result?
The only solution I can think of, would be to make a function (for example, search_in_structure (IN search text, OUT a text, OUT b text ...) returns record) and call it like :
select a,b from (select search_in_structure('postgresql 4ever'));
But as I am still a postgresql noob, I want to have expert suggestions.
A function is the way to go:
create function search_in_structure(p_search_value text)
returns table (a text, b text, c text, d text)
as
$$
select a,b,c,d,e
from t1
natural join t2
natural join t3 ...
where
a ilike '%'||p_search_value||'%'
or b ilike '%'||p_search_value||'%'
or c ilike '%'||p_search_value||'%'
or f ilike '%'||p_search_value||'%'
$$
language sql;
Then you can do:
select *
from search_in_structure('foobar');
Related
In Postgres, I have a table with many columns (e.g. t1(a, b, c, ..., z)). I need to obtain its subset through a select-from-where statement into a new table (e.g. t2), but this new table must have a serial attribute. So, t2 would like t2(id, a, b, c, ..., z), where id the serial attribute. In Postgres, this works:
INSERT INTO t2(a, b, c, d, ..., z)
SELECT *
FROM t1
WHERE <condition>
However, is it possible to achieve the same without writing all the attributes of t1?
You can define a view that is a simple SELECT of all but the serial column.
Such views are updateable in PostgreSQL, so you can use it as the target for your INSERT.
In addition to Laurenz's answer, it's worth noting that you can call the next value for each record in your serial sequence within your insert.
One way you could do it requires that you know the name of your sequence beforehand. By default the naming convention for a serial sequence will be tablename_id_seq where tablename in this case would be t2.
INSERT INTO t2
SELECT
nextval('t2_id_seq')
, t1.*
FROM t1
For more details on dealing with sequences:
Auto-generated sequences will adhere to the pattern ${table}_${column}_seq.
You can find all sequences by running the following queries:
/* Version 10+ */
SELECT
*
FROM pg_sequences -- Not to be confused with `pg_sequence`
WHERE sequencename LIKE '%t2%'
;
/* Version 9.5+ */
-- Returns the sequences associated with a table
SELECT
pg_get_serial_sequence('schema.tablename', 'columnname')
;
-- Returns sequences accessible to the user, not those owned by the user
SELECT
*
FROM information_schema.sequences
WHERE sequence_name LIKE '%t2%'
;
-- Return sequences owned by the current user
SELECT
n.nspname AS sequence_schema,
c.relname AS sequence_name,
u.usename AS owner
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
JOIN pg_user u ON u.usesysid = c.relowner
WHERE c.relkind = 'S'
AND u.usename = current_user;
/* Version 8.1+ */
-- Returns sequences accessible to the user, not those owned by the user
SELECT
relname
FROM pg_class
WHERE relkind = 'S' -- 'S' for sequence
;
I have three tables:
create table id_table (
id integer
);
insert into id_table values (1),(2),(3);
create table alphabet_table (
id integer,
letter text
);
insert into alphabet_table values (1,'a'),(2,'b'),(3,'c');
create table greek_table (
id integer,
letter text
);
insert into greek_table values (1,'alpha'),(2,'beta');
I like to create a function that join id_table with either alphabet_table or greek_table on id. The choice of the table depends on an input value specified in the function. I wrote:
CREATE OR REPLACE FUNCTION choose_letters(letter_type text)
RETURNS table (id integer,letter text) AS $$
BEGIN
RETURN QUERY
select t1.id,
case when letter_type = 'alphabet' then t2.letter else t3.letter end as letter
from id_table t1,
alphabet_table t2 ,
greek_table t3
where t1.id = t2.id and t1.id = t3.id;
END;
$$LANGUAGE plpgsql;
I ran select choose_letter('alphabet'). The problem with this code is that when id_table joins with alphabet_table, it does not pick up id, No 3. It seems that inner joins are done for both alphabet_table and greek_table (so it only picks up the common ids, 1 and 2). To avoid this problem, I wrote:
CREATE OR REPLACE FUNCTION choose_letters(letter_type text)
RETURNS table (id integer, letter text) AS $$
BEGIN
RETURN QUERY
select t1.id,
case when letter_type = 'alphabet' then t2.letter else t3.letter end as letter
from id_table t1
left join alphabet_table t2 on t1.id=t2.id
left join greek_table t3 on t1.id=t3.id
where t2.letter is not null or t3.letter is not null;
END;
$$LANGUAGE plpgsql;
Now it pick up all the 3 ids when id_table and alphabet_table join. However, When I ran select choose_letter('greek'). The id no. 3 appears with null value in letter column despite the fact that I specified t3.letter is not null in where clause.
What I'm looking for is that when I ran select choose_letters('alphabet'), the output needs to be (1,'a'), (2,'b'),(3,'c'). select choose_letters('greek') should produce (1,'alpha'),(2,'beta). No missing values nor null. How can I accomplish this?
Learn to use proper, explicit JOIN syntax. Simple rule: Never use commas in the FROM clause.
You can do what you want with LEFT JOIN and some other logic:
select i.id,
coalesce(a.letter, g.letter) as letter
from id_table i left join
alphabet_table a
on i.id = a.id and letter_type = 'alphabet' left join
greek_table g
on i.id = g.id and letter_type <> 'alphabet'
where a.id is not null or g.id is not null;
The condition using letter_type needs to be in the on clauses. Otherwise, alphabet_table will always have a match.
Gordon Linoff's answer above is certainly correct, but here is an alternative way to write the code.
It may or may not be better from a performance perspective, but it is logically equivalent. If performance is a concern you'd need to run EXPLAIN ANALYZE on the query an inspect the plan, and do other profiling.
Some good parts about this are the inner join makes the join clause and where clause simpler and easier to reason about. It's also more straight forward for the execution engine to parallelize the query.
On the downside it looks like code duplication, however, the DRY principle is often misapplied to SQL. Repeating code is less important than repeating data reads. What you're aiming to do is not scan the same data multiple times.
If there is no index on the fields you are joining or the letter_type then this could end up doing a full table scan twice, and be worse. If you do have the indexes then it can do it with index range scans nicely.
SELECT
i.id,
a.letter
FROM id_table i
INNER JOIN alphabet_table a
ON i.id = a.id
WHERE letter_type = 'alphabet'
UNION ALL
SELECT
i.id,
g.letter
FROM id_table i
INNER JOIN greek_table g
ON i.id = g.id
WHERE letter_type <> 'alphabet'
The first problem is your tables or not structured properly, You would have created single table like char_table (id int, letter text, type text) type will specify whether it is alphabet or Greek.
Another solution is you can write two SQL queries one in if condition other one is in else part
I am using a LEFT JOIN there will be cases where there is no right-table match therefore empty (null) values are substituted for the right-table columns. As a result I am getting [null] as one of the JSON aggregates.
SELECT C.id, C.name, json_agg(E) AS emails FROM contacts C
LEFT JOIN emails E ON C.id = E.user_id
GROUP BY C.id;
Postgres 9.3 creates output for example
id | name | emails
-----------------------------------------------------------
1 | Ryan | [{"id":3,"user_id":1,"email":"hello#world.com"},{"id":4,"user_id":1,"email":"again#awesome.com"}]
2 | Nick | [null]
How can I ignore/remove null so I have an empty JSON array [] when the right-table column is null?
In 9.4 you can use coalesce and an aggregate filter expression.
SELECT C.id, C.name,
COALESCE(json_agg(E) FILTER (WHERE E.user_id IS NOT NULL), '[]') AS emails
FROM contacts C
LEFT JOIN emails E ON C.id = E.user_id
GROUP BY C.id, C.name
ORDER BY C.id;
The filter expression prevents the aggregate from processing the rows that are null because the left join condition is not met, so you end up with a database null instead of the json [null]. Once you have a database null, then you can use coalesce as usual.
http://www.postgresql.org/docs/9.4/static/sql-expressions.html#SYNTAX-AGGREGATES
something like this, may be?
select
c.id, c.name,
case when count(e) = 0 then '[]' else json_agg(e) end as emails
from contacts as c
left outer join emails as e on c.id = e.user_id
group by c.id
sql fiddle demo
you also can group before join (I'd prefer this version, it's a bit more clear):
select
c.id, c.name,
coalesce(e.emails, '[]') as emails
from contacts as c
left outer join (
select e.user_id, json_agg(e) as emails from emails as e group by e.user_id
) as e on e.user_id = c.id
sql fiddle demo
If this is actually a PostgreSQL bug, I hope it's been fixed in 9.4. Very annoying.
SELECT C.id, C.name,
COALESCE(NULLIF(json_agg(E)::TEXT, '[null]'), '[]')::JSON AS emails
FROM contacts C
LEFT JOIN emails E ON C.id = E.user_id
GROUP BY C.id;
I personally don't do the COALESCE bit, just return the NULL. Your call.
I used this answer (sorry, I can't seem to link to your username) but I believe I improved it a bit.
For the array version we can
get rid of the redundant double select
use json_agg instead of the array_to_json(array_agg()) calls
and get this:
CREATE OR REPLACE FUNCTION public.json_clean_array(p_data JSON)
RETURNS JSON
LANGUAGE SQL IMMUTABLE
AS $$
-- removes elements that are json null (not sql-null) or empty
SELECT json_agg(value)
FROM json_array_elements(p_data)
WHERE value::text <> 'null' AND value::text <> '""';
$$;
For 9.3, for the object version, we can:
get rid of the non-used WITH clause
get rid of the redundant double select
and get this:
CREATE OR REPLACE FUNCTION public.json_clean(p_data JSON)
RETURNS JSON
LANGUAGE SQL IMMUTABLE
AS $$
-- removes elements that are json null (not sql-null) or empty
SELECT ('{' || string_agg(to_json(key) || ':' || value, ',') || '}') :: JSON
FROM json_each(p_data)
WHERE value::TEXT <> 'null' AND value::TEXT <> '""';
$$;
For 9.4, we don't have to use the string assembly stuff to build the object, as we can use the newly added json_object_agg
CREATE OR REPLACE FUNCTION public.json_clean(p_data JSON)
RETURNS JSON
LANGUAGE SQL IMMUTABLE
AS $$
-- removes elements that are json null (not sql-null) or empty
SELECT json_object_agg(key, value)
FROM json_each(p_data)
WHERE value::TEXT <> 'null' AND value::TEXT <> '""';
$$;
Probably less performant than Roman Pekar's solution, but a bit neater:
select
c.id, c.name,
array_to_json(array(select email from emails e where e.user_id=c.id))
from contacts c
A bit different but might be helpful for others:
If all objects in the array are of same structure (e.g. because you use jsonb_build_object to create them) you can define a "NULL object with the same structure" to use in array_remove:
...
array_remove(
array_agg(jsonb_build_object('att1', column1, 'att2', column2)),
to_jsonb('{"att1":null, "att2":null}'::json)
)
...
I made my own function for filtering json arrays:
CREATE OR REPLACE FUNCTION public.json_clean_array(data JSON)
RETURNS JSON
LANGUAGE SQL
AS $$
SELECT
array_to_json(array_agg(value)) :: JSON
FROM (
SELECT
value
FROM json_array_elements(data)
WHERE cast(value AS TEXT) != 'null' AND cast(value AS TEXT) != ''
) t;
$$;
I use it as
select
friend_id as friend,
json_clean_array(array_to_json(array_agg(comment))) as comments
from some_entity_that_might_have_comments
group by friend_id;
of course only works in postgresql 9.3. I also have a similar one for object fields:
CREATE OR REPLACE FUNCTION public.json_clean(data JSON)
RETURNS JSON
LANGUAGE SQL
AS $$
SELECT
('{' || string_agg(to_json(key) || ':' || value, ',') || '}') :: JSON
FROM (
WITH to_clean AS (
SELECT
*
FROM json_each(data)
)
SELECT
*
FROM json_each(data)
WHERE cast(value AS TEXT) != 'null' AND cast(value AS TEXT) != ''
) t;
$$;
EDIT: You can see a few utils (a few are not originally mine but they were take from other stackoverflow solutions) here at my gist:
https://gist.github.com/le-doude/8b0e89d71a32efd21283
This way works, but there's gotta be a better way :(
SELECT C.id, C.name,
case when exists (select true from emails where user_id=C.id) then json_agg(E) else '[]' end
FROM contacts C
LEFT JOIN emails E ON C.id = E.user_id
GROUP BY C.id, C.name;
demo: http://sqlfiddle.com/#!15/ddefb/16
At the time this question was asked, the following example might not have been as efficient of a choice, due to the nature of how the email_list would basically not limit itself based on the outer query, but newer versions of postgres handle this much better (also, I'd recommend jsonb over json)
WITH email_list (user_id, emails) as (
SELECT user_id, json_agg(emails) FROM emails GROUP BY user_id
)
SELECT C.id, C.name, COALESCE(E.emails, '[]'::json) as emails
FROM contacts C LEFT JOIN email_list E ON C.id = E.user_id;
The COALESCE is only needed if you actually do want to have an empty array, otherwise the entire value would be null, which can be preferable output in some languages.
How do I join 2 sets of records solely based on the default order?
So if I have a table x(col(1,2,3,4,5,6,7)) and another table z(col(a,b,c,d,e,f,g))
it will return
c1 c2
-- --
1 a
2 b
3 c
4 d
5 e
6 f
7 g
Actually, I wanted to join a pair of one dimensional arrays from parameters and treat them like columns from a table.
Sample code:
CREATE OR REPLACE FUNCTION "Test"(timestamp without time zone[],
timestamp without time zone[])
RETURNS refcursor AS
$BODY$
DECLARE
curr refcursor;
BEGIN
OPEN curr FOR
SELECT DISTINCT "Start" AS x, "End" AS y, COUNT("A"."id")
FROM UNNEST($1) "Start"
INNER JOIN
(
SELECT "End", ROW_NUMBER() OVER(ORDER BY ("End")) rn
FROM UNNEST($2) "End" ORDER BY ("End")
) "End" ON ROW_NUMBER() OVER(ORDER BY ("Start")) = "End".rn
LEFT JOIN "A" ON ("A"."date" BETWEEN x AND y)
GROUP BY 1,2
ORDER BY "Start";
return curr;
END
$BODY$
Now, to answer the real question that was revealed in comments, which appears to be something like:
Given two arrays 'a' and 'b', how do I pair up their elements so I can get the element pairs as column aliases in a query?
There are a couple of ways to tackle this:
If and only if the arrays are of equal length, use multiple unnest functions in the SELECT clause (a deprecated approach that should only be used for backward compatibility);
Use generate_subscripts to loop over the arrays;
Use generate_series over subqueries against array_lower and array_upper to emulate generate_subscripts if you need to support versions too old to have generate_subscripts;
Relying on the order that unnest returns tuples in and hoping - like in my other answer and as shown below. It'll work, but it's not guaranteed to work in future versions.
Use the WITH ORDINALITY functionality added in PostgreSQL 9.4 (see also its first posting) to get a row number for unnest when 9.4 comes out.
Use multiple-array UNNEST, which is SQL-standard but which PostgreSQL doesn't support yet.
So, say we have function arraypair with array parameters a and b:
CREATE OR REPLACE FUNCTION arraypair (a integer[], b text[])
RETURNS TABLE (col_a integer, col_b text) AS $$
-- blah code here blah
$$ LANGUAGE whatever IMMUTABLE;
and it's invoked as:
SELECT * FROM arraypair( ARRAY[1,2,3,4,5,6,7], ARRAY['a','b','c','d','e','f','g'] );
possible function definitions would be:
SRF-in-SELECT (deprecated)
CREATE OR REPLACE FUNCTION arraypair (a integer[], b text[])
RETURNS TABLE (col_a integer, col_b text) AS $$
SELECT unnest(a), unnest(b);
$$ LANGUAGE sql IMMUTABLE;
Will produce bizarre and unexpected results if the arrays aren't equal in length; see the documentation on set returning functions and their non-standard use in the SELECT list to learn why, and what exactly happens.
generate_subscripts
This is likely the safest option:
CREATE OR REPLACE FUNCTION arraypair (a integer[], b text[])
RETURNS TABLE (col_a integer, col_b text) AS $$
SELECT
a[i], b[i]
FROM generate_subscripts(CASE WHEN array_length(a,1) >= array_length(b,1) THEN a::text[] ELSE b::text[] END, 1) i;
$$ LANGUAGE sql IMMUTABLE;
If the arrays are of unequal length, as written it'll return null elements for the shorter, so it works like a full outer join. Reverse the sense of the case to get an inner-join like effect. The function assumes the arrays are one-dimensional and that they start at index 1. If an entire array argument is NULL then the function returns NULL.
A more generalized version would be written in PL/PgSQL and would check array_ndims(a) = 1, check array_lower(a, 1) = 1, test for null arrays, etc. I'll leave that to you.
Hoping for pair-wise returns:
This isn't guaranteed to work, but does with PostgreSQL's current query executor:
CREATE OR REPLACE FUNCTION arraypair (a integer[], b text[])
RETURNS TABLE (col_a integer, col_b text) AS $$
WITH
rn_c1(rn, col) AS (
SELECT row_number() OVER (), c1.col
FROM unnest(a) c1(col)
),
rn_c2(rn, col) AS (
SELECT row_number() OVER (), c2.col
FROM unnest(b) c2(col)
)
SELECT
rn_c1.col AS c1,
rn_c2.col AS c2
FROM rn_c1
INNER JOIN rn_c2 ON (rn_c1.rn = rn_c2.rn);
$$ LANGUAGE sql IMMUTABLE;
I would consider using generate_subscripts much safer.
Multi-argument unnest:
This should work, but doesn't because PostgreSQL's unnest doesn't accept multiple input arrays (yet):
SELECT * FROM unnest(a,b);
select x.c1, z.c2
from
x
inner join
(
select
c2,
row_number() over(order by c2) rn
from z
order by c2
) z on x.c1 = z.rn
order by x.c1
If x.c1 is not 1,2,3... you can do the same that was done with z
The middle order by is not necessary as pointed by Erwin. I tested it like this:
create table t (i integer);
insert into t
select ceil(random() * 100000)
from generate_series(1, 100000);
select
i,
row_number() over(order by i) rn
from t
;
And i comes out ordered. Before this simple test which I never executed I though it would be possible that the rows would be numbered in any order.
By "default order" it sounds like you probably mean the order in which the rows are returned by select * from tablename without an ORDER BY.
If so, this ordering is undefined. The database can return rows in any order that it feels like. You'll find that if you UPDATE a row, it probably moves to a different position in the table.
If you're stuck in a situation where you assumed tables had an order and they don't, you can as a recovery option add a row number based on the on-disk ordering of the tuples within the table:
select row_number() OVER (), *
from the_table
order by ctid
If the output looks right, I recommend that you CREATE TABLE a new table with an extra field, then do an INSERT INTO ... SELECT to insert the data ordered by ctid, then ALTER TABLE ... RENAME the tables and finally fix any foreign key references so they point to the new table.
ctid can be changed by autovacuum, UPDATE, CLUSTER, etc, so it is not something you should ever be using in applications. I'm using it here only because it sounds like you don't have any real ordering or identifier key.
If you need to pair up rows based on their on-disk ordering (an unreliable and unsafe thing to do as noted above), you could per this SQLFiddle try:
WITH
rn_c1(rn, col) AS (
SELECT row_number() OVER (ORDER BY ctid), c1.col
FROM c1
),
rn_c2(rn, col) AS (
SELECT row_number() OVER (ORDER BY ctid), c2.col
FROM c2
)
SELECT
rn_c1.col AS c1,
rn_c2.col AS c2
FROM rn_c1
INNER JOIN rn_c2 ON (rn_c1.rn = rn_c2.rn);
but never rely on this in a production app. If you're really stuck you can use this with CREATE TABLE AS to construct a new table that you can start with when you're working on recovering data from a DB that lacks a required key, but that's about it.
The same approach given above might work with an empty window clause () instead of (ORDER BY ctid) when using sets that lack a ctid, like interim results from functions. It's even less safe then though, and should be a matter of last resort only.
(See also this newer related answer: https://stackoverflow.com/a/17762282/398670)
I have 3 tables Person, Names, and Notes. Each person has multiple name and has optional notes. I have full text search on some columns on names and notes (see below), they are working perfectly if the word I search with is in the result set or is in the db, this is for custom function, php, and psql. The problem now is that when the word I search is not present in the db the query gets super slow in php and custom function but still fast on psql. On psql it's less than 1s, others are more than 10s.
Tables:
Person | id, birthday
Name | person_id, name, fs_name
Notes | person_id, note, fs_note
Beside PK and FK index, Gin index on fs_name and fs_note.
Function/Query
create or replace function queryNameFunc (TEXT)
returns TABLE(id int, name TEXT) as $$
select id, name
from person_name pnr
inner join person pr on (pnr.person_id=pr.id)
left join personal_notes psr on (psr.person_id = pr.id)
where pr.id in
(select distinct(id)
from person_name pn
inner join person p on (p.id = pn.person_id)
left join personal_notes ps on (ps.person_id = p.id)
where tname ## to_tsquery($1)
limit 20);
$$ language SQL;
The where condition is trimmed down in here, so for example if I do 'john & james' on $1 and the data is on the db then results is fast but if 'john and james' are not in db then its slow. This got slower as I have 1M records on person and 3M+ on names (all dummy records). Any idea on how to fix this? I tried restarting the server, restarting postgresql.
The database has to preprare the inner query before it has any knowledge about the parameter. This might result in a bad queryplan. To avoid this problem in a function, use the plpgsql-language and use EXECUTE inside the function:
CREATE OR REPLACE FUNCTION queryNameFunc (TEXT) RETURNS TABLE(id INT, name TEXT) AS $$
BEGIN
RETURN QUERY EXECUTE '
SELECT
id,
name
FROM
person_name pnr
INNER JOIN person pr ON (pnr.person_id=pr.id)
LEFT JOIN personal_notes psr ON (psr.person_id = pr.id)
WHERE
pr.id IN(
SELECT
DISTINCT(id)
FROM
person_name pn
INNER JOIN person p ON (p.id = pn.person_id)
LEFT JOIN personal_notes ps ON (ps.person_id = p.id)
WHERE tname ## to_tsquery($1)
LIMIT 20)' USING $1;
END;
$$ LANGUAGE plpgsql;
This works in version 8.4 and you do have to install plpgsql:
CREATE LANGUAGE plpgsql;