How to produce grouped column from many rows on PostgreSQL - postgresql

If i could produce this result from many to many relationship from this kind of query:
SELECT x1.id AS id1, x3.id AS id3
FROM humans x1
LEFT JOIN memberships x2
ON x1.id = x2.human_id
LEFT JOIN groups x3
ON x2.group_id = x3.id
WHERE x1.id IN ( 1,2,3,4 )
ORDER BY 1,2
id1 | id3
----+----
1 | A
1 | B
1 | C
2 | D
2 | E
3 | F
4 | (null)
5 | G
5 | Z
how to convert it into this kind of table?
id1 | id3s
----+--------
1 | A, B, C
2 | D, E
3 | F
4 | (null)
5 | G, Z

Use string_agg and a group by:
SELECT x1.id AS id1, string_agg(x3.id,',' order by x3.id asc) AS id3s
FROM humans x1
LEFT JOIN memberships x2
ON x1.id = x2.human_id
LEFT JOIN groups x3
ON x2.group_id = x3.id
WHERE x1.id IN ( 1,2,3,4 )
GROUP BY x1.id
ORDER BY 1

Related

How can I use PostGIS to select the average price of the closest X locations?

I would like to find the average price of gas for any given home. Here are my current tables.
home_id | geocoordinates
1 | 0101000020E61000005BB6D617097544
2 | 0101000020E61000005BB6D617097545
3 | 0101000020E61000005BB6D617097546
4 | 0101000020E61000005BB6D617097547
5 | 0101000020E61000005BB6D617097548
gas_price | geocoordinates
1 | 0101000020E61000005BB6D617097544
1 | 0101000020E61000005BB6D617097545
1 | 0101000020E61000005BB6D617097546
2 | 0101000020E61000005BB6D617097547
2 | 0101000020E61000005BB6D617097548
2 | 0101000020E61000005BB6D617097544
2 | 0101000020E61000005BB6D617097545
3 | 0101000020E61000005BB6D617097546
3 | 0101000020E61000005BB6D617097547
3 | 0101000020E61000005BB6D617097548
3 | 0101000020E61000005BB6D617097544
4 | 0101000020E61000005BB6D617097545
4 | 0101000020E61000005BB6D617097546
4 | 0101000020E61000005BB6D617097547
For each home, I would like to find the average gas price of the X closest gas_prices. Example if X=5:
home_id | average_of_closest_five_gas_prices
1 | 1.5
2 | 2.5
3 | 2.1
4 | 1.5
5 | 1.5
I figured it out for using one individual home_id but I'm struggling to figure out how to do it for all.
select avg(gas_price) from (
SELECT *
FROM gas_price
ORDER BY gas_price.geocoordinates <-> '0101000020E61000005BB6D617097544'
LIMIT 5
) as table_a
You can use lateral join to limit size of group in group by.
select home_id, avg(gas_price)
from home,
lateral (
select gas_price
from gas_price
order by gas_price.geocoordinates <-> home.geocoordinates
limit 5
) x
group by home_id;
Another option is to use window function: partition by home_id, order by distance and select only rows with row_number() <= 5.
select home_id, avg(gas_price)
from (
select row_number() over w as r, *
from home h, gas_price g
window w as (partition by home_id order by g.geocoordinates <-> h.geocoordinates)
) x
where r <= 5
group by home_id;

SQL Server recursive query·

I have a table in SQL Server 2008 R2 which contains product orders. For the most part, it is one entry per product
ID | Prod | Qty
------------
1 | A | 1
4 | B | 1
7 | A | 1
8 | A | 1
9 | A | 1
12 | C | 1
15 | A | 1
16 | A | 1
21 | B | 1
I want to create a view based on the table which looks like this
ID | Prod | Qty
------------------
1 | A | 1
4 | B | 1
9 | A | 3
12 | C | 1
16 | A | 2
21 | B | 1
I've written a query using a table expression, but I am stumped on how to make it work. The sql below does not actually work, but is a sample of what I am trying to do. I've written this query multiple different ways, but cannot figure out how to get the right results. I am using row_number to generate a sequential id. From that, I can order and compare consecutive rows to see if the next row has the same product as the previous row since ReleaseId is sequential, but not necessarily contiguous.
;with myData AS
(
SELECT
row_number() over (order by a.ReleaseId) as 'Item',
a.ReleaseId,
a.ProductId,
a.Qty
FROM OrdersReleased a
UNION ALL
SELECT
row_number() over (order by b.ReleaseId) as 'Item',
b.ReleaseId,
b.ProductId,
b.Qty
FROM OrdersReleased b
INNER JOIN myData c ON b.Item = c.Item + 1 and b.ProductId = c.ProductId
)
SELECT * from myData
Usually you drop the ID out of something like this, since it is a summary.
SELECT a.ProductId,
SUM(a.Qty) AS Qty
FROM OrdersReleased a
GROUP BY a.ProductId
ORDER BY a.ProductId
-- if you want to do sub query you can do it as a column (if you don't have a very large dataset).
SELECT a.ProductId,
SUM(a.Qty) AS Qty,
(SELECT COUNT(1)
FROM OrdersReleased b
WHERE b.ReleasedID - 1 = a.ReleasedID
AND b.ProductId = b.ProductId) as NumberBackToBack
FROM OrdersReleased a
GROUP BY a.ProductId
ORDER BY a.ProductId

Query in postgresql

I have two tables t1 and t2 as following
t1
A B C D E
1 2 c d e
3 1 d e f
4 2 f g h
t2
A B
1 2
8 6
4 2
Here A,B,C,D,E are the columns of t1 and A,B are the columns of t2 where A and B are common columns.
What I have done so far
I have written the following query
WITH temp as (
select *
from t2
)
select tab1.*
from t1 tab1, temp tab2
where (tab1.A!=tab2.A OR tab1.B!=tab2.B)
I wanted this output
A B C D E
3 1 d e f
But I am getting this output
A B C D E
1 2 c d e
1 2 c d e
3 1 d e f
3 1 d e f
3 1 d e f
4 2 f g h
4 2 f g h
What query should I use?
If I understand you correctly, you'd like those rows from T1 that don't have corresponding rows in T2. The easiest way in my opinion is a LEFT OUTER JOIN:
psql=> select * from t1;
a | b | c | d | e
---+---+---+---+---
1 | 2 | c | d | e
3 | 1 | d | e | f
4 | 2 | f | g | h
(3 rows)
psql=> select * from t2;
a | b
---+---
1 | 2
8 | 6
4 | 2
(3 rows)
psql=> select t1.a, t1.b, t1.c, t1.d, t1.e from t1 left outer join t2 on (t1.a = t2.a and t1.b = t2.b) where t2.a is null;
a | b | c | d | e
---+---+---+---+---
3 | 1 | d | e | f
(1 row)
Edit: Here's the select without the where clause, with the rows from t2 added (otherwise it'd be just like a select * from t1). As you can see, the first row contains NULLs for t2_a and t2_b:
psql=> select t1.a, t1.b, t1.c, t1.d, t1.e, t2.a as t2_a, t2.b as t2_b from t1 left outer join t2 on (t1.a = t2.a and t1.b = t2.b);
a | b | c | d | e | t2_a | t2_b
---+---+---+---+---+------+------
3 | 1 | d | e | f | |
1 | 2 | c | d | e | 1 | 2
4 | 2 | f | g | h | 4 | 2
(3 rows)
How about:
SELECT * FROM t1 WHERE (a,b) NOT IN (SELECT a,b FROM t2);

Find Parent Recursively using Query

I am using postgresql. I have the table as like below
parent_id child_id
----------------------
101 102
103 104
104 105
105 106
I want to write a sql query which will give the final parent of input.
i.e suppose i pass 106 as input then , its output will be 103.
(106 --> 105 --> 104 --> 103)
Here's a complete example. First the DDL:
test=> CREATE TABLE node (
test(> id SERIAL,
test(> label TEXT NOT NULL, -- name of the node
test(> parent_id INT,
test(> PRIMARY KEY(id)
test(> );
NOTICE: CREATE TABLE will create implicit sequence "node_id_seq" for serial column "node.id"
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "node_pkey" for table "node"
CREATE TABLE
...and some data...
test=> INSERT INTO node (label, parent_id) VALUES ('n1',NULL),('n2',1),('n3',2),('n4',3);
INSERT 0 4
test=> INSERT INTO node (label) VALUES ('garbage1'),('garbage2'), ('garbage3');
INSERT 0 3
test=> INSERT INTO node (label,parent_id) VALUES ('garbage4',6);
INSERT 0 1
test=> SELECT * FROM node;
id | label | parent_id
----+----------+-----------
1 | n1 |
2 | n2 | 1
3 | n3 | 2
4 | n4 | 3
5 | garbage1 |
6 | garbage2 |
7 | garbage3 |
8 | garbage4 | 6
(8 rows)
This performs a recursive query on every id in node:
test=> WITH RECURSIVE nodes_cte(id, label, parent_id, depth, path) AS (
SELECT tn.id, tn.label, tn.parent_id, 1::INT AS depth, tn.id::TEXT AS path
FROM node AS tn
WHERE tn.parent_id IS NULL
UNION ALL
SELECT c.id, c.label, c.parent_id, p.depth + 1 AS depth,
(p.path || '->' || c.id::TEXT)
FROM nodes_cte AS p, node AS c
WHERE c.parent_id = p.id
)
SELECT * FROM nodes_cte AS n ORDER BY n.id ASC;
id | label | parent_id | depth | path
----+----------+-----------+-------+------------
1 | n1 | | 1 | 1
2 | n2 | 1 | 2 | 1->2
3 | n3 | 2 | 3 | 1->2->3
4 | n4 | 3 | 4 | 1->2->3->4
5 | garbage1 | | 1 | 5
6 | garbage2 | | 1 | 6
7 | garbage3 | | 1 | 7
8 | garbage4 | 6 | 2 | 6->8
(8 rows)
This gets all of the descendents WHERE node.id = 1:
test=> WITH RECURSIVE nodes_cte(id, label, parent_id, depth, path) AS (
SELECT tn.id, tn.label, tn.parent_id, 1::INT AS depth, tn.id::TEXT AS path FROM node AS tn WHERE tn.id = 1
UNION ALL
SELECT c.id, c.label, c.parent_id, p.depth + 1 AS depth, (p.path || '->' || c.id::TEXT) FROM nodes_cte AS p, node AS c WHERE c.parent_id = p.id
)
SELECT * FROM nodes_cte AS n;
id | label | parent_id | depth | path
----+-------+-----------+-------+------------
1 | n1 | | 1 | 1
2 | n2 | 1 | 2 | 1->2
3 | n3 | 2 | 3 | 1->2->3
4 | n4 | 3 | 4 | 1->2->3->4
(4 rows)
The following will get the path of the node with id 4:
test=> WITH RECURSIVE nodes_cte(id, label, parent_id, depth, path) AS (
SELECT tn.id, tn.label, tn.parent_id, 1::INT AS depth, tn.id::TEXT AS path
FROM node AS tn
WHERE tn.parent_id IS NULL
UNION ALL
SELECT c.id, c.label, c.parent_id, p.depth + 1 AS depth,
(p.path || '->' || c.id::TEXT)
FROM nodes_cte AS p, node AS c
WHERE c.parent_id = p.id
)
SELECT * FROM nodes_cte AS n WHERE n.id = 4;
id | label | parent_id | depth | path
----+-------+-----------+-------+------------
4 | n4 | 3 | 4 | 1->2->3->4
(1 row)
And let's assume you want to limit your search to descendants with a depth less than three (note that depth hasn't been incremented yet):
test=> WITH RECURSIVE nodes_cte(id, label, parent_id, depth, path) AS (
SELECT tn.id, tn.label, tn.parent_id, 1::INT AS depth, tn.id::TEXT AS path
FROM node AS tn WHERE tn.id = 1
UNION ALL
SELECT c.id, c.label, c.parent_id, p.depth + 1 AS depth,
(p.path || '->' || c.id::TEXT)
FROM nodes_cte AS p, node AS c
WHERE c.parent_id = p.id AND p.depth < 2
)
SELECT * FROM nodes_cte AS n;
id | label | parent_id | depth | path
----+-------+-----------+-------+------
1 | n1 | | 1 | 1
2 | n2 | 1 | 2 | 1->2
(2 rows)
I'd recommend using an ARRAY data type instead of a string for demonstrating the "path", but the arrow is more illustrative of the parent<=>child relationship.
Use WITH RECURSIVE to create a Common Table Expression (CTE). For the non-recursive term, get the rows in which the child is immediately below the parent:
SELECT
c.child_id,
c.parent_id
FROM
mytable c
LEFT JOIN
mytable p ON c.parent_id = p.child_id
WHERE
p.child_id IS NULL
child_id | parent_id
----------+-----------
102 | 101
104 | 103
For the recursive term, you want the children of these children.
WITH RECURSIVE tree(child, root) AS (
SELECT
c.child_id,
c.parent_id
FROM
mytable c
LEFT JOIN
mytable p ON c.parent_id = p.child_id
WHERE
p.child_id IS NULL
UNION
SELECT
child_id,
root
FROM
tree
INNER JOIN
mytable on tree.child = mytable.parent_id
)
SELECT * FROM tree;
child | root
-------+------
102 | 101
104 | 103
105 | 103
106 | 103
You can filter the children when querying the CTE:
WITH RECURSIVE tree(child, root) AS (...) SELECT root FROM tree WHERE child = 106;
root
------
103

postgres counting one record twice if it meets certain criteria

I thought that the query below would naturally do what I explain, but apparently not...
My table looks like this:
id | name | g | partner | g2
1 | John | M | Sam | M
2 | Devon | M | Mike | M
3 | Kurt | M | Susan | F
4 | Stacy | F | Bob | M
5 | Rosa | F | Rita | F
I'm trying to get the id where either the g or g2 value equals 'M'... But, a record where both the g and g2 values are 'M' should return two lines, not 1.
So, in the above sample data, I'm trying to return:
$q = pg_query("SELECT id FROM mytable WHERE ( g = 'M' OR g2 = 'M' )");
1
1
2
2
3
4
But, it always returns:
1
2
3
4
Your query doesn't work because each row is returned only once whether it matches one or both of the conditions. To get what you want use two queries and use UNION ALL to combine the results:
SELECT id FROM mytable WHERE g = 'M'
UNION ALL
SELECT id FROM mytable WHERE g2 = 'M'
ORDER BY id
Result:
1
1
2
2
3
4
you might try a UNION along these lines:
"SELECT id FROM mytable WHERE ( g = 'M') UNION SELECT id FROM mytable WHERE ( g2 = 'M')"
Hope this helps, Martin
SELECT id FROM mytable WHERE g = 'M'
UNION
SELECT id FROM mytable WHERE g2 = 'M'