How to find in a many to many relation all the identical values in a column and join the table with other three tables? - postgresql

I have a many to many relation with three columns, (owner_id,property_id,ownership_perc) and for this table applies (many owners have many properties).
So I would like to find all the owner_id who has many properties (property_id) and connect them with other three tables (Table 1,3,4) in order to get further information for the requested result.
All the tables that I'm using are
Table 1: owner (id_owner,name)
Table 2: owner_property (owner_id,property_id,ownership_perc)
Table 3: property(id_property,building_id)
Table 4: building(id_building,address,region)
So, when I'm trying it like this, the query runs but it returns empty.
SELECT address,region,name
FROM owner_property
JOIN property ON owner_property.property_id = property.id_property
JOIN owner ON owner.id_owner = owner_property.owner_id
JOIN building ON property.building_id=building.id_building
GROUP BY owner_id,address,region,name
HAVING count(owner_id) > 1
ORDER BY owner_id;
Only when I'm trying the code below, it returns the owner_id who has many properties (see image below) but without joining it with the other three tables:
SELECT a.*
FROM owner_property a
JOIN (SELECT owner_id, COUNT(owner_id)
FROM owner_property
GROUP BY owner_id
HAVING COUNT(owner_id)>1) b
ON a.owner_id = b.owner_id
ORDER BY a.owner_id,property_id ASC;
So, is there any suggestion on what I'm doing wrong when I'm joining the tables? Thank you!

This query:
SELECT owner_id
FROM owner_property
GROUP BY owner_id
HAVING COUNT(property_id) > 1
returns all the owner_ids with more than 1 property_ids.
If there is a case of duplicates in the combination of owner_id and property_id then instead of COUNT(property_id) use COUNT(DISTINCT property_id) in the HAVING clause.
So join it to the other tables:
SELECT b.address, b.region, o.name
FROM (
SELECT owner_id
FROM owner_property
GROUP BY owner_id
HAVING COUNT(property_id) > 1
) t
INNER JOIN owner_property op ON op.owner_id = t.owner_id
INNER JOIN property p ON op.property_id = p.id_property
INNER JOIN owner o ON o.id_owner = op.owner_id
INNER JOIN building b ON p.building_id = b.id_building
ORDER BY op.owner_id, op.property_id ASC;
Always qualify the column names with the table name/alias.

You can try to use a correlated subquery that counts the ownerships with EXISTS in the WHERE clause.
SELECT b1.address,
b1.region,
o1.name
FROM owner_property op1
INNER JOIN owner o1
ON o1.id_owner = op1.owner_id
INNER JOIN property p1
ON p1.id_property = op1.property_id
INNER JOIN building b1
ON b1.id_building = p1.building_id
WHERE EXISTS (SELECT ''
FROM owner_property op2
WHERE op2.owner_id = op1.owner_id
HAVING count(*) > 1);

Related

select count from both tables using join in postgesql

how to find number of records in both table using join.
i have two tables table1 and table2 with same structure.
table1
id
item
1
A
1
B
1
C
2
A
2
B
table2
id
item
1
A
1
B
2
A
2
B
2
C
2
D
Output should be like this.
id
table1.itemcount
table2.itemcount
1
3
2
2
2
4
SELECT DISTINCT id, (
SELECT COUNT(*) FROM table1 AS table1_2 WHERE table1_2.id=table1.id
) AS "table1.itemcount", (
SELECT COUNT(*) FROM table2 AS table2_2 WHERE table2_2.id=table1.id
) AS "table2.itemcount"
FROM table1;
Assuming that each id is guaranteed to exist in both tables, the following would work
select
t1.id,
count(distinct t1.item) t1count,
count(distinct t2.item) t2count
from t1
join t2 on t1.id = t2.id
group by 1;
But if that is not guaranteed then we'll have to use full outer join to get unique ids from both tables
select
coalesce(t1.id, t2.id) id,
count(distinct t1.item) t1count,
count(distinct t2.item) t2count
from t1
full outer join t2 on t1.id = t2.id
group by 1;
We're using coalesce here as well for id because if it only exists in t2, t1.id would result in null.
#DeeStark's answer also works if ids are guaranteed to be in both tables but it's quite inefficient because count is essentially run twice for every distinct id in the table. Here's the fiddle where you can test out different approaches. I've prefixed each query with explain which shows the cost
Hope this helps

How to get unique rows by one column but sort by the second

There is an example request in which there are several joins.
SELECT DISTINCT ON(a.id_1) 1, a.name, b.task, c.created_at
FROM a
INNER JOIN b ON a.id_2 = b.id
INNER JOIN c ON a.ID_2 = c.id
WHERE a.deleted_at IS NULL
ORDER BY a.id_1 desc
In this case, the query will work, sorting by unique values ​​of id_1 will take place. But I need to sort by the column a.name. In this case, postresql will swear with the words ERROR: SELECT DISTINCT ON expressions must match initial ORDER BY expressions.
The following query can serve as a solution to the problem:
SELECT *
FROM(
SELECT DISTINCT ON(a.id_1) a.name, b.task, c.created_at
FROM a
INNER JOIN b ON a.id_2 = b.id
INNER JOIN c ON a.ID_2 = c.id
WHERE a.deleted_at IS NULL
)
ORDER_BY a.name desc
But in reality the database is very large and such a query is not optimal. Are there other ways to sort by the selected column while keeping one uniqueness?

Can't solve this SQL query

I have a difficulty dealing with a SQL query. I use PostgreSQL.
The query says: Show the customers that have done at least an order that contains products from 3 different categories. The result will be 2 columns, CustomerID, and the amount of orders. I have written this code but I don't think it's correct.
select SalesOrderHeader.CustomerID,
count(SalesOrderHeader.SalesOrderID) AS amount_of_orders
from SalesOrderHeader
inner join SalesOrderDetail on
(SalesOrderHeader.SalesOrderID=SalesOrderDetail.SalesOrderID)
inner join Product on
(SalesOrderDetail.ProductID=Product.ProductID)
where SalesOrderDetail.SalesOrderDetailID in
(select DISTINCT count(ProductCategoryID)
from Product
group by ProductCategoryID
having count(DISTINCT ProductCategoryID)>=3)
group by SalesOrderHeader.CustomerID;
Here are the database tables needed for the query:
where SalesOrderDetail.SalesOrderDetailID in
(select DISTINCT count(ProductCategoryID)
Is never going to give you a result as an ID (SalesOrderDetailID) will never logically match a COUNT (count(ProductCategoryID)).
This should get you the output I think you want.
SELECT soh.CustomerID, COUNT(soh.SalesOrderID) AS amount_of_orders
FROM SalesOrderHeader soh
INNER JOIN SalesOrderDetail sod ON soh.SalesOrderID = sod.SalesOrderID
INNER JOIN Product p ON sod.ProductID = p.ProductID
HAVING COUNT(DISTINCT p.ProductCategoryID) >= 3
GROUP BY soh.CustomerID
Try this :
select CustomerID,count(*) as amount_of_order from
SalesOrder join
(
select SalesOrderID,count(distinct ProductCategoryID) CategoryCount
from SalesOrderDetail JOIN Product using (ProductId)
group by 1
) CatCount using (SalesOrderId)
group by 1
having bool_or(CategoryCount>=3) -- At least on CategoryCount>=3

Has-Many-Through: How to select records with no relation OR by some condition in relation?

There are three tables: businesses, categories, categorizations,
CREATE TABLE businesses (
id SERIAL PRIMARY KEY,
name varchar(40)
);
CREATE TABLE categories (
id SERIAL PRIMARY KEY,
name varchar(40)
);
CREATE TABLE categorizations (
business_id integer,
category_id integer
);
So business has many categories through categorizations.
If I want to select businesses without categories, I would do something
like this:
SELECT businesses.* FROM businesses
LEFT OUTER JOIN categorizations
ON categorizations.business_id = businesses.id
LEFT OUTER JOIN categories
ON categories.id = categorizations.category_id
GROUP BY businesses.id
HAVING count(categories.id) = 0;
The question is: How do I select businesses without categories AND
businesses with category named "Media" in one query?
You can use a union:
SELECT businesses.*
FROM businesses
LEFT OUTER JOIN categorizations
ON categorizations.business_id = businesses.id
GROUP BY businesses.id
HAVING count(categorizations.business_id) = 0
UNION
SELECT businesses.*
FROM businesses
INNER JOIN categorizations
ON categorizations.business_id = businesses.id
INNER JOIN categories
ON categories.id = categorizations.category_id
WHERE categories.name = 'Media';
Note that in the first instance (businesses with no categories at all) that you won't need to join as far as categories - you can detect the lack of category in the junction table. If it is possible for the same business to have the same category more than once, you'll need to introduce the second query with DISTINCT.
I would try:
SELECT b.* FROM businesses b
LEFT JOIN categorizations cz ON b.business_id = cz.business_id
LEFT JOIN categories cs ON cz.category_id = cs.category_id
WHERE COALESCE(cs.name, 'Media') = 'Media';
... in the hope that businesses with no categorizations would get NULL entries on their joins.
The double-negation trick works for this kind of selections:
SELECT * FROM businesses b
WHERE NOT EXISTS (
SELECT *
FROM categorizations bc
JOIN categories c ON bc.category_id = c.category_id
WHERE bc.business_id = b.business_id
AND c.name <> 'Media'
);

How to get the top most parent in PostgreSQL

I have a tree structure table with columns:
id,parent,name.
Given a tree A->B->C,
how could i get the most top parent A's ID according to C's ID?
Especially how to write SQL with "with recursive"?
Thanks!
WITH RECURSIVE q AS
(
SELECT m
FROM mytable m
WHERE id = 'C'
UNION ALL
SELECT m
FROM q
JOIN mytable m
ON m.id = q.parent
)
SELECT (m).*
FROM q
WHERE (m).parent IS NULL
To implement recursive queries, you need a Common Table Expression (CTE).
This query computes ancestors of all parent nodes. Since we want just the top level, we select where level=0.
WITH RECURSIVE Ancestors AS
(
SELECT id, parent, 0 AS level FROM YourTable WHERE parent IS NULL
UNION ALL
SELECT child.id, child.parent, level+1 FROM YourTable child INNER JOIN
Ancestors p ON p.id=child.parent
)
SELECT * FROM Ancestors WHERE a.level=0 AND a.id=C
If you want to fetch all your data, then use an inner join on the id, e.g.
SELECT YourTable.* FROM Ancestors a WHERE a.level=0 AND a.id=C
INNER JOIN YourTable ON YourTable.id = a.id
Assuming a table named "organization" with properties id, name, and parent_organization_id, here is what worked for me to get a list that included top level and parent level org ID's for each level.
WITH RECURSIVE orgs AS (
SELECT
o.id as top_org_id
,null::bigint as parent_org_id
,o.id as org_id
,o.name
,0 AS relative_depth
FROM organization o
UNION
SELECT
allorgs.top_org_id
,childorg.parent_organization_id
,childorg.id
,childorg.name
,allorgs.relative_depth + 1
FROM organization childorg
INNER JOIN orgs allorgs ON allorgs.org_id = childorg.parent_organization_id
) SELECT
*
FROM
orgs order by 1,5;