Has-Many-Through: How to select records with no relation OR by some condition in relation? - postgresql

There are three tables: businesses, categories, categorizations,
CREATE TABLE businesses (
id SERIAL PRIMARY KEY,
name varchar(40)
);
CREATE TABLE categories (
id SERIAL PRIMARY KEY,
name varchar(40)
);
CREATE TABLE categorizations (
business_id integer,
category_id integer
);
So business has many categories through categorizations.
If I want to select businesses without categories, I would do something
like this:
SELECT businesses.* FROM businesses
LEFT OUTER JOIN categorizations
ON categorizations.business_id = businesses.id
LEFT OUTER JOIN categories
ON categories.id = categorizations.category_id
GROUP BY businesses.id
HAVING count(categories.id) = 0;
The question is: How do I select businesses without categories AND
businesses with category named "Media" in one query?

You can use a union:
SELECT businesses.*
FROM businesses
LEFT OUTER JOIN categorizations
ON categorizations.business_id = businesses.id
GROUP BY businesses.id
HAVING count(categorizations.business_id) = 0
UNION
SELECT businesses.*
FROM businesses
INNER JOIN categorizations
ON categorizations.business_id = businesses.id
INNER JOIN categories
ON categories.id = categorizations.category_id
WHERE categories.name = 'Media';
Note that in the first instance (businesses with no categories at all) that you won't need to join as far as categories - you can detect the lack of category in the junction table. If it is possible for the same business to have the same category more than once, you'll need to introduce the second query with DISTINCT.

I would try:
SELECT b.* FROM businesses b
LEFT JOIN categorizations cz ON b.business_id = cz.business_id
LEFT JOIN categories cs ON cz.category_id = cs.category_id
WHERE COALESCE(cs.name, 'Media') = 'Media';
... in the hope that businesses with no categorizations would get NULL entries on their joins.

The double-negation trick works for this kind of selections:
SELECT * FROM businesses b
WHERE NOT EXISTS (
SELECT *
FROM categorizations bc
JOIN categories c ON bc.category_id = c.category_id
WHERE bc.business_id = b.business_id
AND c.name <> 'Media'
);

Related

How to find in a many to many relation all the identical values in a column and join the table with other three tables?

I have a many to many relation with three columns, (owner_id,property_id,ownership_perc) and for this table applies (many owners have many properties).
So I would like to find all the owner_id who has many properties (property_id) and connect them with other three tables (Table 1,3,4) in order to get further information for the requested result.
All the tables that I'm using are
Table 1: owner (id_owner,name)
Table 2: owner_property (owner_id,property_id,ownership_perc)
Table 3: property(id_property,building_id)
Table 4: building(id_building,address,region)
So, when I'm trying it like this, the query runs but it returns empty.
SELECT address,region,name
FROM owner_property
JOIN property ON owner_property.property_id = property.id_property
JOIN owner ON owner.id_owner = owner_property.owner_id
JOIN building ON property.building_id=building.id_building
GROUP BY owner_id,address,region,name
HAVING count(owner_id) > 1
ORDER BY owner_id;
Only when I'm trying the code below, it returns the owner_id who has many properties (see image below) but without joining it with the other three tables:
SELECT a.*
FROM owner_property a
JOIN (SELECT owner_id, COUNT(owner_id)
FROM owner_property
GROUP BY owner_id
HAVING COUNT(owner_id)>1) b
ON a.owner_id = b.owner_id
ORDER BY a.owner_id,property_id ASC;
So, is there any suggestion on what I'm doing wrong when I'm joining the tables? Thank you!
This query:
SELECT owner_id
FROM owner_property
GROUP BY owner_id
HAVING COUNT(property_id) > 1
returns all the owner_ids with more than 1 property_ids.
If there is a case of duplicates in the combination of owner_id and property_id then instead of COUNT(property_id) use COUNT(DISTINCT property_id) in the HAVING clause.
So join it to the other tables:
SELECT b.address, b.region, o.name
FROM (
SELECT owner_id
FROM owner_property
GROUP BY owner_id
HAVING COUNT(property_id) > 1
) t
INNER JOIN owner_property op ON op.owner_id = t.owner_id
INNER JOIN property p ON op.property_id = p.id_property
INNER JOIN owner o ON o.id_owner = op.owner_id
INNER JOIN building b ON p.building_id = b.id_building
ORDER BY op.owner_id, op.property_id ASC;
Always qualify the column names with the table name/alias.
You can try to use a correlated subquery that counts the ownerships with EXISTS in the WHERE clause.
SELECT b1.address,
b1.region,
o1.name
FROM owner_property op1
INNER JOIN owner o1
ON o1.id_owner = op1.owner_id
INNER JOIN property p1
ON p1.id_property = op1.property_id
INNER JOIN building b1
ON b1.id_building = p1.building_id
WHERE EXISTS (SELECT ''
FROM owner_property op2
WHERE op2.owner_id = op1.owner_id
HAVING count(*) > 1);

SQL query involving specific count with distinct

I have these tables:
person (id primary key, name)
money (acct primary key, loaner)
loan (id primary key, acct)
How would I create a SQL query that shows for each loaner the names of persons who took more than four loans from that specific loaner? And I want the 4 persons that he loaned to be different with each other.
SELECT
p.id, p.name, m.loaner, COUNT(*)
FROM
person p
INNER JOIN
loan l ON p.id = l.id
INNER JOIN
money m ON l.acct = m.acct
GROUP BY
id, name, lower
HAVING
COUNT(*) = 4
With this query you can find the first part of the question - what should I add?
I would try this out and see what happens :D
SELECT distinct *
FROM person as p_loaner_detailed
WHERE p_loaner_detailed.id in (
SELECT loanerId
FROM (
SELECT p.id, p.name, m.loaner as loanerId
COUNT(*)
FROM person p
INNER JOIN loan l
ON p.id = l.id
INNER JOIN money m
ON l.acct = m.acct
GROUP BY id, name, loanerId
HAVING COUNT(*) > 4
)
)

Joined aggregate function update?

I'm trying to denormalize an aggregate bit of data for performance, but can't figure out how to get the aggregation working...
CREATE TABLE brands (
id SERIAL,
name TEXT,
total INTEGER,
unitcount INTEGER
)
CREATE TABLE items (
brandid INTEGER,
id SERIAL,
unitvalue INTEGER
)
UPDATE brands SET b.total = i.sumScore,
b.unitcount = i.unitcount
FROM brands b
INNER JOIN
(
SELECT brandid,
SUM(unitvalue) sumScore,
COUNT(unitvalue) unitcount
FROM items
group by brandid
) as a
ON i.brandid = b.id
This updates EVERY record in brand with the same values, despite the inner join query showing a correct table set of distinct values for each brand. How can I get that correlated?
please try
UPDATE brands b
INNER JOIN (
SELECT brandid,SUM(unitvalue) value_to_update,COUNT(unitvalue) cnt
FROM items GROUP BY brandid) abc
ON b.id=abc.brandid
SET b.total=abc.value_to_update,b.unitcount=abc.cnt

Removing duplicate rows from relation

I have the following code which produces a relation:
SELECT book_id, shipments.customer_id
FROM shipments
LEFT JOIN editions ON (shipments.isbn = editions.isbn)
LEFT JOIN customers ON (shipments.customer_id = customers.customer_id)
In this relation, there are customer_ids as well as book_ids of books they have bought. My goal is to create a relation with each book in it and then how many unique customers bought it. I assume one way to achieve this is to eliminate all duplicate rows in the relation and then counting the instances of each book_id.
So my question is: How can I delete all duplicate rows from this relation?
Thanks!
EDIT: So what I mean is that I want all the rows in the relation to be unique. If there are three identical rows for example, two of them should be removed.
This will give you all the {customer,edition} pairs for which an order exists:
SELECT *
FROM customers c
JOIN editions e ON (
SELECT * FROM shipments s
WHERE s.isbn = e.isbn
AND s.customer_id = c.customer_id
);
The duplicates are in table shipments. You can remove these with a DISTINCT clause and then count them in an outer query GROUP BY isbn:
SELECT isbn, count(customer_id) AS unique_buyers
FROM (
SELECT DISTINCT isbn, customer_id FROM shipments) book_buyer
GROUP BY isbn;
If you want a list of all books, even where no purchases were made, you should LEFT JOIN the above to the list of all books:
SELECT isbn, coalesce(unique_buyers, 0) AS books_sold_to_unique_buyers
FROM editions
LEFT JOIN (
SELECT isbn, count(customer_id) AS unique_buyers
FROM (
SELECT DISTINCT isbn, customer_id FROM shipments) book_buyer
GROUP BY isbn) books_bought USING (isbn)
ORDER BY isbn;
You can write this more succinctly by joining before counting:
SELECT isbn, count(customer_id) AS books_sold_to_unique_buyers
FROM editions
LEFT JOIN (
SELECT DISTINCT isbn, customer_id FROM shipments) book_buyer USING (isbn)
GROUP BY isbn
ORDER BY isbn;

Postgres get rows which hasnt match in other table

I need your help. I need an advanced Query to my database. Im showing part of my database following:
Place (id, name, address)
Local (id, place_id, name)
PlaceReservation(id, local_id, date)
Media_Place (id, place_id, type)
Now I need a query, which gets all places with logo, which have AT LEAST ONE local which hasn't been reserved on a specific day e.g: 2015-07-01.
Help me please, because I haven't an idea how to do it. I thought about an outer join but I don't know how use it.
I was trying by:
$query = 'SELECT DISTINC *,
(SELECT sum(po.rating)/count(po.id)
FROM "Place_Opinion" po
WHERE po.place_id = p.id AND po.deleted = false) AS rating,
mp.path as logo_path
FROM "Place" p
INNER JOIN "Media_Place" mp ON mp.place_id = p.id
JOIN Local ON Local.place_id = Place.id
LEFT JOIN (
SELECT id AS rr, local_id
FROM PlaceReservation
WHERE date_start = \'2015-07-01\') Reserved ON Reserved.local_id = Local.id
WHERE mp.type = ' . Model_Row_MediaPlace::LOGO_TYPE . '
AND mp.deleted = false
AND p.deleted = false
AND rr IS NULL';
Looking for things that do not exist in a database is usually very inefficient. But you can change the logic around by finding places that do have a booking for the specified date, then LEFT JOIN that to all places with a logo and filter out the records with a reservation:
SELECT DISTINCT p.*, po.rating, mp.path as logo_path
FROM "Place" p
JOIN "Media_Place" mp ON mp.place_id = p.id AND mp.deleted = false AND mp.type = ?
JOIN Local ON Local.place_id = p.id
LEFT JOIN (
SELECT id AS rr, local_id
FROM PlaceReservation
WHERE date_start = '2015-07-01') reserved ON reserved.local_id = Local.id
LEFT JOIN (
SELECT place_id, avg(rating) AS rating
FROM "Place_Opinion"
WHERE deleted = false
GROUP BY place_id) po ON po.place_id = p.id
WHERE p.deleted = false
AND reserved.rr IS NULL;
The average rating per places is calculated in a separate sub-query. The error you had was because you referenced the "Place" table (p.id) before it was defined. For simple columns you can do that, but for sub-queries you can't.