Find a difference between 2 tables - postgresql

I want to check that the poi_equipement table (relationship table) corresponds to the data in the data table (i.e. a two-way check)
https://dbfiddle.uk/gFMjbIpX
detect that wc (in poi_equipement) is extra (because it is not present in the data table) and that hotel is not in poi_equipement so it is absent compared to the data table
I don't understand why with the raquĂȘte except he just answers me hotel.
I want him to answer me hotel and wc.
select object from data where subject = 'url1'
except
select subject from poi_equipement inner join equipement on poi_equipement.equipement_id = equipement.id;
ideally I want to know when I have a difference in poi_equipement, in data or in the 2 tables

A full outer join will do
with params as (
select 'url1' as subject),
data_object as (
select d.object
from data d
join params prm
on d.subject = prm.subject),
equipment_subject as (
select e.subject
from poi_equipement pe
join poi p
on pe.poi_id = p.id
join equipement e
on pe.equipement_id = e.id
join params prm
on p.id_url = prm.subject)
select d.object as data,
e.subject as poi_equipment
from data_object d
full outer
join equipment_subject e
on d.object = e.subject
where d.object is null
or e.subject is null;
Result:
data |poi_equipment|
-----+-------------+
hotel| |
|wc |
You can remove where clause if you need to see which item is in both places.

Related

Left [Outer] Join without Intersection to get count

I have the following schema
I wanted to get all cars along with number of models for each car and number of remaining colors for each car.
I was able to get number of models but i am not able to get number of remaining colors for each car. I know i have to do Left [Outer] Join without Intersection. But its not working
I may also have model which does not have any colors. In such case there wont be any entry in ModelColors table
select
c.CarID,
c.CarName,
T1.[Num Of Models],
T2.[Remaining Colors]
from Cars c
LEFT JOIN
(
SELECT m.CarID, COUNT(1) AS 'Num Of Models'
FROM Models m
GROUP BY m.CarID
) AS T1 ON T1.CarID = c.CarID
LEFT JOIN
(
SELECT m1.CarID, COUNT(1) AS 'Remaining Colors'
FROM Colors col
LEFT JOIN ModelColors mc on mc.ColorID = col.ColorID
LEFT JOIN Models m1 on m1.ModelID = mc.ModelID
WHERE mc.ColorID IS NULL
GROUP BY m1.CarID
) AS T2 ON T2.CarID = c.CarID
Your from/join clause in the second derived table (T2) is wrong.
You should use Models and ModelColors only:
SELECT m1.CarID, COUNT(1) AS 'Remaining Colors'
FROM Models m
LEFT JOIN ModelColors mc
ON m.ModelID = mc.ModelID
The entire query should look like this:
SELECT
c.CarID,
c.CarName,
T1.[Num Of Models],
T2.[Remaining Colors]
FROM Cars c
LEFT JOIN
(
SELECT m.CarID, COUNT(1) AS 'Num Of Models'
FROM Models m
GROUP BY m.CarID
) AS T1 ON T1.CarID = c.CarID
LEFT JOIN
(
SELECT m1.CarID, COUNT(1) AS 'Remaining Colors'
FROM Models m
LEFT JOIN ModelColors mc
ON m.ModelID = mc.ModelID
) AS T2 ON T2.CarID = c.CarID
Since you only want to count the colors, you don't need the Colors table at all for this query.

COALESCE TSQL with a join tsql

I have a requirement to pick up data that is in more than one place and I have some form of recognition if using the coalesce function. Basically I am looking to coalesce the join itself but looking online its seems as if i can only do this on the fields.
So we have a Products and Suppliers table, we also have these as a temp table so in total 4 tables (products, tempproducts, suppliers, tempsuppliers). In the suppliers and products table is where we store our products and suppliers and their temptables we store any new suppliers/products. We also have a tempsupplierproduct which joins new suppliers to new products. However we can end in a situation where a new supplier has an existing product so the new supplier will be in the tempsuppliers table and its product is in the products table NOT the tempproducts as it is not new, we will also have a new tempsupplierproduct to join the two up.
So i want a query which looks in the tempsupplierproducts table and then gets basic information about the supplier and products. To do this i am using a coalesce.
SELECT DISTINCT SP.*, COALESCE(P.Product, PD.Product) 'Product', COALESCE(S.Supplier, SU.Supplier) 'Supplier'
FROM tempsupplierproduct SP
LEFT JOIN tempProduct P ON SP.ProductCode = P.Code
LEFT JOIN Products PD ON SP.ProductCode = PD.Code
LEFT JOIN tempSupplier S ON SP.SupplierCode = S.Code
LEFT JOIN Suppliers SU ON SP.SupplierCode = SU.Code
Now while this works, something at the back of my head tells me it is not entirely right, ideally i want if data is not in table A then join to table B. I have seen maybe coalescing inside the join itself but I am unsure how to do this
LEFT JOIN Suppliers Su ON SP.SupplierCode = COALESCE(S.Code, SU.Code)
maybe away, but I am confused by this, all it is saying is use code in temptable if not there then use supplier code. So what would this mean if we have a code in the temptable, will this try to join on it, if so then this is incorrect also.
Any help is appreciated
You can union the two suppliers tables together and then join them in one go like this. I'm assuming that there are no duplicates between the two tables in this case but with a bit of extra work that could be resolved as well.
WITH AllSuppliers AS
(
SELECT Code, Supplier FROM Suppliers
UNION ALL
SELECT Code, Supplier FROM tempSupplier
)
SELECT DISTINCT SP.*, COALESCE(P.Product, PD.Product) 'Product', S.Supplier
FROM tempsupplierproduct SP
LEFT JOIN tempProduct P ON SP.ProductCode = P.Code
LEFT JOIN Products PD ON SP.ProductCode = PD.Code
LEFT JOIN AllSuppliers S ON SP.SupplierCode = S.Code
If you need to handle duplicates in the two suppliers tables then an approach like this should work, essentially we rank the duplicates and then pick the highest ranked result. For two tables you could use a full outer join between the two but this approach will scale to any number of tables.
WITH AllSuppliers AS
(
SELECT Code, Supplier, 1 AS TablePriority FROM Suppliers
UNION ALL
SELECT Code, Supplier, 2 AS TablePriority FROM tempSupplier
),
SuppliersRanked AS
(
SELECT Code, Supplier,
ROW_NUMBER() OVER (PARTITION BY Code ORDER BY TablePriority) AS RowPriority
FROM AllSuppliers
)
SELECT DISTINCT SP.*, COALESCE(P.Product, PD.Product) 'Product', S.Supplier
FROM tempsupplierproduct SP
LEFT JOIN tempProduct P ON SP.ProductCode = P.Code
LEFT JOIN Products PD ON SP.ProductCode = PD.Code
LEFT JOIN SuppliersRanked S ON SP.SupplierCode = S.Code
AND RowPriority = 1
You can absolutely join on a coalesced field. Here is a snippet from one of my production views:
LEFT JOIN [Portal].tblHelpdeskresource supplier ON PO.fld_str_SupplierID = supplier.fld_str_SupplierID
-- Job type a
LEFT JOIN [Portal].tblHelpDeskFault HDF ON PO.fld_int_HelpdeskFaultID = HDF.fld_int_ID
-- Job Type b
LEFT JOIN [Portal].tblProjectHeader PH ON PO.fld_int_ProjectHeaderID = PH.fld_int_ID
LEFT JOIN [Portal].tblPPMScheduleLine PSL ON PH.fld_int_PPMScheduleRef = PSL.fld_int_ID
-- Managers (used to be separate for a & b type, now converged)
LEFT JOIN [Portal].uvw_HelpDeskSiteManagers PSM ON COALESCE(PSL.fld_int_StoreID,HDF.fld_int_StoreID) = PSM.PortalSiteId
LEFT JOIN [Portal].tblHelpdeskResource PHDR ON PSM.PortalResourceId = PHDR.fld_int_ID

Postgres get rows which hasnt match in other table

I need your help. I need an advanced Query to my database. Im showing part of my database following:
Place (id, name, address)
Local (id, place_id, name)
PlaceReservation(id, local_id, date)
Media_Place (id, place_id, type)
Now I need a query, which gets all places with logo, which have AT LEAST ONE local which hasn't been reserved on a specific day e.g: 2015-07-01.
Help me please, because I haven't an idea how to do it. I thought about an outer join but I don't know how use it.
I was trying by:
$query = 'SELECT DISTINC *,
(SELECT sum(po.rating)/count(po.id)
FROM "Place_Opinion" po
WHERE po.place_id = p.id AND po.deleted = false) AS rating,
mp.path as logo_path
FROM "Place" p
INNER JOIN "Media_Place" mp ON mp.place_id = p.id
JOIN Local ON Local.place_id = Place.id
LEFT JOIN (
SELECT id AS rr, local_id
FROM PlaceReservation
WHERE date_start = \'2015-07-01\') Reserved ON Reserved.local_id = Local.id
WHERE mp.type = ' . Model_Row_MediaPlace::LOGO_TYPE . '
AND mp.deleted = false
AND p.deleted = false
AND rr IS NULL';
Looking for things that do not exist in a database is usually very inefficient. But you can change the logic around by finding places that do have a booking for the specified date, then LEFT JOIN that to all places with a logo and filter out the records with a reservation:
SELECT DISTINCT p.*, po.rating, mp.path as logo_path
FROM "Place" p
JOIN "Media_Place" mp ON mp.place_id = p.id AND mp.deleted = false AND mp.type = ?
JOIN Local ON Local.place_id = p.id
LEFT JOIN (
SELECT id AS rr, local_id
FROM PlaceReservation
WHERE date_start = '2015-07-01') reserved ON reserved.local_id = Local.id
LEFT JOIN (
SELECT place_id, avg(rating) AS rating
FROM "Place_Opinion"
WHERE deleted = false
GROUP BY place_id) po ON po.place_id = p.id
WHERE p.deleted = false
AND reserved.rr IS NULL;
The average rating per places is calculated in a separate sub-query. The error you had was because you referenced the "Place" table (p.id) before it was defined. For simple columns you can do that, but for sub-queries you can't.

TSQL efficiency - INNER JOIN replaced by EXISTS

Can the following be rewritten to be more efficient?
I would use EXISTS if I didn't need fields from country but I do need those fields, and am not sure how to write this to make it more efficient.
SELECT distinct
p.ProvinceID,
p.Abbv as RegionCode,
p.name as RegionName,
cn.Code as CountryCode,
cn.Name as CountryName
FROM dbo.provinces AS p
INNER JOIN dbo.Countries AS cn ON p.CountryID = cn.CountryID
INNER JOIN dbo.Cities c on c.ProvinceID = p.ProvinceID
INNER JOIN dbo.Listings AS l ON l.CityID = c.CityID
WHERE l.IsActive = 1 AND l.IsApproved = 1
There are two things to note:
You're joining to dbo.Listings which results in many records, so you need to use DISTINCT (usually an expensive operator)
For any tables with columns not in the select you can move into an EXISTS (but the query planner effectively does this for you anyway)
So try this:
SELECT
p.ProvinceID,
p.Abbv as RegionCode,
p.name as RegionName,
cn.Code as CountryCode,
cn.Name as CountryName
FROM dbo.provinces AS p
INNER JOIN
dbo.Countries AS cn
ON p.CountryID = cn.CountryID
WHERE EXISTS (SELECT 1 FROM
dbo.Listings l
INNER JOIN dbo.Cities c
on l.CityID = c.CityID
WHERE c.ProvinceID = p.ProvinceID
AND l.IsActive = 1 AND l.IsApproved = 1
)
Check the query plans before and after - the query planner might be smart enough to do this anyway, but you have removed your distinct
The following will often perform even better by providing the optimizer more useful information:
SELECT
p.ProvinceID,
p.Abbv as RegionCode,
p.name as RegionName,
cn.Code as CountryCode,
cn.Name as CountryName
FROM dbo.provinces AS p
INNER JOIN
dbo.Countries AS cn
ON p.CountryID = cn.CountryID
INNER JOIN (
SELECT
p.ProvinceID
FROM
dbo.Listings l
INNER JOIN dbo.Cities c
on l.CityID = c.CityID
WHERE l.IsActive = 1 AND l.IsApproved = 1
GROUP BY
p.ProvinceID
) list
on list.ProvinceID = p.ProvinceID

Get Greatest date across multiple columns with entity framework

I have three entities: Group, Activity, and Comment. Each entity is represented in the db as a table. A Group has many Activities, and an Activity has many comments. Each entity has a CreatedDate field. I need to query for all groups + the CreatedDate of the most recent entity created on that Group's object graph.
I've constructed a sql query that gives me what I need, but I'm not sure how to do this in entity framework. Specifically this line: (SELECT MAX(X)
FROM (VALUES (g.CreatedDate), (a.CreatedDate), (c.CreatedDate)) Thanks in advance for your help. Here's the full query:
WITH GroupWithLastActivityDate AS (
SELECT DISTINCT
g.Id
,g.GroupName
,g.GroupDescription
,g.CreatedDate
,g.ApartmentComplexId
,(SELECT MAX(X)
FROM (VALUES (g.CreatedDate), (a.CreatedDate), (c.CreatedDate)) AS AllDates(X)) AS LastActivityDate
FROM Groups g
LEFT OUTER JOIN Activities a
on g.Id = a.GroupId
LEFT OUTER JOIN Comments c
on a.Id = c.ActivityId
WHERE g.IsActive = 1
)
SELECT
GroupId = g.Id
,g.GroupName
,g.GroupDescription
,g.ApartmentComplexId
,NumberOfActivities = COUNT(DISTINCT a.Id)
,g.CreatedDate
,LastActivityDate = Max(g.LastActivityDate)
FROM GroupWithLastActivityDate g
INNER JOIN Activities a
on g.Id = a.GroupId
WHERE a.IsActive = 1
GROUP BY g.Id
,g.GroupName
,g.GroupDescription
,g.CreatedDate
,g.ApartmentComplexId
I should add that for now I've constructed a view with this query (plus some other stuff) which I'm querying with a SqlQuery.