Postgresql-Select one row from table where value in many table matches? - postgresql

I have two tables, call them a and b, where a is related to b in a one-to-many relationship. I would like to select any rows from table a where any of the many related records in table b match a criteria. A basic join doesn't work, because that will return one result for each row in table b that matches - I just want one result for each row in table a with one or more related records matching.
For simplified example, say I have a table Departments and related table Employees, where each employee has one department, but each department obviously can have multiple employees. I want a query that will give me one row per department that has one or more employees matching a given criteria - say the departments that have one or more employees that have earned "employee of the month". How would I do this? Thanks.

SELECT * FROM department d
WHERE EXISTS (
SELECT * FROM employee e
JOIN badges b ON b.person_id = e.person_id AND b.badge = 'EotM'
WHERE e.dep_id = d.dep_id
AND e.gender = 'F'
);

select distinct on (d.id)
d.name
from
department d
inner join
employee e on d.id = e.department_id
where e.age between 60 and 65
How to order it by any column:
select *
from (
select distinct on (d.id)
d.*
from
department d
inner join
employee e on d.id = e.department_id
where e.age between 60 and 65
) s
order by name

Sounds like a job for a subquery. Something like: Select * from dept where id in (select deptID from Emp where wasEOTM = true); ought to do the job.

Related

show records that have only one matchin row in another table

I need to write a sql code that probably is very simple but I am very new to it.
I need to find all the records from one table that have matching id (but no more than one) from the other table. eg. one table contains records of the employees and the second one with employees' telephone numbers. i need to find all employees with only one telephone no
Sample data would be nice. In absence of:
SELECT
employees.employee_id
FROM
employees
LEFT JOIN
(SELECT distinct on(employee_id) employee_id FROM emp_phone) AS phone
ON
employees.employee_id = phone.employee_id
WHERE
phone.employee_id IS NOT NULL;
You need a join of the 2 tables, group by employee and the condition in the having clause:
SELECT e.employee_id, e.name
FROM employees e INNER JOIN numbers n
ON e.employee_id = n.employee_id
GROUP BY e.employee_id, e.name
HAVING COUNT(*) = 1;
If there can be more than a few numbers per employee in the table with the employees' telephone numbers (calling it tel), then it's cheaper to avoid GROUP BY and HAVING which has to process all rows. Find employees with "unique" numbers using a self-anti-join with NOT EXISTS.
While you don't need more than the employee_id and their unique phone number, you don't even have to involve the employee table at all:
SELECT *
FROM tel t
WHERE NOT EXISTS (
SELECT FROM tel
WHERE employee_id = t.employee_id
AND tel_number <> t.tel_number -- or use PK column
);
If you need additional columns from the employee table:
SELECT * -- or any columns you need
FROM (
SELECT employee_id AS id, tel_number -- or any columns you need
FROM tel t
WHERE NOT EXISTS (
SELECT FROM tel
WHERE employee_id = t.employee_id
AND tel_number <> t.tel_number -- or use PK column
)
) t
JOIN employee e USING (id);
The column alias in the subquery (employee_id AS id) is just for convenience. Then the outer join condition can be USING (id), and the ID column is only included once in the result, even with SELECT * ...
Simpler with a smart naming convention that uses employee_id for the employee ID everywhere. But it's a widespread anti-pattern to use employee.id instead.
Related:
JOIN table if condition is satisfied, else perform no join

How to find in a many to many relation all the identical values in a column and join the table with other three tables?

I have a many to many relation with three columns, (owner_id,property_id,ownership_perc) and for this table applies (many owners have many properties).
So I would like to find all the owner_id who has many properties (property_id) and connect them with other three tables (Table 1,3,4) in order to get further information for the requested result.
All the tables that I'm using are
Table 1: owner (id_owner,name)
Table 2: owner_property (owner_id,property_id,ownership_perc)
Table 3: property(id_property,building_id)
Table 4: building(id_building,address,region)
So, when I'm trying it like this, the query runs but it returns empty.
SELECT address,region,name
FROM owner_property
JOIN property ON owner_property.property_id = property.id_property
JOIN owner ON owner.id_owner = owner_property.owner_id
JOIN building ON property.building_id=building.id_building
GROUP BY owner_id,address,region,name
HAVING count(owner_id) > 1
ORDER BY owner_id;
Only when I'm trying the code below, it returns the owner_id who has many properties (see image below) but without joining it with the other three tables:
SELECT a.*
FROM owner_property a
JOIN (SELECT owner_id, COUNT(owner_id)
FROM owner_property
GROUP BY owner_id
HAVING COUNT(owner_id)>1) b
ON a.owner_id = b.owner_id
ORDER BY a.owner_id,property_id ASC;
So, is there any suggestion on what I'm doing wrong when I'm joining the tables? Thank you!
This query:
SELECT owner_id
FROM owner_property
GROUP BY owner_id
HAVING COUNT(property_id) > 1
returns all the owner_ids with more than 1 property_ids.
If there is a case of duplicates in the combination of owner_id and property_id then instead of COUNT(property_id) use COUNT(DISTINCT property_id) in the HAVING clause.
So join it to the other tables:
SELECT b.address, b.region, o.name
FROM (
SELECT owner_id
FROM owner_property
GROUP BY owner_id
HAVING COUNT(property_id) > 1
) t
INNER JOIN owner_property op ON op.owner_id = t.owner_id
INNER JOIN property p ON op.property_id = p.id_property
INNER JOIN owner o ON o.id_owner = op.owner_id
INNER JOIN building b ON p.building_id = b.id_building
ORDER BY op.owner_id, op.property_id ASC;
Always qualify the column names with the table name/alias.
You can try to use a correlated subquery that counts the ownerships with EXISTS in the WHERE clause.
SELECT b1.address,
b1.region,
o1.name
FROM owner_property op1
INNER JOIN owner o1
ON o1.id_owner = op1.owner_id
INNER JOIN property p1
ON p1.id_property = op1.property_id
INNER JOIN building b1
ON b1.id_building = p1.building_id
WHERE EXISTS (SELECT ''
FROM owner_property op2
WHERE op2.owner_id = op1.owner_id
HAVING count(*) > 1);

Using Count Function and Percentage of Count Total in One Select Statement

I have three data tables Employees, Departments, and Locations.
I want to show the total number of employees in each state and what percentage of the employees are located in that each state. The Employees table and the Departments table have one identical column called Department_ID, and the Departments table and the Locations table have one identical column called Location_ID. Here's what I wrote for my code:
select l.state_province e.count(*) as "Employees in State",
e.count(*)*100/sum(e.count(*)) over ()
from employees e
full outer join departments d on e.department_id = d.department_id
full outer join locations l on l.location_id = d.location_id
order by l.state_province;
However, the error "from keyword not found where expected" shows up when I run the code. How do I fix it?
You need group by. And regular joins should be fine:
select l.state_province, count(*) as "Employees in State",
count(*) * 100/sum(count(*)) over ()
from employees e join
departments d
on e.department_id = d.department_id join
locations l
on l.location_id = d.location_id
group by l.state_province
order by l.state_province;

MYSQL database confusion

So on this question, I'm having trouble.
EMPLOYEE(fname,minit,lname,ssn,birthdate,address,sex,salary,superssn,dno) key:ssn
DEPARTMENT(dname,dnumber,mgrssn,mgrstartdate) key:dnumber
PROJECT(pname,pnumber,plocation,dnum) key:pnumber
Here is what I wrote:
Select e.ssn, e.lname,e.fname,
From employee e,
where e.ssn in
(select s.ssn, s.lname,sfname
from employee s,
where s.superssn = e.ssn, AND s.lnamme='Wallace' s.fname ='Jennifer'
)
But I only got 10 out of 15 points, my professor said my select s.ssn,slname part is wrong, and it must "match my e.ssn". How should I fix this?
A self-join (same table). Alias e is for the worker, alias s is for the supervisor.
select s.ssn, s.lname,s.fname,
From employee s
join employee e
on s.ssn=e.superssn
where e.lname='Wallace' and e.fname ='Jennifer'
your in statement is going to make this query slow. you can refactor it to be a self join like so
select e.ssn, e.lname, e.fname
from employee e
join employee s on s.superssn = e.ssn
where s.lnamme='Wallace' AND s.fname ='Jennifer';
the problem with your in statement is you are making a dependent subquery which checks every row in the employee table with every row in the same table.
to break down the query itself
select s.ssn, s.lname, s.fname -- s is the supervisor
from employee e -- e is jennifer
join employee s on s.superssn = e.ssn -- self join on the supervisors id is equal to the employees id
where e.lnamme='Wallace' AND e.fname ='Jennifer';
Using IN is fine, but with a correlated subquery, EXISTS is the way to go:
Select s.ssn, s.lname, s.fname
From employee s
where exists (select 1
from employee e
where e.superssn = s.ssn AND
e.lname = 'Wallace' AND
e.fname = 'Jennifer'
);
Note:
The s and e are swapped. You want the supervisor information, so it goes in the outer query.
The extraneous commas have been removed.
AND has been added.
Using IN, it looks like:
Select s.ssn, s.lname, s.fname
From employee s
where s.ssn IN (select e.superssn
from employee e
where e.lname = 'Wallace' AND
e.fname = 'Jennifer'
);
Note that the correlation clause is not needed.

Get Greatest date across multiple columns with entity framework

I have three entities: Group, Activity, and Comment. Each entity is represented in the db as a table. A Group has many Activities, and an Activity has many comments. Each entity has a CreatedDate field. I need to query for all groups + the CreatedDate of the most recent entity created on that Group's object graph.
I've constructed a sql query that gives me what I need, but I'm not sure how to do this in entity framework. Specifically this line: (SELECT MAX(X)
FROM (VALUES (g.CreatedDate), (a.CreatedDate), (c.CreatedDate)) Thanks in advance for your help. Here's the full query:
WITH GroupWithLastActivityDate AS (
SELECT DISTINCT
g.Id
,g.GroupName
,g.GroupDescription
,g.CreatedDate
,g.ApartmentComplexId
,(SELECT MAX(X)
FROM (VALUES (g.CreatedDate), (a.CreatedDate), (c.CreatedDate)) AS AllDates(X)) AS LastActivityDate
FROM Groups g
LEFT OUTER JOIN Activities a
on g.Id = a.GroupId
LEFT OUTER JOIN Comments c
on a.Id = c.ActivityId
WHERE g.IsActive = 1
)
SELECT
GroupId = g.Id
,g.GroupName
,g.GroupDescription
,g.ApartmentComplexId
,NumberOfActivities = COUNT(DISTINCT a.Id)
,g.CreatedDate
,LastActivityDate = Max(g.LastActivityDate)
FROM GroupWithLastActivityDate g
INNER JOIN Activities a
on g.Id = a.GroupId
WHERE a.IsActive = 1
GROUP BY g.Id
,g.GroupName
,g.GroupDescription
,g.CreatedDate
,g.ApartmentComplexId
I should add that for now I've constructed a view with this query (plus some other stuff) which I'm querying with a SqlQuery.