Postgres Select one record per matching condition - postgresql

I have some issues while trying to get only one record per matching condition..
Let's suppose I have Certifications table with the following columns:
Id, EmployeeId, DepartmentId, CertificationTitle, PassedDate
An employee can have more then one record in this table but I need to get only one record per employee (based on latest PassedDate)
SELECT Id, EmployeeId, CertificationTitle
FROM certifications c
ORDER BY EmployeeId, PassedDate DESC
From this select I need somehow to get only the first record for each EmployeeId.
Does anyone have any ideas how I can achieve this, Is it possible?
The Id is the Primary Key on the table, so it is different on each record.
I need to keep all this columns specified in the Select query.
The Group By didn't worked for me, or maybe I did it wrong...

Use DISTINCT ON. This returns exactly the first ordered record of the group. You ordered correctly by PassedData DESC to get the most recent record first. The group for DISTINCT ON, naturally, is EmployeeID:
SELECT DISTINCT ON (EmployeeId),
Id,
EmployeeId,
CertificationTitle
FROM certifications c
ORDER BY EmployeeId, PassedDate DESC

Related

How to get latest data for a column when using grouping in postgres

I am using postgres alongside sequelize. I have encountered a case where I need to write a coustom query which groups the records are a particular field. I know for the remaning columns that are not used for grouping, I need to use a aggregate function like SUM. But the problem is that for some columns I need to get the one what is the latest one (DESC sorted by created_at). I see no function in sql to do so. Is my only option to write subqueries or is there a better way? Thanks?
For better understanding, If you look at the below picture, I want the group the records with address. So after the query there should only be two records, one with sydney and the other with new york. But when it comes to the distance, I want the result of the query to contain the distance form the row that was most recently created, i.e with the latest created_at.
so the final two query results should be:
sydney 100 2022-09-05 18:14:53.492131+05:45
new york 40 2022-09-05 18:14:46.23328+05:45
select address, distance, created_at
from(
select address, distance, created_at, row_number() over(partition by address order by created_at DESC) as rn
from table) x
where rn = 1

Find max in group by in postgresql

This is my students table. I want to display the hostel,rollno,parent_inc of the student who has the max(parent_inc) in a hostel. When I'm trying this command -
select hostel, rollno, max(parent_inc) from students group by hostel;
Getting error -
column "students.rollno" must appear in the GROUP BY clause or be used in an aggregate function
select hostel, rollno, max(parent_inc) from students group b...
How to get it in correct way?
Without selecting rollno field it works fine.
Try the windowed version of MAX function:
select rollno
, hostel
, max(parent_inc) over(partition by hostel) max_parent_inc
from students;
NOTE: Not tested

Column must appear in the GROUP BY clause

I have this query:
SELECT
"EventReadingListItem"."id"
, "EventReadingListItem"."UserId"
FROM "EventReadingListItems" AS "EventReadingListItem"
group by "EventReadingListItem"."EventId";
When I run it I get the error
Column "EventReadingListItem"."id" must appear in the GROUP BY clause or be used in an aggregate function.
Why? I have read similar questions but I don't really get why this simple group by is not working. Is it because the field in group by is not known as "EventReadingListItem" yet?
So, according to your comment, this should work for you.
Gives unique rows for each EventId which does have smallest/min id value:
select DISTINCT ON (EventId) EventId, id, UserId
from EventReadingListItems
order by EventId, id

Complex Joins in Postgresql

It's possible I'm stupid, but I've been querying and checking for hours and I can't seem to find the answer to this, so I apologize in advance if the post is redundant... but I can't seem to find its doppelganger.
OK: I have a PostGreSQL db with the following tables:
Key(containing two fields in which I'm interested, ID and Name)
and a second table, Key.
Data contains well... data, sorted by ID. ID is unique, but each Name has multiple ID's. E.G. if Bill enters the building this is ID 1 for Bill. Mary enters the building, ID 2 for Mary, Bill re-enters the building, ID 3 for Bill.
The ID field is in both the Key table, and the DATA table.
What I want to do is... find
The MAX (e.g. last) ID, unique to EACH NAME, and the Data associated with it.
E.g. Bill - Last Login: ID 10. Time: 123UTC Door: West and so on.
So... I'm trying the following query:
SELECT
*
FROM
Data, Key
WHERE
Key.ID = (
SELECT
MAX (ID)
FROM
Key
GROUP BY ID
)
Here's the kicker, there's about... something like 800M items in these tables, so errors are... time consuming. Can anyone help to see if this query is gonna do what I expect?
Thanks so much.
To get the maximum key for each name . . .
select Name, max(ID) as max_id
from data
group by Name;
Join that to your other table.
select *
from key t1
inner join (select Name, max(ID) as max_id
from data
group by Name) t2
on t1.id = t2.max_id

Applying distinct on more than one field?

I have a SQL query, like so:
SELECT DISTINCT ID, Name FROM Table
This brings up all the distinct IDs (1...13), but in the 13 IDs, it repeats the name (as it comes up twice). The order of the query (ID, Name) has to be kept the same as the app using this query is coded with this assumption.
Is there a way to ensure there are no duplicates?
Thanks
You can try :
select id, name from table group by id,name
But it seems like distinct should work. Perhaps there are trailing spaces at the end of your name fields?
Instead of using DISTINCT, use GROUP BY
SELECT ID, Name FROM Table GROUP BY ID, Name