SQL to return all students with minimum GPA - postgresql

I'm trying to do a query which, as the title says, returns the name of all the students that share the minimum GPA.
Since the minimum GPA from all the students I have in the table is 0, right now I have:
Select s.name from Student s
where s.gpa = 0
However, I don't want to put the value '0' since it could not always be the case, instead I'm trying to find the minimum gpa with MIN(gpa)
I've been trying to use that aggregate function in combination with group by But I cannot make it work. So far I have this:
Select s.name from Student s
having s.gpa = min(s.gpa)
group by s.gpa
How should I change it to make it work? Thanks in advance

You need to first find what is the minimum GPA, then get students that have it.
To get the minimum GPA in the entire table:
SELECT MIN(gpa) FROM Student;
This query can be nested inside your original query:
SELECT name FROM Student
WHERE gpa = (SELECT MIN(gpa) FROM Student);
I am not sure if there is a simpler way to do it. I hope this helps!

Related

How to speed up this Postgres query by reducing the number of rows it has to search through before it searches?

Context:
I have three Postgres tables:
authors - stores the id, author's full name, credentials, and awards
books - stores the id, title, book-length, summary, and an image of the front cover
authorBookRelations - connects Authors and Books by storing the author_id and book_id
An author can be connected to any book, but books are not connected. Books can have the same name, but each has its own id that is unique. Multiple authors can author a single book.
My question:
If I want to get all titles that match a given list of titles and are by a specific author what would be the best way to do that?
What I have so far:
Currently, I do two SELECT queries and a filtering function to "join" the two queries.
SELECT query #1 - get all of the book_ids associated with a particular author:
SELECT book_id FROM authorBookRelations WHERE author_id = 5
SELECT query #2 - get all of the titles that are in a given list of titles:
SELECT * FROM books WHERE title IN ('arbitraryTitle_1', arbitraryTitle_2, etc.)
Filter function (python) - filter titles for any that are not written by that specific author:
filtered_list = [x for x in query_2_results if x.id in query_1_results]
I get the correct books with this method, but can't help but feel that this is not a good way to do it/won't scale well. What would you suggest as a way to speed up this query? Instead of two separate db calls and a filtering function, could I do it all in one call by searching the list of titles against the filtered rows in table "books" that were filtered by the output from the query against authorBookRelations? ... that was horribly worded ... so something like this:
SELECT *
FROM (
SELECT book_id
FROM authorBookRelations
WHERE author_id = 5) AS foobar
WHERE title IN ('arbitraryTitle_1', arbitraryTitle_2, etc.)
UPDATE:
Trying out this seems to have cut my total query/processing time by half:
select *
from (select *
from books
where id in (
select book_id
from authorBookRelations
where author_id = 5
)) as foo
where foo.title in ('arbitraryTitle_1', 'arbitraryTitle_2', etc.)
The problem of performances will be on the "IN" operator, if the list has a great number of items...
For two or three sometime an index can be used by PG for seeking the data.
But when there is much more items, a scan will be the only solution...
If you want to speed up this query, just use a temporary table to INSERT your data into, the add an index and rewrite the query with a join between this temp table and your original query...

Get entire record with max field for each group

There are a lot of answers about this problem, but none of them retrieves the entire record, but only the ID... and I need the whole record.
So, I have a table status_changes that is composed of 4 columns:
issue_id : the issue the change refers to
id: the id of the change, just a SERIAL
status_from and status_to that are infact the status that the issue had before, and the status that the issue got then
when that is a timestamp of when this happened
Nothing too crazy, but now, I would like to have the "most recent status_change" for each issue.
I tried something like:
select id
from change
group by issue_id
having when = max(when)
But this has obviously 2 big problems:
select contains fields that are not in the group by
2 having can't contains aggregate function in this way
I thought of "ordering every group by when and using something like top(1), but I can't figure out how to do it...
Use PostgreSQL's DISTINCT ON:
SELECT DISTINCT ON (issue_id)
id, issue_id, status_from, statue_to, when
FROM change
ORDER BY issue_id, when DESC;
This will return the first result (the one with the greatest when) for each issue.

Sorting for particular user in mongodb

I am trying to get the last record for a particular user id. For example.if user_id 2 has 10 records I need the 10th record.
I have tried for static user_id
Help me to change in dynamic
db.product_logs.find({"user_id" :"862"}).sort({"_id":-1}).limit(1)
You can simply assign a variable to the given user_id and use that in the find query.
For example, in javascript, you can do something like this:
let userId = "862";//assign this variable as per your requirement/choice
db.product_logs.find({"user_id" :userId}).sort({"_id":-1}).limit(1)

Filemaker - Summing a field based on another field

In Filemaker Pro 12, I am trying to write a formula for a calculation field that will sum a field in a related table based on another field in that same related table. The normal Filemaker sum equation would look like this:
Sum (Assets::Asset Quantity)
However, I need to specify that only quantities that are related to a field named Asset Type with a value of "Building" will be used to filter the values in Asset Quantity that will be used in the sum.
There are a couple of ways that you could do this:
A new Calculated field
First, you could add a new Calculation field to your Assets table called, say, Building Quantity, with a Calculated Value of:
If (Asset Type = "Building" ; Asset Quantity ; 0)
And then you can use the sum of this new Building Quantity just like you were using Sum(Assets::Asset Quantity) before.
A new relationship
Second, you could add a new Calculated field to your main table with the value always equal to "Building" and then add a new table occurrence of the Assets table. We'll call it "BuildingAssets" and set the relationship so that your IDs match and also your new "Building" field matches the Asset Type
Summary ID \____________/ BuildingAssets::Summary ID
BuildingText / \ BuildingAssets::Asset Type
Then you will use
Sum (BuildingAssets::Asset Quantity)
instead of Sum (Assets::Asset Quantity) so that you only pull the Building types through.
ExecuteSQL
Finally, FileMaker 12 introduced the ExecuteSQL step. This may be the most elegant way to do the above because it doesn't involve changing any schema. The statement would be somethign like:
SELECT
SUM (Asset Quantity)
FROM
Assets
WHERE
Summary ID = ID AND
Asset Type = Building
For more information check out FileMaker's page: http://www.filemaker.com/12help/html/func_ref3.33.6.html
Also check out the FileMaker SQL Sugar ("#") Module for help building queries: http://www.modularfilemaker.org/2013/03/filemaker-sql-sugar/

Postgresql different where based on column value?

Is it possible to dynamically change the where clause as a query executes based on the value of one of the columns? That is, lets say (and this is a completely made up example) I have a table of students, classes attended, and if they were tardy. I want to see for each student a list of all classes they have attended since the last time they were tardy. So the query would look something like this:
SELECT student, class, classdate
FROM attendance
WHERE classdate>(<<SOME QUERY that selects the most recent tardy for each student>>)
ORDER BY student,classdate;
or, to put it into more programing terminology, to perhaps make it clearer:
for studentName in (SELECT distinct(student) FROM attendance):
SELECT student, class, classdate
FROM attendance
WHERE classdate>(SELECT classdate
FROM attendance
WHERE tardy=true AND student=studentName
ORDER BY classdate DESC
LIMIT 1)
ORDER BY classdate;
Is there any way to do this with a single query, or do I need to do a separate query for each student (essentially as per the loop above)? The actual use case is more complicated to explain (it has to do with certifying records, and which ones need to be looked at) but conceptually it is the same.
Just use multiple aliases (e.g. a1 and a2) for the attendance table, so you can refer to the "outer" table alias in the subquery:
SELECT student, class, classdate
FROM attendance a1
WHERE classdate>(SELECT classdate
FROM attendance a2
WHERE tardy=true AND a2.student=a1.student
ORDER BY classdate DESC
LIMIT 1)
ORDER BY classdate;