Greatest n per group with multiple criteria for greatest - postgresql

I need to select the largest, most recent or currently active term across a number of schools, with the assumption that is possible for a school to have multiple concurrent terms (ie, one term that honors students are registered in, and another for non honors). Also need to take into account the end date, as the honors term may have the same start date but may be year long instead of just a semester, and I want the semester.
Code looks something like this:
SELECT t.school_id, t.term_id, COUNT(s.id) AS size, t.start_date, t.end_date
FROM term t
INNER JOIN students s ON t.term_id = s.term_id
WHERE t.school_id = (some school id)
GROUP BY t.school_id, t.term_id
ORDER BY t.start_date DESC, t.end_date ASC, size DESC LIMIT 1;
This works perfectly to find the largest currently or most recently active term, but I want to be able to eliminate the WHERE t.school_id = (some school id) part.
A standard greatest n per group can easily choose the largest OR most recent term, but I need to select the most recent term that ends soonest with the largest number of students.

Not sure I am interpreting your question correctly. Would be easier if you had supplied table definitions including primary and foreign keys.
If you want the the most recent term that ends soonest with the largest number of students per school, this might do it:
SELECT DISTINCT ON (t.school_id)
t.school_id, t.term_id, s.size, t.start_date, t.end_date
FROM term t
JOIN (
SELECT term_id, COUNT(s.id) AS size
FROM students
GROUP BY term_id
) s USING (term_id)
ORDER BY t.school_id, t.start_date DESC, t.end_date, size DESC;
More explanation for DISTINCT ON in this related answer:
Select first row in each GROUP BY group?

Related

count distinct idi_counterparty PER campaignId tableau

it's my first time here, I have a question I can't solve.
I need a calculation that allows me to see the amount of id there is per campaignid
count distinct idi_counterparty PER campaignId.
Should take into account all activities (sent/open/click)
both columns contain numbers (because they are id)

How to select max date value while selecting max value

I have the following sample from a table with students results with date for a school entry exam
First student passed exam - This is the most common record found for most students
Second student failed 1st time entry and passed second time based on the date
3rd student had a failed input entry and was corrected based on the Version
I need the results to like like the picture above, so we take into regard using the latest date and highest version!
My basic query thus far is
select studentid
,examdate --(Date)
,result -- (charvar)
from StudentEntryExam
How should I approach this issue?
demo:db<>fiddle
SELECT DISTINCT ON (studentid)
*
FROM mytable
ORDER BY studentid, examdate DESC, version DESC
DISTINCT ON returns the first record of an ordered group. In that case the groups are the studentids. You must find the correct order to set the required record first. So, you need to order by studentid, of course. Then you need the most recent examdate first, which can be achieved with DESC order. If there are two records on the same date, you need to order the highest version first as well using the DESC modifier, too.

MySQL Workbench - script storing return in array and performing calculations?

Firstly, this is part of my college homework.
Now that's out of the way: I need to write a query that will get the number of free apps in a DB as a percentage of the total number of apps, sorted by what category the app is in.
I can get the number of free apps and also the number of total apps by category. Now I need to find the percentage, this is where it goes a bit pear-shaped.
Here is what I have so far:
-- find total number of apps per category
select #totalAppsPerCategory := count(*), category_primary
from apps
group by category_primary;
-- find number of free apps per category
select #freeAppsPerCategory:= count(*), category_primary
from apps
where (price = 0.0)
group by category_primary;
-- find percentage of free apps per category
set #totals = #freeAppsPerCategory / #totalAppsPercategory * 100;
select #totals, category_primary
from apps
group by category_primary;
It then lists the categories but the percentage listed in each category is the exact same value.
I had initially thought to use an array, but from what I have read mySql does not seem to support arrays.
I'm a bit lost of how to proceed from here.
Finally figured it out. Since I had been saving the previous results in variables it seems that it was not able to calculate on a row by row basis, which is why all the percentages were identical, it was an average. So the calculation needed to be part of the query.
Here's what I came up with:
SELECT DISTINCT
category_primary,
CONCAT(FORMAT(COUNT(CASE
WHEN price = 0 THEN 1
END) / COUNT(*) * 100,
1),
'%') AS FreeAppSharePercent
FROM
apps
GROUP BY category_primary
ORDER BY FreeAppSharePercent DESC;
Then the query result is:

Compare 2 Tables When 1 Is Null in PostgreSQL

List item
I am kinda new in PostgreSQL and I have difficulty to get the result that I want.
In order to get the appropriate result, I need to make multiple joins and I have difficulty when counting grouping them in one query as well.
The table names as following: pers_person, pers_position, and acc_transaction.
What I want to accomplish is;
To see who was absent on which date comparing pers_person with acc_transaction for any record, if there any record its fine... but if record is null the person was definitely absent.
I want to count the absence by pers_person, how many times in month this person is absent.
Also the person hired_date should be considered, the person might be hired in November in October report this person should be filtered out.
pers_postition table is for giving position information of that person.
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
SELECT tr.create_time::date AS Date, pers.pin, tr.dept_name, tr.name, tr.last_name, pos.name, Count(*)
FROM acc_transaction AS tr
RIGHT JOIN pers_person as pers
ON tr.pin = pers.pin
LEFT JOIN pers_position as pos
ON pers.position_id=pos.id
WHERE tr.event_no = 0 AND DATE_PART('month', DATE)=10 AND DATE_PART('month', pr.hire_date::date)<=10 AND pr.pin IS DISTINCT FROM tr.pin
GROUP BY DATE
ORDER BY DATE
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
*This is report for octeber,
*Pin is ID number
I'd start by
changing the RIGHT JOIN for a LEFT JOIN as they works the same in reverse but it's confusing to figure them both in mind :
removing for now the pers_position table as it is used for added information purpose rather than changing any returned result
there is an unknown alias pr and I'd assume it is meant for pers (?), changing it accordingly
that leads to strange WHERE conditions, removing them
"pers.pin IS DISTINCT FROM pers.pin" (a field is never distinct from itself)
"AND DATE_PART('month', DATE)=10 " (always true when run in october, always false otherwise)
Giving the resulting query :
SELECT tr.create_time::date AS Date, pers.pin, tr.dept_name, tr.name, tr.last_name, Count(*)
FROM pers_person as pers
LEFT JOIN acc_transaction AS tr ON tr.pin = pers.pin
WHERE tr.event_no = 0
AND DATE_PART('month', pers.hire_date::date)<=10
GROUP BY DATE
ORDER BY DATE
At the end, I don't know if that answers the question, since the title says "Compare 2 Tables When 1 Is Null in PostgreSQL" and the content of the question says nothing about it.

How to sort by a calculated column in PostgreSQL?

We have a number of fields in our offers table that determine an offer's price, however one crucial component for the price is an exchange rate fetched from an external API. We still need to sort offers by actual current price.
For example, let's say we have two columns in the offers table: exchange and premium_percentage. The exchange is the name of the source for the exchange rate to which an external request will be made. premium_percentage is set by the user. In this situation, it is impossible to sort offers by current price without knowing the exchange rate, and that maybe different depending on what's in the exchange column.
How would one go about this? Is there a way to make Postgres calculate current price and then sort offers by it?
SELECT
product_id,
get_current_price(exchange) * (premium_percentage::float/100 + 1) AS price
FROM offers
ORDER BY 2;
Note the ORDER BY 2 to sort by the second ordinal column.
You can instead repeat the expression you want to sort by in the ORDER BY clause. But that can result in multiple evaluation.
Or you can wrap it all in a subquery so you can name the output columns and refer to them in other clauses.
SELECT product_id, price
FROM
(
SELECT
product_id,
get_current_price(exchange) * (premium_percentage::float/100 + 1)
FROM offers
) product_prices(product_id, price)
ORDER BY price;