T SQL Query about grouping - tsql

First image is my query output. Now I want to group the subject so that it become like the second image. Is it possible? Thanks for the help.
select Subject, Grade,
case when Grade >= 50
Then '1'
else '0'
end as Pass,
case when Grade < 50
Then '1'
else '0'
end as Fail
from Grade_report
OUTPUT:
what I want is:

Your specification is not very exact; what number do you expect in the grade on the merged record, and should it just write 1 or 0 or aggregate the sum of the pass and fail columns?
The following will generate your output, it takes the MAX for the grade and SUM for pass/fail information:
WITH GradePassFail AS (
SELECT
Subject,
Grade,
CASE WHEN Grade >= 50 THEN 1 ELSE 0 END AS Pass,
CASE WHEN Grade < 50 THEN 1 ELSE 0 END AS Fail
FROM Grade_report
)
SELECT Subject, MAX(Grade) AS Grade, SUM(Pass) AS Pass, SUM(Fail) AS Fail
FROM GradePassFail
GROUP BY Subject

Related

How to subtract a seperate count from one grouping

I have a postgres query like this
select application.status as status, count(*) as "current_month" from application
where to_char(application.created, 'mon') = to_char('now'::timestamp - '1 month'::interval, 'mon')
and date_part('year',application.created) = date_part('year', CURRENT_DATE)
and application.job_status != 'expired'
group by application.status
it returns the table below that has the number of applications grouped by status for the current month. However I want to subtract a total count of a seperate but related query from the internal review number only. I want to count the number of rows with type = abc within the same table and for the same date range and then subtract that amount from the internal review number (Type is a seperate field). Current_month_desired is how it should look.
status
current_month
current_month_desired
fail
22
22
internal_review
95
22
pass
146
146
UNTESTED: but maybe...
The intent here is to use an analytic and case expression to conditionally sum. This way, the subtraction is not needed in the first place as you are only "counting" the values needed.
SELECT application.status as status
, sum(case when type = 'abc'
and application.status ='internal_review' then 0
else 1 end) over (partition by application.status)) as
"current_month"
FROM application
WHERE to_char(application.created, 'mon') = to_char('now'::timestamp - '1 month'::interval, 'mon')
and date_part('year',application.created) = date_part('year', CURRENT_DATE)
and application.job_status != 'expired'
GROUP BY application.status

How to create buckets and groups within those buckets using PostgresQL

How to find the distribution of credit cards by year, and completed transaction. Group these credit cards into three buckets: less than 10 transactions, between 10 and 30 transactions, more than 30 transactions?
The first method I tried to use was using the width_buckets function in PostgresQL, but the documentation says that only creates equidistant buckets, which is not what I want in this case. Because of that, I turned to case statements. However, I'm not sure how to use the case statement with a group by.
This is the data I am working with:
table 1 - credit_cards table
credit_card_id
year_opened
table 2 - transactions table
transaction_id
credit_card_id - matches credit_cards.credit_card_id
transaction_status ("complete" or "incomplete")
This is what I have gotten so far:
SELECT
CASE WHEN transaction_count < 10 THEN “Less than 10”
WHEN transaction_count >= 10 and transaction_count < 30 THEN “10 <= transaction count < 30”
ELSE transaction_count>=30 THEN “Greater than or equal to 30”
END as buckets
count(*) as ct.transaction_count
FROM credit_cards c
INNER JOIN transactions t
ON c.credit_card_id = t.credit_card_id
WHERE t.status = “completed”
GROUP BY v.year_opened
GROUP BY buckets
ORDER BY buckets
Expected output
credit card count | year opened | transaction count bucket
23421 | 2002 | Less than 10
etc
You can specify the bin sizes in width_bucket by specifying a sorted array of the lower bound of each bin.
In you case, it would be array[10,30]: anything less than 10 gets bin 0, between 10 and 29 gets bin 1 and 30 or more gets bin 2.
WITH a AS (select generate_series(5,35) cnt)
SELECT cnt, width_bucket(cnt, array[10,30])
FROM a;
To figure this out you need to count transactions per credit card in order to figure out the right bucket, then you need to count the credit cards per bucket per year. There are a couple of different ways to get the final result. One way is to first join up all your data and compute the first level of aggregate values. Then compute the final level of aggregate values:
with t1 as (
select year_opened
, c.credit_card_id
, case when count(*) < 10 then 'Less than 10'
when count(*) < 30 then 'Between [10 and 30)'
else 'Greater than or equal to 30'
end buckets
from credit_cards c
join transactions t
on t.credit_card_id = c.credit_card_id
where t.transaction_status = 'complete'
group by year_opened
, c.credit_card_id
)
select count(*) credit_card_count
, year_opened
, buckets
from t1
group by year_opened
, buckets;
However, it may be more perforamant first calculate the first level of aggregate data on the transactions table before joining it to the credit cards table:
select count(*) credit_card_count
, year_opened
, buckets
from credit_cards c
join (select credit_card_id
, case when count(*) < 10 then 'Less than 10'
when count(*) < 30 then 'Between [10 and 30)'
else 'Greater than or equal to 30'
end buckets
from transactions
group by credit_card_id) t
on t.credit_card_id = c.credit_card_id
group by year_opened
, buckets;
If you prefer to unroll the above query and uses Common Table Expressions, you can do that too (I find this easier to read/follow along):
with bkt as (
select credit_card_id
, case when count(*) < 10 then 'Less than 10'
when count(*) < 30 then 'Between [10 and 30)'
else 'Greater than or equal to 30'
end buckets
from transactions
group by credit_card_id
)
select count(*) credit_card_count
, year_opened
, buckets
from credit_cards c
join bkt t
on t.credit_card_id = c.credit_card_id
group by year_opened
, buckets;
Not sure if this is what you are looking for.
WITH cte
AS (
SELECT c.year_opened
,c.credit_card_id
,count(*) AS transaction_count
FROM credit_cards c
INNER JOIN transactions t ON c.credit_card_id = t.credit_card_id
WHERE t.STATUS = 'completed'
GROUP BY c.year_opened
,c.credit_card_id
)
SELECT cte.year_opened AS 'year opened'
,SUM(CASE
WHEN transaction_count < 10
THEN 1
ELSE 0
END) AS 'Less than 10'
,SUM(CASE
WHEN transaction_count >= 10
AND transaction_count < 30
THEN 1
ELSE 0
END) AS '10 <= transaction count < 30'
,SUM(CASE
WHEN transaction_count >= 30
THEN 1
ELSE 0
END) AS 'Greater than or equal to 30'
FROM CTE
GROUP BY cte.year_opened
and the output would be as below.
year opened | Less than 10 | 10 <= transaction count < 30 | Greater than or equal to 30
2002 | 23421 | |

PostgreSQL How to check range of integer in case statement

I am having problem to fetch query in which i have a check user score in range to display the grade if user score between 75 and 100 then its A. If user score between 60- 75 then its B and .. so on .
I am getting this values
CASE users.points_earned
WHEN 75-100 THEN
'A+'
WHEN 60-75 THEN
'A'
WHEN 40-60 THEN
'B+'
WHEN 1--40 THEN
'B'
ELSE
'Absent'
end as rank
buts not working how to check range in case statement of postgresql
You can use BETWEEN for check ranges.
WITH users(points_earned) as(
select 75
union all
select 90
union all
select 200
)
SELECT CASE
WHEN users.points_earned BETWEEN 40 AND 75 THEN 'A+'
WHEN users.points_earned BETWEEN 76 AND 100 THEN 'A'
ELSE 'Absent'
END as rank
FROM users

In Redshift/Postgres, how to count rows that meet a condition?

I'm trying to write a query that count only the rows that meet a condition.
For example, in MySQL I would write it like this:
SELECT
COUNT(IF(grade < 70), 1, NULL)
FROM
grades
ORDER BY
id DESC;
However, when I attempt to do that on Redshift, it returns the following error:
ERROR: function if(boolean, integer, "unknown") does not exist
Hint: No function matches the given name and argument types. You may need to add explicit type casts.
I checked the documentation for conditional statements, and I found
NULLIF(value1, value2)
but it only compares value1 and value2 and if such values are equal, it returns null.
I couldn't find a simple IF statement, and at first glance I couldn't find a way to do what I want to do.
I tried to use the CASE expression, but I'm not getting the results I want:
SELECT
CASE
WHEN grade < 70 THEN COUNT(rank)
ELSE COUNT(rank)
END
FROM
grades
This is the way I want to count things:
failed (grade < 70)
average (70 <= grade < 80)
good (80 <= grade < 90)
excellent (90 <= grade <= 100)
and this is how I expect to see the results:
+========+=========+======+===========+
| failed | average | good | excellent |
+========+=========+======+===========+
| 4 | 2 | 1 | 4 |
+========+=========+======+===========+
but I'm getting this:
+========+=========+======+===========+
| failed | average | good | excellent |
+========+=========+======+===========+
| 11 | 11 | 11 | 11 |
+========+=========+======+===========+
I hope someone could point me to the right direction!
If this helps here's some sample info
CREATE TABLE grades(
grade integer DEFAULT 0,
);
INSERT INTO grades(grade) VALUES(69, 50, 55, 60, 75, 70, 87, 100, 100, 98, 94);
First, the issue you're having here is that what you're saying is "If the grade is less than 70, the value of this case expression is count(rank). Otherwise, the value of this expression is count(rank)." So, in either case, you're always getting the same value.
SELECT
CASE
WHEN grade < 70 THEN COUNT(rank)
ELSE COUNT(rank)
END
FROM
grades
count() only counts non-null values, so typically the pattern you'll see to accomplish what you're trying is this:
SELECT
count(CASE WHEN grade < 70 THEN 1 END) as grade_less_than_70,
count(CASE WHEN grade >= 70 and grade < 80 THEN 1 END) as grade_between_70_and_80
FROM
grades
That way the case expression will only evaluate to 1 when the test expression is true and will be null otherwise. Then the count() will only count the non-null instances, i.e. when the test expression is true, which should give you what you need.
Edit: As a side note, notice that this is exactly the same as how you had originally written this using count(if(test, true-value, false-value)), only re-written as count(case when test then true-value end) (and null is the stand in false-value since an else wasn't supplied to the case).
Edit: postgres 9.4 was released a few months after this original exchange. That version introduced aggregate filters, which can make scenarios like this look a little nicer and clearer. This answer still gets some occasional upvotes, so if you've stumbled upon here and are using a newer postgres (i.e. 9.4+) you might want to consider this equivalent version:
SELECT
count(*) filter (where grade < 70) as grade_less_than_70,
count(*) filter (where grade >= 70 and grade < 80) as grade_between_70_and_80
FROM
grades
Another method:
SELECT
sum(CASE WHEN grade < 70 THEN 1 else 0 END) as grade_less_than_70,
sum(CASE WHEN grade >= 70 and grade < 80 THEN 1 else 0 END) as grade_between_70_and_80
FROM
grades
Works just fine in case you want to group the counts by a categorical column.
The solution given by #yieldsfalsehood works perfectly:
SELECT
count(*) filter (where grade < 70) as grade_less_than_70,
count(*) filter (where grade >= 70 and grade < 80) as grade_between_70_and_80
FROM
grades
But since you talked about NULLIF(value1, value2), there's a way with nullif that can give the same result:
select count(nullif(grade < 70 ,true)) as failed from grades;
Redshift only
For lazy typers, here's a "COUNTIF" sum integer casting version built on top of #user1509107 answer:
SELECT
SUM((grade < 70)::INT) AS grade_less_than_70,
SUM((grade >= 70 AND grade < 80)::INT) AS grade_between_70_and_80
FROM
grades

How to check if the sum of some records equals the difference between two other records in t-sql?

I have a view that contains bank account activity.
ACCOUNT BALANCE_ROW AMOUNT SORT_ORDER
111 1 0.00 1
111 0 10.00 2
111 0 -2.50 3
111 1 7.50 4
222 1 100.00 5
222 0 25.00 6
222 1 125.00 7
ACCOUNT = account number
BALANCE_ROW = either starting or ending
balance would be 1, otherwise 0
AMOUNT = the amount
SORT_ORDER =
simple order to return the records in the order of start balance,
activity, and end balance
I need to figure out a way to see if the sum of the non balance_row rows equal the difference between the ending balance and the starting balance. The result for each account (1 for yes, 0 for no) would be simply added to the resulting result set.
Example:
Account 111 had a starting balance of 0.00. There were two account activity records of 10.00 and -2.5. That resulted in the ending balance of 7.50.
I've been playing around with temp tables, but I was not sure if there is a more efficient way of accomplishing this.
Thanks for any input you may have!
I would use ranking, then group rows by ACCOUNT calculating totals along the way:
;
WITH ranked AS (
SELECT
*,
rnk = ROW_NUMBER() OVER (PARTITION BY ACCOUNT ORDER BY SORT_ORDER)
FROM data
),
grouped AS (
SELECT
ACCOUNT,
BALANCE_DIFF = SUM(CASE BALANCE_ROW WHEN 1 THEN AMOUNT END
* CASE rnk WHEN 1 THEN -1 ELSE 1 END),
ACTIVITY_SUM = SUM(CASE BALANCE_ROW WHEN 0 THEN AMOUNT ELSE 0 END)
FROM data
GROUP BY
ACCOUNT
)
SELECT *
FROM grouped
WHERE BALANCE_DIFF <> ACTIVITY_SUM
Ranking is only used here to make it easier to calculate the starting/ending balance difference. If starting and ending balance rows had, for instance, different BALANCE_ROW codes (like 1 for the starting balance, 2 for the ending one), it would be possible to avoid ranking.
Untested code, but should be really close for comparing the summed balance with the balance_row as you've defined in your question.
SELECT
Account, /* Account Number */
(select sum(B.amount) from yourview B
where B.balance_row = 0 and
B.account = A.account and
B.sort_order BETWEEN A.sort_order and
(select max(sort_order) /* previous sort order value on account */
from yourview C where
C.balance_row = 1 and
C.account = A.account and
C.sort_order < A.sort_order)
) AS Test_Balance, /* Test_Balance = sum of amounts since last balance row */
Balance_Row /* Value of balance row */
FROM yourview A
WHERE A.Balance_Row = 1