T-SQL to group results by a range of possible values - group-by

Not sure how to word the question:
I have a query such as
SELECT s.*
FROM SUMMARY s
WHERE s.TYP = 'A'
AND s.NUM > 0
AND s.NUM <= 999999
and a group by like
SELECT s.TYP, s.COUNT(TYPE)
FROM SUMMARY s
GROUP BY s.TPY
which gives:
A 38720
B 39500
C 170
D 850
E 8891
What I'd like to do is get a "split" of my results using a "range" like:
TYP RANGE(NUM) COUNT
A 0000>1000 240
A 1000>2000 800
A 2000>3000 120
etc...
Is there a simple way of doing this?

Look at IF in the SELECT command to group by ranges you have in the rows and then PIVOT fction to flip the result.

Related

Get the ID of a table and its modulo respect the total rows in the same table in Postgres

While trying to map some data to a table, I wanted to obtain the ID of a table and its modulo respect the total rows in the same table. For example, given this table:
id
--
1
3
10
12
I would like this result:
id | mod
---+----
1 | 1 <- 1 mod 4
3 | 3 <- 3 mod 4
10 | 2 <- 10 mod 4
12 | 0 <- 12 mod 4
Is there an easy way to achieve this dynamically (as in, not counting the rows on before hand or doing it in an atomic way)?
So far I've tried something like this:
SELECT t1.id, t1.id % COUNT(t1.id) mod FROM tbl t1, tbl t2 GROUP BY t1.id;
This works but you must have the GROUP BY and tbl t2 as otherwise it returns 0 for the mod column which makes sense because I think it works by multiplying the table by itself so each ID gets a full set of the table. I guess for small enough tables this is ok but I can see how this becomes problematic for larger tables.
Edit: Found another hack-ish way:
WITH total AS (
SELECT COUNT(*) cnt FROM tbl
)
SELECT t1.id, t1.id % t2.cnt mod FROM tbl t1, total t2
It similar to the previous query but it "collapses" the multiplication to a single row with the previous count.
You can use COUNT() window function:
SELECT id,
id % COUNT(*) OVER () mod
FROM tbl;
I'm sure that the optimizer is smart enough to calculate the result of the window function only once.
See the demo.

need a true false answer for multiple conditions when joining two tables

I have two tables, one is information about a sampleid (sample id is primary key) and the other is conditions the sampleid has (sampleid is not primary key in this table as it may have multiple conditions). I would like to know if my sampleid has a specific condition (Y/N) but not sure how to join them without getting a query that returns mulitple rows of the sampleid.
eg
sampleid colour
-----------------------
1 blue
2 red
3 green
sampleid condition
-----------------------
1 23
1 81
1 94
2 81
2 94
3 23
I want to ask if the sampleid has condition 23 and return:
sampleid colour condition23
----------------------------------------------
1 blue Y
2 red N
3 green Y
Hope this is clear, every time I join them i end up with multiple sampleid- I am a newbie and trying to find my way!
Thanks in advance
F
This can be done using a left join and case something like this:
SELECT
s.sampleId,
s.color,
case when c.condition is null
then 'N'
else 'Y'
end condition23
FROM
samples s
LEFT JOIN conditions c
ON s.sampleId = c.sampleId
AND c.condition = 23
Try this query:
select s.*, case when c.condition is null then 'N' else 'Y' end condition23
from samples s
left join
(select * from conditions where condition = 23) c on s.sampleid = c.sampleid
With EXISTS:
select
s.*,
case
when exists (
select 1 from conditions where sampleid = s.sampleid and condition = 23
) then 'Y'
else 'N'
end condition23
from samples s

SUM of COUNTs in the same table

I'm doing many counts that I want to show in a table. And I want in the same table to show the sum of all counts.
Here's what I got (simplified - I got 6 Counts):
SELECT * FROM (SELECT COUNT() AS NB_book
item as a1, metadatavalue as m1, metadatavalue as m12,
WHERE m1.field_id = 64 (because I need that field to exist)
AND m2.field_id = 66
And m2. = book
AND a1.in_archive = TRUE )
(SELECT COUNT() AS NB_toys
metadatavalue as m1, metadatavalue as m12,
WHERE m1.field_id = 64 (because I need that field to exist)
AND m2.field_id = 66
And m2. = toys
AND a1.in_archive = TRUE)
)
Now, I want the display to be like
-------------table ----------
|NB_book | NB_Toys | total_object |
-----------------------------
| 12 | 10 | 22 |
You want something along the lines of:
SELECT
sum(CASE WHEN condition_1 THEN 1 END) AS firstcount,
sum(CASE WHEN condition_2 THEN 1 END) AS secondcount,
sum(thecolumn) AS total
FROM ...
Your example query is too vague to construct something usable from, but this'll give you the idea. The conditions above can be any boolean expression.
If you prefer you can use NULLIF instead of CASE WHEN ... THEN ... END. I prefer to stick to the standard CASE.
It is difficult to figure out what you actually want. You can run completely different queries that each return a one-row result and combine the results like this:
select
(select count(*) from pgbench_accounts) as count1,
(select count(*) from pgbench_tellers) as count2 ;
But perhaps you shouldn't do that. Instead just run each query by itself and use the client, rather than the database engine, to format the results.

Counting Number of Users Whose Average is Greater than X in Postgres

I am trying to find out the number of users who have scored an average of 80 or higher. I am using Having in my query but it is not returning the count of number of rows.
The Schema looks like:
Results
user
test_no
question_no
score
My Query:
SELECT "user" FROM results WHERE (score >0) GROUP BY "user"
HAVING (sum(score) / count(distinct(test_no))) >= 80;
I get:
user
2
4
8
(3 rows)
Instead I would like to get 3 (number of rows) as the output. If I do count("user"), I get the count of number of tests for each user.
I understand this is related to use Group By but I need it for my Having clause. Any suggestions how I can do this is appreciated.
Update: Here is some sample data: http://pastebin.com/k1nH5Wzh (-1 means unanswered)
Thanks!
The query you found is good. Some minor simplifications:
SELECT count(*) AS ct
FROM (
SELECT 1
FROM result
WHERE score > 0
GROUP BY user_id
HAVING (sum(score) / count(DISTINCT test_no)) >= 80
) sub
DISTINCT does not require parentheses.
You can SELECT a constant value in the subquery. The value is irrelevant, since you are only going to count the rows. Slightly shorter and cheaper.
Don't use the reserved word user as column name. That's asking for trouble. I am using user_id instead.
I am not sure if this is an efficient way to do it but this seems to be working.
SELECT COUNT(*) FROM
(SELECT "user" FROM results WHERE (score >0) GROUP BY "user"
HAVING (sum(score) / count(distinct(test_no))) >= 80)) q1;

Summing From Consecutive Rows

Assume we have a table and we want to do a sum of the Expend column so that the summation only adds up values of the same Week_Name.
SN Week_Name Exp Sum
-- --------- --- ---
1 Week 1 10 0
2 Week 1 20 0
3 Week 1 30 60
4 Week 2 40 0
5 Week 2 50 90
6 Week 3 10 0
I will assume we will need to `Order By' Week_Name, then compare the previous Week_Name(previous row) with the current row Week_name(Current row).
If both are the same, put zero in the SUM column.
If not the same, add all expenditure, where Week_Name = Week_Name(Previous row) and place in the Sum column. The final output should look like the table above.
Any help on how to achieve this in T-SQL is highly appreciated.
Okay, I was eventually able to resolve this issue, praise Jesus! If you want the exact table I gave above, you can use GilM's response below, it is perfect. If you want your table to have running Cumulatives, i.e. Rows 3 shoud have 60, Row 5, should have 150, Row 6 160 etc. Then, you can use my code below:
USE CAPdb
IF OBJECT_ID ('dbo.[tablebp]') IS NOT NULL
DROP TABLE [tablebp]
GO
CREATE TABLE [tablebp] (
tablebpcCol1 int PRIMARY KEY
,tabledatekey datetime
,tableweekname varchar(50)
,expenditure1 numeric
,expenditure_Cummulative numeric
)
INSERT INTO [tablebp](tablebpcCol1,tabledatekey,tableweekname,expenditure1,expenditure_Cummulative)
SELECT b.s_tablekey,d.PK_Date,d.Week_Name,
SUM(b.s_expenditure1) AS s_expenditure1,
SUM(b.s_expenditure1) + COALESCE((SELECT SUM(s_expenditure1)
FROM source_table bs JOIN dbo.Time dd ON bs.[DATE Key] = dd.[PK_Date]
WHERE dd.PK_Date < d.PK_Date),0)
FROM source_table b
INNER JOIN dbo.Time d ON b.[Date key] = d.PK_Date
GROUP BY d.[PK_Date],d.Week_Name,b.s_tablekey,b.s_expenditure1
ORDER BY d.[PK_Date]
;WITH CTE AS (
SELECT tableweekname
,Max(expenditure_Cummulative) AS Week_expenditure_Cummulative
,MAX(tablebpcCol1) AS MaxSN
FROM [tablebp]
GROUP BY tableweekname
)
SELECT [tablebp].*
,CASE WHEN [tablebp].tablebpcCol1 = CTE.MaxSN THEN Week_expenditure_Cummulative
ELSE 0 END AS [RunWeeklySum]
FROM [tablebp]
JOIN CTE on CTE.tableweekname = [tablebp].tableweekname
I'm not sure why your SN=6 line is 0 rather than 10. Do you really not want the sum for the last Week? If having the last week total is okay, then you might want something like:
;WITH CTE AS (
SELECT Week_Name,SUM([Expend.]) as SumExpend
,MAX(SN) AS MaxSN
FROM T
GROUP BY Week_Name
)
SELECT T.*,CASE WHEN T.SN = CTE.MaxSN THEN SumExpend
ELSE 0 END AS [Sum]
FROM T
JOIN CTE on CTE.Week_Name = T.Week_Name
Based on the requst in the comment wanting a running total in SUM you could try this:
;WITH CTE AS (
SELECT Week_Name, MAX(SN) AS MaxSN
FROM T
GROUP BY Week_Name
)
SELECT T.SN, T.Week_Name,T.Exp,
CASE WHEN T.SN = CTE.MaxSN THEN
(SELECT SUM(EXP) FROM T T2
WHERE T2.SN <= T.SN) ELSE 0 END AS [SUM]
FROM T
JOIN CTE ON CTE.Week_Name = T.Week_Name
ORDER BY SN