SQL to select users into groups based on group percentage - tsql

To keep this simple, let's say I have a table with 100 records that include:
userId
pointsEarned
I would like to group these 100 records (or whatever the total is based on other criteria) into several groups as follows:
Group 1, 15% of total records
Group 2, 25% of total records
Group 3, 10% of total records
Group 4, 10% of total records
Group 5, 40% (remaining of total records, percentage doesn't really matter)
In addition to the above, there will be a minimum of 3 groups and a maximum of 5 groups with varying percentages that always totally 100%. If it makes it easier, the last group will always be the remainder not picked in the other groups.
I'd like to results to be as follows:
groupNbr
userId
pointsEarned

To do this sort of breakup, you need a way to rank the records so that you can decide which group they belong in. If you do not want to randomise the group allocation, and userId is contiguous number, then using userId would be sufficient. However, you probably can't guarantee that, so you need to create some sort of ranking, then use that to split your data into groups. Here is a simple example.
Declare #Total int
Set #Total = Select COUNT(*) from dataTable
Select case
when ranking <= 0.15 * #Total then 1
when ranking <= 0.4 * #Total then 2
when ranking <= 0.5 * #Total then 3
when ranking <= 0.6 * #Total then 4
else 5 end as groupNbr,
userId,
pointsEearned
FROM (Select userId, pointsEarned, ROW_NUMBER() OVER (ORDER BY userId) as ranking From dataTable) A
If you need to randomise which group data end up in, then you need to allocate a random number to each row first, and then rank them by that random number and then split as above.
If you need to make the splits more flexible, you could design a split table that has columns like minPercentage, maxPercentage, groupNbr, fill it with the splits and do something like this
Declare #Total int
Set #Total = Select COUNT(*) from dataTable
Select S.groupNbr
B.userId,
B.pointsEearned
FROM (Select ranking / #Total * 100 as rankPercent, userId, pointsEarned
FROM (Select userId, pointsEarned, ROW_NUMBER() OVER (ORDER BY userId) as ranking From dataTable) A
) B
inner join splitTable S on S.minPercentage <= rankPercent and S.maxPercentage >= rankPercent

Related

How to retrieve the N first rows AND the N last rows in only one request?

Let's say we have a huge query like this:
SELECT id, quality FROM products ORDER BY quality
Is it possible to retrieve the N first rows AND the N last rows of the results, without performing two requests ?
What I want to avoid (two requests):
SELECT id, quality FROM products ORDER BY quality LIMIT 5;
SELECT id, quality FROM products ORDER BY quality DESC LIMIT 5;
Context: the actual request is very CPU/time consuming, that's why I want to limit to one request if possible.
Using a WITH clause to avoid writing the same code twice:
WITH my_complex_query AS (
SELECT * FROM table_name
)
(SELECT * FROM my_complex_query ORDER BY id ASC LIMIT 5)
UNION ALL
(SELECT * FROM my_complex_query ORDER BY id DESC LIMIT 5)
(SELECT * FROM table_name LIMIT 5) UNION (SELECT * FROM table_name ORDER BY id DESC LIMIT 5);

I want to select 2 data from database which durations less than 150

I have a problem with my SQL command. I want to select 2 movies which 2 movies sum of durations less than 150 I wrote this SQL command:
Select
movie_title,Sum(movie_time) as sum_movie
From
movie_movie
Group By
movie_title
Having
Sum(movie_time)<100
Order By
sum_movie DESC
You can get two movies with minimum movie_time values ​​with order by movie_time ASC limit 2 in CTE, and then use that in the condition.
with two_min_movie as (
select *
from movie_movie
order by movie_time ASC limit 2
)
select *
from two_min_movie
where (select sum(movie_time) from two_min_movie) < 150
Demo in DBfiddle

GROUP BY - How to create 3 group for the column?

Say I have a table of products, fields are id, number_of_product, price
Let's price is min = 100, max = 1000*
How to create 3 groups for this column (PostgreSQL) - 100-400, 400-600, 600-1000*
*PS - it would be nice to know how to split into 3 equal parts.
SELECT COUNT(id),
COUNT(number_of_product),
!!!! price - ?!
FROM Scheme.Table
GROUP BY PRICE
You can try next query:
with p as (
select
*,
min(price) over() min_price,
(max(price) over() - min(price) over()) / 3 step
from products
) select
id, product, price,
case
when price < min_price + step then 'low_price'
when price < min_price + 2 * step then 'mid_price'
else 'high'
end as category
from p
order by price;
PostgreSQL fiddle
To do this quickly, you can use a case statement to set the groups.
CASE WHEN price BETWEEN 100 AND 400 THEN 1 WHEN price BETWEEN 400 AND 600 THEN 2 WHEN price BETWEEN 600 AND 1000 THEN 3 ELSE 0 END
You would group on this.
For splitting into equal parts, you would use the NTILE window function to group.
NTILE(3) OVER (
ORDER BY price]
)

how to values transfer to another column with two query

I have a query. this query is calculated percentage for every product. I created a virtual column on this query this columns name is 'yüzde'. After that, i want to transfer yüzde columns to another column in another table with update query if product ids are same.
I think I need to write a stored procedure. How can I do that?
SELECT [ProductVariantId] ,
count([ProductVariantId]) as bedensayısı,
count([ProductVariantId]) * 100.0 / (SELECT Top 1 Count(*) as Total
FROM [Live_ADL].[dbo].[_INV_ProductCombinationAttributes]
Where Size LIKE '%[^0-9]%' and [StockQuantity]>0
Group by [ProductVariantId]
order by Total Desc) as yüzde
FROM [Live_ADL].[dbo].[_INV_ProductCombinationAttributes]
Where Size LIKE '%[^0-9]%' and [StockQuantity]>0
group by [ProductVariantId]
order by yüzde desc
you don't really need a SP, you can do it in-line, using CTE for instance, something along these lines:
; with tabyuzde as
(
SELECT [ProductVariantId] ,
count([ProductVariantId]) as bedensayısı,
count([ProductVariantId]) * 100.0 / (SELECT Top 1 Count(*) as Total
FROM [Live_ADL].[dbo].[_INV_ProductCombinationAttributes]
Where Size LIKE '%[^0-9]%' and [StockQuantity]>0
Group by [ProductVariantId]
order by Total Desc) as yüzde
FROM [Live_ADL].[dbo].[_INV_ProductCombinationAttributes]
Where Size LIKE '%[^0-9]%' and [StockQuantity]>0
group by [ProductVariantId]
)
update x
set othertablevalue=yüzde
from
othertable x
join tabyuzde t on x.ProductVariantId=t.ProductVariantId

Get total count per ID change

how can I get a total count of sheets per change of sheet
example:
select sheetID,
..
from SomeTable
results look something like this:
sheetID
-----------
1000
1000
1000
1000
3000
3000
3000
so I want something like this:
select sheetID,
count(sheetID) as TotalsheetCount
from SomeTable
I just don't know how to break the count up per change of sheetID.
So I'd end up with this essentially:
sheetID TotalsheetCount
-------- -----------
1000 4
1000 4
1000 4
1000 4
3000 3
3000 3
3000 3
so 4 is because there are 4 1000s, 3 because there are 3 3000s. I am wanting to repeat the total count for that sheetID for each row, even though it's repeating, I want to provide that.
UPDATE, here's what I did per the replies but I'm getting way too many results now as compoared to the count where I did not add that partition count before
select MainTable.sheetID,
COUNT(SomeTable.sheetID)OVER(PARTITION BY SomeTable.sheetID) AS TotalSheetCount
table2.SomeField1,
table2.SomeField1
from MainTable
join (select distinct Sales.SalesKey from SomeLongTableName_Sales) sales on sales.SheetKey = MainTable.sheetKey
left outer join Site on MainTable.SiteKey = Site.SiteKey
join Calendar on sales.Date >= Calendar.StartDate
and sales.Date < Calendar.EndDate
group by SomeTable.sheetID
the joins and stuff is more realistic to my real query but formatted for this post to hide real field and table names.
You probably want to use a GROUP BY:
SELECT sheetID, COUNT(sheetID) AS TotalsheetCount
FROM dbo.SomeTable
GROUP BY sheetID
I am wanting to repeat the total count for that sheetID for each row,
even though it's repeating, I want to provide that
If you're using at least SQL-Server 2005, you can use a CTE with COUNT + OVER-clause, otherwise use a sub-query:
WITH CTE AS
(
SELECT sheetID,
COUNT(sheetID)OVER(PARTITION BY sheetID) AS TotalsheetCount
FROM SomeTable
)
SELECT sheetID, TotalsheetCount FROM CTE
Use the GROUP BY clause in a subquery to select the counts:
SELECT sheetID,
count(sheetID) as TotalsheetCount
FROM SomeTable
GROUP BY sheetID
This would make your whole query look like this:
SELECT t.sheetID,
counts.TotalsheetCount
FROM SomeTable t,
(SELECT sheetID, count(sheetID) as TotalsheetCount FROM SomeTable GROUP BY sheetID) counts
WHERE t.sheetID = counts.sheetID
It looks like you need a group-by expression:
select sheetID,
count(*) as TotalsheetCount
from SomeTable
group by sheetID
Is that it?
DC