I'm trying to write a query that calculates the number of days between the first and last score per id.
The data sample:
id date score
11 1/1/2017 25.34
4 1/2/2017 34.34
25 1/2/2017 15.78
4 3/2/2017 47.2
25 7/3/2017 65.21
11 9/3/2017 96.09
25 10/3/2017 11.3
4 10/3/2017 27.12
Which is far from what I need, but I'm really lost. Clueless to be honest. Any idea?
Thanks
Try this:
SELECT
customer_id,
date(last_score) - date(first_score) AS days_between_last_and_first_score,
total_score::float/(date(last_score) - date(first_score)) AS score_per_day
FROM
(
select customer_id,
MAX(date(purchase_date)) as last_score,
MIN(date(purchase_date)) as first_score,
SUM(score) AS total_score
FROM candidate_test_q1
group by customer_id
) AS sub_query
Related
I am using PostgreSQL and I am trying to calculate the percentage change for two values in the same column and group them by the name column and I am having trouble.
Suppose I have the following table:
name
day
score
Allen
1
87
Allen
2
89
Allen
3
95
Bob
1
64
Bob
2
68
Bob
3
75
Carl
1
71
Carl
2
77
Carl
3
80
I want the result to be the name and the percentage change for each person between day 3 and day 1. So Allen would be 9.2 because from 87 to 95 is a 9.2 percent increase.
I want the result to be:
name
percent_change
Allen
9.2
Bob
17.2
Carl
12.7
Thanks for your help.
Try this...
with dummy_table as (
select
name,
day,
score as first_day_score,
lag(score, 2) over (partition by name order by day desc) as last_day_score
from YOUR_TABLE_NAME
)
select
name,
(last_day_score - first_day_score) / first_day_score::decimal as percentage_change
from dummy_table where last_day_score is not null
Just replace YOUR_TABLE_NAME. There are likely more performant and fancier solutions, but this works.
You can try with lag function, something like this:
select name, day, score, 100*(score - lag(score, 1) over (partition by name order by day))/(lag(score, 1) over (partition by name order by day)) as growth_percentage
I have a complex situation in PostgreSQL 11 where i need to generate a numbering based on a single figure which i get it from a CTE.
Below is the CTE
WITH pending_orders_to_be_processed_details
AS
(
SELECT ROW_NUMBER() OVER(ORDER BY so.create_date ) as queue_no
, name,so.create_date ::TIMESTAMP
FROM picking sp
LEFT JOIN order so ON so.name=sp.origin
WHERE sp.state IN('assigned','confirmed')
)
,orders_which_can_be_processed_today AS
(
-- This CTE will give me a count of orders
and its hourly average, Lets say count is 400 and hourly avg is 3
)
Now i need to number the details according to the hourly average, Means the first 3 orders need to be ranked as 1, next 3 to be ranked as 2 and so on, so that i can able to identify that these can be processed based on this ranking.
Input will be
name queu_number. create_date
so1 1 2021-03-11 12:00:00
so2 2 2021-03-11 13:00:00
so3 3 2021-03-11 14:00:00
so4 4 2021-03-11 15:00:00
so5 5 2021-03-11 16:00:00
so6 6 2021-03-11 17:00:00
so7 7 2021-03-11 18:00:00
so8 8 2021-03-11 19:00:00
so9 9 2021-03-11 20:00:00
The expected output will be
name rank
so1 1
so2 1
so3 1
so4 2
so5 2
so6 2
so7 3
so8 3
so9 3
Any help/suggestions.
Edit: I recently learned about a function, which fits well here:
demo:db<>fiddle
You can use the ntile() window function for that:
SELECT
*,
ntile(3) OVER (ORDER BY create_date)
FROM mytable
demo:db<>fiddle
Since you already created a cumulative row count, you can use this to create your expected rank:
SELECT
*,
floor((queue_no - 1) / 3) + 1 as rank
FROM my_cte
queue_no - 1 (so, 1 to 3 will be shifted to 0 to 2)
Diff by 3: so, 0 to 2 will be 0.x and 3 to 5 will be 1.x, ...
Now round these result to 0, 1, 2, ...
If you want to start with 1 instead of 0, add 1
I have this sample table:
id date score
11 1/1/2017 14:32 25.34
4 1/2/2017 12:14 34.34
25 1/2/2017 18:08 37.15
4 3/2/2017 23:42 47.24
4 4/2/2017 23:42 54.12
25 7/3/2017 22:07 65.21
11 9/3/2017 21:02 74.6
25 10/3/2017 5:15 11.3
4 10/3/2017 7:11 22.45
My aim is to calculates the first(!) date (YYYY-MM-DD) on which an id's cumulative score has reached 100 (>=). For that, I've written the following code:
SELECT date(date),id, score,
sum(score) over (partition by id order by date(date) rows unbounded preceding) as cumulative_score
FROM test_q1
GROUP BY id, date, score
Order by id, date
It returns:
date id score cumulative_score
1/1/2017 11 25.34 25.34
9/3/2017 11 74.6 99.94
1/2/2017 4 34.34 34.34
3/2/2017 4 47.24 81.58
4/2/2017 4 54.12 135.7
10/3/2017 4 22.45 158.15
1/2/2017 25 37.15 37.15
7/3/2017 25 65.21 102.36
10/3/2017 25 11.3 113.66
I tried to add either WHERE cumulative_score >= 100 or HAVING cumulative score >= 100, but it returns_
ERROR: column "cumulative_score" does not exist
LINE 4: WHERE cumulative_score >= 100
^
SQL state: 42703
Character: 206
Anyone knows how to solve this?
Thanks
What I expect is:
date id score cumulative_score
4/2/2017 4 54.12 135.7
7/3/2017 25 65.21 102.36
And the output just id and date.
Try this:
with cumulative_sum AS (
SELECT id,date,sum(score) over( partition by id order by date) as sum from test_q1
),
above_100_score_rank AS (
SELECT *, rank() over (partition by id order by sum) AS rank
FROM cumulative_sum where sum > 100
)
SELECT * FROM above_100_score_rank WHERE rank= 1;
I'm fairly close to this solution, but I just need a little help getting over the end.
I'm trying to get a running count of the occurrences of client_ids regardless of the date, however I need the dates and ids to still appear in my results to verify everything.
I found part of the solution here but have not been able to modify it enough for my needs.
Here is what the answer should be, counting if the occurrences of the client_ids sequentially :
id client_id deliver_on running_total
1 138 2017-10-01 1
2 29 2017-10-01 1
3 138 2017-10-01 2
4 29 2013-10-02 2
5 29 2013-10-02 3
6 29 2013-10-03 4
7 138 2013-10-03 3
However, here is what I'm getting:
id client_id deliver_on running_total
1 138 2017-10-01 1
2 29 2017-10-01 1
3 138 2017-10-01 1
4 29 2013-10-02 3
5 29 2013-10-02 3
6 29 2013-10-03 1
7 138 2013-10-03 2
Rather than counting the times the client_id appears sequentially, the code counts the time the id appears in the previous date range.
Here is my code and any help would be greatly appreciated.
Thank you,
SELECT n.id, n.client_id, n.deliver_on, COUNT(n.client_id) AS "running_total"
FROM orders n
LEFT JOIN orders o
ON (o.client_id = n.client_id
AND n.deliver_on > o.deliver_on)
GROUP BY n.id, n.deliver_on, n.client_id
ORDER BY n.deliver_on ASC
* EDIT WITH ANSWER *
I ending up solving my own question. Here is the solution with comments:
-- Set "1" for counting to be used later
WITH DATA AS (
SELECT
orders.id,
orders.client_id,
orders.deliver_on,
COUNT(1) -- Creates a column of "1" for counting the occurrences
FROM orders
GROUP BY 1
ORDER BY deliver_on, client_id
)
SELECT
id,
client_id,
deliver_on,
SUM(COUNT) OVER (PARTITION BY client_id
ORDER BY client_id, deliver_on
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) -- Counts the sequential client_ids based on the number of times they appear
FROM DATA
Just the answer posted to close the question:
-- Set "1" for counting to be used later
WITH DATA AS (
SELECT
orders.id,
orders.client_id,
orders.deliver_on,
COUNT(1) -- Creates a column of "1" for counting the occurrences
FROM orders
GROUP BY 1
ORDER BY deliver_on, client_id
)
SELECT
id,
client_id,
deliver_on,
SUM(COUNT) OVER (PARTITION BY client_id
ORDER BY client_id, deliver_on
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) -- Counts the sequential client_ids based on the number of times they appear
FROM DATA
Assuming data such as the following:
ID EffDate Rate
1 12/12/2011 100
1 01/01/2012 110
1 02/01/2012 120
2 01/01/2012 40
2 02/01/2012 50
3 01/01/2012 25
3 03/01/2012 30
3 05/01/2012 35
How would I find the rate for ID 2 as of 1/15/2012?
Or, the rate for ID 1 for 1/15/2012?
In other words, how do I do a query that finds the correct rate when the date falls between the EffDate for two records? (Rate should be for the date prior to the selected date).
Thanks,
John
How about this:
SELECT Rate
FROM Table1
WHERE ID = 1 AND EffDate = (
SELECT MAX(EffDate)
FROM Table1
WHERE ID = 1 AND EffDate <= '2012-15-01');
Here's an SQL Fiddle to play with. I assume here that 'ID/EffDate' pair is unique for all table (at least the opposite doesn't make sense).
SELECT TOP 1 Rate FROM the_table
WHERE ID=whatever AND EffDate <='whatever'
ORDER BY EffDate DESC
if I read you right.
(edited to suit my idea of ms-sql which I have no idea about).