PostgreSQL - filter function for dates - postgresql

I am trying to use the built-in filter function in PostgreSQL to filter for a date range in order to sum only entries falling within this time-frame.
I cannot understand why the filter isn't being applied.
I am trying to filter for all product transactions that have a created_at date of the previous month (so in this case that were created in June 2017).
SELECT pt.created_at::date, pt.customer_id,
sum(pt.amount/100::double precision) filter (where (date_part('month', pt.created_at) =date_part('month', NOW() - interval '1 month') and
date_part('year', pt.created_at) = date_part('year', NOW()) ))
from
product_transactions pt
LEFT JOIN customers c
ON c.id= pt.customer_id
GROUP BY pt.created_at::date,pt.customer_id
Please find my expected results (sum of the amount for each day in the previous month - for each customer_id if an entry for that day exists) and the actual results I get from the query - below (using date_trunc).
Expected results:
created_at| customer_id | amount
2017-06-30 1 220.5
2017-06-28 15 34.8
2017-06-28 12 157
2017-06-28 48 105.6
2017-06-27 332 425.8
2017-06-25 1 58.0
2017-06-25 23 22.5
2017-06-21 14 88.9
2017-06-17 2 34.8
2017-06-12 87 250
2017-06-05 48 135.2
2017-06-05 12 95.7
2017-06-01 44 120
Results:
created_at| customer_id | amount
2017-06-30 1 220.5
2017-06-28 15 34.8
2017-06-28 12 157
2017-06-28 48 105.6
2017-06-27 332 425.8
2017-06-25 1 58.0
2017-06-25 23 22.5
2017-06-21 14 88.9
2017-06-17 2 34.8
2017-06-12 87 250
2017-06-05 48 135.2
2017-06-05 12 95.7
2017-06-01 44 120
2017-05-30 XX YYY
2017-05-25 XX YYY
2017-05-15 XX YYY
2017-04-30 XX YYY
2017-03-02 XX YYY
2016-11-02 XX YYY
The actual results give me the sum for all dates in the database, so no date time-frame is being applied in the query for a reason I cannot understand. I'm seeing dates that are both not for June 2017 and also from previous years.

Use date_trunc(..) function:
SELECT pt.created_at::date, pt.customer_id, c.name,
sum(pt.amount/100::double precision) filter (where date_trunc('month', pt.created_at) = date_trunc('month', NOW() - interval '1 month'))
from
product_transactions pt
LEFT JOIN customers c
ON c.id= pt.customer_id
GROUP BY pt.created_at::date

Related

How to group rows without using GROUP BY clause

Let's say I have simple table:
Date Price
-----------------------
2012-01-05 23
2015-04-08 145
2016-03-09 12
2015-09-09 87
2000-01-15 23
2016-01-15 89
2016-07-12 23
2012-04-08 65
I want to group this rows by year but without using GROUP BY clause. It would be good if I could add another column that would contain year or character that would indicate group, like this:
Date Price Group
-------------------------------
2012-01-05 23 1
2015-04-08 145 2
2016-03-09 12 3
2015-09-09 87 2
2000-01-15 23 4
2016-01-15 89 3
2016-07-12 23 3
2012-04-08 65 1
I tried use over() clause but to be honest I don't know which function use with over().
Combination of extract year from date and dense_rank will do the trick
select *,
dense_rank () OVER(order by extract(year from Date))
from YOURTABLE
Try to do the CASE if you only want to add another column
SELECT DATE,
PRICE,
CASE DATE_PART('YEAR', DATE) WHEN 2015 THEN 1
WHEN 2016 THEN 2 ... END
FROM MYTABLE
But if you want to get the aggregate of something then you do OVER() or GROUP BY

PostgreSQL: first date cumulative score

I have this sample table:
id date score
11 1/1/2017 14:32 25.34
4 1/2/2017 12:14 34.34
25 1/2/2017 18:08 37.15
4 3/2/2017 23:42 47.24
4 4/2/2017 23:42 54.12
25 7/3/2017 22:07 65.21
11 9/3/2017 21:02 74.6
25 10/3/2017 5:15 11.3
4 10/3/2017 7:11 22.45
My aim is to calculates the first(!) date (YYYY-MM-DD) on which an id's cumulative score has reached 100 (>=). For that, I've written the following code:
SELECT date(date),id, score,
sum(score) over (partition by id order by date(date) rows unbounded preceding) as cumulative_score
FROM test_q1
GROUP BY id, date, score
Order by id, date
It returns:
date id score cumulative_score
1/1/2017 11 25.34 25.34
9/3/2017 11 74.6 99.94
1/2/2017 4 34.34 34.34
3/2/2017 4 47.24 81.58
4/2/2017 4 54.12 135.7
10/3/2017 4 22.45 158.15
1/2/2017 25 37.15 37.15
7/3/2017 25 65.21 102.36
10/3/2017 25 11.3 113.66
I tried to add either WHERE cumulative_score >= 100 or HAVING cumulative score >= 100, but it returns_
ERROR: column "cumulative_score" does not exist
LINE 4: WHERE cumulative_score >= 100
^
SQL state: 42703
Character: 206
Anyone knows how to solve this?
Thanks
What I expect is:
date id score cumulative_score
4/2/2017 4 54.12 135.7
7/3/2017 25 65.21 102.36
And the output just id and date.
Try this:
with cumulative_sum AS (
SELECT id,date,sum(score) over( partition by id order by date) as sum from test_q1
),
above_100_score_rank AS (
SELECT *, rank() over (partition by id order by sum) AS rank
FROM cumulative_sum where sum > 100
)
SELECT * FROM above_100_score_rank WHERE rank= 1;

difference on date condition in postgresql

name total date
a 100 1/2/2015
b 30 1/2/2015
c 40 1/2/2015
d 45 1/2/2015
a 20 2/2/2015
b 13 2/2/2015
a 30 3/2/2015
b 23 3/2/2015
c 20 3/2/2015
and the table goes on with different dates,
I want to find difference(a-b) for each date and occurence .. i.e
diff total date
a-b 70 1/2/2015
a-b 7 2/2/2015....
how do I do it in postgresql
Use nth_value() window function for that:
WITH t(name,total,date) AS ( VALUES
('a',100,'2016-01-01'::DATE),
('b',30,'2016-01-01'::DATE),
('c',40,'2016-01-01'::DATE),
('d',45,'2016-01-01'::DATE),
('a',20,'2016-01-02'::DATE),
('b',13,'2016-01-02'::DATE)
)
SELECT
DISTINCT ON (date)
'a-b' AS diff,
(nth_value(total,1) OVER (PARTITION BY date) -
nth_value(total,2) OVER (PARTITION BY date)) total_diff,
date
FROM t
WHERE name IN ('a','b');
Result:
diff | total_diff | date
------+------------+------------
a-b | 70 | 2016-01-01
a-b | 7 | 2016-01-02
(2 rows)

Postgresql Query for display of records every 45 days

I have a table that has data of user_id and the timestamp they joined.
If I need to display the data month-wise I could just use:
select
count(user_id),
date_trunc('month',(to_timestamp(users.timestamp))::timestamp)::date
from
users
group by 2
The date_trunc code allows to use 'second', 'day', 'week' etc. Hence I could get data grouped by such periods.
How do I get data grouped by "n-day" period say 45 days ?
Basically I need to display number users per 45 day period.
Any suggestion or guidance appreciated!
Currently I get:
Date Users
2015-03-01 47
2015-04-01 72
2015-05-01 123
2015-06-01 132
2015-07-01 136
2015-08-01 166
2015-09-01 129
2015-10-01 189
I would like the data to come in 45 days interval. Something like :-
Date Users
2015-03-01 85
2015-04-15 157
2015-05-30 192
2015-07-14 229
2015-08-28 210
2015-10-12 294
UPDATE:
I used the following to get the output, but one problem remains. I'm getting values that are offset.
with
new_window as (
select
generate_series as cohort
, lag(generate_series, 1) over () as cohort_lag
from
(
select
*
from
generate_series('2015-03-01'::date, '2016-01-01', '45 day')
)
t
)
select
--cohort
cohort_lag -- This worked. !!!
, count(*)
from
new_window
join users on
user_timestamp <= cohort
and user_timestamp > cohort_lag
group by 1
order by 1
But the output I am getting is:
Date Users
2015-04-15 85
2015-05-30 157
2015-07-14 193
2015-08-28 225
2015-10-12 210
Basically The users displayed at 2015-03-01 should be the users between 2015-03-01 and 2015-04-15 and so on.
But I seem to be getting values of users upto a date. ie: upto 2015-04-15 users 85. which is not the results I want.
Any help here ?
Try this query :
SELECT to_char(i::date,'YYYY-MM-DD') as date, 0 as users
FROM generate_series('2015-03-01', '2015-11-30','45 day'::interval) as i;
OUTPUT :
date users
2015-03-01 0
2015-04-15 0
2015-05-30 0
2015-07-14 0
2015-08-28 0
2015-10-12 0
2015-11-26 0
This looks like a hot mess, and it might be better wrapped in a function where you could use some variables, but would something like this work?
with number_of_intervals as (
select
min (timestamp)::date as first_date,
ceiling (extract (days from max (timestamp) - min (timestamp)) / 45)::int as num
from users
),
intervals as (
select
generate_series(0, num - 1, 1) int_start,
generate_series(1, num, 1) int_end
from number_of_intervals
),
date_spans as (
select
n.first_date + 45 * i.int_start as interval_start,
n.first_date + 45 * i.int_end as interval_end
from
number_of_intervals n
cross join intervals i
)
select
d.interval_start, count (*) as user_count
from
users u
join date_spans d on
u.timestamp >= d.interval_start and
u.timestamp < d.interval_end
group by
d.interval_start
order by
d.interval_start
With this sample data:
User Id timestamp derived range count
1 3/1/2015 3/1-4/15
2 3/26/2015 "
3 4/4/2015 "
4 4/6/2015 " (4)
5 5/6/2015 4/16-5/30
6 5/19/2015 " (2)
7 6/16/2015 5/31-7/14
8 6/27/2015 "
9 7/9/2015 " (3)
10 7/15/2015 7/15-8/28
11 8/8/2015 "
12 8/9/2015 "
13 8/22/2015 "
14 8/27/2015 " (5)
Here is the output:
2015-03-01 4
2015-04-15 2
2015-05-30 3
2015-07-14 5

how to get day difference in postgres

I want to find out how many days are left until "End_date" is reached in postgres. What will be equivalent for following in postgres?
Days_Left = Column in table - Today's date
GREATEST(INT4(CEIL(("NUMERIC"(DATE_PART('EPOCH'::"VARCHAR", (T1.End_date - "TIMESTAMP"(DATE('now'::"VARCHAR"))))) / '86400'::"NUMERIC"))), 0) AS DAYS_LEFT
--Thanks I tried your suggestion but did not get expected result.
Expected Result -- if use GREATEST(INT4(CEIL(("NUMERIC"(DATE_PART('EPOCH'::"VARCHAR", (CA.END_DATE - "TIMESTAMP"(DATE('now'::"VARCHAR"))))) / '86400'::"NUMERIC"))), 0)
End_date Days_left
2014-11-01 03:59:00 47
2016-01-01 04:59:59 473
2017-01-01 06:59:59 839
2014-12-31 22:59:00 107
Result - date(end_date) - date(current_date)
End_date Days_Left
2014-11-01 03:59:00 46
2016-01-01 04:59:59 472
2017-01-01 06:59:59 838
2014-12-31 22:59:00 106
Result - if use (end_date - current_date)
End_date Days_Left
2014-11-01 03:59:00 46 days 03:59
2016-01-01 04:59:59 472 days 04:59:59
2017-01-01 06:59:59 838 days 06:59:59
2014-12-31 22:59:00 106 days 22:59
Thanks
Sandy
If column_in_table is defined as a DATE you can use this:
select column_in_table - current_date as days_left
from the_table
Edit
As end_date is a timestamp the above expression will return an interval not an integer.
If you don't care about the hours and minutes left, casting the timestamp to a date should work:
select end_date::date - current_date as days_left
from the_table