Customize query of postgresql - postgresql

I am using postgresql database, for i am trying to achieve like i have two queries and but i don't want to use multiple queries so is it possible to manage by single query ?
Query 1 :
select coalesce(sum("dummy"),0) as sum from generate_series ('2014-09-09 00:00:00'::timestamp,'2014-09-09 23:59:59','1 minute')
minutes(minute) LEFT JOIN report ON
minutes.minute=date_trunc('minute', report.fetchdate)
AND fetchdate >= '2014-09-09 00:00:00' AND fetchdate <= '2014-09-09 23:59:00'
AND entity_id ='0' group by minute order by minute
OUTPUT:
Total count of dummy field for each minutes of each day it means each day have total (24*60=1440) records
Note : This Query Using for single Day
Query2 :
select date(day)as day,coalesce(sum("dummy"),0) as sum from generate_series ('2014-09-06 00:00:01'::date,'2014-09-12 23:59:59'::date,'1 day'::interval) days(day) LEFT JOIN report ON days.day=date_trunc('day', report.fetchdate) AND entity_id ='0' group by day order by day
OUTPUT:
give total count of dummy field for each day between day 2014-09-06 to 2014-09-12 it means total 7 records (Date : 6,7,8,9,10,11,12)
Note :This Query using for more than 1 days
Required Output:
1) Need to see total count of dummy field of each day between specified date(Output of 2nd query)
2) Need to see maximum call of each day
Ex :
Suppose i am search by any two days then need to break in single date and get data for each minute of each date and whenever we have maximum count of dummy field of particular day then need to show as output maximum call for each day

select
date_trunc('day', minute) as day,
sum(minute_sum) as day_sum,
max(minute_sum) as max_minute_sum
from (
select
minute,
coalesce(sum("dummy"),0) as minute_sum
from
generate_series(
'2014-09-06'::timestamp,
'2014-09-13'::timestamp - interval '1 minute',
'1 minute'
) minutes(minute)
left join
report on
minutes.minute = date_trunc('minute', report.fetchdate)
and entity_id ='0'
group by minute
) s
group by 1
order by 1

Related

Would like to get sum of payment in interval during 36 month from date of file creation - Postgres

I have two tables.
In first table "customers" I have 3 colums, first column is "id" (unique for every customer) and in second column is "date_created" (the date when the customer file was created). Dates are ranging in format 'yyyy-mm-dd' from 2007 till date. The third column is "client_name" (the name of client to which customer belongs)
the second table is "payments" table with 3 colums, "id" (unique for every customer), in second column is "amount" of payment and in third column is "date_payment".
What I would like to achieve is next. From the first table I would like to choose files, created within a date range (for example from 2018-06-01 till 31.12.2018) and get sum of payment from second table after one month of creation, etc.. till month 36. Interval is attached for every particular claim, so it is different for every separate claim. The purpose of this is to get success rates from on month of creation, 2 month from creation, till month 36...
The result would be:
interval client_name sum
1 Boden 100
2 Boden 220
etc... till 36
Close results i get with next query but it is very time consuming, so I would need some quicker solution that would return results for 36 month intervals for every each claim.
select customers.client_name, sum(payments.amount) as sum_amount
from customers
left join payments on customers.id=payments.id
where customers.date_created >= '2018-06-01' and customersdate_created <= '2018-12-31'
and date_payment <= date(date_created + interval '1 month)
and client_name ilike '%Boden%'
group by customers.client_name
Can anyone help, please?
UPDATE
Nobody answered but I kind of figure it out myself. Solution below:
select customers.client_name, sum(payments.amount) as sum_amount, 1 as sortorder
from customers
left join payments on customers.id=payments.id
where customers.date_created >= '2018-06-01' and customersdate_created <= '2018-12-31'
and date_payment <= date(date_created + interval '1 month)
and client_name ilike '%Boden%'
group by customers.client_name
UNION
select customers.client_name, sum(payments.amount) as sum_amount, 2 as sortorder
from customers
left join payments on customers.id=payments.id
where customers.date_created >= '2018-06-01' and customersdate_created <= '2018-12-31'
and date_payment <= date(date_created + interval '2 month)
and client_name ilike '%Boden%'
group by customers.client_name

Mixing DISTINCT with GROUP_BY Postgres

I am trying to get a list of:
all months in a specified year that,
have at least 2 unique rows based on their date
and ignore specific column values
where I got to is:
SELECT DATE_PART('month', "orderDate") AS month, count(*)
FROM public."Orders"
WHERE "companyId" = 00001 AND "orderNumber" != 1 and DATE_PART('year', ("orderDate")) = '2020' AND "orderNumber" != NULL
GROUP BY month
HAVING COUNT ("orderDate") > 2
The HAVING_COUNT sort of works in place of DISTINCT insofar as I can be reasonably sure that condition filters the condition of data required.
However, being able to use DISTINCT based on a given date within a month would return a more reliable result. Is this possible with Postgres?
A sample line of data from the table:
Sample Input
"2018-12-17 20:32:00+00"
"2019-02-26 14:38:00+00"
"2020-07-26 10:19:00+00"
"2020-10-13 19:15:00+00"
"2020-10-26 16:42:00+00"
"2020-10-26 19:41:00+00"
"2020-11-19 20:21:00+00"
"2020-11-19 21:22:00+00"
"2020-11-23 21:10:00+00"
"2021-01-02 12:51:00+00"
without the HAVING_COUNT this produces
month
count
7
1
10
2
11
3
Month 7 can be discarded easily as only 1 record.
Month 10 is the issue: we have two records. But from the data above, those records are from the same day. Similarly, month 11 only has 2 distinct records by day.
The output should therefore be ideally:
month
count
11
2
We have only two distinct dates from the 2020 data, and they are from month 11 (November)
I think you just want to take the distinct count of dates for each month:
SELECT
DATE_PART('month', orderDate) AS month,
COUNT(DISTINCT orderDate::date) AS count
FROM Orders
WHERE
companyId = 1 AND
orderNumber != 1 AND
DATE_PART('year', orderDate) = '2020'
GROUP BY
DATE_PART('month', orderDate)
HAVING
COUNT(DISTINCT orderDate::date) > 2;

Using 'over' function results in column "table.id" must appear in the GROUP BY clause or be used in an aggregate function

I'm currently writing an application which shows the growth of the total number of events in my table over time, I currently have the following query to do this:
query = session.query(
count(Event.id).label('count'),
extract('year', Event.date).label('year'),
extract('month', Event.date).label('month')
).filter(
Event.date.isnot(None)
).group_by('year', 'month').all()
This results in the following output:
Count
Year
Month
100
2021
1
50
2021
2
75
2021
3
While this is okay on it's own, I want it to display the total number over time, so not just the number of events that month, so the desired outpout should be:
Count
Year
Month
100
2021
1
150
2021
2
225
2021
3
I read on various places I should use a window function using SqlAlchemy's over function, however I can't seem to wrap my head around it and every time I try using it I get the following error:
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.GroupingError) column "event.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT count(event.id) OVER (PARTITION BY event.date ORDER...
^
[SQL: SELECT count(event.id) OVER (PARTITION BY event.date ORDER BY EXTRACT(year FROM event.date), EXTRACT(month FROM event.date)) AS count, EXTRACT(year FROM event.date) AS year, EXTRACT(month FROM event.date) AS month
FROM event
WHERE event.date IS NOT NULL GROUP BY year, month]
This is the query I used:
session.query(
count(Event.id).over(
order_by=(
extract('year', Event.date),
extract('month', Event.date)
),
partition_by=Event.date
).label('count'),
extract('year', Event.date).label('year'),
extract('month', Event.date).label('month')
).filter(
Event.date.isnot(None)
).group_by('year', 'month').all()
Could someone show me what I'm doing wrong? I've been searching for hours but can't figure out how to get the desired output as adding event.id in the group by would stop my rows from getting grouped by month and year
The final query I ended up using:
query = session.query(
extract('year', Event.date).label('year'),
extract('month', Event.date).label('month'),
func.sum(func.count(Event.id)).over(order_by=(
extract('year', Event.date),
extract('month', Event.date)
)).label('count'),
).filter(
Event.date.isnot(None)
).group_by('year', 'month')
I'm not 100% sure what you want, but I'm assuming you want the number of events up to that month for each month. You're going to first need to calculate the # of events per month and also sum them with the postgresql window function.
You can do that with in a single select statement:
SELECT extract(year FROM events.date) AS year
, extract(month FROM events.date) AS month
, SUM(COUNT(events.id)) OVER(ORDER BY extract(year FROM events.date), extract(month FROM events.date)) AS total_so_far
FROM events
GROUP BY 1,2
but it might be easier to think about if you split it into two:
SELECT year, month, SUM(events_count) OVER(ORDER BY year, month)
FROM (
SELECT extract(year FROM events.date) AS year
, extract(month FROM events.date) AS month
, COUNT(events.id) AS events_count
FROM events
GROUP BY 1,2
)
but not sure how to do that in SqlAlchemy

Postgres find where dates are NOT overlapping between two tables

I have two tables and I am trying to find data gaps in them where the dates do not overlap.
Item Table:
id unique start_date end_date data
1 a 2019-01-01 2019-01-31 X
2 a 2019-02-01 2019-02-28 Y
3 b 2019-01-01 2019-06-30 Y
Plan Table:
id item_unique start_date end_date
1 a 2019-01-01 2019-01-10
2 a 2019-01-15 'infinity'
I am trying to find a way to produce the following
Missing:
item_unique from to
a 2019-01-11 2019-01-14
b 2019-01-01 2019-06-30
step-by-step demo:db<>fiddle
WITH excepts AS (
SELECT
item,
generate_series(start_date, end_date, interval '1 day') gs
FROM items
EXCEPT
SELECT
item,
generate_series(start_date, CASE WHEN end_date = 'infinity' THEN ( SELECT MAX(end_date) as max_date FROM items) ELSE end_date END, interval '1 day')
FROM plan
)
SELECT
item,
MIN(gs::date) AS start_date,
MAX(gs::date) AS end_date
FROM (
SELECT
*,
SUM(same_day) OVER (PARTITION BY item ORDER BY gs)
FROM (
SELECT
item,
gs,
COALESCE((gs - LAG(gs) OVER (PARTITION BY item ORDER BY gs) >= interval '2 days')::int, 0) as same_day
FROM excepts
) s
) s
GROUP BY item, sum
ORDER BY 1,2
Finding the missing days is quite simple. This is done within the WITH clause:
Generating all days of the date range and subtract this result from the expanded list of the second table. All dates that not occur in the second table are keeping. The infinity end is a little bit tricky, so I replaced the infinity occurrence with the max date of the first table. This avoids expanding an infinite list of dates.
The more interesting part is to reaggregate this list again, which is the part outside the WITH clause:
The lag() window function take the previous date. If the previous date in the list is the last day then give out true (here a time changing issue occurred: This is why I am not asking for a one day difference, but a 2-day-difference. Between 2019-03-31 and 2019-04-01 there are only 23 hours because of daylight saving time)
These 0 and 1 values are aggregated cumulatively. If there is one gap greater than one day, it is a new interval (the days between are covered)
This results in a groupable column which can be used to aggregate and find the max and min date of each interval
Tried something with date ranges which seems to be a better way, especially for avoiding to expand long date lists. But didn't come up with a proper solution. Maybe someone else?

Retrieve Records for month,datewise group by day in postgres

I have table time_slot where i have columns like date,start_time,end_time,user.
I want to retrieve records like say if I give the month and year along with user, what is the slots available for a particular user day wise for a month. Say user can have 3slots on a day & 0 on a day.
I am using Postgres and my date column is a date, time column is time. I am trying to do this in a Java web application and the date will be picked using a jquery datepicker. From where I'm sending as month, year and user.
Sample Data of table.
Date start-time end-time user
2019-09-01 12:21:34 13:21:34 user1
2019-09-01 14:21:34 15:21:34 user1
2019-09-01 17:21:34 17:21:34 user1
2019-09-03 12:21:34 13:21:34 user1
2019-09-03 12:21:34 13:21:34 user1
I would like to create a query that gives the time-slots of user concating start-time & end-time column and groups the results by date for a month as follows:
Date count_of_slots
2019-09-01 3
2019-09-02 0
2019-09-03 2
I have tried the below Query.
select distinct kt.start_time,kt.end_time,DATE(kt.slot_date),count(kt.slot_date)
from time_slot as kt
WHERE date_trunc('month',to_timestamp(kt.start_time, 'yy-mm-dd HH24:MI:SS.MS') + interval '1 day')
= date_trunc('month',to_timestamp(:startdate, 'yy-mm-dd HH24:MI:SS.MS') + interval '1 day' )
group by DATE(kt.slot_date) order by cb.start_time.
After getting result as expected above format, I need to loop through date to get the time-slots for that day and store in json as below.
{
"Date" : "2019-09-01",
"count" : "3",
"time-slot" : [
"12:21:34 - 13:21:34","14:21:34 - 15:21:34","17:21:34 - 17:21:34"]
}
Any suggestion and leads are welcomed.
Disclaimer: You should really upgrade your Postgres version!
demo:db<>fiddle
You need to join a date series against your data set. This can be done using the generate_series() function.
SELECT
gs::date,
COUNT(the_date)
FROM
time_slot ts
RIGHT JOIN
generate_series('2019-09-01', '2019-09-05', interval '1 day') gs ON ts.the_date = gs
GROUP BY gs
If you want to get the time_slots as well, simply add:
ARRAY_AGG(start_time || ' - ' || end_time) AS time_slot