SQL statement that detects calendar appointment collisions for the iPhone - iphone

I try to create an application for the iPhone where you can set appointments. Everything is saved into a MySQL database and I currently get the data through JSON into my app. This is a workflow:
User1 defines when he is working. E.g. 8am - 4pm.
User2 wants to have an appointment with user1, e.g. 8am-9am.
The script should be able to do this:
the appointment is within the user's work hours; and
it does not clash with an existing appointment, which can happen in three possible ways:
the clashing appointment starts during the new appointment; and/or
the clashing appointment ends during the new appointment; or
the clashing appointment starts before and ends after the new appointment.
These are the important tables:
// new row should be added here when the conditions above are met
create table ios_appointment (
appointmentid int not null auto_increment,
start timestamp,
end timestamp,
user_id_fk int
)
// a working hour has a n:1 relationshipt to ios_worker
create table ios_workinghours (
workinghoursid int not null auto_increment,
start timestamp,
end timestamp,
worker_id_fk int
)
// employee, has a 1:n relationship to ios_workinghours
create table ios_worker (
workerid int not null auto_increment,
prename varchar(255),
lastname varchar(255),
...
)
The input for the select clause are two timestamps, start and end. These are defined by the user. So the script should check if user 2 is working at that specific time and if there are already appointments.
I currently have something like this, but that uses the user_id to link the tables:
SELECT EXISTS (
SELECT *
FROM ios_appointments a JOIN ios_workhours h USING (user_id)
WHERE user_id = 1
AND h.start <= '08:00:00' AND h.end >= '09:00:00'
AND (
a.start BETWEEN '08:00:00' AND '09:00:00'
OR a.end BETWEEN '08:00:00' AND '09:00:00'
OR (a.start < '08:00:00' AND a.end > '09:00:00')
)
LIMIT 1
)
Every help is appreciated. Thx.

You either need to have your app read in the data and determine if the time is available OR you need to create a view that has the available "time slots" (e.g. every 30 minutes).
Here's how I would do it:
CREATE TABLE #timeslot
(
timeslot_id INT PRIMARY KEY IDENTITY(1,1),
timeslot_time DATETIME NOT NULL
)
DECLARE #startime DATETIME, #endtime DATETIME
SELECT #starttime = '12/25/2012 08:00:00.000', #endtime = '12/25/2012 15:00:00.000'
WHILE #starttime < #endtime BEGIN
INSERT INTO #timeslot (timeslot_time)
VALUES (#starttime)
SET #starttime = DATEADD(mm,30,#starttime)
END
SELECT
w.workerid,
ts.timeslot_time
INTO
ios_workertimeslot
FROM
#timeslot ts
FULL OUTER JOIN
ios_worker w
ON (1 = 1)
SELECT
wts.workerid,
wts.timeslot_time,
ap.appointmentid,
CASE WHEN ap.appointmentid IS NOT NULL THEN 0 ELSE 1 END AS AvailableSlot
FROM
ios_workertimeslot wts
JOIN
ios_workinghours wh
ON (wts.workerid = wh.workerid)
AND (wts.timeslot_time >= wh.start)
AND (wts.timeslot_time < wh.end)
LEFT JOIN
ios_appointment ap
ON (wts.workerid = ap.workerid)
AND (wts.timeslot_time >= ap.start)
AND (wts.timeslot_time < ap.end)
This will leave you with a data set that indicates the available and non-available timeslots.
Hope this helps!

Related

I am trying to make a list where all user can go using command BETWEEN, but it doesn't work properly

My code to identified where one user can go works properly, but I want to make a list of where all user can go. And for that I tried using the command BETWEEN AND, but it did not work as expected.
Code: Where ONE USER can go;
SELECT place_name, user_id, user_name
FROM schema.place, schema.person
WHERE schema.place_id NOT IN(
SELECT place_id
FROM went_to
WHERE went_to.user_id = 1
AND age(date) <= interval '4 months'
)
AND user_id=1
IMAGE OF THE CODE WORKING PROPERLY:
There's a total of 40 lines, places the user with the id 1 can go
Code: Where ALL USER can go;
SELECT place_name, user_id, user_name
FROM schema.place, schema.person
WHERE schema.place_id NOT IN(
SELECT place_id
FROM went_to
WHERE went_to.user_id BETWEEN 1 AND 15
AND age(date) <= interval '4 months'
)
AND user_id BETWEEN 1 AND 15
ORDER BY user_id
IMAGE OF THE CODE NOT WORKING PROPERLY:
It should have a total of 40 lines, places the user with the id 1 can go
When I reduce the difference in the BETWEEN, the code gets closer to the right answer, however it isn't right.
What I am doing it wrong with the BETWEEN?
The tables:
CREATE TABLE schema.place (
place_id VARCHAR(8),
place_name VARCHAR (50),
CONSTRAINT pk_place_id PRIMARY KEY (place_id)
);
CREATE TABLE schema.user (
user_id VARCHAR(3),
user_name VARCHAR (50),
CONSTRAINT pk_user_id PRIMARY KEY (user_id)
);
CREATE TABLE schema.visit (
user_id VARCHAR(3),
place_id VARCHAR(8),
data DATE,
CONSTRAINT pk_user_id FOREIGN KEY (user_id) REFERENCES SCHEMA.user,
CONSTRAINT pk_place_id FOREIGN KEY (place_id) REFERENCES code.place,
EXCLUDE USING gist (pk_user_id WITH =, daterange(data, (data + interval '6 months')::date) WITH &&)
);
Seeing the schema would be helpful, but I believe the issue is in how you constructed your query.
SELECT place_id
FROM went_to
WHERE went_to.id BETWEEN 1 AND 15
AND age(date) <= interval '4 months'
If we look at just the subquery, we are returning all the place ids where users 1-15 went to in the last 4 months. You're then trying to return all places/users that don't match those place ids. The issue is that you're combining all the places that all of those users went to and then using that as an exclusion when you really want to be excluding only places a particular user went to.
I think you want something like this:
SELECT schema.place.place_name, schema.user.user_id, schema.user.user_name
FROM schema.place, schema.user
WHERE (schema.place.place_id, schema.user.user_id) NOT IN(
SELECT place_id, schema.visit.user_id
FROM schema.visit
WHERE schema.visit.user_id::int BETWEEN 1 AND 15
AND age(data) <= interval '4 months'
)
AND user_id::int BETWEEN 1 AND 15
ORDER BY schema.user.user_id
Your schema has the ids as varchars and not ints and the date field is called data, so I had to make some tweaks

postgreSQL select interval and fill blanks

I'm working on a system to manage the problems in different projects.
I have the following tables:
Projects
id
Description
Country
1
3D experience
Brazil
2
Lorem Epsum
Chile
Problems
id
idProject
Description
1
1
Not loading
2
1
Breaking down
Problems_status
id
idProblem
Status
Start_date
End_date
1
1
Red
2020-10-17
2020-10-25
2
1
Yellow
2020-10-25
2020-11-20
3
1
Red
2020-11-20
4
2
Red
2020-11-01
2020-11-25
5
2
Yellow
2020-11-25
2020-12-22
6
2
Red
2020-12-22
2020-12-23
7
2
Green
2020-12-23
In the above examples, the problem 1 is still red, and the problem 2 is green (no end date).
I need to create a chart when the user selects an specific project, where the status of the problems along the weeks (starting by the week of the first registered problem) will be shown. The chart of the project 1 should look like this:
I'm trying to write a code in postgreSQL to return a table like this, so that I can populate this chart:
Week
Green
Yellow
Red
42/20
0
0
1
43/20
0
0
1
44/20
0
1
0
...
...
...
...
04/21
1
0
1
I've been trying multiple ways but just can't figure out how to do that, could someone help me please?
Bellow a db-fiddle to help:
CREATE TABLE projects (
id serial NOT NULL,
description character varying(50) NOT NULL,
country character varying(50) NOT NULL,
CONSTRAINT projects_pkey PRIMARY KEY (id)
);
CREATE TABLE problems (
id serial NOT NULL,
id_project integer NOT NULL,
description character varying(50) NOT NULL,
CONSTRAINT problems_pkey PRIMARY KEY (id),
CONSTRAINT problems_id_project_fkey FOREIGN KEY (id_project)
REFERENCES projects (id) MATCH SIMPLE
);
CREATE TABLE problems_status (
id serial NOT NULL,
id_problem integer NOT NULL,
status character varying(50) NOT NULL,
start_date date NOT NULL,
end_date date,
CONSTRAINT problems_status_pkey PRIMARY KEY (id),
CONSTRAINT problems_status_id_problem_fkey FOREIGN KEY (id_problem)
REFERENCES problems (id) MATCH SIMPLE
);
INSERT INTO projects (description, country) VALUES ('3D experience','Brazil');
INSERT INTO projects (description, country) VALUES ('Lorem Epsum','Chile');
INSERT INTO problems (id_project ,description) VALUES (1,'Not loading');
INSERT INTO problems (id_project ,description) VALUES (1,'Breaking down');
INSERT INTO problems_status (id_problem, status, start_date, end_date) VALUES
(1, 'Red', '2020-10-17', '2020-10-25'),(1, 'Yellow', '2020-10-25', '2020-11-20'),
(1, 'Red', '2020-11-20', NULL),(2, 'Red', '2020-11-01', '2020-11-25'),
(2, 'Yellow', '2020-11-25', '2020-12-22'),(2, 'Red', '2020-12-22', '2020-12-23'),
(2, 'Green', '2020-12-23', NULL);
If I understood correctly your goal is to produce a weekly tally by problem status for a particular project for a specific time period (Min db date to current date). Further if a problem status spans week then is should be included in each weeks tally. That involve 2 time periods, the report period against the status start/end dates and checking for overlap of those dates. Now there ate 5 overlaps scenarios that need checking; lets call the ranges let A the any week in the report period and B. the start/end of status. Now, allowing that A must end within the reporting period. but B does not we have the following.
A starts, B starts, A ends, B ends. B overlaps end of A.
A starts, B starts, B ends, A ends. B totally contained within A.
B starts, A starts, B ends, A ends. B overlaps start of A.
B starts, A starts, A ends, B ends. A totally enclosed within B.
Fortunately, Postgres provides functionally to handle all the above meaning the query does not have to handle the individual validations. This is DATERANGEs and the Overlap operator. The difficult work then becomes defining each week with in A. Then employ the Overlap operator on daterange for each week in A against the daterange for B (start_date, end_date). Then do conditional aggregation. for each overlap detected. See full example here.
with problem_list( problem_id ) as
-- identify the specific problem_ids desirded
(select ps.id
from projects p
join problems ps on(ps.id_project = p.id)
where p.id = &selected_project
) --select * from problem_list;
, report_period(srange, erange) as
-- generate the first day of week (Mon) for the
-- oldest start date through day of week of Current_Date
(select min(first_of_week(ps.start_date))
, first_of_week(current_date)
from problem_status ps
join problem_list pl
on (pl.problem_id = ps.id_problem)
) --select * from report_period;
, weekly_calendar(wk,yr, week_dates) as
-- expand the start, end date ranges to week dates (Mon-Sun)
-- and identify the week number with year
(select extract( week from mon)::integer wk
, extract( isoyear from mon)::integer yr
, daterange(mon, mon+6, '[]'::text) wk_dates
from (select generate_series(srange,erange, interval '7 days')::date mon
from report_period
) d
) -- select * from weekly_calendar;
, status_by_week(yr,wk,status) as
-- determine where problem start_date, end_date overlaps each calendar week
-- then where multiple statuses exist for any week keep only the lat
( select yr,wk,status
from (select wc.yr,wc.wk,ps.status
-- , ps.start_date, wc.week_dates,id_problem
, row_number() over (partition by ps.id_problem,yr,wk order by yr, wk, start_date desc) rn
from problem_status ps
join problem_list pl on (pl.problem_id = ps.id_problem)
join weekly_calendar wc on (wc.week_dates && daterange(ps.start_date,ps.end_date)) -- actual overlap test
) ac
where rn=1
) -- select * from status_by_week order by wk;
select 'Project ' || p.id || ': ' || p.description Project
, to_char(wk,'fm09') || '/' || substr(to_char(yr,'fm0000'),3) "WK"
, "Red", "Yellow", "Green"
from projects p
cross join (select sbw.yr,sbw.wk
, count(*) filter (where sbw.status = 'Red') "Red"
, count(*) filter (where sbw.status = 'Yellow') "Yellow"
, count(*) filter (where sbw.status = 'Green') "Green"
from status_by_week sbw
group by sbw.yr, sbw.wk
) sr
where p.id = &selected_project
order by yr,wk;
The CTEs and main operate as follows:
problem_list: Identifies the Problems (id_problem) related the
specified project.
report_period: Identifies the full reporting period start to end.
weekly_calendar: Generates the beginning date (Mon) and ending date (Sun) for each week within the reporting period (A above). Along the
way it also gathers week of the year and the ISO year.
status_by_week: This is the real work horse preforming two tasks.
First is passes each problem by each of the week in the calendar. It
builds row for each overlap detected. Then it enforces the "one
status" rule.
Finally, the main select aggregates the status into the appropriate
buckets and adds the syntactic sugar getting the Program Name.
Note the function first_of_week(). This is a user defined function and available in the example and below. I created it some time ago and have found it useful. You are free to use it. But you do so without any claim of suitability or guaranty.
create or replace
function first_of_week(date_in date)
returns date
language sql
immutable strict
/*
* Given a date return the first day of the week according to ISO-8601
*
* ISO-8601 Standard (in short)
* 1 All weeks begin on Monday.
* 2 All Weeks have exactly 7 days.
* 3 First week of any year is the Monday on or before 4-Jan.
* This implies that the last few days on Dec may be in the
* first week of the following year and that the first few
* days of Jan may be in week 53 (53) of the prior year.
* (Not at the same time obviously.)
*
*/
as $$
with wk_adj(l_days) as (values (array[0,1,2,3,4,5,6]))
select date_in - l_days[ extract (isodow from date_in)::integer ]
from wk_adj;
$$;
In the example I have implemented the query as a SQL function as it seems db<>fiddle has issues with bound variables
and substitution variables, Besides it gave the ability to parameterize it. (Hate hard coded values). For the example I
added additional data fro extra testing, Mostly as data that will not be selected. And an additional Status (what happens if it encounters something other than those 3 status values (in this case Pink). This easy to remove, just get rid on OTHER.
Your notice that "the daterange is covering mon-mon, instead of mon-sun" is incorrect, although it would appear that way for someone not use to looking at them. Lets take week 43. If you queried the date range it would show [2020-10-19,2020-10-26) and yes both those dates are Monday. However, the bracketing characters have meaning. The leading character [ says the date is to included and the trailing character ) says the date is not to be included. A standard condition:
somedate && [2020-10-19,2020-10-26)
is the same as
somedate >= 2020-10-19 and somedate < 2020-10-26
This is why when you change the increment from "mon+6" to "mon+5" you fixed week 43, but introduced errors into other weeks.
You can fill in blanks using COALESCE to select the first non-null value in the list.
SELECT COALESCE(<some_value_that_could_be_null>, <some_value_that_will_not_be_null>);
If you want to force the bounds of your time range into a result set you can UNION your result set with a specific date.
SELECT ... -- your data query here
UNION ALL
SELECT end_ts -- WHERE end_ts is a timestamptz type
In order to UNION you will need to have the same arity and same type of fields returned in the unioned query. You can fill in everything other than the timestamp with NULL casted to whichever the matching type is.
More concrete example:
WITH data AS -- get raw data
(
SELECT p.id
, ps.status
, ps.start_date
, COALESCE(ps.end_date, CURRENT_DATE, '01-01-2025'::DATE) -- you can fill in NULL values with COALESCE
, pj.country
, pj.description
, MAX(start_date) OVER (PARTITION BY p.id) AS latest_update
FROM problems p
JOIN projects pj ON (pj.id = p.id_project)
JOIN problem_status ps ON (p.id = ps.id_problem)
UNION ALL -- force bounds in the following
SELECT NULL::INTEGER -- could be null or a defaulted value
, NULL::TEXT -- could be null or a defaulted value
, start_date -- either as an input param to a function or a hard-coded date
, end_date -- either as an input param to a function or a hard-coded date
, NULL::TEXT
, NULL::TEXT
, NULL::DATE
) -- aggregate in the following
SELECT <week> -- you'll have to figure out how you're getting weeks out of the DATE data
, COUNT(*) FILTER (WHERE status = 'Red')
, COUNT(*) FILTER (WHERE status = 'Yellow')
, COUNT(*) FILTER (WHERE status = 'Green')
FROM data
WHERE start_date = latest_update
GROUP BY <week>
;
Some of the features used in this query are very powerful and you should look them up if they're new to you and you are going to be doing a bunch of reporting queries. Mainly coalesce, common table expressions (CTE), window functions, and aggregate expressions.
Aggregate Expressions
WITH Queries (CTEs)
COALESCE
Window Functions
I wrote a dbfiddle for you to take a look at here after you updated your requirements.

How to create a PostgreSQL function that I can call from DBeaver?

Here is the sample date:
CREATE TABLE #logins (
username text not null,
logged_at timestamp not null);
insert into #logins (username, logged_at) values
('a','2019-01-01'),('b','2019-01-01'),('c','2019-01-01'),('d','2019-01-01'),('e','2019-01-01'),
('a','2019-02-01'),('b','2019-02-01'),('c','2019-02-01'),('f','2019-02-01'),('g','2019-02-01'),
('h','2019-02-01'),('i','2019-02-01'),('j','2019-02-01'),('a','2019-03-01'),('b','2019-03-01'),
('f','2019-03-01'),('h','2019-03-01'),('g','2019-03-01'),('k','2019-03-01'),('l','2019-03-01'),
('m','2019-03-01'),('n','2019-03-01'),('o','2019-03-01'),('a','2019-04-01'),('f','2019-04-01'),
('g','2019-04-01'),('k','2019-04-01'),('l','2019-04-01')`
What I normally do
drop table if exists #a;
create table #a as
select username, min(logged_at) as date from #logins --Please note that there is **MIN()** here
group by 1;
alter table #a
add m_1 varchar;
update #a
set m_1 = (select username from #logins
where add_months(#a.date,1) = #logins.logged_at and #logins.username = #a.username);
alter table #a
add m_2 varchar;
update #a
set m_2 = (select username from #logins
where add_months(#a.date,2) = #logins.logged_at and #logins.username = #a.username);
alter table #a
add m_3 varchar;
update #a
set m_3 = (select username from #logins
where add_months(#a.date,1) = #logins.logged_at and #logins.username = #a.username);
select to_date(date,'yyyy-mm') as date, count(username) as num_acc,
count(m_1) as m_1,
count(m_2) as m_2,
count(m_3) as m_3
from #a
group by 1
order by 1
The expected result:
num_acc m_1 m_2 m_3
2019-01-01 5 3 2 3
2019-02-01 5 3 2 3
2019-03-01 5 2 0 2
From this point I will download the data and visualize it in cohort
The point is that I want to create a function for convenient. I am working on Dbeaver using PostgreSQL for your information.
In this function, we only need to input a table with ID and Date then it would automate the process.
This is my try so far:
CREATE OR REPLACE FUNCTION test(timestamp,varchar(255))
RETURNS int
declare
counter integer :=1
stable
AS $$
LOOP
EXIT WHEN counter = 6 ;
counter := counter + 1 ;
alter table #a
add counter varchar;
update #a
counter = select user_name from #logins
where add_month(#logins.logged_at,counter) = #a.first_login
#a.first_login and #logins.username = #a.username
END LOOP
$$ LANGUAGE sql;
This is embarrassing as function in SQL is quite difficult. This is the best I could do.
(p/s: please sympathy that LANGUAGE plpythonu can not be used. Our only option is sql)
Revised: Incorporating additional requirement
Well with the new information a small adjustment can be made. Since "No matter how many times you login within a month, we only count 1, based on the username". Rather than looking for equal dates we'll use the Posrgres date_trunc function to look at the 1st of the month whatever the actual login date happens to be. Also continuing to use WHERE EXISTS ensures that no matter how many logins a user has we only count 1. So the REVISED function:
create or replace function collect_user_login_counts(login_start_in date)
returns table( "Date" text
, num_acc bigint
, m_1 bigint
, m_2 bigint
, m_3 bigint
)
language sql strict
as $$
-- work table exists for single execution so clear any existing data
truncate user_login_wrk;
with su_dater as
-- get each user and the earliest date of login such that the login date in not less than parameter date
(select l0.username, min(date_trunc('month', l0.logged_at))::date logged_at
from logins l0
where date_trunc('month', l0.logged_at)::date >= date_trunc('month', login_start_in)::date
group by l0.username
)
, inserter as
-- insert the the counter table the user name for least login date and the following 3 months,
-- return each row for subsequent summerization
( insert into user_login_wrk(username, logged_at, m_1,m_2,m_3)
select su.username
, su.logged_at
, (select su.username where exists
(select null
from logins l1
where l1.username = su.username
and date_trunc('month',l1.logged_at)::date = (su.logged_at + interval '1 month')::date))
, (select su.username where exists
(select null
from logins l2
where l2.username = su.username
and date_trunc('month',l2.logged_at)::date = (su.logged_at + interval '2 month')::date))
, (select su.username where exists
(select null
from logins l3
where l3.username = su.username
and date_trunc('month',l3.logged_at)::date = (su.logged_at + interval '3 month')::date))
from su_dater su
returning *
)
-- summarize count on user logins over period current and next 3 months result returned caller
select to_char(ulc.logged_at,'yyyy-mm')
, count(ulc.username)
, count(ulc.m_1)
, count(ulc.m_2)
, count(ulc.m_3)
from inserter ulc
where ulc.logged_at >= date_trunc('month',login_start_in)::date
group by to_char(logged_at,'yyyy-mm')
order by to_char(logged_at,'yyyy-mm');
$$;
Testing:
For testing I changed your original date so that there are no rows actually having the 1st of the month and none on the same day number. Further the parameter date for the function does not occur in the data.
truncate logins;
insert into logins (username, logged_at) values
('a','2019-01-03'),('b','2019-01-04'),('c','2019-01-11'),('d','2019-01-15'),('e','2019-01-21'),
('a','2019-02-06'),('b','2019-02-02'),('c','2019-02-04'),('f','2019-02-08'),('g','2019-02-09'),
('h','2019-02-12'),('i','2019-02-24'),('j','2019-02-26'),('a','2019-03-02'),('b','2019-03-03'),
('f','2019-03-05'),('h','2019-03-11'),('g','2019-03-17'),('k','2019-03-31'),('l','2019-03-09'),
('m','2019-03-29'),('n','2019-03-27'),('o','2019-03-24'),('a','2019-04-06'),('f','2019-04-03'),
('g','2019-04-14'),('k','2019-04-30'),('l','2019-04-11');
select collect_user_login_counts(date '2019-01-18'); -- select as row
select * from collect_user_login_counts(date '2019-01-18'); -- select as individual columns
RESULTS
Date | num_acc| m_1| m_2| m_3
________________________________
2019-01 | 5 | 3 | 2 | 1
2019-02 | 5 | 3 | 2 | 0
2019-03 | 5 | 2 | 0 | 0
Despite the data changes the same results are produced.
BTW. I did test the original with your data. And those results matched your expectations exactly except for m_3, which is explained in original reply. I just did not post it, my error.
*Original reply**
Well there are a couple issues with your code as posted. As #a_horse_with_no_name pointed out the # character not valid
in a Postgres object name, unless the name is double quote enclosed (i.e. "#logins") regardless of the schema. Additionally Postges does not have the function add_months (you could have it as a user written function, but I cannot
know that.)
I notice a couple inconsistencies with your expected results. First the final query in what you "normally do" cannot produce those results. The query returns year-month for the date, The expected has year-month-day. I'll assume year-month. Secondly the m_3 expected output is, I believe, incorrect. This is due to the set m_3 where you uses add_months(#a.date,1). I believe from the naming structure and prior settings, this is a copy/past typo that should read
add_months(#a.date,3). I will assume it's the latter. That does however change the results for column m_3.
There is an item in your posted function I haven't fully understood. I not sure of the magic number 6 is doing. Were you attempting to create columns m_1 thru m_6, that would seem to be. However the code would actually try to create the
column name counter 6 times, that would fail on the 2nd one. In the function below I'll stay with m_1 thru m_3. If m_6 is your goal, just replicate m_1 editing as needed. (also have to update table definition).
Some Changes Made:
I do not name columns date. Its a reserved word and while you can get away with it now, that could change at any time. So I'll use logged_at in the work table.
I dislike single character names for DB objects So #a becomes user_login_wrk.
I avoid DML (create,alter) in functions. So the table is created externally. Besides for a SQL function it must exist
initially unless the entire function is dynamic SQL and a single string.
Taking all that into consideration we get:
-- create 'months' work table
create table user_login_wrk( username text
, logged_at date
, m_1 text
, m_2 text
, m_3 text
);
Now for the main event.
create or replace function collect_user_login_counts(login_start_in date)
returns table( "Date" text
, num_acc bigint
, m_1 bigint
, m_2 bigint
, m_3 bigint
)
language sql strict
as $$
-- work table exists for single execution so clear any existing data
truncate user_login_wrk;
with su_dater as
-- get each user and the earliest date of login such that the login date in not less than parameter date
(select l0.username, min(l0.logged_at)::date logged_at
from logins l0
where l0.logged_at::date >= login_start_in
group by l0.username
)
, inserter as
-- insert the the counter table the user name for least login date and the following 3 months,
-- return each row for subsequent summerization
( insert into user_login_wrk(username, logged_at, m_1,m_2,m_3)
select su.username
, su.logged_at
, (select su.username where exists (select null from logins l1 where l1.username = su.username and l1.logged_at = su.logged_at + interval '1 month'))
, (select su.username where exists (select null from logins l2 where l2.username = su.username and l2.logged_at = su.logged_at + interval '2 month'))
, (select su.username where exists (select null from logins l3 where l3.username = su.username and l3.logged_at = su.logged_at + interval '3 month'))
from su_dater su
returning *
)
-- summerize count on user logins over period current and next 3 months result returned caller
select to_char(ulc.logged_at,'yyyy-mm')
, count(ulc.username)
, count(ulc.m_1)
, count(ulc.m_2)
, count(ulc.m_3)
from inserter ulc
where ulc.logged_at >= login_start_in
group by to_char(logged_at,'yyyy-mm')
order by to_char(logged_at,'yyyy-mm');
$$;
-- test
select collect_user_login_counts(date '2019-01-01'); -- select as row
select * from collect_user_login_counts(date '2019-01-01'); -- select as individual columns
select * from collect_user_login_counts(date '2019-02-01'); -- Next month
The above completely refreshes the work table and rebuilds it. However, there are times when viewing the results for the last run is desirable/needed. The following provides that capability. (Note the actual query can be extracted and run stand alone if desired.
create or replace function show_user_login_counts()
returns table( "Date" text
, num_acc bigint
, m_1 bigint
, m_2 bigint
, m_3 bigint
)
language sql strict
as $$
select to_char(ulc.logged_at,'yyyy-mm')
, count(ulc.username)
, count(ulc.m_1)
, count(ulc.m_2)
, count(ulc.m_3)
from user_login_wrk ulc
group by to_char(logged_at,'yyyy-mm')
order by to_char(logged_at,'yyyy-mm') ;
$$;
-- test
select show_user_login_counts(); -- select as row
select * from show_user_login_counts(); -- select as individual columns
There are a couple issues not addressed. Currently each subsequent (m_1,m_2,m_3) is exact months from the start date?What happens if a user login is not the exact date but the next day? Also there is no allowance for a user logging in
multiple times in a month. Well those are questions for another day.

PostgreSQL: Delete all but most recent date

I have a table defined like so:
CREATE TABLE contracts (
ContractID TEXT DEFAULT NULL,
ContractName TEXT DEFAULT NULL,
ContractEndDate TIMESTAMP WITHOUT TIME ZONE,
ContractPOC TEXT DEFAULT NULL
);
In this table, a ContractID may have more than one record, for each ContractID I want to delete all records but the one with the latest ContractEndDate. I know how to do this in MySQL using:
DELETE contracts
FROM contracts
INNER JOIN (
SELECT
ContractID,
ContractName,
max(ContractEndDate) as lastDate,
ContractPOC
FROM contracts
GROUP BY EmployeeID
HAVING COUNT(*) > 0) Duplicate on Duplicate.ContractID = contracts.ContractID
WHERE contracts.ContractEndDate < Duplicate.lastDate;
But I need help to get this working in PostgreSQL.
You could use this
delete
from
contracts c
using (SELECT
ContractID,
max(ContractEndDate) as lastDate
FROM
contracts
GROUP BY
ContractID) d
where
d.ContractID = c.ContractID
and c.ContractEndDate < d.lastDate;

Looping SQL query - PostgreSQL

I'm trying to get a query to loop through a set of pre-defined integers:
I've made the query very simple for this question.. This is pseudo code as well obviously!
my_id = 0
WHILE my_id < 10
SELECT * from table where id = :my_id`
my_id += 1
END
I know that for this query I could just do something like where id < 10.. But the actual query I'm performing is about 60 lines long, with quite a few window statements all referring to the variable in question.
It works, and gets me the results I want when I have the variable set to a single figure.. I just need to be able to re-run the query 10 times with different variables hopefully ending up with one single set of results.
So far I have this:
CREATE OR REPLACE FUNCTION stay_prices ( a_product_id int ) RETURNS TABLE (
pid int,
pp_price int
) AS $$
DECLARE
nights int;
nights_arr INT[] := ARRAY[1,2,3,4];
j int;
BEGIN
j := 1;
FOREACH nights IN ARRAY nights_arr LOOP
-- query here..
END LOOP;
RETURN;
END;
$$ LANGUAGE plpgsql;
But I'm getting this back:
ERROR: query has no destination for result data
HINT: If you want to discard the results of a SELECT, use PERFORM instead.
So do I need to get my query to SELECT ... INTO the returning table somehow? Or is there something else I can do?
EDIT: this is an example of the actual query I'm running:
\x auto
\set nights 7
WITH x AS (
SELECT
product_id, night,
LAG(night, (:nights - 1)) OVER (
PARTITION BY product_id
ORDER BY night
) AS night_start,
SUM(price_pp_gbp) OVER (
PARTITION BY product_id
ORDER BY night
ROWS BETWEEN (:nights - 1) PRECEDING
AND CURRENT ROW
) AS pp_price,
MIN(spaces_available) OVER (
PARTITION BY product_id
ORDER BY night
ROWS BETWEEN (:nights - 1) PRECEDING
AND CURRENT ROW
) AS min_spaces_available,
MIN(period_date_from) OVER (
PARTITION BY product_id
ORDER BY night
ROWS BETWEEN (:nights - 1) PRECEDING
AND CURRENT ROW
) AS min_period_date_from,
MAX(period_date_to) OVER (
PARTITION BY product_id
ORDER BY night
ROWS BETWEEN (:nights - 1) PRECEDING
AND CURRENT ROW
) AS max_period_date_to
FROM products_nightlypriceperiod pnpp
WHERE
spaces_available >= 1
AND min_group_size <= 1
AND night >= '2016-01-01'::date
AND night <= '2017-01-01'::date
)
SELECT
product_id as pid,
CASE WHEN x.pp_price > 0 THEN x.pp_price::int ELSE null END as pp_price,
night_start as from_date,
night as to_date,
(night-night_start)+1 as duration,
min_spaces_available as spaces
FROM x
WHERE
night_start = night - (:nights - 1)
AND min_period_date_from = night_start
AND max_period_date_to = night;
That will get me all the nights night periods available for all my products in 2016 along with the price for the period and the max number of spaces I could fill in that period.
I'd like to be able to run this query to get all the periods available between 2 and 30 days for all my products.
This is likely to produce a table with millions of rows. The plan is to re-create this table periodically to enable a very quick look up of what's available for a particular date. The products_nightlypriceperiod represents a night of availability of a product - e.g. Product X has 3 spaces left for Jan 1st 2016, and costs £100 for the night.
Why use a loop? You can do something like this (using your first query):
with params as (
select generate_series(1, 10) as id
)
select t.*
from params cross join
table t
where t.id = params.id;
You can modify params to have the values you really want. Then just use cross join and let the database "do the looping."