PostgreSQL - GROUP subsequent rows - postgresql

I have a table which contains some records ordered by date.
And I want to get start and end dates for each subsequent group (grouped by some criteria e.g.position).
Example:
create table tbl (id int, date timestamp without time zone,
position int);
insert into tbl values
( 1 , '2013-12-01', 1),
( 2 , '2013-12-02', 2),
( 3 , '2013-12-03', 2),
( 4 , '2013-12-04', 2),
( 5 , '2013-12-05', 3),
( 6 , '2013-12-06', 3),
( 7 , '2013-12-07', 2),
( 8 , '2013-12-08', 2)
Of course if I simply group by position I will get wrong result as positions could be the same for different groups:
SELECT POSITION, min(date) MIN, max(date) MAX
FROM tbl GROUP BY POSITION
I will get:
POSITION MIN MAX
1 December, 01 2013 00:00:00+0000 December, 01 2013 00:00:00+0000
3 December, 05 2013 00:00:00+0000 December, 06 2013 00:00:00+0000
2 December, 02 2013 00:00:00+0000 December, 08 2013 00:00:00+0000
But I want:
POSITION MIN MAX
1 December, 01 2013 00:00:00+0000 December, 01 2013 00:00:00+0000
2 December, 02 2013 00:00:00+0000 December, 04 2013 00:00:00+0000
3 December, 05 2013 00:00:00+0000 December, 06 2013 00:00:00+0000
2 December, 07 2013 00:00:00+0000 December, 08 2013 00:00:00+0000
I found a solution for MySql which uses variables and I could port it but I believe PostgreSQL can do it in some smarter way using its advanced features like window functions.
I'm using PostgreSQL 9.2

There is probably more elegant solution but try this:
WITH tmp_tbl AS (
SELECT *,
CASE WHEN lag(position,1) OVER(ORDER BY id)=position
THEN position
ELSE ROW_NUMBER() OVER(ORDER BY id)
END AS grouping_col
FROM tbl
)
, tmp_tbl2 AS(
SELECT position,date,
CASE WHEN lag(position,1)OVER(ORDER BY id)=position
THEN lag(grouping_col,1) OVER(ORDER BY id)
ELSE ROW_NUMBER() OVER(ORDER BY id)
END AS grouping_col
FROM tmp_tbl
)
SELECT POSITION, min(date) MIN, max(date) MAX
FROM tmp_tbl2 GROUP BY grouping_col,position

There are some complete answers on Stackoverflow for that, so I'll not repeat them in detail, but the principle of it is to group the records according to the difference between:
The row number when ordered by the date (via a window function)
The difference between the dates and a static date of reference.
So you have a series such as:
rownum datediff diff
1 1 0 ^
2 2 0 | first group
3 3 0 v
4 5 1 ^
5 6 1 | second group
6 7 1 v
7 9 2 ^
8 10 2 v third group

Related

Groupby year, calculate sum and percentage per year

I have a table with the columns
datefield area
I want to calculate sum of area per year and a percentage column
year sum percentage
2022 5 12
2023 10 24
2024 6 15
[null] 20 49
(I have many more years in the table which I want to include)
WITH total as(
select extract(YEAR from "datefield") theyear, sum(area) as totalarea
from thetable
group by extract(YEAR from "datefield")
)
select total.theyear, total.totalareal,
totalarea/(SUM(totalarea) OVER (PARTITION BY theyear))*100
from total
I get correct sum, but all the percentages are 100..
What am I doing wrong?
Some sample data:
2019 7.05
2020 4.77
2020 3.56
2021 1.64
2021 8.37
2021 3.51
2021 1.43
2021 9.94
2022 1.91
2022 5.3
I would like the result
2019 7.05 15
2020 8.33 18
2021 24.89 52
2022 7.21 15
WITH
total as
(
select extract(YEAR from "datefield") theyear, sum(area) as totalarea,
SUM(sum(area)) OVER() as SUM_totalarea
from thetable
group by extract(YEAR from "datefield")
)
SELECT theyear, totalarea, 100.0 * totalarea / SUM_totalarea AS PERCENTAGE
FROM total

How to get last value with condition in postgreSQL?

I have a table in postgres with three columns, one with a group, one with a date and the last with a value.
grp
mydate
value
A
2021-01-27
5
A
2021-01-23
10
A
2021-01-15
15
B
2021-01-26
7
B
2021-01-24
12
B
2021-01-15
17
I would like to create a view with a sequence of dates and the most recent value on table for each date according with group.
grp
mydate
value
A
2021-01-27
5
A
2021-01-26
10
A
2021-01-25
10
A
2021-01-24
10
A
2021-01-23
10
A
2021-01-22
15
A
2021-01-21
15
A
2021-01-20
15
A
2021-01-19
15
A
2021-01-18
15
A
2021-01-17
15
A
2021-01-16
15
A
2021-01-15
15
B
2021-01-27
7
B
2021-01-26
7
B
2021-01-25
12
B
2021-01-24
12
B
2021-01-23
17
B
2021-01-22
17
B
2021-01-21
17
B
2021-01-20
17
B
2021-01-19
17
B
2021-01-18
17
B
2021-01-17
17
B
2021-01-16
17
B
2021-01-15
17
SQL code to generate the table:
CREATE TABLE foo (
grp char(1),
mydate date,
value integer);
INSERT INTO foo VALUES
('A', '2021-01-27', 5),
('A', '2021-01-23', 10),
('A', '2021-01-15', 15),
('B', '2021-01-26', 7),
('B', '2021-01-24', 12),
('B', '2021-01-15', 17)
I have so far managed to generate a visualization with the sequence of dates joined with the distinct groups, but I am failing to get the most recent value.
SELECT DISTINCT(foo.grp), (date_trunc('day'::text, dd.dd))::date AS mydate
FROM foo, generate_series((( SELECT min(foo.mydate) AS min
FROM foo))::timestamp without time zone, (now())::timestamp without time zone, '1 day'::interval) dd(dd)
step-by-step demo:db<>fiddle
SELECT
grp,
gs::date as mydate,
value
FROM (
SELECT
*,
COALESCE( -- 2
lead(mydate) OVER (PARTITION BY grp ORDER BY mydate) - 1, -- 1
mydate
) as prev_date
FROM foo
) s,
generate_series(mydate, prev_date, interval '-1 day') as gs -- 3
ORDER BY grp, mydate DESC -- 4
lead() window function shifts the next value of an ordered group (= partition) into the current one. The group is already defined, the order is the date. This can be used to create the required date range. Since you don't want to have the last date twice (as end of the first range and beginning of the next one) the end date stops - 1 (one day before the next group starts)
This is for the very last records of the groups: They don't have a following record, so lead() yield NULL. To avoid this, COALESCE() sets them to the current record.
Now, you can create a date range with the current and the next date value using generate_series().
Finally you can generate the required order

How to sort attendance date along with the month?

Attendance is sorting according to date, that is fine, but I want to sort date along with the month name January should come at the bottom, and December at the top.
Table
Attendance Date
---------------
26 Feb 2018
19 Dec 2018
18 Dec 2018
14 Dec 2018
12 June 2018
7 Dec 2018
5 Feb 2018
Query
select distinct
(select ARRAY_TO_STRING(ARRAY_AGG(ARRAY[to_char(t1.l_time,'HH12:mi AM')]::text), ',')
from
(select (al1.create_time AT TIME ZONE 'UTC+5:30')::time as l_time
from users.access_log as al1
where al1.user_id = al.user_id
and al1.login_status = 1
and al1.create_time::date = al.create_time::date
order by al1.create_time::time ASC
) as t1
) as login_time,
(select ARRAY_TO_STRING(ARRAY_AGG(ARRAY[to_char(t2.o_time,'HH:mi AM')]::text), ',')
from
(select (al2.create_time AT TIME ZONE 'UTC+5:30')::time as o_time
from users.access_log as al2
where al2.user_id = al.user_id
and al2.login_status = 0
and al2.create_time::date = al.create_time::date
order by al2.create_time::time ASC
) as t2
) as logout_time,
al.create_time::date
from users.access_log as al
where al.user_id = ?;
Attendance is sorting according to date, that is fine, but I want to sort date along with the month name January should come at the bottom, and December at the top.

Find max value in a group in FileMaker

How to select only max values in a group in the following set
id productid price year
---------------------------
1 11 0,10 2015
2 11 0,12 2016
3 11 0,11 2017
4 22 0,08 2016
5 33 0,02 2016
6 33 0,01 2017
Expected result for each productid and max year would be
id productid price year
---------------------------
3 11 0,11 2017
4 22 0,08 2016
6 33 0,01 2017
This works for me.
ExecuteSQL (
"SELECT t.id, t.productid, t.price, t.\"year\"
FROM test t
WHERE \"year\" =
(SELECT MAX(\"year\") FROM test tt WHERE t.productid = tt.productid)"
; " " ; "")
Adapted from this answer:
https://stackoverflow.com/a/21310671/832407
A simple SQL query will give you a last year for every product record
ExecuteSQL (
"SELECT productid, MAX ( \"year\")
FROM myTable
GROUP By productid";
"";"" )
To get to the price for that year is going to be trickier, as FileMaker SQL does not fully support subqueries or temp tables.

Partitioned by Year

I have a year table like this. Every year has 12 values (Fixed)
declare #t table (FiscalYear int,[Month] varchar(25))
insert into #t values
(2011,'Jan'),(2011,'Feb'),(2011,'Mar'),(2011,'Apr'),
(2011,'May'),(2011,'Jun'),(2011,'Jul'),(2011,'Aug'),
(2011,'Sep'),(2011,'Oct'),(2011,'Nov'),(2011,'Dec'),
(2012,'Jan'),(2012,'Feb'),(2012,'Mar'),(2012,'Apr'),
(2012,'May'),(2012,'Jun'),(2012,'Jul'),(2012,'Aug'),
(2012,'Sep'),(2012,'Oct'),(2012,'Nov'),(2012,'Dec'),
(2013,'Jan'),(2013,'Feb'),(2013,'Mar'),(2013,'Apr'),
(2013,'May'),(2013,'Jun'),(2013,'Jul'),(2013,'Aug'),
(2013,'Sep'),(2013,'Oct'),(2013,'Nov'),(2013,'Dec')
I want to output as
FYear Month Qt Qtp
2011 Jan 1 1
2011 Feb 1 2
2011 Mar 1 3
2011 Apr 2 1
2011 May 2 2
2011 Jun 2 3
2011 Jul 3 1
2011 Aug 3 2
2011 Sep 3 3
2011 Oct 4 1
2011 Nov 4 2
2011 Dec 4 3
2012 Jan 1 1
2012 Feb 1 2
2012 Mar 1 3
2012 Apr 2 1
2012 May 2 2
2012 Jun 2 3
2012 Jul 3 1
2012 Aug 3 2
2012 Sep 3 3
2012 Oct 4 1
2012 Nov 4 2
2012 Dec 4 3
2013 Jan 1 1
2013 Feb 1 2
2013 Mar 1 3
2013 Apr 2 1
2013 May 2 2
2013 Jun 2 3
2013 Jul 3 1
2013 Aug 3 2
2013 Sep 3 3
2013 Oct 4 1
2013 Nov 4 2
2013 Dec 4 3
How can i do that in SQLServer2008R2. I have tried using DenseRank, RowNuber, Partitioned but all in vain.
Tru using Ntile:
--select * from #t
SELECT * ,
ROW_NUMBER() OVER ( PARTITION BY FYear, Qt ORDER BY FYear ) Qtp
from
(SELECT FYear,[Month],
NTILE(4) OVER ( PARTITION BY FYear ORDER BY FYear ) AS Qt
FROM #t) PERIOD
ORDER BY FYear ,Qt ,ROW_NUMBER() OVER ( PARTITION BY FYear, Qt ORDER BY FYear)
I propose dynamically populating a table with date values from Dec 2013 going down to the year that you like (you can alter the #COUNT_Y Variable to add more years).
SQL has some interesting datetime functions like DATEPART which can tell you which quarter a month is in etc.
** Answer changed due to question change **
DECLARE #DATES TABLE
(
xDATE DATETIME
)
DECLARE #STARTDATE DATETIME = '12-31-2013'
DECLARE #COUNT_X INT = 0
DECLARE #COUNT_X_MAX INT = 11
DECLARE #COUNT_Y INT = 0
DECLARE #COUNT_Y_MAX INT = 2
WHILE (#COUNT_Y <= #COUNT_Y_MAX)
BEGIN
SET #COUNT_X = 0
WHILE (#COUNT_X <= #COUNT_X_MAX)
BEGIN
INSERT INTO #DATES
SELECT DATEADD(MONTH, -#COUNT_X, DATEADD(YEAR,-#COUNT_Y, #STARTDATE))
SET #COUNT_X = #COUNT_X + 1
END
SET #COUNT_Y = #COUNT_Y + 1
END
SELECT * FROM
(SELECT
DATEPART(YEAR, D.xDATE) AS [YEAR],
DATEPART(MONTH, D.xDATE) AS [MONTH],
DATENAME(MONTH, D.xDATE) AS [MONTH_NAME],
DATEPART(QUARTER, D.xDATE) AS [QUARTER],
DATEPART(MONTH, D.xDATE) - (3 * (DATEPART(QUARTER, D.xDATE) - 1)) AS [QTP]
FROM #DATES D) t
ORDER BY T.YEAR, T.MONTH