multiple named windows in a postgres query

multiple named windows in a postgres query - postgresql

The postgres docs specify a window definition clause thus:
[ WINDOW window_name AS ( window_definition ) [, ...] ]
The [,...] specifies that multiple windows are possible. I find nothing else in the docs to confirm or deny it's possible. How do I make this work?
In this query, I can use either window clause on its own but I can't use both even though the syntax follows the spec:
select q.*
, min(value) over w_id as min_id_val
--, min(value) over w_kind as min_kind_val
from (
select 1 as id, 1 as kind, 3.0 as value
union select 1, 2, 1.0
union select 2, 1, 2.0
union select 2, 2, 0.5
) as q
window w_id as (partition by id)
-- ,
-- window w_kind as (partition by kind)
I can get the technical effect by not using window definitions, but that gets tiresome for a complex query where windows are re-used:
select q.*
, min(value) over (partition by id) as min_id_val
, min(value) over (partition by kind) as min_kind_val
from (
select 1 as id, 1 as kind, 3.0 as value
union select 1, 2, 1.0
union select 2, 1, 2.0
union select 2, 2, 0.5
) as q

Don't repeat the window keyword:
select q.*,
min(value) over w_id as min_id_val,
min(value) over w_kind as min_kind_val
from (
values
(1,1,3.0),
(1, 2, 1.0),
(2, 1, 2.0),
(2, 2, 0.5)
) as q(id,kind,value)
window w_id as (partition by id),
w_kind as (partition by kind)

Related

postgres aggregate subset from group by rows

I'm trying to evaluate user loyalty bonuses balance when bonuses burns after half-year inactivity. I want my sum consist of ord's 4, 5 and 6 for user 1.
create table transactions (
user int,
ord int, -- transaction date replacement
amount int,
lag interval -- after previous transaction
);
insert into transactions values
(1, 1, 10, '1h'::interval),
(1, 2, 10, '.5y'::interval),
(1, 3, 10, '1h'::interval),
(1, 4, 10, '.5y'::interval),
(1, 5, 10, '.1h'::interval),
(1, 6, 10, '.1h'::interval),
(2, 1, 10, '1h'::interval),
(2, 2, 10, '.5y'::interval),
(2, 3, 10, '.1h'::interval),
(2, 4, 10, '.1h'::interval),
(3, 1, 10, '1h'::interval),
;
select user, sum(
amount -- but starting from last '.5y'::interval if any otherwise everything counts
) from transactions group by user
user | sum(amount)
--------------------
1 | 30 -- (4+5+6), not 50, not 60
2 | 30 -- (2+3+4), not 40
3 | 10

try this:
with cte as(
select *,
case when (lead(lag) over (partition by user_ order by ord)) >= interval '.5 year'
then 1 else 0 end "flag" from test
),
cte1 as (
select *,
case when flag=(lag(flag,1) over (partition by user_ order by ord)) then 0 else 1 end "flag1" from cte
)
select distinct on (user_) user_, sum(amount) over (partition by user_,grp order by ord) from (
select *, sum(flag1) over (partition by user_ order by ord) "grp" from cte1) t1
order by user_ , ord desc
DEMO
Though it is very complicated and slow but resolve your problem

Is this what you're looking for ?
with last_5y as(
select "user", max(ord) as ord
from transactions
where lag = '.5y'::interval group by "user"
) select t.user, sum(amount)
from transactions t, last_5y t2
where t.user = t2.user and t.ord >= t2.ord
group by t.user

Checking Slowly Changing Dimension 2

I have a table that looks like this:
A slowly changing dimension type 2, according to Kimball.
Key is just a surrogate key, a key to make rows unique.
As you can see there are three rows for product A.
Timelines for this product are ok. During time the description of the product changes.
From 1-1-2020 up until 4-1-2020 the description of this product was ProdA1.
From 5-1-2020 up until 12-2-2020 the description of this product was ProdA2 etc.
If you look at product B, you see there are gaps in the timeline.
We use DB2 V12 z/Os. How can I check if there are gaps in the timelines for each and every product?
Tried this, but doesn't work
with selectie (key, tel) as
(select product, count(*)
from PROD_TAB
group by product
having count(*) > 1)
Select * from
PROD_TAB A
inner join selectie B
on A.product = B.product
Where not exists
(SELECT 1 from PROD_TAB C
WHERE A.product = C.product
AND A.END_DATE + 1 DAY = C.START_DATE
)
Does anyone know the answer?

The following query returns all gaps for all products.
The idea is to enumerate (RN column) all periods inside each product by START_DATE and join each record with its next period record.
WITH
/*
MYTAB (PRODUCT, DESCRIPTION, START_DATE, END_DATE) AS
(
SELECT 'A', 'ProdA1', DATE('2020-01-01'), DATE('2020-01-04') FROM SYSIBM.SYSDUMMY1
UNION ALL SELECT 'A', 'ProdA2', DATE('2020-01-05'), DATE('2020-02-12') FROM SYSIBM.SYSDUMMY1
UNION ALL SELECT 'A', 'ProdA3', DATE('2020-02-13'), DATE('2020-12-31') FROM SYSIBM.SYSDUMMY1
UNION ALL SELECT 'B', 'ProdB1', DATE('2020-01-05'), DATE('2020-01-09') FROM SYSIBM.SYSDUMMY1
UNION ALL SELECT 'B', 'ProdB2', DATE('2020-01-12'), DATE('2020-03-14') FROM SYSIBM.SYSDUMMY1
UNION ALL SELECT 'B', 'ProdB3', DATE('2020-03-15'), DATE('2020-04-18') FROM SYSIBM.SYSDUMMY1
UNION ALL SELECT 'B', 'ProdB4', DATE('2020-04-16'), DATE('2020-05-03') FROM SYSIBM.SYSDUMMY1
)
,
*/
MYTAB_ENUM AS
(
SELECT
T.*
, ROWNUMBER() OVER (PARTITION BY PRODUCT ORDER BY START_DATE) RN
FROM MYTAB T
)
SELECT A.PRODUCT, A.END_DATE + 1 START_DT, B.START_DATE - 1 END_DT
FROM MYTAB_ENUM A
JOIN MYTAB_ENUM B ON B.PRODUCT = A.PRODUCT AND B.RN = A.RN + 1
WHERE A.END_DATE + 1 <> B.START_DATE
AND A.END_DATE < B.START_DATE;
The result is:
|PRODUCT|START_DT |END_DT |
|-------|----------|----------|
|B |2020-01-10|2020-01-11|
May be more efficient way:
WITH MYTAB2 AS
(
SELECT
T.*
, LAG(END_DATE) OVER (PARTITION BY PRODUCT ORDER BY START_DATE) END_DATE_PREV
FROM MYTAB T
)
SELECT PRODUCT, END_DATE_PREV + 1 START_DATE, START_DATE - 1 END_DATE
FROM MYTAB2
WHERE END_DATE_PREV + 1 <> START_DATE
AND END_DATE_PREV < START_DATE;

Thnx Mark, will try this one of these days.
Never heard of LAG in DB2 V12 for z/Os
Will read about it
Thnx

Limit query by count distinct column values

I have a table with people, something like this:
ID PersonId SomeAttribute
1 1 yellow
2 1 red
3 2 yellow
4 3 green
5 3 black
6 3 purple
7 4 white
Previously I was returning all of Persons to API as seperate objects. So if user set limit to 3, I was just setting query maxResults in hibernate to 3 and returning:
{"PersonID": 1, "attr":"yellow"}
{"PersonID": 1, "attr":"red"}
{"PersonID": 2, "attr":"yellow"}
and if someone specify limit to 3 and page 2(setMaxResult(3), setFirstResult(6) it would be:
{"PersonID": 3, "attr":"green"}
{"PersonID": 3, "attr":"black"}
{"PersonID": 3, "attr":"purple"}
But now I want to select people and combine then into one json object to look like this:
{
"PersonID":3,
"attrs": [
{"attr":"green"},
{"attr":"black"},
{"attr":"purple"}
]
}
And here is the problem. Is there any possibility in postgresql or hibernate to set limit not by number of rows but to number of distinct people ids, because if user specifies limit to 4 I should return person1, 2, 3 and 4, but in my current limiting mechanism I will return person1 with 2 attributes, person2 and person3 with only one attribute. Same problem with pagination, now I can return half of a person3 array attrs on one page and another half on next page.

You can use row_number to simulate LIMIT:
-- Test data
CREATE TABLE person AS
WITH tmp ("ID", "PersonId", "SomeAttribute") AS (
VALUES
(1, 1, 'yellow'::TEXT),
(2, 1, 'red'),
(3, 2, 'yellow'),
(4, 3, 'green'),
(5, 3, 'black'),
(6, 3, 'purple'),
(7, 4, 'white')
)
SELECT * FROM tmp;
-- Returning as a normal column (limit by someAttribute size)
SELECT * FROM (
select
"PersonId",
"SomeAttribute",
row_number() OVER(PARTITION BY "PersonId" ORDER BY "PersonId") AS rownum
from
person) as tmp
WHERE rownum <= 3;
-- Returning as a normal column (overall limit)
SELECT * FROM (
select
"PersonId",
"SomeAttribute",
row_number() OVER(ORDER BY "PersonId") AS rownum
from
person) as tmp
WHERE rownum <= 4;
-- Returning as a JSON column (limit by someAttribute size)
SELECT "PersonId", json_object_agg('color', "SomeAttribute") AS attributes FROM (
select
"PersonId",
"SomeAttribute",
row_number() OVER(PARTITION BY "PersonId" ORDER BY "PersonId") AS rownum
from
person) as tmp
WHERE rownum <= 3 GROUP BY "PersonId";
-- Returning as a JSON column (limit by person)
SELECT "PersonId", json_object_agg('color', "SomeAttribute") AS attributes FROM (
select
"PersonId",
"SomeAttribute"
from
person) as tmp
GROUP BY "PersonId"
LIMIT 4;
In this case, of course, you must use a native query, but this is a small trade-off IMHO.
More info here and here.

I'm assuming you have another Person table. With JPA, you should do the query on Person table(one side), not on the PersonColor(many side).Then the limit will be applied on number of rows of Person then
If you don't have the Person table and can't modify the DB, what you can do is use SQL and Group By PersonId, and concatenate colors
select PersonId, array_agg(Color) FROM my_table group by PersonId limit 2
SQL Fiddle

Thank you guys. After I realize that it could not be done with one query I just do sth like
temp_query = select distinct x.person_id from (my_original_query) x
with user specific page/per_page
and then:
my_original_query += " AND person_id in (temp_query_results)

PostgreSQL-filter where category is most common value

I'm attempting to use the mode() or most_common_vals() functions as a subquery criteria.
SELECT user_id, COUNT(request_id) AS total
FROM requests
WHERE category = (SELECT mode(category) AS modal_category FROM requests)
GROUP BY user_id
ORDER BY total DESC
LIMIT 5;
However, I continue to receive an error regarding the non-existence of both functions.

If I correctly understand, you need something like this:
with requests (user_id, request_id, category) as (
select 1, 111, 'A' union all
select 1, 111, 'A' union all
select 2, 111, 'A' union all
select 2, 111, 'B' union all
select 1, 111, 'B' union all
select 3, 111, 'B' union all
select 1, 111, 'B' union all
select 1, 111, 'C'
)
-- Below is actual query:
select user_id, COUNT(request_id) AS total
from (
select t.*, rank() over(order by cnt desc) as rnk from (
select requests.*, count(*) over(partition by category) as cnt from requests
) t
) tt
where rnk = 1
group by user_id
order by total desc
limit 5
Here user_id and COUNT(request_id) are calculated only for 'B' category, because it is most common in this example.
Also please note that in case, if there is multiple most common categories, this query produces result from all of those categories.

Select all but sort by count in postgresql

I have a table myTable with a lot of columns, keep in mind this table is too big, and one of that columns is a geometry point, we'll call it mySortColumn. I need to sort my select by count mySortColumn when there are the same.
One example could be this
myTable
id, mySortColumn
----------------
1, ASD12321F
2, ASD12321G
3, ASD12321F
4, ASD12321G
5, ASD12321H
6, ASD12321F
I have a query which can do what I want, the problem is the time. Actually it take like 30 seconds, and it seems like this:
SELECT
id,
mySortColumn
FROM
myTable
JOIN (
SELECT
mySortColumn,
ST_Y(mySortColumn) AS lat,
ST_X(mySortColumn) AS lng,
COUNT(*)
FROM myTable
GROUP BY mySortColumn
HAVING COUNT(*) > 1
) AS myPosition ON (
ST_X(myTable.mySortColumn) = myPosition.lng
AND ST_Y(myTable.mySortColumn) = myPosition.lat
)
WHERE
<some filters>
ORDER BY COUNT DESC
The result must be this:
id, mySortColumn
----------------
1, ASD12321F
3, ASD12321F
6, ASD12321F
2, ASD12321G
4, ASD12321G
5, ASD12321H
I hope you can help me.

Here you are:
select * from myTable order by count(1) over (partition by mySortColumn) desc;
For more info about aggregate over () construction have a look at:
http://www.postgresql.org/docs/9.4/static/tutorial-window.html

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

multiple named windows in a postgres query - postgresql

Don't repeat the window keyword: select q.*, min(value) over w_id as min_id_val, min(value) over w_kind as min_kind_val from ( values (1,1,3.0), (1, 2, 1.0), (2, 1, 2.0), (2, 2, 0.5) ) as q(id,kind,value) window w_id as (partition by id), w_kind as (partition by kind)

Related

postgres aggregate subset from group by rows

Checking Slowly Changing Dimension 2

Limit query by count distinct column values

PostgreSQL-filter where category is most common value

Select all but sort by count in postgresql

Categories

Resources