We use Postgresql 14.1
I have a sample data that contains over 50 million records.
base table:
+------+----------+--------+--------+--------+
| id | item_id | battles| wins | damage |
+------+----------+--------+--------+--------+
| 1 | 255 | 35 | 52.08 | 1245.2 |
| 2 | 255 | 35 | 52.08 | 1245.2 |
| 3 | 255 | 35 | 52.08 | 1245.3 |
| 4 | 255 | 35 | 52.08 | 1245.3 |
| 5 | 255 | 35 | 52.09 | 1245.4 |
| 6 | 255 | 35 | 52.08 | 1245.3 |
| 7 | 255 | 35 | 52.08 | 1245.3 |
| 8 | 255 | 35 | 52.08 | 1245.7 |
| 1 | 460 | 18 | 47.35 | 1010.1 |
| 2 | 460 | 27 | 49.18 | 1518.9 |
| 3 | 460 | 16 | 50.78 | 1171.2 |
+------+----------+--------+--------+--------+
We need to get the target row number and 2 next and 2 previous rows as quickly as possible.
Indexed columns:
id
item_id
Sorting:
damage (DESC)
wins (DESC)
battles (ASC)
id (ASC)
At the example, we need to find the row number and +- 2 rows where id = 4 and item_id = 255. The result table should be:
+------+----------+--------+--------+--------+------+
| id | item_id | battles| wins | damage | rank |
+------+----------+--------+--------+--------+------+
| 5 | 255 | 35 | 52.09 | 1245.4 | 2 |
| 3 | 255 | 35 | 52.08 | 1245.3 | 3 |
| 4 | 255 | 35 | 52.08 | 1245.3 | 4 |
| 6 | 255 | 35 | 52.08 | 1245.3 | 5 |
| 7 | 255 | 35 | 52.08 | 1245.3 | 6 |
+------+----------+--------+--------+--------+------+
How can I do this with Row number windows function?
Is there is any way optimize in query to make it faster because other columns have no indexes?
CREATE OR REPLACE FUNCTION find_top(in_id integer, in_item_id integer) RETURNS TABLE (
r_id int,
r_item_id int,
r_battles int,
r_wins real,
r_damage real,
r_rank bigint,
r_eff real,
r_frags int
) AS $$
DECLARE
center_place bigint;
BEGIN
SELECT place INTO center_place FROM
(SELECT
id, item_id,
ROW_NUMBER() OVER (ORDER BY damage DESC, wins DESC, battles, id) AS place
FROM
public.my_table
WHERE
item_id = in_item_id
AND battles >= 20
) AS s
WHERE s.id = in_id;
RETURN QUERY SELECT
s.place, pt.id, pt.item_id, pt.battles, pt.wins, pt.damage
FROM
(
SELECT * FROM
(SELECT
ROW_NUMBER () OVER (ORDER BY damage DESC, wins DESC, battles, id) AS place,
id, item_id
FROM
public.my_table
WHERE
item_id = in_item_id
AND battles >= 20) x
WHERE x.place BETWEEN (center_place - 2) AND (center_place + 2)
) s
JOIN
public.my_table pt
ON pt.id = s.id AND pt.item_id = s.item_id;
END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION find_top(in_id integer, in_item_id integer) RETURNS TABLE (
r_id int,
r_item_id int,
r_battles int,
r_wins real,
r_damage real,
r_rank bigint,
r_eff real,
r_frags int
) AS $$
BEGIN
RETURN QUERY
SELECT c.*, B.ord -3 AS row_number
FROM
( SELECT array_agg(id) OVER w AS id
, array_agg(item_id) OVER w AS item_id
FROM public.my_table
WINDOW w AS (ORDER BY damage DESC, wins DESC, battles, id ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING)
) AS a
CROSS JOIN LATERAL unnest(a.id, a.item_id) WITH ORDINALITY AS b(id, item_id, ord)
INNER JOIN public.my_table AS c
ON c.id = b.id
AND c.item_id = b.item_id
WHERE a.item_id[3] = in_item_id
AND a.id[3] = in_id
ORDER BY b.ord ;
END ; $$ LANGUAGE plpgsql;
test result in dbfiddle
Related
DB fiddle DB fiddle (updated)
I pay attention to 1-2; 3-4 cases. Notice, that all rows from config_id group still together with all rows of xgroup
Detailed task description:
I have some order details for some resource. For each ordered resource there should be allocated resource. Resource may belongs to each other and belongs to some group. Consider virtual machine VM has: RAM, CPU, HDD.
Currently relation between orders and allocated resources is broken. I try to write query to analyze what is ordered without allocated resource and what is allocated without order.
CREATE TABLE order_detail (
id SERIAL,
order_id INTEGER,
resource_type_id INTEGER,
allocated_resource_id INTEGER,
amount INTEGER
)
CREATE TABLE allocated_resource (
id SERIAL,
group_id INTEGER,
resource_type_id INTEGER,
resource_uuid UUID,
)
This is easily done:
select * from order_detail od
where od.allocated_resource_id IS NULL
SELECT * FROM allocated_resource ar
WHERE NOT EXISTS ( SELECT 1 FROM order_detail od where od.allocated_resource_id = ar.id)
But I need to found to which Order assign that allocated resource. Or which resource allocate for this Order. Or which Order bind to allocated resource.
Data example:
id | order_id | allocated_resource_id | resource_type_id
--------------------------------------------------------
41 | 1 | 1 | 70
42 | 1 | | 71
43 | 1 | | 73
44 | 2 | | 70
45 | 2 | 5 | 71
id | group_id | resource_type_id
--------------------------------
1 | 1 | 70
2 | 1 | 71
3 | 1 | 72
4 | 2 | 70
5 | 2 | 71
6 | 2 | 73
Here I want to get:
id | order_id | allocated_resource_id | resource_type_id | ar.id | ar.group_id | ar.resource_type_id
----------------------------------------------------------------------------------------------------
41 | 1 | 1 | 70 | 1 | 1 | 70
42 | 1 | | 71 |
43 | 1 | | 73 |
| | | | 1 | 1 | 71
| | | | 1 | 1 | 72
44 | 2 | | 70 |
45 | 2 | 5 | 71 | 5 | 1 | 71
| | | | 4 | 1 | 70
| | | | 6 | 1 | 73
Unfortunately order by ar.id or order by od.id will move unbound order details/allocated resource to bottom. I want to keep order details/allocated resources together as in example above.
To resolve task I select all available related groups first:
od_ar_group AS (
SELECT
od.order_id, ar.group_id,
cast( od.order_id AS TEXT ) || '-' || cast( ar.group_id AS TEXT ) AS odar
FROM order_detail od
FULL JOIN allocated_resource ar ON ar.id = od.allocated_resource_id
WHERE od.order_id IS NOT NULL AND ar.group_id IS NOT NULL
GROUP BY od.order_id, ar.group_id
)
Then I attach group info to both tables:
SELECT od.*, odar.odar
FROM order_detail od
LEFT JOIN od_ar_group odar ON odar.order_id = od.order_id
SELECT ar.*, odar.odar
FROM allocated_resource ar
LEFT JOIN od_ar_group odar ON odar.parent_id = ar.parent_id
Finally I can join those groups and sort inside them. So unbound order detail/allocated resource is inside group and not at the bottom of table:
SELECT
CASE WHEN od.odar IS NOT NULL THEN od.odar ELSE ar.odar END AS odar,
od.*, ar.*
FROM od_grouped
FULL JOIN ar_grouped ar ON ar.odar = od.odar
AND ar.id = od.allocated_resource_id
ORDER BY odar, CASE WHEN od.id IS NULL THEN 1 WHEN ar.id IS NULL THEN 2 ELSE 0 END, od.id NULLS LAST
Is there a more easy way to keep related rows inside their group?
I have 2 tables (actually there are 4, but for now lets say it's 2) with data like this:
Table PersonA
ClientID ID From Till
1 10 1.1.2017 30.4.2017
1 12 1.8.2017 2.1.2018
Table PersonB
ClientID ID From Till
1 6 1.3.2017 30.6.2017
And I need to generate view that would show something like this:
ClientID From Till PersonA PersonB
1 1.1.2017 28.2.2017 10 NULL
1 1.3.2017 30.4.2017 10 6
1 1.5.2017 30.6.2017 NULL 6
1 1.8.2017 02.1.2018 12 NULL
So basically I need to create view that would show what "persons" each client had in given period.
So when there is an overlap, client have both PersonA and PersonB (same should apply for PersonC and PersonD).
So in the final view one client can't have any overlapping dates.
I don't know how to approach this.
In an adaptation of this algorithm, we can already handle the overlaps:
declare #PersonA table(ClientID int, ID int, [From] date, Till date);
insert into #PersonA values (1,10,'20170101','20170430'),(1,12,'20170801','20180112');
declare #PersonB table(ClientID int, ID int, [From] date, Till date);
insert into #PersonB values (1,6,'20170301','20170630');
declare #PersonC table(ClientID int, ID int, [From] date, Till date);
insert into #PersonC values (1,12,'20170401','20170625');
declare #PersonD table(ClientID int, ID int, [From] date, Till date);
insert into #PersonD values (1,14,'20170501','20170525'),(1,14,'20170510','20171122');
with X(ClientID,EdgeDate)
as (select ClientID
,case
when toggle = 1
then Till
else [From]
end as EdgeDate
from
(
select ClientID,[From],Till from #PersonA
union all
select ClientID,[From],Till from #PersonB
union all
select ClientID,[From],Till from #PersonC
union all
select ClientID,[From],Till from #PersonD
) as concated
cross join
(
select-1 as toggle
union all
select 1 as toggle
) as toggler
),merged
as (select distinct
S.ClientID
,S.EdgeDate as [From]
,min(E.EdgeDate) as Till
from
X as S
inner join X as E
on S.ClientID = E.ClientID
and S.EdgeDate < E.EdgeDate
group by S.ClientID
,S.EdgeDate
),prds
as (select distinct
merged.ClientID
,merged.[From]
,merged.Till
,A.ID as PersonA
,B.ID as PersonB
,C.ID as PersonC
,D.ID as PersonD
from
merged
left join #PersonA as A
on merged.ClientID = A.ClientID
and A.[From] <= merged.[From]
and merged.Till <= A.Till
left join #PersonB as B
on merged.ClientID = B.ClientID
and B.[From] <= merged.[From]
and merged.Till <= B.Till
left join #PersonC as C
on merged.ClientID = C.ClientID
and C.[From] <= merged.[From]
and merged.Till <= C.Till
left join #PersonD as D
on merged.ClientID = D.ClientID
and D.[From] <= merged.[From]
and merged.Till <= D.Till
where not(A.ID is null
and B.ID is null
and C.ID is null
and D.ID is null
)
)
select ClientID
,[From]
,case
when Till = lead([From]
) over(order by Till)
then dateadd(d,-1,Till)
else Till
end as Till
,PersonA
,PersonB
,PersonC
,PersonD
from
prds
order by ClientID
,[From]
,Till;
Output with just the two Person tables given in the question:
+----------+------------+------------+---------+---------+
| ClientID | From | Till | PersonA | PersonB |
+----------+------------+------------+---------+---------+
| 1 | 2017-01-01 | 2017-02-28 | 10 | NULL |
| 1 | 2017-03-01 | 2017-04-29 | 10 | 6 |
| 1 | 2017-04-30 | 2017-06-30 | NULL | 6 |
| 1 | 2017-08-01 | 2018-01-12 | 12 | NULL |
+----------+------------+------------+---------+---------+
Output of script as it is above, with four Person tables:
+----------+------------+------------+---------+---------+---------+---------+
| ClientID | From | Till | PersonA | PersonB | PersonC | PersonD |
+----------+------------+------------+---------+---------+---------+---------+
| 1 | 2017-01-01 | 2017-02-28 | 10 | NULL | NULL | NULL |
| 1 | 2017-03-01 | 2017-03-31 | 10 | 6 | NULL | NULL |
| 1 | 2017-04-01 | 2017-04-29 | 10 | 6 | 12 | NULL |
| 1 | 2017-04-30 | 2017-04-30 | NULL | 6 | 12 | NULL |
| 1 | 2017-05-01 | 2017-05-09 | NULL | 6 | 12 | 14 |
| 1 | 2017-05-10 | 2017-05-24 | NULL | 6 | 12 | 14 |
| 1 | 2017-05-25 | 2017-06-24 | NULL | 6 | 12 | 14 |
| 1 | 2017-06-25 | 2017-06-29 | NULL | 6 | NULL | 14 |
| 1 | 2017-06-30 | 2017-07-31 | NULL | NULL | NULL | 14 |
| 1 | 2017-08-01 | 2017-11-21 | 12 | NULL | NULL | 14 |
| 1 | 2017-11-22 | 2018-01-12 | 12 | NULL | NULL | NULL |
+----------+------------+------------+---------+---------+---------+---------+
I have 2 tables that I want match by ID.
EDIT: Table1:
| id | other columns A |
| 23 | ... |
| 27 | ... |
| 9 | ... |
| 50 | ... |
Table2:
| id_new | id_old | other columns B
| 23 | 7 | ...
| 27 | 8 | ...
| 33 | 9 | ...
Problem is that the second table contains 2 ID columns: first with new ID second with the old one - both can match the ID from first table.
EDIT: there are some rows from table A which ID not match neither id_new nor id_old. But I want them to retain in the new table.
This is my desired result:
| id | id_new | id_old | other columns A + B
| 23 | 23 | 7 | A + B
| 27 | 27 | 8 | A + B
| 9 | 33 | 9 | A + B
| 50 | -- | -- | A
I tried this one but it's a huge dataset and my query takes a long time to execute.
create table spoj2
as
select *
from table1
left join table2 on table1.id = table2.id_new
or table1.id = table2.id_old
Is this what you need?
select table1.id, IFNULL(T2A.id_new, T2B.id_new) as id_new
, IFNULL(T2A.id_old, T2B.id_old) as id_old
FROM table1
LEFT JOIN table2 as T2A ON table1.id = T2A.id_new
LEFT JOIN table2 as T2B ON table1.id = T2B.id_old
WITH t1(id,o_t1) AS ( VALUES
(23,'...'),
(27,'...'),
(9,'...')
), t2(id_new,id_old,o_t2) AS ( VALUES
(23,7,'...'),
(27,8,'...'),
(33,9,'...')
)
SELECT t1.id,t2.id_new,t2.id_old,t1.o_t1,t2.o_t2 FROM t1
INNER JOIN t2 ON t2.id_new = t1.id
UNION ALL
SELECT t1.id,t2.id_new,t2.id_old,t1.o_t1,t2.o_t2 FROM t1
INNER JOIN t2 ON t2.id_old = t1.id;
Result:
id | id_new | id_old | o_t1 | o_t2
----+--------+--------+------+------
23 | 23 | 7 | ... | ...
27 | 27 | 8 | ... | ...
9 | 33 | 9 | ... | ...
(3 rows)
I have three foreign identifiers in my PSQL view. How could I replace the NULL second_id values with the third_id values based on their common first_id?
Currently:
first_id | second_id | third_id
----------+-----------+----------
1 | | 11
1 | | 11
1 | | 11
1 | 22 | 22
2 | 33 | 33
3 | 44 | 44
4 | 55 | 55
5 | 66 | 66
6 | | 77
6 | | 77
6 | | 77
6 | | 77
6 | 88 | 88
Should be:
first_id | second_id | third_id
----------+-----------+----------
1 | 22 | 11
1 | 22 | 11
1 | 22 | 11
1 | 22 | 22
2 | 33 | 33
3 | 44 | 44
4 | 55 | 55
5 | 66 | 66
6 | 88 | 77
6 | 88 | 77
6 | 88 | 77
6 | 88 | 77
6 | 88 | 88
How can I make this change?
The NULL values in the second_id column should be filled i.e. there shouldn't be blank cells.
If the second_id column shares a value with the third_id column, this value should fill the blank cells in the second_id column.
They should both be based on their common first_id.
Thanks so much. I really appreciate it.
The second_id is really a CASE WHEN modification of the third_id. This modification is made in the view.
VIEW:
View "public.my_view"
Column | Type | Modifiers | Storage | Description
-----------------------------+-----------------------------+-----------+----------+-------------
row_number | bigint | | plain |
first_id | integer | | plain |
second_id | integer | | plain |
third_id | integer | | plain |
first_type | character varying(255) | | extended |
date_1 | timestamp without time zone | | plain |
date_2 | timestamp without time zone | | plain |
date_3 | timestamp without time zone | | plain |
View definition:
SELECT row_number() OVER (PARTITION BY t.first_id) AS row_number,
t.first_id,
CASE
WHEN t.localization_key::text = 'rq.bkd'::text THEN t.third_id
ELSE NULL::integer
END AS second_id,
t.third_id,
t.first_type,
CASE
WHEN t.localization_key::text = 'rq.bkd'::text THEN t.created_at
ELSE NULL::timestamp without time zone
END AS date_1,
CASE
WHEN t.localization_key::text = 'st.appt'::text THEN t.created_at
ELSE NULL::timestamp without time zone
END AS date_2,
CASE
WHEN t.localization_key::text = 'st.eta'::text THEN t.created_at
ELSE NULL::timestamp without time zone
END AS date_3
FROM my_table t
WHERE (t.localization_key::text = 'rq.bkd'::text OR t.localization_key::text = 'st.appt'::text OR t.localization_key::text = 'st.eta'::text) AND t.first_type::text = 'thing'::text
ORDER BY t.created_at DESC;
Here is a link to the table definition that the view is using (my_table).
https://gist.github.com/dankreiger/376f6545a0acff19536d
Thanks again for your help.
You can get it by:
select a.first_id, coalesce(a.second_id,b.second_id), a.third_id
from my_table a
left outer join
(
select first_id, second_id from my_table
where second_id is not null
) b
using (first_id)
So the update should be:
update my_table a set second_id = b.second_id
from
(
select first_id, second_id from my_table
where second_id is not null
) b
where b.first_id = a.first_id and a.second_id is null
You can not UPDATE the underlying table my_table because it does not have the second_id column so you should make the view display the data the way you want it. That is fairly straightforward with a CTE:
CREATE VIEW my_view AS
WITH second (first, id) AS (
SELECT first_id, third_id
FROM my_table
WHERE t.localization_key = 'rq.bkd')
SELECT
row_number() OVER (PARTITION BY t.first_id) AS row_number,
t.first_id,
s.id AS second_id,
t.third_id,
t.first_type,
CASE
WHEN t.localization_key = 'rq.bkd' THEN t.created_at
END AS date_1,
CASE
WHEN t.localization_key = 'st.appt' THEN t.created_at
END AS date_2,
CASE
WHEN t.localization_key = 'st.eta' THEN t.created_at
END AS date_3
FROM my_table t
JOIN second s ON s.first = t.first_id
WHERE (t.localization_key = 'rq.bkd'
OR t.localization_key = 'st.appt'
OR t.localization_key = 'st.eta')
AND t.first_type = 'thing'
ORDER BY t.created_at DESC;
This assumes that where my_table.localization_key = 'rq.bkd' you do have exactly 1 third_id value; if not you should add the appropriate qualifiers such as ORDER BY first_id ASC NULLS LAST LIMIT 1 or some other suitable filter. Also note that the CTE is JOINed, not LEFT JOINed, assuming there is always a valid pair (first_id, third_id) without NULLs.
I have this table named Samples. The Date column values are just symbolic date values.
+----+------------+-------+------+
| Id | Product_Id | Price | Date |
+----+------------+-------+------+
| 1 | 1 | 100 | 1 |
| 2 | 2 | 100 | 2 |
| 3 | 3 | 100 | 3 |
| 4 | 1 | 100 | 4 |
| 5 | 2 | 100 | 5 |
| 6 | 3 | 100 | 6 |
...
+----+------------+-------+------+
I want to group by product_id such that I have the 1'th sample in descending date order and a new colomn added with the Price of the 7'th sample row in each product group. If the 7'th row does not exist, then the value should be null.
Example:
+----+------------+-------+------+----------+
| Id | Product_Id | Price | Date | 7thPrice |
+----+------------+-------+------+----------+
| 4 | 1 | 100 | 4 | 120 |
| 5 | 2 | 100 | 5 | 100 |
| 6 | 3 | 100 | 6 | NULL |
+----+------------+-------+------+----------+
I belive I can achieve the table without the '7thPrice' with the following
SELECT * FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY Product_Id ORDER BY date DESC) r, * FROM Samples
) T WHERE T.r = 1
Any suggestions?
You can try something like this. I used your query to create a CTE. Then joined rank1 to rank7.
;with sampleCTE
as
(SELECT ROW_NUMBER() OVER (PARTITION BY Product_Id ORDER BY date DESC) r, * FROM Samples)
select *
from
(select * from samplecte where r = 1) a
left join
(select * from samplecte where r=7) b
on a.product_id = b.product_id