Reverse column values after grouping - postgresql

I have a table as follows
id | group | value
------+-------------+----------
1 | 1 | 2
2 | 1 | 4
3 | 1 | 3
4 | 2 | 2
5 | 2 | 9
6 | 2 | 5
I want to group the rows by 'group' with the order of 'id' and create a new column that reverses the 'value' column as follows
id | group | value | reversedvalue
------+-------------+---------+---------
1 | 1 | 2 | 3
2 | 1 | 4 | 4
3 | 1 | 3 | 2
4 | 2 | 2 | 5
5 | 2 | 9 | 9
6 | 2 | 5 | 2

Try the following:
SELECT q1.id,q1.group_id,q1.value,q2.value
FROM
(
SELECT *,ROW_NUMBER()OVER(PARTITION BY group_id ORDER BY id) n
FROM your_table
) q1
JOIN
(
SELECT *,ROW_NUMBER()OVER(PARTITION BY group_id ORDER BY id DESC) n
FROM your_table
) q2
ON q1.group_id=q2.group_id AND q1.n=q2.n
You also can use CTE:
WITH cte AS(
SELECT *,
ROW_NUMBER()OVER(PARTITION BY group_id ORDER BY id) n1,
ROW_NUMBER()OVER(PARTITION BY group_id ORDER BY id DESC) n2
FROM your_table
)
SELECT q1.id,q1.group_id,q1.value,q2.value
FROM
cte q1
JOIN
cte q2
ON q1.group_id=q2.group_id AND q1.n1=q2.n2;

Related

Recursive CTE in PostgreSQL for knapsack problem

I have a dataset with 3 columns:
Item_id
Sourced_from
Cost
1
Local
15
2
Local
10
3
Local
20
4
International
60
I am trying to write a query in PostgreSQL to fetch total of local and international items, customer can buy within the cash limit. For a cash limit 50, this is the output I am expecting:
Local
International
3
0
I have a pretty basic knowledge of PostgreSQL, and after googling it seems like this could be solved with recursive CTE, I am unable to figure out how should I select my source seed/anchor point in this scenario.
Any ideas, how should I approach this?
Not with a recursive CTE, but still works:
DDL/DML:
create table T
(
id integer primary key generated by default AS IDENTITY,
kind text not null,
cost integer not null
);
insert into T(kind, cost)
values ('local', 15),
('local', 10),
('local', 20),
('international', 60);
-- 4. This outer CTE and the following self-join is only necessary in order to display the rows that have a count() of 0
with sub as
(
-- 3. find the total cost of buying this row + all previous rows, grouped by its kind
select X.kind, sum(X.cost) as cost, X.rn
from (
with cte as (
-- 1. assign an increasing row number on each row from the table ordered by its cost
select *, row_number() over (order by T.cost asc, T.kind) as rn
from T
)
-- 2. self-join the CTE on each row with the same kind, but join it only with the rows that have a row number less than or equal to the current row number
select A.id, A.kind, A.cost, B.rn
from cte as A
join cte as B on A.kind = B.kind and A.rn <= B.rn
) as X
group by X.kind, X.rn
)
select M.kind, count(N.*)
from sub as M -- 5. count only the amount of goods that fit in out budget (i.e. 50)
left outer join sub as N on M.rn = N.rn and N.cost <= 50
group by M.kind
;
Output (db-fiddle):
+-------------+-----+
|kind |count|
+-------------+-----+
|local |3 |
|international|0 |
+-------------+-----+
I made a CTE example to solve the problem:
Recreated your case with
create table kp (item_id int, sourced_from varchar, cost int);
insert into kp values (1,'local',15);
insert into kp values (2,'local',10);
insert into kp values (3,'local',20);
insert into kp values (4,'international',60);
The following query does:
Selects from kp only items with cost less than 50
adds the item_id in the list_of_items
The recursive bit does:
joins with kp checking the source_from is the same and the kp.item_id is not already contained in the list_of_items (avoiding to put the same item multiple times)
computes the total cost (total_cost)
adds the new item item_id to the list_of_items
WITH RECURSIVE items (item_id, next_item_id, sourced_from, total_cost, nr_items, list_of_items) AS (
SELECT
item_id,
item_id as next_item_id,
sourced_from,
cost as total_cost,
1 as nr_items,
ARRAY[item_id] list_of_items
from kp where cost < 50
UNION ALL
SELECT
kp.item_id,
items.item_id as next_item_id,
items.sourced_from,
items.total_cost + kp.cost total_cost,
items.nr_items + 1 as nr_items,
items.list_of_items || kp.item_id as list_of_items
FROM kp join items
on items.sourced_from=kp.sourced_from
and items.list_of_items::int[] #> ARRAY[kp.item_id] = false
WHERE kp.cost + items.total_cost < 50
)
SELECT * FROM items;
If you run against the above dataset you'll end up with the detailed result
item_id | next_item_id | sourced_from | total_cost | nr_items | list_of_items
---------+--------------+--------------+------------+----------+---------------
1 | 1 | local | 15 | 1 | {1}
2 | 2 | local | 10 | 1 | {2}
3 | 3 | local | 20 | 1 | {3}
1 | 2 | local | 25 | 2 | {2,1}
1 | 3 | local | 35 | 2 | {3,1}
2 | 1 | local | 25 | 2 | {1,2}
2 | 3 | local | 30 | 2 | {3,2}
3 | 1 | local | 35 | 2 | {1,3}
3 | 2 | local | 30 | 2 | {2,3}
1 | 2 | local | 45 | 3 | {3,2,1}
1 | 3 | local | 45 | 3 | {2,3,1}
2 | 1 | local | 45 | 3 | {3,1,2}
2 | 3 | local | 45 | 3 | {1,3,2}
3 | 1 | local | 45 | 3 | {2,1,3}
3 | 2 | local | 45 | 3 | {1,2,3}
(15 rows)
which shows all the permutations of the 3 local items.
Now if you substitute the last SELECT section with
SELECT * FROM items order by nr_items desc, total_cost desc, list_of_items asc limit 1;
You'll be able also to pick the combination having the max number of items, with the cost closest to the budget (I added also an ascending ordering based on list_of_items to receive always the same result in case of multiple combinations), which in the case above would result in
item_id | next_item_id | sourced_from | total_cost | nr_items | list_of_items
---------+--------------+--------------+------------+----------+---------------
3 | 2 | local | 45 | 3 | {1,2,3}
(1 row)
If you are just interested in the maximum by sourced_from then the last SELECT becomes
select sourced_from, max(nr_items) nr_items from items group by sourced_from;
with the expected result being
sourced_from | nr_items
--------------+----------
local | 3
(1 row)
Edit: to speed up the query and avoiding having multiple permutations of the same objects (e.g. {1,2,3} and {1,2,3}) we can force the next item_id to be greater of the current one. Full query
WITH RECURSIVE items (item_id, next_item_id, sourced_from, total_cost, nr_items, list_of_items) AS (
SELECT
item_id,
item_id as next_item_id,
sourced_from,
cost as total_cost,
1 as nr_items,
ARRAY[item_id] list_of_items
from kp where cost < 50
UNION ALL
SELECT
kp.item_id,
items.item_id as next_item_id,
items.sourced_from,
items.total_cost + kp.cost total_cost,
items.nr_items + 1 as nr_items,
items.list_of_items || kp.item_id as list_of_items
FROM kp join items
on items.sourced_from=kp.sourced_from
and items.list_of_items::int[] #> ARRAY[kp.item_id] = false
and items.item_id < kp.item_id
WHERE kp.cost + items.total_cost < 50
)
select * from items;
result
item_id | next_item_id | sourced_from | total_cost | nr_items | list_of_items
---------+--------------+--------------+------------+----------+---------------
1 | 1 | local | 15 | 1 | {1}
2 | 2 | local | 10 | 1 | {2}
3 | 3 | local | 20 | 1 | {3}
2 | 1 | local | 25 | 2 | {1,2}
3 | 1 | local | 35 | 2 | {1,3}
3 | 2 | local | 30 | 2 | {2,3}
3 | 2 | local | 45 | 3 | {1,2,3}
(7 rows)

Select row by id and it's nearest rows sorted by some value. PostgreSQL

I have chapters table like this:
id | title | sort_number | book_id
1 | 'Chap 1' | 3 | 1
5 | 'Chap 2' | 6 | 1
8 | 'About ' | 1 | 1
9 | 'Chap 3' | 9 | 1
10 | 'Attack' | 1 | 2
Id is unique, sort_number is unique for same book(book_id)
1)How can load all data (3 rows) for 3 chapters (current, next and prev) sorted by sort_number if i have only current chapter id?
2)How can i load current chapter data (1 row) and only id's of next, prev if they exist?
This can be done using window functions
select id, title, sort_number, book_id,
lag(id) over w as prev_chapter,
lead(id) over w as next_chapter
from chapters
window w as (partition by book_id order by sort_number);
With your sample data that returns:
id | title | sort_number | book_id | prev_chapter | next_chapter
---+--------+-------------+---------+--------------+-------------
8 | About | 1 | 1 | | 1
1 | Chap 1 | 3 | 1 | 8 | 5
5 | Chap 2 | 6 | 1 | 1 | 9
9 | Chap 3 | 9 | 1 | 5 |
10 | Attack | 1 | 2 | |
The above query can now be used to answer both your questions:
1)
select id, title, sort_number, book_id
from (
select id, title, sort_number, book_id,
--first_value(id) over w as first_chapter,
lag(id) over w as prev_chapter_id,
lead(id) over w as next_chapter_id
from chapters
window w as (partition by book_id order by sort_number)
) t
where 1 in (id, prev_chapter_id, next_chapter_id)
2)
select *
from (
select id, title, sort_number, book_id,
lag(id) over w as prev_chapter_id,
lead(id) over w as next_chapter_id
from chapters
window w as (partition by book_id order by sort_number)
) t
where id = 1

How to use join with aggregate function in postgresql?

I have 4 tables
Table1
id | name
1 | A
2 | B
Table2
id | name1
1 | C
2 | D
Table3
id | name2
1 | E
2 | F
Table4
id | name1_id | name2_id | name3_id
1 | 1 | 2 | 1
2 | 2 | 2 | 2
3 | 1 | 2 | 1
4 | 2 | 1 | 1
5 | 1 | 1 | 2
6 | 2 | 2 | 1
7 | 1 | 1 | 2
8 | 2 | 1 | 1
9 | 1 | 2 | 1
10 | 2 | 2 | 1
Now I want to join all tables with 4 and get this type of output
name | count
{A,B} | {5, 5}
{C,D} | {5, 6}
{E,F} | {7, 3}
I tried this
select array_agg(distinct(t1.name)), array_agg(distinct(temp.test))
from
(select t4.name1_id, (count(t4.name1_id)) "test"
from table4 t4 group by t4.name1_id
) temp
join table1 t1
on temp.name1_id = t1.id
I am trying to achieve this. Anybody can help me.
Calculate the counts for every table separately and union the results:
select
array_agg(name order by name) as name,
array_agg(count order by name) as count
from (
select 1 as t, name, count(*)
from table4
join table1 t1 on t1.id = name1_id
group by name
union all
select 2 as t, name, count(*)
from table4
join table2 t2 on t2.id = name2_id
group by name
union all
select 3 as t, name, count(*)
from table4
join table3 t3 on t3.id = name3_id
group by name
) s
group by t;
name | count
-------+-------
{A,B} | {5,5}
{C,D} | {4,6}
{E,F} | {7,3}
(3 rows)

how to join bigger table T1 to a smaller T2 with random repeating rows from T2

sorry if this in some way has been asked already. i've searched and couldn't find this specific solution. so i would appreciate an answer or pointer to the right place...
i have two tables of different (varying) length. Toy example:
T1:
SELECT * FROM T1;
gid | call_s1
-----+---------
1 | 1
3 | 1
4 | 1
7 | 1
8 | 1
(5 rows)
and
SELECT * FROM T2;
gid | dt_ping
-----+---------------------
1 | 2009-06-06 19:00:00
2 | 2009-06-06 19:00:15
3 | 2009-06-06 19:00:30
4 | 2009-06-06 19:00:45
(4 rows)
I would like to get a result T3 that has assigned random rows from T2.dt_ping to each row in T1 repeating if necessary. For instance possible result would be:
gid | call_s1 | dt_ping
-----+----------+---------
1 | 1 | 2009-06-06 19:00:45
3 | 1 | 2009-06-06 19:00:30
4 | 1 | 2009-06-06 19:00:15
7 | 1 | 2009-06-06 19:00:00
8 | 1 | 2009-06-06 19:00:45
I have tried offset, order by random, etc. Either I get a cartesian product or nulls. For instance this is my last attempt:
SELECT
T1.gid
, T1.call_s1
, T2.dt_ping
FROM
(
SELECT
gid
, call_s1
, ceiling( random() * (SELECT count(*)::int as n FROM fake_called_small ) ) tgid
FROM fake_called_small
) T1
LEFT OUTER JOIN
fake_times_small T2
ON
T2.gid = T1.tgid;
and one of the results i get:
gid | call_s1 | dt_ping
-----+---------+---------------------
1 | 1 |
3 | 1 | 2009-06-06 19:00:30
4 | 1 | 2009-06-06 19:00:45
7 | 1 | 2009-06-06 19:00:15
8 | 1 |
(5 rows)
I know I'm missing something simple, but what?
btw i tried this:
SELECT
T1.gid
, T1.call_s1
, (SELECT dt_ping FROM fake_times_small T2 ORDER BY random() LIMIT 1)
FROM fake_called_small T1;
and got:
gid | call_s1 | dt_ping
-----+---------+---------------------
1 | 1 | 2009-06-06 19:00:30
3 | 1 | 2009-06-06 19:00:30
4 | 1 | 2009-06-06 19:00:30
7 | 1 | 2009-06-06 19:00:30
8 | 1 | 2009-06-06 19:00:30
(5 rows)
same row for dt_ping repeated because the subselect is done only once
If you are not concerned with performance, the following query should do what you want:
SELECT gid, call_s1, dt_ping
FROM (
SELECT t1.gid, call_s1, dt_ping,
ROW_NUMBER() OVER (PARTITION BY t1.gid, call_s1 ORDER BY RANDOM()) AS rn
FROM t1
CROSS JOIN t2
) x
WHERE rn = 1;

1th and 7th row in grouping

I have this table named Samples. The Date column values are just symbolic date values.
+----+------------+-------+------+
| Id | Product_Id | Price | Date |
+----+------------+-------+------+
| 1 | 1 | 100 | 1 |
| 2 | 2 | 100 | 2 |
| 3 | 3 | 100 | 3 |
| 4 | 1 | 100 | 4 |
| 5 | 2 | 100 | 5 |
| 6 | 3 | 100 | 6 |
...
+----+------------+-------+------+
I want to group by product_id such that I have the 1'th sample in descending date order and a new colomn added with the Price of the 7'th sample row in each product group. If the 7'th row does not exist, then the value should be null.
Example:
+----+------------+-------+------+----------+
| Id | Product_Id | Price | Date | 7thPrice |
+----+------------+-------+------+----------+
| 4 | 1 | 100 | 4 | 120 |
| 5 | 2 | 100 | 5 | 100 |
| 6 | 3 | 100 | 6 | NULL |
+----+------------+-------+------+----------+
I belive I can achieve the table without the '7thPrice' with the following
SELECT * FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY Product_Id ORDER BY date DESC) r, * FROM Samples
) T WHERE T.r = 1
Any suggestions?
You can try something like this. I used your query to create a CTE. Then joined rank1 to rank7.
;with sampleCTE
as
(SELECT ROW_NUMBER() OVER (PARTITION BY Product_Id ORDER BY date DESC) r, * FROM Samples)
select *
from
(select * from samplecte where r = 1) a
left join
(select * from samplecte where r=7) b
on a.product_id = b.product_id