Reverse column values after grouping

Reverse column values after grouping - postgresql

I have a table as follows
id | group | value
------+-------------+----------
1 | 1 | 2
2 | 1 | 4
3 | 1 | 3
4 | 2 | 2
5 | 2 | 9
6 | 2 | 5
I want to group the rows by 'group' with the order of 'id' and create a new column that reverses the 'value' column as follows
id | group | value | reversedvalue
------+-------------+---------+---------
1 | 1 | 2 | 3
2 | 1 | 4 | 4
3 | 1 | 3 | 2
4 | 2 | 2 | 5
5 | 2 | 9 | 9
6 | 2 | 5 | 2

Try the following:
SELECT q1.id,q1.group_id,q1.value,q2.value
FROM
(
SELECT *,ROW_NUMBER()OVER(PARTITION BY group_id ORDER BY id) n
FROM your_table
) q1
JOIN
(
SELECT *,ROW_NUMBER()OVER(PARTITION BY group_id ORDER BY id DESC) n
FROM your_table
) q2
ON q1.group_id=q2.group_id AND q1.n=q2.n
You also can use CTE:
WITH cte AS(
SELECT *,
ROW_NUMBER()OVER(PARTITION BY group_id ORDER BY id) n1,
ROW_NUMBER()OVER(PARTITION BY group_id ORDER BY id DESC) n2
FROM your_table
)
SELECT q1.id,q1.group_id,q1.value,q2.value
FROM
cte q1
JOIN
cte q2
ON q1.group_id=q2.group_id AND q1.n1=q2.n2;

Related

Recursive CTE in PostgreSQL for knapsack problem

I have a dataset with 3 columns:
Item_id
Sourced_from
Cost
1
Local
15
2
Local
10
3
Local
20
4
International
60
I am trying to write a query in PostgreSQL to fetch total of local and international items, customer can buy within the cash limit. For a cash limit 50, this is the output I am expecting:
Local
International
3
0
I have a pretty basic knowledge of PostgreSQL, and after googling it seems like this could be solved with recursive CTE, I am unable to figure out how should I select my source seed/anchor point in this scenario.
Any ideas, how should I approach this?

Not with a recursive CTE, but still works:
DDL/DML:
create table T
(
id integer primary key generated by default AS IDENTITY,
kind text not null,
cost integer not null
);
insert into T(kind, cost)
values ('local', 15),
('local', 10),
('local', 20),
('international', 60);
-- 4. This outer CTE and the following self-join is only necessary in order to display the rows that have a count() of 0
with sub as
(
-- 3. find the total cost of buying this row + all previous rows, grouped by its kind
select X.kind, sum(X.cost) as cost, X.rn
from (
with cte as (
-- 1. assign an increasing row number on each row from the table ordered by its cost
select *, row_number() over (order by T.cost asc, T.kind) as rn
from T
)
-- 2. self-join the CTE on each row with the same kind, but join it only with the rows that have a row number less than or equal to the current row number
select A.id, A.kind, A.cost, B.rn
from cte as A
join cte as B on A.kind = B.kind and A.rn <= B.rn
) as X
group by X.kind, X.rn
)
select M.kind, count(N.*)
from sub as M -- 5. count only the amount of goods that fit in out budget (i.e. 50)
left outer join sub as N on M.rn = N.rn and N.cost <= 50
group by M.kind
;
Output (db-fiddle):
+-------------+-----+
|kind |count|
+-------------+-----+
|local |3 |
|international|0 |
+-------------+-----+

I made a CTE example to solve the problem:
Recreated your case with
create table kp (item_id int, sourced_from varchar, cost int);
insert into kp values (1,'local',15);
insert into kp values (2,'local',10);
insert into kp values (3,'local',20);
insert into kp values (4,'international',60);
The following query does:
Selects from kp only items with cost less than 50
adds the item_id in the list_of_items
The recursive bit does:
joins with kp checking the source_from is the same and the kp.item_id is not already contained in the list_of_items (avoiding to put the same item multiple times)
computes the total cost (total_cost)
adds the new item item_id to the list_of_items
WITH RECURSIVE items (item_id, next_item_id, sourced_from, total_cost, nr_items, list_of_items) AS (
SELECT
item_id,
item_id as next_item_id,
sourced_from,
cost as total_cost,
1 as nr_items,
ARRAY[item_id] list_of_items
from kp where cost < 50
UNION ALL
SELECT
kp.item_id,
items.item_id as next_item_id,
items.sourced_from,
items.total_cost + kp.cost total_cost,
items.nr_items + 1 as nr_items,
items.list_of_items || kp.item_id as list_of_items
FROM kp join items
on items.sourced_from=kp.sourced_from
and items.list_of_items::int[] #> ARRAY[kp.item_id] = false
WHERE kp.cost + items.total_cost < 50
)
SELECT * FROM items;
If you run against the above dataset you'll end up with the detailed result
item_id | next_item_id | sourced_from | total_cost | nr_items | list_of_items
---------+--------------+--------------+------------+----------+---------------
1 | 1 | local | 15 | 1 | {1}
2 | 2 | local | 10 | 1 | {2}
3 | 3 | local | 20 | 1 | {3}
1 | 2 | local | 25 | 2 | {2,1}
1 | 3 | local | 35 | 2 | {3,1}
2 | 1 | local | 25 | 2 | {1,2}
2 | 3 | local | 30 | 2 | {3,2}
3 | 1 | local | 35 | 2 | {1,3}
3 | 2 | local | 30 | 2 | {2,3}
1 | 2 | local | 45 | 3 | {3,2,1}
1 | 3 | local | 45 | 3 | {2,3,1}
2 | 1 | local | 45 | 3 | {3,1,2}
2 | 3 | local | 45 | 3 | {1,3,2}
3 | 1 | local | 45 | 3 | {2,1,3}
3 | 2 | local | 45 | 3 | {1,2,3}
(15 rows)
which shows all the permutations of the 3 local items.
Now if you substitute the last SELECT section with
SELECT * FROM items order by nr_items desc, total_cost desc, list_of_items asc limit 1;
You'll be able also to pick the combination having the max number of items, with the cost closest to the budget (I added also an ascending ordering based on list_of_items to receive always the same result in case of multiple combinations), which in the case above would result in
item_id | next_item_id | sourced_from | total_cost | nr_items | list_of_items
---------+--------------+--------------+------------+----------+---------------
3 | 2 | local | 45 | 3 | {1,2,3}
(1 row)
If you are just interested in the maximum by sourced_from then the last SELECT becomes
select sourced_from, max(nr_items) nr_items from items group by sourced_from;
with the expected result being
sourced_from | nr_items
--------------+----------
local | 3
(1 row)
Edit: to speed up the query and avoiding having multiple permutations of the same objects (e.g. {1,2,3} and {1,2,3}) we can force the next item_id to be greater of the current one. Full query
WITH RECURSIVE items (item_id, next_item_id, sourced_from, total_cost, nr_items, list_of_items) AS (
SELECT
item_id,
item_id as next_item_id,
sourced_from,
cost as total_cost,
1 as nr_items,
ARRAY[item_id] list_of_items
from kp where cost < 50
UNION ALL
SELECT
kp.item_id,
items.item_id as next_item_id,
items.sourced_from,
items.total_cost + kp.cost total_cost,
items.nr_items + 1 as nr_items,
items.list_of_items || kp.item_id as list_of_items
FROM kp join items
on items.sourced_from=kp.sourced_from
and items.list_of_items::int[] #> ARRAY[kp.item_id] = false
and items.item_id < kp.item_id
WHERE kp.cost + items.total_cost < 50
)
select * from items;
result
item_id | next_item_id | sourced_from | total_cost | nr_items | list_of_items
---------+--------------+--------------+------------+----------+---------------
1 | 1 | local | 15 | 1 | {1}
2 | 2 | local | 10 | 1 | {2}
3 | 3 | local | 20 | 1 | {3}
2 | 1 | local | 25 | 2 | {1,2}
3 | 1 | local | 35 | 2 | {1,3}
3 | 2 | local | 30 | 2 | {2,3}
3 | 2 | local | 45 | 3 | {1,2,3}
(7 rows)

Select row by id and it's nearest rows sorted by some value. PostgreSQL

I have chapters table like this:
id | title | sort_number | book_id
1 | 'Chap 1' | 3 | 1
5 | 'Chap 2' | 6 | 1
8 | 'About ' | 1 | 1
9 | 'Chap 3' | 9 | 1
10 | 'Attack' | 1 | 2
Id is unique, sort_number is unique for same book(book_id)
1)How can load all data (3 rows) for 3 chapters (current, next and prev) sorted by sort_number if i have only current chapter id?
2)How can i load current chapter data (1 row) and only id's of next, prev if they exist?

This can be done using window functions
select id, title, sort_number, book_id,
lag(id) over w as prev_chapter,
lead(id) over w as next_chapter
from chapters
window w as (partition by book_id order by sort_number);
With your sample data that returns:
id | title | sort_number | book_id | prev_chapter | next_chapter
---+--------+-------------+---------+--------------+-------------
8 | About | 1 | 1 | | 1
1 | Chap 1 | 3 | 1 | 8 | 5
5 | Chap 2 | 6 | 1 | 1 | 9
9 | Chap 3 | 9 | 1 | 5 |
10 | Attack | 1 | 2 | |
The above query can now be used to answer both your questions:
1)
select id, title, sort_number, book_id
from (
select id, title, sort_number, book_id,
--first_value(id) over w as first_chapter,
lag(id) over w as prev_chapter_id,
lead(id) over w as next_chapter_id
from chapters
window w as (partition by book_id order by sort_number)
) t
where 1 in (id, prev_chapter_id, next_chapter_id)
2)
select *
from (
select id, title, sort_number, book_id,
lag(id) over w as prev_chapter_id,
lead(id) over w as next_chapter_id
from chapters
window w as (partition by book_id order by sort_number)
) t
where id = 1

How to use join with aggregate function in postgresql?

I have 4 tables
Table1
id | name
1 | A
2 | B
Table2
id | name1
1 | C
2 | D
Table3
id | name2
1 | E
2 | F
Table4
id | name1_id | name2_id | name3_id
1 | 1 | 2 | 1
2 | 2 | 2 | 2
3 | 1 | 2 | 1
4 | 2 | 1 | 1
5 | 1 | 1 | 2
6 | 2 | 2 | 1
7 | 1 | 1 | 2
8 | 2 | 1 | 1
9 | 1 | 2 | 1
10 | 2 | 2 | 1
Now I want to join all tables with 4 and get this type of output
name | count
{A,B} | {5, 5}
{C,D} | {5, 6}
{E,F} | {7, 3}
I tried this
select array_agg(distinct(t1.name)), array_agg(distinct(temp.test))
from
(select t4.name1_id, (count(t4.name1_id)) "test"
from table4 t4 group by t4.name1_id
) temp
join table1 t1
on temp.name1_id = t1.id
I am trying to achieve this. Anybody can help me.

Calculate the counts for every table separately and union the results:
select
array_agg(name order by name) as name,
array_agg(count order by name) as count
from (
select 1 as t, name, count(*)
from table4
join table1 t1 on t1.id = name1_id
group by name
union all
select 2 as t, name, count(*)
from table4
join table2 t2 on t2.id = name2_id
group by name
union all
select 3 as t, name, count(*)
from table4
join table3 t3 on t3.id = name3_id
group by name
) s
group by t;
name | count
-------+-------
{A,B} | {5,5}
{C,D} | {4,6}
{E,F} | {7,3}
(3 rows)

how to join bigger table T1 to a smaller T2 with random repeating rows from T2

sorry if this in some way has been asked already. i've searched and couldn't find this specific solution. so i would appreciate an answer or pointer to the right place...
i have two tables of different (varying) length. Toy example:
T1:
SELECT * FROM T1;
gid | call_s1
-----+---------
1 | 1
3 | 1
4 | 1
7 | 1
8 | 1
(5 rows)
and
SELECT * FROM T2;
gid | dt_ping
-----+---------------------
1 | 2009-06-06 19:00:00
2 | 2009-06-06 19:00:15
3 | 2009-06-06 19:00:30
4 | 2009-06-06 19:00:45
(4 rows)
I would like to get a result T3 that has assigned random rows from T2.dt_ping to each row in T1 repeating if necessary. For instance possible result would be:
gid | call_s1 | dt_ping
-----+----------+---------
1 | 1 | 2009-06-06 19:00:45
3 | 1 | 2009-06-06 19:00:30
4 | 1 | 2009-06-06 19:00:15
7 | 1 | 2009-06-06 19:00:00
8 | 1 | 2009-06-06 19:00:45
I have tried offset, order by random, etc. Either I get a cartesian product or nulls. For instance this is my last attempt:
SELECT
T1.gid
, T1.call_s1
, T2.dt_ping
FROM
(
SELECT
gid
, call_s1
, ceiling( random() * (SELECT count(*)::int as n FROM fake_called_small ) ) tgid
FROM fake_called_small
) T1
LEFT OUTER JOIN
fake_times_small T2
ON
T2.gid = T1.tgid;
and one of the results i get:
gid | call_s1 | dt_ping
-----+---------+---------------------
1 | 1 |
3 | 1 | 2009-06-06 19:00:30
4 | 1 | 2009-06-06 19:00:45
7 | 1 | 2009-06-06 19:00:15
8 | 1 |
(5 rows)
I know I'm missing something simple, but what?
btw i tried this:
SELECT
T1.gid
, T1.call_s1
, (SELECT dt_ping FROM fake_times_small T2 ORDER BY random() LIMIT 1)
FROM fake_called_small T1;
and got:
gid | call_s1 | dt_ping
-----+---------+---------------------
1 | 1 | 2009-06-06 19:00:30
3 | 1 | 2009-06-06 19:00:30
4 | 1 | 2009-06-06 19:00:30
7 | 1 | 2009-06-06 19:00:30
8 | 1 | 2009-06-06 19:00:30
(5 rows)
same row for dt_ping repeated because the subselect is done only once

If you are not concerned with performance, the following query should do what you want:
SELECT gid, call_s1, dt_ping
FROM (
SELECT t1.gid, call_s1, dt_ping,
ROW_NUMBER() OVER (PARTITION BY t1.gid, call_s1 ORDER BY RANDOM()) AS rn
FROM t1
CROSS JOIN t2
) x
WHERE rn = 1;

1th and 7th row in grouping

I have this table named Samples. The Date column values are just symbolic date values.
+----+------------+-------+------+
| Id | Product_Id | Price | Date |
+----+------------+-------+------+
| 1 | 1 | 100 | 1 |
| 2 | 2 | 100 | 2 |
| 3 | 3 | 100 | 3 |
| 4 | 1 | 100 | 4 |
| 5 | 2 | 100 | 5 |
| 6 | 3 | 100 | 6 |
...
+----+------------+-------+------+
I want to group by product_id such that I have the 1'th sample in descending date order and a new colomn added with the Price of the 7'th sample row in each product group. If the 7'th row does not exist, then the value should be null.
Example:
+----+------------+-------+------+----------+
| Id | Product_Id | Price | Date | 7thPrice |
+----+------------+-------+------+----------+
| 4 | 1 | 100 | 4 | 120 |
| 5 | 2 | 100 | 5 | 100 |
| 6 | 3 | 100 | 6 | NULL |
+----+------------+-------+------+----------+
I belive I can achieve the table without the '7thPrice' with the following
SELECT * FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY Product_Id ORDER BY date DESC) r, * FROM Samples
) T WHERE T.r = 1
Any suggestions?

You can try something like this. I used your query to create a CTE. Then joined rank1 to rank7.
;with sampleCTE
as
(SELECT ROW_NUMBER() OVER (PARTITION BY Product_Id ORDER BY date DESC) r, * FROM Samples)
select *
from
(select * from samplecte where r = 1) a
left join
(select * from samplecte where r=7) b
on a.product_id = b.product_id

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Reverse column values after grouping - postgresql

Related

Recursive CTE in PostgreSQL for knapsack problem

Select row by id and it's nearest rows sorted by some value. PostgreSQL

How to use join with aggregate function in postgresql?

how to join bigger table T1 to a smaller T2 with random repeating rows from T2

1th and 7th row in grouping

Categories

Resources