Typeorm order by after distinct on with postgresql - postgresql

I have a table below:
id
product_id
price 
1
1
100
2
1
150
3
2
120
4
2
190
5
3
100
6
3
80
I want to select cheapest price for product and sort them by price
Expected output:
id
product_id
price
6
3
80
1
1
100
3
2
120
What I try so far:
`
repository.createQueryBuilder('products')
.orderBy('products.id')
.distinctOn(['products.id'])
.addOrderBy('price')
`
This query returns, cheapest products but not sort them. So, addOrderBy doesn't effect to products. Is there a way to sort products after distinctOn ?

SELECT id,
product_id,
price
FROM (SELECT id,
product_id,
price,
Dense_rank()
OVER (
partition BY product_id
ORDER BY price ASC) dr
FROM product) inline_view
WHERE dr = 1
ORDER BY price ASC;
Setup:
postgres=# create table product(id int, product_id int, price int);
CREATE TABLE
postgres=# insert into product values (1,1,100),(2,1,150),(3,2,120),(4,2,190),(5,3,100),(6,3,80);
INSERT 0 6
Output
id | product_id | price
----+------------+-------
6 | 3 | 80
1 | 1 | 100
3 | 2 | 120
(3 rows)

Related

how to drop rows if a variale is less than x, in sql

I have the following query code
query = """
with double_entry_book as (
SELECT to_address as address, value as value
FROM `bigquery-public-data.crypto_ethereum.traces`
WHERE to_address is not null
AND block_timestamp < '2022-01-01 00:00:00'
AND status = 1
AND (call_type not in ('delegatecall', 'callcode', 'staticcall') or call_type is null)
union all
-- credits
SELECT from_address as address, -value as value
FROM `bigquery-public-data.crypto_ethereum.traces`
WHERE from_address is not null
AND block_timestamp < '2022-01-01 00:00:00'
AND status = 1
AND (call_type not in ('delegatecall', 'callcode', 'staticcall') or call_type is null)
union all
)
SELECT address,
sum(value) / 1000000000000000000 as balance
from double_entry_book
group by address
order by balance desc
LIMIT 15000000
"""
In the last part, I want to drop rows where "balance" is less than, let's say, 0.02 and then group, order, etc. I imagine this should be a simple code. Any help will be appreciated!
We can delete on a CTE and use returning to get the id's of the rows being deleted, but they still exist until the transaction is comitted.
CREATE TABLE t (
id serial,
variale int);
insert into t (variale) values
(1),(2),(3),(4),(5);
✓
5 rows affected
with del as
(delete from t
where variale < 3
returning id)
select
t.id,
t.variale,
del.id ids_being_deleted
from t
left join del
on t.id = del.id;
id | variale | ids_being_deleted
-: | ------: | ----------------:
1 | 1 | 1
2 | 2 | 2
3 | 3 | null
4 | 4 | null
5 | 5 | null
select * from t;
id | variale
-: | ------:
3 | 3
4 | 4
5 | 5
db<>fiddle here

Difference of top two values while GROUP BY

Suppose I have the following SQL Table:
id | score
------------
1 | 4433
1 | 678
1 | 1230
1 | 414
5 | 8899
5 | 123
6 | 2345
6 | 567
6 | 2323
Now I wanted to do a GROUP BY id operation wherein the score column would be modified as follows: take the absolute difference between the top two highest scores for each id.
For example, the response for the above query should be:
id | score
------------
1 | 3203
5 | 8776
6 | 22
How can I perform this query in PostgreSQL?
Using ROW_NUMBER along with pivoting logic we can try:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY id ORDER BY score DESC) rn
FROM yourTable
)
SELECT id,
ABS(MAX(score) FILTER (WHERE rn = 1) -
MAX(score) FILTER (WHERE rn = 2)) AS score
FROM cte
GROUP BY id;
Demo

PostgreSQL materialized view with global and partitioned ranks

I have a table with 10 mils. of user ratings.
I need to create materialized view that has global rating and rating by country and is refreshed once a day.
I came up with the following select query:
SELECT row_number() OVER(ORDER BY value DESC, id) AS rank_global,
row_number() OVER(PARTITION BY country ORDER BY value DESC, id) AS rank_country,
*
FROM rate
ORDER BY value DESC, id
LIMIT 100000
Is there a way to speed up this query or maybe there is another way to do the same? I created btree (value desc, id) and (country, value desc, id) indexes but it still takes a lot of time to complete.
Example:
Creating a table and populating it with users with random value column and random country:
CREATE TABLE rate
(
id serial NOT NULL,
name text,
value integer NOT NULL DEFAULT 0,
country character varying,
CONSTRAINT rate_pkey PRIMARY KEY (id)
);
INSERT INTO rate
(SELECT n, ('user_'||n), (random()*30)::int, ('country_'||(random()*3)::int)
FROM generate_series(0,10) AS n);
CREATE INDEX rate_country_value_id_index
ON rate
USING btree(country, value DESC, id);
CREATE INDEX rate_value_id_index
ON rate
USING btree(value DESC, id);
Table contents:
id name value country
0 user_0 28 country_2
1 user_1 24 country_2
2 user_2 29 country_1
3 user_3 11 country_1
4 user_4 16 country_1
5 user_5 28 country_0
6 user_6 3 country_1
7 user_7 7 country_1
8 user_8 28 country_1
9 user_9 4 country_0
10 user_10 29 country_1
Then I create materialized view:
CREATE MATERIALIZED VIEW rate_view AS
SELECT row_number() OVER (ORDER BY value DESC, id) AS rgl,
row_number() OVER (PARTITION BY country ORDER BY value DESC, id) AS rc,
*
FROM rate
ORDER BY value DESC, id;
View contents (rgl - global rank, rc - rank by country):
rgl rc id name value country
1 1 2 user_2 29 country_1
2 2 10 user_10 29 country_1
3 1 5 user_5 28 country_0
4 1 0 user_0 28 country_2
5 3 8 user_8 28 country_1
6 2 1 user_1 24 country_2
7 4 4 user_4 16 country_1
8 5 3 user_3 11 country_1
9 6 7 user_7 7 country_1
10 2 9 user_9 4 country_0
11 7 6 user_6 3 country_1
Now i can create complex queries to select users with closest rank and its neighbours by rank. Both globally and by country.
For example, (after creating (value,id) and (rgl) indexes on view) here is global top 50 and 5 users closest by rank to value 9999942:
(
WITH closest_rank AS
(
SELECT rgl FROM rate_view
WHERE value <= 9999942
ORDER BY value DESC, id ASC
LIMIT 1
)
SELECT rgl, name, value
FROM rate_view
WHERE rgl > (SELECT rgl-3 FROM closest_rank )
ORDER BY rgl ASC
LIMIT 5
)
UNION
SELECT rgl, name, value
FROM rate_view
WHERE rgl <=50
ORDER BY rgl;

Renumbering a column in postgresql based on sorted values in that column

Edit: I am using postgresql v8.3
I have a table that contains a column we can call column A.
Column A is populated, for our purposes, with arbitrary positive integers.
I want to renumber column A from 1 to N based on ordering the records of the table by column A ascending. (SELECT * FROM table ORDER BY A ASC;)
Is there a simple way to accomplish this without the need of building a postgresql function?
Example:
(Before:
A: 3,10,20,100,487,1,6)
(After:
A: 2,4,5,6,7,1,3)
Use the rank() (or dense_rank() ) WINDOW-functions (available since PG-8.4):
create table aaa
( id serial not null primary key
, num integer not null
, rnk integer not null default 0
);
insert into aaa(num) values( 3) , (10) , (20) , (100) , (487) , (1) , (6)
;
UPDATE aaa
SET rnk = w.rnk
FROM (
SELECT id
, rank() OVER (order by num ASC) AS rnk
FROM aaa
) w
WHERE w.id = aaa.id;
SELECT * FROM aaa
ORDER BY id
;
Results:
CREATE TABLE
INSERT 0 7
UPDATE 7
id | num | rnk
----+-----+-----
1 | 3 | 2
2 | 10 | 4
3 | 20 | 5
4 | 100 | 6
5 | 487 | 7
6 | 1 | 1
7 | 6 | 3
(7 rows)
IF window functions are not available, you could still count the number of rows before any row:
UPDATE aaa
SET rnk = w.rnk
FROM ( SELECT a0.id AS id
, COUNT(*) AS rnk
FROM aaa a0
JOIN aaa a1 ON a1.num <= a0.num
GROUP BY a0.id
) w
WHERE w.id = aaa.id;
SELECT * FROM aaa
ORDER BY id
;
Or the same with a scalar subquery:
UPDATE aaa a0
SET rnk =
( SELECT COUNT(*)
FROM aaa a1
WHERE a1.num <= a0.num
)
;

Filter rows based on two fields, where one of them contains a selection criterion

Given the following table
group | weight | category_id | category_name_plus
1 10 100 Ab
1 20 101 Bcd
1 30 100 Efghij
2 10 101 Bcd
2 20 101 Cdef
2 30 100 Defgh
2 40 100 Ab
3 10 102 Fghijkl
3 20 101 Ab
The "weight" is unique for each group and is also an indicator for the order of records inside the group.
What I want is to retrieve one record per group filtered by category_id, but only the record having the highest "weight" inside its "group".
Example for filtering by category_id = 100:
group | weight | category_id | category_name_plus
1 30 100 Efghij
2 40 100 Ab
Example for filtering by category_id = 101:
group | weight | category_id | category_name_plus
1 20 101 Bcd
2 20 101 Cdef
3 20 101 Ab
How can I select just these rows?
I tried fiddling with UNIQUE, MAX(category_id) etc. but I'm still unable to get the correct results. The main problem for me is to get the category_name_plus value here.
I am working with PostgreSQL 9.4(beta 3), because I also need various other niceties like "WITH ORDINALITY" etc.
The rank window function should do the trick:
SELECT "group", weight, category_id, category_name_plus
FROM (SELECT "group", weight, category_id, category_name_plus,
RANK() OVER (PARTITION BY "group"
ORDER BY weight DESC) AS rk
FROM my_table) t
WHERE rk = 1 AND category_id = 101
Note:
"group" is a reserved word in SQL, so it has to be surrounded by quotes in order to be used as a column name. It would probably be better, though, to replace it with a non-reserved word, such as "group_id".
Try something like:
SELECT DISTINCT ON (category_id) *
from your_table
order by category_id, weight desc