Delete rows where data has less sum - postgresql

I have the following table:
data_id sum_value
xxx 30
xxx 40
ccc 50
aaa 60
ccc 70
aaa 80
ddd 100
eee 200
How would I delete the row where data_id = data_id and sum < sum ? Delete rows if data_id = data_id and sum_value is less and if data_id != data_id then show actual values
Expected Output
data_id sum_value
xxx 40
ccc 70
aaa 80
ddd 100
eee 200
thank you

delete from foo
using(select min(sum_value) sum
,data_id
from foo
group by data_id
having count(data_id)>1
)t
where foo.sum_value=t.sum
and foo.data_id=t.data_id
SqlFiddle - Demo

Try this:
delete from table where data_id=data_id and (SELECT min(the_field) FROM the_table)
select * from table

Assuming you only want to keep the records with the largest sum (for every id), and delete the rest:
-- the data
CREATE TABLE ztable
( data_id CHAR(3) NOT NULL
, sum_value INTEGER NOT NULL
);
INSERT INTO ztable(data_id, sum_value) VALUES
('xxx',30)
,('xxx',40)
,('ccc',50)
,('aaa',60)
,('ccc',70)
,('aaa',80)
;
-- Delete the non-largest per id:
DELETE FROM ztable del -- del cannot be the record with the largest sum
WHERE EXISTS ( -- if a record exists
SELECT * FROM ztable x
WHERE x.data_id = del.data_id -- with the same ID
AND x.sum_value > del.sum_value -- ... but a larger sum
);
Result:
CREATE TABLE
INSERT 0 6
DELETE 3
data_id | sum_value
---------+-----------
xxx | 40
ccc | 70
aaa | 80
(3 rows)

You can do this :
with src as (
SELECT DISTINCT on (data_id)
data_id as di, sum_value as sv
FROM Table1
ORDER BY data_id, sum_value DESC
)
DELETE FROM table1
WHERE ( data_id, sum_value) NOT IN (SELECT di, sv FROM src);
to only leave the row with the highest sum_value , or this:
with src as (
SELECT DISTINCT on (data_id)
data_id as di, sum_value as sv
FROM Table1
ORDER BY data_id, sum_value ASC
)
DELETE FROM table1
WHERE ( data_id, sum_value) IN (SELECT di, sv FROM src);
to only remove row with the smallest sum_value.
For supplied data both will do the same. You didn't mention what do you expect where more than two rows exist with the same data_id.
see the fiddle: http://sqlfiddle.com/#!15/92a04/1
Hmm sqlFiddle behaves strange when you modify data in the right pane. When running them both it shows that 0 rows were affected by first one, but then the second on shows only 3 rows left . However, if you run them separately - it shows that no rows were deleted, and all rows still exist. I guess it returns the table for the state defined in the left pane after each run... It has to run it in a transaction. When I added COMMIT; statement it threw an error:
Explicit commits are not allowed within the query panel.
Try theese queries on your local DB - this is how it looks on my local postgres 9.4:
testdb=# select * from Table1;
data_id | sum_value
---------+-----------
xxx | 30
xxx | 40
ccc | 50
aaa | 60
ccc | 70
aaa | 80
(6 wierszy)
testdb=# with src as (
testdb(# SELECT DISTINCT on (data_id)
testdb(# data_id as di, sum_value as sv
testdb(# FROM Table1
testdb(# ORDER BY data_id, sum_value ASC
testdb(# )
testdb-# DELETE FROM table1
testdb-# WHERE ( data_id, sum_value) IN (SELECT di, sv FROM src);
DELETE 3
testdb=# select * from Table1;
data_id | sum_value
---------+-----------
xxx | 40
ccc | 70
aaa | 80
(3 wiersze)
re-init table
testdb=# select * from Table1;
data_id | sum_value
---------+-----------
xxx | 30
xxx | 40
ccc | 50
aaa | 60
ccc | 70
aaa | 80
ddd | 100
eee | 200
(8 wierszy)
testdb=# with src as (
testdb(# SELECT DISTINCT on (data_id)
testdb(# data_id as di, sum_value as sv
testdb(# FROM Table1
testdb(# ORDER BY data_id, sum_value DESC
testdb(# )
testdb-# DELETE FROM table1
testdb-# WHERE ( data_id, sum_value) NOT IN (SELECT di, sv FROM src);
DELETE 3
testdb=# SELECT * from table1;
data_id | sum_value
---------+-----------
xxx | 40
ccc | 70
aaa | 80
ddd | 100
eee | 200
(5 wierszy)
works for me...

delete from table
USING table, table as vtable
WHERE (NOT table.ID=vtable.ID)
OR sum=(SELECT min(sum) FROM table)
try this

Related

Typeorm order by after distinct on with postgresql

I have a table below:
id
product_id
priceĀ 
1
1
100
2
1
150
3
2
120
4
2
190
5
3
100
6
3
80
I want to select cheapest price for product and sort them by price
Expected output:
id
product_id
price
6
3
80
1
1
100
3
2
120
What I try so far:
`
repository.createQueryBuilder('products')
.orderBy('products.id')
.distinctOn(['products.id'])
.addOrderBy('price')
`
This query returns, cheapest products but not sort them. So, addOrderBy doesn't effect to products. Is there a way to sort products after distinctOn ?
SELECT id,
product_id,
price
FROM (SELECT id,
product_id,
price,
Dense_rank()
OVER (
partition BY product_id
ORDER BY price ASC) dr
FROM product) inline_view
WHERE dr = 1
ORDER BY price ASC;
Setup:
postgres=# create table product(id int, product_id int, price int);
CREATE TABLE
postgres=# insert into product values (1,1,100),(2,1,150),(3,2,120),(4,2,190),(5,3,100),(6,3,80);
INSERT 0 6
Output
id | product_id | price
----+------------+-------
6 | 3 | 80
1 | 1 | 100
3 | 2 | 120
(3 rows)

results mismatched when retrieved dates from column of type character varying

I have two tables,i want to get the min and max date stored in table1 cfrange column which is of type character varying.
table1 and table2 is mapped using sid. i want to get the max and min date range when compared with sid of table2.
table1:
sid cfrange
100 3390
101 8000
102 5/11/2010
103 11/12/2016
104 01/03/2016
105 4000
106 4000
107 03/12/2017
108 03/11/2016
109 4/04/2018
110 10/12/2016
table2:
sid description
102 success
103 success
104 Proceeding
107 success
108 success
I tried as below but its not giving the correct min and max value.Please advice.
select max(t1.cfrange),min(t1.cfrange) from table1 t1,table2 t2 where t1.sid=t2.sid;
You should join two tables and cast cfrange as a date and cross your fingers. (May be you must format it as a date before to cast it).
create table table1 (sid int, cfrange varchar(30));
insert into table1 values
(100, '3390'),
(101, '8000'),
(102, '5/11/2010'),
(103, '11/12/2016'),
(104, '01/03/2016'),
(105, '4000'),
(106, '4000'),
(107, '03/12/2017'),
(108, '03/11/2016'),
(109, '4/04/2018'),
(110, '10/12/2016');
create table table2 (sid int, description varchar(30));
insert into table2 values
(102, 'success'),
(103, 'success'),
(104, 'Proceeding'),
(107, 'success'),
(108, 'success');
select 'Min' as caption, min(cfrange) as value
from (select table1.sid, table1.cfrange::date
from table1
inner join table2
on table1.sid = table2.sid) tt
UNION ALL
select 'Max' as caption, max(cfrange) as value
from (select table1.sid, table1.cfrange::date
from table1
inner join table2
on table1.sid = table2.sid) tt;
caption | value
:------ | :---------
Min | 2010-11-05
Max | 2017-12-03
dbfiddle here

Combining three very similar queries? (Postgres)

So I have three queries. I'm trying to combine them all into one query. Here they are with their outputs:
Query 1:
SELECT distinct on (name) name, count(distinct board_id)
FROM tablea
INNER JOIN table_b on tablea.id = table_b.id
GROUP BY name
ORDER BY name ASC
Output:
A | 15
B | 26
C | 24
D | 11
E | 31
F | 32
G | 16
Query 2:
SELECT distinct on (name) name, count(board_id) as total
FROM tablea
INNER JOIN table_b on tablea.id = table_b.id
GROUP BY 1, board_id
ORDER BY name, total DESC
Output:
A | 435
B | 246
C | 611
D | 121
E | 436
F | 723
G | 293
Finally, the last query:
SELECT distinct on (name) name, count(board_id) as total
FROM tablea
INNER JOIN table_b on tablea.id = table_b.id
GROUP BY 1
ORDER BY name, total DESC
Output:
A | 14667
B | 65123
C | 87426
D | 55198
E | 80612
F | 31485
G | 43392
Is it possible to format it to be like this:
A | 15 | 435 | 14667
B | 26 | 246 | 65123
C | 24 | 611 | 87426
D | 11 | 121 | 55198
E | 31 | 436 | 80612
F | 32 | 723 | 31485
G | 16 | 293 | 43392
EDIT:
With #Clodoaldo Neto 's help, I combined the first and the third queries with this:
SELECT name, count(distinct board_id), count(board_id) as total
FROM tablea
INNER JOIN table_b on tablea.id = table_b.id
GROUP BY 1
ORDER BY description ASC
The only thing preventing me from combining the second query with this new one is the GROUP BY clause needing board_id to be in it. Any thoughts from here?
This is hard to get right without test data. But here is my try:
with s as (
select name, grouping(name, board_id) as grp,
count(distinct board_id) as dist_total,
count(*) as name_total,
count(*) as name_board_total
from
tablea
inner join
table_b on tablea.id = table_b.id
group by grouping sets ((name), (name, board_id))
)
select name, dist_total, name_total, name_board_total
from
(
select name, dist_total, name_total
from s
where grp = 1
) r
inner join
(
select name, max(name_board_total) as name_board_total
from s
where grp = 0
group by name
) q using (name)
order by name
https://www.postgresql.org/docs/current/static/queries-table-expressions.html#QUERIES-GROUPING-SETS

T-SQL query to remove duplicates from large tables using join

I am new in using T-SQL queries and I was trying different solutions in order to remove duplicate rows from a fairy large table (with over 270,000 rows).
The table looks something like:
TableA
-----------
RowID int not null identity(1,1) primary key,
Col1 varchar(50) not null,
Col2 int not null,
Col3 varchar(50) not null
The rows for this table are not perfect duplicates because of the existence of the RowID identity field.
The second table that I need to join with:
TableB
-----------
RowID int not null identity(1,1) primary key,
Col1 int not null,
Col2 varchar(50) not null
In TableA I have something like:
1 | gray | 4 | Angela
2 | red | 6 | Diana
3 | black| 6 | Alina
4 | black| 11 | Dana
5 | gray | 4 | Angela
6 | red | 12 | Dana
7 | red | 6 | Diana
8 | black| 11 | Dana
And in TableB:
1 | 6 | klm
2 | 11 | lmi
Second column from TableB (Col1) is foreign key inside TableA (Col2).
I need to remove ONLY the duplicates from TableA that has Col2 = 6 ignoring the other duplicates.
1 | gray | 4 | Angela
2 | red | 6 | Diana
4 | black| 6 | Alina
5 | black| 11 | Dana
6 | gray | 4 | Angela
7 | red | 12 | Dana
8 | black| 11 | Dana
I tried using
DELETE FROM TableA a inner join TableB b on a.Col2=b.Col1
WHERE a.RowId NOT IN (SELECT MIN(RowId) FROM TableA GROUP BY RowId, Col1, Col2, Col3) and b.Col2="klm"
but I still get some of the duplicates that I need to remove.
What is the best way to remove not perfect duplicate rows using join?
well min would only be one and group by PK will give you everything
and the RowID are wrong in the example
DELETE FROM TableA a
inner join TableB b
on a.Col2=b.Col1
WHERE a.RowId NOT IN (SELECT MIN(RowId)
FROM TableA GROUP BY RowId, Col1, Col2, Col3)
and b.Col2="klm"
this would be rows to delete
select *
from
( select *
, row_number over (partition by Col1, Col3 order by RowID) as rn
from TableA a
where del.Col2 = 6
) tt
where tt.rn > 1
another solution is:
WITH CTE AS(
SELECT t.[col1], t.[col2], t.[col3], t.[col4],
RN = ROW_NUMBER() OVER (PARTITION BY t.[col1], t.[col2], t.[col3], t.[col4] ORDER BY t.[col1])
FROM [TableA] t
)
delete from CTE WHERE RN > 1
regards.

Renumbering a column in postgresql based on sorted values in that column

Edit: I am using postgresql v8.3
I have a table that contains a column we can call column A.
Column A is populated, for our purposes, with arbitrary positive integers.
I want to renumber column A from 1 to N based on ordering the records of the table by column A ascending. (SELECT * FROM table ORDER BY A ASC;)
Is there a simple way to accomplish this without the need of building a postgresql function?
Example:
(Before:
A: 3,10,20,100,487,1,6)
(After:
A: 2,4,5,6,7,1,3)
Use the rank() (or dense_rank() ) WINDOW-functions (available since PG-8.4):
create table aaa
( id serial not null primary key
, num integer not null
, rnk integer not null default 0
);
insert into aaa(num) values( 3) , (10) , (20) , (100) , (487) , (1) , (6)
;
UPDATE aaa
SET rnk = w.rnk
FROM (
SELECT id
, rank() OVER (order by num ASC) AS rnk
FROM aaa
) w
WHERE w.id = aaa.id;
SELECT * FROM aaa
ORDER BY id
;
Results:
CREATE TABLE
INSERT 0 7
UPDATE 7
id | num | rnk
----+-----+-----
1 | 3 | 2
2 | 10 | 4
3 | 20 | 5
4 | 100 | 6
5 | 487 | 7
6 | 1 | 1
7 | 6 | 3
(7 rows)
IF window functions are not available, you could still count the number of rows before any row:
UPDATE aaa
SET rnk = w.rnk
FROM ( SELECT a0.id AS id
, COUNT(*) AS rnk
FROM aaa a0
JOIN aaa a1 ON a1.num <= a0.num
GROUP BY a0.id
) w
WHERE w.id = aaa.id;
SELECT * FROM aaa
ORDER BY id
;
Or the same with a scalar subquery:
UPDATE aaa a0
SET rnk =
( SELECT COUNT(*)
FROM aaa a1
WHERE a1.num <= a0.num
)
;