Postgresql left join without duplicates

Postgresql left join without duplicates - postgresql

DB structure:
CREATE TABLE page
(
id serial primary key,
title VARCHAR(40) not null
);
CREATE TABLE page_rating
(
id serial primary key,
page_id INTEGER,
rating_type INTEGER,
rating INTEGER
);
CREATE TABLE user_history
(
id serial primary key,
page_id INTEGER
)
Data:
INSERT INTO page (id,title) VALUES(1,'Page #1');
INSERT INTO page (id,title) VALUES(2,'Page #2');
INSERT INTO page (id,title) VALUES(3,'Page #3');
INSERT INTO page (id,title) VALUES(4,'Page #4');
INSERT INTO page (id,title) VALUES(5,'Page #5');
INSERT INTO page_rating VALUES (1,1,60,100);
INSERT INTO page_rating VALUES (2,1,99,140);
INSERT INTO page_rating VALUES (3,1,58,120);
INSERT INTO page_rating VALUES (4,1,70,110);
INSERT INTO page_rating VALUES (5,2,60,50);
INSERT INTO page_rating VALUES (6,2,99,60);
INSERT INTO page_rating VALUES (7,2,58,90);
INSERT INTO page_rating VALUES (8,2,70,140);
Purpose - select unique values for rating_type in a table "page" sorted by "rating_page.rating". And exclude table user_history from the result
My query:
SELECT DISTINCT ON(pr.rating_type) p.*,pr.rating,pr.rating_type FROM page as p
LEFT JOIN page_rating as pr ON p.id = pr.page_id
LEFT JOIN user_history uh ON uh.page_id = p.id
WHERE
pr.rating_type IN (60, 99, 58, 45, 73, 97, 55, 59, 70, 43, 74, 97, 64, 71, 46)
AND uh.page_id IS NULL
ORDER BY pr.rating_type,pr.rating DESC
Result:
ID TITLE RATING RATING_TYPE
1 "Page #1" 120 58
1 "Page #1" 100 60
2 "Page #2" 140 70
1 "Page #1" 140 99
Duplicate values ( Ideal:
ID TITLE RATING RATING_TYPE
1 "Page #1" 120 58
1 "Page #2" 50 60
Thx for help!

You almost certainly need a UNIQUE constraint on {page_id, rating_type} in the table "page_rating". You're also missing every necessary foreign key constraint. The primary key on "user_history" is suspicious, too.
Purpose - select unique values for rating_type in a table "page"
sorted by "rating_page.rating".
You can select distinct values for rating_type without referring to any other tables. And you should, at first. Let's look at the data.
select page_id, rating_type, rating
from page_rating
order by page_id, rating_type;
page_id rating_type rating
--
1 58 120 *
1 60 100
1 70 110
1 99 140
2 58 90
2 60 50 *
2 70 140
2 99 60
You seem to want one row per page_id. Those rows are marked with an asterisk in the table above. How can we get those two rows?
Those rows have different values for rating_type, so we can't just use rating_type in the WHERE clause. The values in rating are neither the max nor the min for both values of rating_type, so we can't use GROUP BY with max() or min(). And we can't use GROUP BY with an aggregate function, because you want the unaggregated value of "rating" for an arbitrary value of "rating_type".
So, based on what you've told us, the only way to get the result set you want is to specify rating_type and page_id in the WHERE clause.
select page_id, rating_type, rating
from page_rating
where (page_id = 1 and rating_type = 58)
or (page_id = 2 and rating_type = 60)
order by page_id, rating_type;
page_id rating_type rating
--
1 58 120
2 60 50
I'm not going to follow through with the joins, because I'm 100% confident that you don't really want to do this.

Related

Unique serial value for two distinct columns in postgres

I have two tables:
table_A and table_B
In table_A, I have unique constraint on following columns:
col_1
col_2
This generates a unique sequence explicitly via following: (col_id which is foreign key for table_B)
FEATURESET_ID_SEQ = Sequence('featureset_id_seq')
col_id = Column(Integer, FEATURESET_ID_SEQ, primary_key=True, nullable=False, server_default=FEATURESET_ID_SEQ.next_value())
Now, I want to generate sequential id in table_B for following scenario:
whenever,
col_id generated by above explicitly unique sequence (col_1 and col_2 from table_A) let's say 100
unique value of col_3 from table_B (abcd_123) is present
I want a sequence for column id in table_B.
Basically, I want following:
col_id col_3 id
100 + abcd_123 --> 1
100 + abcd_124 --> 2
100 + abcd_125 --> 3
100 + abcd_126 --> 4
101 + abcd_127 --> 1
101 + abcd_128 --> 2
102 + abcd_129 --> 1
102 + abcd_130 --> 2
102 + abcd_131 --> 3

Enforcing a unique relationship over multiple columns where one column is nullable

Given the table
ID PERSON_ID PLAN EMPLOYER_ID TERMINATION_DATE
1 123 ABC 321 2020-01-01
2 123 DEF 321 (null)
3 123 ABC 321 (null)
4 123 ABC 321 (null)
I want to exclude the 4th entry. (The 3rd entry shows the person was re-hired and therefore is a new relationship. I'm only showing relevant fields)
My first attempt was to simply create a unique index over PERSON_ID / PLAN / EMPLOYER_ID / TERMINATION_DATE, thinking that DB2 for IBMi considered nulls equal in a unique index. I was evidently wrong...
Is there a way to enforce uniqueness over these columns, or,
is there a better way to approach the value of termination date? (null is not technically correct; I'm thinking of it as more true/false, but the business logic needs a date)
Edit
According to the docs for 7.3:
UNIQUE
Prevents the table from containing two or more rows with the same value of the index key. When UNIQUE is used, all null values for a column are considered equal. For example, if the key is a single column that can contain null values, that column can contain only one null value. The constraint is enforced when rows of the table are updated or new rows are inserted.
The constraint is also checked during the execution of the CREATE INDEX statement. If the table already contains rows with duplicate key values, the index is not created.
UNIQUE WHERE NOT NULL
Prevents the table from containing two or more rows with the same value of the index key, where all null values for a column are not considered equal. Multiple null values in a column are allowed. Otherwise, this is identical to UNIQUE.
So, the behavior I'm seeing looks more like UNIQUE WHERE NOT NULL. When I generate SQL for this table, I see
ADD CONSTRAINT TERMEMPPLANSSN
UNIQUE( TERMINATION_DATE , EMPLOYERID , PLAN_CODE , SSN ) ;
(note this is showing the real field names, not the ones I used in my example)
Edit 2
Bottom line, Constraint !== Index. When I went back and created an actual index, I got the desired behavior.

CREATE TABLE PERSON
(
ID INT NOT NULL
, PERSON_ID INT NOT NULL
, PLAN CHAR(3) NOT NULL
, EMPLOYER_ID INT
, TERMINATION_DATE DATE
);
INSERT INTO PERSON (ID, PERSON_ID, PLAN, EMPLOYER_ID, TERMINATION_DATE)
VALUES
(1, 123, 'ABC', 321, DATE('2020-01-01'))
, (2, 123, 'DEF', 321, CAST(NULL AS DATE))
, (3, 123, 'ABC', 321, CAST(NULL AS DATE))
WITH NC;
--- To not allow: ---
INSERT INTO PERSON (ID, PERSON_ID, PLAN, EMPLOYER_ID, TERMINATION_DATE) VALUES
(4, 123, 'ABC', 321, CAST(NULL AS DATE))
or
(4, 123, 'ABC', 321, DATE('2020-01-01'))
You may:
CREATE UNIQUE INDEX PERSON_U1 ON PERSON
(PERSON_ID, PLAN, EMPLOYER_ID, TERMINATION_DATE);
--- To not allow: ---
INSERT INTO PERSON (ID, PERSON_ID, PLAN, EMPLOYER_ID, TERMINATION_DATE) VALUES
(4, 123, 'ABC', 321, DATE('2020-01-01'))
but allow multiple:
(X, 123, 'ABC', 321, CAST(NULL AS DATE))
(Y, 123, 'ABC', 321, CAST(NULL AS DATE))
...
You may:
CREATE UNIQUE WHERE NOT NULL INDEX PERSON_U2 ON PERSON
(PERSON_ID, PLAN, EMPLOYER_ID, TERMINATION_DATE);

Update Multiple Rows with Specific Values and Others With Zero

How do I efficiently update multiple rows with particular values for a_id 84, and then all other rows set to 0 for that same a_id?
products
p_id a_id best
111 81 99
222 81 99
666 82 99
222 83 99
111 84 99
222 84 99
333 84 99
111 85 99
222 85 99
Right now I'm doing this:
SQL Fiddle
update products as u set
best = u2.best
from (values
(111, 84, 1),
(222, 84, 2)
) as u2(p_id, a_id, best)
where u2.p_id = u.p_id AND u2.a_id = u.a_id
RETURNING u2.p_id, u2.a_id, u2.best
But this only updates the rows within values as expected. How do I also update rows not in values to be 0 with a_id = 84?
Meaning the p_id of 333 should have best = 0. I could explicitly include every single p_id but the table is huge.
The values set into best will always be in order from 1 to n, defined by the order of values.
The products table has 1 million rows

Assuming (p_id, a_id) is the PK - or at least UNIQUE and NOT NULL, this is one way:
UPDATE products AS u
SET best = COALESCE(u2.best, 0)
FROM products AS p
LEFT JOIN ( VALUES
(111, 84, 1),
(222, 84, 2)
) AS u2(p_id, a_id, best) USING (p_id, a_id)
WHERE u.a_id = 84
AND u.a_id = p.p_id
AND u.p_id = p.p_id
RETURNING u2.p_id, u2.a_id, u2.best;
The difficulty is that the FROM list of an UPDATE is the equivalent of an INNER JOIN, while you need an OUTER JOIN. This workaround adds the table products to the FROM list (which is normally redundant), to act as left table for the LEFT OUTER JOIN. Then the INNER JOIN from products to products works.
To restrict to a_id = 84 additionally, add another WHERE clause saying so. That makes a_id = 84 redundant in the VALUES expression, but keep it there to avoid multiple joins that would only be filtered later. Cheaper.
If you don't have a PK or any other (combination of) UNIQUE NOT NULL columns, you can fall back to the system column ctid for joining products rows. Example:
Numbering rows consecutively for a number of tables

Remove the condition u2.a_id = u.a_id from the ON clause and put it in the assignment with a CASE statement:
update products as u set
best = case when u2.a_id = u.a_id then u2.best else 0 end
from (values
(111, 84, 1),
(222, 84, 2)
) as u2(p_id, a_id, best)
where u2.p_id = u.p_id

Running Totals with debit credit and previous row SQL Server 2012

I am having problems in recalculating the running totals.
I have a situation where we have duplicate transactions and these must be deleted and the and initial and closing balance must be recalculated based on the amount and taking into account when isdebit.
My attempt is to have nested cursors (parent-child) and the parent select all the distinct bookingNo and the child does the calculation looks very messy and I didn't work, didn't post it because I didn't want to confuse things.
I know in SQL Server 2012 you can use (sum over partition by) but I cannot figure how to do it to handle the deleted row etc..
Below is what I did so far
--Create Table for testing
IF object_id(N'TestTransaction', 'U') IS NOT NULL DROP TABLE TestTransaction
GO
CREATE TABLE [TestTransaction]
(
[Id] [bigint] IDENTITY(1,1) NOT NULL,
[BookingNo] [bigint] NOT NULL,
[IsDebit] [bit] NOT NULL,
[Amount] [decimal](18, 2) NOT NULL,
[InitialBalance] [decimal](18, 2) NOT NULL,
[ClosingBalance] [decimal](18, 2) NOT NULL
) ON [PRIMARY]
GO
INSERT [TestTransaction] ([BookingNo], [IsDebit], [Amount], [InitialBalance], [ClosingBalance])
SELECT 200, 0, 100, 2000,2100 UNION ALL
SELECT 200, 0, 100, 2100,2200 UNION ALL
SELECT 200, 1, 150, 2150,2000 UNION ALL
SELECT 200, 0, 300, 2000,2300 UNION ALL
SELECT 200, 0, 400, 2300,2700 UNION ALL
SELECT 200, 0, 250, 2700,2950 UNION ALL
SELECT 200, 0, 250, 2950,3200
--- end of setup
IF OBJECT_ID('tempdb..#tmpTransToDelete') IS NOT NULL DROP TABLE #tmpTransToDelete
GO
CREATE TABLE #tmpTransToDelete
( BoookingNo bigint,
Isdebit bit,
amount decimal(18,2),
InitialBalance decimal(18,2),
ClosingBalance decimal(18,2)
)
DECLARE #RunnnigInitialBalance decimal(18,2),#RunnnigClosingBalance decimal(18,2)
INSERT #tmpTransToDelete(BoookingNo,Isdebit,amount,InitialBalance,ClosingBalance)
SELECT BookingNo,Isdebit,amount,InitialBalance,ClosingBalance
FROM TestTransaction
WHERE ID IN (1,6)
--Delete all duplicate transaction (just to prove the point)
DELETE TestTransaction WHERE ID IN (1,6)
-- now taking into account the deleted rows recalculate the lot and update the table.
Any help? Suggestions?
edited
Results should be
Id BookingNo IsDebit Amount InitialBalance ClosingBalance
2 200 0 100.00 2000.00 2000.00
3 200 1 150.00 2000.00 2150.00
4 200 0 300.00 2150.00 2450.00
5 200 0 400.00 2450.00 2850.00
7 200 0 250.00 2600.00 2850.00

The RunningTotal approach in my previous response would work if there were transactional data that accounted for the initial balance. But, since that evidently isn't the case, I would say you can't delete any rows without also applying the relative difference to all subsequent rows as part of the same transaction. Moreover, I'm convinced your initial sample data is wrong, which only exacerbates the confusion. It seems to me it should be as follows:
SELECT 200, 0, 100, 2000,2100 UNION ALL
SELECT 200, 0, 100, 2100,2200 UNION ALL
SELECT 200, 1, 150, 2200,2050 UNION ALL
SELECT 200, 0, 300, 2050,2350 UNION ALL
SELECT 200, 0, 400, 2350,2750 UNION ALL
SELECT 200, 0, 250, 2750,3000 UNION ALL
SELECT 200, 0, 250, 3000,3250
With that rectified, here's how I'd write the delete-and-update transaction:
BEGIN TRAN
DECLARE #tbd TABLE (
Id bigint
,BookingNo bigint
,Amount decimal(18,2)
);
DELETE FROM TestTransaction
OUTPUT deleted.Id
, deleted.BookingNo
, deleted.Amount * IIF(deleted.IsDebit = 0, 1, -1) AS Amount
INTO #tbd
WHERE ID IN (1,6);
WITH adj
AS (
SELECT tt.BookingNo, tt.Id, SUM(tbd.amount) AS Amount
FROM TestTransaction tt
JOIN #tbd tbd ON tt.BookingNo = tbd.BookingNo AND tbd.id <= tt.id
GROUP BY tt.BookingNo, tt.Id
)
UPDATE tt
SET InitialBalance -= adj.Amount
,ClosingBalance -= adj.Amount
FROM TestTransaction tt
JOIN adj ON tt.BookingNo = adj.BookingNo AND tt.Id = adj.Id;
COMMIT TRAN
Which yields a final result of:
Id BookingNo IsDebit Amount InitialBalance ClosingBalance
2 200 0 100.00 2000.00 2100.00
3 200 1 150.00 2100.00 1950.00
4 200 0 300.00 1950.00 2250.00
5 200 0 400.00 2250.00 2650.00
7 200 0 250.00 2650.00 2900.00

Here's an example of a running total using your data:
SELECT BookingNo
, Amount
, IsDebit
, SUM(Amount * IIF(IsDebit = 0, 1, -1)) OVER (PARTITION BY BookingNo ORDER BY Id ROWS UNBOUNDED PRECEDING) AS RunningTotal
FROM TestTransaction

Postgresql. select SUM value from arrays

Condition:
There are two tables with arrays.
Note food.integer and price.food_id specified array.
CREATE TABLE food (
id integer[] NOT NULL,
name character varying(255),
);
INSERT INTO food VALUES ('{1}', 'Apple');
INSERT INTO food VALUES ('{1,1}', 'Orange');
INSERT INTO food VALUES ('{1,2}', 'banana');
and
CREATE TABLE price (
id bigint NOT NULL,
food_id integer[],
value double precision DEFAULT 0
);
INSERT INTO price VALUES (44, '{1}', 500);
INSERT INTO price VALUES (55, '{1,1}', 100);
INSERT INTO price VALUES (66, '{1,2}', 200);
Need to get the sum value of all the products from table food.
Please help make a sql query.
ANSWER:
{1} - Apple - 800 (500+100+200)

What about this:
select
name,
sum(value)
from
(select unnest(id) as food_id, name from food) food_cte
join (select distinct id, unnest(food_id) as food_id, value from price) price_cte using (food_id)
group by
name
It is difficult to understand your question, but this query at least returns 800 for Apple.

try the following command,
SELECT F.ID,F.NAME,SUM(P.VALUE) FROM FOOD F,PRICE P WHERE F.ID=P.FOOT_ID;

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse