How can I restrict a result to only include rows where one specific field is unique with UNION Select statement in BigQuery? - select

I have the following code. I try to stitch the two tables together, but restrict it to only add duplicate Opportunity_ID once, and then from the second table (OpportunitiesUpdates).
SELECT
Opportunity.Account_Name,
Opportunity.Opportunity_Name,
Opportunity.Opportunity_Owner,
Opportunity.Opportunity_ID
FROM
Opportunity
UNION DISTINCT
SELECT
OpportunityUpdates.Account_Name,
OpportunityUpdates.Opportunity_Name,
OpportunityUpdates.Opportunity_Owner,
OpportunityUpdates.Opportunity_ID
FROM
OpportunityUpdates
WHERE OpportunityUpdates.Opportunity_ID <> Opportunity.Opportunity_ID

This code consolidates all records from both tables (by Opportunity_ID) and gives priority to the OpportunityUpdates table based on Opportunity_ID.
It assumes that the same Opportunity_ID could be in either table ("duplicates"), but that within each table an Opportunity_ID is unique. It also assumes that Opportunity_ID is not nullable (never null).
SELECT DISTINCT
IF(ou.Opportunity_ID IS NOT NULL, ou.Account_Name, o.Account_Name) Account_Name,
IF(ou.Opportunity_ID IS NOT NULL, ou.Opportunity_Name, o.Opportunity_Name) Opportunity_Name,
IF(ou.Opportunity_ID IS NOT NULL, ou.Opportunity_Owner, o.Opportunity_Owner) Opportunity_Owner,
COALESCE(ou.Opportunity_ID, o.Opportunity_ID) Opportunity_ID
FROM OpportunityUpdates ou
FULL OUTER JOIN
Opportunity o
ON o.Opportunity_ID = ou.Opportunity_ID

Related

How to return different format of records from a single PL/pgSQL function?

I am a frontend developer but I started to write backend stuff. I have spent quite some amount of time trying to figure out how to solve this. I really need some help.
Here are the simplified definitions and relations of two tables:
Relationship between tables
CREATE TABLE IF NOT EXISTS items (
item_id uuid NOT NULL DEFAULT gen_random_uuid() ,
parent_id uuid DEFAULT NULL ,
parent_table parent_tables NOT NULL
);
CREATE TABLE IF NOT EXISTS collections (
collection_id uuid NOT NULL DEFAULT gen_random_uuid() ,
parent_id uuid DEFAULT NULL
);
Our product is an online document collaboration tool, page can have nested pages.
I have a piece of PostgreSQL code for getting all of its ancestor records for given item_ids.
WITH RECURSIVE ancestors AS (
SELECT *
FROM items
WHERE item_id in ( ${itemIds} )
UNION
SELECT i.*
FROM items i
INNER JOIN ancestors a ON a.parent_id = i.item_id
)
SELECT * FROM ancestors
It works fine for nesting regular pages, But if I am going to support nesting collection pages, which means some items' parent_id might refer to "collection" table's collection_id, this code will not work anymore. According to my limited experience, I don't think pure SQL code can solve it. I think writing a PL/pgSQL function might be a solution, but I need to get all ancestor records to given itemIds, which means returning a mix of items and collections records.
So how to return different format of records from a single PL/pgSQL function? I did some research but haven't found any example.
You can make it work by returning a superset as row: comprised of item and collection. One of both will be NULL for each result row.
WITH RECURSIVE ancestors AS (
SELECT 0 AS lvl, i.parent_id, i.parent_table, i AS _item, NULL::collections AS _coll
FROM items i
WHERE item_id IN ( ${itemIds} )
UNION ALL -- !
SELECT lvl + 1, COALESCE(i.parent_id, c.parent_id), COALESCE(i.parent_table, 'i'), i, c
FROM ancestors a
LEFT JOIN items i ON a.parent_table = 'i' AND i.item_id = a.parent_id
LEFT JOIN collections c ON a.parent_table = 'c' AND c.collection_id = a.parent_id
WHERE a.parent_id IS NOT NULL
)
SELECT lvl, _item, _coll
FROM ancestors
-- ORDER BY ?
db<>fiddle here
UNION ALL, not UNION.
Assuming a collection's parent is always an item, while an item can go either way.
We need LEFT JOIN on both potential parent tables to stay in the race.
I added an optional lvl to keep track of the level of hierarchy.
About decomposing row types:
Combine postgres function with query
Record returned from function has columns concatenated

Computed table column with MAX value between rows containing a shared value

I have the following table
CREATE TABLE T2
( ID_T2 integer NOT NULL PRIMARY KEY,
FK_T1 integer, <--- foreign key to T1(Table1)
FK_DATE date, <--- foreign key to T1(Table1)
T2_DATE date, <--- user input field
T2_MAX_DIFF COMPUTED BY ( (SELECT DATEDIFF (day, MAX(T2_DATE), CURRENT_DATE) FROM T2 GROUP BY FK_T1) )
);
I want T2_MAX_DIFF to display the number of days since last input across all similar entries with a common FK_T1.
It does work, but if another FK_T1 values is added to the table, I'm getting an error about "multiple rows in singleton select".
I'm assuming that I need some sort of WHERE FK_T1 = FK_T1 of corresponding row. Is it possible to add this? I'm using Firebird 3.0.7 with flamerobin.
The error "multiple rows in singleton select" means that a query that should provide a single scalar value produced multiple rows. And that is not unexpected for a query with GROUP BY FK_T1, as it will produce a row per FK_T1 value.
To fix this, you need to use a correlated sub-query by doing the following:
Alias the table in the subquery to disambiguate it from the table itself
Add a where clause, making sure to use the aliased table (e.g. src, and src.FK_T1), and explicitly reference the table itself for the other side of the comparison (e.g. T2.FK_T1)
(optional) remove the GROUP BY clause because it is not necessary given the WHERE clause. However, leaving the GROUP BY in place may uncover certain types of errors.
The resulting subquery then becomes:
(SELECT DATEDIFF (day, MAX(src.T2_DATE), CURRENT_DATE)
FROM T2 src
WHERE src.FK_T1 = T2.FK_T1
GROUP BY src.FK_T1)
Notice the alias src for the table referenced in the subquery, the use of src.FK_T1 in the condition, and the explicit use of the table in T2.FK_T1 to reference the column of the current row of the table itself. If you'd use src.FK_T1 = FK_T1, it would compare with the FK_T1 column of src (as if you'd used src.FK_T1 = src.FK_T2), so that would always be true.
CREATE TABLE T2
( ID_T2 integer NOT NULL PRIMARY KEY,
FK_T1 integer,
FK_DATE date,
T2_DATE date,
T2_MAX_DIFF COMPUTED BY ( (
SELECT DATEDIFF (day, MAX(src.T2_DATE), CURRENT_DATE)
FROM T2 src
WHERE src.FK_T1 = T2.FK_T1
GROUP BY src.FK_T1) )
);

Inserting records into table1 depending on row value in table2

For each row in table exam 'where exam.examRegulation isnull', I want to insert one corresponding row in table examRegulation and copy columnvalues from exam to examregulation. Apparently the following query ist too naive and must be approved:
insert into examRegulation (graduation, course, examnumber, examversion)
values (exam.graduation, exam.course, exam.examnumber, exam.examversion)
where ?? (select graduation, course, examnumber, examversion
from exam
where exam.examRegulation isnull)
Is there a way to do this in postgresql?
You may rephrase this as an INSERT INTO ... SELECT statement:
INSERT INTO examRegulation (graduation, course, examnumber, examversion)
SELECT graduation, course, examnumber, examversion
FROM exam
WHERE examRegulation IS NULL;
The VALUES clause, as the name implies, can only be used with literal values. If you need to populate an insert using query logic, then you need to use a SELECT clause.

I'm trying to insert tuples into a table A (from table B) if the primary key of the table B tuple doesn't exist in tuple A

Here is what I have so far:
INSERT INTO Tenants (LeaseStartDate, LeaseExpirationDate, Rent, LeaseTenantSSN, RentOverdue)
SELECT CURRENT_DATE, NULL, NewRentPayments.Rent, NewRentPayments.LeaseTenantSSN, FALSE from NewRentPayments
WHERE NOT EXISTS (SELECT * FROM Tenants, NewRentPayments WHERE NewRentPayments.HouseID = Tenants.HouseID AND
NewRentPayments.ApartmentNumber = Tenants.ApartmentNumber)
So, HouseID and ApartmentNumber together make up the primary key. If there is a tuple in table B (NewRentPayments) that doesn't exist in table A (Tenants) based on the primary key, then it needs to be inserted into Tenants.
The problem is, when I run my query, it doesn't insert anything (I know for a fact there should be 1 tuple inserted). I'm at a loss, because it looks like it should work.
Thanks.
Your subquery was not correlated - It was just a non-correlated join query.
As per description of your problem, you don't need this join.
Try this:
insert into Tenants (LeaseStartDate, LeaseExpirationDate, Rent, LeaseTenantSSN, RentOverdue)
select current_date, null, p.Rent, p.LeaseTenantSSN, FALSE
from NewRentPayments p
where not exists (
select *
from Tenants t
where p.HouseID = t.HouseID
and p.ApartmentNumber = t.ApartmentNumber
)

Another way of returning rows if any of the columns has different value for the same id

Is there any other way for returning rows for the same id by joining two tables and return the row if any of the columns value for the same id is different.
Select Table1.No,Table2.No,Table1.Name,Table2.Name,Table1.ID,Table2.ID,Table1.ID_N,Table2.ID_N
From MyFirstTable Table1
JOIN MySecondTable Table2
ON Table1.No=Table2.No where Table1.ID!=Table2.ID or Table1.ID_N != Table2.ID_N
In the example above , I have only two columns I need to check but in my real case there are at least 20 .
Is there any other statment I can use instead of enumerating each column in the where codition?
...WHERE BINARY_CHECKSUM(Table1.*) <> BINARY_CHECKSUM(Table2.*)
or
...WHERE BINARY_CHECKSUM(Table1.Field1, Table1.Field2, ...) <> BINARY_CHECKSUM(Table2..Field1, Table2.Field2, ...)
*this assumes you have no blob fields in your tables
http://technet.microsoft.com/en-us/library/ms173784.aspx
If No is a PK
Select Table1.No,Table1.Name,Table1.ID,Table1.ID_N
From MyFirstTable Table1
except
Select Table1.No,Table1.Name,Table1.ID,Table1.ID_N
From MySecondTable Table1