Why is Eclipselink changing the order of joins incorrectly? - jpa

I have a query like this in JPQL:
SELECT ...
FROM Receipt AS receipt
JOIN Invoice AS invoice
ON receipt.invoiceID = invoice.id
LEFT JOIN Payment AS payment
ON receipt.paymentID = payment.id
LEFT JOIN CreditNote AS creditNote
ON receipt.crmID = creditNote.id
LEFT JOIN Profile AS profile
ON invoice.accountID = profile.accountID
WHERE ...
However when I run it with EclipseLink I get this native query which is invalid
SELECT ...
FROM receipts t0
LEFT OUTER JOIN payments t2
ON ( t0.payment_id = t2.id )
LEFT OUTER JOIN credit_notes t3
ON ( t0.crm_id = t3.id )
LEFT OUTER JOIN profile t4
ON ( t1.account_id = t4.account_id ),
bills t1
WHERE ...
How can I resolve this issue?

In my case I was able to change the JOIN to a LEFT JOIN for all of the joins and then my ordering was respected.
I suspect the underlying issue is a bug in EclipseLink.

Related

Postgresql query many to many relationship with left join and use where clouse not show data

I have three table, table_1 (id), table_2(id), table_pivot(tbl1_id, tbl2_id). When I use query like this:
select *
from public.table_1 t1
left join public.table_pivot tp on tp.tbl1_id = t1.id and tp.user_id = 1
left join public.table_2 t2 on t2.id = tp.tbl2_id;
Data from table_1 is show. But when I use where condition. Like this:
select *
from public.table_1 t1
left join public.table_pivot tp on tp.tbl1_id = t1.id
left join public.table_2 t2 on t2.id = tp.tbl2_id
where tp.user_id = 1;
Data from table_1 not show.
I hope advance can help explain why?
Thanks.
Including a where clause on an outer joined table, effectively converts the join into an inner join. If there are no matching values in table_pivot, then
you will get no results at all. This is standard sql.

COALESCE TSQL with a join tsql

I have a requirement to pick up data that is in more than one place and I have some form of recognition if using the coalesce function. Basically I am looking to coalesce the join itself but looking online its seems as if i can only do this on the fields.
So we have a Products and Suppliers table, we also have these as a temp table so in total 4 tables (products, tempproducts, suppliers, tempsuppliers). In the suppliers and products table is where we store our products and suppliers and their temptables we store any new suppliers/products. We also have a tempsupplierproduct which joins new suppliers to new products. However we can end in a situation where a new supplier has an existing product so the new supplier will be in the tempsuppliers table and its product is in the products table NOT the tempproducts as it is not new, we will also have a new tempsupplierproduct to join the two up.
So i want a query which looks in the tempsupplierproducts table and then gets basic information about the supplier and products. To do this i am using a coalesce.
SELECT DISTINCT SP.*, COALESCE(P.Product, PD.Product) 'Product', COALESCE(S.Supplier, SU.Supplier) 'Supplier'
FROM tempsupplierproduct SP
LEFT JOIN tempProduct P ON SP.ProductCode = P.Code
LEFT JOIN Products PD ON SP.ProductCode = PD.Code
LEFT JOIN tempSupplier S ON SP.SupplierCode = S.Code
LEFT JOIN Suppliers SU ON SP.SupplierCode = SU.Code
Now while this works, something at the back of my head tells me it is not entirely right, ideally i want if data is not in table A then join to table B. I have seen maybe coalescing inside the join itself but I am unsure how to do this
LEFT JOIN Suppliers Su ON SP.SupplierCode = COALESCE(S.Code, SU.Code)
maybe away, but I am confused by this, all it is saying is use code in temptable if not there then use supplier code. So what would this mean if we have a code in the temptable, will this try to join on it, if so then this is incorrect also.
Any help is appreciated
You can union the two suppliers tables together and then join them in one go like this. I'm assuming that there are no duplicates between the two tables in this case but with a bit of extra work that could be resolved as well.
WITH AllSuppliers AS
(
SELECT Code, Supplier FROM Suppliers
UNION ALL
SELECT Code, Supplier FROM tempSupplier
)
SELECT DISTINCT SP.*, COALESCE(P.Product, PD.Product) 'Product', S.Supplier
FROM tempsupplierproduct SP
LEFT JOIN tempProduct P ON SP.ProductCode = P.Code
LEFT JOIN Products PD ON SP.ProductCode = PD.Code
LEFT JOIN AllSuppliers S ON SP.SupplierCode = S.Code
If you need to handle duplicates in the two suppliers tables then an approach like this should work, essentially we rank the duplicates and then pick the highest ranked result. For two tables you could use a full outer join between the two but this approach will scale to any number of tables.
WITH AllSuppliers AS
(
SELECT Code, Supplier, 1 AS TablePriority FROM Suppliers
UNION ALL
SELECT Code, Supplier, 2 AS TablePriority FROM tempSupplier
),
SuppliersRanked AS
(
SELECT Code, Supplier,
ROW_NUMBER() OVER (PARTITION BY Code ORDER BY TablePriority) AS RowPriority
FROM AllSuppliers
)
SELECT DISTINCT SP.*, COALESCE(P.Product, PD.Product) 'Product', S.Supplier
FROM tempsupplierproduct SP
LEFT JOIN tempProduct P ON SP.ProductCode = P.Code
LEFT JOIN Products PD ON SP.ProductCode = PD.Code
LEFT JOIN SuppliersRanked S ON SP.SupplierCode = S.Code
AND RowPriority = 1
You can absolutely join on a coalesced field. Here is a snippet from one of my production views:
LEFT JOIN [Portal].tblHelpdeskresource supplier ON PO.fld_str_SupplierID = supplier.fld_str_SupplierID
-- Job type a
LEFT JOIN [Portal].tblHelpDeskFault HDF ON PO.fld_int_HelpdeskFaultID = HDF.fld_int_ID
-- Job Type b
LEFT JOIN [Portal].tblProjectHeader PH ON PO.fld_int_ProjectHeaderID = PH.fld_int_ID
LEFT JOIN [Portal].tblPPMScheduleLine PSL ON PH.fld_int_PPMScheduleRef = PSL.fld_int_ID
-- Managers (used to be separate for a & b type, now converged)
LEFT JOIN [Portal].uvw_HelpDeskSiteManagers PSM ON COALESCE(PSL.fld_int_StoreID,HDF.fld_int_StoreID) = PSM.PortalSiteId
LEFT JOIN [Portal].tblHelpdeskResource PHDR ON PSM.PortalResourceId = PHDR.fld_int_ID

What's the difference between these joins?

What's the difference between
SELECT COUNT(*)
FROM TOOL T
LEFT OUTER JOIN PREVENT_USE P ON T.ID = P.TOOL_ID
WHERE
P.ID IS NULL
and
SELECT COUNT(*)
FROM TOOL T
LEFT OUTER JOIN PREVENT_USE P ON T.ID = P.TOOL_ID AND P.ID IS NULL
?
The bottom query is equivalent to
SELECT COUNT(*)
FROM TOOL T
since it is not limiting the result set but rather producing a joined table with a lot of null fields for the right part of the join.
The first query is a left anti join.

Why does not adding distinct in this query produce duplicate rows?

This query was taken from a Rails application log...I'm trying to edit a massive postgresql statement I didn't write....If I don't add a distinct keyword after the SELECT, 2 duplicate rows appear for each braintree account. Why is this and is there another way to avoid having to use the distinct to avoid duplicates?
EDIT: I understand what distinct is supposed to do, the reason I'm asking is that it doesn't generate duplicates for other toy lines. By other toy lines, this query is building a "table" for a particular toy id (this specific example toys.id = 12). How do I figure out where the duplicate rows are being generated?
SELECT accounts.braintree_account_id as braintree_account_id,
accounts.braintree_account_id as braintree_account_id, format('%s %s', addresses.first_name,
addresses.last_name) as shipping_address_full_name,
users.email as email, addresses.line_1 as shipping_address_line_1,
addresses.line_2 as shipping_address_line_2, addresses.city as
shipping_address_city, addresses.state as shipping_address_state,
addresses.zip as shipping_address_zip_code, addresses.country
as shipping_address_country, CASE WHEN xy_shirt IS NULL THEN '' ELSE xy_shirt END, plans.name as plan_name, toys.sku as sku, to_char(accounts.created_at, 'MM/DD/YYYY HH24:MM:SS') as
account_created_at,
to_char(accounts.next_assessment_at, 'MM/DD/YYYY HH24:MM:SS') as account_next_assessment_at,
accounts.account_status as account_status FROM \"accounts\" INNER JOIN \"addresses\" ON
\"addresses\".\"id\" = \"accounts\".\"shipping_address_id\" AND \"addresses\".\"type\" IN
('ShippingAddress') LEFT OUTER JOIN shipping_methods ON
shipping_methods.account_id = accounts.id LEFT OUTER JOIN plans ON
accounts.plan_id = plans.id
LEFT OUTER JOIN users ON
accounts.user_id = users.id LEFT OUTER JOIN toys ON plans.toy_id = toys.id
LEFT OUTER JOIN account_variations ON accounts.id =
account_variations.account_id LEFT OUTER JOIN variations ON
account_variations.variation_id = variations.id
LEFT OUTER JOIN
choice_value_variations ON variations.id =
choice_value_variations.variation_id
LEFT OUTER JOIN choice_values ON
choice_value_variations.choice_value_id = choice_values.id LEFT OUTER
JOIN choice_types ON choice_values.choice_type_id = choice_types.id
LEFT
OUTER JOIN choice_type_toys ON choice_type_toys.toy_id = toys.id
AND choice_type_toys.choice_type_id = choice_types.id
LEFT OUTER JOIN
(SELECT * FROM crosstab('SELECT accounts.id, choice_types.id,
choice_values.presentation FROM accounts\n
LEFT JOIN account_variations ON
accounts.id=account_variations.account_id\n
LEFT JOIN variations ON account_variations.variation_id=variations.id\n
LEFT JOIN choice_value_variations ON
variations.id=choice_value_variations.variation_id\n
LEFT JOIN choice_values ON
choice_value_variations.choice_value_id=choice_values.id\n
LEFT JOIN choice_types ON choice_values.choice_type_id=choice_types.id
ORDER BY 1,2',\n 'select distinct choice_types.id
from choice_types JOIN choice_values ON choice_values.choice_type_id =
choice_types.id JOIN choice_value_variations ON
choice_value_variations.choice_value_id = choice_values.id JOIN
variations ON choice_value_variations.variation_id = variations.id JOIN choice_type_toys ON choice_type_toys.choice_type_id = choice_types.id JOIN toys ON toys.id = choice_type_toys.toy_id
where toys.id=12 ORDER
BY choice_types.id ASC')\n
AS (account_id int, xy_shirt
VARCHAR)) account_variation_view\n ON
accounts.id=account_variation_view.account_id WHERE
\"accounts\".\"account_status\" = 'active' AND
\"addresses\".\"flagged_invalid_at\" IS NULL AND \"toys\".\"id\" = 12
AND (NOT EXISTS (SELECT \"account_skipped_months\".* FROM
\"account_skipped_months\" WHERE
\"account_skipped_months\".\"month_year\" = 'JUL2016' AND
(account_skipped_months.account_id = accounts.id)))"
The purpose of using DISTINCT in a SELECT statement is to eliminate duplicate rows.

Left join filtering only works on where clause

Can someone explain this to me?
I have two huge tables on which I wish to LEFT JOIN, but I think it would be more efficient to filter at the 'on' rather than the 'where'.
Select * from Table1 t1 left join Table2 t2 on t1.Id = t2.Id and t1.Enabled = 1
Rather than
Select * from Table1 t1 left join Table2 t2 on t1.Id = t2.Id where t1.Enabled = 1
The above results are not the same. No matter what I put in as filtering, it stays unaffected:
Select * from Table1 t1 left join Table2 t2 on t1.Id = t2.Id and t1.Enabled = 1
yields exactly the same results as
Select * from Table1 t1 left join Table2 t2 on t1.Id = t2.Id and t1.Enabled = 0
Also, Table1 might only have a few records where Enabled = 1, so it would be inefficient to join millions of records first and then filter to find only 10.
I can do a sub select first, but I somehow I feel it is not the right way:
Select * from (Select * from Table1 where Enabled = 1) a left join Table2 t2 on a.Id = t2.Id
Types of Joins
Left outer join All rows from the first-named table (the "left"
table, which appears leftmost in the JOIN clause) are included.
Unmatched rows in the right table do not appear.
When you move the condition to the where then t1.Enabled = 1 is applied
Do you have actual performance problems with the first query?