User defined join in Informatica with DB2 - db2

We are trying to replace source qualifier override with user defined joins and source filter.
for below user defined join in Informatica source qualifier:
{A INNER JOIN B ON a.dept_id= b.dept_id
b.load_date between 20170712174712000000 and 20170904152656000000
LEFT OUTER JOIN C ON a.emp_id = c.emp_id}
for this I'm getting SQL query as
FROM A,B,C WHERE {A INNER JOIN B ON A.dept_id = B.dept_id
AND b.load_date between 20170712174712000000 and 20170904152656000000
LEFT OUTER JOIN C
ON C ON a.emp_id = c.emp_id}
I have tried replacing INNER JOIN in override query with NORMAL JOIN, as I saw it somewhere that informatica translates normal to inner join.
The source database is DB2.

I don't know anything about informatica, but the resulting SQL syntax that you listed in the question is not valid for DB2. The biggest problem is that you have the JOIN in the WHERE clause rather than the FROM clause. Not real sure how to fix that in informatica though. Appropriate DB2 syntax though would be something like this:
FROM a
INNER JOIN b ON a.dept_id = b.dept_id
LEFT OUTER JOIN c ON a.emp_id = c.emp_id
WHERE
b.load_date BETWEEN 20170712174712000000 and 20170904152656000000
This assumes that b.load_date is not a timestamp field. If you are using a timestamp field, the format for the timestamps should be '2017-07-12 17:47:12.000000'

The T-SQL should be used in FROM clause not in the WHERE clause. Hence use:
FROM A INNER JOIN B ON a.dept_id = b.dept_id
LEFT OUTER JOIN C ON a.emp_id = c.emp_id

Related

Which join type is used when using just JOIN in PostgreSql?

I prefer to indicate join types when I use it in database systems but when I switch to a new project, there is a single join is used. Generally I prefer to use LEFT JOIN or INNER JOIN according to my needs, but I have not found which JOIN type is considered when a single JOIN is used in PostgreSQL.
select p.uuid from Product s " +
join Category c on p.uuid = c.siteUuid
join Brand b on b.uuid = c.brandUuid
Inner Join is the default join when we use plain JOIN.
For better readablity of the queries, It is always preferred to write INNER JOIN
Reference:
https://www.postgresql.org/docs/current/queries-table-expressions.html#id-1.5.6.6.5.6.4.3.1.2

Convert LEFT JOIN query to Ecto

I have some queries I need migrate to Ecto and for maintainability reasons, I'd rather not just wrap them in a fragment and call it a day.
They have a lot of LEFT JOINs in them and as I understand from this answer, a left_join in Ecto does a LEFT OUTER JOIN by default. I can't seem to figure out how to specify to Ecto that I want a LEFT INNER JOIN, which is the default behavior for a LEFT JOIN in Postgresql.
To look at a toy example, let's say posts in our database can be either anonymous or they can have a creator. I have a query to get just enough info to make a post preview, but I only want non-anonymous posts to be included:
SELECT
p.id,
p.title,
p.body,
u.name AS creator_name,
u.avatar AS creator_avatar,
FROM posts p
LEFT JOIN users u ON p.creator_id = u.id;
I would translate that into Ecto as:
nonanonymous_posts =
from p in Post,
left_join: u in User, on: p.creator_id == u.id,
select: [p.id, p.title, p.body, u.name, u.avatar]
and Ecto spits out
SELECT
t0."id",
t0."title",
t0."body",
t1."name" AS creator_name,
t1."avatar" AS creator_avatar,
FROM "posts" AS t0
LEFT OUTER JOIN "users" as t1 ON t0."creator_id" = t1."id";
which will give back anonymous posts as well.
There is no such thing as LEFT INNER JOIN. There is only INNER JOIN and LEFT [OUTER] JOIN (OUTER part is optional, as LEFT JOIN must be outer join). So what you want is just :join or :inner_join in your Ecto query.

Left join results in cross join in spark

I am trying to join two tables in pyspark using a SQLContext:
create table joined_table stored
as orc
as
SELECT A.*,
B.*
FROM TABLEA AS A
LEFT JOIN TABLEB AS B ON 1=1
where lower(A.varA) LIKE concat('%',lower(B.varB),'%')
AND (B.varC = 0 OR (lower(A.varA) = lower(B.varB)));
But I get the following error:
AnalysisException: u'Detected cartesian product for LEFT OUTER join between logical plans
parquet\nJoin condition is missing or trivial.\nUse the CROSS JOIN syntax to allow cartesian products between these relations.;
Edit:
I solved the problem using the following in Spark:
conf.set('spark.sql.crossJoin.enabled', 'true')
This enables the cross join in Pyspark!
I cannot see the on clause condition with your left join.. A Left join without join condition will always results in cross join. A cross join will repeat each row of your left hand table for each row the table on your right side. Can you edit your query and include the 'ON' clause with your join key column.

How to take an intermediate data for the complex sql query. Postgresql

I have some complex queries to the postgresql which takes data from several tables joined each other with outer left join operators.
I need to test these queries so I need a fixtures for the tests contain only data I need, not whole tables data.
How could I see the intermediate results for these join subqueries to use it as a fixtures?
For example, I have tables A, B and C and query
SELECT A.column
FROM A
LEFT JOIN B ON A.b_id = B.id
LEFT JOIN C ON A.c_id = C.a_id
How could I take a result as "From table a: {part of A table taking part on query}, From table B {part of B table taking part on query}" etc, when parts of tables shows needed data or something like this. Is there any existing tool or method for it?
Unfortunately, EXPLAIN and ANALYSE shows only statistics and benchmarks, not data.
maybe you mean
SELECT A.*
FROM A
LEFT JOIN B ON A.b_id = B.id
LEFT JOIN C ON A.c_id = C.a_id
limit 10
to see what's happening in A from the join?
Or perhaps
select concat('from table a', a.col1, a.col2...) ,
concat('from table b', b.col1, b.col2...)
from ...
String functions such as concat: http://www.postgresql.org/docs/9.1/static/functions-string.html
also worth looking into http://www.postgresql.org/docs/9.1/static/functions-array.html at array_append()

JOIN that doesn't exclude all records if one side is null

I have a fairly conventional set of order entry tables divided by:
Orders
OrdersRows
OrdersRowsOptions
The record in OrderRowOptions is not created unless needed. When I create a set of joins like
select * from orders o
inner join OrdersRows r on r.idOrder = o.idOrder
inner join ordersrowsoptions ro on ro.idOrderRow = r.idOrderRow
where r.idProduct = [foo]
My full resultset is blank if no ordersrowsoptions records exist for the given product.
what's the correct syntax to return records even if no records exist at one of the join clauses?
thx
select * from orders o
inner join OrdersRows r on r.idOrder = o.idOrder
left join ordersrowsoptions ro on ro.idOrderRow = r.idOrderRow
where r.idProduct = [foo]
Of course you should not use select * in any query but especially never when doing a join. The repeated fields are just wasting server and network resources.
Since you seem unfamiliar with left joins, you probably also need to understand the concepts in this:
http://wiki.lessthandot.com/index.php/WHERE_conditions_on_a_LEFT_JOIN
LEFT JOIN / RIGHT JOIN.
Edit: yes, the following answer, given earlier, is correct:
select * from orders o
inner join OrdersRows r on r.idOrder = o.idOrder
left join ordersrowsoptions ro on ro.idOrderRow = r.idOrderRow
where r.idProduct = [foo]
LEFT JOIN (or RIGHT JOIN) are probably what you are looking for, depending on which side of the join no rows may appear.
Interesting, do you want to get all orders that have that product in them? The other post is correct that you have to use LEFT or RIGHT OUTER JOINS. But if you want to get entire orders that have that product then you'd need a more complex where clause.