Pentaho DI - How to use "all" results from prior step in the next step as an "IN" query - pentaho-spoon

I have input from a tableA in database A that I would like to join to another tableB in database B.
These were my two options:
Use Database Join: For each input from table in database A, run the
join query in database B.
Use two Input tables (talbeA + tableB) and do merge join on key.
I went with option #1 as I want to avoid reading in tableA and tableB in entirety.
My question is:
How can I use all results from a prior step as one "IN" query?
For instance
select *
from tableB b
where b.id IN (all_rows_from_prior_step)
versus (where it runs for each input row)
select *
from tableB b
where b.id = ?

Use 'Group by' to flatten the rows into one row with a field 'all_rows_from_prior_step' of comma separated ids (Group field: empty, Name: all_rows_from_prior_step, Subject: id, Type: 'Concatenate strings separated by ,'). Next, use a 'User Defined Java Expression' to build the sql query:
"select * from tableB b where b.id IN (" + all_rows_from_prior_step + ")"
Last, use 'Dynamic SQL row' to run the query. The template sql could be
select * from tableB b where 1=0

Related

How to add complex inner query to #Query annotaion

normal sql query which work correctly in db in sql developer passing values for bID and period.
SELECT * FROM A WHERE abcID IN (SELECT abcID FROM B WHERE bID=1) AND period=3
in project at Repository class I passed as this
#Query("select a from A where a.abcID IN:(select b.abcId from B where bID=:RevID) and period=:period")
error comes as
Space is not allowed after parameter prefix ':' [select a from A
where a.abcID IN:(select b.abcId from B where bID=:RevID) and
period=:period]
I want to know how should I insert above query correctly in #Query annotation
First of all I would tell you below points.
you can't use select query as select a from A where a.abcID. Here a is a column so can't define something like a.abcID. It need to be select column from tableA a where a.abcID
Same for query use in IN clause. It needs to be like select b.abcId from tableB b where b.bID=:RevID
What you use as :RevID, :period need to be passed as #Param("RevID"), #Param("period") to the query method.
This is the query template.
#Query("select a.nameOfcolumnYouWantToRetrieve from tableA a where a.someColumn in(select b.someColumn from tableB b where b.columnValueOfTableBYouwantToMatch=:RevID) and a.period=:period")
Using this points, try below query.
#Query("select a.id from A a where a.abcID in(select b.abcId from B b where b.bID =:RevID) and a.period=:period")

dynamically choose fields from different table based on existense

I have two tables A and B.
Both the tables have same number of columns.
Table A always contains all ids of Table B.
Need to fetch row from Table B first if it does not exist then have
to fetch from Table A.
I was trying to dynamically do this
select
CASE
WHEN b.id is null THEN
a.*
ELSE
b.*
END
from A a
left join B b on b.id = a.id
I think this syntax is not correct.
Can some one suggest how to proceed.
It looks like you want to select all columns from table A except when a matching ID exists in table B. In that case you want to select all columns from table B.
That can be done with this query as long as the number and types of columns in both tables are compatible:
select * from a where not exists (select 1 from b where b.id = a.id)
union all
select * from b
If the number, types, or order of columns differs you will need to explicitly specify the columns to return in each sub query.

JPQL order entities by passed IDs to IN clause

is possible order entities by IDs which I pass as parameter to IN clause with spring data repository?
For instanse:
SELECT e FROM Employee WHERE e.id IN (:employeeIds);
and employeeIds = {1,2,3,4,5}
and my List with result from JPARepository will be entities is same order:
Employee={id:1, ...}, Employee={id:2, ...}, Employee={id:3, ...}
Depending on database you are using you can create a table from passed array and join the entity to is, something like that:
select e
from (values (1), (2), (3), ...) as t(id)
inner join employee e on t.id = e.id;
This can be evaluated as native query:
entityManager.createNativeQuery(
"select e " +
"from (values (1), (2), (3), ...) as t(id) " +
"inner join employee e on t.id = e.id", Employee.class)
.gerResultList();
But as you see you would have to compose query yourself or pass quite a lot od parameters (maybe in loop).

Sql to LINQ query LINQ Query convert

I have a SQL query I want to write in LINQ
Here is my Query
SELECT DISTINCT *
FROM [IHQDB].[dbo].[Table1] as t1
inner join Table2 as t2 on t2.Table2 =t1.ChangedItemID
inner join Table3 as t3 on t3.Table3 = t1.FromUserID
where (t1.FromUserID=1 And t2.ContentItemID= t1.ChangedItemID)
OR (t2.LastModifiedBy=1 or t2.CreatedBy=1 )
Hi now its working fine but My query little bit different on place of 1 I need my userID on base of their First Name and Last Name from M_User table.
How can I get UserId on Base of First Name + Last Name.
Here is my LINQ CODE For Retrieving User Name
linq4 = from q in context.T_ContentItems
join p in context.M_Users on q.CreatedBy equals p.UserID
where (advanceKeyword.Contains(p.EmployeeFirstName + " " + p.EmployeeLastName)) select q;
advancechk12 = linq4.ToList();
========================================================================
What I require is that wherever I have written the value "1" (e.g. t2.CreatedBy=1), I need to find the UserID. For simplicity, I am able to get the names of all the filtered users in the advancechk12. How do I retrieve the UserID's of the list of usernames returned in advancechk12
You have to replace below mentioned Linq query with your models name.I just used the same name of the T-Sql.
var t1List = (from t1 in db.Table1
join t2 in db.Table2 on t1.ChangedItemID equals t2.Id
join t3 in db.Table3 on t3.Id equals t1.FromUserID
where ((t1.FromUserID=1 && t2.ContentItemID= t1.ChangedItemID) || (t2.LastModifiedBy=1 or t2.CreatedBy=1))
select t1).Distinct().ToList();

Postgres is value in column

I have two tables A and B. B includes a column binder which contains integers. Now I want to search those rows of table A which are in placed in A.binder. The following statement does what I want:
SELECT * FROM A WHERE A.binder=ANY(SELECT binder FROM B)
But I expected something like
SELECT * FROM A WHERE A.binder=ANY(B.binder)
or
SELECT * FROM A WHERE A.binder IN array_agg(B.binder)
would work. Consider B.binder could contain duplicates. Therefor I cant simplify the statement by using inner join.
An INNER JOIN is still possible.
SELECT A.* FROM A INNER JOIN (SELECT DISTINCT binder FROM B) AS C ON
A.binder = C.binder
Why doesn't ANY(B.binder) work? because ANY in this context expects a subquery.
Use a subquery to get your integers from table B.
SELECT * FROM A
WHERE A.binder IN (
SELECT binder FROM B
);