Select/Insert from identical table on two different DB connections - select

I have to do a select from a table and insert into another identical(same structure) table on two different DB connections.
This is my code:
from("direct:" + getId)
.toD("sql:classpath:" +getSql1 + "?datasource= DataSourse1&usePlaceHolder=true"))
.setHeaders("Results", simple(${body})
toD("sql:classpath:" +getSql2 + "?datasource= DataSourse2&usePlaceHolder=true"))
where
getSQL1 : Select * FROM Product1
and
getSQL2 :Insert Into Product2 Values(${headers.results})
It does not work because of the data format I'm trying to insert, I suppose. What I get from the Select is something like this:
[{ID=130, DESCRIPTION=Product130}]
So, I need to clean my data and get only 130, 'Product130'
Any help? Thanks.

Assuming your actual code works and you're just transferring one record, change the getSql1 select option part to this:
"?datasource= DataSourse1&usePlaceHolder=true&outputType=SelectOne"
That puts a map into your message body as a result, instead of List of Map. No need to copy the result into a header. With Camel's SQL component, it will look for named query parameters in your message body if the body type is a Java Map.
Next change your insert to use the parameters within the Map:
insert into Product2 (ID, DESCRIPTION) values (:#ID, :#DESCRIPTION)
Notice I included the column names. This is for safety and good practice. If by chance the column order isn't the same as the origin table, this will still work.

Related

ADF Copy function comparing watermark against isnull(date1,date2)

Forum Newbie...
I want to utilise the ADF Copy function, to carry out incremental table extracts from one Azure DB to another. Every table in the database that I need all have the same 2 relevant fields i.e. date1, date2. For Watermark comparison purposes, I need to use isnull(date1,date2), but unsure how to do this, i.e. I am not sure how I can add this consistent derived value to the Source as an additional field that can perhaps be added via the Query or Stored Procedure Option on the source, to utilise the #item().source.schema and #item().source.table values that have already been generated as parameters..?
You can use the query option in the Copy data activity source and add a new column in the query itself to get the results of isnull(date1,date2) and include the parameter values to get the table name instead of hardcoding them as shown below.
In source, select Query option under Use query and add dynamic content to concat() select statement with parameter values.
#concat('select *, isnull(date1,date2) as final_dt from ',pipeline().parameters.schema,'.',pipeline().parameters.table)
Sink table data output:

how to get talend (TMAP) to load lookup data and incoming data at the same time

I have a talend job that i require a lookup at the target table.
Naturally the target table is large (a fact table) so I don't want to have to wait to load the whole thing before going to running lookups like this picture below:
Is there a way to have the lookup work DURING the pull from the main source?
The attempt is to speed up the inital loads so things move fast, and attempt to save on memory. as you can see, the lookup is already passed 3 Million rows.
the tLogRow represents the same table as the lookup.
You can achieve what you're looking for by configuring the lookup in your tMap to use "Reload at each row" lookup model, instead of "Load Once". This lookup model allows you to reexecute your lookup query for each incoming row, instead of loading all your lookup table at once, useful for lookups on large tables.
When you select the reload at each row model, you will have to specify a lookup key in the global map sections that will appear under the settings. Create a key with a name like "ORDER_ID", and map it with FromExt.ORDER_ID column. Then modify your lookup query so that it returns a single match for the ORDER_ID like so:
"SELECT col1, col1.. FROM lookup_table WHERE id = '" + (String)globalMap.get("ORDER_ID") + "'".
This is supposing your id column is a string.
What this does is create a global variable called "ORDER_ID" containing the order id for every incoming row from your main connection, then executes the lookup query filtering for that id.

Updating the table through tOracleOutput in Talend using an additional SQL query

I have a job where I am getting a flow into tOracleOutput where I am updating the table. Now, I have to update that table using an SQL statement, which I guess we have option in Advanced settings of tOracleOuptut, but I don't know how to use it or you can say that I am not getting the settings properly. I referred to official documentation but could not understand. Can any one explain the fields like Name, SQL expression, Position, Reference Column in a better way?
the SQL query which I am using is:
update set COL1=SOMETHING1
where COL2=SOMETHING2
Now, value for COL1 is coming from the flow but COL2 is some column in the table which is not coming from the flow.
Have a look to tOracleRow for such a case.
Hope this helps.
TRF
Using tOracleOutput is helpful when a ready data source (table or file (...) with same columns as destination) the more elaborate your query is, the more you should do as TRF said (and use tOracleRow), but here's an example to your question:
file contain 3 column,
DB table of destination contains 4 column, where the 4th is the date of update, (the first 3 are identical to the input)
so you add the destination's column's name in Name and put the SQL function for the date (eg: SYSDATE) and where to put it (Position) in reference to a column of your choice (Reference Column)
In my view it helps avoid using tMap for a miserable additional column when you want to Insert, but you want to Update, in which case the component doesn't offer the additional column section, plus I don't think you can add the WHERE clause here
Hope it helps

Querying on multiple LINKMAP items with OrientDB SQL

I have a class that contains a LINKMAP field called links. This class is used recursively to create arbitrary hierarchical groupings (something like the time-series example, but not with the fixed year/month/day structure).
A query like this:
select expand(links['2017'].links['07'].links['15'].links['10'].links) from data where key='AAA'
Returns the actual records contained in the last layer of "links". This works exactly as expected.
But a query like this (note the 10,11 in the second to last layer of "links"):
select expand(links['2017'].links['07'].links['15'].links['10','11'].links) from data where key='AAA'
Returns two rows of the last layer of "links" instead:
{"1000":"#23:0","1001":"#24:0","1002":"#23:1"}
{"1003":"#24:1","1004":"#23:2"}
Using unionAll or intersect (with or without UNWIND) results in this single record:
[{"1000":"#23:0","1001":"#24:0","1002":"#23:1"},{"1003":"#24:1","1004":"#23:2"}]
But nothing I've tried (including various attempts at "compound" SELECTs) will get the expand to work as it does with the original example (i.e. return the actual records represented in the last LINKMAP).
Is there a SQL syntax that will achieve this?
Note: Even this (slightly modified) example from the ODB docs does not result in a list of linked records:
select expand(records) from
(select unionAll(years['2017'].links['07'].links['15'].links['10'].links, years['2017'].links['07'].links['15'].links['11'].links) as records from data where key='AAA')
Ref: https://orientdb.com/docs/2.2/Time-series-use-case.html
I'm not sure of what you want to achieve, but I think it's worth trying with values():
select expand(links['2017'].links['07'].links['15'].links['10','11'].links.values()) from data where key='AAA'

Filling additional columns of an internal table with additional data by SELECT statement? Can this be done?

SELECT matnr ersda ernam laeda
FROM mara
INTO CORRESPONDING FIELDS OF TABLE gt_mara
UP TO 100 ROWS.
At this point I have 100 entries in the itab gt_mara.
SELECT aenam vpsta pstat lvorm mtart
FROM mara
INTO CORRESPONDING FIELDS OF TABLE gt_mara
FOR ALL ENTRIES IN gt_mara
WHERE matnr = gt_mara-matnr AND
ersda = gt_mara-ersda AND
ernam = gt_mara-ernam AND
laeda = gt_mara-laeda.
At this point I have 59 entries. Which makes sense. This code is buggy, because it might be modifying the selection criteria at run time.
Anyway what i intended was this: select the first 4 fields of the table at one point, and then select the other 5 at some other.
Of course, this is just an example. Perhaps the second select would be done on a different table with the same key or with a different number of fields.
So can this even be done?
Are there more efficient methods to achieve this than what comes to my mind by default (redoing the complete select) ?
Ok I think the essence of your question is more about whether you can update certain unfilled fields in an internal table directly through a second select statement.
The answer is no. Your second select statement would replace the contents in table gt_mara, so you would be left with an internal table where the first 4 fields are blank, and the last 5 are filled.
The best you could do is something like this:
SELECT matnr ersda ernam laeda
FROM mara
INTO CORRESPONDING FIELDS OF TABLE gt_mara
UP TO 100 ROWS.
SELECT matnr aenam vpsta pstat lvorm mtart
FROM mara
INTO CORRESPONDING FIELDS OF TABLE gt_mara2
FOR ALL ENTRIES IN gt_mara
WHERE matnr = gt_mara-matnr AND
ersda = gt_mara-ersda AND
ernam = gt_mara-ernam AND
laeda = gt_mara-laeda.
loop at gt_mara2 into ls_mara.
modify gt_mara from ls_mara transporting aenam vpsta pstat lvorm mtart
where matnr = ls_mara-matnr.
endloop.
This is obviously quite inefficient, which is why you would always try to make the database do as much of the work for you before you bring the data back to the application server. Obviously if the data is coming from the same table selecting it all in one go is going to be your best option. In most cases even if the data is in different tables you would be better off creating a view or using a join.
In rare cases it is necessary to loop at your internal table to fill in some fields that were not available to you when you did the original select.
Either SELECT everything you need right away (which is the preferred solution if the data comes from the same table) or SELECT the additional stuff later (which is a good idea if the stuff comes from a different table that is not used for the first selection). For assembling the result set, the database usually needs to access the entire dataset anyway, so it doesn't really hurt to select some additional fields - in contrast to hitting the database again with a massive SELECT statement (if the FOR ALL ENTRIES table gets large). Also bear in mind that - depending on the kind of processing you're doing - the contents of the table might have changed in the meantime. If the database transaction (LUW) ends (which is always the case between dialog steps), you loose the database-level transaction isolation.