How to know in Talend if tMySQLInput will overwrite data? - talend

I have one already existing Talend Open Studio tMySQLInput component with some sql code inside it, in order to retrieve some joined columns linked to a tMySQLOuput component (pointing to an already existing MySQL table) with few records.
QUESTION:
Will the "tMySQLInput" component overwrite the already existing table data that the tMySQLOutput component relates to? I mean is there an option to check in the tMySQLInput our output in order to say, overwrite each time this job is executed ?
Thank you all.

Yes, there is an option where in tMySQLOutput where you can specify what action you want to do to your table. Follow following steps:
Go to component tab of tMySQLOutput, it will open the basic settings of this component.
If you will look closer you will find Action on table. This is the action which you can perform on the table which is pointed by tMySQLOutput. It has options as Default, Drop and Create Table etc.
Then you have Action on data. These are the options which you can perform on the data like Insert, Update etc.
In your case I suppose you can choose Action on Table as Default and Action on Data as Insert. Default action would not do anything on the table and Insert option would insert the records at the end of table. But in case of Insert if you will have duplicate rows then job would stop the moment it will find any duplicate row.

Related

How can I edit table properties in tabular model in visual studio?

I want to add new columns in two tables in a tabular model. But I faced three questions in the process.
When I opened the table properties, I found here has filter rows commands. I tried to directly delete filter rows command here, but I clicked validate, it shows the credentials for this operation could not be validated. How can I renew the SQL statement?
When I open design and click import. Error appears: Cannot import the partition query because the set of columns in the partition definition does not match those in the table definition. The following required columns are mission.
The partition only sets the datetime, I do not understand what the error is here.
When I opened design in the table properties and click update, the error: cannot save changes because the partitions' schema has been changed. Please correct the schema and try again. But the table does not have any partitions. How can I fix it?

Copy Data - How to skip Identity columns

I'm designing a Copy Data task where the Sink SQL Server table contains an Identity column. The Copy Data task always wants me to map that column when, in my opinion, it should just not include the column in the list of columns to map. Does anyone know how I can get the ADF Copy Data task to ignore Sink Identity columns?
If you are using copy data tool, and in your sql server, the ID is set as auto-increment, then it should not show out at the mapping step. Please tell us if it is not the case.
If you are using the create pipeline/dataset, you could just go to the sink dataset schema tab, remove the id column. And then go to the copy activity mapping tab, click import schemes again. ID column should has disappeared now.
You could include a SET_IDENTITY_INSERT_ON statement for the given table before executing the copy step. After completed, set it to OFF.

iSQLOutput - Update only Selected columns

My flow is simple and I am just reading a raw file into a SQL table.
At times the raw file contains data corresponding to existing records. I do not want to insert a new record in that case and would only want to update the existing record in the SQL table. The challenge is, there is a 'record creation date' column which I initialize at the time of record creation. The update operation overwrites that column too. I just want to avoid overwriting that column, while updating the other columns from the information coming from the raw file.
So far I am having no idea about how to do that. Could someone make a recommendation?
I defaulted the creation column to auto-populate in the SQL database itself. And I changed my flow to just update the remaining records. Talend job is now not touching that column. Problem solved.
Yet another reminder of 'Simplification is underrated'. :)

Avoid duplicate inserts without unique constraint in target table?

Source & target tables are similar.
Target table has a UUID field that is computed in tMap, however the flow should not insert duplicate persons in target i.e unique (firstname,lastname,dob,gender). I tried marking those columns as key in tMap as in below screenshot, but that does not prevent duplicate inserts. How can I avoid duplicate inserts without adding unique constraint on target?
I also tried "using field" in target.
Edit: Solution as suggested below:
The CDC components in the Paid version of Talend Studio for Data Integration undoubtedly address this.
In Open Studio, you'll can roll your own Change data capture based on the composite, unique key (firstname,lastname,dob,gender).
Use tUniqueRow on data coming from stage_geno_patients, unique on the following columns: firstname,lastname,dob,gender
Feed that into a tMap
Add another query as input to the tMap, to perform look-ups against the table behind "patients_test", to find a match on the firstname,lastname,dob,gender. That lookup should "Reload for each row" using looking up against values from the staging row
In the case of no-match, detect it and then do an insert of the staging row of data into the table behind "patients_test"
Q: Are you going to update information, also? Or, is the goal only to perform unique inserts where the data is not already present?

APEX - Creating a page with multiple forms linked to multiple related tables... that all submit with one button?

I have two tables in APEX that are linked by their primary key. One table (APEX_MAIN) holds the basic metadata of a document in our system and the other (APEX_DATES) holds important dates related to that document's processing.
For my team I have created a contrl panel where they can interact with all of this data. The issue is that right now they alter the information in APEX_MAIN on a page then they alter APEX_DATES on another. I would really like to be able to have these forms on the same page and submit updates to their respective tables & rows with a single submit button. I have set this up currently using two different regions on the same page but I am getting errors both with the initial fetching of the rows (Which ever row is fetched 2nd seems to work but then the page items in the form that was fetched 1st are empty?) and with submitting (It give some error about information in the DB having been altered since the update request was sent). Can anyone help me?
It is a limitation of the built-in Apex forms that you can only have one automated row fetch process per page, unfortunately. You can have more than one form region per page, but you have to code all the fetch and submit processing yourself if you do (not that difficult really, but you need to take care of optimistic locking etc. yourself too).
Splitting one table's form over several regions is perfectly possible, even using the built-in form functionality, because the region itself is just a layout object, it has no functionality associated with it.
Building forms manually is quite straight-forward but a bit more work.
Items
These should have the source set to "Static Text" rather than database column.
Buttons
You will need button like Create, Apply Changes, Delete that submit the page. These need unique request values so that you know which table is being processed, e.g. CREATE_EMP. You can make the buttons display conditionally, e.g. Create only when PK item is null.
Row Fetch Process
This will be a simple PL/SQL process like:
select ename, job, sal
into :p1_ename, :p1_job, :p1_sal
from emp
where empno = :p1_empno;
It will need to be conditional so that it only fires on entry to the form and not after every page load - otherwise if there are validation errors any edits will be lost. This can be controlled by a hidden item that is initially null but set to a non-null value on page load. Only fetch the row if the hidden item is null.
Submit Process(es)
You could have 3 separate processes for insert, update, delete associated with the buttons, or a single process that looks at the :request value to see what needs doing. Either way the processes will contain simple DML like:
insert into emp (empno, ename, job, sal)
values (:p1_empno, :p1_ename, :p1_job, :p1_sal);
Optimistic Locking
I omitted this above for simplicity, but one thing the built-in forms do for you is handle "optimistic locking" to prevent 2 users updating the same record simultaneously, with one's update overwriting the other's. There are various methods you can use to do this. A common one is to use OWA_OPT_LOCK.CHECKSUM to compare the record as it was when selected with as it is at the point of committing the update.
In fetch process:
select ename, job, sal, owa_opt_lock.checksum('SCOTT','EMP',ROWID)
into :p1_ename, :p1_job, :p1_sal, :p1_checksum
from emp
where empno = :p1_empno;
In submit process for update:
update emp
set job = :p1_job, sal = :p1_sal
where empno = :p1_empno
and owa_opt_lock.checksum('SCOTT','EMP',ROWID) = :p1_checksum;
if sql%rowcount = 0 then
-- handle fact that update failed e.g. raise_application_error
end if;
Another, easier solution for the fetching part is creating a view with all the feilds that you need.
The weak point is it that you later need to alter the "submit" code to insert to the tables that are the source for the view data