Unable to export data to PostgreSQL from Oracle - postgresql

I have to extract data from Oracle tables and copy them to PostgreSQL. I am able to map both the input and output files. On running the connector component I get the proper row fetching graphical image, but when I go to the table there is no such data.
This one is for PostgreSQl to PostgreSQL:
TRACE_DEBUG result
After trace debug this is what I get

Are you trying to read and write on the same table in Input and Output ? (this could be a problem).
What kind of action are you using in the output , insert, update, insert or update ?
Did you check if there is a lock on your output table ?

Depending on the settings on your database connection, you may need to turn on auto commit or add an explicit commit component at the end of the Flow.
How is the output component configured ?
operation type : insert?
is it doing a lookup?
is the table name correct ?
did you check the error code global value for the component After it finishes ?

Related

How to populated the table via Pentaho Data Integration's table_output step?

I am performing an ETL job via Pentaho 7.1.
The job is to populate a table 'PRO_T_TICKETS' in PostgreSQL 9.2 via the Pentaho Jobs and transformations?
I have mapped the table fields with respect to the stream fields
Mapped Fields
My Table PRO_T_TICKETS contains the Schema (Column Names) in UPPERCASE.
Is this the reason I can't populate the table PRO_T_TICKETS with my ETL Job?
I duplicated the step TABLE_OUTPUT to PRO_T_TICKETS and changed the Target table field to 'PRO_T_TICKETS2'. Pentaho created a new table with lowercase schema and populated the data in it.
But I want this data to be uploaded in the table PRO_T_TICKETS only and with the UPPERCASE schema if possible.
I am attaching the whole job here and the error thrown by Pentaho. Pentaho Error I have also tried my query by adding double quotes to the column names as you can see in the error. But it didn't help.
What do you think I should do?
When you create (or modify) the connection, select Advanced on the left panel and click on the Force to upper case or Force to lower case or, even better, Preserve case of reserved words.
To know which option to choose, copy the 4th line of your error log, the line starting with INSERT INTO "public"."PRO_T_TICKETS("OID"... in your SQL-developer tool and change the connection advanced parameters until it works.
Also, at debug time, don't use batch updates, don't use lazy conversion on previous steps, and try with one (1) field rather than all (25).
Just as a complement: it worked for me following the tips from AlainD and using specific configurations that I'd like to share with you. I have a transformation streaming data from MySQL to PostgreSQL using a Table Input and Output. In both of DBs I have uppercase objects.
I did the following steps to work in the right way:
In the table input (MySQL) the objects are uppercase too, but I typed in lowercase and it worked and I didn't set any special option in the DB Connection.
In the table output (PostgreSQL) I typed everything in uppercase (schema, table name and columns) and I also set "specify the database fields" (clicking on "Get fields").
In the target DB Connection (PostgreSQL) I put the options (in "Advanced" section): "Quote all in database" and "Preserve case of reserved words".
PS: Ah, the last option is because I've found out that there was one more problem with my fields: there was a column called "Admin" (yes guys, they created a camelcase column using a reserved word!) and for that reason I must to put "Preserve case of reserved words" and type it as "Admin" (without quotes and in camelcase) in the Table Output.

How to assign csv field value to SQL query written inside table input step in Pentaho Spoon

I am pretty new to Pentaho so my query might sound very novice.
I have written a transformation in which am using CSV file input step and table input step.
Steps I followed:
Initially, I created a parameter in transformation properties. The
parameter birthdate doesn't have any default value set.
I have used this parameter in postgresql query in table input step
in the following manner:
select * from person where EXTRACT(YEAR FROM birthdate) > ${birthdate};
I am reading the CSV file using CSV file input step. How do I assign the birthdate value which is present in my CSV file to the parameter which I created in the transformation?
(OR)
Could you guide me the process of assigning the CSV field value directly to the SQL query used in the table input step without the use of a parameter?
TLDR;
I recommend using a "database join" step like in my third suggestion below.
See the last image for reference
First idea - Using Table Input as originally asked
Well, you don't need any parameter for that, unless you are going to provide the value for that parameter when asking the transformation to run. If you need to read data from a CSV you can do that with this approach.
First, read your CSV and make sure your rows are ok.
After that, use a select values to keep only the columns to be used as parameters.
In the table input, use a placeholder (?) to determine where to place the data and ask it to run for each row that it receives from the source step.
Just keep in ming that the order of columns received by the table input (the columns out of the select values) is the same order that it will be used for the placeholders (?). This should not be a problem with your question that uses only one placeholder, but keep that in mind as you ramp up using Pentaho.
Second idea, using a Database Lookup
This is another approach where you can't personalize the query made to the database and may experience a better performance because you can set a "Enable cache" flag and if you don't need to use a function on your where clause this is really recommended.
Third idea, using a Database Join
That is my recommended approach if you need a function on your where clause. It looks a lot like the Table Input approach but you can skip the select values step and select what columns to use, repeat the same column a bunch of times and enable a "outer join" flag that returns the rows without result from the query
ProTip: If you feel the transformation running too slow, try to use multiple copies from the step (documentation here) and obviously make sure the table have the appropriate indexes in place.
Yes there's a way of assigning directly without the use of parameter. Do as follows.
Use Block this step until steps finish to halt the table input step till csv input step completes.
Following is how you configure each step.
Note:
Postgres query should be select * from person where EXTRACT(YEAR
FROM birthdate) > ?::integer
Check Execute for each row and Replace variables in in Table input step.
Select only the birthday column in CSV input step.

Talend tMap Set Default Value for Rejected Inner Joins and connect them with the main data flow

i've got the following problem.
I have several tMaps, each has a lookup and at the end all the data is written in a db. The following mockup shall illustrate it:
There can be values in the main data stream which are not found in the lookup tables. For this values there is a rejected path which catches them from the specific tMap.
Requirements:
In case of a rejected inner join the looked up value shall be set to a default value (for example 0, which could be done in the schema of the tMap) and after that these "corrected" records should be added to the "normal" main data flow and process the next lookup.
The tUnite component is not able to handle this cases because it can not exist in a data flow loop.
Does anybody got an idea how to solve this problem?
Cheers.
The answer was so easy that i didn't got it in the first conception. I just have to change the join model from inner to left-join so all the formal rejected values will have a null value in it. Afterwards i can check the columns in the tmap and set them on a default value if they are null.
row1.id == null ? 0 : row1.id
Cheers.
If I understand correctly what you are trying to accomplish you will have to have staging files or staging tables on the database. Once you get the rejected rows, write them on a file or table. The accepted files will go also to a staging table(different than the rejected). Then you can union both tables or files by reading them. The key point is having a staging structure. I attach a picture what how would it be. In the picture the staging structure is a mysql table.
Let me know if it helps!

Spring store data in jdbcTemlate(h2 db) permanently

I am starting to learn Spring and faced with some issues regarding spring-jdbc.
First, I tried run the example from this: https://spring.io/guides/gs/relational-data-access/ and it worked. Then, I commented lines with droping and creating new tables(http://pastebin.com/zcJHsL1P), in order to not override data, but just get it from db and show it. However, spring showed me error:
Table "CUSTOMERS" not found; SQL statement: ...
So, my question is: What should I do to store my database permanently? I don't want to recreate all time new database, I want create it once and update it.
P.S. I used H2 database. Maybe problem exists in tis db?
That piece of code looks like you are "prototyping" something; so it's easier to automatically create a new database (schema, tables, data) on the fly, execute and/or test whatever you want to...and finish the execution.
If you want to persist your data and only modify/update it, either use H2 with the "file layout" or use MySQL, PostreSQL, etcetera.
By the way, the reason you are getting Table "CUSTOMERS" not found; SQL statement: ... is because you are using H2 as an in-memory database and every time you start your application you need to re-create the tables and populate them with data.

How to log (or see) all inserts performed in a talend job

I have a Job in talend that inserts data into a table.
Can I get this SQL sentences (ie "insert into tabla(a,b)values(....)")?
You can see the data inserted by adding tLogRow but if you want to see the generated insert on real time you can use the debugger.
For example, for the following job:
Above you can see the data inserted from an excel file to a mysql table. This was generated using tLogRow. But if you want the sql generated sentence, by using the debug you can see it here:
Hope to help.
You could simply place a tLogRow component either before or after your database output component to log things to the console if you are interested in seeing what data is being sent to the database.
I think it's impossible to see (it could be nice as an improvement in new releases). My problem, was when I change de source of my database output (Oracle SID to Oracle RAC), the inserts were made in the older database.
I fix it change the xml code in the "item" file. With the change older params attached to Oracle SID were stil there.
Thanks a lot!! Have a nice weekend Goon10 and ydaetskcoR!
You can check the generated JAVA code. You'll see an:
INSERT INTO (columns) VALUES (?,?,?)
thats the insert preparedStatement. Talend uses preparedStatements to do the inserts, thus only 1 insert will be generated and sent. In the main part of the component it will call
setString(value,position)
Please refer to: http://docs.oracle.com/javase/tutorial/jdbc/basics/prepared.html