Adding logs of jobs into a database with Talend - postgresql

I am trying to import all the logs of the jobs running into a table in Postgres. I am using the components tLogCatcher and tStatCatcher and joining them to create a table with all the available data.
The job looks like this:
Inside the tMap, I am joining the two sources from logcatcher and statcatcher on the pid and the job name and try to merge the results to have them combined in a table:
However whenever the job fails I get nulls in the logcatcher output, even if there are error messages:
[statistics] connecting to socket on port 3696
[statistics] connected
2017-02-03 13:51:07|PR7710|PR7710|PR7710|6981|NASIA|Master_ETL_Job|_52dYEJUvEeaqS8phzVFskQ|0.1|Default||begin||
Exception in component tFileInputDelimited_1
java.io.FileNotFoundException: /Users/nasiantalla/Documents/keychain.csv (No such file or directory)
at java.io.FileInputStream.open0(Native Method)
at java.io.FileInputStream.open(FileInputStream.java:195)
at java.io.FileInputStream.<init>(FileInputStream.java:138)
at java.io.FileInputStream.<init>(FileInputStream.java:93)
at org.talend.fileprocess.TOSDelimitedReader.<init>(TOSDelimitedReader.java:88)
at org.talend.fileprocess.FileInputDelimited.<init>(FileInputDelimited.java:164)
at nasia.master_etl_job_0_1.Master_ETL_Job.tFileInputDelimited_1Process(Master_ETL_Job.java:796)
at nasia.master_etl_job_0_1.Master_ETL_Job.runJobInTOS(Master_ETL_Job.java:6073)
at nasia.master_etl_job_0_1.Master_ETL_Job.main(Master_ETL_Job.java:5879)
2017-02-03 13:51:08|PR7710|PR7710|PR7710|NASIA|Master_ETL_Job|Default|6|Java Exception|tFileInputDelimited_1|java.io.FileNotFoundException:/Users/nasiantalla/Documents/keychain.csv (No such file or directory)|1
2017-02-03 13:51:08|PR7710|PR7710|PR7710|6981|NASIA|Master_ETL_Job|_52dYEJUvEeaqS8phzVFskQ|0.1|Default||end|failure|890
[statistics] disconnected
Job Master_ETL_Job endet am 13:51 03/02/2017. [exit code=1]
And in my table the data I get are like this:
Do you see something that I might have missed? I tried with all different joins in the tMap but it doesn't seem to work and I dont understand why..
Thanks in advance!

The tStatCatcher and tLogCatcher do not work when joined with a tMap. I cannot give a definitive answer as to why, but I think its related to the special functionality involved in 'catching' the errors and stats, and is likely a timing issue. The log catcher for instance will only catch an error while the stats can catch stats on every component.
I recommend writing to separate tables and joining on those tables to produce reports. As a matter of fact Talend has this functionality built in so you do not even need to provide your own tStatCatcher and tLogCatcher components in each job.
You must first create the AMC database structure then go to file-->edit project settings--> job settings --> stats and logs. Choose the 'on database' option. Then Talend will automatically log stats, errors and flows to the AMC db. You can report off this db.

They are 3 reasons for that :
tLogCatcher does not provide logs if there is no tDie or tWarn, and i think this is your case.
It's not necessary that tLogCatcher and tStatCatcher provide their data at the same time, because their are triggered by different events. So join will not match.
From functional prespective, joining the 2 flow does not make sense, they are fully independent.
I recommand you to dump these flows into different tables, and this can be achieved implicitly without using any component and without development, see here.

Related

IBM Datastage reports failure code 262148

I realize this is a bad question, but I don't know where else to turn.
can someone point me to where I can find the list of reports failure codes for IBM? I've tried searching for it in the IBM documentation, and in general google search, but this particular error is unique and I've never seen it before.
I'm trying to find out what code 262148 means.
Background:
I built a datastage job that has:
ORACLE CONNECTOR --> TRANSFORMER -> HIERARCHICAL DATA
The intent is to pull data from a ORACLE table, and output the response of the select statement into a JSON file. I'm using the HIERARCHICAL stage to set it. When tested in the stage, no problems, I see the JSON output.
However, when I run the job, it squawks:
reports failure code 262148
then the job aborts. There are no warnings, no signs, no errors prior to this line.
Until I know what it is, I can't troubleshoot.
If someone can point me to where the list of failure codes are, i can proceed.
Thanks!
can someone point me to where I can find the list of reports failure codes for IBM?
Here you go:
https://www.ibm.com/support/knowledgecenter/en/ssw_ibm_i_73/rzahb/rzahbsrclist.htm
While this list does not list your specific error code, it does categorize many other codes, and explains how the code breakdown works. While this list is not specifically for DataStage, in my experience IBM standards are generally compatible across different products. In this list every code that starts with a 2 is a disk failure, so maybe run a disk checker. That's the best I've got as far as error codes.
Without knowledge of the inner workings of the product, there is not much more you can do beyond checking system health in general (especially disk, network and permissions in this case). Personally, I prefer to go get internal knowledge whenever exterior knowledge proves insufficient. I would start with a network capture, as I'm sure there's a socket involved in the connection between the layers. Compare the network capture from when the select statement is run from the hierarchical from one captured when run from the job. There may be clues in there, like reset or refused connections.

OrientDB: IllegalStateException: Cannot begin a transaction while a hook is executing

I'm getting this error when I try to insert 17000 vertex in the DB. The vertex are grouped as a multiple tree an the commit occur when a tree has bean fulled readed/stored. The first tree has 2300 vertex, the second has 5500 vertex and is in this point when it fail.
java.lang.IllegalStateException: Cannot begin a transaction while a hook is executing
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.begin(ODatabaseDocumentTx.java:2210)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.begin(ODatabaseDocumentTx.java:2192)
at com.tinkerpop.blueprints.impls.orient.OrientTransactionalGraph.ensureTransaction(OrientTransactionalGraph.java:229)
at com.tinkerpop.blueprints.impls.orient.OrientTransactionalGraph.commit(OrientTransactionalGraph.java:177)
at net.odbogm.SessionManager.commit(SessionManager.java:351)
at com.quiencotiza.utilities.SetupInicial.loadRubros(SetupInicial.java:180)
at com.quiencotiza.utilities.SetupInicial.initDatabase(SetupInicial.java:48)
at com.quiencotiza.utilities.SetupInicial.main(SetupInicial.java:41)
It's a single thread app. It load the database with the initials records.
I have upgraded to 2.2.4 but I get the same error.
Thanks
Marcelo
Well. I solve the problem. It seem is something related to the activateOnCurrentThread() but don't know why it happened. What means that exception? Why it is throwing?
I know its an old topic, but maybe its will help someone,
Had the same problem, a lot threads with many queries and updates.
So I started to work with one thread (SingleThreadExecutor in Java) and solve it,
I guess there is a bug in the locks of hooks

Talend Force run order of joblets

My company has a couple of joblets that we put in new jobs to do things like initialization of variables, get system information from the database and sending out error / warning emails. The issue we are running into is that if we go ahead and start creating the components of a job and realize that we forgot to include these 3 joblets, we have to basically re-create the job to ensure that the joblets are added first so they run first.
Is there any way to force these joblets to run first and possibly also in a certain order before moving on to the contents of the job being created? Please let me know if there is any information you may need that I'm missing as I have only been using Talend for a few days. The rest of the team has not been using it too much longer than I have, so they do not have the answer I'm looking for either. Thanks in advance!
In Joblets you can use the components Trigger_Input and Trigger_Output as connection-points for on subjob OK triggers. So you can connect joblets and other components in a job with triggers. Thus enforcing execution order.
But you cannot get a on subjob OK trigger from a tPreJob. I am thinking on triggering from a tPreJob to a tWarn (on component OK) and then from tWarn to the joblet (on subjob OK).

Talend Subjobs and Sundry

Trying to troubleshoot an existing Talend job with many iterations and sub-jobs created by a developer who is no longer with the company. Ran into an issue with subjobs and hoping someone here can answer.
I know by reading the documentation that OnSubjobOk10 indicates that the job will execute after #10 is complete. But in a workflow with no names, how I do know which is Subjob#10? Can I assume it is the one from where the job-job connection is made?
Thanks in advance,
Bee
OnSubJobOK will make te next subjob work if the previous subjob finished without error, from help.talend:
OnSubjobOK (previously Then Run): This link is used to trigger the
next subjob on the condition that the main subjob completed without
error. This connection is to be used only from the start component of
the Job.
These connections are used to orchestrate the subjobs forming the Job
or to easily troubleshoot and handle unexpected errors.

How to increment a number from a csv and write over it

I'm wondering how to increment a number "extracted" from a field in a csv, and then rewrite the file with the number incremented.
I need this counter in a tMap.
Is the design below a good way to do it ?
EDIT: im trying a new method. see the design of my subjob below, but i have an error when i link the tjavarow to my main tmap in the main job
Exception in component tMap_1
java.lang.NullPointerException
at mod_file_02.file_02_0_1.FILE_02.tFileList_1Process(FILE_02.java:9157)
at mod_file_02.file_02_0_1.FILE_02.tRowGenerator_5Process(FILE_02.java:8226)
at mod_file_02.file_02_0_1.FILE_02.tFileInputDelimited_2Process(FILE_02.java:7340)
at mod_file_02.file_02_0_1.FILE_02.runJobInTOS(FILE_02.java:12170)
at mod_file_02.file_02_0_1.FILE_02.main(FILE_02.java:11954)
2014-08-07 12:43:35|bm9aSI|bm9aSI|bm9aSI|MOD_FILE_02|FILE_02|Default|6|Java
Exception|tMap_1|java.lang.NullPointerException:null|1
[statistics] disconnected
enter image description here
You should be able to do this mid flow in a tMap or a tJavaRow.
Simply read the number in as an integer (or other numeric data type) and then add your increment to it.
A really simple example might look like this:
Here we have a tFixedFlowInput that has some hard coded values for the job:
And we run it through a tMap where we add 1 to the age column:
And finally, we output it to the console in a table:
EDIT:
As Gabriele B has pointed out, this doesn't exactly work when reading and writing to the same flat file as Talend claims an exclusive read-write lock on the file when reading and keeps it open throughout the job.
Instead you would have to write the incremented data to some other place such as a temporary file, a database or even just to the buffer and then read that data in to a separate job which would then output the file you want and clean up anything temporary.
The problem with that is you can't do the output in the same process. I've just tried testing reading in the file in one child job, passing the data back to a parent job using a tBufferOutput and then passing that data to another child job as a context variable and then trying to output to the file. Unfortunately the file lock remains on it so you can't do this all in one self contain job (even using a parent job and several child jobs).
If this sounds horrible to you (it is) and you absolutely need this to happen (I'd suggest a database table sounds like a better match for this functionality than a flat file) then you could raise a feature request on the Talend Jira for the tFileInputDelimited to not hold the file open or to not insist on an exclusive read-write lock on the file.
Once again, I strongly recommend that you move to using a database table for this because even without the file lock issue, this is definitely not the right use of a flat file and this use case perfectly fits a database, even something as lightweight as an embedded H2 database.