tJavaFlex behaviour when changing loop position - talend

Having some problems in a job, and I suspect it is due to a lack of understanding of tJavaFlex. I am generating 10 rows in this test job, and am generating loop inside a tJavaFlex:
So there are 10 rows coming in, and a loop in the Start and End section. I was expecting that for each row coming in, it would generate 10 identical rows coming out. And that I would see iterations 0,1,2,3....9 for each row.
What I got was this. This looks to me like the entire job is running 10 times, and so I have 100 random values coming through the flow from the tRowGenerator.
If I move the for loop into the Main Code section, I get close to the behaviour I was expecting. I am expecting each row when it comes in to be repeated 10 times, and for 1 row coming in to produce 10 output rows. What I get is this.
But even then my tLogRow is only generating one row for each 10 iterations it seems (look at the tLogRow output after iteration 9 above why not 10 items?). I had thought I would be getting 10 rows for each single row coming in and I would see this in the tLogRow.
What I need to do is take a value from a field coming in, do some reg exp parsing and split into an array, and then for each item in the array create lines in the output flow. i.e. 1 row coming in can be turned into x number of rows coming out using a string.split() method.
Can someone explain the behaviour above, and also advise on the best approach to get one value coming in, do some java manipulation and then generate multiple rows coming out?
Any advice appreciated.

Yes you don't use it correctly.
The initial part is for initiate variable. (executed one time before the first tow)
In the principal you put your loop (executed one time at each row)
In the final you store in global variable for example.(executed one time after the last row)
The principal code will be executed at each row in a tjavaflex. So don't put a for loop inside you can do like the example in the screen.
You tjavaflex comportement is normal. you have ten row so each row the for loop wil be executed 10 time (i<10)
You can use it like :

You dont need to create your own loop.
By putting the for loop in the Start code, your main code will be triggered by the loop and by incoming rows, and it will be executed n*r times.
The behaviour of subjob that contains a tJavaFlex, reveils that component before tJavaFlex is included into its starting code, and the after component is included in the ending code, but that may depend to many conditions like data propagation and trigger type.
start code :
System.out.print("tJavaFlex is starting...");
int i = 0;
Main code :
i++;
System.out.print("tJavaFlex inside Main Code...iteration:"+i);
row8.ITEM_NAME = row7.ITEM_NAME;
row8.ITEM_COUNT = row7.ITEM_COUNT;
End code :
System.out.print("tJavaFlex is ending...");
System.out.print(row7.ITEM_NAME);

Instead of main flow in row5, try using iterate flow to connect tJavaFlex

Related

Talend - How to get tFlowToIterate size and tFileInputRegex size?

Good day,
I have component tFileInputRegex and tFlowToIterate to read data from a text file, I saw there are number of row record being process as follow:
Which is 3412 rows, may I know how can I get this value in tJava_2 ?
currently I am using _NB_LINE but getting null:
System.out.println("total is " + (Integer)globalMap.get("tFileInputRegex_1_NB_LINE"));
System.out.println("total2 is " + (Integer)globalMap.get("tFlowToIterate_2_NB_LINE"));
In your example, tJava_2 executes within the iteration, i.e. once for each row. In that component, you can use globalMap.get("tFlowToIterate_2_CURRENT_ITERATION").toString() to get the number of rows processed so far. Please note that instead of casting it to Integer you need to convert it to a string as shown above in order to output it the way you do.
If you need the total number of rows, you can use globalMap.get("tFileInputRegex_1_NB_LINE").toString() - but it is only available after the end of the loop, which means the component where you access it needs to be connected to tFileInputRegex_1 via OnSubjobOk trigger.

Save output of each itertation and combine in one variable

I am getting a 2-column double array output from each iteration of a loop. Every time the result is 30×2 , 40×2 , 99×2, ... and so on.
I want to save the result of each iteration in the same variable, lets say in data.
Currently, every time the loop is running only the last output is saved.
What I want is to have all of the outputs stored in data like in the first 30 rows the first output, which is 30×2, and from row 31 to 71 the output from the second iteration which is 40×2 and so on.
Can anyone help?
Perhaps try the following:
For the first iteration of the loop,
data=result;
For subsequent iterations,
data=[data; result];

How to check if the stream of rows has ended

Is there a way for me to know if the stream of rows has ended? That is, if the job is on the last row?
What im trying to do is for every 10 rows do something, my problem are the last rows, for example in 115 rows, the last 5 wont happen but i need them to.
There is no built-in functionality in Talend which tells you if you're on the last row. You can work around this using one of the following:
Get the row count beforehand. For instance, if you have a file, you
can use tFileRowCount to count the number of rows, then when you
process your file, you use a variable for your current row
number, and so you can tell if you've reached the last row. If your
data come from a database, you could either issue a query that
returns the total number of rows beforehand, or modify your main
query to return the total number of rows in an additional column and
use that (using ranking functions).
Do some processing after the subjob has ended: There may be situations
where you need a special processing for the last row, you can achieve
this by getting the last row processed by the previous subjob (which
you have already saved, for instance, by putting a tSetGlobalVar
after your target, when your subjob is done, your variable contains the last written value).
Edit
For your use case, what you could do is first store the result of the API call in memory using tHashOutput, then read it with a tHashInput in order to process it, and you'll know then how many rows you have retrieved using tHashOutput's global variable tHashOuput_X_NB_LINE.

How do I force tJavaFlex to generate multiple rows for a single row

How do I make tJavaFlex generate multiple output rows for a single input row? I don't want to use tSplitRow as I have to do other processing.
But for example, if I add a for loop inside my main code, and split my string into words the below happens, and I just get the last word in the sentence in my output flow:
tRowGenerator generating one sentence (1 row, one column):
tJavaFlex with loop in the Main section splitting the sentence into word tokens:
And this is what I get:
I had thought my loop would generate 10 rows in the output. Is there a way to make the tJavaFlex do this kind of multiplication of input rows?
In order to achieve your requirement, you need to use component tnormalize.
Below is just a sample job using tNormalize component and I have used the same string that you have used
I have provided item separator as "space"
I have got the below result for simple println statement
Hope this may help you out.

Getting certain rows from list of rows(C#3.0)

I have a datatable having 44 rows.
I have converted that to list and want to take the rows from 4th row till the last(i.e. 44th).
I have the below program
IEnumerable<DataRow> lstDr = dt.AsEnumerable().Skip(4).Take(dt.Rows.Count);
But the output is Enumeration yielded no results
I am using c#3.0
Please help.
If you want to take everything from the 4th row onwards, you don't need a Take call at all, just:
IEnumerable<DataRow> lstDr = dt.AsEnumerable().Skip(4);
When you talk about "the output" what's that coming from? What do you get if you call:
Console.WriteLine(lstDr.Count());
?
How many rows are in your data table to start with?