tExtractRegexField unable to act as lookup to tMap in Talend DI - talend

I have a tExtractRegexField which extracts a date from a string of text coming from a ExcelFileInput and will output the dates to tLogRow but I can't connect the same output as a lookup column to a tMap with a 2nd ExcelFileInput as its main input.
If I connect the ExtractRegexField to tMap first I can't then connect the 2nd ExcelFileInput and vis versa
I'm using Talend 6.3.1 and for testing I am able to connect 2 x ExcelFileInput to a tMap so I dont think its a problem with my system setup.
I have also tried tJoin instead of tMap but I encounter the same issue (can't connect both inputs together but can connect "A" or "B" first)
Overview of Process
Problem Area
The tExcelFileInput uses globalMap to get the path to the excel file from the preceding tFlowToIterate

Based on discussions on the talend forum the issue may have been down to a desire by taland DI to avoid circular references
An alternative solution is to extract the regexfield from the header row and store them in a global variable using a tJavaRow and globalMap.put("MyVal", row.Data); and then OnComponentOk read the remaining data from the body rows and in the tMap recall the global variable MyVal and include it as needed in your tMap Output

Related

Talend - Output data to Snowflake table with spaces in Field Names

I have a very specific requirement to output data to a Snowflake table but the field names must have spaces in them. Snowflake appears to handle this okay, but I'm unsure how Talend will as I understand Java doesn't allow it. Can anyone help?
Also are there other tools that won't handle spaces in field names (i.e. R or Python) so we would be restricting use of the warehouse if we did that?

Throw error on invalid lookup in Talend job that populates an output table

I have a tMap component in a Talend job. The objective is to get a row from an input table, perform a column lookup in another input table, and write an output table populating one of the columns with the retrieved value (see screenshot below).
If the lookup is unsuccessful, I generate a row in an "invalid rows" table. This works fine however is not the solution I'm looking for.
Instead, I want to stop the entire process and throw an error on the first unsuccessful lookup. Is this possible in Talend? The error that is thrown should contain the value that failed the lookup.
UPDATE
A tfileoutputdelimited componenent would do the staff .
So ,the flow would be as such tMap ->invalid_row->tfileoutputdelimited -> tdie
Note : that you have to go to advanced settings in the tfileoutputdelimited component aand tick split output into multiple files option and put 1 rather then 1000
For more flexibility , simply do two tmap order than one tMap

Pivot data in Talend

I have some data which I need to pivot in Talend. This is a sample:
brandname,metric,value
A,xyz,2
B,xyz,2
A,abc,3
C,def,1
C,ghi,6
A,ghi,1
Now I need this data to be pivoted on the metric column like this:
brandname,abc,def,ghi,xyz
A,3,null,1,2
B,null,null,null,2
C,null,1,6,null
Currently I am using tPivotToColumnsDelimited to pivot the data to a file and reading back from that file. However having to store data on an external file and reading back is messy and unnecessary overhead.
Is there a way to do this with Talend without writing to an external file? I tried to use tDenormalize but as far as I understand, it will return the rows as 1 column which is not what I need. I also looked for some 3rd party component in TalendExchange but couldn't find anything useful.
Thank you for your help.
Assuming that your metrics are fixed, you can use their names as columns of the output. The solution to do the pivot has two parts: first, a tMap that transposes the value of each input-row in into the corresponding column in the output-row out and second, a tAggregate that groups the map's output-rows according to the brandname.
For the tMap you'd have to fill the columns conditionally like this, example for output colum named "abc":
out.abc = "abc".equals(in.metric)?in.value:null
In the tAggregate you'd have to group by out.brandname and aggregate each column as sum ignoring nulls.

tAggregateRow Talend - How can i count rows from table in Talend

The format is
tJDBCInput
main
tAggregateRow
main
tJavaRow
main
tLogRow
as shown in the image:
Under tAggregateRow basic setting I have this:
What should I write in tJava to get the value of rowcount?
If you want to get the row number of the data read by tjdbcinput, Talend provide it natively with no need to make aggregation, the row number is stored in the global map and you can get it using this line of code ((Integer)globalMap.get("tJDBCInput_1_NB_LINE"))
You can use it in a tJava component and wite it in your console using
System.out.println(((Integer)globalMap.get("tJDBCInput_1_NB_LINE")));

Autoincrement using Sequences is not working as expected

I am currently working on a job something like this
The design is to,extract some data from customers,(say first name,last name) to one excel file,other data (say address) is to goto other excel file,i added a identity to tMap Numeric("s1",1,1) but it is starting from 1,3,5,7,9,11,13.... and on other excel it getting 2,4,6,8,10,12,...
but i need both excel to have same identity 1,2,3,4,5,6,....N
so that i can map the records
so can somebody guide me on this?
edit:
The autoincrement returns 1,2,3,4,5,6,... this is fine when thers only one tMap component in the job,but not similar when 2 tMaps are used ?
This is because the numeric sequence is static. Since you have only one sequence called "s1", it will be incremented twice at every iteration (one time for each tMap it's invoked in).
Just use some unique labels (ie. "s1" and "s2") to force the use of two independent sequences, thus the solution of your problem.