Talend - missing output - talend

I'm new to Talend, I just did some test.
I created a simple job that query SQL Server and writes on a xlsx file.
Query returns one record (tested on SSMS). And job seems to run fine:
It creates output files, names it correctly and, if specified, it also writes header in it.
But it does not write the record.
Mapping is one to one with the guessed schema. No filter or other operations on data.
What can I check?
== Edit - xlsx configuration ==

As we discover together, you need to replace your tMSSqlRow by a tMSSqlInput component in your job.
Regards,
TRF

Related

Append datasets from xlsx and databse - Talend

I have three Excel files and one database connection which I need to append as a part of my flow. All four datasets in the pre-append stage have just one column.
When I try to use tUnite, I get the error for tFileInputExcel - see the screenshot. Moreover, I cannot join the database connection with tUnite.
What am I doing wrong?
I think the problem is with the tFileExist components (I think that's what these are on the left with the "if" links coming out) because each of them is trying to start a new flow. Once you're joining them with the unite, there can be only one start to the flow - and this goes to the start of the first branch of the merge order.
You can move the if logic elsewhere. Another idea is to put the output from each of the Excel into a tHashOutput (linked together), then use a tHashInput to write to your DB.

AZURE DATA FACTORY - Can I set a variable from within a CopyData task or by using the output?

I have simple pipeline that has a Copy activity to populate a table. That task is based on a query and will only ever return 1 row.
The problem I am having is that I want to reuse the value from one of the columns (batch number) to set a variable so that at the end of the pipeline I can use a Stored Procedure to log that the batch was processed. I would rather avoid running the query a second time in a lookup task so can I make use of the data already being returned?
I have tried duplicating the column in the Copy activity and then mapping that to something like #BatchNo but that fails and have even tried to add a Set Variable task but can't figure out how to take a single column #{activity('Populate Aleprstw').output} does not error but not sure what that will actually do in this case.
Thanks and sorry if its a silly question.
Cheers
Mark
I always do it like this:
Generate a batch number (usually with a proc)
Use a lookup to grab it into a variable
Use the batch number in all activities (might be multiple copes, procs etc.)
Write the batch completion
From your description it seems you have the batch embedded in the data copy from the start which is not typical.
If you must do it this way, is there really an issue with running a lookup again?
Copy activity doesn't return data like that, so you won't be able to capture the results that way. With this design, running the query again in a Lookup is the best option.
Is the query in the Source running on the same Server as the Sink? If so, you could collapse the entire operation into a Stored Procedure that returns the data point you are trying to capture.

Two different mappings to ONE XML output file

I'm working on a talend job where I have a excel file and a couple of database fields that gets mapped to an XML file.
The working job looks like this:
Problem: I want to, with the same input of the excel file and the database fields, make another mapping that outputs to the same working XML file mentioned ealier. So I will have ONE XML file with TWO different mappings. How can I achieve this?
Update
I have done this mapping:
which in the end gets exported like this:
but I'm unsure on how to use this mapping in the tAdvancedFileOutputXML
If I understood correctly, you want to have a single XML file containing two different XMLs (the second one appended to the first one). In the shown Job add a OnSubJobOk link to point to a duplicate of your document flow which has a different mapping. In the second flow rather than using tFileOutputXML component to write the XML file, you can use the tAdvancedFileOutputXML with Append Source XML File marked to add to the file generated from the first flow. Also make sure to configure the XML tree. Check the following link for further information https://help.talend.com/reader/~hSvVkqNtFWjDbBHy0iO_w/h3wZegFH1_1XfusiUGtsPg
Hope this helps.

Using Talend Open Studio DI to extract extract value from unique 1st row before continuing to process columns

I have a number of excel files where there is a line of text (and blank row) above the header row for the table.
What would be the best way to process the file so I can extract the text from that row AND include it as a column when appending multiple files? Is it possible without having to process each file twice?
Example
This file was created on machine A on 01/02/2013
Task|Quantity|ErrorRate
0102|4550|6 per minute
0103|4004|5 per minute
And end up with the data from multiple similar files
Task|Quantity|ErrorRate|Machine|Date
0102|4550|6 per minute|machine A|01/02/2013
0103|4004|5 per minute|machine A|01/02/2013
0467|1264|2 per minute|machine D|02/02/2013
I put together a small, crude sample of how it can be done. I call it crude because a. it is not dynamic, you can add more files to process but you need to know how many files in advance of building your job, and b. it shows the basic concept, but would require more work to suite your needs. For example, in my test files I simply have "MachineA" or "MachineB" in the first line. You will need to parse that data out to obtain the machine name and the date.
But here is how may sample works. Each Excel is setup as two inputs. For the header the tFileInput_Excel is configured to read only the first line while the body tFileInput_Excel is configured to start reading at line 4.
In the tMap they are combined (not joined) into the output schema. This is done for the Machine A Excel and Machine B excels, then those tMaps are combined with a tUnite for the final output.
As you can see in the log row the data is combined and includes the header info.

How to increment a number from a csv and write over it

I'm wondering how to increment a number "extracted" from a field in a csv, and then rewrite the file with the number incremented.
I need this counter in a tMap.
Is the design below a good way to do it ?
EDIT: im trying a new method. see the design of my subjob below, but i have an error when i link the tjavarow to my main tmap in the main job
Exception in component tMap_1
java.lang.NullPointerException
at mod_file_02.file_02_0_1.FILE_02.tFileList_1Process(FILE_02.java:9157)
at mod_file_02.file_02_0_1.FILE_02.tRowGenerator_5Process(FILE_02.java:8226)
at mod_file_02.file_02_0_1.FILE_02.tFileInputDelimited_2Process(FILE_02.java:7340)
at mod_file_02.file_02_0_1.FILE_02.runJobInTOS(FILE_02.java:12170)
at mod_file_02.file_02_0_1.FILE_02.main(FILE_02.java:11954)
2014-08-07 12:43:35|bm9aSI|bm9aSI|bm9aSI|MOD_FILE_02|FILE_02|Default|6|Java
Exception|tMap_1|java.lang.NullPointerException:null|1
[statistics] disconnected
enter image description here
You should be able to do this mid flow in a tMap or a tJavaRow.
Simply read the number in as an integer (or other numeric data type) and then add your increment to it.
A really simple example might look like this:
Here we have a tFixedFlowInput that has some hard coded values for the job:
And we run it through a tMap where we add 1 to the age column:
And finally, we output it to the console in a table:
EDIT:
As Gabriele B has pointed out, this doesn't exactly work when reading and writing to the same flat file as Talend claims an exclusive read-write lock on the file when reading and keeps it open throughout the job.
Instead you would have to write the incremented data to some other place such as a temporary file, a database or even just to the buffer and then read that data in to a separate job which would then output the file you want and clean up anything temporary.
The problem with that is you can't do the output in the same process. I've just tried testing reading in the file in one child job, passing the data back to a parent job using a tBufferOutput and then passing that data to another child job as a context variable and then trying to output to the file. Unfortunately the file lock remains on it so you can't do this all in one self contain job (even using a parent job and several child jobs).
If this sounds horrible to you (it is) and you absolutely need this to happen (I'd suggest a database table sounds like a better match for this functionality than a flat file) then you could raise a feature request on the Talend Jira for the tFileInputDelimited to not hold the file open or to not insist on an exclusive read-write lock on the file.
Once again, I strongly recommend that you move to using a database table for this because even without the file lock issue, this is definitely not the right use of a flat file and this use case perfectly fits a database, even something as lightweight as an embedded H2 database.