Talend How To Pass Last Modified File Into TFileInputDelimited? - talend

I have searched all over, and read this post.
But it doesn't seem complete and doesn't work.
The situation: I need to get the last modified file from a directory on the local machine. I then need to pass that file into the fileinputdelimited component.
I currently have:
tfilelist --> iterate --> titeratetoflow --> tsamplerow
-->tflowtoiterate -> tinpufiledelimited ---> tlogrow (just to make sure its pulling the right file)
But it doesn't work. I have configured it. so that titeratetoflow has a column called
"FileName" with "((String)globalMap.get("CURRENT_FILE"))" as the value,
"FileDirectory" with ((String)globalMap.get("CURRENT_FILEDIRECTORY")) as value, and
"FileAndDirectory" with ((String)globalMap.get("CURRENT_FILEPATH")) as value.
The tsamplerow is limited to "1".
The tiflowtoiterate is set so that
"FileNameOnly" is value of "FileName"
"FileDirectoryOnly" is "FileDirectory" and
"FilePathComplete" is "FileAndDirectory"
In the File location field of the tinputfiledelimited, I have "((String)globalMap.get("FilePathComplete"))"
When it runs I get an error saying cannot find file or path. If I cut out the fileinput component and have it send straight to the tlogrow, it shows a single line of blank entry.
Any ideas?

I'm not sure if you've just slightly misconfigured the job here but it seems to work fine for me.
Here's a few screenshots showing my job design:
The only thing I can think of just by looking at your post is that you might have slightly messed up the key value pair combinations in the tFlowToIterate. I tend to find that the default settings there work fine pretty much all of the time and it makes it a little more obvious what it's doing as well.
EDIT: Actually, it looks like you might be using the wrong values in your tIterateToFlow. The tFileList will throw the values for the file paths etc in to the global map but it will preface it with the unique component name. If you hit ctrl+space in the value window it should prompt you with a list of available values (these are also specified in the "Outline" tab of the studio). It typically makes an implicit conversion to String but for this you will need to explicitly convert it so use .toString() instead of (String).

Another way to get last modified file is as below
tFileList(sorted DESC by file modified date) ------> tFixedFlowInput (schema - filename, filenumber) ----->tHashOutput
here in tFixedFlowInput
filename = file(String)globalMap.get("tFileList_1_CURRENT_FILEPATH")+"/"+(String)globalMap.get("tFileList_1_CURRENT_FILE")
filenumber = (Integer)globalMap.get("tFileList_1_NB_FILE")
What above will accomplish is get list of all files in the directory with their number/rank - where the file last modified will have file number =1 and next to that will have 2...and so on.
Now on SubJobOK of above tFileList you can have tHashInput which will read from above tHashOutput and filter only row where filenumber==1 - which means the last modified file.
tHashInput (link to tHashoutput) ---->tFilterRow(filenumber==1)------>tLogRow
One reason why you are getting null is probably you have used globalMap.get("CURRENT_FILEPATH) instead of globalMap.get("tFileList_1_CURRENT_FILEPATH")

The Simple Solution for above problem could be as below:
tFileList(sorted ASC by file modified date)--> tIterateToFlow --> tJava( just to end the subjob).
Then on
subjob ok --> tfileinput ( use (String)globalMap.get("tFileList_1_CURRENT_FILE") or (String)globalMap.get("tFileList_1_CURRENT_FILEPATH") as a file name/file path)
Explanation:
Since tFileList iterates all the files in ASC order, it will always have Latest file name stored in globalMap for the last iteration. The list is only iterated till tIterateToFlow hence after this component (String)globalMap.get("tFileList_1_CURRENT_FILE") will always give the last file name from the iterated list, which is the latest file in out case.
Main Flow :
Component View:

Related

Is there a difference in performance between set/save when saving columns to tables?

I have a small utility that checks for new columns for an intraday hdb and adds new columns.
At the moment I am using :
.[set;(pth;?[data;();();cls]);{[p;e] .log.error[.z.h;"Failed to save to path [",string[p],"] with error :",e]}[pth;]]
where path is :
`:path_to_hdb/2022.03.31/table01/newDummyThree
and
?[data;();();cls] // just an exec statement
Would it make any difference to use save instead:
.[save;(pth;?[data;();();cls]);{[p;e] .log.error[.z.h;"Failed to save to path [",string[p],"] with error :",e]}[pth;]]
Yes. If you are adding entire columns to a table then you might want to store it splayed, i.e. as a directory of column files rather than as a single table file. This means using set rather than save.
https://code.kx.com/q/kb/splayed-tables/
But test actual example updates.
As mentioned in the documentation for save:
Use set instead to save
a variable to a file of a different name
local data
So set has the advantage of not requiring a global and you can name the file a different name to the name of your in-memory global variable.
There is no difference in how they serialise/write the data. In fact, save uses set under the covers anyway:
q)save
k){$[1=#p:`\:*|`\:x:-1!x;set[x;. *p]; x 0:.h.tx[p 1]#.*p]}'
By the way - you can't use save in the way that you've suggested in your post. save takes a symbol as input and this symbol is the symbol name of your global variable containing the data you want to write.

Dataset Empty parameter value

I have an xml dataset, I want to parametrize the compression type to treat .xml and .xml.gz files with the same pipeline :
When I put 'gzip' value in compression type it reads xml.gzip file. I want to know what value I should put to read uncompressed .xml file because it does not accept empty value. It is able to read xml file just when I delete the compression_type parameter
You should pass "None" and it should work out .
I feel "None" is more of a workaround in this particular case. "None" is still a string value, not empty.
In my scenario right now, I have an Excel dataset. I want to make every parameter as generic as possible, including the file path/name, sheet name, and the range. The value of "Range" under Connection tab allows empty value. However if I specify it as #dataset().DataRange and leave my parameter DataRange empty, I cannot preview the data or submit the pipeline because it complains that the value cannot be empty.

Trouble with VB6 form location (top,left) and value type (i.e., integer)

VB6 will REPORT a LEFT location greater than max integer when the form is in an auxiliary screen. However, I cannot assign that value as a LEFT specification.
I manually drag frmSecondary to the secondary monitor (located to the right of my primary monitor).
I click a button within frmSecondary in order to write the value of frmSeconday.Left to a file. It writes 49950.
I close the program, then re-open.
I read the value from the file (and the value IS 49950).
I try to programmatically assign that value to frmSecondary.Left
Failure (works OK if the value is less than "largest integer").
Expected result: I should be able to assign the value that was written to file. Better yet, any ideas of how to programmatically place frmSecondary at it's "proper" location on the secondary monitor?
I simply read the value from file:
Input #fnum, frmSecondary.left
I've tried, without success:
Dim lft as variant
Input #fnum, lft
frmSecondary.left = lft
I have confirmed that the value being read from file is, in fact, 49950. I do this confirmation two ways...one is by looking at the file; the second is by putting a "stop" in the program and showing the value of lft.
TOTAL NONSENSE
Ignore this entire post. I just discovered my error -- it's so embarrassing, I don't even want to describe it. Nothing quite like spending an entire afternoon (to say nothing of YOUR time) chasing garbage.

How do I prevent users to use thousands separator in FileMaker Pro?

In FileMaker Pro, when using number field, the user can choose to use a thousand separator or not. For example, if I have a database with a field for the price of an item, the user can either enter 1,000 or 1000.
I am using my database to generate an XML file that needs to be uploaded. The thing is, that my XML scheme dictates that only a value of 1000 is allowed and not 1,000. Therefore, I want to either automatically remove the comma, or (my preference in this case) alert the user when trying to enter a value with a thousand separator.
What I tried is the following.
For the field, I am setting Validation options. For example:
Require Strict data type: Numeric Only
Validated by calculation: Position ( Self ; ","; 1 ; 1 ) = 0
Validated by calculation: Self = Substitue ( Self, ",", "")
Auto-enter calculation: Filter( Self ; "0123456789." )
Unfortunately, none of these work. As the field is defined as a number (and I want to keep it like this, as I am also performing calculations based on this number), the Position function and the Substitute function apparently ignore the thousand separator!
EDIT:
Note that I am generating my XML by concatenating a string, for example:
"<Products><Product><Name>" & Name & "</Name><Price>" & Price & "</Price></Product></Product>"
The reason is that what I am exporting is dependent on the values in my database. Therefore, I am not using the [File][Export records...] function.
Auto-enter calculation will work, but you need to uncheck the box "Do not replace existing value of field" (which is checked by default).
I'd suggest using the calculation GetAsNumber(self) as the auto-enter calc. If it should only contain integers, wrap that in a call to Int()
I am using my database to generate an XML file that needs to be uploaded. The thing is, that my XML scheme dictates that only a value of 1000 is allowed and not 1,000.
If this is only a problem when you export, why not handle it when exporting?
If you are exporting as XML using XSLT, you can add an instruction to
your stylesheet to remove the comma from all number fields;
Alternatively, you can export from a layout where the field is
formatted to display without the comma and select the Apply current's layout data formatting to exported data option when
exporting.
Added:
Perhaps I should have clarified. I am not using the export function to generate the XML as there is some logic involved in how the XML should be formatted (dependent on the data that I want to export). What I do instead is that I make a string where I combine XML-tags and actual values from the database.
IMHO, you're making a mistake by not taking advantage of the built-in XML/XSLT export option. Any imaginable logic can be implemented this way, without burdening your solution with the fragile task of creating a valid XML.
In any case, if you're using the field in a calculation, you can replace all references to it with:
GetAsNumber (YourField )
to get an unformatted, numeric-only, value.
Your question puzzles me. As far as I know, FileMaker does not store the thousands separator, but rather offers it only as a display option.
That's also why those functions can't find it.
Are you sure you are exporting the raw data and not a "formatted as layout" variant?

Iteration among components in talend 5.6

I have a filename and I have to check if it is there in physical location. If it is there then I have to increment like filename_1,filename_2 etc and check If filename_1 exists in physical location and in database if it is there in any one of these then increment again and check in both physical location,database until I get a file name that is not there in both places.
But any link is not going back for iterating though components in talend.
when i found a filename not in both places then i have to create a file with that name in physical location and update in database.
Use these components :
tLoop : for loop, from 1 to 9999 (as infinity)
tFileExists : check for "filename_"+((Integer)globalMap.get("tLoop_1_CURRENT_VALUE"))
if not exists : write new file with this name "filename_"+((Integer)globalMap.get("tLoop_1_CURRENT_VALUE")) and update the tLoop_1_CURRENT_VALUE beyond the max of the loop to get off the loop, use a tJava and this line of code globalMap.put("tLoop_1_CURRENT_VALUE",9999))
Dont forget to check the first file filename separatly before the loop.