how to move file after reading the file in ibm datastage - datastage

i have 1 folder which has 4 files, they are sales_jan, sales_feb, debt_jan, debt_feb.I created specific job for each sales and debt. The thing is, if i already run the job previously for sales_jan only and then there comes sales_feb after that, i dont wanna repeat reading the sales_jan again, i only want to read the newest file added that hasn't been processed. For reading the file, i pass the pattern of the specific file (ex. sales_*) but if i use it like that, then the stage will reprocessed the sales_jan again although it already has. I want to move the file already been read into another folder. How do i exactly do it in ibm datastage? if there's no way to do it, what's your suggestion for my problem. Any ideas would be appreciated.

The easiest solution is to use an after-job subroutine (ExecSH on Linux/UNIX, ExecDOS on Windows) to move the file to a different location.
Since you're using wildcards for the Sequential File stage, you're going to have to be a bit more clever in handling a situation where your job processes only some of the files. I would prefer to write this using a loop in a sequence, processing one file at a time, so that the move can be handled per-file.

you might make a flag for every file which already read by your job. For example add a maxdate field for each file. When the first file max date is less than the second file or new file Then read the latest file. It can be done by using simple linux command in sequence or tranformer. Just like Ray mentioned before

Related

How to replace value in txt file with powershell from GitHub

I want to build a simple script that may be useful for others as well, but I have only very basic programming knowledge and can't do it myself without learning how to write powershell scripts from scratch.
What this script is supposed to do is, open an INI file (really just a txt), look for a variable with an assigned value and replace that value from a txt hosted on GitHub, save and then run a program.
This is for the tracker list of qBittorrent, since that feature still hasn't been implemented and the only other script that I could find that does this is for linux and mac, there seem to be none for windows.
The basic idea is this:
get-content "c:\users\[user]\appdata\roaming\qbittorrent\qbittorrent.ini"
# This is where pseudo code starts
get file from "[github-link.txt]"
save file to cache # keeping it is useless as it gets updated daily
find variable "Session\AdditionalTrackers=" in qbittorrent.ini
replace value of variable with content of cached file # this is what I struggle with most when looking for example code. Everything I could find specified the exact string that needed replacing, which in this case is quite long and may change with every update of the file.
overwrite original file
launch program qbittorrent.exe
end script
Conveniently or most likely deliberately all (most) of the tracker lists on GitHub are already formatted in a way that they can be directly pasted into the file without having to worry about formatting. Example.
I can totally understand if nobody wants to do the work, but I would greatly appreciate it and possibly others that are looking for a stopgap for the lacking feature.
If this already exists, go ahead and call me an idiot and while you're at it drop a link ;)
I just found a little tool called Power Automate and it pretty much does what I was looking for. It's not quite as elegant as a single click script but it does the job. Sadly I can't share the "flow" I built because, well, there is no option for it - thanks Microsoft. So, I'll try my best to write it out.
Not quite a "solution" but pretty to close to it.
Here is the "flow":
get file from web // from github for example
read text from file // read downloaded .txt file
read text from file // read qBittorrent.ini
crop text // crop between flags in qBittorrent.ini use "Session\AdditionalTrackers=" as start and "Session\GlobalMaxRatio=" as end and save to cropVar2
crop text // crop before flag use "Session\AdditionalTrackers=" as flag and save to cropVar1
crop text // crop after flag use cropVar2 as flag and save to cropVar3
replace text // replace cropVar2 with content of downloaded file and save to cropVar2
write text to file // write cropVar1,cropVar2,cropVar3
end flow
Keep in mind that any changes to the qBittorrent.ini may change the order of the entries. Which means you have to check if it's still correct after every update and after every change you make in the options. This is a massive cludge after all...
You can input fail saves so that you won't break anything if the order changed.

Talend - Extract FileName from tLogRow/tSort

I am new to Talend and just trying to work my way through it.
Problem Statement
I need to process a positional file, from a list of files. Need to identify the latest file first and then process only that file. I was able to identify the most updated file. And then I was able to create another flow which processes the positional file. The problem is combining these two flows so that I am able to identify the most recent file and have just that one processed.
Tried so far
Have been trying to extract the most recent file from a list within a directory. Iterated through all the files, retained their properties in a buffer. Post completion of this sub-task, read through the buffer, sorted with descending mime, extracted the top record and was able to print it using tLogRow.
All seems to be fine except I don't know how to use the filename now for next task.
I am certain this is very rudimentary but I'll be honest, I've been scourging the internet/help from quite some time now, with no success.
Any pointers would help.
The job flow is attached for your reference.
First of all, you can simplify your job by using tFileList's capabilities. It can sort files by their modified date:
Next, use tIterateToFlow to convert each iteration to a row:
(String)globalMap.get("tFileList_1_CURRENT_FILEPATH")
and tSampleRow with a range of "1", to get the most recent file.
Then store the result in a global variable. In the next subjob, just use that global variable as your filename in tFileInputPositional.

Talend tWaitForFile insufficiency

We have a producer process that write files into a specific folder, which run continuously, we have to read files one by one using talend, there is 2 issues:
The 1st: tWaitForFile read only files which exist before its starting, so files which have created after the component starting are not visible for it.
The 2nd: There is no way to know if the file is released by the producer process, it may be read while it is not completely written, the parameter _wait_release_ of tWaitForFile does not work on Linux system !
So how can make Talend read complete written files from a directory that have an increasing files number ?
I'm not sure what you mean by your first issue. tWaitForFile has options to trigger when files are created, modified or deleted in a folder.
As for the second issue, your best bet here is for the file producer to be creating an OK or control file which is a 0 byte touch when it has finished writing the file you want.
In this case you simply look for the appearance of the OK file and then pick up the relevant completed file. If you name the 2 files the same but with a different file extension (the OK file is typically called ".OK" then this should be easy enough to look for. So you would set your tWaitForFile to look for "*.OK" files and then connect this to an iterate to a tFileInputDelimited (in the case you want to pick up a delimited text file) and then declare the file name as ((String)globalMap.get("tWaitForFile_1_CREATED_FILE")).substring(0,((String)globalMap.get("tWaitForFile_1_CREATED_FILE")).length()-3) + ".txt"
I've included some screenshots to help you below:

How to replace a file inside a zip on iOS?

I need to replace a file on a zip using iOS. I tried many libraries with no results. The only one that kind of did the trick was zipzap (https://github.com/pixelglow/zipzap) but this one is no good for me, because what really do is re-zip the file again with the change and besides of this process be to slow for me, also do something that loads the whole file on memory and make my application crash.
PS: If this is not possible or way to complicated, I can settle for rename or delete an specific file.
You need to find a framework where you can modify how data is read and written. You would then use some form of mmap to essentially read and write small chunks. Searching on NSData and mmap resulted in this Post, however you can use mmap from the posix level too. Ps it will be slower than using pure memory no way around that.
Got it WORKING!! JXZip (https://github.com/JanX2/JXZip) has made exactly what I need, they link to libzip (http://www.nih.at/libzip/) that is a fully equiped library for working with ZIP files and JXZip have all the necessary Objective-C wrapper code. Thanks for all the replys.
For archive purposes, as the author of zipzap:
Actually zipzap does exactly what you want. If you replace an entry within a zip file, zipzap will do the minimum necessary to update it: it will skip writing all entries before the replaced entry, then write out the entry, then write out all entries after the replaced entry without recompressing. At the moment, it does require sufficient memory for the entries after the replaced entry though.

Pipe multiple files into a zip file

I have several files in a GridFS Document Store and what I'd like to do is to pipe this data into a zip file via stdin in NodeJS. So that I will end up with a zip file containing all these files.
Now my question is how can I give the files a valid filename inside of the zip file. I think I need to emulate/fake a file header containing the filename?
Any help is appreciated!
Thanks
I had problems when writing zip files with Node.js not long ago. I ended up doing something similar to what is described in Zip archives in node.js
I can't help you directly with your problem, but at least I hope I can point out some things:
Don't try to use node-archive. Even if the description says it allows to create zip files, the moment I read the source code (since documentation is unexistant) I realized that's just a lie. It only exposes methods for reading.
Using zip by spawning a process, like recommended on the provided link, seems to be the best way. Something that would work is copying the files to a local folder with whatever name you desire and then calling the zip command, just to delete the files afterwards.
The other option, which seems ok, is to use zipper (https://github.com/rubenv/zipper, although better just use npm). The reason I'm not really wishing to use it is because there's not that much flexibility, it seems to have been done in a day and it hasn't been modified since the first commit, so I'm not sure it will receive maintenance (sure, you could just fork it...).
I swear the day I have an entire free weekend with no work I will write a freaking module that does this as complete as possible. It's silly that there isn't and it shouldn't be that much struggle. blablablarant.
Edit:
Not sure if it was there before, but now I've been using the node-compress module (also using gzippo). It works fine.