I just started using Spring Batch and I don't know how I can implement my business need.
The behavior is quite simple : I have a directory where files are saved. My batch should detect those files, import them into my database and move the file to a backup directory (or an error directory if the data can't be saved).
So I create chunks of 1 file. The reader retrieve them and the processor imports the data.
I read Spring Batch create a global transaction for the whole chunk, and only the ChunkListener is called out of the transaction. It seems to be OK, but the input parameter is a ChunkContext. How can I retrieve the file managed in the chunk ? I don't see where it's stored in the ChunkContext.
I need to be sure the DB accepts the insertions before choosing where the file must be moved. That's why I need to do that after the commit.
Here is how you can proceed:
Create a service (based on a file system watching API or something like Spring Integration directory polling) that launches a batch job for the new file
The batch job can use a chunk-oriented step to read data and write it to the database. In that job, you can use a job/step execution listener or a separate step to move files to the backup/error directory according to the success or failure of the previous step.
Related
I am currently using Community version on linux server, have configured db2audit process
Which generated audit files at respective location. Then user have to manually execute db2audit archive command to achieved logs file and have to execute thedb2 extract command to extract the archived files into flat ascIII files and then we have to load the files into respective tables.
There only we can analyze the logs by query the tables. In this whole process lots of manual intervention is required.
Question:-Do we have any config settings or utility
with the help of which we can generate logs files which include SQL statement event, host, session id,Timestamp and all instantly and automatically.
need to set instant level logging mechanism to generate flat files for logs of any SQL execution happened or any event triggered in database level in DB2 on linux server
A cronjob runs every 3 hours to download a file using SFTP. The scheduled program is written in Perl and the module used is Net::SFTP::Foreign.
Can the Net::SFTP::Foreign download files that are only partially uploaded using SFTP?
If so, do we need to check the SFTP file modified date to check copy process completion?
Suppose a new file is uploading by someone in SFTP and he file upload/copy is in progress. If a download is attempted at the same time, do I need to code for the possibility of fetching only part of a file?
It's not a question of the SFTP client you use, that's irrelevant. It's how the SFTP server handles the situation.
Some SFTP servers may lock the file being uploaded, preventing you from accessing it, while it is still being uploaded. But most SFTP servers, particularly the common OpenSSH SFTP server, won't lock the file.
There's no generic solution to this problem. Checking for timestamp or size changes may work for you, but it's hardly reliable.
There are some common workarounds to the problem:
Have the uploader upload "done" file once upload finishes. Make your program wait for the "done" file to appear.
You can have dedicated "upload" folder and have the uploader (atomically) move the uploaded file to "done" folder. Make your program look to the "done" folder only.
Have a file naming convention for files being uploaded (".filepart") and have the uploader (atomically) rename the file after upload to its final name. Make your program ignore the ".filepart" files.
See (my) article Locking files while uploading / Upload to temporary file name for example of implementing this approach.
Also, some FTP servers have this functionality built-in. For example ProFTPD with its HiddenStores directive.
A gross hack is to periodically check for file attributes (size and time) and consider the upload finished, if the attributes have not changed for some time interval.
You can also make use of the fact that some file formats have clear end-of-the-file marker (like XML or ZIP). So you know, when you download an incomplete file.
For details, see my answer to SFTP file lock mechanism.
The easiest way to do that when the upload process is also under your control, is to upload files using temporal names (for instance, foo-20170809.tgz.temp) and once the upload finishes, rename then (Net::SFTP::Foreign::put method supports the atomic option which does just that). Then on the download side, filter out the files with names corresponding to temporal files.
Anyway, Net::SFTP::Foreign get and rget methods can be instructed to resume a transfer passing the option resume => 1.
Also, if you have full SSH access to the SFTP server, you could check if some other process is still writing to the file to be downloaded using fuser or some similar tool (though, note that even then, the file may be incomplete if for instance there is some network issue and the uploader needs to reconnect before resuming the transfer).
You can check the size of the file.
Connect to SFTP.
Check file size.
Sleep for 5/10 seconds.
Check file size again.
If size did not change, download the file, if the size changed do step 3.
I have a function which should take an executable file as argument, execute it and return the result. This function should be run asynchronously so I'm using celery. I want to use multiple computers as workers so each worker should be able to access the executable file. However since the executable files are uploaded by the moderators it's not an option to put a version of each file in each worker by hand. So what would be the best way to handle this?
The only option I could thought of was storing the files in the database. the function should retrieve the file from DB and store it temporarily. Execute it ,remove the file and return the result.
Is this a good approach? Are there any better ways to handle this?
I've wrote the code that creates full backups of my ESENT database, using JetBeginExternalBackup API.
Following the MSDN guidelines, I backed up every file returned by JetGetAttachInfo and JetGetLogInfo.
I've made the backup, erased old database, and copied the backup data to the database folder.
The DB engine was unable to start, the JetInit error code is "JET_errMissingLogFile".
I've checked the backup, it only contains the database file, and "<inst>XXXXX.log" log files. It lacks the current log file (I'm using circular logging, BTW).
Is there any way to restore such backup?
I don't want to use JetExternalRestore API because it's too complex: I don't need to restore to another location, I don't understand why there're 3 input folders not 2, and I don't know the values to supply in genLow and genHigh arguments.
I do need external backups: the ESENT database is used by ASP.NET on a remote server, and I'm backing it up over the Internet.
Or, maybe there's a way to retrieve the name of the current log file, and I should just add it to the backup?
Thanks in advance!
P.S. I've got no permissions to span processes on my web server, so using eseutil.exe is not an option.
Unpack all backed up files to a single folder.
Take the name of your main database file. Replace extension to .pat. Create zero-length file with that name, e.g. database.pat.
After this simple step, call JetRestoreInstance API, it will restore the backup from that folder.
I have lost my "Trak.db" there is log file is available is it possible to recover this one through log file? use of Log files?
The best you can do is to run DBTran against the log file and it will generate a SQL file for the statements that were executed while that log was running. Of course, whether or not the log will contain all your data is going to be based on how you were logging/truncating/backing up. Technically it is possible if you have all your logs and the schema.
For reference: http://www.ianywhere.com/developer/product_manuals/sqlanywhere/0901/en/html/dbdaen9/00000590.htm