Matlab - load results from parallel job that has not finished yet - matlab

Is there a way to retrieve the variables of a batch job that has not finished yet?
If not, how do I perform some kind of checkpointing, so I could retrieve intermediate results from a parallel job?

Well, there a few things you can do. First, something like this to see if your job is done:
while ~strcmp(jobHandle.State, 'Finished')
jobHandle.Task
jobHandle.Task(1)
jobHandle.Task(1).State
jobHandle.Task(1).OutputArguments
end
Inside that loop, you'll have access to the job object, and all the task objects for that job. I tried to demo some of the data you have access to in the impractical example above. You can use that data-access to set up any checkpoint scheme you want. See the documentation, here, for more info. Good Luck!

Related

A file prepared by one spring batch job is not accessible to other for deletion

I have a requirement where I have to prepare a file using one job and another job which runs once a day will send the file to external system and delete/or move from the location. When this job tries to delete/or move the file it can't access it.
I tried setting writable to true when file is created. Running jobs on separate times (Running one job at a time). Tried adding "delete" as a step to the same job as well. Nothing worked.
I am using file.delete(). Also tried Files.deleteIfExists().
I suspect the first job is not assigning proper permissions but don't know a way around it set permissions in spring batch
Are these jobs run by the same user? i.e. Same user and permissions?
Also what is the actual error message? Does it say permissions denied? If so they it is likely an OS restriction not Spring Batch/Java limitation.
An easier solution would be to just add a step to the first job to send the files are part of the job and drop the job that just transfers the files.
Answering my own question 😀. Hope it helps someone.
Issue was the last ItemWriter was holding the resources because I was using the composite writer. While using CompositeWriter beforeStep, afterStep methods are “hidden”. You have to call them explicitely. I selected the approach to write a custom writer which will explicitely call writer.close().
Adding afterStep method and calling super.close() should also work. Though I have nit tries that out.

How to use Counters in apache crunch

In Apache Crunch , there is method named increment("any enum").
I used increment(TOTAL_IDS);, but where I can see the result of counters, counters are not coming in logs after completion of job.
What am I missing there?
you should be able to see your counters in the tracking URL of the mapreduce job (if you are running mapreduce) or extract the counters from the pipeline. It would be useful if you could provide your code how you are incrementing the counters? is it in your DoFn, in your cleanp method?
Regards

Talend job batch processing

I am exploring Talend at work, I was asked if Talend supports batch processing as in running the job in multiple threads. After going through the user guide I understood threading is possible with sub jobs. I would like to know if it is possible to run the a job with a single action in parallel
Talend has excellent multi threading support. There are two basic methods for this. One method gives you more control and is implemented using components. The other method is implemented as job setting.
For the first method see my screenshot. I use tParallelize to load three files into three tables at the same time. Then when all three files are successfully loaded I use the same tParallelize to set the values of a control table. tParallelize can also be connected to tRunJob as easily as a subjob.
The other method is described very well here in Talend Help: Talend Help- Run Jobs in Parallel
Generally I recommend the first method because of the control it gives you, but if your job follows the simple pattern described in the help link, that method works as well.

How to make a Sequential Http get calls from locust

In Locust Load test Enviroment tasks are defined and are called randomly.
But if i want a task to be performed just after a specific task. Then how do i do it?
for ex: after every 'X' url call i want 'Y' url to be called based on the response of 'X'.
In my experience, I found that it's better to model Locust tasks as completely independent of each other, and each of them covering a user scenario or behavior (eg. customer logs in, searches for a book and adds it to the cart). This is mostly because that's a closer simulation of the user's behavior.
Have you tried just having the multiple requests on the same task, and just if / else based on your responses? This slide from Carl Byström's talk follows said approach.
You just have to make a sequential gets or posts. When you define your task do something like this:
#task(10)
def my_task(l):
l.client.get('/X')
l.client.get('/Y')
There's an option to create a custom task set inherited from TaskSequence class.
Then you should add seq_task decorators to all task set methods to run its tasks sequentially.
https://docs.locust.io/en/latest/writing-a-locustfile.html#tasksequence-class

Run a single job in parallel

I need to know that how can we run a single job in parallel with different parameters in talend.
The answer is straightforward, but rather depends on what you want, and whether you are using free Talend or commercial.
As far as parameters go, make sure that your jobs are using context variables - this is the preferred way of passing parameters in.
As for running in parallel, there are a few options.
Talend's studio is a java code generator, so you can export your job (it's just java code) and run it wherever you want. How you invoke it is up to you - schedule it, invoke it N times manually, your call. Obviously, if your job touches shared resources then making it safe to run in parallel is up to you - the usual concurrency issues apply.
If you have the commercial product, then you can use the Talend admin centre (TAC). The TAC allows you to schedule a job more than once with different contexts. Or, if you want to keep the parallelization logic inside your job, then consider using the tParallelize component in one job to run another job N times.