Spring Batch pause and then continue - spring-batch

I'm writing a job that will read from an excel file, x number of rows and then I'd like it to pause for an hour before it continues with the next x number of rows.
How do I do this?
I have a job.xml file which contains the following. The subscriptionDiscoverer fetches the file and pass it over to the processor. The subscriptionWriter should write another file when the processor is done.
<job id="subscriptionJob" xmlns="http://www.springframework.org/schema/batch" incrementer="jobParamsIncrementer">
<validator ref="jobParamsValidator"/>
<step id="readFile">
<tasklet>
<chunk reader="subscriptionDiscoverer" processor="subscriptionProcessor" writer="subscriptionWriter" commit-interval="1" />
</tasklet>
</step>
</job>
Is there some kind of timer I could use or is it some kind of flow structure? It's a large file of about 160000 rows that should be processed.
I hope someone has a solution they would like to share.
Thank you!

I'm thinking of two possible approaches for you to start with:
Stop the job, and restart again (after an hour) at the last position. You can start by taking a look on how to change the BatchStatus to notify your intent to stop the job. See http://docs.spring.io/spring-batch/2.0.x/cases/pause.html or look at how Spring Batch Admin implements its way of communicating the PAUSE flag (http://docs.spring.io/spring-batch-admin/reference/reference.xhtml). You may need to implement some persistence to store the position (row number) for the job to know where to start processing again. You can use a scheduler as well to restart the job.
-or-
Add a ChunkListener and implement the following in afterChunk(ChunkContext context): Check if x number of rows has been read so far, and if yes, implement your pause mechanism (e.g., a simple Thread.sleep or look for more consistent way of pausing the step). To check for the number of rows read, you may use StepExecution.getReadCount() from ChunkContext.getStepContext().StepExecution().
Do note that afterChunk is called outside the transaction as indicated in the javadoc:
Callback after the chunk is executed, outside the transaction.

Related

Veins: Get Tripinfo and emissions in output

I'm using veins 4.6 with omnetpp 5.1.1 and trying to output tripinfo of vehicles using following configurations in .sumocfg file:
<input>
<net-file value="erlangen.net.xml"/>
<route-files value="erlangen.rou.xml"/>
<additional-files value="erlangen.poly.xml"/>
</input>
<time>
<begin value="0"/>
<end value="300"/>
<step-length value="0.1"/>
</time>
<report>
<no-step-log value="true"/>
</report>
<gui_only>
<start value="true"/>
</gui_only>
<emissions>
<device.emissions.probability value="1"/>
</emissions>
<output>
<tripinfo-output value="erlangen.trip_info.xml"/>
<fcd-output value="erlangen.fcd.xml"/>
</output>
I have generated 30 random trips for example network, set emissionClass="HBEFA3/LDV_G_EU4" attribute of vType element. When I run simulation directly in SUMO then on successful completion it generates required trip info file:
<tripinfo id="0" depart="0.00" departLane="4006674#0_0" departPos="5.10" departSpeed="0.00" departDelay="0.00" arrival="202.40" arrivalLane="-4006726#0_0" arrivalPos="281.67" arrivalSpeed="13.76" duration="202.40" routeLength="2214.00" waitSteps="0" timeLoss="28.90" rerouteNo="0" devices="tripinfo_0 emissions_0" vType="passenger" speedFactor="1.00" vaporized="">
<emissions CO_abs="16453.885943" CO2_abs="591255.824603" HC_abs="76.174970" PMx_abs="24.476562" NOx_abs="123.285735" fuel_abs="254.203634" electricity_abs="0"/>
</tripinfo>
...
<tripinfo id="29" depart="29.00" departLane="29900564#4_0" departPos="5.10" departSpeed="0.00" departDelay="0.00" arrival="226.10" arrivalLane="-31241838#0_0" arrivalPos="18.39" arrivalSpeed="22.13" duration="197.10" routeLength="2353.60" waitSteps="0" timeLoss="23.99" rerouteNo="0" devices="tripinfo_29 emissions_29" vType="passenger" speedFactor="1.00" vaporized="">
<emissions CO_abs="16826.605518" CO2_abs="612826.831847" HC_abs="78.478455" PMx_abs="25.328690" NOx_abs="126.946877" fuel_abs="263.477812" electricity_abs="0"/>
</tripinfo>
But when I debug the same as OMNET++ Simulation then it finishes with following notification and trip info file is not generated.
I set the simulation time to 300s in both .sumocfg and omnetpp.ini (sim-time-limit = 300s), screenshots shows that all departed vehicles were arrived at 285.900 s and at the same time simulation stopped with the notification. I have observed this issue multiple time by changing the number of random trips and simulation time again and again but all in vain.
Here it is clearly stated that:
The information is generated for each vehicle as soon as the vehicle arrived at its destination and is removed from the network.
But that is not the case with me. Please guide what i'm doing wrong. Thanks
You most likely ran SUMO via sumo-launchd.py, which creates a temporary copy of your scenario (in /tmp). After the scenario ran, the copy is deleted. This means, if you are logging to the directory that the SUMO simulation is executing in, your logged data will be cleaned along with the temporary copy.
There are three ways of preventing that:
Run sumo-launchd.py with a command line switch that disables deletion of the temporary directory, or
Configure SUMO to store the statistics somewhere else, or
Use a different way of launching SUMO (manually or using the TraCI ScenarioManagerForker)

Create scheduled task using Task Scheduler Managed Wrapper with "Synchronize across time zones" option disabled

Does anybody know how to create a scheduled task using Task Scheduler Managed Wrapper or Schtasks.exe with "Synchronize across time zones" unchecked.
You can do this with schtasks.exe, but it's tricky. Essentially, you have to use the /xml switch and pass an XML file that has the trigger formatted properly.
The basics of the XML file can be determined by getting as much of the required config done in the Task Scheduler GUI on your dev machine; then using Export... from the context menu, saving the file and chopping out the irrelevant bits.
Given a basic XML structure:
<?xml version="1.0" encoding="UTF-16"?>
<Task version="1.2" xmlns="http://schemas.microsoft.com/windows/2004/02/mit/task">
<Triggers>
<CalendarTrigger>
<StartBoundary>2018-03-28T18:00:00Z</StartBoundary>
<Enabled>true</Enabled>
<ScheduleByDay>
<DaysInterval>1</DaysInterval>
</ScheduleByDay>
</CalendarTrigger>
</Triggers>
<Actions Context="Author">
<Exec>
<Command>C:\Windows\System32\cmd.exe</Command>
<Arguments>/c dir</Arguments>
</Exec>
</Actions>
</Task>
The important element here is <StartBoundary/> – it defines both the date/time from which tasks should start running and (in the case of time-based triggers) the time at which it should run each day, week, etc.
If you want Synchronize across timezones to be unchecked:
You must use a time value for start boundary that is as per the local time that you want the task to run and does not end with the GMT+0/UTC+0/zulu-time indicator Z:
<StartBoundary>2018-03-28T18:00:00</StartBoundary>
The above should run every day, at 18:00 local time.
If you want Synchronize across timezones to be checked:
You must calculate the GMT+0/UTC+0/zulu-time of the desired start time yourself, based on your local timezone and respective of any daylight-savings system your timezone uses, then use this time value and include the Z indicator at the end:
<StartBoundary>2018-03-28T18:00:00Z</StartBoundary>
The above should run every day, at 18:00 UTC, regardless of local time.
To register the above task:
From the command prompt, you would issue:
schtasks.exe /create /tn "My Task Name" /xml x:\pathto\taskdefinition.xml
(You do not need to keep the task definition file after you have registered it; the settings are copied to the created task.)
The difficulty here is probably in creating the XML file — it can be a little finicky around the encoding of the file (you may need to experiment with byte-order markers), and there are some combinations of settings that I have never been able to get to run properly (they register OK, but the running task instantly fails with a strange return code). Your mileage may vary.
I've never tried with the Managed Wrapper, but the source suggests it also generates the XML in the background. However, it appears to use XmlDateTimeSerializationMode.RoundtripKind as its serialization method, which (rightly, for round-tripping) includes the timezone as part of the serialization.
This leads me to think that it will never create a task that has Synchronize across timezones unchecked. In fact, it may mean that, if you can determine the correct timezone suffix for your start time, you might not need to do that Z-based calculation above, yourself.
You might be able to raise a feature request, to have this changed based on a boolean property, e.g.:
writer.WriteElementString("StartBoundary",
System.Xml.XmlConvert.ToString(t.StartBoundary,
System.Xml.XmlDateTimeSerializationMode.RoundtripKind));
becoming:
writer.WriteElementString("StartBoundary",
System.Xml.XmlConvert.ToString(t.StartBoundary,
SynchronizeAcrossTimezones
? System.Xml.XmlDateTimeSerializationMode.RoundtripKind
: System.Xml.XmlDateTimeSerializationMode.Unspecified));
...but that's not up to me!
Fixed it using the Task Scheduler Managed Wrapper library by specifying the
DateTimeKind.Unspecified in StatBoundary

Spring Batch Jsr 352, manage processor skip outside/before skip listener

I am trying to find a way to manage a skip scenario in the process listener (or could be read or write listener as well). What I have found is the skip listener seems to be executed after the process listener's on error method. This means that I might be handling the error in some way with out knowledge that it is an exception to be skipped.
Is there some way to know that a particular exception is being skipped out side the skip listener? Something that could be pulled into the process listener or possibly else where.
The best approach I found to do this was just to add property to the step and then wire in the step context where i needed it.
<step id="firstStep">
<properties> <property name="skippableExceptions" value="java.lang.IllegalArgumentException"/> </properties>
</step>
This was not a perfect solution but the skip exceptions only seem to be set in StepFactoryBean and Tasklet and are not directly accessible.
For code in my listeners
#Inject
StepContext stepContext;
.
.
.
Properties p = stepContext.getProperties();
String exceptions = p.getProperty("skippableExceptions");

Quartz.Net - delay a simple trigger to start

I have a few jobs setup in Quartz to run at set intervals. The problem is though that when the service starts it tries to start all the jobs at once... is there a way to add a delay to each job using the .xml config?
Here are 2 job trigger examples:
<simple>
<name>ProductSaleInTrigger</name>
<group>Jobs</group>
<description>Triggers the ProductSaleIn job</description>
<misfire-instruction>SmartPolicy</misfire-instruction>
<volatile>false</volatile>
<job-name>ProductSaleIn</job-name>
<job-group>Jobs</job-group>
<repeat-count>RepeatIndefinitely</repeat-count>
<repeat-interval>86400000</repeat-interval>
</simple>
<simple>
<name>CustomersOutTrigger</name>
<group>Jobs</group>
<description>Triggers the CustomersOut job</description>
<misfire-instruction>SmartPolicy</misfire-instruction>
<volatile>false</volatile>
<job-name>CustomersOut</job-name>
<job-group>Jobs</job-group>
<repeat-count>RepeatIndefinitely</repeat-count>
<repeat-interval>43200000</repeat-interval>
</simple>
As you see there are 2 triggers, the first repeats every day, the next repeats twice a day.
My issue is that I want either the first or second job to start a few minutes after the other... (because they are both in the end, accessing the same API and I don't want to overload the request)
Is there a repeat-delay or priority property? I can't find any documentation saying so..
I know you are doing this via XML but in code you can set the StartTimeUtc to delay say 30 seconds like this...
trigger.StartTimeUtc = DateTime.UtcNow.AddSeconds(30);
This isn't exactly a perfect answer for your XML file - but via code you can use the StartAt extension method when building your trigger.
/* calculate the next time you want your job to run - in this case top of the next hour */
var hourFromNow = DateTime.UtcNow.AddHours(1);
var topOfNextHour = new DateTime(hourFromNow.Year, hourFromNow.Month, hourFromNow.Day, hourFromNow.Hour, 0, 0);
/* build your trigger and call 'StartAt' */
TriggerBuilder.Create().WithIdentity("Delayed Job").WithSimpleSchedule(x => x.WithIntervalInSeconds(60).RepeatForever()).StartAt(new DateTimeOffset(topOfNextHour))
You've probably already seen this by now, but it's possible to chain jobs, though it's not supported out of the box.
http://quartznet.sourceforge.net/faq.html#howtochainjobs

schedule a trigger every minute, if job still running then standby and wait for the next trigger

I need to schedule a trigger to fire every minute, next minute if the job is still running the trigger should not fire and should wait another minute to check, if job has finished the trigger should fire
Thanks
In Quartz 2, you'll want to use the DisallowConcurrentExecution attribute on your job class. Then make sure that you set up a key using something similar to TriggerBuilder.Create().WithIdentity( "SomeTriggerKey" ) as DisallowConcurrentExecution uses it to determine if your job is already running.
[DisallowConcurrentExecution]
public class MyJob : IJob
{
...
}
I didnt find any thing about monitor.enter or something like that, thanks any way
the other answer is that the job should implement the 'StatefulJob' interface. As a StatefulJob, another instance will not run as long as one is already running
thanks again
IStatefulJob is the key here. Creating own locking mechanisms may cause problems with the scheduler as you are then taking part in the threading.
If you're using Quartz.NET, you can do something like this in your Execute method:
object execution_lock = new object();
public void Execute(JobExecutionContext context) {
if (!Monitor.TryEnter(execution_lock, 1)) {
return,
}
// do work
Monitor.Exit(execution_lock);
}
I pull this off the top of my head, maybe some names are wrong, but that's the idea: lock on some object while you're executing, and if upon execution the lock is on, then a previous job is still running and you simply return;
EDIT: the Monitor class is in the System.Threading namespace
If you are using spring quartz integration, you can specify the 'concurrent' property to 'false' from MethodInvokingJobDetailFactoryBean
<bean id="positionFeedFileProcessorJobDetail" class="org.springframework.scheduling.quartz.MethodInvokingJobDetailFactoryBean">
<property name="targetObject" ref="xxxx" />
<property name="targetMethod" value="xxxx" />
<property name="concurrent" value="false" /> <!-- This will not run the job if the previous method is not yet finished -->
</bean>