Scheduling jobs using code is easy, but I would like to schedule jobs based on contents of a folder.
For example:
I want the folder in "\MyApp\Jobs\" to contain some XML files that will have the information about the IJob to be scheduled.
The thing is that I want this folder to be watched for changes (for XML files) and when a new file is found, a new IJob will be schedule using the information contained in the XML.
What should I do to implement a mechanism like this?
Thanks
The class java.io.File has a few listFiles() methods which will list the contents of your directory. Use a FileFilter or FilenameFilter if you want to limit the filenames returned in some way. Do this in a loop with a "sleep", something like 60 seconds, to avoid chewing up all your CPU.
Hope this helps.
you dont need to watch for changes in file using file watcher.
while creating quartz property file it have option as
org.quartz.plugin.jobInitializer.scanInterval = 5
which scans for xml file changes.so in above case it scans every 5 seconds
my complete quartz.property file as follow
org.quartz.scheduler.instanceName = MyScheduler
org.quartz.threadPool.threadCount = 3
org.quartz.jobStore.class = org.quartz.simpl.RAMJobStore
org.quartz.plugin.jobInitializer.class =org.quartz.plugins.xml.XMLSchedulingDataProcessorPlugin
org.quartz.plugin.jobInitializer.fileNames =C:/Users/Admin/Documents/NetBeansProjects/QXmlTest/src/java/quartz-config.xml
org.quartz.plugin.jobInitializer.failOnFileNotFound =true
org.quartz.plugin.jobInitializer.scanInterval = 5
org.quartz.plugin.jobInitializer.wrapInUserTransaction= true
and i define jobs and trigger inside this quartz-config.xml file as follows :
<?xml version="1.0" encoding="UTF-8"?>
<job-scheduling-data
xmlns="http://www.quartz-scheduler.org/xml/JobSchedulingData"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.quartz-scheduler.org/xml/JobSchedulingData
http://www.quartz-scheduler.org/xml/job_scheduling_data_1_8.xsd"
version="1.8">
<schedule>
<job>
<name>AJob</name>
<group>AGroup</group>
<description>Print a welcome message</description>
<job-class>mypackage.SchedulerJob</job-class>
</job>
<trigger>
<cron>
<name>dummyTriggerName</name>
<job-name>AJob</job-name>
<job-group>AGroup</job-group>
<!-- It will run every 5 seconds -->
<cron-expression>0/5 * * * * ?</cron-expression>
</cron>
</trigger>
</schedule>
</job-scheduling-data>
Please note that i am using quartz java api.
Related
I have FilePulse correctly configured, so that when I create a file inside the reading folder, it reads it and ingests it in the topic.
Now I need to do continuous reading of each of the files in that folder, since they are continually being updated.
I have to change any property of properties file?
My filePulseTxtFile.properties:
name=connect-file-pulse-txt
connector.class=io.streamthoughts.kafka.connect.filepulse.source.FilePulseSourceConnector
topic=lineas-fichero
tasks.max=1
File types
fs.scan.filters=io.streamthoughts.kafka.connect.filepulse.scanner.local.filter.RegexFileListFilter
file.filter.regex.pattern=.*\\.log$
task.reader.class=io.streamthoughts.kafka.connect.filepulse.reader.RowFileInputReader
File scanning
fs.cleanup.policy.class=io.streamthoughts.kafka.connect.filepulse.clean.LogCleanupPolicy
fs.scanner.class=io.streamthoughts.kafka.connect.filepulse.scanner.local.LocalFSDirectoryWalker
fs.scan.directory.path=/home/ec2-user/parser/scanDirKafka
fs.scan.interval.ms=10000
Internal Reporting
internal.kafka.reporter.bootstrap.servers=localhost:9092
internal.kafka.reporter.id=connect-file-pulse-txt
internal.kafka.reporter.topic=connect-file-pulse-status
Track file by name
offset.strategy=name
Thanks a lot!
Continious reading is only supported by the RowFileInputReader that you can configure with the read.max.wait.ms property - The maximum time to wait in milliseconds for more bytes after hitting end of file.
For example, if you configure that property to 10000 then the reader will wait 10 seconds for new lines to be added to the file before considering it completed.
Also, you should note that as long as there are task processing files, then new files that are added to the source directory will not be selected. But, you can configure the allow.tasks.reconfiguration.after.timeout.ms to force all tasks to be restarted after a given period so that new files will be scheduled.
Finally, you must take care to correctly set the max.tasks property so that all files can be processed in parallel (a task can only process one file at a time).
I'm using veins 4.6 with omnetpp 5.1.1 and trying to output tripinfo of vehicles using following configurations in .sumocfg file:
<input>
<net-file value="erlangen.net.xml"/>
<route-files value="erlangen.rou.xml"/>
<additional-files value="erlangen.poly.xml"/>
</input>
<time>
<begin value="0"/>
<end value="300"/>
<step-length value="0.1"/>
</time>
<report>
<no-step-log value="true"/>
</report>
<gui_only>
<start value="true"/>
</gui_only>
<emissions>
<device.emissions.probability value="1"/>
</emissions>
<output>
<tripinfo-output value="erlangen.trip_info.xml"/>
<fcd-output value="erlangen.fcd.xml"/>
</output>
I have generated 30 random trips for example network, set emissionClass="HBEFA3/LDV_G_EU4" attribute of vType element. When I run simulation directly in SUMO then on successful completion it generates required trip info file:
<tripinfo id="0" depart="0.00" departLane="4006674#0_0" departPos="5.10" departSpeed="0.00" departDelay="0.00" arrival="202.40" arrivalLane="-4006726#0_0" arrivalPos="281.67" arrivalSpeed="13.76" duration="202.40" routeLength="2214.00" waitSteps="0" timeLoss="28.90" rerouteNo="0" devices="tripinfo_0 emissions_0" vType="passenger" speedFactor="1.00" vaporized="">
<emissions CO_abs="16453.885943" CO2_abs="591255.824603" HC_abs="76.174970" PMx_abs="24.476562" NOx_abs="123.285735" fuel_abs="254.203634" electricity_abs="0"/>
</tripinfo>
...
<tripinfo id="29" depart="29.00" departLane="29900564#4_0" departPos="5.10" departSpeed="0.00" departDelay="0.00" arrival="226.10" arrivalLane="-31241838#0_0" arrivalPos="18.39" arrivalSpeed="22.13" duration="197.10" routeLength="2353.60" waitSteps="0" timeLoss="23.99" rerouteNo="0" devices="tripinfo_29 emissions_29" vType="passenger" speedFactor="1.00" vaporized="">
<emissions CO_abs="16826.605518" CO2_abs="612826.831847" HC_abs="78.478455" PMx_abs="25.328690" NOx_abs="126.946877" fuel_abs="263.477812" electricity_abs="0"/>
</tripinfo>
But when I debug the same as OMNET++ Simulation then it finishes with following notification and trip info file is not generated.
I set the simulation time to 300s in both .sumocfg and omnetpp.ini (sim-time-limit = 300s), screenshots shows that all departed vehicles were arrived at 285.900 s and at the same time simulation stopped with the notification. I have observed this issue multiple time by changing the number of random trips and simulation time again and again but all in vain.
Here it is clearly stated that:
The information is generated for each vehicle as soon as the vehicle arrived at its destination and is removed from the network.
But that is not the case with me. Please guide what i'm doing wrong. Thanks
You most likely ran SUMO via sumo-launchd.py, which creates a temporary copy of your scenario (in /tmp). After the scenario ran, the copy is deleted. This means, if you are logging to the directory that the SUMO simulation is executing in, your logged data will be cleaned along with the temporary copy.
There are three ways of preventing that:
Run sumo-launchd.py with a command line switch that disables deletion of the temporary directory, or
Configure SUMO to store the statistics somewhere else, or
Use a different way of launching SUMO (manually or using the TraCI ScenarioManagerForker)
Does anybody know how to create a scheduled task using Task Scheduler Managed Wrapper or Schtasks.exe with "Synchronize across time zones" unchecked.
You can do this with schtasks.exe, but it's tricky. Essentially, you have to use the /xml switch and pass an XML file that has the trigger formatted properly.
The basics of the XML file can be determined by getting as much of the required config done in the Task Scheduler GUI on your dev machine; then using Export... from the context menu, saving the file and chopping out the irrelevant bits.
Given a basic XML structure:
<?xml version="1.0" encoding="UTF-16"?>
<Task version="1.2" xmlns="http://schemas.microsoft.com/windows/2004/02/mit/task">
<Triggers>
<CalendarTrigger>
<StartBoundary>2018-03-28T18:00:00Z</StartBoundary>
<Enabled>true</Enabled>
<ScheduleByDay>
<DaysInterval>1</DaysInterval>
</ScheduleByDay>
</CalendarTrigger>
</Triggers>
<Actions Context="Author">
<Exec>
<Command>C:\Windows\System32\cmd.exe</Command>
<Arguments>/c dir</Arguments>
</Exec>
</Actions>
</Task>
The important element here is <StartBoundary/> – it defines both the date/time from which tasks should start running and (in the case of time-based triggers) the time at which it should run each day, week, etc.
If you want Synchronize across timezones to be unchecked:
You must use a time value for start boundary that is as per the local time that you want the task to run and does not end with the GMT+0/UTC+0/zulu-time indicator Z:
<StartBoundary>2018-03-28T18:00:00</StartBoundary>
The above should run every day, at 18:00 local time.
If you want Synchronize across timezones to be checked:
You must calculate the GMT+0/UTC+0/zulu-time of the desired start time yourself, based on your local timezone and respective of any daylight-savings system your timezone uses, then use this time value and include the Z indicator at the end:
<StartBoundary>2018-03-28T18:00:00Z</StartBoundary>
The above should run every day, at 18:00 UTC, regardless of local time.
To register the above task:
From the command prompt, you would issue:
schtasks.exe /create /tn "My Task Name" /xml x:\pathto\taskdefinition.xml
(You do not need to keep the task definition file after you have registered it; the settings are copied to the created task.)
The difficulty here is probably in creating the XML file — it can be a little finicky around the encoding of the file (you may need to experiment with byte-order markers), and there are some combinations of settings that I have never been able to get to run properly (they register OK, but the running task instantly fails with a strange return code). Your mileage may vary.
I've never tried with the Managed Wrapper, but the source suggests it also generates the XML in the background. However, it appears to use XmlDateTimeSerializationMode.RoundtripKind as its serialization method, which (rightly, for round-tripping) includes the timezone as part of the serialization.
This leads me to think that it will never create a task that has Synchronize across timezones unchecked. In fact, it may mean that, if you can determine the correct timezone suffix for your start time, you might not need to do that Z-based calculation above, yourself.
You might be able to raise a feature request, to have this changed based on a boolean property, e.g.:
writer.WriteElementString("StartBoundary",
System.Xml.XmlConvert.ToString(t.StartBoundary,
System.Xml.XmlDateTimeSerializationMode.RoundtripKind));
becoming:
writer.WriteElementString("StartBoundary",
System.Xml.XmlConvert.ToString(t.StartBoundary,
SynchronizeAcrossTimezones
? System.Xml.XmlDateTimeSerializationMode.RoundtripKind
: System.Xml.XmlDateTimeSerializationMode.Unspecified));
...but that's not up to me!
Fixed it using the Task Scheduler Managed Wrapper library by specifying the
DateTimeKind.Unspecified in StatBoundary
I'm writing a job that will read from an excel file, x number of rows and then I'd like it to pause for an hour before it continues with the next x number of rows.
How do I do this?
I have a job.xml file which contains the following. The subscriptionDiscoverer fetches the file and pass it over to the processor. The subscriptionWriter should write another file when the processor is done.
<job id="subscriptionJob" xmlns="http://www.springframework.org/schema/batch" incrementer="jobParamsIncrementer">
<validator ref="jobParamsValidator"/>
<step id="readFile">
<tasklet>
<chunk reader="subscriptionDiscoverer" processor="subscriptionProcessor" writer="subscriptionWriter" commit-interval="1" />
</tasklet>
</step>
</job>
Is there some kind of timer I could use or is it some kind of flow structure? It's a large file of about 160000 rows that should be processed.
I hope someone has a solution they would like to share.
Thank you!
I'm thinking of two possible approaches for you to start with:
Stop the job, and restart again (after an hour) at the last position. You can start by taking a look on how to change the BatchStatus to notify your intent to stop the job. See http://docs.spring.io/spring-batch/2.0.x/cases/pause.html or look at how Spring Batch Admin implements its way of communicating the PAUSE flag (http://docs.spring.io/spring-batch-admin/reference/reference.xhtml). You may need to implement some persistence to store the position (row number) for the job to know where to start processing again. You can use a scheduler as well to restart the job.
-or-
Add a ChunkListener and implement the following in afterChunk(ChunkContext context): Check if x number of rows has been read so far, and if yes, implement your pause mechanism (e.g., a simple Thread.sleep or look for more consistent way of pausing the step). To check for the number of rows read, you may use StepExecution.getReadCount() from ChunkContext.getStepContext().StepExecution().
Do note that afterChunk is called outside the transaction as indicated in the javadoc:
Callback after the chunk is executed, outside the transaction.
I have a few jobs setup in Quartz to run at set intervals. The problem is though that when the service starts it tries to start all the jobs at once... is there a way to add a delay to each job using the .xml config?
Here are 2 job trigger examples:
<simple>
<name>ProductSaleInTrigger</name>
<group>Jobs</group>
<description>Triggers the ProductSaleIn job</description>
<misfire-instruction>SmartPolicy</misfire-instruction>
<volatile>false</volatile>
<job-name>ProductSaleIn</job-name>
<job-group>Jobs</job-group>
<repeat-count>RepeatIndefinitely</repeat-count>
<repeat-interval>86400000</repeat-interval>
</simple>
<simple>
<name>CustomersOutTrigger</name>
<group>Jobs</group>
<description>Triggers the CustomersOut job</description>
<misfire-instruction>SmartPolicy</misfire-instruction>
<volatile>false</volatile>
<job-name>CustomersOut</job-name>
<job-group>Jobs</job-group>
<repeat-count>RepeatIndefinitely</repeat-count>
<repeat-interval>43200000</repeat-interval>
</simple>
As you see there are 2 triggers, the first repeats every day, the next repeats twice a day.
My issue is that I want either the first or second job to start a few minutes after the other... (because they are both in the end, accessing the same API and I don't want to overload the request)
Is there a repeat-delay or priority property? I can't find any documentation saying so..
I know you are doing this via XML but in code you can set the StartTimeUtc to delay say 30 seconds like this...
trigger.StartTimeUtc = DateTime.UtcNow.AddSeconds(30);
This isn't exactly a perfect answer for your XML file - but via code you can use the StartAt extension method when building your trigger.
/* calculate the next time you want your job to run - in this case top of the next hour */
var hourFromNow = DateTime.UtcNow.AddHours(1);
var topOfNextHour = new DateTime(hourFromNow.Year, hourFromNow.Month, hourFromNow.Day, hourFromNow.Hour, 0, 0);
/* build your trigger and call 'StartAt' */
TriggerBuilder.Create().WithIdentity("Delayed Job").WithSimpleSchedule(x => x.WithIntervalInSeconds(60).RepeatForever()).StartAt(new DateTimeOffset(topOfNextHour))
You've probably already seen this by now, but it's possible to chain jobs, though it's not supported out of the box.
http://quartznet.sourceforge.net/faq.html#howtochainjobs