Does the Overpas API installation's Area Creation process resume if stopped/started? - openstreetmap

The Area Creation process can take up to 24 hours. If something happens during that time which causes the process to stop, will it resume when I run it again or does it start back over from the beginning?
We can assume for this question that the files in $DB_DIR remain in place throughout the running/stopping/starting process.

It will start over from the beginning, assuming you're using areas.osm3s to define the area creation rules. This file contains a number of queries which are being executed to generate the areas. If you restart the process, it will execute those very same queries again from the beginning.
For performance reasons, we use areas_delta.osm3s and the accompanying rules_delta_loop.sh script on the production servers. This way, we can limit the workload to those areas, which have been changed since the last area creation run.

Related

How to configure druid properly to fire a periodic kill task

I have been trying to get druid to fire a kill task periodically to clean up unused segments.
These are the configuration variables responsible for it
druid.coordinator.kill.on=true
druid.coordinator.kill.period=PT45M
druid.coordinator.kill.durationToRetain=PT45M
druid.coordinator.kill.maxSegments=10
From the above configuration my mental model is, once ingested data is marked unused, kill task will fire and delete the segments that are older that 45 mins while retaining 45 mins worth of data. period and durationToRetain are the config vars that are confusing me, not quite sure how to leverage them. Any help would be appreciated.
The caveat for druid.coordinator.kill.on=true is that segments are deleted from whitelisted datasources. The whitelist is empty by default.
To populate the whitelist with all datasources, set killAllDataSources to true. Once I did that, the kill task fired as expected and deleted the segments from s3 (COS). This was tested for Druid version 0.18.1.
Now, while the above configuration properties can be set when you build your image, the killAllDataSources needs to be set through an API. This can be set via the druid UI too.
When you click the option, a modal appears that has Kill All Data Sources. Click on True and you should see a kill task (Ingestion ---> Tasks below) firing in the interval specified. It would be really nice to have this as a part of runtime.properties or some sort of common configuration file that we can set the value in when build the druid image.
Use crontab it works quite well for us.
If you want to have a control outside the druid over the segments removal, then you must use an scheduled task which runs based on your desire interval and register kill-tasks in druid. It can increase your control over your segments, since when they go away, you cannot recover them. You can use this script to accompany you:
https://github.com/mostafatalebi/druid-kill-task

Stop recurring job during specific times

We have a job on our SQL database that runs periodically forever.
During predefined maintenance periods, we would like to have this job stop for a set time (say 12 hours) and then restart the regular periodic schedule.
We've tried using a separate job that disables it a the predefined time and a second one that enables it. This works but is not very neat.
Is there a better way to do this that only involves the job itself?
Make a "maintenance schedule" table in some service database or MSDB (StartDate, EndDate, Description, etc.). Let the first step of your job check if current datetime within maintenance period. If so, just do nothing.
If a session or transaction is associated with the maintenance process then you could use an application lock to have the regular job wait, or terminate, if it attempts to run while the maintenance is in process.
Using a locking mechanism allows finer control over the processes, e.g. the regular job can release and reacquire the lock between steps and wait (or terminate) if the maintenance process has started. Alternatively, the maintenance process could wait for the regular job to terminate (or reach a suitable checkpoint) before proceeding.
See sp_getapplock for additional information.

Shared memory and waiting sessions in Matlab

Doing a regression on a large dataset, I have a huge read-only matrix that I'd like to share among several threads. I've looked at various way of doing this and found sharedmatrix toolkit to be exactly what I need. Reading through the tutorial, I came up with the following setup:
Session 0 - just loads up the matrix and makes it available
Session 1..n - worker sessions
The problem is that Session 0 has to finish only after all the other n sessions have finished. Do you know how to make a session wait? The best solution would be to make it wait until I kill it as I'm running the scripts on a remote linux system and am not connected to it all the time.
UPDATE:
In the end I've changed my approach to the problem, after reading this part of the tutorial:
The “free” directive marks the shared memory segment for deletion.
Note: it is not actually deleted until every attached session
explicitly detaches or is terminated. As soon as the last session
detaches, the system will reclaim the allocated segment.
This means that I've created one "master" session that loads up the matrix, makes it available and then starts its own computation, and several "slave" sessions that use the shared matrix. Even if the master session finishes earlier, it causes no problem to the slave sessions as the shared matrix remains in the memory until the last process that uses it is terminated.
If you really want to have it wait indefinitely, use Eitan's solution based on pause. More generally, something like labBarrier is what you should use to perform this kind of synchronization.

Is there a way to make the Start Time closer than Schedule Time in an SCOM Task?

I realize that when I execute a SCOM Task on demand from a Powershell script, there are 2 columns in Task Status view called Schedule Time and Start Time. It seems that there is an interval these two fields of around 15 seconds. I'm wondering if there is a way to minimize this time so I could have a response time shorter when I execute an SCOM task on demand.
This is not generally something that users can control. The "ScheduledTime" correlates to the time when the SDK received the request to execute the task. The "StartTime" represents the time that the agent healthservice actually began executing the task workflow locally.
In between those times, things are moving as fast as they can. The request needs to propagate to the database, and a server healthservice needs to be notified that a task is being triggered. The servers then need to determine the correct route for the task message to take, then the healthservices need to actually send and receive the message. Finally, it gets to the actual agent where the task will execute. All of these messages go through the same queues as other monitoring data.
That sequence can be very quick (when running a task against the local server), or fairly slow (in a big Management Group, or when there is lots of load, or if machines/network are slow). Besides upgrading your hardware, you can't really do anything to make the process run quicker.

Work around celerybeat being a single point of failure

I'm looking for recommended solution to work around celerybeat being a single point of failure for celery/rabbitmq deployment. I didn't find anything that made sense so far, by searching the web.
In my case, once a day timed scheduler kicks off a series of jobs that could run for half a day or longer. Since there can only be one celerybeat instance, if something happens to it or the server that it's running on, critical jobs will not be run.
I'm hoping there is already a working solution for this, as I can't be the only one who needs reliable (clustered or the like) scheduler. I don't want to resort to some sort of database-backed scheduler, if I don't have to.
There is an open issue in celery github repo about this. Don't know if they are working on it though.
As a workaround you could add a lock for tasks so that only 1 instance of specific PeriodicTask will run at a time.
Something like:
if not cache.add('My-unique-lock-name', True, timeout=lock_timeout):
return
Figuring out lock timeout is well, tricky. We're using 0.9 * task run_every seconds if different celerybeats will try to run them at different times.
0.9 just to leave some margin (e.g. when celery is a little behind schedule once, then it is on schedule which would cause lock to still be active).
Then you can use celerybeat instance on all machines. Each task will be queued for every celerybeat instance but only one task of them will finish the run.
Tasks will still respect run_every this way - worst case scenario: tasks will run at 0.9*run_every speed.
One issue with this case: if tasks were queued but not processed at scheduled time (for example because queue processors was unavailable) - then lock may be placed at wrong time causing possibly 1 next task to simply not run. To go around this you would need some kind of detection mechanism whether task is more or less on time.
Still, this shouldn't be a common situation when using in production.
Another solution is to subclass celerybeat Scheduler and override its tick method. Then for every tick add a lock before processing tasks. This makes sure that only celerybeats with same periodic tasks won't queue same tasks multiple times. Only one celerybeat for each tick (one who wins the race condition) will queue tasks. In one celerybeat goes down, with next tick another one will win the race.
This of course can be used in combination with the first solution.
Of course for this to work cache backend needs to be replicated and/or shared for all of servers.
It's an old question but I hope it helps anyone.