Future partition in postgres11 on OLAP DB never finish job - postgresql

My cron job is scheduled to run daily and creates 15 days future partitions in postgres11.
Due to huge Select queries on table 24/7 ,cron job never completes on time (I have put 10 sec timeout else it blocks everything).
I know,this is solved in postgres12, but in postgre11,what is the smart way to handle this?

Related

Is using "pg_sleep" in plpgsql Procedures/Functions while using multiple worker background processes concurrently bad practice?

I am running multiple background worker processes in my Postgres database. I am doing this using Pg_Cron extension. I cannot unfortunately use Pg_Timetables, as suggested by another user here.
Thus, I have 5 dependent "Jobs" that need 1 other independent Procedure/Function to execute and complete before they can start. I originally having my Cron jobs simply check every 30minutes or-so some "job_log" table I created to see if the independent Job completed (i.e. if yes, execute Procedure, if not, return out of Procedure and check at next Cron interval)
However, I believe I could simplify the way I am triggering/orchestrating all these Jobs/Procedures greatly if I utilize pg_sleep and start all the Jobs at one -time (so no more checking every 30minutes). I would be running these Jobs in the night time concurrently so I believe it shouldn't effect my actual traffic that much.
i.e.
WHILE some_variable != some_condition LOOP
PERFORM pg_sleep(1);
some_variable := some_value; -- update variable here
END LOOP;
My question is
Would starting all these Jobs at one time (i.e. Setting a concrete time in the Cron expression e.g. 15 18 * * *), and utilizing pg_sleep be bad practice/inefficient as I would be idling 5 background workers while 1 Job completes. The 1 job these are dependent on could take any amount of time to finish i.e. 15 min, 30 min, 1hr (should be < 1 hr though).
Or is better to simply just use a Cron expression to check every 5min or so if the main/independent Job is done, so my other Jobs that are dependent can then run?
Running two schedulers, one of them home-built, seems more complex than just running one scheduler that does 2 (or 6, 1+5, however you count it) different things. If your goal is to make things simpler, does your proposal really achieve that?
I wouldn't worry about 5 backends sleeping on pg_sleep at the same time. But you might worry about them holding back the xid horizon while they do so, which would make vacuuming and HOT pruning less effective. But if you already have one long-running task in one snapshot (the thing they are waiting for) then more of them aren't going to make the matter worse.

Airflow limit daily trigger

there is a "natural" ( I mean thought parameter) way to limit the number of triggering a dag (let say every 24 hours).
I don't want to schedule it, but some user can trigger the same dag multiple time, and for resources and others reason, I want it only once .
As I see "depends_on_past" depend only against the previous run, but it could be many time a day.
Thx
Not directly, but you could likely implement task_instance_mutation_hook in the first task of the DAG, it could then immediately fail the task if you check if it's been run several times the same day.
https://airflow.apache.org/docs/apache-airflow/stable/concepts/cluster-policies.html#task-instance-mutation

SQL Agent Job runtime alert

I was hoping i could get some help on how i can setup an e-mail alert for a specific agent job, such that it sends an e-mail alert when the run duration exceeds 30 minutes.
Would it be easier to add this step in the job itself? Are there any available methods in the SQL Agent GUI or do i have to create a new job? I figured creating a new job is less likely as i would have to query the sysjobhistory in msdb; The value is only updated once the job finishes so that doesn't help...I need it to check the real time duration of 1 specific agent job as it's running...
Specifically because it happens that the job runs into a deadlock ( That's no longer an issue now), so the job just stays stuck on the table it's locked on, and i only get the notification from the enduser that the report doesn't return results :S
The best method outside of 3rd party monitoring software is to create a high-frequency SQL Agent Job that runs a query on active sessions (returned by something like sp_who) for the duration of spids. This way you can have this monitoring job email you whenever a spid goes over a threshold. Alternatively you could have it compare the current runtime vs a calculated average runtime gleaned from the sys.jobhistory table.

Procedure executing 10 times faster using job agent

I have got procedure that does some inserts with selects. When i try to run it from Access or even in management studio it takes 4 minutes. With job checking every 40 sec if there is that procedure to execute and executing it it took around 15 sec.
What can cause a problem? Why the same procedure executed from job is 10 times faster than procedure executed from access or query in management studio?
I doubt there is any time difference here. The problem is that after the job runs a few times, then the data is likely cached in memory, and thus it runs very fast.
You do a fresh re-boot and then check the time for the run, it should be the SAME from all sources.
It is not clear how you are calling this routine, but is it possible that the job scheduling is NOT waiting for the routine to finish?
And does the routine return any data to the client application? Perhaps the difference is that when run from Access (or SSMS) then lots of data is returned from the stored procedure, but when run using job agent, then NO provisions exist to consume the returned data, so the returning of data is ignored and thus the result(s) run faster.

Is there a way to make the Start Time closer than Schedule Time in an SCOM Task?

I realize that when I execute a SCOM Task on demand from a Powershell script, there are 2 columns in Task Status view called Schedule Time and Start Time. It seems that there is an interval these two fields of around 15 seconds. I'm wondering if there is a way to minimize this time so I could have a response time shorter when I execute an SCOM task on demand.
This is not generally something that users can control. The "ScheduledTime" correlates to the time when the SDK received the request to execute the task. The "StartTime" represents the time that the agent healthservice actually began executing the task workflow locally.
In between those times, things are moving as fast as they can. The request needs to propagate to the database, and a server healthservice needs to be notified that a task is being triggered. The servers then need to determine the correct route for the task message to take, then the healthservices need to actually send and receive the message. Finally, it gets to the actual agent where the task will execute. All of these messages go through the same queues as other monitoring data.
That sequence can be very quick (when running a task against the local server), or fairly slow (in a big Management Group, or when there is lots of load, or if machines/network are slow). Besides upgrading your hardware, you can't really do anything to make the process run quicker.