How does Shortest Remaining Time First (SRTF) work? - scheduler

If a job is being processed, wouldn't it have the smallest time till completion, since it will just end up as the head of the Ready-to-Run Queue, when it is preempted?
So is this just a repeating cycle till a job has reached completion, with overheads?
Wouldn't longer processes be neglected (just like SJF)?
Thanks

No, a job being processed doesn't necessarily have the shortest remaining time.
SRTF checks if there's a process in the ready queue which has less burst time to complete to do the preemption. Let's say you have p1,p2 and p3. p1 has a total burst of 15 and arrives at time 0, p2 has a burst of 10 and arrives at time 3, p3 has a burst of 1 and arrives at time 4.
The execution with SRTF would be:
p1 -> from 0 to 3, remaining burst -> p1 = 12
at 3, arrives p2, p2 burst < p1 remaining burst, so p2 gets the cpu
p2 -> from 3 to 4, remaining burst -> p1=12,p2 = 9
at 4 arrives p3, p3 burst < p2 remaining burst < p1 remaining burst, so p3 gets the cpu
p3-> from 4 to 5, remaining burst -> p2=9,p1=12
p2-> from 5 to 14, remaining burst -> p1=12
p1-> from 14 to 26, end

Related

Kafka Stream: How to trigger event based on hopping windows and how to trigger event based on combined set of windows that are part of hopping window

Here is our topology:
test-events--->groupByKey-->Window Aggregation-->Suppress-->Filter-->rekey
Here is the flow
We have 1 min hopping window. With total window size of 5 min. The hopping window is evaluated every five minutes, HOW to get count of every 1 min hops??
Total 5 min window, hop duration 1 min.
1. How to TRIGGER some event based on 1 min HOP
2. How to TRIGGER some event based on TOTAL 5 min WINDOW
||-----||----||-------||------||-----|
||-----||----||-------||------||-----|
We want to do 2 things.
We want to count total number of events in a hopping window of 1 min. If total number of events are > some threshold then trigger one combined event for that 1 min hopping window.
Then we want to trigger an alert if count of combined event is > some number.
From Kafka window API
Code looks like this:
KStream<String, DataPointUri> KS0 = builder.stream(inputTopicsList,);
KGroupedStream<String, DataPointUri> KS1 = KS0.groupByKey();
Duration windowDuration =getDuration(...));
Duration hopDuration =getDuration(env.getProperty(...)));
TimeWindowedKStream<String, DataPointUri> KS2 = KS1.windowedBy(TimeWindows.of(windowDuration).advanceBy(hopDuration).grace(Duration.ofSeconds(10)));
KTable<Windowed<String>, Long> uriCount = KS2.count(Materialized.<String, Long, WindowStore<Bytes, byte[]>>with(Serdes.String(), Serdes.Long()));
Here we are getting count for 5 min window. NOT for hop 1 min window. How to get details of 1 min window?

How to change quantum in xv6? [duplicate]

Right now it seems that on every click tick, the running process is preempted and forced to yield the processor, I have thoroughly investigated the code-base and the only relevant part of the code to process preemption is below (in trap.c):
// Force process to give up CPU on clock tick.
// If interrupts were on while locks held, would need to check nlock.
if(myproc() && myproc() -> state == RUNNING && tf -> trapno == T_IRQ0 + IRQ_TIMER)
yield();
I guess that timing is specified in T_IRQ0 + IRQ_TIMER, but I can't figure out how these two can be modified, these two are specified in trap.h:
#define T_IRQ0 32 // IRQ 0 corresponds to int T_IRQ
#define IRQ_TIMER 0
I wonder how I can change the default RR scheduling time-slice (which is right now 1 clock tick, fir example make it 10 clock-tick)?
If you want a process to be executed more time than the others, you can allow it more timeslices, *without` changing the timeslice duration.
To do so, you can add some extra_slice and current_slice in struct proc and modify the TIMER trap handler this way:
if(myproc() && myproc()->state == RUNNING &&
tf->trapno == T_IRQ0+IRQ_TIMER)
{
int current = myproc()->current_slice;
if ( current )
myproc()->current_slice = current - 1;
else
yield();
}
Then you just have to create a syscall to set extra_slice and modify the scheduler function to reset current_slice to extra_slice at process wakeup:
// Switch to chosen process. It is the process's job
// to release ptable.lock and then reacquire it
// before jumping back to us.
c->proc = p;
switchuvm(p);
p->state = RUNNING;
p->current_slice = p->extra_slice
You can read lapic.c file:
lapicinit(void)
{
....
// The timer repeatedly counts down at bus frequency
// from lapic[TICR] and then issues an interrupt.
// If xv6 cared more about precise timekeeping,
// TICR would be calibrated using an external time source.
lapicw(TDCR, X1);
lapicw(TIMER, PERIODIC | (T_IRQ0 + IRQ_TIMER));
lapicw(TICR, 10000000);
So, if you want the timer interrupt to be more spaced, change the TICR value:
lapicw(TICR, 10000000); //10 000 000
can become
lapicw(TICR, 100000000); //100 000 000
Warning, TICR references a 32bits unsigned counter, do not go over 4 294 967 295 (0xFFFFFFFF)

How to modify process preemption policies (like RR time-slices) in XV6?

Right now it seems that on every click tick, the running process is preempted and forced to yield the processor, I have thoroughly investigated the code-base and the only relevant part of the code to process preemption is below (in trap.c):
// Force process to give up CPU on clock tick.
// If interrupts were on while locks held, would need to check nlock.
if(myproc() && myproc() -> state == RUNNING && tf -> trapno == T_IRQ0 + IRQ_TIMER)
yield();
I guess that timing is specified in T_IRQ0 + IRQ_TIMER, but I can't figure out how these two can be modified, these two are specified in trap.h:
#define T_IRQ0 32 // IRQ 0 corresponds to int T_IRQ
#define IRQ_TIMER 0
I wonder how I can change the default RR scheduling time-slice (which is right now 1 clock tick, fir example make it 10 clock-tick)?
If you want a process to be executed more time than the others, you can allow it more timeslices, *without` changing the timeslice duration.
To do so, you can add some extra_slice and current_slice in struct proc and modify the TIMER trap handler this way:
if(myproc() && myproc()->state == RUNNING &&
tf->trapno == T_IRQ0+IRQ_TIMER)
{
int current = myproc()->current_slice;
if ( current )
myproc()->current_slice = current - 1;
else
yield();
}
Then you just have to create a syscall to set extra_slice and modify the scheduler function to reset current_slice to extra_slice at process wakeup:
// Switch to chosen process. It is the process's job
// to release ptable.lock and then reacquire it
// before jumping back to us.
c->proc = p;
switchuvm(p);
p->state = RUNNING;
p->current_slice = p->extra_slice
You can read lapic.c file:
lapicinit(void)
{
....
// The timer repeatedly counts down at bus frequency
// from lapic[TICR] and then issues an interrupt.
// If xv6 cared more about precise timekeeping,
// TICR would be calibrated using an external time source.
lapicw(TDCR, X1);
lapicw(TIMER, PERIODIC | (T_IRQ0 + IRQ_TIMER));
lapicw(TICR, 10000000);
So, if you want the timer interrupt to be more spaced, change the TICR value:
lapicw(TICR, 10000000); //10 000 000
can become
lapicw(TICR, 100000000); //100 000 000
Warning, TICR references a 32bits unsigned counter, do not go over 4 294 967 295 (0xFFFFFFFF)

Marathon backoff - is it really exponential?

I'm trying to figure out Marathon's exponential backoff configuration. Here's the documentation:
The backoffSeconds and backoffFactor values are multiplied until they reach the maxLaunchDelaySeconds value. After they reach that value, Marathon waits maxLaunchDelaySeconds before repeating this cycle exponentially. For example, if backoffSeconds: 3, backoffFactor: 2, and maxLaunchDelaySeconds: 3600, there will be ten attempts to launch a failed task, each three seconds apart. After these ten attempts, Marathon will wait 3600 seconds before repeating this cycle.
The way I think of exponential backoff is that the wait periods should be:
3*2^0 = 3
3*2^1 = 6
3*2^2 = 12
3*2^3 = 24 and so on
so every time the app crashes, Marathon will wait a longer period of time before retrying. However, given the description above, Marathon's logic for waiting looks something like this:
int retryCount = 0;
while(backoffSeconds * (backoffFactor ^ retryCount) < maxLaunchDelaySeconds)
{
wait(backoffSeconds);
retryCount++;
}
wait(maxLaunchDelaySeconds);
This matches the explanation in the documentation, since 3*2^x < 3600 for values of x fewer than or equal to 10. However, I really don't see how it can be called an exponential backoff, since the wait time is constant.
Is there a way to make Marathon wait progressively longer times with every restart of the app? Am I misunderstand the doc? Any help would be appreciated!
as far as I understand the code in the RateLimiter.scala, it is like you described, but then capped to the maxLaunchDelay waiting period. Let`s say maxLaunchDelay is one hour (3600s)
3*2^0 = 3
3*2^1 = 6
3*2^2 = 12
3*2^3 = 24
3*2^4 = 48
3*2^5 = 96
3*2^6 = 192
3*2^7 = 384
3*2^8 = 768
3*2^9 = 1536
3*2^10 = 3072
3*2^11 = 3600 (6144)
3*2^12 = 3600 (12288)
3*2^13 = 3600 (24576)
Which brings us a typically 2^n graph, see
You would get a bigger increase, if you would other backoffFactors,
for example backoff factor 10:
or backoff factor 20:
Additionally I saw a re-work of this topic, code review currently open here: https://phabricator.mesosphere.com/D1007
What do you think?
Thanks
Johannes

What is the purpose of Flux::sampleTimeout method in the project-reactor API?

The Java docs say the following:
Emit the last value from this Flux only if there were no new values emitted during the time window provided by a publisher for that particular last value.
However I found the above description confusing. I read in gitter chat that its similar to debounce in RxJava. Can someone please illustrate it with an example? I could not find this anywhere after doing a thorough search.
sampleTimeout lets you associate a companion Flux X' to each incoming value x in the source. If X' completes before the next value is emitted in the source, then value x is emitted. If not, x is dropped.
The same processing is applied to subsequent values.
Think of it as splitting the original sequence into windows delimited by the start and completion of each companion flux. If two windows overlap, the value that triggered the first one is dropped.
On the other side, you have sample(Duration) which only deals with a single companion Flux. It splits the sequence into windows that are contiguous, at a regular time period, and drops all but the last element emitted during a particular window.
(edit): about your use case
If I understand correctly, it looks like you have a processing of varying length that you want to schedule periodically, but you also don't want to consider values for which processing takes more than one period?
If so, it sounds like you want to 1) isolate your processing in its own thread using publishOn and 2) simply need sample(Duration) for the second part of the requirement (the delay allocated to a task is not changing).
Something like this:
List<Long> passed =
//regular scheduling:
Flux.interval(Duration.ofMillis(200))
//this is only to show that processing is indeed started regularly
.elapsed()
//this is to isolate the blocking processing
.publishOn(Schedulers.elastic())
//blocking processing itself
.map(tuple -> {
long l = tuple.getT2();
int sleep = l % 2 == 0 || l % 5 == 0 ? 100 : 210;
System.out.println(tuple.getT1() + "ms later - " + tuple.getT2() + ": sleeping for " + sleep + "ms");
try {
Thread.sleep(sleep);
} catch (InterruptedException e) {
e.printStackTrace();
}
return l;
})
//this is where we say "drop if too long"
.sample(Duration.ofMillis(200))
//the rest is to make it finite and print the processed values that passed
.take(10)
.collectList()
.block();
System.out.println(passed);
Which outputs:
205ms later - 0: sleeping for 100ms
201ms later - 1: sleeping for 210ms
200ms later - 2: sleeping for 100ms
199ms later - 3: sleeping for 210ms
201ms later - 4: sleeping for 100ms
200ms later - 5: sleeping for 100ms
201ms later - 6: sleeping for 100ms
196ms later - 7: sleeping for 210ms
204ms later - 8: sleeping for 100ms
198ms later - 9: sleeping for 210ms
201ms later - 10: sleeping for 100ms
196ms later - 11: sleeping for 210ms
200ms later - 12: sleeping for 100ms
202ms later - 13: sleeping for 210ms
202ms later - 14: sleeping for 100ms
200ms later - 15: sleeping for 100ms
[0, 2, 4, 5, 6, 8, 10, 12, 14, 15]
So the blocking processing is triggered approximately every 200ms, and only values that where processed within 200ms are kept.