Grafana negative spikes in latency query

Grafana negative spikes in latency query - apache-kafka

I have a Grafana dashboard that is measuring latency of a Kafka topic per partition in minutes using this query here:
avg by (topic, consumergroup, environment, partition)(kafka_consumer_lag_millis{environment="production",topic="topic.name",consumergroup="consumer.group.name"}) / 1000 / 60
The graph is working fine but we're seeing negative spikes in the graph that doesn't make a lot of sense to us. Does anyone know potentially what could be causing these spikes?

This is more of a guess than an accurate answer based here. let's suppose in a very simple manner we have 2 metrics being measured, and their subtraction is the number sent to prometheus:
lag = offset-producer - offset-consumer
while the producer offset is measured with a pooling mechanism, the consumer offset is measured with direct synchronous requests (to whatever other inner place has this values). this way, we could have outdated values for the producer. example:
instant | producer | consumer
t1 | 10 | 0
t2 | 30 | 15
t3 | 200 | 70
if we had always updated values, we should have:
instant | lag
t1 | 10 - 0 = 10
t2 | 30 - 15 = 15
t3 | 200 - 70 = 130
let's suppose our offset producer was one measurement behind on t2 due to the long pooling period:
l(t1) = p(t1) - c(t1)
l(t2) = p(t1) - c(t2)
l(t3) = p(t2) - c(t3)
this would produce:
instant | lag
t1 | 10 - 0 = 10
t2 | 10 - 15 = -5
t3 | 30 - 70 = -40
and there's your negative value: when the diff increases and your pooling rate of the positive value is bigger than prometheus' pooling rate, you get the negative value to be bigger than older positive value.
now to really answer your question we need to check prometheus' kafka client code to check if the pooling rate is editable to make it smaller until negative values vanish (or instead just set it smaller than prometheus rate directly)

Related

Dataframe level computation in pySpark

I am using PySpark and want to use the benefit of multiple nodes to improve on performance time.
For example:
Suppose I have 3 columns and have 1 million records:
Emp ID | Salary | % Increase | New Salary
1 | 200 | 0.05 |
2 | 500 | 0.15 |
3 | 300 | 0.25 |
4 | 700 | 0.1 |
I want to compute the New Salary column and want to use the power of multiple nodes in pyspark to reduce overall processing time.
I don't want to do an iterative row wise computation of New Salary.
Does df.withColumn do the computation at a dataframe level? Would it be able to give better performance as more nodes are used?

Spark's dataframes are basically a distributed collection of data. Spark manages this distribution and the operations (such as .withColumn) on them.
Here is a quick google search on how to increase spark's performance.

Pure Data - Get adc value in a particular duration

I'm trying to get the adc values in says 50 seconds. I end up with the picture below
I set up the metro as 50 which is 0.05 sec and the tabwrite size 1000. I got a list of values as below
But I feel it isn't right as I speak louder for a few seconds, the entire graph changed. Can anyone point out what I did wrong? Thank you.

the [metro 50] will retrigger every 50 milliseconds (20 times per second).
So the table will get updated quite often, which explains why it reacts immediately to your voice input.
To record 50 seconds worth of audio, you need:
a table that can hold 2205000 (50*44100) samples (as opposed to the default 64)
a [metro] that triggers every 50 seconds:
[tgl]
|
[metro 50000]
|
| [adc~]
|/
[tabwrite~ mytable]
[table mytable 2205000]

How to calculate the turnaround time (preemtive- Scheduling)

Hey Folks I have this exercise for my exams:
Three process arrive at he exact same time (run time in brackets)
P1 (10) P2 (7) P3(4)
a)calculate the turnaroundtime of all single processes and the average turnaroundtime time of the 3 processes.
b) In which sequence should the processes be executed to reduce the average turnaround time?
Edit: I found a solution.
a) 10 + 17 + 21 = 48 / 3 = 16 sec average
b) Shortest Job First:
4 + 11 + 21 = 36/3 = 12 sec average

It depends on what scheduling algorithm you use.
Let T(x) = "P(x)'s turnaround time"
FCFS:
T(a)=10-0=10
T(b)=10+7-0=17
T(c)=10+7+4-0=21
Average turnaround time=48/3=16
SJF:
T(a)=4+7+10-0=21
T(b)=4+7-0=11
T(c)=4-0=4
Average turnaround time=36/3=12
You can also practice SRTF/RR/priority/Multilevel queue/M.F.Q Scheduleing
And draw Gantt chart and calculate average waiting time.
You can also practice RM algorithm and EDF algorithm which are used in realtime systems.

How to sample from KDB table to reduce data before querying?

I have a table of tick data representing prices of various financial instruments up to millisecond precision. Problem is, there are over 5 billion entries, and even the most basic queries takes several minutes.
I only need data with a precision of up to 1 second - is there an efficient way to sample the table so that the precision is reduced to roughly 1 second prior to querying? This should dramatically cut the amount of data and hence execution time.
So far, as a quick hack I've added the condition where i mod 2 = 0 to my query, but is there a better way?

The best way to bucket time data is with xbar
q)select last price, sum size by 10 xbar time.minute from trade where sym=`IBM
minute| price size
------| -----------
09:30 | 55.32 90094
09:40 | 54.99 48726
09:50 | 54.93 36511
10:00 | 55.23 35768
...
more info http://code.kx.com/q/ref/arith-integer/#xbar

Interrupt time in DMA operation

I'm facing difficulty with the following question :
Consider a disk drive with the following specifications .
16 surfaces, 512 tracks/surface, 512 sectors/track, 1 KB/sector, rotation speed 3000 rpm. The disk is operated in cycle stealing mode whereby whenever 1 byte word is ready it is sent to memory; similarly for writing, the disk interface reads a 4 byte word from the memory in each DMA cycle. Memory Cycle time is 40 ns. The maximum percentage of time that the CPU gets blocked during DMA operation is?
the solution to this question provided on the only site is :
Revolutions Per Min = 3000 RPM
or 3000/60 = 50 RPS
In 1 Round it can read = 512 KB
No. of tracks read per second = (2^19/2^2)*50
= 6553600 ............. (1)
Interrupt = 6553600 takes 0.2621 sec
Percentage Gain = (0.2621/1)*100
= 26 %
I have understood till (1).
Can anybody explain me how has 0.2621 come ? How is the interrupt time calculated? Please help .

Reversing form the numbers you've given, that's 6553600 * 40ns that gives 0.2621 sec.
One quite obvious problem is that the comments in the calculations are somewhat wrong. It's not
Revolutions Per Min = 3000 RPM ~ or 3000/60 = 50 RPS
In 1 Round it can read = 512 KB
No. of tracks read per second = (2^19/2^2)*50 <- WRONG
The numbers are 512K / 4 * 50. So, it's in bytes. How that could be called 'number of tracks'? Reading the full track is 1 full rotation, so the number of tracks readable in 1 second is 50, as there are 50 RPS.
However, the total bytes readable in 1s is then just 512K * 50 since 512K is the amount of data on the track.
But then it is further divided by 4..
So, I guess, the actual comments should be:
Revolutions Per Min = 3000 RPM ~ or 3000/60 = 50 RPS
In 1 Round it can read = 512 KB
Interrupts per second = (2^19/2^2) * 50 = 6553600 (*)
Interrupt triggers one memory op, so then:
total wasted: 6553600 * 40ns = 0.2621 sec.
However, I don't really like how the 'number of interrupts per second' is calculated. I currently don't see/fell/guess how/why it's just Bytes/4.
The only VAGUE explanation of that "divide it by 4" I can think of is:
At each byte written to the controller's memory, an event is triggered. However the DMA controller can read only PACKETS of 4 bytes. So, the hardware DMA controller must WAIT until there are at least 4 bytes ready to be read. Only then the DMA kicks in and halts the bus (or part of) for a duration of one memory cycle needed to copy the data. As bus is frozen, the processor MAY have to wait. It doesn't NEED to, it can be doing its own ops and work on cache, but if it tries touching the memory, it will need to wait until DMA finishes.
However, I don't like a few things in this "explanation". I cannot guarantee you that it is valid. It really depends on what architecture you are analyzing and how the DMA/CPU/BUS are organized.

The only mistake is its not
no. of tracks read
Its actually no. of interrupts occured (no. of times DMA came up with its data, these many times CPU will be blocked)
But again I don't know why 50 has been multiplied,probably because of 1 second, but I wish to solve this without multiplying by 50

My Solution:-
Here, in 1 rotation interface can read 512 KB data. 1 rotation time = 0.02 sec. So, one byte data preparation time = 39.1 nsec ----> for 4B it takes 156.4 nsec. Memory Cycle time = 40ns. So, the % of time the CPU get blocked = 40/(40+156.4) = 0.2036 ~= 20 %. But in the answer booklet options are given as A) 10 B)25 C)40 D)50. Tell me if I'm doing wrong ?