How does mongodb compute network latency when using nearest? - mongodb

As to the mongodb doc, using nearest mode for read perference,
the driver reads from a member whose network latency falls within the acceptable latency window. However the doc does not tell what is the exact "network latency" is.
Does anyone know how is it evaluated? Using something like ping round trip time or the avg query time or through some specific query?

Does anyone know how is it evaluated? Using something like ping round trip time or the avg query time or through some specific query?
The following is applicable for MongoDB v3.2, v3.4 and current version v3.6. Based on MongoDB Specifications: Server Selection:
For every available server, clients (i.e. the driver) must track the average Round Trip Times (RTT) of server monitoring isMaster commands. When there is no average RTT for a server, the average RTT must be set equal to the first RTT measurement (i.e. the first isMaster command after the server becomes available).
After the first measurement, average RTT must be computed using an exponentially-weighted moving average formula, with a weighting factor (alpha) of 0.2. If the prior average is denoted (old_rtt), then the new average (new_rtt) is computed from a new RTT measurement (x) using the following formula:
alpha = 0.2
new_rtt = alpha * x + (1 - alpha) * old_rtt
A weighting factor of 0.2 was chosen to put about 85% of the weight of the average RTT on the 9 most recent observations.
See also Blog: Server Selection in Next Generation MongoDB Drivers.
You may also be interested in selecting servers within the latency window :
When choosing between several suitable servers, the latency window is the range of acceptable RTTs from the shortest RTT to the shortest RTT plus the local threshold. E.g. if the shortest RTT is 15ms and the local threshold is 200ms, then the latency window ranges from 15ms - 215ms.
For example, in MongoDB Python driver (PyMongo) by default the value of localThresholdMS is 15ms, which means only members whose ping times are within 15ms-30ms are used for queries.

Related

what is the default grafana setting for $__rate_interval

I understand that rate(xyz[5m]) * 60 is the rate of xyz per minute, averaged over 5 mins.
How then would $__rate_interval and $__interval be defined,
possibly in the same syntax?
What format is rate being measured here, in my panel? Per minute, per second?
What is the interval= 30s in my panel here? My scraping interval is set to 5s.
How do i change the rate format?
See New in Grafana 7.2: $__rate_interval for Prometheus rate queries that just work.
Rate is always per second. See Grafana documentation for the rate function.
Click on Query options, then click on the Info-Symbol. An explanation will be displayed.
To get rate per minute, just multiply the rate with 60.
Edit: ($__rate_interval and $_interval)
Prometheus periodically fetches data from your application. Grafana periodically fetches Data from Prometheus. Grafana does not know, how often Prometheus polls your application for data. Grafana will estimate this time by looking at the data. The $__interval variable expands to the duration between two data points in the graph. (Note that this is only true for small time ranges and high resolution as the intended use case for $__interval is reducing the number of data points when the time range is wide. See Approximate Calculation of $__interval.)
If the time-distance between every two data points in each series is 15 seconds, it does not make sense to use anything less than [15s] as interval in the rate function. The rate function works best with at least 4 data points. Therefore [1m] would be much better than anything betweeen [15s] and [1m]. This is what $__rate_interval tries to achieve: guessing a minimal sensible interval for the rate function.
Personally, I think, this does not always work if your application delivers sparse data. I prefer using fixed intervals like 10m or even 1h or 1d in these situations. The interval need to be great enough to get you enough data points for the metric to work with the rate function.
A different approach would be to use any of $__rate_interval and $_interval but also set the Min step parameter for the query in the Grafana UI to be big enough.

Prometheus query quantile of pod memory usage performance

I'd like to get the 0.95 percentile memory usage of my pods from the last x time. However this query start to take too long if I use a 'big' (7 / 10d) range.
The query that i'm using right now is:
quantile_over_time(0.95, container_memory_usage_bytes[10d])
Takes around 100s to complete
I removed extra namespace filters for brevity
What steps could I take to make this query more performant ? (except making the machine bigger)
I thought about calculating the 0.95 percentile every x time (let's say 30min) and label it p95_memory_usage and in the query use p95_memory_usage instead of container_memory_usage_bytes, so that i can reduce the amount of points the query has to go through.
However, would this not distort the values ?
As you already observed, aggregating quantiles (over time or otherwise) doesn't really work.
You could try to build a histogram of memory usage over time using recording rules, looking like a "real" Prometheus histogram (consisting of _bucket, _count and _sum metrics) although doing it may be tedious. Something like:
- record: container_memory_usage_bytes_bucket
labels:
le: 100000.0
expr: |
container_memory_usage_bytes > bool 100000.0
+
(
container_memory_usage_bytes_bucket{le="100000.0"}
or ignoring(le)
container_memory_usage_bytes * 0
)
Repeat for all bucket sizes you're interested in, add _count and _sum metrics.
Histograms can be aggregated (over time or otherwise) without problems, so you can use a second set of recording rules that computes an increase of the histogram metrics, at much lower resolution (e.g. hourly or daily increase, at hourly or daily resolution). And finally, you can use histogram_quantile over your low resolution histogram (which has a lot fewer samples than the original time series) to compute your quantile.
It's a lot of work, though, and there will be a couple of downsides: you'll only get hourly/daily updates to your quantile and the accuracy may be lower, depending on how many histogram buckets you define.
Else (and this only came to me after writing all of the above) you could define a recording rule that runs at lower resolution (e.g. once an hour) and records the current value of container_memory_usage_bytes metrics. Then you could continue to use quantile_over_time over this lower resolution metric. You'll obviously lose precision (as you're throwing away a lot of samples) and your quantile will only update once an hour, but it's much simpler. And you only need to wait for 10 days to see if the result is close enough. (o:
The quantile_over_time(0.95, container_memory_usage_bytes[10d]) query can be slow because it needs to take into account all the raw samples for all the container_memory_usage_bytes time series on the last 10 days. The number of samples to process can be quite big. It can be estimated with the following query:
sum(count_over_time(container_memory_usage_bytes[10d]))
Note that if the quantile_over_time(...) query is used for building a graph in Grafana (aka range query instead of instant query), then the number of raw samples returned from the sum(count_over_time(...)) must be multiplied by the number of points on Grafana graph, since Prometheus executes the quantile_over_time(...) individually per each point on the displayed graph. Usually Grafana requests around 1000 points for building smooth graph. So the number returned from sum(count_over_time(...)) must be multiplied by 1000 in order to estimate the number of raw samples Prometheus needs to process for building the quantile_over_time(...) graph. See more details in this article.
There are the following solutions for reducing query duration:
To add more specific label filters in order to reduce the number of selected time series and, consequently, the number of raw samples to process.
To reduce the lookbehind window in square brackets. For example, changing [10d] to [1d] reduces the number of raw samples to process by 10x.
To use recording rules for calculating coarser-grained results.
To try using other Prometheus-compatible systems, which may process heavy queries at faster speed. Try, for example, VictoriaMetrics.

Calculating Mbps in Prometheus from cumulative total

I have a metric in Prometheus called unifi_devices_wireless_received_bytes_total, it represents the cumulative total amount of bytes a wireless device has received. I'd like to convert this to the download speed in Mbps (or even MBps to start).
I've tried:
rate(unifi_devices_wireless_received_bytes_total[5m])
Which I think is saying: "please give me the rate of bytes received per second", over the last 5 minutes, based on the documentation of rate, here.
But I don't understand what "over the last 5 minutes" means in this context.
In short, how can I determine the Mbps based on this cumulative amount of bytes metric? This is ultimately to display in a Grafana graph.
You want rate(unifi_devices_wireless_received_bytes_total[5m]) / 1000 / 1000
But I don't understand what "over the last 5 minutes" means in this context.
It's the average over the last 5 minutes.
The rate() function returns the average per-second increase rate for the counter passed to it. The average rate is calculated over the lookbehind window passed in square brackets to rate().
For example, rate(unifi_devices_wireless_received_bytes_total[5m]) calculates the average per-second increase rate over the last 5 minutes. It returns lower than expected rate when 100MB of data in transferred in 10 seconds, because it divides those 100MB by 5 minutes and returns the average data transfer speed as 100MB/5minutes = 333KB/s instead of 10MB/s.
Unfortinately, using 10s as a lookbehind window doesn't work as expected - it is likely the rate(unifi_devices_wireless_received_bytes_total[10s]) would return nothing. This is because rate() in Prometheus expects at least two raw samples on the lookbehind window. This means that new samples must be written at least every 5 seconds or more frequently into Prometheus for [10s] lookbehind window. The solution is to use irate() function instead of rate():
irate(unifi_devices_wireless_received_bytes_total[5m])
It is likely this query would return data transfer rate, which is closer to the expected 10MBs if the interval between raw samples (aka scrape_interval) is lower than 10 seconds.
Unfortunately, it isn't recommended to use irate() function in general case, since it tends to return jumpy results when refreshing graphs on big time ranges. Read this article for details.
So the ultimate solution is to use rollup_rate function from VictoriaMetrics - the project I work on. It reliably detects spikes in counter rates by returning the minimum, maximum and average per-second increase rate across all the raw samples on the selected time range.

If I change the _pdpstep and heartbeat the RRD not updated properly

I have rrd file "abcd" with _pdpstep = 300 and heartbeat = 700. If this is the configuration then it works fine means accept value. But If I create this file newly with _pdpstep = 1200 and heartbeat = 1500 then it gives all value as Nan. How Can I check what is wrong. If you require I can send rrdtool info for both files.
There's not enough information to answer your question.
However, you should probably look at the documentation
Specifically the bit about heartbeat, step and the 'xff' in your RRA definitions.
xff The xfiles factor defines what part of a consolidation interval may be made up from UNKNOWN data while the consolidated value is still regarded as known. It is given as the ratio of allowed UNKNOWN PDPs to the number of PDPs in the interval. Thus, it ranges from 0 to 1 (exclusive).
It's quite likely that if you're using a different heartbeat, then your sampling interval is now too low.
The "heartbeat" defines the maximum acceptable interval between samples/updates. If the interval between samples is less than "heartbeat", then an average rate is calculated and applied for that interval. If the interval between samples is longer than "heartbeat", then that entire interval is considered "unknown". Note that there are other things that can make a sample interval "unknown", such as the rate exceeding limits, or a sample that was explicitly marked as unknown.
So, short answer is - if you chance your RRA definition to have a lower xff, then you should stop getting NaNs in your data.

Equation for determining average data transfer speed when day/night throttling limit is different

this may be better posted in Mathematics, but figured someone in StackOverflow may have seen this before. I am trying to devise an equation for determining the average data transfer speed for backup appliances that offsite their data to a data center.
On weekdays during the 8:00a-5:00p hours (1/3 of the day), the connection is throttled to 20% of the measured bandwidth. The remaining 2/3 of the weekday (5:00p-8:00a), the connection is throttled to 80% of the measured bandwidth. On the weekend from Friday 5:00p until Monday 8:00a, the connection is a constant 80% of the measured bandwidth.
The reason behind this is deciding whether to seed the data onto a hard drive versus letting the data transfer over the internet. Making this decision is based on getting a somewhat accurate bandwidth average so that I can calculate the transfer time
I had issues coming up with an equation, so I reverse engineered a few real world occurrences using just the weekday 80%/20% average. I came up with 57.5% of the measured bandwidth, but could not extrapolate an equation from it. Now I want to write a program to determine this. I am thinking factoring in the weekend being 80% the whole time would use a similar equation.
This would be similar scenario to a car travelling at 20% of speed limit for 1/3 of the day and then 80% of speed limit for the rest of that day, and then determine average car speed for the day. I searched online and could not find any reference to an equation for this. Any ideas?
Using the idea you provided, is direct the equation:
Average = (1/3) * bandwith_1 + (2/3) * bandwith_2
If bandwith_1 = 20 and bandwith_2 = 80, the equation gives a maximumm value of 59,99999%.