I met the problem that when I try to use filter logs by logQL query I wait response from server about 13 seconds? Is there any way to decrease this value?
For example, request without logQL query "https://someurl/6/loki/api/v1/query_range?limit=5000&query={container=%22test%22}&start=1673902800000000000&end=1673989199999000000" I wait about 3 seconds but to 'https://someurl/6/loki/api/v1/query_range?limit=5000&query={container=%22test%22}|="message"&start=1673902800000000000&end=1673989199999000000' I got 11-13 seconds
I tried to discover google for something useful but didn't find anything
Related
EDITED:
I have a requirement to skip records that are created before 10s and 20s after if a gap in incoming data occurs.
(A gap is said to occur when the event-time1 - event-time2 > 3 seconds)
the resulting data is used to calculate average or median in a timewindow,
Is this possible to be done with Kinesis analytics, Dataflow, flink API, or some solution that works?
If I understand correctly, you want to find the median and average of records that are created between 10 and 20 seconds after a gap of at least 3 seconds.
Using Flink (or Kinesis Analytics, which is a managed Flink service), you could do that with session windows, or with a ProcessFunction. Process functions are more flexible, and are capable of handling pretty much anything you might need. However, in this case, session windows are probably simpler, especially if you are willing to wait until a session ends (i.e., until the next gap) to get the results. You could avoid this delay by implementing a custom window Trigger.
window tutorial
process function tutorial
I would like Prometheus to scrape metrics every hour and display these hourly scrape events in a table in a Grafana dashboard. I have the global scrape interval set to 1h in the prometheus.yml file. From the prometheus visualizer, it seems like Prometheus scrapes around the 43 minute mark of every hour. However, it also seems like this data is only valid for about 3 minutes: Prometheus graph
My situation, then, is this: In a Grafana table, I set the min step of a query on this metric to 1h, but this causes the table to say that there are no data points. However, if I set the min step to 5 minutes, it displays the hourly scrape events with a timestamp on the 45 minute mark. My guess as to why this happens is that Prometheus starts on the dot of some hour and steps either forward or backward by the min step.
This does achieve what I would like to do, but it also has potential for incorrect behavior if Prometheus ever does something like can been seen at the beginning of the earlier graph. I also know that I can add a time shift, but it seems like it is always relative to the current time rather than an absolute time.
Is it possible to increase the amount of time that the scrape data is valid in Prometheus without having to scrape again every 3 minutes? Or maybe tell Prometheus to scrape at the 00 minute mark of every hour? Or if not, then can I add a relative time shift to the table so that it goes from the 45 minute mark instead of the 00 minute mark?
On a side note, in the above Prometheus graph, the irregular data was scraped after Prometheus was started. I had started Prometheus around 18:30 on the 22nd, but Prometheus didn't scrape until 23:30, and then it scraped at different intervals until it stabilized around 2:43 on the 23rd. Does anybody know why?
Your data disappear because of the staleness strategy implemented in Prometheus. Once a sample has been ingested, the metric is considered stale after 5 minutes. I didn't find any configuration to change that value.
Scraping every hour is not really the philosophy of Prometheus. If your really need to scrape with such a low frequency, it could be a better idea to schedule a job sending the data to a push gateway or using a prom file fed to a node exporter (if it makes sense). You can then scrape this endpoint every 1-2 minutes.
You could also roll your own exporter that memorize the last scrape and scrape anew only if the data age exceeds one hour. (That's the solution I would prefer)
Now, as a quick solution you can request the data over the last hour and average on it. That way, you'll get the last (old) scrape taken into account:
avg_over_time(old_metric[1h])
It should work or have some transient incorrect values if there is some jitters in the scheduling of the scrape.
Regarding the issues you had about late scraping, I suspect the scraping failed at those dates. Prometheus retries only at the next schedule (1h in your case).
If the metric is scraped with intervals exceeding 5 minutes, then Prometheus would return gaps to Grafana because of staleness mechanism. These gaps can be filled with the last raw sample value by wrapping the queried time series into last_over_time function. Just specify the lookbehind window in square brackets, which equals or exceeds the interval between samples. For example, the following query would fill gaps for my_gauge time series with one hour interval between samples:
last_over_time(my_gauge[1h])
See these docs for time durations format, which can be used in square brackets.
I implemented exports.get = function(request,response) of a custom api on a mobile service of azure. I download 5 thousands records from the rest service and then i prepare the json for the output. The problem is that the time of downloading of all records is too long, for that script exceeds the default timeout of 30 secs. I was thinking if there is a way to increase the timeout of the response.
I don't believe you can have a timeout greater than 30 seconds, as I have encountered this problem myself with azure custom APIs. According to this link https://msdn.microsoft.com/en-us/library/azure/dd894042.aspx, Table operations are limited to 30 seconds, but it's not clear if that applies to custom apis, but it certainly appears to be.
What I would recommend is to implement pagination and return a limited number of records at a time. Your parameters should include the the start index and amount of records to return, and your response should include how many records in total so you can determine how many records to fetch with each request.
For example- I need to check that for 1000 users it responds in 3 seconds.
Is the number of users and response times configurable?
This answer targets Gatling 2.
You can set the target number of users by configuring the "injection profile" of your simulation:
setUp(scn.inject(atOnceUsers(1000)) // To start all users at the same time
setUp(scn.inject(rampUsers(1000) over (30 seconds) // To start gradually, over 30 seconds
For more information, please refer to the injection DSL documentation
If you want to check that all your users responds in less than 3 seconds, one way to ensure this is Gatling's assertions:
setUp(...).assertions(global.responseTime.max.lessThan(3000))
If this assertion fails, meaning at least one user responded in more than 3 seconds, Gatling will clearly indicate the failure after your simulation is completed.
We have an script that calls a procedure that sometimes takes more than 60 seconds to execute. When the procedure goes above 60 secs, we get a DeadlineExceededError. The error message looks like this:
DeadlineExceededError: The API call rdbms.Exec() took too long to respond and was cancelled.
This script is scheduled as a cron and using B8 class back end. Does using a back end have any effect on the timeout limit of a query? What can I do in case of a query taking more than 60 secs to execute?
Yes, using a cron or backend would give you a longer deadline. See this FAQ entry:
https://developers.google.com/cloud-sql/faq#sizeqps
There's obviously a bug, because I run query on backend instance and it also raises DeadlineExceededError after 60 seconds.
Here's the ticket in googleappengine https://code.google.com/p/googleappengine/issues/detail?id=9479 and on googlecloudsql https://code.google.com/p/googlecloudsql/issues/detail?id=71
No response yet.