Getting Wrong Percentile response time for Gatling Graphite Integration - grafana

I read gatling official blog and understood that we cannot get 95th,99th percentile "All aggregations will result in computing averages on percentiles and will inherently be broken". I dint understand one thing why we are not getting simple response time series in gatling which we can customized through our queries to get any percentile we want . Without correct percentile such integration are worthless.
Is there any way we can get close to the gatling reports timing in InfluxDB if not exact.

Gatling doesn't really send to influxDB the response time of each operation. If gatling sent complete information about the operations, it would significantly reduce the performance of gatling and influxDB. You would not have enough RAM for a test with a load intensity of even 100 transactions per second. Instead, gatling sends aggregated response times over the time interval you specified in gatling.conf.
Because of this, grafana cannot show us the actual value of the 90th percentile. There are many utilities for proper aggregation of gatling logs, such as this one: https://github.com/codefiler/gatling-analyser

Related

Google Indexing API rateLimitExceeded

I have a Spring Batch Process which submits something around 5M urls to Google Indexing API. In the past, the process was segmented e parallelized int two threads by an attribute, one for the small segments and one for the bigger. From some days ago up to now, it was refactored to submit request as it come from a query response (sorted by its priority, ignoring the previous segmenting attribute, using a single thread to execute). After that refactoring, I started getting a "rateLimitExceed" error from Google API. I have (by contract) 5M request a day and I'm submitting batches of 500 urls a time. The average sending time is around 1.2 seconds for each 500 urls batch.
Does anybody know what may be causing this error?
I did not do the math, but if you are getting this exception, it means you are exceeding the limit. Depending on where you are doing the API call (ie in the item writer or in an item processor), you can do the math and delay the call as needed with a listener to not exceed the limit.
You can find a similar question/answer here: Spring batch writer throttling

Stream kinesis Analytics ETL Flink - skip records before and after a delay

EDITED:
I have a requirement to skip records that are created before 10s and 20s after if a gap in incoming data occurs.
(A gap is said to occur when the event-time1 - event-time2 > 3 seconds)
the resulting data is used to calculate average or median in a timewindow,
Is this possible to be done with Kinesis analytics, Dataflow, flink API, or some solution that works?
If I understand correctly, you want to find the median and average of records that are created between 10 and 20 seconds after a gap of at least 3 seconds.
Using Flink (or Kinesis Analytics, which is a managed Flink service), you could do that with session windows, or with a ProcessFunction. Process functions are more flexible, and are capable of handling pretty much anything you might need. However, in this case, session windows are probably simpler, especially if you are willing to wait until a session ends (i.e., until the next gap) to get the results. You could avoid this delay by implementing a custom window Trigger.
window tutorial
process function tutorial

Spark Dataframe.write() completed percentage

I am trying to write a Dataframe to a file. As the data frame is quite large, I want to know what is the status of the write operation in terms of Progress percentage, because it continues execution for a good amount of time.
myDataFrame
.filter(myFilter)
.write
.json(ExportPath)
Is there any way to know the percentage of data written to file?
Or at least get the number of partitions that have completed individually?
For a quick manual check, you can check the processed amount of data in the Spark UI. For a more automated way of accessing the data, either the REST API or the Metrics library is helpful.

rate limit policy on queries to Azure Insights REST API for Events (Audit Logs)

I have some questions regarding Azure Insights REST Api for Events.
When I make HTTP request to Inisghts API for events, I receive the header "
x-ms-ratelimit-remaining-subscription-reads", with value "14999".
But next query in 1s returns me the same value of remaining reads.
I see there is some throttling policy there, but I would like to understand how it works and what is the correct way to deal with that.
In particular,
1) how many reads I am able to do per second?
2) if I exceed the whole remaining reads parameter, how much time should I wait before it will again be maximum?
3) is it decreased on every query attempt, despite of the $top parameter setted and how many results has been returned?
Thank you!
This article seems to have the responses you need.
To answer the questions based on it:
There is no limit to the number of requests per second, but you have 15k
requests/hour/subscription/region/instance of ARM region. Worst case scenario you will get throttled after 15k requests but you'd have to be extremely unlucky for that.
If you exceed the limit, you are
told how much you have to wait and you can integrate that logic by
looking at the Retry-After header. Happily, it's a matter of
seconds.
I believe the $top parameter doesn't affect the query since
no matter how many results are brought back, a paging request is
still just one request.
As for the fact that you get 14999 requests
remaining multiple times, as they say in their documentation it is
expected since an ARM region has multiple instances and each instance has
15k requests limit/subscription/hour. If you hit simultaneously and
you get the same number remaining, it just means that you were lucky
enough to hit different instances within the same ARM region.
1) how many reads I am able to do per second?
Based on the rate limits published here - https://azure.microsoft.com/en-in/documentation/articles/azure-subscription-service-limits/#subscription-limits, you can perform 15000 reads / hour (not sure it would translate to 4 reads / second).
2) if I exceed the whole remaining reads parameter, how much time
should I wait before it will again be maximum?
Given the rates are defined per hour, my guess would be to wait till next hour if you exhaust 15000 read request limit.
3) is it decreased on every query attempt, despite of the $top
parameter setted and how many results has been returned?
This is based on the number of API calls and not the amount of data returned. So I would say defining $top parameter should not have any impact on this.
When I make HTTP request to Inisghts API for events, I receive the
header " x-ms-ratelimit-remaining-subscription-reads", with value
"14999". But next query in 1s returns me the same value of remaining
reads.
I would assume there's some caching in play here. Is it the same request you're repeating or a different request all together?

What does server throughput mean

If the throughput is increase how will be changed the response and request time?
If I have the data(request/min)?
JMeter's definition of throughput can be seen here: https://jmeter.apache.org/usermanual/glossary.html
Basically its a measure of how many requests that JMeter were able to send to your test site/application in one second. Or in another word the number of requests that your test site/application was able to receive from JMeter in one second. An increase in the throughput will mean your site/application was able to receive more requests per second while a decrease will mean a reduction in the number of request it handled per second.
The relationship between throughput with response/request time totally depends as ysth stated. I typically use this number to see the load of the server but run the test several times (30x min) and take the average.
There's not necessarily a relationship. Can you tell us anything more about why you want to know this, what you plan to do with the information, etc.? It may help get you an answer better suited to your needs.
After completion of the project development as a developer, we are responsible to test the performance of the application.
As part of performance testing, we have to check
1)Response time of application
2)bottle nack of application
3)Throughput of application
Throughput of application:-
In general 'Request capacity of application in a given time.'
As per Apache JMeter doc :-
Throughput is calculated as requests/unit of time. The time is calculated from the start of the first sample to the end of the last sample. This includes any intervals between samples, as it is supposed to represent the load on the server.
The formula is: Throughput = (number of requests) / (total time).