VSTS Load Test: single request by many users over time - azure-devops

I have an end-point, let's call it https://www.ajax.org/api/v1/offers.
The scenario is that 80.000 users will access this end-point one time each, and they will all make this one request within 60 minutes.
How exactly do you model this in a VSTS Load Test?
Thanks in advance!

Create a ".webtest" that does the request.
The load of 80000 requests in one hour is about 1333 per minute, which is about 22 per second. (Check: 22 * 60 *60 = 79200 and 23 * 60 * 60 = 82800, so the 22 or 23 is about right.) If each request takes on average one second then will need 23 Virtual Users (VUs) to create the total load. If each request takes on average two seconds then would need about 46 VUs. (Check: (46 / 2) * 60 * 60 = 82800 and (45 /2) * 60 * 60 = 81000. So still about right.) Even though there is only one test must specify a test mix, so use "Test mix based on number of tests started".
Once the average request time is known when under load then its value can be used in the style above to set the required number of VUs.
Another approach starts with the above sums to find minimum numbers of VUs but uses a "Test mix based on user pace". Suppose we specify 100 VUs (which is normally considered a modest load). Then we need each VU to process 80000/100 = 800 webtests per hour and we just specify that 800 in the test mix window. -- On reflection this may be the better approach but I think the analysis above is useful.
To simulate 80000 different users ensure that the "Percentage of new users" is 100 in the scenario properties.
If you want exactly 80000 requests in the run then specify that as the "Number of iterations" in the "Run settings" along with "Use test iterations" set to "true". If you want about 80000 then I recommend setting "Use test iterations" to "false" and giving a "Run duration" of one hour.

Related

How to implement decreasing Gatling users during some time without spikes in users arrival?

I have the ramp down test that decrease the amount of users every time in Gatling. For example, 5 users every 1 minute, from 15 users to 5 users. For the user's injections I use constantConcurrentUsers():
setUp(
Test.inject(
(15) to 5 by (-5)
.map(i => constantConcurrentUsers(i) during 60)
)
Screenshot with constantConcurrentUsers()
But users arrive with some spikes. I want it to be more constant like in atOnceUsers() where we don't have any spikes. Is there another way to do it in Gatling?
But users arrive with some spikes.
This is just a consequence of this chart displaying "active users" metric, not "concurrent users" one. See doc.

How to interpret LocustIO's output / simulate short user visits

I like Locust, but I'm having a problem interpreting the results.
e.g. my use case is that I have a petition site. I expect 10,000 people to sign the petition over a 12 hour period.
I've written a locust file that simulates user behaviour:
Some users load but don't sign petition
Some users load and submit invalid data
Some users (hopefully) successfully submit.
In real life the user now goes away (because the petition is an API not a main website).
Locust shows me things like:
with 50 concurrent users the median time is 11s
with 100 concurent users the median time is 20s
But as one "Locust" just repeats the tasks over and over, it's not really like one user. If I set it up with a swarm of 1 user, then that still represents many real world users, over a period of time; e.g. in 1 minute it might do the task 5 times: that would be 5 users.
Is there a way I can interpret the data ("this means we can handle N people/hour"), or some way I can see how many "tasks" get run per second or minute etc. (ie locust gives me requests per second but not tasks)
Tasks dont really exist on the logging level in locust.
If you want, you could log your own fake samples, and use that as your task counter. This has an unfortunate side effect of inflating your request rate, but it should not impact things like average response times.
Like this:
from locust.events import request_success
...
#task(1)
def mytask(self):
# do your normal requests
request_success.fire(request_type="task", name="completed", response_time=None, response_length=0)
Here's the hacky way that I've got somewhere. I'm not happy with it and would love to hear some other answers.
Create class variables on my HttpLocust (WebsiteUser) class:
WebsiteUser.successfulTasks = 0
Then on the UserBehaviour taskset:
#task(1)
def theTaskThatIsConsideredSuccessful(self):
WebsiteUser.successfulTasks += 1
# ...do the work...
# This runs once regardless how many 'locusts'/users hatch
def setup(self):
WebsiteUser.start_time = time.time();
WebsiteUser.successfulTasks = 0
# This runs for every user when test is stopped.
# I could not find another method that did this (tried various combos)
# It doesn't matter much, you just get N copies of the result!
def on_stop(self):
took = time.time() - WebsiteUser.start_time
total = WebsiteUser.successfulTasks
avg = took/total
hr = 60*60/avg
print("{} successful\nAverage: {}s/success\n{} successful signatures per hour".format(total, avg, hr)
And then set a zero wait_time and run till it settles (or failures emerge) and then stop the test with the stop button in the web UI.
Output is like
188 successful
0.2738157498075607s/success
13147.527132862522 successful signatures per hour
I think this therefore gives me the max conceivable throughput that the server can cope with (determined by changing the No. users hatched until failures emerge, or until the average response time becomes unbearable).
Obviously real users would have pauses, but that makes it harder to test the maximums.
Drawbacks
Can't use distributed Locust instances
Messy; also can't 'reset' - have to quit the process and restart for another test.

Jmeter Throughput shaping timer Questions

So, I am using JMeter's throughput shaping timer to test the performance of our REST Server. I noticed a few things i did not expect.
First of all my setup details :
1)JMeter Version : 3.0 r1743807
2)JMX file : DropBox Link
Now , my questions :
1)The throughput shaping timer is configured to run for 60 seconds(100rps - 30 secs, 200 rps - next 30 secs). But the actual test runs only for 3 seconds as shown below. Why?
2) As per the plan the number of requests per second should go from 100 - 200. But here it seems to decrease , as in above.
3)As per this plugin's documentation , the number of thread groups = desired requests per second * server response time / 1000 . Is it because how this plugin internally works or is it a simple logic i am missing?
The issue is with the Thread Group settings.
You only one have 1 iteration and ramp up 300 users in 1 second. So if Jmeter can send all the 300 requests and get the response, JMeter will finish the test immediately. Those timer settings will apply only if the test is running.
If you need the test to run for some duration (say 60 seconds), then set the loop count to forever & duration to 60

What do we mean by "top percentile" or TP based latency?

When we discuss performance of a distributed system we use the terms tp50, tp90, tp99.99 TPS.
Could someone explain what do we mean by those?
tp90 is a maximum time under which 90% of requests have been served.
Imagine you have times:
10s
1000s
100s
2s
Calculating TP is very simple:
sort all times in ascending order: [2s, 10s, 100s, 1000s]
find latest item in portion you need to calculate. For TP50 it will ceil(4*.5)=2 requests. You need 2nd request. For TP90 it will be ceil(4*.9)=4. You need 4th request.
get time for the item found above. TP50=10s. TP90=1000s
Say if we are referring to in-terms of performance of an API, TP90 is the max time under which 90% of requests have been served.
TPx: Max response time taken by xth percentile of requests.
time taken by 10 requests in ms [2,1,3,4,5,6,7,8,9,10] - there are 10 response times
TP100 = 10
TP90 = 9
TP50 = 5

Quartz triggers are firing ahead of time

We are using quartz 2.1.5, on 64 bit machine (clustered, 2 instances, 16GB ram). We have around 8000 triggers in the system.
Every second we have around 50 triggers - they get fired every second.
org.quartz.threadPool.threadCount = 50
org.quartz.scheduler.batchTriggerAcquisitionMaxCount=100
org.quartz.scheduler.idleWaitTime=15000
#org.quartz.scheduler.batchTriggerAcquisitionFireAheadTimeWindow=0 (this is not set)
Quartz is able to handle the load, but triggers get fired ahead of time?
batchTriggerAcquisitionMaxCount - can we increase it to 500 and keep batchTriggerAcquisitionFireAheadTimeWindow at 1000 (1 sec), is there any disadvantage of these configuration?
Any other way?
with following configuration, it seems to work fine.
org.quartz.threadPool.threadCount = 100
org.quartz.scheduler.batchTriggerAcquisitionMaxCount=500
org.quartz.scheduler.batchTriggerAcquisitionFireAheadTimeWindow=1000
org.quartz.scheduler.idleWaitTime=25000
When quartz wants to run your triggers it calls this method :
triggers = qsRsrcs.getJobStore().acquireNextTriggers(now + idleWaitTime, Math.min(availThreadCount, qsRsrcs.getMaxBatchSize()), qsRsrcs.getBatchTimeWindow());
Where :
idleWaitTime is org.quartz.scheduler.idleWaitTime
availThreadCount is the number of free threads (will be less or equal to org.quartz.threadPool.threadCount)
qsRsrcs.getMaxBatchSize() is org.quartz.scheduler.batchTriggerAcquisitionMaxCount
qsRsrcs.getBatchTimeWindow() is org.quartz.scheduler.batchTriggerAcquisitionFireAheadTimeWindow
it leads to an SQL request like :
SELECT * FROM TRIGGERS WHERE NEXT_FIRE_TIME <= now + idleWaitTime + qsRsrcs.getBatchTimeWindow() LIMIT Math.min(availThreadCount, qsRsrcs.getMaxBatchSize())
So yes, Quartz always runs triggers ahead of time with idleWaitTime + qsRsrcs.getBatchTimeWindow(). The minimum number of triggers that it takes will be if you set getBatchTimeWindow to zero, and idleWaitTime to 1000 (it's a minimal value). In this case it will still take triggers that are supposed to happen 1 second ahead of time in addition to those, which are expected to run.
If you want to stop taking triggers ahead of time completely you can set batchTriggerAcquisitionMaxCount to 1. The downside in this case is that you can get too many SQL requests. You can try to play with batchTriggerAcquisitionMaxCount parameter and find the value that suits you best.
BTW Looking on the Quartz code you can see that setting batchTriggerAcquisitionMaxCount bigger than threadCount makes no sense.