How to retry DynamoDb write when throttled? - scala

I am trying to write large amounts of data to dynamo using AmazonDynamoDBAsyncClient and I am trying to understand what the best practice of handling throttling is?
For example, I have a capacity of 3000 writes and at a given moment I have, let's say, 100,000 records I'd like to write. I don't need them all in immediately, but I am trying to figure what the best way to get them in is.
This application is running in a distributed environment so there maybe 5 executors all trying to do this at the same time. Would the best way to handle this be this way? Where I sleep the write process should we hit the throttle? Or should I be doing something to avoid the throttle completely. In fact, is my code even doing what I think it is, which is retrying the data after waiting a second?
try{
amazonDynamoAsyncDb.updateItemAsync(updateRequest)
}catch{
case e: ThrottlingException => {
Thread.sleep(1000)
//retry here, but how?
}
}

The AWS SDK for Java will retry throttled requests 10 times by default, before throwing a ProvisionedThroughputExceededException. If your items are small (1KB or less) and you are performing the writes from EC2 in the same region as your table you can assume each write will take around 10 ms. That means each thread of processing can do about 100 writes per second. To scale your writes to 3000 writes per second, you would need 30 threads and 30 HTTP connections. 3000 small (1kb) writes per second translates to a data throughput of 2.92 MB per second. Thus, for this write load, it does not appear that EC2 hardware could become a bottleneck. I recommend you do some measurements to figure out how long it takes to write each of your items on average, and scale your threads and HTTP connections appropriately.

Related

Locust eats CPU after 2-3 hours running

I have a simple HTTP server that I was testing. This server interacts with other HTTP servers and Cassandra DB.
Currently I was using 100 users with 1 request/s, so totally 100 tps was on the server. What I noticed with the Docker stats was that the CPU usage became higher and higher and ~ 2-3 hours later the CPU usage reaches the 90% mark, and even more. After that I got a notice from Locust, stating that the measurement may be inconsistent. But the latencies were not increased, so I do not know why this has been happening.
Can you please suggest possible cause(s) of the problem? I think 100 tps should be handled by one vCPU.
Thanks,
AM
There's no way for us to know exactly what's wrong without at very least seeing some code, and even then other factors like the environment or data or server you're running it on or against could have additional factors we wouldn't know about.
It's possible you have a problem with your code for your Locust users, such as a memory leak or they're just doing too much for a single worker to handle that many users. For users only doing simple HTTP calls, a single CPU typically can handle upwards of thousands of requests per second. Do anything more than that and you'll start to expect to reduce what a worker can handle. It's also possible you may just need a more powerful CPU (or more RAM or bandwidth) to do what you want it to do at the scale you want.
Do some profiling to see if you can find any inefficiencies in your code. Run smaller tests to see if the same behavior is evident with smaller loads. Run the same load but with additional Locust workers on other CPUs.
It's also just as possible your DB can't handle the load. The increasing CPU usage could be due to how your code is handling waiting on the connection from the DB. Perhaps the DB could sustain, say, 80 users at an acceptable rate but any additional users makes it fall further and further behind and your Locust users are then waiting longer and longer for the requested data.
For more suggestions, check out the Locust FAQ https://github.com/locustio/locust/wiki/FAQ#increase-my-request-raterps

Server Throughput definition ambiguity

Is throughput the max number of requests a server instance can handle or is it the number of requests that the server instance is currently handling?
Edit: By "currently handling" I mean, the number of requests the server is receiving for a given time interval in recent time. For eg: The server is currently handling 400 reqs every min.
For eg:, I might have a server instance with a lot of hardware which can have high throughput, but I might be only receiving small amount of traffic. What does throughput measure in such a situation. Also, what about the inverse case, i.e if my instance can only handle x requests per min. but is receiving y>>>x requests per min.
If throughput is the max no. of requests a server can handle, how is it measured? Do we do a load/stress test, where we keep increasing the requests per min on the server until it cannot handle them anymore?
No, Throughput is an aggregation that depends on execution time, you can send 1000 requests in the same second and your server won't handle, but when you'll send 1000 requests in an hour and your server will handle it normally.
Throughput is calculated as requests/unit of time. The time is calculated from the start of the first sample to the end of the last sample. This includes any intervals between samples, as it is supposed to represent the load on the server.
The formula is: Throughput = (number of requests) / (total time).
You want to find the number of concurrent users that your server can handle by increasing JMeter threads until server reach his maximum
Throughput is the number of Samplers which JMeter executes within the duration of your test. If you want to see the actual amount of requests which are being sent - consider using i.e. Server Hits Per Second listener (can be installed using JMeter Plugins Manager)
If you see that your server resources consumption doesn't increase as you increase the load in JMeter the reasons are in:
Your application middleware configuration is not suitable for high load (i.e. there is a limit of concurrent sessions and requests are queuing up instead of being processed), check out Web Performance Optimization: Top 3 Server and Client-Side Performance Tips for overall ideas what could be looked at
Your application code doesn't utilize underlying OS resources efficiently, consider using profiler tool to see what's going on under the hood.
JMeter may fail to send requests fast enough, make sure to follow JMeter Best Practices and if JMeter's machine is overloaded - consider going for Distributed Testing

Large number of connections to kdb

I have a grid with over 10,000 workers, and I'm using qpython to append data to kdb. Currently with 1000 workers, I'm getting ~40 workers that fail to connect and send data on the first try, top shows q is at 100% cpu when that happens. As I scale to 10k workers, the problem will escalate. The volume of data is only 100MBs. I've tried running extra slaves, but kdb tells me I can't use it with -P option, which I'm guessing I need to use qpython. Any ideas how to scale to support 10k workers. My current idea is to write a server in between that will buffer write requests and pass them to kdb, is there a better solution?
It amazes me that you're willing to dedicate 10,000 cpus to Python but only a single one to Kdb.
Simply run more Kdb cores (on other ports) and then, enable another process to receive the updates from the ingestion cores. The tickerplant (u.q) is a good model for this.

Which mongo stats to use to throttle writes

I am writing logging information asynchronously to mongodb. Since this is an non-essential function, I am looking for a way to throttle these writes so it does not impact read/writes from other part of the application. Essentially, only write when certain stat is below acceptable level.
One stats I thought of using is "globalLock.ratio" from serverStatus. However, this does not seem to be a moving average and not a good way to measure current usage on the database.
What would be a good stats to use for what I am looking to do? Write lock % would be ideal, but how would I get moving average from serverStatus?
There are a number of things to note about your question:
1) If you want moving averages, then you'll need to keep track of them yourself in your client program. If you're running a multi-threaded program, you could dedicate one thread to polling MongoDB at regular (1 second? 5 second?) intervals, and calculating the moving average yourself. This is the way that MMS does it.
2) When you calculate this average, you need to figure out what a 'loaded database' means to you. There could be many things to check: do you care about write lock percentage? read percentage? I/O usage? Replication delay? Unfortunately, there is no single metric that will work for all use cases at all times: you'll have to figure out what you care about and measure that.
3) Another strategy that you could take to achieve this goal is to do the writes to the logging collection using write concern, a 'w' value of 'majority', and a reasonable timeout (say 10 seconds). Using this, you won't be able to write to your database faster than your replication. If you start getting timeouts, you know that you need to scale back. If you can't write fast enough to drain the queue, then you start dropping log entries at that time.

Use of MSMQ to control SQL write operations

Can I use MSMQ to reduce the number of synchronous write operations to a database and instead have the records written to the database every X number of minutes?
You can't reduce the number of write operations by queuing them, but you can use a message queue to cluster the writes together.
That might be a bit more efficient (by dint of sharing a single connection), and could also let you schedule the writes at a convenient time if you wanted to ('every X minutes' wouldn't do that, but you could perform the writes during low usage times).
The increased complexity of that arrangement will normally outweigh the benefits - what do you really want to achieve?