How to implement distributed rate limiter? - guava

Let's say, I have P processes running some business logic on N physical machines. These processes call some web service S, say. I want to ensure that not more than X calls are made to the service S per second by all the P processes combined.
How can such a solution be implemented?
Google Guava's Rate Limiter works well for processes running on single box, but not in distributed setup.
Are there any standard, ready to use, solutions available for JAVA? [may be based on zookeeper]

Bucket4j is java implementation of "token-bucket" rate limiting algorithm. It works both locally and distributed(on top of JCache). For distributed use case you are free to choose any JCache implementation like Hazelcast or Apache Ignite. See this example of using Bucket4j in cluster.

I have been working on an opensource solution for these kind of problems.
Limitd is a "server" for limits. The limits are implemented using the Token Bucket Algorithm.
Basically you define limits in the service configuration like this:
"request to service a":
per_minute: 10
"request to service b":
per_minute: 5
The service is run as a daemon listening on a TCP/IP port.
Then your application does something along these lines:
var limitd = new Limitd('limitd://my-limitd-address');
limitd.take('request to service a', 'app1' 1, function (err, result) {
if (result.conformant) {
console.log('everything is okay - this should be allowed');
} else {
console.error('too many calls to this thing');
We are currently using this for rate-limiting and debouncing some application events.
The server is on:
We are planning to work on several SDKs but for now we only have node.js and partially implemented go: provides distributed rate limits that should do just this. You can configure your limit as S per second / minute etc and choose burst size / refill rate of the leaky bucket that is under the covers.

Simple rate limiting in java where you want to achieve a concurrency of 3 transactions every 3 seconds. If you want to centralize this then either store the tokens array in elasticache or any database. And in place of synchronized block you will have to implement a lock flag as well.
import java.util.Date;
public class RateLimiter implements Runnable {
private long[] tokens = new long[3];
public static void main(String[] args) {
// TODO Auto-generated method stub
RateLimiter rateLimiter = new RateLimiter();
for (int i=0; i<20; i++) {
Thread thread = new Thread(rateLimiter,"Thread-"+i );
public void run() {
// TODO Auto-generated method stub
long currentStartTime = System.currentTimeMillis();
while(true) {
if(System.currentTimeMillis() - currentStartTime > 100000 ) {
throw new RuntimeException("timed out");
}else {
if(getToken()) {
System.out.println(Thread.currentThread().getName() +
" at " +
new Date(System.currentTimeMillis()) + " says hello");
synchronized private boolean getToken() {
// TODO Auto-generated method stub
for (int i = 0; i< 3; i++) {
if(tokens[i] == 0 || System.currentTimeMillis() - tokens[i] > 3000) {
tokens[i] = System.currentTimeMillis();
return true;
return false;

So with all distributed rate limiting architecture, you'll need a single backend store that acts as single source of true to track the number of requests. You can always use zookeeper as a in memory datastore for this out of convenience, although there are better choices such as Redis.


Vertx event bus slow consuming issue

We have a non clustered vertx application, and we use the event bus to internally communicate between verticles.
Verticle A consumes from the bus, performs a HTTP request, and sends the response back through the bus.
Verticle B just request to perform that HTTP request.
The problem appears when a "high" request volume is performed by Verticle B. Then, the consumer starts receiving the events slower and slower (presumably because they are getting queued in the event bus). For 8 requests/second the bus takes up to 3-4 seconds to consume the event. When the requests/second are elevated, it can take more than 30 seconds to consume it, so the bus timeout is triggered.
The thing is, Verticle A is really fast performing the HTTP operation (~200ms) so I don't really understand why the requests get stuck in the bus.
We've tried many solutions but none ot then worked:
Deploy multiple instances of Verticle A as workers
Use vertx.executeBlocking() to perform the HTTP request
The only thing that worked was commenting the HTTP request and returning a mock object through the bus. But again, the HTTP request doesn't take more than 200ms, so it shouldn't be blocking the bus.
Additional information: We use an autogenerated rest client that uses Retrofit + OkHttpClient. Due to company policy, we cannot use Vertx WebClient, so I didn't try this solution.
This is a really simplified version of our code so you can check if I'm missing something.
// Instantiated in Verticle A
public class EmailSender {
private final Vertx vertx;
private final EmailApiClient emailApiClient;
public EmailSender(Vertx vertx) {
this.vertx = vertx;
emailApiClient = ClientFactory.createEmailApiClient();
public void start() {
vertx.eventBus().consumer("sendEmail", this::sendEmail);
public void sendEmail(Message<EmailRequest> message) {
EmailRequest emailRequest = message.body();
response -> {
if (response.code() == 200) {
EmailResponse emailResponse = response.body();
} else {, "Error sending email");
// Instantiated in Verticle B
public class EmailCommunications {
private final Vertx vertx;
public EmailCommunications(Vertx vertx) {
this.vertx = vertx;
public Single<EmailResponse> sendEmail(EmailRequest emailRequest) {
SingleSubject<EmailResponse> emailSent = SingleSubject.create();
busResult -> {
if (busResult.succeded()) {
} else {
return emailSent;
We fixed the issue changing our OkHttpClient configuration so HTTP requests won't get stuck
default void configureOkHttpClient(OkHttpClient.Builder okHttpClientBuilder) {
ConnectionPool connectionPool = new ConnectionPool(40, 5, TimeUnit.MINUTES);
Dispatcher dispatcher = new Dispatcher();
.readTimeout(60, TimeUnit.SECONDS)

Vertx delayed batch process

How can I process a list of delayed jobs in Vertx (actually
hundreds of HTTP GET requests, to limited API that bans fast requesting hosts)? now, I am using this code and it gets blocked because Vertx starts all requests at once. It is desirable to process each request with a 5-second delay between each request.
public void getInstrumnetDailyInfo(Instrument instrument,
Handler<AsyncResult<OptionInstrument>> handler) {
.addQueryParam("i", instrument.getId())
ar -> {
if (ar.succeeded()) {
String html = ar.result().bodyAsString();
Integer thatData = processHTML(html);
} else {
// error
handler.handle(Future.failedFuture("error " +ar.cause()));
public void start(){
List<Instrument> instruments = loadInstrumentsList();
instrument -> {
async -> {
instrumentMap.put(instrument.getId(), instrument);
}else {
log.warn("getInstrumnetDailyInfo: ", async.cause());
You can consider using a timer to fire events (rather than all at startup).
There are two variants in Vertx,
.setTimer() that fires a specific event after a delay
vertx.setTimer(interval, new Handler<T>() {});
2. .setPeriodic() that fires every time a specified period of time has passed.
vertx.setPeriodic(interval, new Handler<Long>() {});
setPeriodic seems to be what you are looking for.
You can get more info from the documentation
For more sophisticated Vertx scheduling use-cases, you can have a look at Chime or other schedulers or this module
You could use any out of the box rate limiter function and adapt it for async use.
An example with the RateLimiter from Guava:
// Make permits available at a rate of one every 5 seconds
private RateLimiter limiter = RateLimiter.create(1 / 5.0);
// A vert.x future that completes when it obtains a throttle permit
public Future<Double> throttle() {
return vertx.executeBlocking(p -> p.complete(limiter.acquire()), true);
.compose(d -> {
System.out.printf("Waited %.2f before running job\n", d);
return runJob(); // runJob returns a Future result

Esper EPL window select not working for a basic example

Everything I read says this should work: I need my listener to trigger every 10 seconds with events. What I am getting now is every event in, it a listener trigger. What am I missing? The basic requirements are to create summarized statistics every 10s. Ideally I just want to pump data into the runtime. So, in this example, I would expect a dump of 10 records, once every 10 seconds
class StreamTest {
private final Configuration configuration = new Configuration();
private final EPRuntime runtime;
private final CompilerArguments args = new CompilerArguments();
private final EPCompiler compiler;
public DatadogApplicationTests() {
runtime = EPRuntimeProvider.getRuntime(this.getClass().getSimpleName(), configuration);
compiler = EPCompilerProvider.getCompiler();
void testDisplayStatsEvery10S() throws Exception{
// Display stats every 10s about the traffic during those 10s:
EPCompiled compiled = compiler.compile("select * from", args);
(old, newEvents, epStatement, epRuntime) -> -> System.out.format("%s: received %n",
new BufferedReader(new InputStreamReader(this.getClass().getResourceAsStream("/access.log"))).lines().map(CommonLogEntry::new).forEachOrdered(e -> {
runtime.getEventService().sendEventBean(e, e.getClass().getSimpleName());
try {
} catch (InterruptedException ex) {
Which currently outputs every second, corresponding to the sleep in my stream:
11:00:54.676: received
11:00:55.684: received
11:00:56.689: received
11:00:57.694: received
11:00:58.698: received
11:00:59.700: received
A time window is a sliding window. There is a chapter on basic concepts that explains how they work. Here is the link to the basic concepts chapter.
It is not clear what the requirements are but I think what you want to achieve is collecting events for a while and then releasing them. You can draw inspiration from the solution patterns.
This will collect events for 10 seconds.
create schema StockTick(symbol string, price double);
create context CtxBatch start #now end after 10 seconds;
context CtxBatch select * from StockTick#keepall output snapshot when terminated;

Why IoScheduler using ScheduledExecutorService with the poolCoreSize is 1?

I found the IoScheduler.createWorker() will create a NewThreadWorker immediately if there is no cached NewThreadWorker,This may result in OutOfMemoryError.
If I put 1000 count of work to IoScheduler one-time,it will create 1000 count of NewThreadWorker and ScheduledExecutorService.
private void submitWorkers(int workerCount) {
for (int i = 0; i < workerCount; i++) {
Single.fromCallable(new Callable<String>() {
public String call() throws Exception {
return "String-call(): " + Thread.currentThread().hashCode();
.subscribe(new Consumer<String>() {
public void accept(String s) throws Exception {
If I set the workerCount with 1000, I received a OutOfMemoryError,I want to know why IoScheduler use NewThreadWorker with ScheduledExecutorService but just execute a single work。
Every time a new work is coming it will create a NewThreadWorker and ScheduledExecutorService if there is no cached NewThreadWorker,Why is it designed to be such a process?
The standard workers of RxJava each use a dedicated thread to avoid excessive thread hopping and work migration in flows.
The standard IO scheduler uses an unbounded number of worker threads because it's main use is to allow blocking operations to block a worker thread while other operations can commence on other worker threads. The difference from newThread is that there is a thread reuse allowed once a worker is returned to an internal pool.
If there was a limit on the number of threads, it would drastically increase the likelihood of deadlocks due to resource exhaustion. Also, unlike the computation scheduler, there is no good default number for this limit: 1, 10, 100, 1000?
There are several ways to work around this problem, such as:
use Schedulers.from() with an arbitrary ExecutorService which you can limit and configure as you wish,
use ParallelScheduler from the Extensions project and define an arbitrary large but fixed pool of workers.

Spymemcache- Memcache/Membase Faileover

Platform: 64 Bit windows OS, spymemcached-2.7.3.jar, J2EE
We want to use two memcache/membase servers for caching solution. We want to allocate 1 GB memory to each memcache/membase server so total we can cache 2 GB data.
We are using spymemcached java client for setting and getting data from memcache. We are not using any replication between two membase servers.
We loading memcacheClient object at the time of our J2EE application startup.
URI server1 = new URI("");
URI server2 = new URI("");
ArrayList<URI> serverList = new ArrayList<URI>();
client = new MemcachedClient(serverList, "default", "");
After that we are using memcacheClient to get and set value in memcache/membase server.
Object obj = client.get("spoon");
client.set("spoon", 50, "Hello World!");
Looks like memcacheClient is setting and getting and value only from server1.
If we stop server1, it fails to get/set value. Should it not use server2 in case of server1 down? Please let me know if we are doing anything wrong here...
aspymemcached java client dos not handle membase failover for particular node.
Ref :
We need to handle it manually(by our code)
We can do this by using ConnectionObserver
Here is my code :
public static void main(String a[]) throws InterruptedException{
try {
URI server1 = new URI("");
URI server2 = new URI("");
final ArrayList<URI> serverList = new ArrayList<URI>();
final MemcachedClient client = new MemcachedClient(serverList, "bucketName", "");
client.addObserver(new ConnectionObserver() {
public void connectionLost(SocketAddress arg0) {
//method call when connection lost
for(MemcachedNode node : client.getNodeLocator().getAll()){
//re init your client here, and after re-init it will connect to your secodry node
public void connectionEstablished(SocketAddress arg0, int arg1) {
//method call when connection established
Object obj = client.get("spoon");
client.set("spoon", 50, "Hello World!");
} catch (Exception e) {
client.get() would use first available node and therefore your value would be stored/updated on one node only.
You seems to be a bit contradicting in your requirements - first you're saying that 'we want to allocate 1 GB memory to each memcache/membase server so total we can cache 2 GB data' which implies distributed cache model (particular key is stored on one node in the cache farm) and then you expect to fetch it if that node is down, which obviously won't happen.
If you need your cache farm to survive node failure without losing data cached on that node you should use replication, which is available in MemBase but obviously you would pay the price of storing the same values multiple times so your desire of '1GB per 2GB of cache' won't be possible.