RSocket Database (Cassandra or Postgresql) Performance study - reactive-programming

I am trying to use RSocket for Microservices within my org. As we have lot to communicate and fetch from databases like Cassandra and Postgresql, I observed a dip in performance.
When I run a sample Rsocket Client and Rsocket Server that returns a mocked response, I am getting a throughput of 12k TPS. Once I integrate to get the same codebase with Cassandra it is 300 TPS and Postgresql is 400 TPS. If I add HikariCP to Postgresql, it is 700 TPS.
For the same usecase with blocking HTTP it is 800 TPS and non-blocking HTTP is 1900 TPS.
This study is made on Mac Laptop 8 cores, 16 GB, rsocket-java (No Spring or RPC).
I am confused whether RSocket is suited for Microservices usecase or not.
Please provide your experience and any benchmark available for further studies.

Related

grpc unary-stream with redis pubsub - degradation with too many clients

We have a python grpc (grpcio with asyncio) server which performs server side streaming of data consumed from redis PUB/SUB (using aioredis 2.x) , combining up to 25 channels per stream. With low traffic everything works fine, as soon as we reach 2000+ concurrent streams , the delivery of messages start falling behind.
Some setup details and what we tried so far:
The client connections to GRPC are loadbalanced over kubernetes cluster with Ingress-NGINX controller, and it seems scaling (we tried 9 pods with 10 process instances each) doesn't help at all (loadbalancing is distributed evenly).
We are running a five node redis 7.x cluster with 96 threads per replica.
Connecting to redis with CLI client while GRPC falls behind - individual channels are on time while GRPC streams are falling behind
Messages are small in size (40B) with a variable rate anywhere between 20-200 per second on each stream.
Aioredis seems to be opening a new connection for each pubsub subscriber even if we're using capped connection pool for each grpc instance.
Memory/CPU utilisation is not dramatic as well as Network I/O, so we're not getting bottlenecked there
Tried identical setup with a very similar grpc server written in Rust, with similar results
#mike_t, As you have mentioned in the comment, switching from Redis Pub/Sub to zmq has helped in resolving the issue.
ZeroMQ (also known as ØMQ, 0MQ, or zmq) is an open-source universal messaging library, looks like an embeddable networking library but acts like a concurrency framework. It gives you sockets that carry atomic messages across various transports like in-process, inter-process, TCP, and multicast.
You can connect sockets N-to-N with patterns like fan-out, pub-sub, task distribution, and request-reply. It's fast enough to be the fabric for clustered products. Its asynchronous I/O model gives you scalable multicore applications, built as asynchronous message-processing tasks.
It has a score of language APIs and runs on most operating systems.

1 million concurrent database connections

In https://cloud.google.com/sql/docs/quotas, it mentioned that "Cloud Run services are limited to 100 connections to a Cloud SQL database.". Assume I deploy my service as Cloud Run, what's the right way to handle 1 million concurrent connections? Can cloud spanner enables this - I can't find documentation discussing maximum concurrent connections on cloud spanner maximum concurrent connection with Cloud Run.
Do you want Cloud Run to handle a million concurrent connections, or do you want Cloud SQL to handle a million concurrent connections?
If you want Cloud SQL to handle a million concurrent connections, you are probably wrong. Check out this article about Pool sizing (it's on a Java repo, but is general enough to apply to all connection pooling). If you are at the point where you need a million concurrent connections, you would need to invest in more advanced architectures (such as sharding).

Cassandra + Redis Project Server setup

Currently i'm having a Dedicated VPS Server with 4GB Ram , 50GB Hard-disk , i have a SAAS solution running on the server with more than 1500 customers. Now i'm going to upgrade the projects business plan,There will be 25000 customers and about 500 - 1000 customers using the project realtime . For now it takes 5 Seconds to fetch cassandra database records from the server to the application.Then i came through redis and it says that saving a copy to redis will help to fetch the data much faster and lowers server overhead.
Am i right about this ?
If i need to improve the overall performance , Can anybody tell me what are the things i need to upgrade ?
Can a server with configuration said above can handle cassandra and redis together ?
Thanks in advance .
A machine with 4GB of RAM will probably only be single-core so it's too small for any production workload and only suitable for dev usage where you run 1 or 2 transactions per second, mostly for functional testing.
We generally recommend deploying Cassandra on machines with at least 2 cores + 8GB allocated to the heap (so need at least 16GB of RAM) for low production loads. For moderate loads, 4 cores + 32GB RAM is ideal so you can allocate 16GB to the heap.
If you're just at the proof-of-concept stage, there's a tier on DataStax Astra that's free forever and doesn't require a credit card to create an account. I recommend it to most people because you can launch a cluster in a few clicks and you can quickly focus on developing your app. Cheers!

Redis Simple Production Server Specification

Currently I use redislabs to host my redis server, but redislabs cloud server not available in my web server hosting (softlayer) so the performance of my web server is decrease because of network latency (~20ms for 1 trip)
Because of that reason, I want to create a VPS to host redis in softlayer so my web server can connect to the redis server through LAN.
From redislabs i know that it consume ~400MB memory and has ~250 ops/sec in normal day, but can go to ~1500 ops/sec when we have an event like flash sale.
The question is which server specification can handle that kind of traffic?
Is VPS using 1 CPU x 4GB memory is enough?
Thank you
In the softlayer portal control when ordering a VPS there are many options with the characteristics that you want, we can not give you the specific characteristics for your requirements because we do not know if it will fulfill your expectations.
I could suggest you to order a hourly VPS with the characteristics you want and you can try it, if it does not work you can cancel it immediately to do not incur huge costs as with a monthly server.

Put memcached on db or web server instance?

For my Drupal-based site, I have an architecture with 3 instances running nginx, postgresql, & solr, respectively. I'd like to install Memcached. Should I put it on the nginx or postgresql server? What are the performance implications?
Memcached is very light on CPU usage, so it is a great candidate to gobble up spare web server RAM. Also, you will scale out you web tier much more than your other tiers, and Memcached clustering can pool that RAM together into one logical cache.
If you have any spare RAM on the DB, it is almost always best for performance to let the DB gobble it up.
TL;DR Let DB have all of the RAM, colocate memcached on web tier.
Source: http://code.google.com/p/memcached/wiki/NewHardware
The best is to have a separate server (if you can do that).
Otherwise, it depends on your servers CPU & memory utilization and availability requirements. In general I would avoid running anything extra on a DB server machine...since DB is the foundation of the system and has to be available and performing well.
if your Solr server does not have high traffic an don't utilize much memory I'd put it in there. Memcached servers known to be light on CPU. Also you should estimate how much memory memcached instance will need...to make sure its enough on the server.