Is there a way to run Redis server on multiple nodes or multiple cores of a CPU? - server

I am currently working on running a deep learning model using Redis server, and for its optimisation, I was wondering if there is a way to run Redis server on multiple nodes, since redis is single threaded.

You can definitely run multiple instances of Redis on one node, just use:
redis-server --port 6380
You can then test with redis-cli -p 6380. Avoid running more instances than cores.
You can split your workload by application scopes. Or you can use Redis Cluster. It is a common pattern to use one node to run both a master instance and a replica instance of another master.
Be sure to look at hash tags for cases where you need multi-key operations.

Related

ProxySQL vs MaxScale on Kubernetes

I'm looking to set up a writing proxy for our MariaDB database on Kubernetes. The problem we are currently having is that we only have one Write master on our 3 master galera cluster setup. So even though we have ours pods replication properly, if our first node goes down then our other two masters end up failing because they are not able to be written to.
I saw this was a possible option to use either ProxySQL or MaxScale for Write proxying, but I'm not sure if I'm reading their uses properly. Do I have the right idea looking to deploy either of these two applications/services on Kubernetes to fix my problem? Would I be able to write to any of the Masters in the cluster?
MaxScale will handle selecting which server to write to as long as you use the readwritesplit router and the galeramon monitor.
Here's an example configuration for MaxScale that does load balancing of reads but sends writes to one node:
[maxscale]
threads=auto
[node1]
type=server
address=node1-address
port=3306
[node2]
type=server
address=node2-address
port=3306
[node3]
type=server
address=node3-address
port=3306
[Galera-Cluster]
type=monitor
module=galeramon
servers=node1,node2,node3
user=my-user
password=my-password
[RW-Split-Router]
type=service
router=readwritesplit
cluster=Galera-Cluster
user=my-user
password=my-password
[RW-Split-Listener]
type=listener
service=RW-Split-Router
protocol=mariadbclient
port=4006
The reason writes are only done on one node at a time is because doing it on multiple Galera nodes won't improve write performance and it results in conflicts when transactions are committed (applications seem to rarely handle these).

Postgres Patroni and etcd on the Same Machine

Assuming I have 2 postgres servers (1 master and 1 slave) and I'm using Patroni for high availability
1) I intend to have three-machine etcd cluster. Is it OK to use the 2 postgres machines also for etcd + another server, or it is preferable to use machines that are not used by Postgres?
2) What are my options of directing the read request to the slave and the write requests to the master without using pgpool?
Thanks!
yes, it is the best practice to run etcd on the two PostgreSQL machines.
the only safe way to do that is in your application. The application has to be taught to read from one database connection and write to another.
There is no safe way to distinguish a writing query from a non-writing one; consider
SELECT delete_some_rows();
The application also has to be aware that changes will not be visible immediately on the replica.
Streaming replication is of limited use when it comes to scaling...

MongoDB data replication in Kubernetes

I've been configuring pods in Kubernetes to hold a mongodb and golang image each with a service to load-balance. The major issue I am facing is data replication between databases. Replication controllers/replicasets do not seem to do what the name implies, but rather is a blank-slate copy instead of a replica of existing/currently running pods. I cannot seem to find any examples or clear answers on how Kubernetes addresses this, or does it even?
For example, data insertions being sent by the Go program are going to automatically load balance to one of X replicated instances of mongodb by the service. This poses problems since they will all be maintaining separate documents without any relation to one another once Kubernetes begins to balance the connections among other pods. Is there a way to address this in Kubernetes, or does it require a complete re-write of the Go code to expect data replication among numerous available databases?
Sorry, I'm relatively new to Kubernetes and couldn't seem to find much information regarding this.
You're right, a replica set is not a replica of another container, it's just a container with the same configuration spun up within the same logical unit.
A replica set (or deployment, which is the resource you should be using now) will have multiple pods, and it's up to you, the operator, to configure the mongodb part.
I would recommend reading this example of how to set up a replica set with multiple mongodb containers:
https://medium.com/google-cloud/mongodb-replica-sets-with-kubernetes-d96606bd9474#.e8y706grr

Can I tie Celery workers to a particular instance given a shared database?

I have a number of machines each with a Django instance, sharing a single Postgres database.
I want to run Celery, preferably using the Django broker and the Postgres database for simplicity. I do not have a high volume of tasks to run, so there is no need to use a different broker for that reason.
I want to run celery tasks which operate on local file storage. This means that I want the celery worker only to run tasks which are on the same machine that triggered the event.
Is this possible with the current setup? If not, how do to it? A local Redis instance for each machine?
I worked out how to make this work. No need for fancy routing or brokers.
I run each celeryd instance with a special queue named after the host. This can be done automatically, like:
./manage.py celeryd -Q celery,`hostname`
I then set up a hostname in the settings.py that stores the hostname:
import socket
CELERY_HOSTNAME = socket.gethostname()
In each Django instance this will have a different value.
I can then specify this queue when I asynchronously call my task:
my_task.apply_async(args=[one, two], queue=settings.CELERY_HOSTNAME)

Can mongos be configured to talk to more than one mongo cluster?

The rule of thumb is to have the 'mongos' process running on each of your application servers. This keeps your application talking to localhost which is fast and your mongos processes scale with your app.
Say we have 2 distinct mongo clusters (sharded), is it possible to configure one mongos process to talk to two different clusters? It would be awesome to abstract away the fact that the databases lived in different places.
Or would you have to launch two different mongos processes on different ports? If this IS possible I still worry that it might be dangerous having two different mongos processes fighting for resources.
Or something completely different? Ideas?
Each mongos belongs to one, only one, cluster (defined by the config db servers). The mongos processes don't use much resources; you can run multiple on a machine.
You can have more than one sharded db/collection per cluster.