Why Cassandra is used for Kong Api Gateway - postgresql

Kong uses Cassandra or Postgres. Cassandra is know for write heavy application.I don't see Kong api gateway is that much write heavy,also none of table uses Cassandra one of the important feature partition key. My doubt is why Cassandra is used for Kong,is there any specific reason? Can't we acheive this using RDBMS.

As per the Kong FAQ at https://getkong.org/about/faq/#which-datastores-are-supported
Postgresql
It is a good candidate if the setup you are aiming at is not distributed
Cassandra
Kong can use Cassandra as its primary datastore if you are aiming at a
distributed, high-availability Kong setup. The two main reasons why
one would chose Cassandra as Kong's datastore are: - An ease to create
a distributed setup (ideal for multi-region). - Ease to scale

Related

ThingsBoard Hybrid mode

I can read in thingsboard configuration documentation (https://thingsboard.io/docs/user-guide/install/config/), in "Common database parameters" section, that database.ts.type can be sql or cassandra, and cassandra should be used for hybrid mode. what's that hybrid mode?
Do you mean that database.entities.type can be sql (postgres) and database.ts.type can be cassandra? and
vice versa?
what's the recommanded install? All on cassandra?
Many thanks,
Best regards
Found the answer here:
https://thingsboard.io/docs/reference/ , section "SQL vs NoSQL vs Hybrid database approach"
ThingsBard uses database to store entities (devices, assets, customers, dashboards, etc) and telemetry data (attributes, timeseries sensor readings, statistics, events). Platform supports three database options at the moment:
SQL - Stores all entities and telemetry in SQL database. ThingsBoard authors recommend to use PostgreSQL and this is the main SQL database that ThingsBoard supports. It is possible to use HSQLDB for local development purposes. We do not recommend to use HSQLDB for anything except running tests and launching dev instance that has minimum possible load.
NoSQL - Stores all entities and telemetry in NoSQL database. ThingsBoard authors recommend to use Cassandra and this is the only NoSQL database that ThingsBoard supports at the moment. However, due to a lot of interest to deployments with managed databases, we plan to introduce support on AWS DynamoDB in v2.3.
Hybrid - Stores all entities in SQL database and all telemetry in NoSQL database.

How Databases synchronize data between persistent volumens in Kubernetes

I`ve just read Deploying Cassandra with Stateful Sets topic in the Kubernetes documentation.
The deployment process:
1. Creation of StorageClass
2. Creation of PersistentVolume (in my case 4 PersistentVolume). Set created in 1) storageClassName
3. Creation of Cassandra Headless Service
4. Using a StatefulSet to Create a Cassandra Ring - setting created in 1) storageClassName in StatefulSet yml definition.
As a result, there are 4 pods: Cassandra-0, Cassandra-1, Cassandra-2, Cassandra-4, which are mounted to created in 2) volumes (pv-0, pv-1, pv-2, pv-3).
I wonder how / if these persistent volumes synchronize data with each other.
E.g. if I add some record, which will be written by pod cassandra-0 in persistent volume pv-0, then if someone who is going to retrieve data from the database a moment later - using the cassandra-1 pod/pv will see data that has been added to pv-0. Can anyone tell me how it works exactly?
This is not related to Kubernetes
The replication is done by database and is configurable
See the CAP theorem and Eventual Consistency for Cassandra
You can control the level of consistency in Cassandra, whether the record is immediately updated across or later , depends on the configuration you do in Cassandra.
See also: Synchronous Replication , Asynchronous Replication
Cassandra Consistency:
how to set cassandra read and write consistency
How is the consistency level configured?
The mechanism to spread data across the clusters is independent if it was deployed in kubernetes or bare-metal instances. Cassandra will try to spread randomly the data across the nodes depending on a hash value (known as token), and will use the same algorithm to retrieve the information.
There are other factors to take in consideration: The replication factor (amount of copies), and the consistency level used.
You would want to take a look to DS201: DataStax Enterprise Foundations of Apache Cassandraâ„¢ in Datastax academy, where they cover the basics of Cassandra.
Just to slightly extend Carlos' answer, Kubernetes is not involved and the volumes are completely isolated. The replication and distribution stuffs are entirely up to the database software to handle. As far as K8s sees, they are just separate processes and separate volumes.
Thanks for comments guys!
so, when I have my db with 3 PVs:
cassandra-pod0 cassandra-pod1 cassandra-pod2
| | |
cassandra-pv0 cassandra-pv0 cassandra-pv0
Data is divided into 3 pvs.When I kill cassandra-pod1 - it is possible that I will lose (temporarily) part of the data. Am I right?

Can I do multi master replication for Mongo DB? Any reference architecture with Kubernetes is more expected in this question

I have a use case where we have a write and read intensive application using the MongoDB in backend. We are planning to implement federated K8s deployment for the Mongo DB with multi master architecture(How to do this?). I am looking for some suggestions on the architecture references/solutions if any that REALLY worked with Federation and active DB replication.
This doesn't directly answer you question but I know Kubedb does provide extensive database deployments within K8S. https://kubedb.com/docs/0.9.0/guides/mongodb/

Citus: is a 2 node PGSQL cluster doable and if yes how?

I am thinking of using Citus opensource for dualnode cluster - my questions are basically 2:
- if this kind of clustering is available - in the case of a failover is the slave node promoted to master? If yes - how - does it use WAL?
- If such a way of clusterisation is not possible what is an alternative for that except pgpool?
Thank you.
Citus isn't a high-availability solution for single-node PostgreSQL. Citus shards/partitions your data across multiple servers, and can thus use multiple CPU cores in parallel for your queries or transactions.
Citus is suitable for a variety of use-cases, and you can find more information on those here.
For high-availability, Citus can replicate data across multiple nodes, or you can set up streaming replication for each worker node. Citus Cloud uses streaming replication for each node, and you can find more information on how Citus Cloud manages HA on our documentation.

MongoDB on Azure worker role

I m developing an application using SignalR to manage websockets and allow my clients to dialog between each other.
I m planning to host this back-office on an Azure worker role. As my SignalR requests carry data that is most of the time saved in the database, I m wondering if NoSQL's MongoDB instead of the classic SQL Server/Entity Framework couple should be a good approach.
Assuming that my application's data types will be strings for most of them, I think MongoDB will be a reliable and a performant solution, and it will allow me to get rid of Azure's SQL's database costs.
For information, the Azure worker role will be running on a machine with the following hardware: 1 core CPU, 3.5GB RAM and 50GB SSD storage.
Do you think I m on a good start with this architecture ?
Thanks
Do you think I m on a good start with this architecture?
In a word, no.
A user asked a similar question regarding running Redis on Worker Roles - Setting up Redis on Azure cloud service worker role - all of the content on that Q/A is relevant in the MongoDb context.
I'd suggest that you read my answer as it goes into more detail, but as an overview of why this is a bad architectural approach:
You cannot guarantee when a Worker Role will be restarted by the Azure Service Fabric.
In a real-world implementation of Mongo, you would run multiple nodes within a cluster, with a single Worker Role (as you have suggested in your question) this won't be possible.
You will need to manage your MongoDb installation within the Worker Role and they simply aren't designed for this.
If you are really fixed on using Mongo, I would suggest that you use a hosted solution such as MongoLabs (as suggested in earlier answers), or consider hosting it on Azure IaaS VM's.
If you are not fixed on using Mongo, I would sincerely suggest that you look at Azure DocumentDb (also suggested above), Microsoft's Azure NoSQL offering - I have used it in several production systems already and it is certainly a capable NoSQL solution; granted, it may not have all of the features available with MongoDb.
If you are looking at a NoSQL solution for caching of data (i.e. not long term storage), I would suggest you take a look at Azure Redis Cache, which is a very capable Redis offering.
Azure has its own native NoSQL Document database called DocumentDB, have you had a look at it? If I were you I would use DocumentDB unless there are some special requirements that you have that you have not mentioned, but from what little requirement info that you have posted DocumentDB would do just fine. I don't think that it is quite similar to MongoDB in terms of the basic functionality, see this article for a comparison between Azure DocumentDB and MongoDB.