All possible system keyspaces in Cassandra - nosql

I am trying to find a list of all the possible 'System' keyspaces that MAY exist in a DSC Cassandra database (System keyspaces are those which are not created by a user).
My experience thus far is I have found
[cqlsh 3.1.8 | Cassandra 1.2.15 | CQL spec 3.0.0 | Thrift protocol 19.36.2]
system system_traces OpsCenter
Are these the only available System Keyspaces or are there others? Does it depend on the version(1.2/2.0) and distribution(Apache/Datastax)?
I tried to search the documentation but no luck. Could anyone help me out here?

Only system, system_auth and system_traces are strictly "System", especially the 1st one.
OpsCenter is created for/by DataStax OpsCenter

Related

I want to deploy janusgraph. which storage backend should i use for cassandra. Is it cql or cassandrathrift?

Problem -> I want to deploy JanusGraph as separate service on Kubernetes. which storage backend should i use for cassandra. Is it CQL or cassandrathrift?? Cassandra is running as stateful service on Kubernetes.
Detailed Description-> As per JanusGraph doc, in case of Remote Server Mode, storage backend should be cql.
JanusGraph graph = JanusGraphFactory.build().
set("storage.backend", "cql").
set("storage.hostname", "77.77.77.77").
open();
Even they mentioned that Thrift is deprecated going ahead with Cassandra 2.1 & I am using Cassandra 3.
But in some blog, they have mentioned that rest api call from JanusGraph to Cassandra is possible only through Thrift.
Is Thrift really required? Can't we use CQL as storage backend for rest api call as well?
Yes, you absolutely should use the cql storage backend.
Thrift is deprecated, disabled by default in the current version of Cassandra (version 3), and has been removed from Cassandra version 4.
I would also be interested in reading the blog post you referenced. Are you talking about IBM's Rest API mentioned in their JanusGraph-utils Git repo? That confuses me as well, because I see both Thrift and CQL config happening there. In any case, I would go with the cql settings and give it a shot.
tl;dr;
Avoid Thrift at all costs!

How I can access to couchbase from pyspark

I'm new in working with NoSQL databases. I have Spark 1.6.0 on my cluster and I need to get document from Couchbase bucket, make some operations with it an load it back.
I know ip, port, bucket's name and bucket's password. Unfortunately, I'm out of ideas, how I can access this database using pyspark. But if it's impossible, how I can do it using scala?
Besides, I need execute suchlike operation with HBase.
Great thanks for any suggestions and useful urls.
Best regards,
Vladimir.
To access Couchbase from the Python tools universe, you need to use the Python SDK.
Start here: https://docs.couchbase.com/python-sdk/2.5/start-using-sdk.html

Report (prometheus) metrics via Kafka

I am looking for a way to decouple Prometheus from applications by putting a Kafka in between to achieve something like this:
+-------------+ +----------+ +--------------+
| Application +--metrics--->+ Kafka +------>+ Prometheus |
+-------------+ +----------+ +--------------+
In order to solve this problem I have two questions:
Are there any Java libraries that abstract metrics representation so my app will not depend on Prometheus in any ways?
Are there any reliable Kafka reporter?
Any comments or suggestions are welcome.
The Prometheus Java client library is designed so that you can used it with other monitoring systems, and indeed many open source and commercial monitoring systems do so as the Prometheus text format is becoming a defacto-standard.
Prometheus is a pull based system and it is not at all recommended to try and convert it to push, you're making your life harder for no real reason. It is recommended to have Prometheus scrape the application directly.
Are there any Java libraries that abstract metrics representation so my app will not depend on Prometheus in any ways?
Yes - StatsD (I've used Datadog's dogstatsd), which is push-based.
Related - Which StatsD client should I use for a java/grails project?
Then you can use StatsD as a relay to Prometheus - https://github.com/prometheus/statsd_exporter
Or use something else that supports remote reads

Why Cassandra is used for Kong Api Gateway

Kong uses Cassandra or Postgres. Cassandra is know for write heavy application.I don't see Kong api gateway is that much write heavy,also none of table uses Cassandra one of the important feature partition key. My doubt is why Cassandra is used for Kong,is there any specific reason? Can't we acheive this using RDBMS.
As per the Kong FAQ at https://getkong.org/about/faq/#which-datastores-are-supported
Postgresql
It is a good candidate if the setup you are aiming at is not distributed
Cassandra
Kong can use Cassandra as its primary datastore if you are aiming at a
distributed, high-availability Kong setup. The two main reasons why
one would chose Cassandra as Kong's datastore are: - An ease to create
a distributed setup (ideal for multi-region). - Ease to scale

wso2 - High availability with Postgres as database

In the documentation regarding Cluster/High availability of WSO2's databases, the example is using MySQL.
Is there any information or anyone using Postgres?
How is High Availability enforced?
Using pgpool-II, for example?
You need additional tool to enforce high availability (failover, failback, switchover) in Postgres - you know that. Here is very good illustrated article by Google on how they do it using Patroni or pg_auto_failover or disk replication and Linux tools: https://cloud.google.com/architecture/architectures-high-availability-postgresql-clusters-compute-engine
Here is a list of available HA tools - especially repmgr worth your attention: https://www.percona.com/blog/2018/09/28/high-availability-for-enterprise-grade-postgresql-environments/
And here is explained how they do HA in Azure: https://www.citusdata.com/blog/2019/05/30/introducing-pg-auto-failover/