How to create create scalability model for Kafka cluster [closed] - apache-kafka

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
How to check following condition
How many streams can support before noticing Kafka cluster degradation and how to scale up the cluster

It will hugely depend on what your application is doing, the throughput, and so on. Some general resources to help you:
Elastic Scaling in the Streams API in Kafka
Kafka Streams Capacity planning and sizing

Related

How to recovery to different environment using Cassy backup tool? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I want to know how to recover Scalar DB to another instance using Cassy backup.
Because I need a new instance for tests from the production environment.
There is no direct support in Cassy to load backups that were taken in a cluster to another cluster.
Since Cassy only manages snapshots of Cassandra, you can follow the doc to do it.
For testing, I would recommend dumping some of the data from the current (possibly production) cluster and load it to a new testing cluster.

Kafka - 8 billion messages posted per hour [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I need to migrate a service that is on the mainframe to Kafka. The server was hosted on amazon (AWS).
Do I need to worry about something? Can the server not support it?
it will be a credit card transaction.
Here's some back of a napkin maths:
8 billion messages an hour is c.2 million a second (8000000000/60/60)
If you assume 1 kB message size that's c.2GB per second
This demo shows Kafka scaling to 2.7GB/s of ingress - so yes, Kafka can support it. You just need to scale and configure your brokers accordingly.

What is the best Scala library for AWS DynamoDB [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I am writing a scala app that needs to interact with DynamoDB. I see many open-sources out there. Some examples are:
https://github.com/piotrga/async-dynamo
https://github.com/bizreach/aws-dynamodb-scala
https://github.com/seratch/AWScala
https://bitbucket.org/atlassian/aws-scala
https://dwhjames.github.io/aws-wrap/index.html
Or perhaps it's better to use the official AWS SDK in Java?
Anyone have any experience with one of the above open-sources?
Check out the Alpakka project, which provides a DynamoDB connector. Alpakka connectors are built on Akka Streams and provide a way to interact with various technologies and protocols in a reactive way.

Where can I get all the message from one topic using kafka java api [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
In the command line, I can use "--from-beginning" to get all the messages in one topic, but how can I get the same effort when I code a java program ,and I'm using High Level Consumer api.
while creating the consumer properties you can add
props.put("auto.offset.reset", "smallest"); to start reading from the beginning

scaling a database on cloud and on local servers [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am considering using mongo db (it could be postgresql or any other ) as a data warehouse, my concern is that up to twenty or more users could be running queries at a time and this could have serious implications in terms of performance.
My question is what is the best approach to handle this in a cloud based and non cloud based environment? Do cloud based db's automatically handle this? If so would the data be consistent through all instances if a refresh on the data was made? In a non cloud based environment would the best approach be to load balance all instances? Again how would you ensure data integrity for all instances?
thanks in advance
I think auto sharding is what I am looking for
http://docs.mongodb.org/v2.6/MongoDB-sharding-guide.pdf