Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I am exploring on schema registries, I have a Kafka setup and now I want to introduce schema support for producer and consumer. I found that both are supporting Avro format and have multiple compatibility options. I am new to both. Can anybody suggest me which one is better or could compare them both.
Thanks in advance!
While Glue works with Kafka, from what I've seen, it is more intended for usage with Athena and similar AWS data-analysis tools. It is serverless, so there is nothing to install and manage, and integrates with IAM, so you can manage permissions all within AWS.
Confluent's is only for Kafka and cannot be (easily) integrated with those other AWS tools.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I'm want to learn about the differences between the two methods. I developed a project so It aggregates some data using Apache Kafka Streams API. And after that, I got on some solutions which are written with KSQL.
I've never got experienced with KSQL so I would like to learn when and which approach should select for aggregate some stuff? Could I use KSQL instead of Kafka Streams?
There's a blog post somewhere that talks about the "Kafka abstraction funnel"
KSQL doesn't provide as much flexibility as Kafka Streams, which in turn, abstracts many details of the core consumer/producer API.
If you have people more familiar with SQL, and not so good at other client libraries, you'd use KSQL. If you run into a feature not supported by KSQL (think, custom, complex data types) or need to embed streaming logic into a larger application without needing to remotely query the KsqlDB rest api, use Kafka Streams
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I want to know how to recover Scalar DB to another instance using Cassy backup.
Because I need a new instance for tests from the production environment.
There is no direct support in Cassy to load backups that were taken in a cluster to another cluster.
Since Cassy only manages snapshots of Cassandra, you can follow the doc to do it.
For testing, I would recommend dumping some of the data from the current (possibly production) cluster and load it to a new testing cluster.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
From what I read, both event hubs and apache kafka could be used for events steaming, however my question is:
1, ) What sort of data source can be defined as so called 'events' source to use event hubs or apache kafka for?
2, ) In which use case should we use events hubs other than apache kafka, and vice sersa?
Thank you.
Azure Event Hubs is a fully managed event streaming service where as you need to manage an Apache Kafka server. Both products have commonalities and differences, don't really want to list them all here. Here is a good quick read on comparison of both when to choose one over another - https://learn.microsoft.com/en-us/archive/blogs/opensourcemsft/choosing-between-azure-event-hub-and-kafka-what-you-need-to-know
The following scenarios are some of the scenarios where you can use Event Hubs:
Anomaly detection (fraud/outliers)
Application logging
Analytics pipelines, such as clickstreams
Live dashboarding
Transaction processing
User telemetry processing
Device telemetry streaming
I recommend you to start with this doc - https://learn.microsoft.com/en-us/azure/event-hubs/event-hubs-about
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
How to check following condition
How many streams can support before noticing Kafka cluster degradation and how to scale up the cluster
It will hugely depend on what your application is doing, the throughput, and so on. Some general resources to help you:
Elastic Scaling in the Streams API in Kafka
Kafka Streams Capacity planning and sizing
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I am writing a scala app that needs to interact with DynamoDB. I see many open-sources out there. Some examples are:
https://github.com/piotrga/async-dynamo
https://github.com/bizreach/aws-dynamodb-scala
https://github.com/seratch/AWScala
https://bitbucket.org/atlassian/aws-scala
https://dwhjames.github.io/aws-wrap/index.html
Or perhaps it's better to use the official AWS SDK in Java?
Anyone have any experience with one of the above open-sources?
Check out the Alpakka project, which provides a DynamoDB connector. Alpakka connectors are built on Akka Streams and provide a way to interact with various technologies and protocols in a reactive way.