How to use kafka and storm on cloudfoundry? - apache-kafka

I want to know if it is possible to run kafka as a cloud-native application, and can I create a kafka cluster as a service on Pivotal Web Services. I don't want only client integration, I want to run the kafka cluster/service itself?
Thanks,
Anil

I can point you at a few starting points, there would be some work involved to go from those starting points to something fully functional.
One option is to deploy the kafka cluster on Cloud Foundry (e.g. Pivotal Web Services) using docker images. Spotify has Dockerized kafka and kafka-proxy (including Zookeeper). One thing to keep in mind is that PWS currently doesn't support apps with persistence (although this work is starting) so if you were to go this route right now, you would lose the data in kafka when the application is rolled. Looking at that Spotify repo, it looks like the docker images are generally run without any mounted volumes, so this persistence-less kafka seems like it may be a valid use case (I don't know enough about kafka to say).
The other option is to deploy kafka directly on some IaaS (e.g. AWS) using BOSH. BOSH can be hard if you're seeing it for the first time, but it is the ideal way to deploy any distributed software that you want running on VMs. You will also be able to have persistent volumes attached to your kafka VMs if necessary. Here is a kafka BOSH release which may work.
Once you have your cluster running, you have two ways to integrate your Cloud Foundry applications with it. The simplest is just to provide it to your applications as a "user-provided service", which lets you flow kafka cluster access info to your apps. The alternative would to put a service broker in front of your cluster, which would be especially useful if you have many different people who will be pushing apps that need to talk to the kafka cluster. Rather than you having to manually tell people the access info each time, they can do something simple like cf bind-service SOME_APP YOUR_KAFKA_SERVICE. Here is a kafka service broker along with more info about service brokers in general.

According to the 12-factor app description (https://12factor.net/processes), Kafka should not run as an application on top of Cloud Foundry:
Twelve-factor processes are stateless and share-nothing. Any data that needs to persist must be stored in a stateful backing service, typically a database.
Kafka is often considered a "distributed commit log" and as such carries a large amount of state. Many companies use it to keep all events flowing through their distributed system of micro services for a long (sometimes unlimited) amount of time.
Therefore I would strongly recommend to go for the second option in the accepted answer: Kafka topics should be bound to your applications in the form of stateful services.

Related

Is it possible to use dolphinscheduler without zookeeper?

Zookeeper plays several roles in the open-source workflow framework dolphinscheduler, such as heartbeat detection among masters and workers, task queue,event listener and distributed lock.
dolphin-sche framework
Is it possible to replace it by using database (mysql)? The main reason is to simplify the project structure .
zookeeper in DS is mainly used as:
Task queue, for master sending tasks to worker
Lock, for the communication between host(masters and workers)
Event watcher. Master listens the event that worker added or removed
it costs to replace zk as mysql.
zk mainly assumes the responsibility of the registry and monitors the application status. zk is very mature in this area and is a recognized solution in the industry. If MySQL wants to do this, the technical implementation cost will be larger, and may not achieve the desired effect.
BTW, their team is currently working on the SPI development for the registry, and in later versions, perhaps you can use other components, such as etcd, to achieve similar functionality.
for now, MasterServer and the WorkerServer nodes in the system all use the Zookeeper for cluster management and fault tolerance. In addition, the system also performs event monitoring and distributed locking based on ZooKeeper. We have also implemented queues based on Redis, but we hope that DolphinScheduler relies on as few components as possible, so we finally removed the Redis implementation.
so now DolphinScheduler can't work fine without Zookeeper, maybe in the future.
DolphinScheduler System Architecture:
For more documents please refer: Official Document.

Application Performance monitoring on Swisscom Application Cloud

I am investigating options for monitoring our installation in Swisscom's cloud-foundry. My objectives are the following:
monitor performance indicators for deployed application (such as cpu, disk, memory)
monitor performance indicators for services (slow queries, number of queries, ideally also some metrics on hitting quotas)
So far, I understand the options are the following (including some BUTs):
I used a very nice TOP cf-plugin (github)
This works very well. It seems that it registers itself to get the required firehose nozzles and consume data.
That is very useful for tracing / ad-hoc monitoring, but not very good for a serious infrastructure monitoring.
Another way I found is to use firehose-syslog solution.
This can be deployed as an app to (as far as I understand) do the job in similar way, as the TOP cf plugin.
The problem is, that it requires registered client, so it can authenticate with the doppler endpoint. For some reason, the top-cf-plugin does that automatically / in another way.
Last option i am considering is to build the monitoring itself to the App (using a special buildpack)
That can be for example done with Datadog. But it seems to also require a dedicated uaa client to register the Nozzle.
I would like to check, if somebody is (was) on the similar road, has some findings.
Eventually I would like to raise the following questions towards the swisscom community support:
is it possible to register uaac client to be able to ingest events through the firehose nozzle from external service? (this requires admin credentials if I was reading correctly)
is there an alternative way to authenticate with the nozzle (for example using a special user and his authentication token?)
is there any alternative to monitor the CF deployments in Swisscom? Eventually, is there a paper, blogpost or other form of documentation, that would be helpful in this respect (also for other users of AppCloud)?
Since it requires admin permissions, we can not give out UAA clients for the firehose.
However, there are different ways to get metrics in context of a user.
CF API
You can obtain basic metrics of a specific app by polling the CF API:
https://apidocs.cloudfoundry.org/5.0.0/apps/get_detailed_stats_for_a_started_app.html
However, since you have to poll (and for each app), it's not the recommended way.
Metrics in syslog drain
CF allows devs to forward their logs to syslog drains; in more recent versions, CF also sends metrics to this syslog drain (see https://docs.cloudfoundry.org/devguide/deploy-apps/streaming-logs.html#container-metrics).
For example, you could use Swisscom's Elasticsearch service to store these metrics and then analyze it using Kibana.
Metrics using loggregator (firehose)
The firehose allows streaming logs to clients for two types of roles:
Streaming all logs to admins (which requires a UAA client with admin permissions) and streaming app logs and metrics to devs with permissions in the app's space. This is also what the cf logs command uses. cf top also works this way (it enumerates all apps and streams the logs of each app).
However, you will find out that most open source tools that leverage the firehose only work in admin mode, since they're written for the platform operator.
Of course you also have the possibility to monitor your app by instrumenting it (white box approach), for example by configuring Spring actuator in a Spring boot app or by including an agent of your favourite APM vendor (Dynatrace, AppDynamics, ...)
I guess this is the most common approach; we've seen a lot of teams having success by instrumenting their applications. Especially since advanced monitoring anyway requires you to create your own metrics as the firehose provided cpu/memory metrics are not that powerful in a microservice world.
However, option 2. would be worth a try as well, especially since the ELK's stack metric support is getting better and better.

difference between dcos-kafka-service and mesos-kafka

I’m doing a POC to deploy Kafka as an application on Mesos Cluster. I came across these 2 codebases on github. One developed by apache-mesos (github page) & other developed by mesosphere and can run only on DCOS (github page).
Question: Would like to know if there are any differences between DCOS-Kafka & mesos-Kafka in terms of features and extended functionality.
Regarding Mesos-Kafka:
I don’t see active participation on github (and some open issues) for mesos-kafka in the past months. Can I assume that the service is robust enough that I can use in production environment? Any Inputs on this would be helpful.
kakfa-mesos is a package that includes a release of Kafka and a custom mesos scheduler that was meant to work around issues with running Kafka as a stateful service on Marathon. I think post but confluent is useful. It also includes a RESTful api for doing ops tasks and aims to include these features in the future (this is pulled from the article I linked)
Integrating Kafka commands (e.g. kafka-topics, etc) into the Scheduler so it can be used through the CLI and REST API.
Auto-scaling clusters (including auto reassignment of partitions) so that the resources (CPU, RAM, etc.) that brokers are using can be used elsewhere in known valleys of traffic.
Rack-aware partition assignment for fault tolerance.
Hooks so that producers and consumers can also be launched from the Scheduler and managed with the cluster.
Automated partition reassignment based on load and traffic
I haven't used it in a production environment myself but it has the support of Confluent which is a good sign.
DC/OS Kafka on the other hand is a DC/OS service which will probably only be useful if you are already running or plan on running services through Mesosphere's DC/OS. It also includes an API and a CLI management tool but is less ambitious with additional features. It's current feature set includes
Single-command installation for rapid provisioning
Multiple clusters for multiple tenancy with DC/OS
High availability runtime configuration and software updates
Storage volumes for enhanced data durability, known as Mesos Dynamic * * Reservations and Persistent Volumes
Integration with syslog-compatible logging services for diagnostics and troubleshooting
Integration with statsd-compatible metrics services for capacity and performance monitoring

How to make the Eureka server strong?

I am new to Spring Cloud. Currently, I want to build a new micro service based on Spring Cloud. It is very easy to build a new Eureka server. But my question is that how to make it high availability ? For example I create two Eureka server and a load balancer. When one of the Eureka server is down, the system still works well. But I don't know to to consist registered information in the two Eureka server.
I have already asked something similar in the spring cloud gitter channel.
Because of the CAP theorem, something as a distributes Service discovery has to decide, either to provide availability, or more consistency, with a trade off to the other one.
in short, by quoting Spencer Gibb:
Eureka favors availability over consistency
so it is very available, while registred services may be not acutal anymore.
As Spencer suggested, if consistency is something you need more then availability, try Consul together with spring cloud consul intead

How do config tools like Consul "push" config updates to clients?

There is an emerging trend of ripping global state out of traditional "static" config management tools like Chef/Puppet/Ansible, and instead storing configurations in some centralized/distributed tool, of which the main players appear to be:
ZooKeeper (Apache)
Consul (Hashicorp)
Eureka (Netflix)
Each of these tools works differently, but the principle is the same:
Store your env vars and other dynamic configurations (that is, stuff that is subject to change) in these tools as key/value pairs
Connect to these tools/services via clients at startup and pull down your config KV pairs. This typically requires the client to supply a service name ("MY_APP"), and an environment ("DEV", "PROD", etc.).
There is an excellent Consul Java client which explains all of this beautifully and provides ample code examples.
My understanding of these tools is that they are built on top of consensus algorithms such as Zab, Paxos and Gossip that allow config updates to spread almost virally, with eventual consistency, throughout your nodes. So the idea there is that if you have a myapp app that has 20 nodes, say myapp01 through myapp20, if you make a config change to one of them, that change will naturally "spread" throughout the 20 nodes over a period of seconds/minutes.
My problem is: how do these updates actually deploy to each node? In none of the client APIs (the one I linked to above, the ZooKeeper API, or the Eureka API) do I see some kind of callback functionality that can be set up and used to notify the client when the centralized service (e.g. the Consul cluster) wants to push and reload config updates.
So I ask: how is this supposed to work (dynamic config deployment and reload on clients)? I'm interested in any viable answer for any of those 3 tools, though Consul's API seems to be the most advanced IMHO.
You could use cfg4j for that. It's a Java configuration library for distributed services. It supports Consul as one of the configuration sources.
That's a nice question. I can tell how Consul HTTP client works.
I also think initially that it works in the push mechanism but while I was recently exploring Consul, I found that all Consul clients poll server for changes they want to watch. Although it is a bit different polling mechanism, Consul supports blocking queries. These are HTTP requests with a max timeout of 10 mins. This query waits until there is some change on the watched key/folder and return with the latest index. If the index is changed, the client reloads the configuration. For more info : Consul Blocking Query