Workflow engine for Apache Kafka?

Workflow engine for Apache Kafka? - apache-kafka

For our new SOA, we are looking at Kafka as a message broker.
I wish to ask, what are our options to organize workflows in messaging?
It looks like there should be a coordinator/supervisor service which concentrates knowledge about an order, in which messages should be published and processed. And probably writing it from scratch is not a good idea.
I seen a few workflow engines, but not one which is designed specifically for use with Kafka, or advised to be used with it. I'm afraid that looking at Kafka, we've chosen the wrong tool, because building complex and reliable workflows is a must from our perspectives.
Any advice please: workflow engine? Another messaging platform? Go to work in McDonalds?:)

Related

Appropriate event architecture for producing an event graph of a company's microservices automation?

Hello we are trying to determine the best or appropriate architecture for tracking events as they occur between microservices. We know loose coupling begins this process for good microservice design. The use case is to literally query how a company's automation is taking place in real time or historically. There are many products such as Kafka, Solace, MassTransit (mediator, riders, message queues, sagas, routing slips).
Just basic advise at this point. We have to implement saga and routing slip patterns to satisfy our business model.

I would recommend starting by taking a look at the Open Telemetry (OTel). It's a CNCF project, so not tied to specific product, and their goal is to provide a level of observability across your architecture, including the ability of tracing across distributed apps (whether they are sync or async).
I will warn that there is currently a SIG focusing on defining the messaging semantics so this isn't a fully baked solution at this point. You can find more info on that SIG here. They are working to replace the existing experimental messaging semantic conventions with a stable document as things go GA.
That said, you'd probably want to start with instrumenting your apps/microservices and OTel has a number of auto-instrumentation libraries for different APIs & languages in various OTel repos. For example, the repo for the Java agent with a number of auto-instrumentation implementations (including JMS) can be found here: https://github.com/open-telemetry/opentelemetry-java-instrumentation. The idea of the auto-instrumentation is that it doesn't require app code changes so as things evolve it should be easy for you to evolve with it, which is obviously ideal since the messaging semantics are still in work.

Agree with the Open Telemetry comment above. As far as what is currently used most widely (by any tech or framework involved) in microservices is its predecessor Open Tracing (Open Telemetry is backward compatible to Open Tracing and/as OpenTracing is also CNCF). If interested we have created a microservices workshop that uses tracing across both Rest and messaging (specifically JMS) so you can see working examples. You can view these traces in Jaeger UI and such. In the same workshop we show "unified observability" where we show metrics, logging, and tracing all within the same Grafana console. This is really handy as you can, eg, see metrics and/or get a metrics alarm/alert, drill down to its corresponding log entries, and then click/drill down directly from a log entry to its trace (and vice-versa). Full disclosure, I am from Oracle, and so we also show the ability to trace from the OpenTracing/Kubernetes layer, down into the database thus showing the "3 pillars of observability" across both app and data tiers ...but you don't need the database to do any of the other tracing, etc. shown. The workshop is here:https://apexapps.oracle.com/pls/apex/dbpm/r/livelabs/view-workshop?p180_id=637 and the src is here: https://github.com/oracle/microservices-datadriven . I'm definitely interested if you'd like to see add anything added or have any questions, etc. We'll be adding more OpenTelemetry to it as its adoption grows.

Is it possible to scale Axon Framework without Axon Server Enterprise

Is it possible to scale Axon Framework without Axon Server Enterprise? I'm interested in creating a prototype CQRS app with Axon, but the final, deployable system has to be be free from licensing fees. If Axon Framework can't be scaled to half a dozen nodes using free software, then I should probably look elsewhere.
If Axon Framework turn out not to be a good choice for the system, what would you recommend? Would building something around Apache Pulsar be a sensible alternative?

I think I have good news for you then.
You can utilize Axon Framework perfectly fine without Axon Server Enterprise.
Firstly, you can use the Axon Server Standard edition, which is completely free and you can check out the code too if you want.
If you prefer to get infrastructure back in your own hands, you can also select different approaches to distributing the CommandBus and the EventBus/EventStore.
For the CommandBus the framework provides the DistributedCommandBus for which two implementation are in place, being:
JGroups
Spring Cloud
I'd argue option 2 is the most ideal for distributing your commands, as it gives you the freedom to choose whichever Spring Cloud Discovery Service implementation you desire. That should give you the handles to work "free of licenses" in that area.
For distributing your Events, you can broadly look at two approaches:
Share the database, aka your EventStore, among all instances
Use a event message bus to distribute your event messages
If you want instances of your nodes to be able to Event Source the Command Model, you are inclined to use option 1. This is required as Axon Framework requires a dedicated EventStore to be able to source the Command Models from history.
When you just want to connect other applications to your Event Stream, option 2 would be a good fit. Again, the framework has two options in this area:
AMQP
Kafka
The only thing I'd like to point out on this part additionally is that the Kafka extension is still in a release candidate state. It is being worked on actively at the moment though.
All these extensions and their options should be clearly stated in the Reference Guide, so I'd definitely check this documentation out if you are gonna start an application.
There is a sole thing you cannot distribute though, which is the QueryBus.
There is an outstanding issue to resolve this, for which someone put the effort in to provide a PR.
After seeing how Axon Server Standard edition does it though, he intentionally closed the PR (with the following comment) as it didn't seem feasible to him to maintain such a tool at this stage.
So, can you use Axon Framework without Axon Server Enterprise?
Well, I think you can. :-)
Mind you though, although you'd be winning on not having a license fee if you don't use Axon Server Enterprise, it doesn't mean your production is gonna be free.
You'd be introducing back quite some infrastructure set up and production time to get this going.
Hope this gives you plenty of feedback #ahoffer!

Enterprise Wide Cluster Streaming System

I'm interested in deploying an enterprise service bus on a fault tolerant system with a variety of use cases that include tracking network traffic and analyzing social media data. I'd prefer to use a streaming application, but open to the idea of micro-batching. The solution will need to be able to take imports and exports from a variety of sources (bus).
I have been researching various types of stream-processing software platforms here:
https://thenewstack.io/apache-streaming-projects-exploratory-guide/
But I've been struggling with the fact that many (all) of these projects are open source and I don't like the large implementation risk.
I have found Apache Kafka attractive because of the Confluent Platform built on-top, but I can't seem to find anything similar to Confluent out there and want to know if there are any direct competitors built on top of another Apache project. Or an entirely private solution.
Any help would be appreciated! Thanks in advance!

Best approach to construct a real-time rule engine for our streaming events

We are at the beginning of building an IoT cloud platform project. There are certain well known portions to achieve complete IoT platform solution. One of them is real-time rule processing/engine system which is needed to understand that streaming events are matched with any rules defined dynamically by end users with readable format (SQL or Drools if/when/then etc.)
I am so confused because there are lots of products, projects (Storm, Spark, Flink, Drools, Espertech etc.) in internet so, considering we have 3-person development team (a junior, a mid-senior, a senior), what would it be the best choice ?
Choosing one of the streaming projects such as Apache Flink and learn well ?
Choosing one of the complete solution (AWS, Azure etc.)

The BRMS(Business Rule Management System) like Drools is mainly built for quickly adapting changes in business logic and are more matured and stable compared to stream processing engines like Apache Storm, Spark Streaming, and Flink. Stream processing engines are built for high throughput and low latency. The BRMS may not be suitable to serve hundreds of millions of events in IOT scenarios and may be difficult to deal with event-time-based window calculations.
All these solutions can be used in Iaas providers. In AWS you may also want to take a look at AWS EMR and Kinesis/Kinesis Analytics.
Some use cases I've seen.
Stream data directly to FlinkCEP.
Use rule engines to do fast response with low latency, at the same time stream data to Spark for analysis and machine learning.
You can also run Drools in Spark and Flink to hot-deploy user-defined rules.

Disclaimer, I work for them. But, you should check out Losant. It's developer friendly and it's super easy to get started. We also have a workflow engine, where you can build custom logic/rules for your application.

check out the Waylay rules engine built specifically for real-time IoT data streams.

In the beginning phase Go for the cloud based IoT platform like predix,AWA,SAP or Watson for rapid product development and initial learning.

Does Apache NiFi support version control

I am trying to explore Apache NiFi. So far haven't seen any ways to version control flows.
Is there a way to version control flows when multiple users are trying to develop in the same instance?
What about code merge from multiple users?
Any help in these regards will help me to continue my exploration.

In addition to James's great answer I'll also point out that this approach to flow management has leveraged external version control systems and put the task on the user to perform. What I mean is that users (or automated processes) could initiate the production of a template and then store that template into a VCS. This has worked well but it is also insufficient. The other direction is also important where given a versioned flow one would like that to be automatically reflected on another cluster/system/environment. Think of the software development lifecycle one might go through when building flows in a development environment and proving/vetting into and through production. Or think of a production case where behavior is not as expected. While NiFi offers a really powerful interactive command and control model sometimes people want to be able to test new approaches and theories in another environment. As a result, we're working now on a really awesome capability.
Come join the conversation. We'd like to hear your thoughts.
Thanks

NiFi Templates are a great format for configuration management of a NiFi flow. You can define templates for everything from small example snippets up to large nested process group structures, essentially your entire flow. Templates will include processors, queues, and controller services, but will not contain sensitive values like passwords. Templates are stored as XML files friendly to source control (since NiFi v1.0).
Templates provide a way for individual developers to separately build parts of a flow, then merge the parts together in a single NiFi. If you match templates with process groups, swapping out the old one with the new one can be fairly easy and intuitive.

The answer to this question is YES, you can use NiFi Registry to have version control.
Below you can see a how it looks like.
The project page is:
https://nifi.apache.org/registry.html

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse