Expected problems and limitations of implementing Kafka Producer in Azure Function - apache-kafka

I have a rather high-level, architectural question which might not have a 100% clear answer. We're currently thinking about implementing a Kafka Producer within Azure Functions, as opposed to having a dedicated Producer client running in some container. The Azure Function would be invoked by some REST API call which includes the payload. The alternative solution would require something similar, the Producer application would have some custom API endpoint exposed via some Java-based framework to take data in that are then passed to Kafka via Producer API - a constantly running Java application on some container (and if necessary, redundant for parallelism).
My gut feeling tells me this approach with Azure Functions might not be a good practice, because as far as I'm aware the Producer concept in Kafka is more something "continuous" rather than something instantiated "per record" and not as short-lived as an Azure Function, which may be instantiated thousands of times in a short period of time. This approach seems unintuitive to me, as we would invoke a whole Producer lifecycle for each incoming record, generating a lot of additional network traffic to our Kafka cluster and potentially result in message ordering being arbitrary (for some use cases negligible), disregarding the fact that it's probably a quite expensive solution.
But I could also be completely mistaken, maybe it is good/best practice and there are no significant downsides regarding the concerns I mentioned. Technically, the Azure Functions approach should be scalable way easier and depending on the load, it could actually be cheaper to invoke X Azure Functions instead of having a 24/7 running producer, but that is highly dependent on the use case. Also operations in the "custom Producer" case are something that need to be taken into account, serverless does not require this kind of considerations regarding operations/deployment/maintenance.
Any thoughts or experiences on this?

No, producers aren't necessarily continuous. If you've used kafka-console-producer, then you'd know this. Lambda/Function methods are no different.
Plus, Java is not necessary. Save yourself some costs/speed and don't trigger a JVM startup within a serverless function. Or, if you do, then compile a native binary using GraalVM (Quarkus or Spring Native can help with this)

Related

Microservices communication model

Consider microservices architecture, where you need to expose functionality to manage simple configuration shared with different microservices. Configuration is not changing often, but still, I would like to see changes whenever I ask for any value.
Using REST microservice seems easy, but it is adding latency.
Alternative could be RPC over messaging (i.e. RabbitMQ), but interface becomes more complicated.
What communication are you using for internal, simple services and what are pros and cons?
Any examples?
I tried with REST API, but it means a lot of "slow" requests, which add a latency to overall requests.
I've found that using RESTful APIs with some judicious implementation of cache-control headers actually works fairly well for this use case. The biggest challenge is ensuring that the HTTP client underneath your REST client actually respects the things.
It's fairly easy to implement, fits nicely into HTTP, and generally scales really well. It gives control to the client to decide if they want to respect the caching suggestions, allows server to optimize if it "knows" the configs haven't change (304 Not modified) to optimize if the client wants to ask for new versions.
You don't have to get into anything too complicated from a cache-invalidation, and you can leverage things like edge caching to further accelerate things in interesting ways.
The question to ask is ultimately the extent to which it is a requirement that a change to the configuration immediately affects everything.
If that's actually a requirement, then we're talking about strong consistency which implies some combination of:
all other processing must be effectively executed one-at-a-time against the (there can only ultimately be one: if there's multiple, then they will be affected at different times) component against which the change is made
all other processing must stop for the duration of time that it takes to propagate the change to all components
(these can be combined: you can have multiple instances depend on the configuration and stop for as long as it takes to update those and then you can execute things in parallel... an example of this is making it static configuration in the dependent services and taking them all down to update the configuration: if these updates are sufficiently rare, you can fit them into your error/downtime budget)
Needless to say, there's a (likely surprisingly small) consistency budget you're dealing with.
If you don't actually need strong absolute consistency like I've described (and the set of problems which actually need it is perhaps surprisingly small: anything to do with money for instance doesn't actually need strong consistency because it's only money), then it's a question of how much inconsistency is acceptable (typically you'll quantify this with some sort of bounded staleness and a liveness guarantee that you don't go back in time (unless there's a really good reason to go back in time...)). At this point, we've established that you want eventual consistency, we're just haggling over "how eventual?".
For this, propagating the configuration changes via durable publish-subscribe log (Kafka being the exemplar of this approach) is probably the place to start. Components subscribe to this log and update local state as it changes (and probably store the log position and the last value in some local store to prevent inadvertently going backward in time when they initially read the log). Then you can distribute the configuration so that it's in local memory of the subscribers, though during an update, there will be a window where different subscribers will have different views of that configuration.
A lot of solutions exist to externalize microservice configuration to a central location depending on what frameworks/programming languages you used to build your services. If it happened you would be using Spring, take a look at Spring Cloud Config. Off course Eureka is not the only solution tailored for this purpose.

Does it make sense to abstract common client concerns in a separate API

Recently I worked on different client side APIs, such as HTTP ReST client, messaging client, and a database client.
In each case, the same concerns sprang up, which are the following:
Connection pooling
Async and non-blocking I/O with clean error handling
Request retrying with backoff policy implementation (this is more the case for ReST and messaging)
Request batching (this is more the case for databases)
The way I see it, the above concerns can be abstracted from the underlying request in a separate API. Furthermore, due to the complexity of coding the above concerns, it makes sense not to pay the cost multiple times.
Therefore, I would have expected to have a generic client helper API which would permit me to retry and batch any sort of request, all while performing all requests asynchronously.
It would be kind of a task executor API, but without the other complexities (such as scheduling, since there is only one task that needs to be executed).
Hence my question, or am I missing something?
I would say to keep them separate. My guess is that you'll find 3rd party solutions for each of these, but I don't know of any libraries that would do all three.
I'm not sure if your programming in Java, but I think the apache project has done a good job at segmenting utilities in their commons-* libraries. You may want to draw some inspiration from there.
https://commons.apache.org/

Spring Cloud Stream flow as one application

As far as I know there is an option to use couple components of Spring Cloud Stream as one application by using AggregateApplication or AggregateApplicationBuilder.
From what I understood, spring will not use broker (Rabbit or Kafka) for communication between steps in this situation it will just pass result from previous step as an argument to the next almost directly, am I right?
If I am, is there another way to have running more components in one instance of an application with usage of a broker? I'm aware that this is not an architecture which is great for Cloud Stream, but now I don't have an infrastructure in which I can run Dataflow and also I would like to use durability of a brokers.
In general, aggregation has been designed as a replacement for communication over a message broker - to reduce the latency by avoiding to go over a hop. That being said, it may make sense to add an option of have the channels bound for use cases like yours. Can you open a feature request in GitHub, please?

Akka.Net work queues

I have an existing distributed computing framework built on top of MassTransit and RabbitMQ. There is essentially a manager which responds with work based on requests. Each worker will take a certain amount of items based on the physcial machine specs. The worker then sends completion messages when done. It works rather well and seems to be highly scalable since the only link is the service bus.
I recently evaluated Akka.Net in order to see if that would be a simpler system to implement the same pattern. After looking at it I was somewhat confused at what exactly it is used for. It seems that if I wanted to do something similar the manager would have to know about each worker ahead of time and directly send it work.
I believe I am missing something because that model doesn't seem to scale well.
Service buses like MassTransit are build as reliable messaging services. Ensuring the message delivery is primary concern there.
Actor frameworks also use messages, but this is the only similarity. Messaging is only a mean to achieve goal and it's not as reliable as in case of the service buses. They are more oriented on building high performance, easily distributed system topologies, centered around actors as primary unit of work. Conceptually actor is close to Active Record pattern (however this is a great simplification). They are also very lightweight. You can have millions of them living in memory of the executing machine.
When it comes to performance, Akka.NET is able to send over 30 mln messages/sec on a single VM (tested on 8 cores) - a lot more than any service bus, but the characteristics also differs significantly.
On the JVM we now that akka clusters may rise up to 2400 machines. Unfortunately we where not able to test, what the .NET implementation limits are.
You have to decide what do you really need: a messaging library, an actor framework or a combination of both.
I agree with #Horusiath answer. In addition, I'd say that in most cases you can replace a servicebus for the messaging system of an actor model like akka, but they are not in the same class.
Messaging is just one thing that Akka provides, and while it's a great feature, I wouldn't say it's the main one. When analyzing it as an alternative, you must first look at the benefits of the model itself and then look if the messaging capabilities are good enough for your use case. You can still use a dedicated external servicebus to distribute messages across different clusters and keep akka.net exchanging messages inside clusters for example.
But the point is that if you decide to use Akka.net, you won't be using it only for messaging.

Is it good to put jdbc operations in actors?

I am building a traditional webapp that do database CRUD operations through JDBC. And I am wondering if it is good to put jdbc operations into actors, out of current request processing thread. I did some search but found no tutorials or sample applications that demo this.
So What are the cons and pros? Will this asynchonization improve the capacity of the appserver(i.e. the concurrent request processed) like nio?
Whether putting JDBC access in actors is 'good' or not greatly depends upon the rest of your application.
Most web applications today are synchronous, thanks to the Servlet API that underlies most Java (and Scala) web frameworks. While we're now seeing support for asynchronous servlets, that support hasn't worked its way up all frameworks. Unless you start with a framework that supports asynchronous processing, your request processing will be synchronous.
As for JDBC, JDBC is synchronous. Realistically there's never going to be anything done about that, given the burden that would place on modifying the gazillion JDBC driver implementations that are out in the world. We can hope, but don't hold your breath.
And the JDBC implementations themselves don't have to be thread safe, so invoking an operation on a JDBC connection prior to the completion of some other operation on that same connection will result in undefined behavior. And undefined behavior != good.
So my guess is that you won't see quite the same capacity improvements that you see with NIO.
Edit: Just discovered adbcj; an asynchronous database driver API. It's an experimental project written for a master's thesis, very early, experimental. It's a worthy experiment, and I hope it succeeds. Check it out!
But, if you are building an asynchronous, actor-based system, I really like the idea of having data access or repository actors, much in the same way your would have data acccess or repository objects in a layered OO architecture.
Actors guarantee that messages are processed one at a time, which is ideal for accessing a single JDBC connection. (One word of caution: most connection pools default to handing out connection-per-thread, which does not play well with actors. Instead you'll need to make sure that you are using a connection-per-actor. The same is true for transaction management.)
This allows you to treat the database like the asynchronous remote system we ought to have been treating it as all along. This also means that results from your data access/repository actors are futures, which are composable. This makes it easier to coordinate data access with other asynchronous activities.
So, is it good? Probably, if it fits within the architecture of the rest of your system. Will it improve capacity? That will depend on your overall system, but it sounds like a very worthy experiment.