How To call a rest API inside an Apache Flink program - rest

I m wondering if there is a solution to call a
REST API (or multiple REST APIs) inside a flink program directly or not ? if such solution is exist.
Do you think it is better to push my processed data from flink to a message broker like kafka or something at first and then from kafka call REST APIs?
or I can Call REST APIs directly from my flink program as well?

The code in your user functions (e.g. a RichFlatMapFunction or a KeyedProcessFunction) can do anything you want, including making REST calls to external services. However, you should avoid doing blocking i/o in your user functions, because checkpoint barriers can't progress through an operator while it is blocked in the user function.
A good way to approach this then is to use Flink's Async I/O API in combination with an HTTP library that offers an asynchronous client interface.

Related

Should one service take care of both processing Kafka messages and API calls simultaneously?

We want to subscribe to a Kafka topic with a microservice. So that the service does not have to accept API calls from the surrounding systems and Kafka messages at the same time, we would like to interpose an 'importer service'. However, in the end we would have the same problem again because the importer service now has to process both the Kafka messages and the API calls from the aforementioned microservice. As a solution to this problem, we considered giving both services access to the same database. The importer service could then receive the Kafka message, process it and write it to the database. The original microservice would then not go to the importer service, but would get the data directly from the DB. However, the approach seems a bit dirty, since you shouldn't share databases between services. Do you have any ideas how to solve this more elegantly? And if there isn't a better approach, should one service really take care of processing Kafka messages and API calls simultaneously?
One service should not process both kafka and api messages.
You can make a service that wraps the database and both services will communicate with it.

Event-based-microservices: MQTT with Broker OR HTTP with API GATEWAY?

I'm trying to develop a project with microservices.
I have some questions on this topic (something is not clear):
1) How to implement microservices communication?
A) HTTP : Every microservice expose HTTP API , an API GATEWAY broadcast requests.
B) MQTT : every microservice pub/sub to a broker
C) BOTH : but how to understand when one is better than the other ?
Have I to use pub/sub protocol as a standard even for classic operations usually performed over HTTP ? For example I have two microservices:
web-management and product-service. web-management is a panel that lets the administrator to add, modify, ... products in its ecommerce digital shop. Let's say we want to implement createProduct operation. It's a command (according to event /command distinction), a one-to-one communication.
I can open an API in product-service, let's say (POST, "/product") that add the new product. I also can implement this transforming the command in a productCreationRequest event. In this case: web-managemnet publish this event. product-service listen to productCreationRequest events (and also productUpdateRequest, productGetEvents, ...) once it is notified it performs the operation and emits productCreated event.
I find this case borderline. For example a last-occasion-service may listen to productCreated and immediately send a message (email or push notification) to customers. What do you think about this use case?
2) Which may be a valid broker (I will use docker-compose or kubernetes to orchestrate containerized microservices: language adopted probably java, javascript, python)?
Both is definitely a possibility! Choose a broker that allows you to easily mix-and-match between HTTP (synchronous) communication, and more async event-driven pub/sub. It should allow you to migrate your microservices between the two options as required.
HTTP APIs are great at the edge of your distributed application, where a customer wants to submit an order or something, and block waiting for a response (200 OK).
But internally within your application between microservices, a lot of them don't need a response... async, eventually consistent. And using pub/sub (like MQTT) allows for multiple downstream consumers easily. Another great use for MQTT is streaming updates to downstream consumers... like a data-feed from a bus or airline company or something, rather than having to poll a REST API for updates.
For your use-case and similar ones, I would almost always recommend using pub/sub communication, even if today it's a simple request-reply interaction with a single backend process. REST over HTTP is point-to-point, and perhaps in the future you want another process to be able to see/consume/monitor that event or interaction. If you're already using publish-subscribe, adding that 2nd (or more) consumer of that data flow is trivial. Harder with REST/HTTP.
In terms of performance, I would highly doubt a blocking protocol like HTTP is going to outperform something that is asynchronous and bidirectional, like MQTT which uses WebSockets for web communication.
As for a broker to glue all this together, check out the standard edition Solace PubSub+ event broker... can do both (and translate between) MQTT and HTTP. I even wrote a CodeLab for this (almost) exact use case haha!
(BTW, I work for Solace! FYI.)
Consider using SMF framework for Javascript/Node.js, it helps prototype pub/sub communications via a message broker (RabbitMQ) between microservices out of the box:
https://medium.com/#krawa76/bootstrap-node-js-microservice-stack-4a348db38e51
As for the message broker routes, use an event-driven naming convention, e.g. post a "web.new-product", where "web" is the sub-system name, "new-product" - event name.

Microservices: API Call Vs Messaging. When to Use?

I know that messaging system is non blocking and scalable and should be used in microservices environment.
The use case that i am questioning is:
Imagine that there's an admin dashboard client responsible for sending API request to create an Item object. There is a microservice that provides API endpoint which uses a MySQL database where the Item should be stored. There is another microservice which uses elastic search for text searching purposes.
Should this admin dashboard client :
A. Send 2 API Calls; 1 Call to MySQL service and another elasticsearch service
or
B. Send message to topic to be consumed by both MySQL service and elasticsearch service?
What are the pros and cons when considering A or B?
I'm thinking that it's a little overkill when only 2 microservices are consuming this topic. Also, the frequency of which the admin is creating Item object is very small.
Like many things in software architecture, it depends. Your requirements, SLAs and business needs should make it clearer.
As you noted, messaging system is not blocking and much more scalable, but, API communication got it pluses as well.
In general, REST APIs are best suited to request/response interactions where the client application sends a request to the API backend over HTTP.
Message streaming is best suited for notifications when new data or events occur that you may want to take action upon.
In you specific case, I would go with a messaging system with is much more scalable and non-blocking.
Your A approach is coupling the "routing" logic into your application. Pretend you need to perform an API call to audit your requests, then you will need to change the code and add another call to your application logic. As you said, the approach is synchronous and unless you're not providing threading logic, your calls will be lined up and won't scale, ie, call mysql --> wait response, then call elastic search --> wait response, ...
In any case you can prefer this approach if you need immediate consistency, ie, the result call of one action feeding the second action.
The B approach is decoupling that routing logic, so, any other service interested in the event can subscribe to the topic and perform the action expected. Totally asynchronous and scalable. Here you will have eventual consistency and you have to recover any possible failure.

Throttle API calls to external service using Scala

I have a service exposing a REST endpoint that, after a couple of transformations, calls a third-party service also via its REST endpoint.
I would like to implement some sort of throttling on my service to avoid being throttled by this third-party service. Note that my service's endpoint accepts only one request and not a list of them. I'm using Play and we also have Akka Streams as dependency.
My first though was to have my service saving the requests into a database table and then have an Akka Streams Source, leveraging the throttle function, picking tasks, applying the transformations and then calling the external service.
Is this a reasanoble approach or does it have any severe drawbacks?
Thanks!
Why save the requests to the database? Does the queue need to survive restarts and/or do you run a load-balanced setup that needs to somehow synchronize the requests?
If you don't need the above I'd think using only Source.queue to store the task data would work just as well?
And maybe you already thought of this: If you want to make your endpoint more resilient you should allow your API to send a 'sorry, busy' response and drop the request instead of queuing it if your queue grows beyond a certain size.

Queued messages + API endpoint

We have developed modular web app with very powerful API and now we need queuing tool for delayed|time consuming jobs. We are looking at RabbitMQ or AWS SQS. But these two just store messages, and you have to manually get messages from them or I misunderstood it?
We would like to channel all messages through our API, so when message is published to Queue in should be POST-ed (after some delay) to to our Interface.
So my question:
Is there any tool for queuing that support http post (with oauth2)?
If not, is this approach somehow valid:
Create worker that poll messages from queue
and POST them to API with some client?
(we have to maintain cli tool, and we want to avoid that).
Are there any alternatives?
When using SQS polling is the only way out.
To make things easier you can write this polling logic in AWS Lambda because lambda functions do not have the overhead of maintaining infrastructure and servers