akka http: Akka streams vs actors to build a rest service

akka http: Akka streams vs actors to build a rest service - scala

When it comes to creating a REST web service with 60+ API on akka http. How can I choose whether I should go with akka streams or akka actors?
In his post, Jos shows two ways to create an API on akka http but he doesn't show when I should select one over the other.

This is a difficult question. Obviously, both approaches work. So to a certain degree it is a matter of taste/familiarity. So everything following now is just my personal opinion.
When possible, I prefer using akka-stream due to its more high-level nature and type safety. But whether this is a viable approach depends very much on the task of the REST API.
Akka-stream
If your REST API is a service that e.g. answers questions based on external data (e.g. a currency exchange rate API), it is preferable to implement it using akka-stream.
Another example where akka-stream would be preferable would be some kind of database frontend where the task of the REST API is to parse query parameters, translate them into a DB query, execute the query and translate the result according to the content-type requested by the user. In both cases, the data flow maps easily to akka-stream primitives.
Actors
An example where using actors would be preferable might be if your API allows querying and updating a number of persistent actors on a cluster. In that case either a pure actor-based solution or a mixed solution (parsing query parameters and translating results using akka-stream, do the rest using actors) might be preferable.
Another example where an actor-based solution might be preferable would be if you have a REST API for long-running requests (e.g. websockets), and want to deploy the processing pipeline of the REST API itself on a cluster. I don't think something like this is currently possible at all using akka-stream.
Summary
So to summarize: look at the data flow of each API and see if it maps cleanly to the primitives offered by akka-stream. If this is the case, implement it using akka-stream. Otherwise, implement using actors or a mixed solution.

Don't Forget Futures!
One addendum I would make to Rudiger Klaehn's fine answer is to also consider the use case of a Future. The composability of Futures and resource management of ExecutionContext make Futures ideal for many, if not most, situations.
There is an excellent blog post describing when Futures are a better choice than Actors. Further, the back-pressure provided by Streams comes with some pretty hefty overhead.
Just because you're down the rabbit hole using akka-http does not mean all concurrency within your request handler has to be confined to Actors or Streams.
Route
Route inherently accomodates Futures in the type definition:
type Route = (RequestContext) ⇒ Future[RouteResult]
Therefore you can bake a Future directly into your Route using only functions and Futures, no Directives:
val requestHandler : RequestContext => HttpResponse = ???
val route : Route =
(requestContext) => Future(requestHandler(requestContext)) map RouteResult.Complete
onComplete Directive
The onComplete Directive allows you to "unwrap" a Future within your Route:
val route =
get {
val future : Future[HttpResponse] = ???
onComplete(future) {
case Success(httpResponse) => complete(httpResponse)
case Failure(exception) => complete(InternalServerError -> exception.toString)
}
}

Related

Wiremock - Mocking a reactive backend

Is it possible to use Wiremock to mock a reactive backend. What I want to do is make Wiremock return chunked responses but each chunk should be a valid json string (something that mimics a reactor Flux type response).
The scenario is something like this, I have a backend sending a stream of json objects that I can consume. Each json string can be marshalled into a POJO without the need to keep track of the state (the chunk that came before). Each chunk that comes over the wire can have different lengths.
Any ideas on how I can mock such a backend?

Most, if not all, of API mocking, stubbing, faking, replacing libraries (there a wide variety of names but it is fair to refer to these as API stubbing) such as wiremock do not support response payload chunking.
You are then left with two options:
A custom hand-made implementation, where you provide chunking based on the used library and its semantics
A simple test-scoped Controller stereotype that returns a reactive (Flux) for your endpoint. Then you let the underlying framework (Spring WebFlux for example) handle the response streaming for you (the cleanest option in my opinion)
That being said, you should be good to go with a mock API where you return an iterable type which will get mapped automatically by the client to its reactive counterpart, Flux, when called. The mapping and request / response handling is low level detail, and its the responsibility of the underlying framework to map your input and output accordingly and you should not bother with how the endpoint is implemented as your client should be working in all cases the same way. It is the responsibility of the framework to interoperability after all and not the responsibility of an application developer.

Databricks consuming rest api

I'm new to Azure Databricks and Scala, i'm trying to consume HTTP REST API that's returning JSON, i went around the databricks docs but i don't see any Datasource that would work with rest api.Is there any library or tutorial on how to work with rest api in databricks. If i make multiple api calls (cause of pagination) it would be nice to get it done in parallel way (spark way).
I would be glad if you guys could point me if there is a Databricks or Spark way to consume REST API as i was shocked that there's no information in docs about api datasource.

Here is A Simple Implementation.
Basic Idea is spark.read.json can read an RDD.
So, just create an RDD from GET call and then read it as regular dataframe.
%spark
def get(url: String) = scala.io.Source.fromURL(url).mkString
val myUrl = "https://<abc>/api/v1/<xyz>"
val result = get(myUrl)
val jsonResponseStrip = result.toString().stripLineEnd
val jsonRdd = sc.parallelize(jsonResponseStrip :: Nil)
val jsonDf = spark.read.json(jsonRdd)
That's it.

It sounds to me like what you want, is to import into scala a library for making HTTP requests. I suggest the HTTP instead of a higher level REST interface, because the pagination may be handled in the REST library and may or may not support parallelism.
Managing with the lower level HTTP lets you de-couple pagination. Then you can use the parallelism mechanism of your choice.
There are a number of libraries out there, but recommending a specific one is out of scope.
If you do not want to import a library, you could have your scala notebook call upon another notebook running a language which has HTTP included in the standard library. This notebook would then return the data to your scala notebook.

Choosing a Websocket REST Paradigm

Since REST is an architectural style, not a protocol, it can be applied to most any protocol form - such as websockets. That's exactly what I'd like to do, but I'd like help deciding on an approach.
As I do my research, I'm finding that there are three paradigms I could follow:
Simulated Request-Response. This is what is implemented in the SwaggerSocket library. Each client request has an ID. Each server-push response has the same ID, to allow correlation of request-response.
Notification-Only. The server pushes a resource address, implying that the client should perform a GET on that resource to discover what has changed.
Event Driven. The server-push is designed to look like an HTTP POST, perhaps similar to a webhook request.
I'd like to hear from those who have experience walking these paths, about which they found to be the most effective, and which tools they applied (such as SwaggerSocket mentioned above).
A major concern I have is simplified demuxing and de-serialization. For example, suppose I have a client written in Typescript. I might like to deserialize server-push payload into a declared, typed object. I think the consequences for each paradigm are as follows:
Simulated Request-Response (SRR). The SwaggerSocket message must be de-serialized twice. First to discover the response ID and/or "path", and second to retrieve the actual payload into a "typed" object. A little clumsy, but doable.
Notification-Only. The server-push message can be deserialized into a single pre-defined type, since it contains little more than a REST resource path.
Event Driven. If the "event" has no payload, then this is basically the same thing as the Notification-Only approach. But if there is a payload, then once again a two-step deserialization would likely be necessary.
Other thoughts I have: The SRR might be the most limiting of the three, because every server-push theoretically is instigated by a client request. The other two paradigms don't have that implicit model. The Event Driven approach has the conceptual advantage of being similar-ish to a webhook callback.
To illustrate the Event Driven / webhook idea, I'll give a SignalR example.
//client side.
hubProxy.On<HttpRequest>("EventNotice", request => {
//Take apart the HttpResponse and dispatch through
//my own routing mechanism...to handlers that
//further deserialize the inner payload.
});
//The corresponding server-push would of course be:
_context.Clients.All.EventNotice(myHttpRequest);
The above example is very achievable. In fact, I would not be surprised somebody has example code (please share!) or even a supporting library for this purpose.
Again, of these different paradigms, which would you advise? What supporting tools would you suggest?

How to have a thread safe data in scala

Newbie in Scala here.
I'm doing a simple project in Scala with a simple web service, so I don't want to use a full blown db.
All my application is immutable. But I don't know how to do the "data" part.
My web service has GET and some POST and PUT. So I'd like my data to be:
- stored in memory
- thread safe (so many concurrent PUT won't mess it
- immmutable? (I'm new to the concept, but I guess is not possible)
I thought of an object like:
object UserContainer {
var users: List[User] = initializeUsers()
def editUser(...) = ...
def addUser(...) = ...
}
But it's not thread safe, neither immutable. I found that I can use "Actors" or "synchronize" for this.
How should I approach?

The primary concurrency abstraction provided by Actors is to provide threadsafe access to mutable state in a asynchronous and non-blocking fashion. Looks like that is your use case.
Some options for building build web services you should look at Spray.io or akka-http - both of which are based on Akka actors. However, creating actors (esp. using Akka) needs an ActorSystem which some argue is heavyweight. Additionally, actors (as of today) are not typed, so you don't get the type safety of Scala.

Using Akka actors in a CRUD web application

I'm working on a web application written in Scala, using the Play! framework and Akka. The code is organized basically like this: Play controllers send messages to Akka actors. The actors, in turn, talk to a persistence layer, that abstracts database access. A typical example of usage of these components in the application:
class OrderController(orderActor: ActorRef) extends Controller {
def showOrders(customerId: Long) = {
implicit request => Async {
val futureOrders = orderActor ? FindOrdersByCustomerId(id)
// Handle the result, showing the orders list to the user or showing an error message.
}
}
}
object OrderActor extends Actor {
def receive = {
case FindOrdersByCustomerId(id) =>
sender ! OrderRepository.findByCustomerId(id)
case InsertOrder(order) =>
sender ! OrderRepository.insert(order)
//Trigger some notification, like sending an email. Maybe calling another actor.
}
}
object OrderRepository {
def findByCustomerId(id: Long): Try[List[Order]] = ???
def insert(order: Order): Try[Long] = ???
}
As you can see, this is the basic CRUD pattern, much like what you would see in other languages and frameworks. A query gets passed down to the layers below and, when the application gets a result from the database, that result comes back up until it reaches the UI. The only relevant difference is the use of actors and asynchronous calls.
Now, I'm very new to the concept of actors, so I don't quite get it yet. But, from what I've read, this is not how actors are supposed to be used. Observe, though, that in some cases (e.g. sending an email when an order is inserted) we do need true asynchronous message passing.
So, my question is: is it a good idea to use actors in this way? What are the alternatives for writing CRUD applications in Scala, taking advantage of Futures and the other concurrency capabilities of Akka?

Although actor based concurrency doesn't fit with transactional operations out of the box but that doesn't stop you from using actors that way if you play nicely with the persistence layer. If you can guarantee that the insert ( write ) is atomic then you can safely have a pool of actors doing it for you. Normally databases have a thread safe read so find should also work as expected. Apart from that if the insert is not threadsafe, you can have one single WriteActor dedicated simply for write operations and the sequential processing of messages will ensure atomicity for you.

One thing to be aware of is that an actor process one message at a time, which would be rather limiting in this case. You can use a pool of actors by using routers.
Your example define a blocking api of the repository, which might be the only thing you can do, depending on your database driver. If possible you should go for an async api there also, i.e. returning Futures. In the actor you would then instead pipe the result of the Future to the sender.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse