Newbie in Scala here.
I'm doing a simple project in Scala with a simple web service, so I don't want to use a full blown db.
All my application is immutable. But I don't know how to do the "data" part.
My web service has GET and some POST and PUT. So I'd like my data to be:
- stored in memory
- thread safe (so many concurrent PUT won't mess it
- immmutable? (I'm new to the concept, but I guess is not possible)
I thought of an object like:
object UserContainer {
var users: List[User] = initializeUsers()
def editUser(...) = ...
def addUser(...) = ...
}
But it's not thread safe, neither immutable. I found that I can use "Actors" or "synchronize" for this.
How should I approach?
The primary concurrency abstraction provided by Actors is to provide threadsafe access to mutable state in a asynchronous and non-blocking fashion. Looks like that is your use case.
Some options for building build web services you should look at Spray.io or akka-http - both of which are based on Akka actors. However, creating actors (esp. using Akka) needs an ActorSystem which some argue is heavyweight. Additionally, actors (as of today) are not typed, so you don't get the type safety of Scala.
Related
I am designing a set of unit and integration tests with a friend of mine, and we had a doubt. We know the answer, at least, what is more likely to be true. However, we would like to hear your thoughts.
We are designing a test for MongoDB, and we expect that we should receive a promise, after asking to save a document. So far so good.
What about if we change the database??? can we assume for sure that all databases when queried will return a promise??
I guess regarding the _id, it depends from database to database, we are using _id from MongoDB for testing reasons.
We are using the following test using in Jest
create://this method do exist on service, however, at the moment of testing, it is empty, just a placeholder
jest.fn().mockImplementation((cat: CreateCatDto) =>
Promise.resolve({ _id: 'a uuid', ...cat })
)
The idea is to design backends that do not depend on the database, but for testing and developing reasons, we are using MongoDB and PostgreSQL.
Keep in mind the term promise doesn't exist as a concept for all databases, and to provide a conclusive answer to your question for all databases is not possible.
That being said, if by promise you mean the primary key or identity (in general database terms) after inserting new data, then the answer is no, there is no guarantee, not even on PostgreSQL that'll you be able to do that. It's possible for tables, even in PostgreSQL, to exist without those constraints.
Otherwise, if by promise, you mean specifically the concept in a procedural or functional language like JavaScript (as your example code in your update indicates), then yes, you should always receive a promise object if your application code is utilizing asynchronous calls appropriately.
But that would be regardless of what the asynchronous call was to whether it's a database directly (and regardless of what database system that was), an API endpoint, or another piece of application code. Also, in that case, your question (or any follow up questions) would be better suited for StackOverflow.com.
Can I assume that all databases return a promise?
No. Most if not all database wire protocols are synchronous, meaning the client blocks until it gets a response. Even the databases that expose some sort of RESTful APIs are synchronous, because HTTP.
Some client-side drivers may wrap this synchronous logic and exhibit asynchronous behaviour by returning something like JavaScrpt Promises or Java Futures, but it is entirely up to the driver implementation you choose to use.
I'm working on a Java/vertx project where the backend is MongoDB (I used to work with Elixir/Erlang since some time, and I'm quite new to vertx but I believe it's the best fit). Basically, I have an http API handled by some HttpServerVerticles which need to store data to (or retrieve data from) the mongo db and to send the appropriate reply to the API caller. I'm looking for the right pattern to implement the queries and the handling of the replies.
From the official guide and some tutorials, I see that for a relational JDBC database, it is necessary to define a dedicated verticle that will handle queries asynchronously. This was my first try with the mongo client but it introduces a lot of boilerplate.
On the other hand, from the mongo client documentation I read that it's Completely non-blocking and that it has its own connection pool. Does that mean that we can safely (from vertx event loop point of view), define and use the mongo client directly in the http verticle ?
Is there any alternative pattern ?
Versions : vertx:3.5.4 / mongodb:4.0.3
It's like that: mongo connection pool is exactly like SQL-db pool synchronous and blocking in it's nature, but is wrapped with non-blocking vert.x API around.
So, instead of a normal blocking way of
JsonObject obj = mongo.get( someQuery )
you have rather a non-blocking call out of the box:
mongo.findOne( 'collectionName', someQuery ){ AsyncResult<JsonObject> res ->
JsonObject obj = res.result()
doStuff( obj )
}
That means, that you can safely use it directly on the event-loop in any type of verticle without reinventing the asyncronous wheel over and over again.
At our client we use mongodb-driver-rx. Vertx has support for RX (vertx-rx-java) and it fits pretty well on mongodb-driver-rx.
For more information see:
https://mongodb.github.io/mongo-java-driver-rx/
https://vertx.io/docs/vertx-rx/java/
https://github.com/vert-x3/vertx-examples/blob/master/rxjava-2-examples/src/main/java/io/vertx/example/reactivex/database/mongo/Client.java
I'm fairly new to both Scala and Akka and I'm trying to figure out how you would create a proper domain model, which also is an Actor.
Let's imagine we have a simple business case where you can open a new Bank Account. Let's say that one of the rules is that you can only create one bank account per last name (not realistic, but just for the sake of simplicity). My first approach, without applying any business rules, would look something like this:
object Main {
def main(args: Array[String]): Unit = {
implicit val system = ActorSystem("accout")
implicit val materializer = ActorMaterializer()
implicit val executionContext = system.dispatcher
val account = system.actorOf(Props[Account])
account ! CreateAccount("Doe")
}
}
case class CreateAccount(lastName: String)
class Account extends Actor {
var lastName: String = null
override def receive: Receive = {
case createAccount: CreateAccount =>
this.lastName = lastName
}
}
Eventually you would persist this data somewhere. However, when adding the rule that there can only be one Bank Account per last name, a query to some data storage needs to be done. Let's say we put that logic inside a repository and the repository eventually returns an Account, we get to the problem where Account isn't an Actor anymore, since the repository won't be able to create Actors.
This is definitely a wrong implementation and not how Actors should be used. My question is, what are ways to solve these kind of problems? I am aware that my knowledge of Akka is not on a decent level yet, so it might be a weird/stupid formulated question.
This might be a long answer and I am sorry there isn't a TLDR version. :)
Ok, so you want to "Actorize" your domain model? Bad idea. Domain models are not necessarily actors. Sometimes they are but often they are not. It would be an anti-pattern to deploy one actor per domain model because if you do that you are simply offloading the method calling to message calling but losing all of the single threaded paradigm of the method calling. You cannot guarantee the timing of the messages hitting your actor and programming based upon ASK patterns is a good way to introduce a system that is not scalable, eventually you have too many threads and too many futures and cant proceed further, the system bogs and chokes. So what does that mean for your particular problem?
First you have to stop thinking of the domain model as a single thing and definitely stop using POJO entities. I entirely agree with Martin Fowler when he discusses the anemic domain model. In a well built actor system there will often be three domain models. One is the persisted model which has entities that model your database. The second is the immutable model. This is the model that the actors use to communicate with each other. All the entities are immutable from the bottom up, all collections unmodifiable, all objects only have getters, all constructors copy the collections to new immutable collections. The immutable model means your actors never have to copy anything, they just pass around references to data. Lastly you will have the API model, this is usually the set of entities that model the JSON for the clients to consume. The API model is there to insulate the back end from client code changes and vice versa, its the contract between the systems.
To create your actors stop thinking about your persistent model and what you will do with it but instead start thinking of the use cases. What does your system have to do? Model your actors based on the use cases and that will change the implementation of the actors and their deployment strategies.
For example, consider a server that delivers inventory information to users including current stock levels, reviews by users and so on for products by a single vendor. The users hammer this information and it changes quickly as stock levels change. This information is likely stored in half a dozen different tables. We don't model an actor for each table but rather a single actor to serve this use case. In this case this information is accessed by a large group of people in heavy load environment. So we are best creating an actor to aggregate all of this data and replicating the actor to each node and whenever the data changes we inform all replicants on all nodes of the changes. This means the user getting the overview doesn't even touch the database. They hit the actors, get the immutable model, convert that to the API model and then return the data.
On the other hand if a user wants to change the stock levels, we need to make sure that two users don't do it concurrently yet large DB transactions slows down the system massively. So instead we pick one node that will hold the stock management actor for that vendor and we cluster shard the actor. Any requests are routed to that actor and handled serially. The company user logs in and notes the receipt of a delivery of 20 new items. The message goes from whatever node they hit to the node holding the actor for that vendor, the vendor then makes the appropriate database changes and the broadcasts the change which is picked up by all the replicated inventory view actors to change their data.
Now this is simplistic because you have to deal with lost messages (read the articles on why reliable messaging is not necessary). However once you start to go down that road you soon realize that simply making your domain model an actor system is an anti-pattern and there are better ways to do things.
Anyway that is my 2 cents :)
General Design
Actors should generally be simple dispatchers to business logic and contain as little functionality as possible. Think of Actors as similar to a Future; when you want concurrency in scala you don't extend the Future class, you just use Future functionality around your existing logic.
Limiting your Actors to bare-bones responsibility has several advantages:
Testing the code can be done without having to construct ActorSystems, probes, ActorRefs, etc...
The business logic can easily be transplanted to other asynchronous libraries, e.g. Futures and akka streams.
It's easier to create a "proper domain model" with plain old classes and functions than it is with Actors.
Placing business logic in Actors naturally emphasizes a more object oriented code/system design rather than a functional approach (we picked scala for a reason).
Business Logic (No Akka)
Here we will setup all of the domain specific logic without using any akka related "stuff".
object BusinessLogicDomain {
type FirstName = String
type LastName = String
type Balance = Double
val defaultBalance : Balance = 0.0
case class Account(firstName : FirstName,
lastName : LastName,
balance : Balance = defaultBalance)
Lets model your account directory as a HashMap:
type AccountDirectory = HashMap[LastName, Account]
val emptyDirectory : AccountDirectory = HashMap.empty[LastName, Account]
We can now create a function that matches your requirements for distinct account per last name:
val addAccount : (AccountDirectory, Account) => AccountDirectory =
(accountDirectory, account) =>
if(accountDirectory contains account.lastName)
accountDirectory
else
accountDirectory + (account.lastName -> account)
}//end object BusinessLogicDomain
Repository (Akka)
Now that the unpolluted business code is complete, and isolated, we can add the concurrency layer on top of the foundational logic.
We can use the become functionality of Actors to store the state and respond to requests:
import BusinessLogicDomain.{Account, AccountDirectory, emptyDirectory, addAccount}
case object QueryAccountDirectory
class RepoActor(accountDirectory : AccountDirectory = emptyDirectory) extends Actor {
val statefulReceive : AccountDirectory => Receive =
currentDirectory => {
case account : Account =>
context become statefulReceive(addAccount(currentDirectory, account))
case QueryAccountDirectory =>
sender ! currentDirectory
}
override def receive : Receive = statefulReceive(accountDirectory)
}
When it comes to creating a REST web service with 60+ API on akka http. How can I choose whether I should go with akka streams or akka actors?
In his post, Jos shows two ways to create an API on akka http but he doesn't show when I should select one over the other.
This is a difficult question. Obviously, both approaches work. So to a certain degree it is a matter of taste/familiarity. So everything following now is just my personal opinion.
When possible, I prefer using akka-stream due to its more high-level nature and type safety. But whether this is a viable approach depends very much on the task of the REST API.
Akka-stream
If your REST API is a service that e.g. answers questions based on external data (e.g. a currency exchange rate API), it is preferable to implement it using akka-stream.
Another example where akka-stream would be preferable would be some kind of database frontend where the task of the REST API is to parse query parameters, translate them into a DB query, execute the query and translate the result according to the content-type requested by the user. In both cases, the data flow maps easily to akka-stream primitives.
Actors
An example where using actors would be preferable might be if your API allows querying and updating a number of persistent actors on a cluster. In that case either a pure actor-based solution or a mixed solution (parsing query parameters and translating results using akka-stream, do the rest using actors) might be preferable.
Another example where an actor-based solution might be preferable would be if you have a REST API for long-running requests (e.g. websockets), and want to deploy the processing pipeline of the REST API itself on a cluster. I don't think something like this is currently possible at all using akka-stream.
Summary
So to summarize: look at the data flow of each API and see if it maps cleanly to the primitives offered by akka-stream. If this is the case, implement it using akka-stream. Otherwise, implement using actors or a mixed solution.
Don't Forget Futures!
One addendum I would make to Rudiger Klaehn's fine answer is to also consider the use case of a Future. The composability of Futures and resource management of ExecutionContext make Futures ideal for many, if not most, situations.
There is an excellent blog post describing when Futures are a better choice than Actors. Further, the back-pressure provided by Streams comes with some pretty hefty overhead.
Just because you're down the rabbit hole using akka-http does not mean all concurrency within your request handler has to be confined to Actors or Streams.
Route
Route inherently accomodates Futures in the type definition:
type Route = (RequestContext) ⇒ Future[RouteResult]
Therefore you can bake a Future directly into your Route using only functions and Futures, no Directives:
val requestHandler : RequestContext => HttpResponse = ???
val route : Route =
(requestContext) => Future(requestHandler(requestContext)) map RouteResult.Complete
onComplete Directive
The onComplete Directive allows you to "unwrap" a Future within your Route:
val route =
get {
val future : Future[HttpResponse] = ???
onComplete(future) {
case Success(httpResponse) => complete(httpResponse)
case Failure(exception) => complete(InternalServerError -> exception.toString)
}
}
I'm working on a web application written in Scala, using the Play! framework and Akka. The code is organized basically like this: Play controllers send messages to Akka actors. The actors, in turn, talk to a persistence layer, that abstracts database access. A typical example of usage of these components in the application:
class OrderController(orderActor: ActorRef) extends Controller {
def showOrders(customerId: Long) = {
implicit request => Async {
val futureOrders = orderActor ? FindOrdersByCustomerId(id)
// Handle the result, showing the orders list to the user or showing an error message.
}
}
}
object OrderActor extends Actor {
def receive = {
case FindOrdersByCustomerId(id) =>
sender ! OrderRepository.findByCustomerId(id)
case InsertOrder(order) =>
sender ! OrderRepository.insert(order)
//Trigger some notification, like sending an email. Maybe calling another actor.
}
}
object OrderRepository {
def findByCustomerId(id: Long): Try[List[Order]] = ???
def insert(order: Order): Try[Long] = ???
}
As you can see, this is the basic CRUD pattern, much like what you would see in other languages and frameworks. A query gets passed down to the layers below and, when the application gets a result from the database, that result comes back up until it reaches the UI. The only relevant difference is the use of actors and asynchronous calls.
Now, I'm very new to the concept of actors, so I don't quite get it yet. But, from what I've read, this is not how actors are supposed to be used. Observe, though, that in some cases (e.g. sending an email when an order is inserted) we do need true asynchronous message passing.
So, my question is: is it a good idea to use actors in this way? What are the alternatives for writing CRUD applications in Scala, taking advantage of Futures and the other concurrency capabilities of Akka?
Although actor based concurrency doesn't fit with transactional operations out of the box but that doesn't stop you from using actors that way if you play nicely with the persistence layer. If you can guarantee that the insert ( write ) is atomic then you can safely have a pool of actors doing it for you. Normally databases have a thread safe read so find should also work as expected. Apart from that if the insert is not threadsafe, you can have one single WriteActor dedicated simply for write operations and the sequential processing of messages will ensure atomicity for you.
One thing to be aware of is that an actor process one message at a time, which would be rather limiting in this case. You can use a pool of actors by using routers.
Your example define a blocking api of the repository, which might be the only thing you can do, depending on your database driver. If possible you should go for an async api there also, i.e. returning Futures. In the actor you would then instead pipe the result of the Future to the sender.