Akka Streams Websocket Wiring - scala

I'm trying to figure out the best way to implement a real websocket app using akka-http and akka-streams. What I'm mostly looking for is simplicity, which I'm just not getting now.
Assume you have a fairly complex pipeline which needs to discriminate between multiple requests and sometimes send the request to an actor for processing, sometimes issue a mongo query and return the response, sometimes perform a PUT on a REST API, etc.
Unlike the simple chat application examples out there, there are at least 3 problems that arise which seem to not have a standard solution:
Conditionally skipping the response, e.g., because it is not expected by the client that this request will receive a response. If I use the typical Flow from Message to Message, once the request has hit its target, I need to stop it from propagating further back to the websocket. It can be done with a special filter (involves some pain) or using various other ways (e.g., Conditionally skip flow using akka streams), but this adds a lot of boilerplate and complexity. Ideally, I'd like to be able to insert 'Skip' messages that just skip everything else.
Routing incoming messages to the appropriate place (e.g., actor, mongo). Once again, I can find solutions to that which involve a lot of boilerplate (e.g., broadcast and filter out at branches which do not handle this kind of request). Ideally, I should be able to define something like: if the message is X, send it there, if the message is Y, send it there, etc.
Propagating errors back to the client. Very similar to the routing problem described above. For example, if the JSON parse fails, I need to add a separate path (broadcast + merge) along which I send an error message, but I cannot even easily reuse the same path if an error occurs at the next stage and I want to propagate that error to the user. Ideally, I should have one single separate path for error handling that can be used at any arbitrary point in the flow, bypasses the rest of the flow entirely and goes back to the client.
At the moment, I have this insanely complex graph spanning 15 lines with paths going through >20 different stages and I'm really worried about keeping the complexity of this solution in check. The DSL is mostly unreadable at this size. I could of course modularize a bit better, but this feels like an insane amount of trouble for something that should be a lot simpler.
Am I missing something? Am I insane for considering akka-streams for such a task? Any ideas or code examples that could allow me to rein in all that complexity?
Thanks in advance!

This is a very wide-ranging question and may not be answerable in its current form.
Akka HTTP addresses many of these concerns in its HTTP handling layers (e.g. empty responses, routing, returning errors). Could you use some of the lessons learnt there and apply them to your system? Or, perhaps better, could you convert your system from using websocket communication into using HTTP communication and use that code directly?

Related

Share a complex object between processes

I want to share a Mojo::Transaction::WebSocket object between processes.
The reason for this is that I am building a websocket chat and I don't want to limit Mojolicious to run only with one worker.
Storable did not work for me it just gives me weird errors.
Any ideas would be appreciated.
There's various ways you can achieve, this. To share the websocket itself would be hard, and requires a solid understanding of process forking/threading, sharing file descriptors, and knowledge of the mojolicious foundation code, which very likely will need to be changed.
If your aim is to load balance, or perform some long running task, your better off having your mojo application take the request, and add it to a queue system such as redis. You can have multiple processes listening for specific requests, read the payload, and send a response back through the queue.
If you just want to be able to access the internals of your Mojo application for other purposes, consider proving a restful endpoint with the data you wish to publish,
Alternatively, you can look at remote procedure calls (RPC) which will allow your Mojo process to call functions, and send data to other processes. Look at RPC::Simple as an example.

WebSocket/REST: Client connections?

I understand the main principles behind both. I have however a thought which I can't answer.
Benchmarks show that WebSockets can serve more messages as this website shows: http://blog.arungupta.me/rest-vs-websocket-comparison-benchmarks/
This makes sense as it states the connections do not have to be closed and reopened, also the http headers etc.
My question is, what if the connections are always from different clients all the time (and perhaps maybe some from the same client). The benchmark suggests it's the same clients connecting from what I understand, which would make sense keeping a constant connection.
If a user only does a request every minute or so, would it not be beneficial for the communication to run over REST instead of WebSockets as the server frees up sockets and can handle a larger crowd as to speak?
To fix the issue of REST you would go by vertical scaling, and WebSockets would be horizontal?
Doe this make sense or am I out of it?
This is my experience so far, I am happy to discuss my conclusions about using WebSockets in big applications approached with CQRS:
Real Time Apps
Are you creating a financial application, game, chat or whatever kind of application that needs low latency, frequent, bidirectional communication? Go with WebSockets:
Well supported.
Standard.
You can use either publisher/subscriber model or request/response model (by creating a correlationId with each request and subscribing once to it).
Small size apps
Do you need push communication and/or pub/sub in your client and your application is not too big? Go with WebSockets. Probably there is no point in complicating things further.
Regular Apps with some degree of high load expected
If you do not need to send commands very fast, and you expect to do far more reads than writes, you should expose a REST API to perform CRUD (create, read, update, delete), specially C_UD.
Not all devices prefer WebSockets. For example, mobile devices may prefer to use REST, since maintaining a WebSocket connection may prevent the device from saving battery.
You expect an outcome, even if it is a time out. Even when you can do request/response in WebSockets using a correlationId, still the response is not guaranteed. When you send a command to the system, you need to know if the system has accepted it. Yes you can implement your own logic and achieve the same effect, but what I mean, is that an HTTP request has the semantics you need to send a command.
Does your application send commands very often? You should strive for chunky communication rather than chatty, so you should probably batch those change request.
You should then expose a WebSocket endpoint to subscribe to specific topics, and to perform low latency query-response, like filling autocomplete boxes, checking for unique items (eg: usernames) or any kind of search in your read model. Also to get notification on when a change request (write) was actually processed and completed.
What I am doing in a pet project, is to place the WebSocket endpoint in the read model, then on connection the server gives a connectionID to the client via WebSocket. When the client performs an operation via REST, includes an optional parameter that indicates "when done, notify me through this connectionID". The REST server returns saying if the command was sent correctly to a service bus. A queue consumer processes the command, and when done (well or wrong), if the command had notification request, another message is placed in a "web notification queue" indicating the outcome of the command and the connectionID to be notified. The read model is subscribed to this queue, gets messessages and forward them to the appropriate WebSocket connection.
However, if your REST API is going to be consumed by non-browser clients, you may want to offer a way to check of the completion of a command using the async REST approach: https://www.adayinthelifeof.nl/2011/06/02/asynchronous-operations-in-rest/
I know, that is quite appealing to have an low latency UP channel available to send commands, but if you do, your overall architecture gets messed up. For example, if you are using a CQRS architecture, where is your WebSocket endpoint? in the read model or in the write model?
If you place it on the read model, then you can easy access to your read DB to answer fast search queries, but then you have to couple somehow the logic to process commands, being the read model the responsible of send the commands to the write model and notify if it is unable to do so.
If you place it on the write model, then you have it easy to place commands, but then you need access to your read model and read DB if you want to answer search queries through the WebSocket.
By considering WebSockets part of your read model and leaving command processing to the REST interface, you keep your loose coupling between your read model and your write model.

Limit concurrent requests with Spray

I'm using spray-routing to build a simple HTTP server. This server calls out to a number of services that take a while to respond (seconds). We would like to reject requests when the number of concurrent requests becomes to large. Otherwise a large number of concurrent requests bogs down the system to nobody's advantage.
There are a number of layers where this might be solved. I'm not sure how to do any of them precisely, or which is the best one.
I could supply an execution context for spray-routing that has a bounded queue and a rejection policy.
I could limit the mailbox size of my spray http server since it is also an actor.
I could configure a setting in application.conf that addresses this directly for spray.
What is a simple an effective way of implementing such a policy?
I don't know what solution would be the best for your case (I would go for creating my own execution context) but I believe that maybe you should rethink how you want to process your requests.
What do you do with your request? Do you try to handle them in Spray directly? With some help from Futures?
I would suggest creating additional actors, passing the request context to them and then deciding what to do. If you want to process it or maybe you should put it down immediately. This will give you much flexibility in future. You can attach additional servers with now support for clustering in Akka without changing the spray part adding more processing power easily.
I know this doesn't answer your question but I think akka was designed to handle this kind of problems differently and cutting on mailboxes or anything else is not the right choice.

Intercept and filter incoming packets at run time in Tigase (XMPP)

I am using Tigase(XMPP) server. I want to block every incoming message from a particular JID. At the moment i am blocking a particular JID by dropping it's packet in Message.java inside
/tigase/xmpp/impl
package. Is it the right way to do this, if not please guide me.
Thanks
An advantage of blocking messages in Message plugin is that the performance penalty for this filtering is reduced to minimal. However there are quite a few disadvantages of doing it this way:
You modify Tigase's code which makes you version update painful and time consuming
It does not allow you to filter out any other packets (such as presence or iq)
Even if you block messages in Message plugin this message may still be processed by other plugins which intercept messages (such as offline message, message archiver, etc...)
Now, what is the best way to implement such a filtering it depends on what you really want to do and why do you want to do it. Have you heard of privacy lists? Please take a look at it. Tigase fully implements privacy lists, why you do not want to use them? Why you cannot use them?
Usually such a filtering you speak about is done in the Tigase filter called preprocessor. Please take a look at the privacy lists plugin or domain filter for a code example.

How to maintain a persistant network-connection between two applications over a network?

I was recently approached by my management with an interesting problem - where I am pretty sure I am telling my bosses the correct information but I really want to make sure I am telling them the correct stuff.
I am being asked to develop some software that has this function:
An application at one location is constantly processing real-time data every second and only generates data if the underlying data has changed in any way.
On the event that the data has changed send the results to another box over a network
Maintains a persistent connection between the both machines, altering the remote box if for some reason the network connection went down
From what I understand, I imagine that I need to do some reading on doing some sort of TCP/IP socket-level stuff. That way if the connection is dropped the remote location will be aware that the data it has received may be stale.
However management seems to be very convinced that this can be accomplished using SOAP. I was under the impression that SOAP is more or less a way for a client to initiate a procedure from a server and get some results via the HTTP protocol. Am I wrong in assuming this? I haven't been able to find much information on how SOAP might be able to solve a problem like this.
I feel like a lot of people around my office are using SOAP as a buzzword and that has generated a bit of confusion over what SOAP actually is - and is capable of.
Any thoughts on how to accomplish this task would be appreciated!
I think SOAP is the wrong tool. SOAP is a spec for exchanging structured data. For your problem, the simplest thing would be to write a program to just transfer data and figure out if the other end is alive. Sockets are a good way to go. There are lots of socket programming tutorials on the net. Pick your language, and ask Mr. Google. Write a couple of demo programs to teach yourself how it works. Ask if you have more specific questions.
For the problem, you'll need a sender and a receiver. The sender sends data when it gets it, the receiver waits for data and hands it off when it arrives. Get that working first. Next, add in heartbeats; a message that says "I'm alive", sent periodically. Get that working next. You'll need to be determine the exact behavior you want -- should both sides send heartbeats to the other end, the maximum time you are willing to wait for a heartbeat, and what action you take should heartbeats stop arriving. The network connection can drop, the other end can crash, the other end can hang, and perhaps there are other conditions you should think about (e.g., what if the real time data is nonsense?). Figure out how to handle each condition, and code up the error handling. Test it out, and serve with a side of documentation.
SOAP certainly won't tell you when the data source goes down, though you could use "heartbeats" to add that.
Probably you are right and they are just repeating a buzz word, and don't actually know much about what SOAP is or does or have any real argument for why it ought to be used here.