Why is redis or other required for socket.io

Why is redis or other required for socket.io - sockets

I'm currently using Heroku auto scale for my servers. And I need to setup a scalable app using socket.io to allow instant updates of data (bear in mind that it's only for updating the frontend displays, no data is processed).
The way I was going to set it up was as follows:
In the image, all the servers have a socket connection to the "main" socket.io server and to the user.
A user would do an action through an API, the server would do its' thing (save to mondoDB or compute...) and pass it to a "main" socket.io server through its socket.io connection but would not send anything back to the user. The "main" server would receive the request through the socket.io connection and emit it back to the servers, which would then emit it to their users.
So the flow would be: User > Server > Main socket.io server > Server > User
My questions are:
Would this work?
Why do all the docs refer to db type redis?

Related

MongoDB Change Stream heavy on system resources

I'm using MongoDB change stream in order to have indirect communication between servers.
One API server is in DMZ and the other is in Intranet, DB server is also in DMZ and both API servers are allowed to communicate with DB via 27017 port.
DMZ API server is doing inserts to DB and is listening for "update" event in order to return response to user, while Intranet API server is listening for insert event and doing updates to those documents only. Once Intranet API updates the document, response is returned to user from DMZ API. Hope this makes sense so far.
That being a setup, I have an issue with DB server. It's complaining about swap memory being full always and since it's replica set, there are 3 servers in it and each has 4GB of RAM.
Do I need to add more RAM, and how much if anyone knows?

In Azure SignalR Service, what is a Concurrent Connection?

The pricing for Azure SignalR Service is based on Concurrent Connections.
However, I can't find the definition of a Concurrent connection.
I have an ASP.Net Core MVC Web Application. I understand that the server application connection to the Azure SignalR Service is one connection. Each client (browser) that connects to my web app is another connection. But are these considered concurrent connections? Or just open connections sitting there waiting for a message to be sent?
I'm hoping that the count of concurrent connections is a count of connections that are actively sending a message. Is that the case?

Ok so I figured out that a Concurrent connection is any connection to the SignalR Service!
I ran through the quickstart tutorial here
And then used the Azure Metric as I connected various clients to the chat room, and this is what I found:
After starting the app in debug mode, it seems the server immediately uses 5 connections. Then, as I open the url in various browser tabs, a new Client connection is established. As expected, the 16th browser tab does not establish a SignalR connection (Because I am on the Free SignalR Service tier, which has a limit of 20 connections.)

Detect load balancer/server changes in WebSocket connection from client

Current configuration
I have a client application connected to a Cloud server through a WebSocket (node.js) connection. I need WebSocket to get real time notifications of incoming messages.
Let use abc.example.com the domain name for the Cloud server for this example.
Cloud server configuration
The Cloud is powered by Amazon Elastic Load Balancer.
This Cloud server have this underlying architecture:
Cloud architecture
On Cloud update, the load balancer switches to another one so all new data posted to the Cloud is handled by a new load balancer and server.
So abc.example.com is always accessible even if the load balancer/server changes. (e.g. Doing an HTTP call)
WebSocket configuration
The WebSocket configuration is connecting to abc.example.com, which connects to a certain server and it stays connected to this one server until something closes it.
Problems
When connected, the WebSocket connection stays open to a server on the Cloud and doesn't detect when the load balancer switches to another one (e.g. on Cloud updates)
So if I send new data to the server for my client (e.g. new message), the connected client doesn't receive it through the WebSocket connection.
Although an HTTP GET query does the work because it resolves the right server.
From my understanding, this is a normal behavior since the server to which the client is connected with WebSocket is still on and didn't close the connection ; also nothing wrong happened.
Actually, I tried to switch the load balancer and the server (the initial server to which the client is connected) still send a pong response to the client when pinged periodically.
So is there any way to detect when the load balancer has switched from the client side ? I'm not allowed to modify the Cloud configuration but I can suggest it if there is a fairly easy solution.
Bottom line is: I don't want to miss any notifications when the Cloud updates.
Other observations:
At t0:
Client app is connected to server1 through WebSocket thanks to ELB1
Reception succeeds through WebSocket when sending new messages to the Cloud
At t1:
Cloud update: Switch from ELB1 to ELB2
Failure to receives new messages through WebSocket
At t2:
Cloud update: Switch from ELB2 to ELB1
Reception succeeds through WebSocket when sending new messages to the Cloud
Any suggestions/help is appreciated,
*This answer helped me understand the network structure but I'm still running out of ideas.
*Apologies if the terminology is not entirely appropriate.

Did you consider using a Pub/Sub server/database such as Redis?
This will augment the architecture in a way that allows Websocket connections to be totally independent from HTTP connections, so events on one server can be pushed to a websocket connection on a different server.
This is a very common network design for horizontal scaling and should be easy enough to implement using Redis or MongoDB.
Another approach (which I find as less effective but could offer scaling advantages for specific databases and designs) would be for each server to "poll" the database (of "subscribe" to database changes), allowing the server to emulate a pub/sub subscription and push data to connected clients.
A third approach, which is by far the most complicated, is to implement a "gossip" protocol and an internal pub/sub service.
As you can see, all three approaches have one thing in common - they never assume that HTTP connections and Websocket connections are routed to the same server.
EDIT (a quick example using redis):
Using Ruby and the iodine HTTP/Websocket server, here's a quick example for an application that uses the first approach to push events to clients (a common Redis Database for Pub/Sub).
Notice that it doesn't matter which server originates an event, the event is pushed to the waiting client.
The application is quite simple and uses a single event "family" (pub/sub channel called "chat"), but it's easy to filter events using multiple channels, such as a channel per user or a channel per resource (i.e. blog post etc').
It's also possible for clients to listen to multiple event types (subscribe to multiple channels) or use glob matching to subscribe to all the (existing and future) matching channels.
save the following to config.ru:
require 'uri'
require 'iodine'
# initialize the Redis engine for each Iodine process.
if ENV["REDIS_URL"]
uri = URI(ENV["REDIS_URL"])
Iodine.default_pubsub = Iodine::PubSub::RedisEngine.new(uri.host, uri.port, 0, uri.password)
else
puts "* No Redis, it's okay, pub/sub will support the process cluster."
end
# A simple router - Checks for Websocket Upgrade and answers HTTP.
module MyHTTPRouter
# This is the HTTP response object according to the Rack specification.
HTTP_RESPONSE = [200, { 'Content-Type' => 'text/html',
'Content-Length' => '32' },
['Please connect using websockets.']]
WS_RESPONSE = [0, {}, []]
# this is function will be called by the Rack server (iodine) for every request.
def self.call env
# check if this is an upgrade request.
if(env['upgrade.websocket?'.freeze])
env['upgrade.websocket'.freeze] = WS_RedisPubSub.new(env['PATH_INFO'] && env['PATH_INFO'].length > 1 ? env['PATH_INFO'][1..-1] : "guest")
return WS_RESPONSE
end
# simply return the RESPONSE object, no matter what request was received.
HTTP_RESPONSE
end
end
# A simple Websocket Callback Object.
class WS_RedisPubSub
def initialize name
#name = name
end
# seng a message to new clients.
def on_open
subscribe channel: "chat"
# let everyone know we arrived
# publish channel: "chat", message: "#{#name} entered the chat."
end
# send a message, letting the client know the server is suggunt down.
def on_shutdown
write "Server shutting down. Goodbye."
end
# perform the echo
def on_message data
publish channel: "chat", message: "#{#name}: #{data}"
end
def on_close
# let everyone know we left
publish channel: "chat", message: "#{#name} left the chat."
# we don't need to unsubscribe, subscriptions are cleared automatically once the connection is closed.
end
end
# this function call links our HelloWorld application with Rack
run MyHTTPRoute
Make sure you have the iodine gem installed (gem install ruby).
Make sure you have a Redis database server running (mine is running on localhost in this example).
From the terminal, run two instances of the iodine server on two different ports (use two terminal windows or add the & to demonize the process):
$ REDIS_URL=redis://localhost:6379/ iodine -t 1 -p 3000 redis.ru
$ REDIS_URL=redis://localhost:6379/ iodine -t 1 -p 3030 redis.ru
In this example, I'm running two separate server processes, using ports 3000 and 3030.
Connect to the two ports from two browser windows. For example (a quick javascript client):
// run 1st client app on port 3000.
ws = new WebSocket("ws://localhost:3000/Mitchel");
ws.onmessage = function(e) { console.log(e.data); };
ws.onclose = function(e) { console.log("closed"); };
ws.onopen = function(e) { e.target.send("Yo!"); };
// run 2nd client app on port 3030 and a different browser tab.
ws = new WebSocket("ws://localhost:3000/Jane");
ws.onmessage = function(e) { console.log(e.data); };
ws.onclose = function(e) { console.log("closed"); };
ws.onopen = function(e) { e.target.send("Yo!"); };
Notice that events are pushed to both websockets, without any concern as to the event's origin.
If we don't define the REDIS_URL environment variable, the application won't use the Redis database (it will use iodine's internal engine instead) and the scope for any events will be limited to a single server (a single port).
You can also shut down the Redis database and notice how events are suspended / delayed until the Redis server restarts (some events might be lost in these instances while the different servers reconnect, but I guess network failure handling is something we have to decide on one way or another)...
Please note, I'm iodine's author, but this architectural approach isn't Ruby or iodine specific - it's quite a common approach to solve the issue of horizontal scaling.

How to deploy a WebSocket server?

When deploying a web application running on a traditional web server, you usually restart the web server after the code updates. Due to the nature of HTTP, this is not a problem for the users. On the next request they will get the latest updates.
But what about a WebSocket server? If I restart or kill the old process all connected users will get disconnected. So my question is, what kind of strategy have you used to deploy a WebSocket server smoothly?

You're right, every connected user will be disconnected if the server restarts.
I think the less bad solution is to tell to the client to reconnect in the onClose method of the client.

WebSockets is just a transport mechanism. Libraries like socket.io exist to build on that transport -- and provide heartbeats, browser fallbacks, graceful reconnects and handle other edge-cases found in real-time applications.
In our WebSocket-enabled application, socket.io is central to ensuring our continuous deployment setup doesn't break users' active socket connections.

If clients are connected directly to sever that does all sockets networking and application logic, then yes - they will be disconnected, due to TCP layer that holds connection.
If you have gateway that clients will be connecting to, and that gateway application is running on another server, but will communicate and forward messages to logical server, then logical server will send them back and gateway will send back to client responses. With such infrastructure, you have to implement stacking of packets on gateway until it will re-establish connection with logical server. Logical server might notify gateway server before restart. That way client will have connection, it will just wont receive any responses.
Or you can implement on client side reconnection.
With HTTP, every time you navigate away, browser actually is creating socket connection to server, transmits all data and closes it (in most cases). And then all website data is local, until you navigate away.
With WebSockets it is continuous connection, and there is no reconnection on requests. Thats why you have to implement simple mechanics when WebSockets getting closing event, you will try to reconnect periodically on client side.
It is more based on your specific needs.

Process on Heroku that connects to an external socket

I have my server hosted on Heroku. The data source for my app is an external to my app. The following is the way to fetch the data :
Initialize a process that connects to a socket # the external-party server.
Save the data that comes through this socket connection.
Now my question is, Is it possible on Heroku to launch such processes, which needs to run constantly for-ever, listening to a socket on an external server?

A processes in Heroku can only listen to HTTP traffic on port 80. Like andy mentioned, Node.js is your best bet for running a service like this on Heroku.

I think this might be a job for Node.js which you can run on heroku. The logic flow will be to connect to the party server with a node.js app and then when data is received it will trigger a "callback" method. This method can then make a web request back to a Rails server with the data.
For examples of something like this, checkout the pubnub node.js sample app:
https://github.com/pubnub/pubnub-api/tree/master/nodejs

If I understand you correctly you need to launch a background process on heroku that connects to an external server -- this process then saves data from the api locally?
Accessing an external service:
That I'm aware of Heroku does not restrict access to external hosts or ports. Indeed, I have an app that connects to my mongodb database on mongohq.
Long running process: This is certainly possible using the new Celadon Cedar stack. The new cedar stack uses a concept called a Procfile, this enables running any script (e.g. ruby, bash, node.js) as a process.
Saving the data: Heroku has a read-only filesystem (excepting /tmp), so you'll need to save the data coming from the API in a database (or somewhere similar).

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse