I am currently working on a project that requires the client requesting a big job and sending it to the server. Then the server divides up the job and responds with an array of urls for the client to make a GET call on and stream back the data. I am the greenhorn on the project and I am currently using Spring websockets to improve efficiency. Instead of the clients constantly pinging the server to see if it has results ready to stream back, the websocket will now just directly contact the client hooray!
Would it be a bad idea to have websockets manage the whole process from end to end? I am using STOMP with Spring websockets, will there still be major issues with ditching REST?
With RESTful HTTP you have a stateless request/response system where the client sends request and server returns the response.
With webSockets you have a stateful (or potentially stateful) message passing system where messages can be sent either way and sending a message has a lower overhead than with a RESTful HTTP request/response.
The two are fairly different structures with different strengths.
The primary advantages of a connected webSocket are:
Two way communication. So, the server can notify the client of anything at any time. So, instead of polling a server on some regular interval to see if there is something new, a client can establish a webSocket and just listen for any messages coming from the server. From the server's point of view, when an event of interest for a client occurs, the server simply sends a message to the client. The server cannot do this with plain HTTP.
Lower overhead per message. If you anticipate a lot of traffic flowing between client and server, then there's a lower overhead per message with a webSocket. This is because the TCP connection is already established and you just have to send a message on an already open socket. With an HTTP REST request, you have to first establish a TCP connection which is several back and forths between client and server. Then, you send HTTP request, receive the response and close the TCP connection. The HTTP request will necessarily include some overhead such as all cookies that are aligned with that server even if those are not relevant to the particular request. HTTP/2 (newest HTTP spec) allows for some additional efficiency in this regard if it is being used by both client and server because a single TCP connection can be used for more than just a single request/response. If you charted all the requests/responses going on at the TCP level just to make an https REST request/response, you'd be surpised how much is going on compared to just sending a message over an already established webSocket.
Higher Scale in some circumstances. With lower overhead per message and no client polling to find out if something is new, this can lead to added scalability (higher number of clients a given server can serve). There are downsides to the webSocket scalability too (see below).
Stateful connections. Without resorting to cookies and session IDs, you can directly store state in your program for a given connection. While a lot of development has been done with stateless connections to solve most problems, sometimes it's just simpler with stateful connections.
The primary advantages of a RESTful HTTP request/response are:
Universal support. It's hard to get more universally supported than HTTP. While webSockets enjoy relatively good support now, there are still some circumstances where webSocket support isn't regularly available.
Compatible with more server environments. There are server environments that don't allow long running server processes (some shared hosting situations). These environments can support HTTP request, but can't support long running webSocket connections.
Higher Scale in some circumstances. The webSocket requirement for a continuously connected TCP socket adds some new scale requirements to the server infrastructure that HTTP requests don't demand. So, this ends up being a tradeoff space. If the advantages of webSockets aren't really needed or being used in a significant way, then HTTP requests might actually scale better. It definitely depends upon the specific usage profile.
For a one-off request/response, a single HTTP request is more efficient than establishing a webSocket, using it and then closing it. This is because opening a webSocket starts with an HTTP request/response and then after both sides have agreed to upgrade to a webSocket connection, the actual webSocket message can be sent.
Stateless. If your job is not made more complicated by having a stateless infrastruture, then a stateless world can make scaling or fail-over much easier (just add or remove server processes behind a load balancer).
Automatically Cacheable. With the right server settings, http responses can be cached by browser or by proxies. There is no such built-in mechanism for requests sent via webSockets.
So, to address the way you asked the question:
What are the pitfalls of using websockets in place of RESTful HTTP?
At large scale (hundreds of thousands of clients), you may have to do some special server work in order to support large numbers of simultaneously connected webSockets.
All possible clients or toolsets don't support webSockets or requests made over them to the same level they support HTTP requests.
Some of the less expensive server environments don't support the long running server processes required to support webSockets.
If it's important to your application to get progress notifications back to the client, you could either use a long running http connection with continuing progress being sent down or you can use a webSocket. The webSocket is likely easier. If you really only need the webSocket for the relatively short duration of this particular activity, then you may find the best overall set of tradeoffs comes by using a webSocket only for the duration of time when you need the ability to push data to the client and then using http requests for the normal request/response activities.
It really depends on your requirements. REST services can be much more transparent and easier to pick up by developer compared to Websockets.
Using Websockets, you remove most of the advantages that RESTful webservices offer, such as the ability to reference a resource via a URI. Really what you should be doing is to figure out what the advantages are of REST and hypermedia, and based on that decide whether those advantages are important to you.
It's of course entirely possible to create a RESTful webservice, and augment it with a a websocket-based API for real-time responses.
But if you are creating a service that only you are going to consume in a controlled environment, the only disadvantage might be that not every client supports websockets, while pretty much any type of environment can do a simple http call.
Related
Background
We are writing a Messenger-like app. We have setup Websockets to Inbox and Chat.
Question
My question is simple. What are the advantages and disadvantages when sending data from Client to Server using REST instead of Websockets? (I am not interested in updates now.)
We know that REST has higher overhead in terms of message sizes and that WS is duplex (thus open all time). What about the other things we didn't keep in mind?
Here's a summary of the tradeoffs I'm aware of.
Reasons to use webSocket:
You need/want server-push of data.
You are sending lots of small pieces of data from client to server and doing it very regularly. Using webSocket has significantly less overhead per transmission.
Reasons to use REST:
You want to use server-side frameworks or modules that are built for REST, not for webSocket (such as auth, rate limiting, security, streaming, etc...).
You aren't sending data very often from client to server and thus the server-side burden of keeping a webSocket connection open all the time may lessen your server scalability.
You want your client to run in places where a long-connected webSocket during inactive periods of time may not be practical (perhaps mobile).
You want your client to run in old browsers that don't support webSocket.
You want the browser to enforce same-origin restrictions (those are enforced for REST Ajax calls, but not for webSocket connections).
You don't want to have to write code that detects when the webSocket connection has died and then auto-reconnects and handles back-offs and handles mobile issues with battery usage issues, etc...
You need to run in situations where there are proxies or other network infrastructure that may not support long running webSocket connections.
If you want request/response built in. REST is request/response. WebSocket is not - it's message based. Responses from a webSocket are done by sending a messge back. That message back is not, by itself, a response to any specific request, it's just data being sent back. If you want request/response with webSocket, then you have to build some infrastructure yourself where you tag an id into a request and the response for that particular request contains that specific id. Otherwise, if there are every multiple requests in flight at the same time, then you don't know which response belongs with which request because all the data is being sent over the same connection and you would have no way of matching response with request.
If you want other clients to be able to carry out this operation via an Ajax call.
So, if you already have a webSocket implementation, don't have any problem with it that are lessened with REST and aren't interested in any of the reasons that REST might be better, then stick with your webSocket implementation.
Related references:
websocket vs rest API for real time data?
Ajax vs Socket.io
Adding comments per your request:
It sounds like you're expecting someone to tell you the "right" way to do it. There are reasons to pick one way over the other. If none of those reason compel you one way vs. the other, then it's just an architectural choice and you must take in the whole context of what you are doing and decide which architectural choice makes more sense to you. If you already have the reliably established webSocket connection and none of the advantages of REST apply to your situation then you can optimize for "efficiency" and send your data to the server over the webSocket connection.
On the other hand, if you wanted there to be a simple API on your server that could be reached with an Ajax call from other clients, then you'd want your server to support this operation via REST so it would simplest for these other clients to carry out this one operation. So, it all depends upon which direction your requirements drive you and, if there is no particular driving reason to go one way or the other, you just make an architectural choice yourself.
Nowadays I'm designing a REST interface for a distributed system. It is a client/sever architecture but with two message exchange patterns:
req/resp: the most RESTful approach, it would be a CRUD interface to access/create/modify/delete objects in the server.
pub/subs: this is my main doubt. I need the server to send asynchronous notifications to the client as soon as possible.
Searching in the web I found that one solution could be to implement REST-servers in the server and client: Publish/subscribe REST-HTTP Simple Protocol web services architecture?
Another alternative would be to implement blocking-REST and so the client doesn't need to listen in a specific port: Using blocking REST requests to implement publish/subscribe
I would like to know which options would you consider to implement an interface like this one. Thanks!
Web Sockets can provide a channel for the service to update web clients live. There's other techniques like http long polling where the client makes a "blocking" request (as you referred to it) where the service holds the request for a period of less than a timeout (say 50 sec) and writes a response when it has data. The web client immediately issues another request. This loop creates a continuous channel where messages can be "sent" from the server to the client but it's initiated from the client (firewalls, proxies, etc...)
There are libraries such as socket.io, signalR and many others that wrap this logic and even fallback from websockets to long polling gracefully for you and abstract away the details.
I would recommend write some sample web socket and long polling examples just to understand but then rely on libraries like mentioned above to get it right.
I'm using zmq to develop a distributed application having the following network topology: a client node that initiates a request and a server node that replies to requests. Since the client is a node.js application I can't block after a send call to wait the response, so the scenario is that the client could emit multiple send calls to the same endpoint. On the other side the server is a mobile application that processes one request a time in one thread, blocking if there are not any requests.
If this configuration sounds odd, I'm trying to build a sort of RPC initiated by the server to mobile.
I thought to use a DEALER socket client side and a REP socket server side. From zmq guide about DEALER/REP combination:
This gives us an asynchronous client that can talk to multiple REP servers. If we rewrote the "Hello World" client using DEALER, we'd be able to send off any number of "Hello" requests without waiting for replies.
Can it be applied to asynchronous client that can talk to one single server? And could it be a good choice? If not which pattern should I use?
Can it be applied to asynchronous client that can talk to one single server? And could it be a good choice?
REQ/REP is not recommended for traffic going over the Internet. The socket can potentially get stuck in a bad state.
The DEALER/REP is for a dealer client talking to multiple REP server. So this does not apply for your use case.
If not which pattern should I use?
In your case it seems to me that using the traditional DEALER/ROUTER is the way to go. What I usually do is prepend my messages by a "tag frame", ie a frame that contain an UUID of some sort that allows me to identifies my request (and their reply) at the application level.
I've been researching memcached, and I'm planning on using that with spymemcached on the client. I'm just curious how client/server communication works between the two. When creating a memcached client object, you can pass in a list of servers, but after the client is created is there any communication between the servers and the client saying that they are still alive and that the client send that particular server information? I've tried looking through the memcached and spymemcached documentation sites, but haven't found anything yet.
Spymemcached does not send any special messages to make sure that the connection is still alive, but you can do this in your application code if necessary by sending no-op messages to each server. You should also note that the TCP layer employs mechanisms such as keep-alive and timeout in order to try to detect dead connections. These parameters however may be different depending on the operating system you are using.
For several years, I've been facing problems with HTTP 1.1 pipelining & continued to ask the server to send the HTTP Header:
Connection: close
I want to revisit this decision. Does your native mobile apps use HTTP pipelining ?
Some problems with HTTP pipelining I've faced:
Server not releasing TCP connections
My client is receiving multiple replies from one HTTP connection
That's exactly what persistent connections and pipelining are for: keeping the TCP connection open until the timeout expires (or the browser closes), and sending multiple requests down the same pipe.
You might want to consider removing persistent connections if your server serves a high number of clients (you might run out of workers, RAM, or even free ports, raising response time for new requests)
If you want to read further, a pointer about persistent connection behaviour
One of the requirements for clients/servers to be compatible with HTTP/1.1 is the support of pipelining. So I don't see how using it would be a problem... I would rather think it would be encouraged. Using pipelining you cut down on creating new resources, network bandwidth, etc.
All modern web servers support pipelining and any reasonably complete client library should, so I'm not sure what the problem could be... perhaps if you ask about specific errors we could help you with them.
HTTP "pipelining" does not only mean to keep the TCP connection open between consecutive requests/responses. It describes a user agent behaviour where it sends the next HTTP request even without waiting for the pending response to the last request.
In my experience almost any HTTP server supports persistent connections. Using pipelining additionally is less stable. Firefox implements this feature but diables it by default.
You're confusing HTTP pipelining and HTTP persistent connections.
Persistent connection is where you keep the TCP connection around for future requests, but still send them serially: http://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html
Pipelining is a rarely used feature of HTTP 1.1 where you just fire multiple requests on the same connection without waiting for the responses. It's actually required by the HTTP specification, but rarely used by clients (Android's HTTP library doesn't, for example). Most servers seem to support it, though. It's described in section 8.1.2.2 of the same RFC.