Why use external libraries (like libcurl) vs. sockets for sending HTTP requests? - sockets

I'm new to network programming and have recently been playing around with using sockets in C++.
I have a pretty decent handle on it at this point, and I understand HTTP/TCP/IP packets pretty well.
However, upon doing some research online it seems like the bulk of network programmers suggest using external libraries such as libcurl (or curl++ for c++) for sending HTTP requests.
Considering that HTTP is a text-based protocol, why is this more beneficial/easier than simply sending HTTP requests as text messages using socket programming?
I found a few sites that show that you can do this without too much difficulty: HTTP Requests in C++ without external libraries?,
Simple C example of doing an HTTP POST and consuming the response
It seems like sending HTTP requests is simply a matter of getting the formatting correct and then sending it via a TCP socket. How is this made easier with external libraries?
Please bear with me as I'm new to network programming and eager to learn.

The links you've provided in your question are in a way a pretty good explanation on why you should not code HTTP yourself it: the first link only points to the socket API and does not say anything about HTTP while the second one provides some examples and code which are too much simplified for real world use and will not even work with with typical setup of multiple domains on the same host since the requests are missing the Host field. In other words: these are resources which might look useful to the inexperienced developer but they will actually lead you into trouble.
HTTP is not the simple as it might look. HTTP/0.9 was simple but is no longer supported by many clients. HTTP/1.0 is still kind of simple if restricted to the basic aspects. But there are already enough pitfalls, like using the wrong delimiter between lines and request header/body or not using a Host field when accessing multi-domain hosts.
And once you want to get efficient you want to have multiple requests per TCP connection and compression and then it gets more complex. With HTTP/1.1 it gets even more complex due to the use of chunked data encoding and with HTTP/2 it gets more efficient but way more complex with a binary protocol and interleaved requests and responses.
And this was only HTTP. With HTTPS you have the additional and not trivial TLS layer which has its own pitfalls.
Thus, if you just want to use HTTP and HTTPS it is much better to use established libraries. Of course if you want to learn the innards of HTTP it might be useful to read all the relevant standards, look at packet traces and try to implement your own.

Related

REST Calls with Http2.0 over Http1.1

I was reading through HTTP2.0 as new HTTP protocol and advantages with binary header and multiplexing. But i would like to know for Rest calls does migrating from HTTP1.1 to HTTP2.0 provide any reasonable advantage. I am not able to find any specific gain for REST full calls with HTTP2.0
Thanks in advance.
It does, in several ways:
Better support for streaming transfers. The conventional alternative is a combination of HTTP/1.1's chunked transfer encoding, connection header Voodoo and the willingness or not of any of the parts to implement HTTP pipelining (which, e.g., curl enables by default). In my experience it takes a lot more of work to get the three to work well together than to just slap HTTP/2. There is no need for chunked transfer-encoding with HTTP/2, and the protocol does not support it.
With HTTP/2, you can have many requests in flight, REST or not, with zero time for establishing connections. This is a blessing both for the browser and for the server, which has to allocate less files descriptors per client.
Header compression also applies to HTTP/2 REST requests, together with the associated bandwidth reduction.
So, if in doubt, go always for HTTP/2. There are also excellent tools out there to develop HTTP/2 applications. Some, like ShimmerCat, even remove the drudgery of setting up certificates and DNS alias, so that starting with HTTP/2 from day one becomes a no-brainer.

RESTful interface for ECG/EEG sensor data in haskell

I'm working on a project in which I want to display biosensor EEG/ECG data measured by a portable device (e.g., a micro controller with wireless data transmission via Wifi or Bluetooth). For this purpose, I need to interface with the portable device/microcontroller, for which the many or some of the device seem to use RESTful interfaces, but offer also probably sockets.
One example of microcontroller with wifi is the "spark.io", which is based on a cortex m3 and CC3000 wireless controller for WiFi access on-board. The data to be transferred are around 500 to 1000 float values per second, which should arrive at the REST client with as little delay as possible. Probably an non-REST approach like sockets would fit better, but I would still like to test an approach based on a RESTFul interface (a tiny argument for this would be that transferring data via RESFul interface seems very common and has good library support).
Q: The question is, what is the best approach for a performant (in the sense of near-realtime) implementation that interfaces with this via REST interface?
I am sure this problem has been solved before, but I could not quickly find a paper via google scholar or technical/scientific blog post that explains this. The only link I found is on "rest hooks", but I am not sure if this is a good approach. Searching on SE didn't reveal a past question on this.
Side note: My approach would be to implement the interface in haskell first to test the design and performance of the RESFull interface. Later the working approach should be ported or implemented with Java/Android/spark.io/some other microcontroller.
(Please note this question is entirely about the architecture and not at all about haskell libraries or anything. If using REST is the stupiest thing, I will accept that as an answer if it is argumented. Also then the question is then whether in general microcontroller web-interfaces and specically their APIs, like that of "spark.io", are in general a stupid idea, if they are implemented via REST. Is this the case? If not, what definition of "near real time" justifies that a REST interface is a bad idea and thus other means of communcation are better. Like: one sensor read per minute? Or, one per second, by 1/10 second, by 1/100 second, by 1/1000 second?)
Okay, let's go through this.
REST is not necessarily a bad idea but it has a lot of features which you may not need. For example, there are REST verbs not just for retrieval, but also updating, deleting, and creating resources. If those functions are important (e.g. you need to send certain control data to the EEG controller) then REST will be nice. If you just want fast access to the stream of data, consider raw TCP instead.
Similarly, REST will package messages into "requests" and their "responses" which come with a bunch of "headers" indicating things like whether the request could be fulfilled, whether it's compressed, etc. These can be great features but may be bloat. You'll probably want to emit enough data on each request so that the ~1kB of headers are a small fraction of it. But given 8-byte floats (doubles), that requires transmitting 500-1000 data points, which you've said will take about one second. Is that our fate -- to always have 1s of latency?
REST will allow you to avoid some of that bloat by declaring a Transfer-Encoding: chunked so that the client can operate on individual chunks as they become available. So that's an architectural decision that I think will need to be made.
I would definitely get Keep-Alive working as soon as possible, and it would be my chief feature when looking for what library to use on the server. Keep-Alive is a standard extension to HTTP which avoids tearing down and rebuilding the TCP stack for each HTTP request. If you don't do this then you have some heavy protocol negotiations each time you send a request.
A crucial decision you'll have to make involves whether you want to do HTTP pipelining or not. You can combine HTTP pipelining with longer-lived requests (ones where you don't expect an immediate response) to essentially "send the data when it becomes available" (i.e. send the headers first and let the server push out the data when it's good and ready). This is an alternative to chunked transfers.
If you can work those out, then HTTP is regularly used to send megabytes per second, so your use case fits well within what REST is capable of. In terms of REST/HTTP libraries for Haskell, if you have to somehow program the controller yourself, the big options are wai, yesod, snap, and rest. If you just need an HTTP client there are a few of those too.

Is ReST over websockets possible?

I am planning to develop a web based chat application which takes in ReSTful requests, translate them to XMPP and deliver them to an XMPP server.
Using websockets for this kind of chat based application looked promising as the events (or responses) can be delivered asynchronously. But if I use websockets as the underlying protocol for transferring the requests from the browser, can this still be considered as a ReSTful design? If yes, how are the URIs, verbs (GET, POST...), parameters represented in the websocket message? Wrap them in an xml/json and send it?
Also, ReSTful architecture states that no session state will be stored on the server. But here in this case when an XMPP client session is created, the state of this session will be stored on the server (violating the stateless constraint)
REST is an architectural style that does not impose a protocol. So yes, you can do REST with Web Sockets, REST with HTTP and REST with FTP if you like.
The main reason to use HTTP is that it is easy and fairly simple to communicate with any component or programming language via HTTP and also because HTTP supports distributed environments with multiple intermediaries: proxies, firewalls...; So you can deploy your service on any topology and anyone will be able to access it.
My rant:
If you are a RESTliban and Roy Fielding’s dissertation is the source of truth, verbs are never acknowledged as part of the semantic. URIs are the semantic. The usage of different verbs for different actions has been an elegant evolution of REST over HTTP, but not part of the "truth". You can check the scenario of rest over HTTP evaluated by Roy in chapter six of his dissertation. No mention to verbs. And notice it is an evaluation scenario, not the specification.
TLDR;
If you need realtime two way communications via the internet and the client is a web browser, the best choice is Web Sockets. You could then implement an application level protocol on top of web sockets to implement a RESTful Web Service.
Yes. You can use REST over WebSocket with library like
SwaggerSocket.
Why would you want to build a REST API on top of socket? IMHO the benefit of a REST API is to leverage standard HTTP protocol possibilities like stateless requests, semantic verbs like GET, DELETE to build an API that can be easily understood by (client) developers. Since sockets do not offer HTTP verbs and so on, you would build some kind of HTTP layer for sockets which is IMHO not reasonable.
In case you would really build such a thing, I'd recommend to use the HTTP protocol as a blueprint and implement the socket protocol like HTTP.
REST architectural style mostly presumes 2 entities viz. client and server.
As we move more towards real time web and development of reactive systems WebSocket would prominently start replacing usage of REST API's.
WS allows data push and pull which dismisses the concept of server and client.
STOMP,AMQP ,XMPP can be used as messaging protocols.
The data itself maybe JSON or Google protocol buffers or maybe Apache Avro.
WebSockets is not tied to web servers but can be developed in stand alone apps like mobile apps or desktop apps too.
I don't understand why you would convert XMPP into REST and then run REST over WS. The point of WebSocket is to take the XMPP protocol directly to the browser, thereby avoiding all of the translation issues.
There are JavaScript libraries that can talk XMPP from the browser to the server. All you need is to proxy the XMPP traffic from WS over into TCP and then straight into your XMPP server. Kaazing has a gateway that allows you to do this.
If you want to use open source, you will need to write a JavaScript XMPP library. There are examples that show how to write a JS library for simple protocols. You just have to find one and extend the concept to the XMPP protocol.
So to recap, here are the way the architecture would look:
Your XMPP Client code <-> XMPP JavaScript Library <-> WebSocket over http <-> WebSocket to TCP Proxy <-> XMPP Server
where the XMPP Client code and the XMPP JavaScript Library runs in the browser, and the WS to TCP proxy along with the XMPP server are all server-side.
I understand this post is really old, but wanted to interject a bit further on the notion that "So if I choose a REST architecture I forfeit the ability for real-time communications?".
In a word, no. A number of REST style implementations I have had experience with leverage REST for compatibility, discoverability, and as a means to scale across different devices in the shadow of IoT.
However, in addition to using WS in addition to REST to facilitate near real-time transmission. There are also a number of abstractions which really help with this and allow you to focus on building your API and deciding how the RT components of the consuming applications should operate.
I would suggest taking a look at things like Tibco Smart-Sockets, or SignalR if you're looking to build a REST API and would like to avoid re-creating the wheel for your RT needs.
I created a project that adds callbacks to the web socket send function: https://github.com/ModernEdgeSoftware/WebSocketR2
Message IDs are established so the client can implement callbacks. It handles message retries after timeouts as well as reconnects to the server if the connection gets dropped. You can then structure you payload to be as RESTful as you want by adding verbs and paths.
This is similar to when a video game studio uses UDP to achieve the speeds they need, but their net code implements a lot of TCP like features for reliability.
The OP's original question is: "Is ReST over websockets possible?"
What this question implies is the following: Is REST API possible over Websockets as a transport.
Of course, OP did not mean the following: Is REST architectural style possible over Websockets. His question was more an operational one i.e. can REST API requests, such as GET, PUT, POST, DELETE etc. be exchanged over a Websockets pipe.
To answer this question, we have to understand that both sockets and Websockets are the same type of interface (full duplex, 3-way handshake protocol), but the difference is that sockets interface originated in ARPANET reference model. In that network model, sockets were an interface between Session layer and Transport layer. The word "interface" means that it resides "in between" network layers, i.e. within their boundary. In other words, sockets are not part of any specific network layer.
Websockets are the same type of socket interface, but in OSI 7-layer network model they no longer reside in between Session and Transport layers. Instead, they reside in between Session layer and Presentation layer. Why there? Why this "move"? A motivation for this was to be able to leverage HTTP protocol as a transport for sockets. And what is so special about HTTP protocol? In enterprise establishments, there are a lot of network zones and segments and these security domains are protected by firewalls. And firewalls, as we know, have associated rules for inbound/outbound traffic. If we want two components in two different network zones to talk to each other, we have to ensure that ports on corresponding firewalls are open. That would involve collaboration of infrastructure, operations teams, business approvals etc. and would introduce significant delays in achieving a simple thing: two components communicating with each other.
Which brings us to our use case: Websockets interface placed between Session OSI layer (where HTTP resides) and Presentation OSI layer (where things like TLS reside). By default, port 80 is open on all firewalls thus no involvement of operations and infrastructure is needed. And our two components can now converse over Websockets communication pipe.
Back to the OP's question. Any type of a string list can be transferred over sockets. Sockets/Websockets are an ideal mechanism for transporting all sorts of custom protocols, whether they are STOMP, HL7, FHIR, or many others. GET, PUT, POST, DELETE requests are different operations on a REST API endpoint. These operations are in the form of a specific string list, and as we saw, sockets/Websockets are very convenient for passing string lists back and forth. In the case of REST over HTTP, though, you are leveraging the whole HTTP "infrastructure" available in all modern Browsers, such as Chrome, Firefox, Edge etc., as well as web servers such as Apache, nginx, IIS, OHS, IHS etc. In other words, REST API piggybacks on an established, string list-based protocol called HTTP that is built-in (part of) both clients and servers' sides. This cannot be said about Websockets. You would have to ensure every type of client and server complies with your (custom) transport solution based on Websockets!
I just spot new topic on the blog of one company who providing cloud solution and "Server-end/Service as a Platform" (SaaS) for games.
I'm not advertising this company, nor I used them, so I don't even know how good or bad they are.
However, they very clearly explain reasons and what are the benefits of using WebSockets in REST
Have a read on their blog
REST requires a stateless protocol according to the statelessness constraint, websockets is a stateful protocol, so it is not possible.

sending files from node.js to node.js server

Whats the best way to send large files from one node.js server to another? We tried to encode it with base64 and send it over an allready existent tls socket connection, but the base64 string is to long so the socket splits it in several parts. We also thought to send it via http methods but that seems not the best way for us. Any ideas?
Unless there are no special requirements, I'd use HTTP. HTTP cliens and servers are both available and rather mature in node.js, and HTTP gives you additional features (i.e. Caching, optimistic transactional behaviour, content negotiation, partial requests, etc.).
Don't roll your own protocol based on plain sockets, you are reinventing the wheel. But you might consider other protocols such as FTP as well.

Should I connect directly to CouchDB's socket and pass HTTP requests or use node.js as a proxy?

First, here's my original question that spawned all of this.
I'm using Appcelerator Titanium to develop an iPhone app (eventually Android too). I'm connecting to CouchDB's port directly by using Titanium's Titanium.Network.TCPSocket object. I believe it utilizes the Apple SDK's CFSocket/NSStream class.
Once connected, I simply write:
'GET /mydb/_changes?filter=app/myfilter&feed=continuous&gameid=4&heartbeat=30000 HTTP/1.1\r\n\r\n'
directly to the socket. It keeps it open "forever" and returns JSON data whenever the db is updated and matches the filter and change request. Cool.
I'm wondering, is it ok to connect directly to CouchDB's socket like this, or would I be better off opening the socket to node.js instead, and maybe using this CouchDB node.js module to handle the CouchDB proxy through node.js?
My main concern is performance. I just don't have enough experience with CouchDB to know if hitting its socket and passing faux HTTP requests directly is good practice or not. Looking for experience and opinions on any ramifications or alternate suggestions.
It's me again. :-)
CouchDB inherits super concurrency handling from Erlang, the language it was written in. Erlang uses lightweight processes and message passing between those processes to achieve excellent performance under high concurrent load. It will take advantage of all cpu cores, too.
Nodejs runs a single process and basically only does one thing at a time within that process. Its event-based, non-blocking IO approach does allow it to multitask while it waits for chunks of IO but it still only does one thing at a time.
Both should easily handle tens of thousands of connections, but I would expect CouchDB to handle concurrency better (and with less effort on your part) than Node. And keep in mind that Node adds some latency if you put it in front of CouchDB. That may only be noticeable if you have them on different machines, though.
Writing directly to Couch via TCPSocket is a-ok as long as your write a well-formed HTTP request that follows the spec. (You're not passing a faux request...that's a real HTTP request you're sending just like any other.)
Note: HTTP 1.1 does require you to include a Host header in the request, so you'll need to correct your code to reflect that OR just use HTTP 1.0 which doesn't require it to keep things simple. (I'm curious why you're not using Titanium.Network.HTTPClient. Does it only give you the request body after the request finishes or something?)
Anyway, CouchDB can totally handle direct connections and--unless you put a lot of effort into your Node proxy--it's probably going to give users a better experience when you have 100k of them playing the game at once.
EDIT: If you use Node write an actual HTTP proxy. That will run a lot faster than using the module you provided and be simpler to implement. (Rather than defining your own API that then makes requests to Couch you can just pass certain requests on to CouchDB and block others, say, for security reasons.
Also take a look at how "multinode" works:
http://www.sitepen.com/blog/2010/07/14/multi-node-concurrent-nodejs-http-server/