HTTP/3 and its impact - quic

Recently Chrome, Firefox, cURL etc announced their support for HTTP/3 (it was earlier termed as HTTP-over-QUIC).
How do you see its adaptation impact from perspective of changes in:
Applications (web-based, mobile, pure socket based etc)
Hosting infrastructures (web/app servers, firewalls, loadbalancers, CDNs, router, switches etc) & ISPs etc.
Security (new threats, vulnerabilities, landscape of VAPT tools etc.)
Congestion-control

A very subjective question so not sure a great fit for here but here’s my two cents:
Shouldn’t make any further difference behind what HTTP/2 did. It closes one edge-case of that (lost packets can make HTTP/2 slower on HTTP/2 than HTTP/1.1) and also potentially brings some performance improvements to the initial connection setup. If you’ve not move to, or optimised for, HTTP/2 then may wish to consider that in preparation. Priorities are also due to get a rethink in HTTP/3 but not decided how yet. At the end of the day it’s a transport layer change and the basic semantics of HTTP/2 doesn’t change so to higher level apps it should be fairly seamless - like HTTP/2 was mostly to HTTP/1.1 users.
It’s UDP based (with fallback to TCP-based HTTP/2 and/or HTTP/1.1) which will be fun to enable and setup! Also TLS libraries need to change to support it so could be a while before we see it in some servers. CDNs are already leading the way and will be easiest way of getting this. Like HTTP/2, it’s probably most important to have it on your edge server and then down grade to HTTP/2 or even HTTP/1.1 for internal traffic beyond that. It’s also more fully encrypted which will make it difficult to sniff and reroute traffic as less information is readable to middle boxes than was in TCP.
See answer 2 above which will make it more difficult to sniff traffic. Also it’s very new (not even fully finished and approved at this time of writing) so there may be implementation bugs like there were for HTTP/2 (for example), and some products will not support it initially. On plus side it’s only available over HTTPS which is good for security.

Related

REST Calls with Http2.0 over Http1.1

I was reading through HTTP2.0 as new HTTP protocol and advantages with binary header and multiplexing. But i would like to know for Rest calls does migrating from HTTP1.1 to HTTP2.0 provide any reasonable advantage. I am not able to find any specific gain for REST full calls with HTTP2.0
Thanks in advance.
It does, in several ways:
Better support for streaming transfers. The conventional alternative is a combination of HTTP/1.1's chunked transfer encoding, connection header Voodoo and the willingness or not of any of the parts to implement HTTP pipelining (which, e.g., curl enables by default). In my experience it takes a lot more of work to get the three to work well together than to just slap HTTP/2. There is no need for chunked transfer-encoding with HTTP/2, and the protocol does not support it.
With HTTP/2, you can have many requests in flight, REST or not, with zero time for establishing connections. This is a blessing both for the browser and for the server, which has to allocate less files descriptors per client.
Header compression also applies to HTTP/2 REST requests, together with the associated bandwidth reduction.
So, if in doubt, go always for HTTP/2. There are also excellent tools out there to develop HTTP/2 applications. Some, like ShimmerCat, even remove the drudgery of setting up certificates and DNS alias, so that starting with HTTP/2 from day one becomes a no-brainer.

What can go wrong if we do NOT follow RESTful best practices?

TL;DR : scroll down to the last paragraph.
There is a lot of talk about best practices when defining RESTful APIs: what HTTP methods to support, which HTTP method to use in each case, which HTTP status code to return, when to pass parameters in the query string vs. in the path vs. in the content body vs. in the headers, how to do versioning, result set limiting, pagination, etc.
If you are already determined to make use of best practices, there are lots of questions and answers out there about what is the best practice for doing any given thing. Unfortunately, there appears to be no question (nor answer) as to why use best practices in the first place.
Most of the best practice guidelines direct developers to follow the principle of least surprise, which, under normal circumstances, would be a good enough reason to follow them. Unfortunately, REST-over-HTTP is a capricious standard, the best practices of which are impossible to implement without becoming intimately involved with it, and the drawback of intimate involvement is that you tend to end up with your application being very tightly bound to a particular transport mechanism. So, some people (like me) are debating whether the benefit of "least surprise" justifies the drawback of littering the application with REST-over-HTTP concerns.
A different approach examined as an alternative to best practices suggests that our involvement with HTTP should be limited to the bare minimum necessary in order to get an application-defined payload from point A to point B. According to this approach, you only use a single REST entry point URL in your entire application, you never use any HTTP method other than HTTP POST, never return any HTTP status code other than HTTP 200 OK, and never pass any parameter in any way other than within the application-specific payload of the request. The request will either fail to be delivered, in which case it is the responsibility of the web server to return an "HTTP 404 Not Found" to the client, or it will be successfully delivered, in which case the delivery of the request was "HTTP 200 OK" as far as the transport protocol is concerned, and anything else that might go wrong from that point on is exclusively an application concern, and none of the transport protocol's business. Obviously, this approach is kind of like saying "let me show you where to stick your best practices".
Now, there are other voices that say that things are not that simple, and that if you do not follow the RESTful best practices, things will break.
The story goes that for example, in the event of unauthorized access, you should return an actual "HTTP 401 Unauthorized" (instead of a successful response containing a json-serialized UnauthorizedException) because upon receiving the 401, the browser will prompt the user of credentials. Of course this does not really hold any water, because REST requests are not issued by browsers being used by human users.
Another, more sophisticated way the story goes is that usually, between the client and the server exist proxies, and these proxies inspect HTTP requests and responses, and try to make sense out of them, so as to handle different requests differently. For example, they say, somewhere between the client and the server there may be a caching proxy, which may treat all requests to the exact same URL as identical and therefore cacheable. So, path parameters are necessary to differentiate between different resources, otherwise the caching proxy might only ever forward a request to the server once, and return cached responses to all clients thereafter. Furthermore, this caching proxy may need to know that a certain request-response exchange resulted in a failure due to a particular error such as "Permission Denied", so as to again not cache the response, otherwise a request resulting in a temporary error may be answered with a cached error response forever.
So, my questions are:
Besides "familiarity" and "least surprise", what other good reasons are there for following REST best practices? Are these concerns about proxies real? Are caching proxies really so dumb as to cache REST responses? Is it hard to configure the proxies to behave in less dumb ways? Are there drawbacks in configuring the proxies to behave in less dumb ways?
It's worth considering that what you're suggesting is the way that HTTP APIs used to be designed for a good 15 years or so. API designers are tending to move away from that approach these days. They really do have their reasons.
Some points to consider if you want to avoid using ReST over HTTP:
ReST over HTTP is an efficient use of the HTTP/S transport mechanism. Avoiding the ReST paradigm runs the risk of every request / response being wrapped in verbose envelopes. SOAP is an example of this.
ReST encourages client and server decoupling by putting application semantics into standard mechanisms - HTTP and XML/JSON (or others data formats). These protocols and standards are well supported by standard libraries and have been built up over years of experience. Sure, you can create your own 'unauthorized' response body with a 200 status code, but ReST frameworks just make it unnecessary so why bother?
ReST is a design approach which encourages a view of your distributed system which focuses on data rather than functionality, and this has a proven a useful mechanism for building distributed systems. Avoiding ReST runs the risk of focusing on very RPC-like mechanisms which have some risks of their own:
they can become very fine-grained and 'chatty'
which can be an inefficient use of network bandwidth
which can tightly couple client and server, through introducing stateful-ness and temporal coupling beteween requests.
and can be difficult to scale horizontally
Note: there are times when an RPC approach is actually a better way of breaking down a distributed system than a resource-oriented approach, but they tend to be the exceptions rather than the rule.
existing tools for developers make debugging / investigations of ReSTful APIs easier. It's easy to use a browser to do a simple GET, for example. And tools such as Postman or RestClient already exist for more complex ReST-style queries. In extreme situations tcpdump is very useful, as are browser debugging tools such as firebug. If every API call has application layer semantics built on top of HTTP (e.g. special response types for particular error situations) then you immediately lose some value from some of this tooling. Building SOAP envelopes in PostMan is a pain. As is reading SOAP response envelopes.
network infrastructure around caching really can be as dumb as you're asking. It's possible to get around this but you really do have to think about it and it will inevitably involve increased network traffic in some situations where it's unnecessary. And caching responses for repeated queries is one way in which APIs scale out, so you'll likely need to 'solve' the problem yourself (i.e. reinvent the wheel) of how to cache repeated queries.
Having said all that, if you want to look into a pure message-passing design for your distributed system rather than a ReSTful one, why consider HTTP at all? Why not simply use some message-oriented middleware (e.g. RabbitMQ) to build your application, possibly with some sort of HTTP bridge somewhere for Internet-based clients? Using HTTP as a pure transport mechanism involving a simple 'message accepted / not accepted' semantics seems overkill.
REST is intended for long-lived network-based applications that span multiple organizations. If you don’t see a need for the constraints, then don’t use them. -- Roy T Fielding
Unfortunately, there appears to be no question (nor answer) as to why use best practices in the first place.
When in doubt, go back to the source
Fielding's dissertation really does quite a good job at explaining how the REST architectural constraints ensure that you don't destroy the properties those constraints are designed to protect.
Keep in mind - before the web (which is the reference application for REST), "web scale" wasn't a thing; the notion of a generic client (the browers) that could discover and consume thousands of customized applications (provided by web servers) had not previously been realized.
According to this approach, you only use a single REST entry point URL in your entire application, you never use any HTTP method other than HTTP POST, never return any HTTP status code other than HTTP 200 OK, and never pass any parameter in any way other than within the application-specific payload of the request.
Yup - that's a thing, it's called RPC; you are effectively taking the web, and stripping it down to a bare message transport application that just happens to tunnel through port 80.
In doing so, you have stripped away the uniform interface -- you've lost the ability to use commodity parts in your deployment, because nobody can participate in the conversation unless they share the same interpretation of the message data.
Note: that's doesn't at all imply that RPC is "broken"; architecture is about tradeoffs. The RPC approach gives up some of the value derived from the properties guarded by REST, but that doesn't mean it doesn't pick up value somewhere else. Horses for courses.
Besides "familiarity" and "least surprise", what other good reasons are there for following REST best practices?
Cheap scaling of reads - as your offering becomes more popular, you can service more clients by installing a farm of commodity reverse-proxies that will serve cached representations where available, and only put load on the server when no fresh representation is available.
Prefetching - if you are adhering to the safety provisions of the interface, agents (and intermediaries) know that they can download representations at their own discretion without concern that the operators will be liable for loss of capital. AKA - your resources can be crawled (and cached)
Similarly, use of idempotent methods (where appropriate) communicates to agents (and intermediaries) that retrying the send of an unacknowledged message causes no harm (for instance, in the event of a network outage).
Independent innovation of clients and servers, especially cross organizations. Mosaic is a museum piece, Netscape vanished long ago, but the web is still going strong.
Of course this does not really hold any water, because REST requests are not issued by browsers being used by human users.
Of course they are -- where do you think you are reading this answer?
So far, REST works really well at exposing capabilities to human agents; which is to say that the server side is so ubiquitous at this point that we hardly think about it any more. The notion that you -- the human operator -- can use the same application to order pizza, run diagnostics on your house, and remote start your car is as normal as air.
But you are absolutely right that replacing the human still seems a long ways off; there are various standards and media types for communicating semantic content of data -- the automated client can look at markup, identify a phone number element, and provide a customized array of menu options from it -- but building into agents the sorts of fuzzy intelligence needed to align offered capabilities with goals, or to recover from error conditions, seems to be a ways off.

How to run socket.io on my static site on AWS

If I have a static site on AWS S3 (and maybe using CloudFront) that's pretty cool, because it scales easily, and has zero-downtime deployments, because you're just updating static assets, and gets distributed to edge locations, woohoo!
But, if I wanted to have a little live chat support feature using Socket.io on the contact page, how would I tell Amazon to deal with Websockets? Could I use Route53 to do something different with Websocket requests to a particular domain, like redirect them to Lambda? (Lambda can't run socket.io can it?)
Similar to your other question, the answer here involves the fact that DNS is not involved in path resolution, so Route 53 is not a factor in this question.
Socket.io is almost certainly going to require a server and connecting through CloudFront seems unlikely.
Although I am not versed in socket.io's underlying transport protocol(s?), I don't see a way around this. CloudFront is a reverse proxy that only supports proper, standard HTTP request/response behavior, which is not well-suited to real-time event-oriented operations. CloudFront does not support websockets, though Socket.io may not need them and may have the flexibility to fall back to a compatible behavior but it will -- at best -- be suboptimal if possible at all, because even with long-polling (inefficient), you're limited to under 30 seconds for a single response, because CloudFront has a fixed timeout timer of 30 seconds, that cannot be modified.
Similarly, Lambda functions accessed through API Gateway are suited only to handling a single HTTP request/response cycle, not anything persistent and no intrinsic mechanism for handling "state" across requests.
My assumption going in would be that you'd need one or more servers behind an ELB Classic Load Balancer with SSL and operating in TCP mode, using a subdomain of your site's domain, connecting browsers to the back-end for persistent connections.
Even if this answer is helpful, I'm honestly not certain that it is sufficiently helpful... so you may wish to hold off on accepting it, since someone may come along and offer an answer that delves more deeply into the internals of socket.io and how that is going to interoperate with CloudFront, if such interoperation is possible.

What's the difference between SIP/XMPP for web conferencing and file-sharing?

I want to setup a personal videoconferencing service for my family, friends and myself. The main problem I have with current options is that they are either closed-source and centralized (GG hangouts, skype) or open-source but not working in corporate environment or in hotels (due to strict firewalling rules and the "Skype is going through, if you want VOIP use that" kind of netadmin reaction).
I have two solutions then. Either setup a STUN/TURN relay server and use XMPP and SIP as I used to, but that would require my friends to setup that too. Or setup a whole VOIP server. 2 solutions come to mind: SIP and XMPP. Though to my knowledge, each of them ultimately uses the (S)RTP/RTCP protocol.
And that's the problem. Out of the specific signaling part used by the two of them, I really can't figure out the difference between them, their typical use case.
I think you're right in that as far as setting up a video conferencing system XMPP and SIP are equivalent. They both are signalling only protocols and the media sessions they set up typically use RTP (although they can both be used to set up any kind of session you want but RTP is the norm).
The biggest problem is also going to be the one you mention about getting video streams out of a corporate firewall. Skype overcomes this obstacle by sending it's media over an SSL connection and is thus able to get through firewalls. Theoretically you could do the same with RTP and in the past I once used openvpn connections with a SIP client to test some audio calls. My experience wasn't great as the audio was very choppy, assumedly as a result of all the extra packaging that is required to get the high volume of small audio packets from one end to the other. That was nearly a decade ago though so perhaps with the better CPU and bandwidth resources available now it would work better.
Personally I think I'd stick with Skype as it's going to be a big hassle to set up your own system. If you were to go ahead with your own the first option I would try would be Asterisk combined with openvpn so that if the clients were behind a firewall or had NAT issues they could connect over it.

Is Fiddler a Good Building Block for an Internet Filter?

So I want to make my own Internet Filter. Don't worry, i'm not asking for a tutorial. I'm just wondering if Fiddler would make a good backbone for it. I'm a little worried because it seemed that there's a few things Fiddler can't always pick up - or that there are workarounds. So, my question:
Would Fiddler grab all web data? i.e, chats, emails, websites, etc.
Are there any known workarounds?
Any other reasons not to use it?
Thanks!
I think you mean FiddlerCore rather than Fiddler. Fiddler(Core) is a web proxy meaning it captures HTTP/HTTPS traffic; it won't capture traffic that uses other protocols (e.g. IRC, etc). To capture traffic from other protocols, you'll need a lower-level interception point (e.g. a Windows Firewall filter) which will capture everything, but it will not be able to decrypt HTTPS traffic, and parsing / modifying the traffic will prove MUCH harder.