Okay guys, I have followed these answers
413 Request Entity Too Large
add client_max_body_size 20M in nginx.conf and also inside httpd.confd folder which is related with my proxy configuration. I added in http, server and location block. I use Play! Framework as my gateway.
But still, I'd have Request entity too large error. Do you guys have any idea or suggestions? Or link to follow through?
Thanks
As well as the web servers in front of Play, which it sounds like you have configured, Play itself has max request Content-Length limits, documented here: https://www.playframework.com/documentation/2.5.x/JavaBodyParsers#Content-length-limits
Most of the built in body parsers buffer the body in memory, and some
buffer it on disk. If the buffering was unbounded, this would open up
a potential vulnerability to malicious or careless use of the
application. For this reason, Play has two configured buffer limits,
one for in memory buffering, and one for disk buffering.
The memory buffer limit is configured using
play.http.parser.maxMemoryBuffer, and defaults to 100KB, while the
disk buffer limit is configured using play.http.parser.maxDiskBuffer,
and defaults to 10MB. These can both be configured in
application.conf, for example, to increase the memory buffer limit to
256KB:
Depending on the situation, you may want to be careful with increasing this limit too much -- if you have untrusted clients they may be able to overload your server by sending lots of very large requests in a short space of time. This may cause your server to crash with an OutOfMemoryError, leading to a denial of service attack.
Related
Question
When sending a large http request (e.g. 100MB) to Spring Cloud Gateway, does it read the complete request into memory before forwarding it to the downstream service?
Assumption/Guess
From the memory consumption and timing, it seems to work this way but I cannot find any information about this in the documentation. Can anyone confirm if the assumption is correct or not?
Are there workarounds/solutions?
If the assumption is correct: Is it possible to make the Spring Cloud Gateway "stream" the request immediately after the route has been determined (e.g. after reading the headers)? Because reading the complete request into memory can quickly turn into a bottleneck when multiple "big" requests are coming in at the same time. Or are there some other recommended workarounds for this issue?
Thanks for the fast response and hint spencergibb!
Actually in our case it was the RetryFilter which was causing the effect that we were observing. After deactivation the streaming is working fine and the memory constraint is even mentioned in the documentation of the RetryFilter.
According to documentation I expected that the filter would only be applied to GET requests on default. But maybe this was a misunderstanding. Removing the filter definitely solved the memory issues for us.
While designing rest API's I time to time have challenge to deal with batch operations (e.g. delete or update many entities at once) to reduce overhead of many tcp client connections. And in particular situation problem usually solves by adding custom api method for specific operation (e.g. POST /files/batchDelete which accepts ids at request body) which doesn't look pretty from point of view of rest api design principles but do the job.
But for me general solution for the problem still desirable. Recently I found Google Cloud Storage JSON API batching documentation which for me looks like pretty general solution. I mean similar format may be used for any http api, not just google cloud storage. So my question is - does anybody know kind of general standard (standard or it's draft, guideline, community effort or so) of making multiple API calls combined into one HTTP request?
I'm aware of capabilities of http/2 which include usage of single tcp connection for http requests but my question is addressed to application level. Which in my opinion still make sense because despite of ability to use http/2 taking that on application level seems like the only way to guarantee that for any client including http/1 which is currently the most used version of http.
TL;DR
REST nor HTTP are ideal for batch operations.
Usually caching, which is one of RESTs constraints, which is not optional but mandatory, prevents batch processing in some form.
It might be beneficial to not expose the data to update or remove in batch as own resources but as data elements within a single resource, like a data table in a HTML page. Here updating or removing all or parts of the entries should be straight forward.
If the system in general is write-intensive it is probably better to think of other solutions such as exposing the DB directly to those clients to spare a further level of indirection and complexity.
Utilization of caching may prevent a lot of workload on the server and even spare unnecessary connecctions
To start with, REST nor HTTP are ideal for batch operations. As Jim Webber pointed out the application domain of HTTP is the transfer of documents over the Web. This is what HTTP does and this is what it is good at. However, any business rules we conclude are just a side effect of the document management and we have to come up with solutions to turn this document management side effects to something useful.
As REST is just a generalization of the concepts used in the browsable Web, it is no miracle that the same concepts that apply to Web development also apply to REST development in some form. Thereby a question like how something should be done in REST usually resolves around answering how something should be done on the Web.
As mentioned before, HTTP isn't ideal in terms of batch processing actions. Sure, a GET request may retrieve multiple results, though in reality you obtain one response containing links to further resources. The creation of resources has, according to the HTTP specification, to be indicated with a Location header that points to the newly created resource. POST is defined as an all purpose method that allows to perform tasks according to server-specific semantics. So you could basically use it to create multiple resources at once. However, the HTTP spec clearly lacks support for indicating the creation of multiple resources at once as the Location header may only appear once per response as well as define only one URI in it. So how can a server indicate the creation of multiple resources to the server?
A further indication that HTTP isn't ideal for batch processing is that a URI must reference a single resource. That resource may change over time, though the URI can't ever point to multiple resources at once. The URI itself is, more or less, used as key by caches which store a cacheable response representation for that URI. As a URI may only ever reference one single resource, a cache will also only ever store the representation of one resource for that URI. A cache will invalidate a stored representation for a URI if an unsafe operation is performed on that URI. In case of a DELETE operation, which is by nature unsafe, the representation for the URI the DELETE is performed on will be removed. If you now "redirect" the DELETE operation to remove multiple backing resources at once, how should a cache take notice of that? It only operates on the URI invoked. Hence even when you delete multiple resources in one go via DELETE a cache might still serve clients with outdated information as it simply didn't take notice of the removal yet and its freshness value would still indicate a fresh-enough state. Unless you disable caching by default, which somehow violates one of REST's constraints, or reduce the time period a representation is considered fresh enough to a very low value, clients will probably get served with outdated information. You could of course perform an unsafe operation on each of these URIs then to "clear" the cache, though in that case you could have invoked the DELETE operation on each resource you wanted to batch delete itself to start with.
It gets a bit easier though if the batch of data you want to remove is not explicitly captured via their own resources but as data of a single resource. Think of a data-table on a Web page where you have certain form-elements, such as a checkbox you can click on to mark an entry as delete candidate and then after invoking the submit button send the respective selected elements to the server which performs the removal of these items. Here only the state of one resource is updated and thus a simple POST, PUT or even PATCH operation can be performed on that resource URI. This also goes well with caching as outlined before as only one resource has to be altered, which through the usage of unsafe operations on that URI will automatically lead to an invalidation of any stored representation for the given URI.
The above mentioned usage of form-elements to mark certain elements for removal depends however on the media-type issued. In the case of HTML its forms section specifies the available components and their affordances. An affordance is the knowledge what you can and should do with certain objects. I.e. a button or link may want to be pushed, a text field may expect numeric or alphanumeric input which further may be length limited and so on. Other media types, such as hal-forms, halform or ion, attempt to provide form representations and components for a JSON based notation, however, support for such media-types is still quite limited.
As one of your concerns are the number of client connections to your service, I assume you have a write-intensive scenario as in read-intensive cases caching would probably take away a good chunk of load from your server. I.e. BBC once reported that they could reduce the load on their servers drastically just by introducing a one minute caching interval for recently requested resources. This mainly affected their start page and the linked articles as people clicked on the latest news more often than on old news. On receiving a couple of thousands, if not hundred thousands, request per minute they could, as mentioned before, reduce the number of requests actually reaching the server significantly and therefore take away a huge load on their servers.
Write intensive use-cases however can't take benefit of caching as much as read-intensive cases as the cache would get invalidated quite often and the actual request being forward to the server for processing. If the API is more or less used to perform CRUD operations, as so many "REST" APIs do in reality, it is questionable if it wouldn't be preferable to expose the database directly to the clients. Almost all modern database vendors ship with sophisticated user-right management options and allow to create views that can be exposed to certain users. The "REST API" on top of it basically just adds a further level of indirection and complexity in such a case. By exposing the DB directly, performing batch updates or deletions shouldn't be an issue at all as through the respective query languages support for such operations should already be build into the DB layer.
In regards to the number of connections clients create: HTTP from 1.0 on allows the reusage of connections via the Connection: keep-alive header directive. In HTTP/1.1 persistent connections are used by default if not explicitly requested to close via the respective Connection: close header directive. HTTP/2 introduced full-duplex connections that allow many channels and therefore requests to reuse the same connections at the same time. This is more or less a fix for the connection limitation suggested in RFC 2626 which plenty of Web developers avoided by using CDN and similar stuff. Currently most implementations use a maximum limit of 100 channels and therefore simultaneous downloads via a single connections AFAIK.
Usually opening and closing a connection takes a bit of time and server resources and the more open connections a server has to deal with the more a system may suffer. Though open connections with hardly any traffic aren't a big issue for most servers. While the connection creation was usually considered to be the costly part, through the usage of persistent connections that factor moved now towards the number of requests issued, hence the request for sending out batch-requests, which HTTP is not really made for. Again, as mentioned throughout the post, through the smart utilization of caching plenty of requests may never reach the server at all, if possible. This is probably one of the best optimization strategies to reduce the number of simultaneous requests, as probably plenty of requests might never reach the server at all. Probably the best advice to give is in such a case to have a look at what kind of resources are requested frequently, which requests take up a lot of processing capacity and which ones can easily get responded with by utilizing caching options.
reduce overhead of many tcp client connections
If this is the crux of the issue, the easiest way to solve this is to switch to HTTP/2
In a way, HTTP/2 does exactly what you want. You open 1 connection, and using that collection you can send many HTTP requests in parallel. Unlike batching in a single HTTP request, it's mostly transparent for clients and response and requests can be processed out of order.
Ultimately batching multiple operations in a single HTTP request is always a network hack.
HTTP/2 is widely available. If HTTP/1.1 is still the most used version (this might be true, but gap is closing), this has more to do with servers not yet being set up for it, not clients.
Should we use http caching only for static stuff?
Or also in API responses could be using caching headers if data from API is not static? it can be changed by application's users.
Caching is needed to gain performance but at the same time it increases the likelihood of the data being outdated. It's true for static resources as well. So if your app is under high load and you want to increase the speed - you may sacrifice up-to-date data for gain in performance.
Note, though, that client side needs to respect caching headers. We often work with browsers - they have it all figured out, but if our client is another service, then you need to ensure that it doesn't ignore the headers. This won't be for free - code will need to be written for this to happen.
Your cache may also be public or private. If it's public (any client is allowed to see the content), you may configure a reverse proxy (like nginx) between your server and the clients. Nginx can be set up to cache results (it also understands cache headers). So it may take off some load from your application by not letting requests through and instead returning cached copies.
I have a setup where I would like to have extremely aggressive HTTP caching on my internal proxy. Basically, what I want to achieve is a simplistic caching strategy like this:
any GET request that does not 500x or 400x gets cached indefinitely
any PUT or POST or DELETE or PATCH that does not 500x or 400x invalidates the resource and it's subpaths (since I only use nested resources and I use them a lot).
I don't plan to have a ridiculous number of subpaths either (around 1000 per root-level resource, and less and less drilling down obviously).
So basically I want to avoid the absolute most of the requests even touching my core app.
I plan to run the caching backend on a separate machine with lots of RAM and evil storage, and there is going to be one such machine (so I don't have to expire across a cluster or anything like that).
Which proxy cache would be better for this task? Varnish or HAProxy? What are the settings that I should look for to achieve this kind of expiry? Is this a common pattern to make REST servers caching-friendly?
HAproxy is only a load balancer. It will not do any caching for you.
Varnish is a good choice for your described case. As for the configuration, you are best off to send the caching details (ttl/expiry time and cachability) from your backend that will instruct varnish on cache handling of the document.
This might be something more suited for Serverfault, but many webdevelopers who come only here will probably benefit from possible answers to this question.
The question is: How do you effectively protect yourself against Denial Of Service attacks against your webserver?
I asked myself this after reading this article
For those not familiar, here's what I remember about it: a DoS attack will attempt to occupy all your connections by repeatedly sending bogus headers to your servers.
By doing so, your server will reach the limit of possible simultanious connections and as a result, normal users can't acces your site anymore.
Wikipedia provides some more info: http://en.wikipedia.org/wiki/Denial_of_service
There's no panacea, but you can make DoS attacks more difficult by doing some of the following:
Don't (or limit your willingness to) do expensive operations on behalf of unauthenticated clients
Throttle authentication attempts
Throttle operations performed on behalf of each authenticated client, and place their account on a temporary lockout if they do too many things in too short a time
Have a similar global throttle for all unauthenticated clients, and be prepared to lower this setting if you detect an attack in progress
Have a flag you can use during an attack to disable all unauthenticated access
Don't store things on behalf of unauthenticated clients, and use a quota to limit the storage for each authenticated client
In general, reject all malformed, unreasonably complicated, or unreasonably huge requests as quickly as possible (and log them to aid in detection of an attack)
Don't use a pure LRU cache if requests from unauthenticated clients can result in evicting things from that cache, because you will be subject to cache poisoning attacks (where a malicious client asks for lots of different infrequently used things, causing you to evict all the useful things from your cache and need to do much more work to serve your legitimate clients)
Remember, it's important to outright reject throttled requests (for example, with an HTTP 503: Service Unavailable response or a similar response appropriate to whatever protocol you are using) rather than queueing throttled requests. If you queue them, the queue will just eat up all your memory and the DoS attack will be at least as effective as it would have been without the throttling.
Some more specific advice for the HTTP servers:
Make sure your web server is configured to reject POST messages without an accompanying Content-Length header, and to reject requests (and throttle the offending client) which exceed the stated Content-Length, and to reject requests with a Content-Length which is unreasonably long for the service that the POST (or PUT) is aimed at
For this specific attack (as long as the request is GET) based a load balancer or a WAF which only bases full requests to the webserver would work.
The problems start when instead of GET POST is used (which is easy) because you can't know if this is a malicious POST or just some really slow upload from an user.
From DoS per se you can't really protect your webapp because of a simple fact. Your resources are limited while the attacker potentially has unlimited time and resources to perform the DoS. And most of the time it's cheap for the attacker to perform the required steps. e.g. this attack mentioned above a few 100 slow running connections -> no problem
Asynchronous servers, for one, are more or less immune to this particular form of attack. I for instance serve my Django apps using an Nginx reverse proxy, and the attack didn't seem to affect its operation whatsoever. Another popular asynchronous server is lighttpd.
Mind you, this attack is dangerous because it can be performed even by a single machine with a slow connection. However, common DDoS attacks pit your server against an army of machines, and there's little you can do to protect yourself from them.
Short answer:
You cannot protect yourself against a DoS.
And i dont agree it belongs on serverfault since DoS is categorized as a security issue and is definetly related to programming