GWT-RPC: hacked attempt on request payload - gwt

My test team try to hack on the system, they found out that GWT-RPC call returned a sensitive information (file name as emphasis as below) in response format "//EX" message. I'm amazed that I can't find any postings on this issue.
HTTP Request (Request payload):
7|0|5|http:/localhost:8080/Test_Web/|14B8AB60CF9C73722670313BAE18D294|abc|abc|abc|1|2|3|4|1|5|0|
HTTP Response:
//EX[2,1,["com.google.gwt.user.client.rpc.IncompatibleRemoteServiceException/3936916533","This application is out of date, please click the refresh button on your browser. ( Blocked attempt to access
interface 'abc', which is not implemented by 'com.testProject.client.customerClassService'; this is either misconfiguration or a hack attempt)"],0,7]
Specially the part that says "either misconfiguration or a hack attempt". In my case is hack attempt as HTTP Response, because the exception states that 'abc' is not implemented by 'com.testProject.client.customerClassService'.
Any ideas to hide the sensitive information (class name as emphasis) in the error message as above? I try with all browsers available it is not from the browser.

Name of your gwt-rpc service interface should not be considered sensitive information. That and all method names and parameters are sent using ajax on every gwt-rpc call... Its similar to rest service API resource names and crud operations you can perform on them.
The exception you got was caused by invalid name of gwt-rpc service interface/method/signature - the call was blocked. In this case its important to remember about server side validation of input parameters. You never know if the call was made by your app or was forged...

The 'sensitive' information gathered here is pretty minimal, so we have to assume that you have all other obfuscation turned on (removing class metadata, obfuscating rpc type names, and have otherwise gone over your own generated JS to ensure that no toString() ever returns its own classname).
With that being the case, it turns out that this is the standard RPC error that is sent out from com.google.gwt.user.server.rpc.RPC#decodeRequest(String, Class<?>, SerializationPolicyProvider) if the interface requested (apparently abc) is not implemented by this class. I'd be very surprised if there even was such an interface! It could be argued that this check shouldn't even happen if such an interface doesn't exist, but even doing that check would betray information about what classes don't exist on your server, which could also be considered 'sensitive'.
If this is a concern, my suggestion would be to prevent any IncompatibleRemoteServiceException from reaching the client. This would effectively prevent any debugging of the client, but just feed it a blank "Something went wrong from your bad request". There are a few legitimate cases where a client possibly should get info from exceptions like this, but from your perspective, that might still be sensitive. Without more detail in your question about exactly what sensitive means, this is hard to say.
With that said, here is how I would go about overriding this behavior: First, look at RemoveServiceServlet.processCall, which is where that failure would normally be handled by logging it to the user:
/**
* Process a call originating from the given request. This method calls
* {#link RemoteServiceServlet#checkPermutationStrongName()} to prevent
* possible XSRF attacks and then decodes the <code>payload</code> using
* {#link RPC#decodeRequest(String, Class, SerializationPolicyProvider)}
* to do the actual work.
* Once the request is decoded {#link RemoteServiceServlet#processCall(RPCRequest)}
* will be called.
* <p>
* Subclasses may optionally override this method to handle the payload in any
* way they desire (by routing the request to a framework component, for
* instance). The {#link HttpServletRequest} and {#link HttpServletResponse}
* can be accessed via the {#link #getThreadLocalRequest()} and
* {#link #getThreadLocalResponse()} methods.
* </p>
* This is public so that it can be unit tested easily without HTTP.
*
* #param payload the UTF-8 request payload
* #return a string which encodes either the method's return, a checked
* exception thrown by the method, or an
* {#link IncompatibleRemoteServiceException}
* #throws SerializationException if we cannot serialize the response
* #throws UnexpectedException if the invocation throws a checked exception
* that is not declared in the service method's signature
* #throws RuntimeException if the service method throws an unchecked
* exception (the exception will be the one thrown by the service)
*/
public String processCall(String payload) throws SerializationException {
// First, check for possible XSRF situation
checkPermutationStrongName();
RPCRequest rpcRequest;
try {
rpcRequest = RPC.decodeRequest(payload, delegate.getClass(), this);
} catch (IncompatibleRemoteServiceException ex) {
log(
"An IncompatibleRemoteServiceException was thrown while processing this call.",
ex);
return RPC.encodeResponseForFailedRequest(null, ex);
}
return processCall(rpcRequest);
}
Instead of catching the IncompatibleRemoteServiceException and just logging it out as-is, we want to write a nondescript "Something was wrong with your request, please submit to tech support" sort of answer. Do use the RPC.encodeResponseForFailedRequest or RPC.encodeResponseForFailure methods to make sure it is written as an exception that you client code and read and understand to be deliberately ambiguous.

Related

How call PUT through Jersy REST client with null entity

I want to call some "upgrade" REST API through Jersy client, which is declared as PUT and does not require any body content.
But when I request this API as below:
webTarget.request().put(Entity.json(null));
It gives error as below:
Entity must not be null for http method PUT.
So need to know, is there any way in jersy client to call PUT method with null Entity.
You can configure the client property
SUPPRESS_HTTP_COMPLIANCE_VALIDATION
By default, Jersey client runtime performs certain HTTP compliance checks (such as which HTTP methods can facilitate non-empty request entities etc.) in order to fail fast with an exception when user tries to establish a communication non-compliant with HTTP specification. Users who need to override these compliance checks and avoid the exceptions being thrown by Jersey client runtime for some reason, can set this property to true. As a result, the compliance issues will be merely reported in a log and no exceptions will be thrown.
[...]
Client client = ClientBuilder.newClient();
client.property(ClientProperties.SUPPRESS_HTTP_COMPLIANCE_VALIDATION, true);
WebTarget target = client.target("...");
Response response = target.request().put(Entity.json(null));
I ran into the same error, and solved it by using an empty text entity:
Entity<?> empty = Entity.text("");
webTarget.request().put(empty);
You can use webTarget.request().put(Entity.json(""));
It works for me.

Is it correct to return 404 when a REST resource is not found?

Let's say I have a simple (Jersey) REST resource as follows:
#Path("/foos")
public class MyRestlet extends BaseRestlet
{
#GET
#Path("/{fooId}")
#Produces(MediaType.APPLICATION_XML)
public Response getFoo(#PathParam("fooId") final String fooId)
throws IOException, ParseException
{
final Foo foo = fooService.getFoo(fooId);
if (foo != null)
{
return response.status(Response.Status.OK).entity(foo).build();
}
else
{
return Response.status(Response.Status.NOT_FOUND).build();
}
}
}
Based on the code above, is it correct to return a NOT_FOUND status (404), or should I be returning 204, or some other more appropriate code?
A 404 response in this case is pretty typical and easy for API users to consume.
One problem is that it is difficult for a client to tell if they got a 404 due to the particular entity not being found, or due to a structural problem in the URI. In your example, /foos/5 might return 404 because the foo with id=5 does not exist. However, /food/1 would return 404 even if foo with id=1 exists (because foos is misspelled). In other words, 404 means either a badly constructed URI or a reference to a non-existent resource.
Another problem arises when you have a URI that references multiple resources. With a simple 404 response, the client has no idea which of the referenced resources was not found.
Both of these problems can be partially mitigated by returning additional information in the response body to let the caller know exactly what was not found.
Yes, it is pretty common to return 404 for a resource not being found. Just like a web page, when it's not found, you get a 404. It's not just REST, but an HTTP standard.
Every resource should have a URL location. URLs don't need to be static, they can be templated. So it's possible for the actual requested URL to not have a resource. It is the server's duty to break down the URL from the template to look for the resource. If they resource doesn't exist, then it's "Not Found"
Here's from the HTTP 1.1 spec
404 Not Found
The server has not found anything matching the Request-URI. No indication is given of whether the condition is temporary or permanent. The 410 (Gone) status code SHOULD be used if the server knows, through some internally configurable mechanism, that an old resource is permanently unavailable and has no forwarding address. This status code is commonly used when the server does not wish to reveal exactly why the request has been refused, or when no other response is applicable.
Here's for 204
204 No Content
The server has fulfilled the request but does not need to return an entity-body, and might want to return updated metainformation. The response MAY include new or updated metainformation in the form of entity-headers, which if present SHOULD be associated with the requested variant.
If the client is a user agent, it SHOULD NOT change its document view from that which caused the request to be sent. This response is primarily intended to allow input for actions to take place without causing a change to the user agent's active document view, although any new or updated metainformation SHOULD be applied to the document currently in the user agent's active view.
The 204 response MUST NOT include a message-body, and thus is always terminated by the first empty line after the header fields.
Normally 204 would be used when a representation has been updated or created and there's no need to send an response body back. In the case of a POST, you could send back just the Location of the newly created resource. Something like
#POST
#Path("/something")
#Consumes(...)
public Response createBuzz(Domain domain, #Context UriInfo uriInfo) {
int domainId = // create domain and get created id
UriBuilder builder = uriInfo.getAbsolutePathBuilder();
builder.path(Integer.toString(domainId)); // concatenate the id.
return Response.created(builder.build()).build();
}
The created(URI) will send back the response with the newly created URI in the Location header.
Adding to the first part. You just need to keep in mind that every request from a client is a request to access a resource, whether it's just to GET it, or update with PUT. And a resource can be anything on the server. If the resource doesn't exist, then a general response would be to tell the client we can't find that resource.
To expand on your example. Let's say FooService accsses the DB. Each row in the database can be considered a resource. And each of those rows (resources) has a unique URL, like foo/db/1 might locate a row with a primary key 1. If the id can't be found, then that resource is "Not Found"
Though this question already have an accepted answer, I believe it's really an opinionated thing. Adding my two cents to help you make a more informed decision about the response code.
404 - Not Found. (Reference)
The origin server did not find a current representation for the target resource or is not willing to disclose that one exists.
The resource may exist and you may not have permission to see the resource, will also be equivalent of Not Found. So 404 for a call where data doesn't exist is a very apt thing to do.
Now as for a non-existing URL; though 404 is a widely adapted response code 400 is a more appropriate code.
400 - Bad Request (Reference)
The server cannot or will not process the request due to something that is perceived to be a client error (e.g., malformed request syntax, invalid request message framing, or deceptive request routing).
If you put an invalid parameter in the request, what would be the response code?
If query param has a typo, what should be response code?
Answer to both is 400.
Most of the file-servers, return 404 for invalid URL because for an invalid URL they try to look for a file, which they can't find on the storage ~= Resource Not Found
Apart from the HTTP Status Code, the response will have some info about the error details, where one can be more descriptive about the error and can clear the ambiguity.
If client is calling with an invalid URL, it's an integration issue and should be caught at least during the sanity. No-way they will push the code to production without testing and catching this. Even if they do, God bless them!
tl;dr - 404 for not-found resource; 400 for not-found URL.
A 4XX error code means error from the client side.
As you request a static resource as an image or a html page, returning a 404 response makes sense as :
The HTTP 404 Not Found client error response code indicates that the
server can't find the requested resource. Links which lead to a 404
page are often called broken or dead links, and can be subject to link
rot.
As you provide to clients some REST methods, you rely on the HTTP methods but you should not consider REST services as simple resources.
For clients, an error response in the REST method is often handled close to errors of other processings.
For example, to catch errors during REST invocations or somewhere else, clients could use catchError() of RxJS.
We could write a code (in TypeScript/Angular 2 for the sample code) in this way to delegate the error processing to a function :
return this.http
.get<Foo>("/api/foos")
.pipe(
catchError(this.handleError)
)
.map(foo => {...})
The problem is that any HTTP error (5XX or 4XXX) will terminate in the catchError() callback.
It may really make the REST API responses misleading for clients.
If we do a parallel with programming language, we could consider 5XX/4XX as exception flow.
Generally, we don't throw an exception only because a data is not found, we throw it as a data is not found and that that data would have been found.
For the REST API, we should follow the same logic.
If the entity may not be found, returning OK in the two cases is perfectly fine :
#GET
#Path("/{fooId}")
#Produces(MediaType.APPLICATION_XML)
public Response getFoo(#PathParam("fooId") final String fooId)
throws IOException, ParseException {
final Foo foo = fooService.getFoo(fooId);
if (foo != null){
return Response.status(Response.Status.OK).entity(foo).build();
}
return Response.status(Response.Status.OK).build();
}
The client could so handle the result according to the result is present or missing.
I don't think that returning 204 brings any useful value.
The HTTP 204 documentation states that :
The client doesn't need to go away from its current page.
But requesting a REST resource and more particularly by a GET method doesn't mean that the client is about terminating a workflow (that makes more sense with POST/PUT methods).
The document adds also :
The common use case is to return 204 as a result of a PUT request,
updating a resource, without changing the current content of the page
displayed to the user.
We are really not in this case.
Some specific HTTP codes for classical browsing matche finely with return codes of REST API (201, 202, 401, and so for...) but this is not always the case.
So for these cases, rather than twisting original codes, I would favor to keep them simple by using more general codes : 200, 400.

CSRF Filtering using GWT RequestFactoryServlet

I am implementing a token based system to prevent CSRF attacks in my Request Factory based GWT App.
To implement my filter on the server side I have overridden the doPost method on RequestFactoryServlet, thus:
#Override
protected void doPost(HttpServletRequest request, HttpServletResponse response) throws IOException, ServletException {
String sessionToken = CsrfTokenManager.getToken(request.getSession());
String requestToken = request.getHeader(CsrfTokenManager.CSRF_TOKEN_NAME);
if (sessionToken.equals(requestToken)) {
super.doPost(request, response);
} else {
logger.error(String.format("Received unsafe http request [%s]", getFullRequest(request)));
response.sendError(401,"Unsafe HTTP Request");
}
}
This works in that it does not allow requests without a valid token to be processed, and my logs contain a suitable message, but the error I get back is a 500-Internal Server Error rather than a 401.
Can anyone shed light on why this is and what I should be doing differently?
There is very little information provided by you on the reason for 500 internal server error. Please share the exception stack trace ( 500 internal server error would have thrown one).
Also avoid implementing a custom one if it is not based on GWT recommendation. Read this stackoverflow query on CSRF with RequestFactory.

Proper way to convey error messages during calls to a REST service?

I'm writing a REST based web service, and I'm trying to figure out the best way to handle error conditions.
Currently the service is returning HTTP Errors, such as Bad Request, but how can I return extra information to give developers using the web service an idea what they're doing wrong?
For example: creating a user with a null username returns an error of Bad Request. How can I add that the error was caused by a null username parameter?
According to the HTTP spec, the text that comes after the three digit response code, the "Reason-Phrase", can only be replaced with a logical equivalent. So you can't respond with 400 null user and expect anything useful to happen. Indeed, The client is not required to examine or display the Reason- Phrase.
In general, the HTTP response entity (typically the page that accompanies the response) should contain information useful to the client to guide them forward, even when the response is an error. On the web, most such errors are HTML, and are devoid of machine readable information, but most browsers do show the error to the user (and SO's error page is pretty good!).
So for a primarily machine readable resource you have two options:
Pass a human readable message anyway. Return 400 Bad Request with a HTML response, which the client may opt to show to the user. It's dead easy but it's a bit like throwing an unchecked exception, it passes all the hard work to the client, or indeed the end user.
Allow clients to recover. Return 400 Bad Request with a machine readable response which is part of your API, so clients can recover from known error conditions. This is harder, like throwing a checked exception, it becomes part of the API, and it allows clients to recover gracefully if they want to.
You could even make the server support both scenarios by defining a media type for the machie readable error recovery document, and allow clients to "accept" them: Accept: application/atom+xml, application/my.proprietary.errors+json
Clients that forget the mandatory field can opt in to getting machine readable errors or human readable errors by choosing to Accepting the error media type.
It's stated in the HTTP spec that most error codes should return some basic text that gives a clarification of why the error is being returned. The basic Java Servlet Spec defines the HttpServletResponse.sendError(int Code, String message) for this purpose.
String desc = "my Description";
throw new WebApplicationException(Response.status(Status.BAD_REQUEST).entity(desc).type("text/plain").build());

HTTP GET with request body

I'm developing a new RESTful webservice for our application.
When doing a GET on certain entities, clients can request the contents of the entity.
If they want to add some parameters (for example sorting a list) they can add these parameters in the query string.
Alternatively I want people to be able to specify these parameters in the request body.
HTTP/1.1 does not seem to explicitly forbid this. This will allow them to specify more information, might make it easier to specify complex XML requests.
My questions:
Is this a good idea altogether?
Will HTTP clients have issues with using request bodies within a GET request?
https://www.rfc-editor.org/rfc/rfc2616
Roy Fielding's comment about including a body with a GET request.
Yes. In other words, any HTTP request message is allowed to contain a message body, and thus must parse messages with that in mind. Server semantics for GET, however, are restricted such that a body, if any, has no semantic meaning to the request. The requirements on parsing are separate from the requirements on method semantics.
So, yes, you can send a body with GET, and no, it is never useful to do so.
This is part of the layered design of HTTP/1.1 that will become clear again once the spec is partitioned (work in progress).
....Roy
Yes, you can send a request body with GET but it should not have any meaning. If you give it meaning by parsing it on the server and changing your response based on its contents, then you are ignoring this recommendation in the HTTP/1.1 spec, section 4.3:
...if the request method does not include defined semantics for an entity-body, then the message-body SHOULD be ignored when handling the request.
And the description of the GET method in the HTTP/1.1 spec, section 9.3:
The GET method means retrieve whatever information ([...]) is identified by the Request-URI.
which states that the request-body is not part of the identification of the resource in a GET request, only the request URI.
Update
The RFC2616 referenced as "HTTP/1.1 spec" is now obsolete. In 2014 it was replaced by RFCs 7230-7237. Quote "the message-body SHOULD be ignored when handling the request" has been deleted. It's now just "Request message framing is independent of method semantics, even if the method doesn't define any use for a message body" The 2nd quote "The GET method means retrieve whatever information ... is identified by the Request-URI" was deleted. - From a comment
From the HTTP 1.1 2014 Spec:
A payload within a GET request message has no defined semantics; sending a payload body on a GET request might cause some existing implementations to reject the request.
While you can do that, insofar as it isn't explicitly precluded by the HTTP specification, I would suggest avoiding it simply because people don't expect things to work that way. There are many phases in an HTTP request chain and while they "mostly" conform to the HTTP spec, the only thing you're assured is that they will behave as traditionally used by web browsers. (I'm thinking of things like transparent proxies, accelerators, A/V toolkits, etc.)
This is the spirit behind the Robustness Principle roughly "be liberal in what you accept, and conservative in what you send", you don't want to push the boundaries of a specification without good reason.
However, if you have a good reason, go for it.
You will likely encounter problems if you ever try to take advantage of caching. Proxies are not going to look in the GET body to see if the parameters have an impact on the response.
Elasticsearch accepts GET requests with a body. It even seems that this is the preferred way: Elasticsearch guide
Some client libraries (like the Ruby driver) can log the cry command to stdout in development mode and it is using this syntax extensively.
Neither restclient nor REST console support this but curl does.
The HTTP specification says in section 4.3
A message-body MUST NOT be included in a request if the specification of the request method (section 5.1.1) does not allow sending an entity-body in requests.
Section 5.1.1 redirects us to section 9.x for the various methods. None of them explicitly prohibit the inclusion of a message body. However...
Section 5.2 says
The exact resource identified by an Internet request is determined by examining both the Request-URI and the Host header field.
and Section 9.3 says
The GET method means retrieve whatever information (in the form of an entity) is identified by the Request-URI.
Which together suggest that when processing a GET request, a server is not required to examine anything other that the Request-URI and Host header field.
In summary, the HTTP spec doesn't prevent you from sending a message-body with GET but there is sufficient ambiguity that it wouldn't surprise me if it was not supported by all servers.
GET, with a body!?
Specification-wise you could, but, it's not a good idea to do so injudiciously, as we shall see.
RFC 7231 §4.3.1 states that a body "has no defined semantics", but that's not to say it is forbidden. If you attach a body to the request and what your server/app makes out of it is up to you. The RFC goes on to state that GET can be "a programmatic view on various database records". Obviously such view is many times tailored by a large number of input parameters, which are not always convenient or even safe to put in the query component of the request-target.
The good: I like the verbiage. It's clear that one read/get a resource without any observable side-effects on the server (the method is "safe"), and, the request can be repeated with the same intended effect regardless of the outcome of the first request (the method is "idempotent").
The bad: An early draft of HTTP/1.1 forbade GET to have a body, and - allegedly - some implementations will even up until today drop the body, ignore the body or reject the message. For example, a dumb HTTP cache may construct a cache key out of the request-target only, being oblivious to the presence or content of a body. An even dumber server could be so ignorant that it treats the body as a new request, which effectively is called "request smuggling" (which is the act of sending "a request to one device without the other device being aware of it" - source).
Due to what I believe is primarily a concern with inoperability amongst implementations, work in progress suggests to categorize a GET body as a "SHOULD NOT", "unless [the request] is made directly to an origin server that has previously indicated, in or out of band, that such a request has a purpose and will be adequately supported" (emphasis mine).
The fix: There's a few hacks that can be employed for some of the problems with this approach. For example, body-unaware caches can indirectly become body-aware simply by appending a hash derived from the body to the query component, or disable caching altogether by responding a cache-control: no-cache header from the server.
Alas when it comes to the request chain, one is often not in control of- or even aware, of all present and future HTTP intermediaries and how they will deal with a GET body. That's why this approach must be considered generally unreliable.
But POST, is not idempotent!
POST is an alternative. The POST request usually includes a message body (just for the record, body is not a requirement, see RFC 7230 §3.3.2). The very first use case example from RFC 7231 (§4.3.3) is "providing a block of data [...] to a data-handling process". So just like GET with a body, what happens with the body on the back-end side is up to you.
The good: Perhaps a more common method to apply when one wish to send a request body, for whatever purpose, and so, will likely yield the least amount of noise from your team members (some may still falsely believe that POST must create a resource).
Also, what we often pass parameters to is a search function operating upon constantly evolving data, and a POST response is only cacheable if explicit freshness information is provided in the response.
The bad: POST requests are not defined as idempotent, leading to request retry hesitancy. For example, on page reload, browsers are unwilling to resubmit an HTML form without prompting the user with a nonreadable cryptic message.
The fix: Well, just because POST is not defined to be idempotent doesn't mean it mustn't be. Indeed, RFC 7230 §6.3.1 writes: "a user agent that knows (through design or configuration) that a POST request to a given resource is safe can repeat that request automatically". So, unless your client is an HTML form, this is probably not a real problem.
QUERY is the holy grail
There's a proposal for a new method QUERY which does define semantics for a message body and defines the method as idempotent. See this.
Edit: As a side-note, I stumbled into this StackOverflow question after having discovered a codebase where they solely used PUT requests for server-side search functions. This were their idea to include a body with parameters and also be idempotent. Alas the problem with PUT is that the request body has very precise semantics. Specifically, the PUT "requests that the state of the target resource be created or replaced with the state [in the body]" (RFC 7231 §4.3.4). Clearly, this excludes PUT as a viable option.
You can either send a GET with a body or send a POST and give up RESTish religiosity (it's not so bad, 5 years ago there was only one member of that faith -- his comments linked above).
Neither are great decisions, but sending a GET body may prevent problems for some clients -- and some servers.
Doing a POST might have obstacles with some RESTish frameworks.
Julian Reschke suggested above using a non-standard HTTP header like "SEARCH" which could be an elegant solution, except that it's even less likely to be supported.
It might be most productive to list clients that can and cannot do each of the above.
Clients that cannot send a GET with body (that I know of):
XmlHTTPRequest Fiddler
Clients that can send a GET with body:
most browsers
Servers & libraries that can retrieve a body from GET:
Apache
PHP
Servers (and proxies) that strip a body from GET:
?
What you're trying to achieve has been done for a long time with a much more common method, and one that doesn't rely on using a payload with GET.
You can simply build your specific search mediatype, or if you want to be more RESTful, use something like OpenSearch, and POST the request to the URI the server instructed, say /search. The server can then generate the search result or build the final URI and redirect using a 303.
This has the advantage of following the traditional PRG method, helps cache intermediaries cache the results, etc.
That said, URIs are encoded anyway for anything that is not ASCII, and so are application/x-www-form-urlencoded and multipart/form-data. I'd recommend using this rather than creating yet another custom json format if your intention is to support ReSTful scenarios.
I put this question to the IETF HTTP WG. The comment from Roy Fielding (author of http/1.1 document in 1998) was that
"... an implementation would be broken to do anything other than to parse and discard that body if received"
RFC 7213 (HTTPbis) states:
"A payload within a GET request message has no defined semantics;"
It seems clear now that the intention was that semantic meaning on GET request bodies is prohibited, which means that the request body can't be used to affect the result.
There are proxies out there that will definitely break your request in various ways if you include a body on GET.
So in summary, don't do it.
From RFC 2616, section 4.3, "Message Body":
A server SHOULD read and forward a message-body on any request; if the
request method does not include defined semantics for an entity-body,
then the message-body SHOULD be ignored when handling the request.
That is, servers should always read any provided request body from the network (check Content-Length or read a chunked body, etc). Also, proxies should forward any such request body they receive. Then, if the RFC defines semantics for the body for the given method, the server can actually use the request body in generating a response. However, if the RFC does not define semantics for the body, then the server should ignore it.
This is in line with the quote from Fielding above.
Section 9.3, "GET", describes the semantics of the GET method, and doesn't mention request bodies. Therefore, a server should ignore any request body it receives on a GET request.
Which server will ignore it? – fijiaaron Aug 30 '12 at 21:27
Google for instance is doing worse than ignoring it, it will consider it an error!
Try it yourself with a simple netcat:
$ netcat www.google.com 80
GET / HTTP/1.1
Host: www.google.com
Content-length: 6
1234
(the 1234 content is followed by CR-LF, so that is a total of 6 bytes)
and you will get:
HTTP/1.1 400 Bad Request
Server: GFE/2.0
(....)
Error 400 (Bad Request)
400. That’s an error.
Your client has issued a malformed or illegal request. That’s all we know.
You do also get 400 Bad Request from Bing, Apple, etc... which are served by AkamaiGhost.
So I wouldn't advise using GET requests with a body entity.
According to XMLHttpRequest, it's not valid. From the standard:
4.5.6 The send() method
client . send([body = null])
Initiates the request. The optional argument provides the request
body. The argument is ignored if request method is GET or HEAD.
Throws an InvalidStateError exception if either state is not
opened or the send() flag is set.
The send(body) method must run these steps:
If state is not opened, throw an InvalidStateError exception.
If the send() flag is set, throw an InvalidStateError exception.
If the request method is GET or HEAD, set body to null.
If body is null, go to the next step.
Although, I don't think it should because GET request might need big body content.
So, if you rely on XMLHttpRequest of a browser, it's likely it won't work.
If you really want to send cachable JSON/XML body to web application the only reasonable place to put your data is query string encoded with RFC4648: Base 64 Encoding with URL and Filename Safe Alphabet. Of course you could just urlencode JSON and put is in URL param's value, but Base64 gives smaller result. Keep in mind that there are URL size restrictions, see What is the maximum length of a URL in different browsers? .
You may think that Base64's padding = character may be bad for URL's param value, however it seems not - see this discussion: http://mail.python.org/pipermail/python-bugs-list/2007-February/037195.html . However you shouldn't put encoded data without param name because encoded string with padding will be interpreted as param key with empty value.
I would use something like ?_b64=<encodeddata>.
I wouldn't advise this, it goes against standard practices, and doesn't offer that much in return. You want to keep the body for content, not options.
You have a list of options which are far better than using a request body with GET.
Let' assume you have categories and items for each category. Both to be identified by an id ("catid" / "itemid" for the sake of this example). You want to sort according to another parameter "sortby" in a specific "order". You want to pass parameters for "sortby" and "order":
You can:
Use query strings, e.g.
example.com/category/{catid}/item/{itemid}?sortby=itemname&order=asc
Use mod_rewrite (or similar) for paths:
example.com/category/{catid}/item/{itemid}/{sortby}/{order}
Use individual HTTP headers you pass with the request
Use a different method, e.g. POST, to retrieve a resource.
All have their downsides, but are far better than using a GET with a body.
What about nonconforming base64 encoded headers? "SOMETHINGAPP-PARAMS:sdfSD45fdg45/aS"
Length restrictions hm. Can't you make your POST handling distinguish between the meanings? If you want simple parameters like sorting, I don't see why this would be a problem. I guess it's certainty you're worried about.
I'm upset that REST as protocol doesn't support OOP and Get method is proof. As a solution, you can serialize your a DTO to JSON and then create a query string. On server side you'll able to deserialize the query string to the DTO.
Take a look on:
Message-based design in ServiceStack
Building RESTful Message Based Web Services with WCF
Message based approach can help you to solve Get method restriction. You'll able to send any DTO as with request body
Nelibur web service framework provides functionality which you can use
var client = new JsonServiceClient(Settings.Default.ServiceAddress);
var request = new GetClientRequest
{
Id = new Guid("2217239b0e-b35b-4d32-95c7-5db43e2bd573")
};
var response = client.Get<GetClientRequest, ClientResponse>(request);
as you can see, the GetClientRequest was encoded to the following query string
http://localhost/clients/GetWithResponse?type=GetClientRequest&data=%7B%22Id%22:%2217239b0e-b35b-4d32-95c7-5db43e2bd573%22%7D
IMHO you could just send the JSON encoded (ie. encodeURIComponent) in the URL, this way you do not violate the HTTP specs and get your JSON to the server.
For example, it works with Curl, Apache and PHP.
PHP file:
<?php
echo $_SERVER['REQUEST_METHOD'] . PHP_EOL;
echo file_get_contents('php://input') . PHP_EOL;
Console command:
$ curl -X GET -H "Content-Type: application/json" -d '{"the": "body"}' 'http://localhost/test/get.php'
Output:
GET
{"the": "body"}
Even if a popular tool use this, as cited frequently on this page, I think it is still quite a bad idea, being too exotic, despite not forbidden by the spec.
Many intermediate infrastructures may just reject such requests.
By example, forget about using some of the available CDN in front of your web site, like this one:
If a viewer GET request includes a body, CloudFront returns an HTTP status code 403 (Forbidden) to the viewer.
And yes, your client libraries may also not support emitting such requests, as reported in this comment.
If you want to allow a GET request with a body, a way is to support POST request with header "X-HTTP-Method-Override: GET". It is described here : https://en.wikipedia.org/wiki/List_of_HTTP_header_fields. This header means that while the method is POST, the request should be treated as if it is a GET. Body is allowed for POST, so you're sure nobody willl drop the payload of your GET requests.
This header is oftenly used to make PATCH or HEAD requests through some proxies that do not recognize those methods and replace them by GET (always fun to debug!).
An idea on an old question:
Add the full content on the body, and a short hash of the body on the querystring, so caching won't be a problem (the hash will change if body content is changed) and you'll be able to send tons of data when needed :)
Create a Requestfactory class
import java.net.URI;
import javax.annotation.PostConstruct;
import org.apache.http.client.methods.HttpEntityEnclosingRequestBase;
import org.apache.http.client.methods.HttpUriRequest;
import org.springframework.http.HttpMethod;
import org.springframework.http.client.HttpComponentsClientHttpRequestFactory;
import org.springframework.stereotype.Component;
import org.springframework.web.client.RestTemplate;
#Component
public class RequestFactory {
private RestTemplate restTemplate = new RestTemplate();
#PostConstruct
public void init() {
this.restTemplate.setRequestFactory(new HttpComponentsClientHttpRequestWithBodyFactory());
}
private static final class HttpComponentsClientHttpRequestWithBodyFactory extends HttpComponentsClientHttpRequestFactory {
#Override
protected HttpUriRequest createHttpUriRequest(HttpMethod httpMethod, URI uri) {
if (httpMethod == HttpMethod.GET) {
return new HttpGetRequestWithEntity(uri);
}
return super.createHttpUriRequest(httpMethod, uri);
}
}
private static final class HttpGetRequestWithEntity extends HttpEntityEnclosingRequestBase {
public HttpGetRequestWithEntity(final URI uri) {
super.setURI(uri);
}
#Override
public String getMethod() {
return HttpMethod.GET.name();
}
}
public RestTemplate getRestTemplate() {
return restTemplate;
}
}
and #Autowired where ever you require and use, Here is one sample code GET request with RequestBody
#RestController
#RequestMapping("/v1/API")
public class APIServiceController {
#Autowired
private RequestFactory requestFactory;
#RequestMapping(method = RequestMethod.GET, path = "/getData")
public ResponseEntity<APIResponse> getLicenses(#RequestBody APIRequest2 APIRequest){
APIResponse response = new APIResponse();
HttpHeaders headers = new HttpHeaders();
headers.setContentType(MediaType.APPLICATION_JSON);
Gson gson = new Gson();
try {
StringBuilder createPartUrl = new StringBuilder(PART_URL).append(PART_URL2);
HttpEntity<String> entity = new HttpEntity<String>(gson.toJson(APIRequest),headers);
ResponseEntity<APIResponse> storeViewResponse = requestFactory.getRestTemplate().exchange(createPartUrl.toString(), HttpMethod.GET, entity, APIResponse.class); //.getForObject(createLicenseUrl.toString(), APIResponse.class, entity);
if(storeViewResponse.hasBody()) {
response = storeViewResponse.getBody();
}
return new ResponseEntity<APIResponse>(response, HttpStatus.OK);
}catch (Exception e) {
e.printStackTrace();
return new ResponseEntity<APIResponse>(response, HttpStatus.INTERNAL_SERVER_ERROR);
}
}
}