HTTP Redirect Status Code - redirect

I have an ASP.NET website. A user can access the URL /partners/{partner-id} in my app. When that url is invoked, I do two things:
1) I want to log the partner ID and user that requested it and
2) Redirect the url to the partner's website.
My question is, which HTTP Status Code should I use? I was using 301. However, that introduced a problem where my logging code was getting skipped. I suspect its because a 301 represents a permanent redirect. However, I basically want to remain the middle man so that I properly log the details.
What HTTP status code should I use?
Thanks!

Taking a look here:
https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
you should use the 302 status code. Two useful points about the 302 redirect:
Since the redirection might be altered on occasion, the client SHOULD
continue to use the Request-URI for future requests
This says by inferring that the redirect may be temporary, clients should always check the initial URI instead of going to the redirect URI as a default behavior, meaning they will pass through your logging system each time rather than going directly to the redirected URI on subsequent requests. The 302 response also states:
This response is only cacheable if indicated by a Cache-Control or
Expires header field.
By default, the 301 redirect is cacheable unless you explicitly specify, but the 302 is not cacheable unless explicitly specified.
However, it's probably a good idea to explicitly add in 'do not cache' headers to the redirect to let the client know that it should not be cached just in case you have a client that doesn't follow the default spec behavior. There are a number of other answers in stackoverflow regarding this, here's a decent one:
How to control web page caching, across all browsers?

Related

can I pass headers to redirected location after redirection?

here's response header of my redirection endpoint with status code 302.
"Location": "http://<target-domain>",
"Set-Cookie": "username=user1;"
I can see it redirects correctly to 302. but the cookie does not get set on the <target-domain>
Looks like the header "Set-Cookie": "username=user1;" does not get passed to the <target-domain> on redirection.
I see 2 network activities in my development tool,
redirection endpoint responds with status code 302. I see Location and Set-Cookie in the response header.
target domain responds with status code 200. I don't see Location and Set-Cookie anymore.
Is there a way to set the cookies on the <target-domain>?
You can't set cookies on a domain other than the one you're on, so basically no. The only exception to this is you can set cookies on example.com if your current domain is something like subdomain.example.com, where you can attach the cookies to a shorter form of your domain, but it must be the same base domain.
If you need the other site to set a cookie with a value it does not know, you'll have to pass that value through somehow. Using a redirect with a query string leaves it open to tampering by the user unless you cryptographically sign it (annoying) or ship over a token that can be used to retrieve the raw value. You may need a short-term store for this, like Redis, Memcached, or even a database row you can purge later.
If it were possible to set cookies on any domain at all there'd be utter chaos. These things are heavily restricted for a reason.

Response.Redirect() vs Response.RedirectPermanent()

I am new to ASP.Net 4.0, and have seen a new feature called Response.RedirectPermanent(). I have checked a few articles, but I'm unable to understand clearly the actual meaning and difference of Response.RedirectPermanent() over Response.Redirect().
According to Gunnar Peipman,
Response.Redirect() returns 302 to browser meaning that asked resource is temporarily moved to other location. Permanent redirect means that browser gets 301 as response from server. In this case browser doesn’t ask the same resource from old URL anymore – it uses URL given by Location header.
Why do I need to check the server response such as 301, 302? And how does it get permanently redirected the page to the server?
301 response (RedirectPermanent) is very useful for SEO purposes. For example, you had a site implemented in ASP.NET WebForms and redesigned using ASP.NET MVC. You'd like to inform search engines that page /Catalog/ProductName.aspx becomes /products/product-name. Then you set 301 redirect from /Catalog/ProductName.aspx to /products/product-name and links in search engines' indices will be replaced. 302 (Redirect) is mostly for internal purposes. For example, the redirect after login (if returnUrl was set in URL).

url shortener 301 redirection understanding

We're working on a URL shortener project in PHP. We're using 301 HTTP redirection and naturally track our links visits. but there is something strange :
After we shorten a URL and go through it by a browser, only the first visit is tracked, and it seems that no other request is sent to our server and it directly goes to the destination URL.(I think this is a browser cache after one try). But :
When trying with a similar service like bitly , it has different treat. some of the same requests on the same browsers are tracked in bitly visit tracking (In fact more than one of them, and I don't understand why, I don't see any logic) while they also use 301 redirection.(at left bottom of browser window sometimes writes "waiting for bit.ly..." and sometimes not , in fact randomly).
Are any tricks included here? What this different treat happens?
Read the HTTP specification. A 301 response tells the browser that the requested resource has permanantly moved to the new URL that is being redirected to, and should not use the original URL anymore:
10.3.2 301 Moved Permanently
The requested resource has been assigned a new permanent URI and
any future references to this resource SHOULD use one of the
returned URIs. Clients with link editing capabilities ought to
automatically re-link references to the Request-URI to one or more
of the new references returned by the server, where possible. This
response is cacheable unless indicated otherwise.
The new permanent URI SHOULD be given by the Location field in the
response. Unless the request method was HEAD, the entity of the
response SHOULD contain a short hypertext note with a hyperlink to
the new URI(s).
If the 301 status code is received in response to a request other
than GET or HEAD, the user agent MUST NOT automatically redirect the
request unless it can be confirmed by the user, since this might
change the conditions under which the request was issued.
Note: When automatically redirecting a POST request after
receiving a 301 status code, some existing HTTP/1.0 user agents
will erroneously change it into a GET request.
For what you are attempting, try using 302, 303, or 307 instead.
10.3.3 302 Found
The requested resource resides temporarily under a different URI.
Since the redirection might be altered on occasion, the client SHOULD
continue to use the Request-URI for future requests. This response
is only cacheable if indicated by a Cache-Control or Expires header
field.
The temporary URI SHOULD be given by the Location field in the
response. Unless the request method was HEAD, the entity of the
response SHOULD contain a short hypertext note with a hyperlink to
the new URI(s).
If the 302 status code is received in response to a request other
than GET or HEAD, the user agent MUST NOT automatically redirect the
request unless it can be confirmed by the user, since this might
change the conditions under which the request was issued.
Note: RFC 1945 and RFC 2068 specify that the client is not allowed
to change the method on the redirected request. However, most
existing user agent implementations treat 302 as if it were a 303
response, performing a GET on the Location field-value regardless
of the original request method. The status codes 303 and 307 have
been added for servers that wish to make unambiguously clear which
kind of reaction is expected of the client.
.
10.3.4 303 See Other
The response to the request can be found under a different URI and
SHOULD be retrieved using a GET method on that resource. This method
exists primarily to allow the output of a POST-activated script to
redirect the user agent to a selected resource. The new URI is not a
substitute reference for the originally requested resource. The 303
response MUST NOT be cached, but the response to the second
(redirected) request might be cacheable.
The different URI SHOULD be given by the Location field in the
response. Unless the request method was HEAD, the entity of the
response SHOULD contain a short hypertext note with a hyperlink to
the new URI(s).
Note: Many pre-HTTP/1.1 user agents do not understand the 303
status. When interoperability with such clients is a concern, the
302 status code may be used instead, since most user agents react
to a 302 response as described here for 303.
.
10.3.8 307 Temporary Redirect
The requested resource resides temporarily under a different URI.
Since the redirection MAY be altered on occasion, the client SHOULD
continue to use the Request-URI for future requests. This response
is only cacheable if indicated by a Cache-Control or Expires header
field.
The temporary URI SHOULD be given by the Location field in the
response. Unless the request method was HEAD, the entity of the
response SHOULD contain a short hypertext note with a hyperlink to
the new URI(s) , since many pre-HTTP/1.1 user agents do not
understand the 307 status. Therefore, the note SHOULD contain the
information necessary for a user to repeat the original request on
the new URI.
If the 307 status code is received in response to a request other
than GET or HEAD, the user agent MUST NOT automatically redirect the
request unless it can be confirmed by the user, since this might
change the conditions under which the request was issued.
Just to note down my comments..
Cache control headers also plays a role on this. If you check with curl or firebug persistant tracking, you can see the cache control headers before the location. bitly is configured to be contacted back if user clicks on the links after 90 seconds.

Correct http status code for resource which requires authorization

There seems to be a lot of confusion about the correct http status code to return if the user tries to access a page which requires the user to login.
So basically what status code will be send when I show the login page?
I'm pretty sure we need to use a status code in the 4xx range.
I'm not talking about HTTP authentication here, so that's at least 1 status code we aren't going to use (401 Unauthorized).
Now what should we use? The answers (also here on SO) seem to vary:
According to the answer here we should use 403 Forbidden.
But in the description of the status code is:
Authorization will not help and the request SHOULD NOT be repeated.
Well that doesn't look like the right one. Since authorization WOULD help.
So let´s check out some other answer. The answer here even doesn't use the 4xx range at all but rather uses 302 Found
The description of the 302 Found status code:
The requested resource resides temporarily under a different URI. Since the redirection might be altered on occasion, the client SHOULD continue to use the Request-URI for future requests. This response is only cacheable if indicated by a Cache-Control or Expires header field.
I think that also isn't what I want. Since it is not the requested resource which resides under a different URI. But rather a completely different resource (login page vs authenticated content page).
So I moved along and picked another answer surprisingly with yet another solution.
This answer suggest we choose 400 Bad Request.
The description of this status code is:
The request could not be understood by the server due to malformed syntax. The client SHOULD NOT repeat the request without modifications.
I think the server understood the request just fine, but just refuses to give access before the user is authenticated.
Another answer also says a 403 response is correct, however it ends with:
If this is a public facing website where you are trying to deny access based on a session cookie [that's what I do], 200 with an appropriate body to indicate that log in is needed or a 302 temporary redirect to a log in page is often best.
So 403 is correct, but 200 or 302 is THE BEST.
Hey! That's what I am looking for: THE BEST solution. But shouldn't the best be the same as the correct one? And why would it be the best?
Thanks to all who have made it this far into this question :)
I know I shouldn't worry too much about it. And I think this question is more hypothetical (not really, but used it because of lack of a better word).
But this question is haunting me for some time now.
And if I would have been a manager (who just picked up some cool sounding words as they always do) I would have said: but, but, but, but restfulness is important. :-)
So: what is the right way™ of using a status code in the above situation (if any)?
tl;dr
What is the correct http status code response when a user tries to access a page which requires login?
If the user has not provided any credentials and your API requires them, return a 401 - Unauthorized. That will challenge the client to do so. There's usually little debate about this particular scenario.
If the user has provided valid credentials but they are insufficient to access the requested resource (perhaps the credentials were for a freemium account but the requested resource is only for your paid users), you have a couple of options given the looseness of some of the HTTP code definitions:
Return 403 - Forbidden. This is more descriptive and is typically understood as, "the supplied credentials were valid but still were not enough to grant access"
Return 401 - Unauthorized. If you're paranoid about security, you might not want to give the extra information back to the client as was returned in (1) above
Return either 401 or 403 but with helpful information in the response body describing the reasons why access is being denied. Again, that information might be more than you would want to provide in case it helps attackers somewhat.
Personally, I've always used #1 for the scenario where valid credentials have been passed but the account they're associated with doesn't have access to the requested resource.
You ask for "the best", "the right way", and "the correct", in turn, which makes answering this question difficult because those criteria are not necessarily interchangeable and may, in fact, conflict -- especially where RESTfulness is concerned.
The "best" answer depends on your application. Are you building a Plain Old Browser-Based (POBB) web-application? Are you building a native client (ex. iOS or Android) and hitting a service over the Web? Are you making heavy use of AJAX to drive web-page updates? Is curl the intended client?
Let's assume you are building a traditional web application. Let's look at how Google does it (output chopped for brevity):
$ curl -v http://gmail.com/
< HTTP/1.1 301 Moved Permanently
< Location: http://mail.google.com/mail/
< Content-Type: text/html; charset=UTF-8
< Content-Length: 225
< ...
Google first redirects us to the "true" URL for GMail (using a 302 redirect).
$ curl -v http://mail.google.com/mail/
< HTTP/1.1 302 Moved Temporarily
< Location: https://accounts.google.com/ServiceLogin?service=mail&passive=true&rm=false&continue=http://mail.google.com/mail/&scc=1&ltmpl=default&ltmplcache=2
< Content-Type: text/html; charset=UTF-8
< Content-Length: 352
< ...
And then it redirects us to the login page (using a 302 redirect).
$ curl -v 'https://accounts.google.com/ServiceLogin?service=mail&passive=true&rm=false&continue=http://mail.google.com/mail/&scc=1&ltmpl=default&ltmplcache=2'
< HTTP/1.1 200 OK
< Content-Type: text/html; charset=UTF-8
< Transfer-Encoding: chunked
< ...
The login page itself is delivered with the 200 status code!
Why this way?
From a user-experience perspective, if a user goes to a page they can't view because they are not authenticated, you want to take the user to a page that allows them to correct this (via logging in). In this example, the login page stands alone and is just another page (which is why 200 is appropriate).
You could throw up a 4XX page with an explanation and a link to the login page. That might, in fact, seem more RESTful. But it's a worse user experience.
Ok, but is there a case where something like 403 makes sense? Absolutely.
First, though, note that 403 isn't well-defined in the specification. In order to understand how it should be used, you need to look at how it's implemented in the field.
403 is commonly used by web servers like Apache and IIS as the status code for pages returned when the browser requests a directory listing (a URI ending in "/") but the server has directory listings disabled. In this case, 403 is really a specialized 404, and there isn't much you can do for the user except let him/her know what went wrong.
However, here's an example of a site that uses the 403 to both signal to the user that he/she doesn't have sufficient privilege and what action to take to correct the situation (check out the full response for details):
curl -v http://www.w3.org/Protocols/rfc2616/
< HTTP/1.1 403 Forbidden
< Content-Type: text/html; charset=iso-8859-1
< Content-Length: 1564
< ...
(As an aside, 403 is also seen in web-based APIs, like Twitter's API; here, 403 means "The request is understood, but it has been refused. An accompanying error message will explain why. This code is used when requests are being denied due to update limits.")
As an improvement, let's assume, however, that you don't want to redirect the user to a login page, or force the user to follow a link to the login page. Instead, you want to display the login form on the page that the user is prevented from seeing. If they successfully authenticate, they see the content when the page reloads; if they fail, they get the login form again. They never navigate to another URL.
In this case, a status code of 403 makes a lot of sense, and is homologous to the 401 case, with the caveat that the browser won't pop up a dialog asking the user to authenticate -- the form is in the page itself.
This approach to authentication is not common, but it could make sense, and is IMHO preferable to the pop-up-a-javascript-modal-to-log-in solutions that developers try to implement.
It comes down to the question, do you want to redirect or not?
Additional: thoughts about the 401 status code...
The 401 status code -- and associated basic/digest authentication -- has many things going for it. It's embraced by the HTTP specification, it's supported by every major browser, it's not inherently un-RESTful... The problem is, from a user experience perspective, it's very very unattractive. There's the un-stylable, cryptic pop-up dialog, lack of an elegant solution for logging out, etc. If you (or your stakeholders/clients) can live with those issues (a big if) then it might qualify as the "correct" solution.
Agreed. REST is just a style, not a strict protocol. Many public web services deviate from this style. You can build your service to return whatever you want. Just make sure your clients know how what return codes to expect.
Personally, I have always used 401 (unauthorized) to indicate an unauthenticated user has requested a resource that requires a login. I then require the client application to guide the user to the login.
I use 400 (bad request) in response to a logon attempt with invalid credentials.
HTTP 302 (moved) seems more appropriate for web applications where the client is a browser. Browsers typically follow the re-direct address in the response. This can be useful for guiding the user to a logon page.
I'm not talking about HTTP authentication here, so that's at least 1 status code we aren't going to use (401 Unauthorized).
Wrong. 401 is part of Hypertext Transfer Protocol (RFC 2616 Fielding, et al.), but not limited to HTTP authentication. Furthermore, it's the only status code indicating that the request requires user authentication.
302 & 200 codes could be used and is easier to implement in some scenarios, but not all. And if you want to obey the specs, 401 is the only correct answer there is.
And 403 is indeed the most wrong code to return. As you correctly stated...
Authorization will not help and the request SHOULD NOT be repeated.
So this is clearly not suitable to indicate that authorization is an option.
I would stick to the standard: 401 Unauthorized
-
UPDATE
To add a little more info, lifting the confusion related to...
The response MUST include a WWW-Authenticate header field (section 14.47) containing a challenge applicable to the requested resource.
If you think that's going to stop you from using a 401, you have to remember there's more:
"The field value consists of at least one challenge that indicates the authentication scheme(s) and parameters applicable to the Request-URI."
This "indicating the authentication scheme(s)" means you can opt-in for other auth-schemes!
The HTTP protocol (RFC 2616) defines a simple framework for access authentication schemes, but you don't HAVE to use THAT framework.
In other words: you're not bound to the usual WWW-Auth. You only just MUST indicate HOW your webapp does it's authorization and offer the according data in the header, that's all. According to the specs, using a 401, you can choose your own poison of authorization! And that's where your "webapp" can do what YOU want it to do when it comes to the 401 header and your authorization implementation.
Don't let the specs confuse you, thinking you HAVE to use the usual HTTP authentication scheme. You don't! The only thing the specs really enforce: you just HAVE/MUST identify your webapp's authentication scheme and pass on related parameters to enable the requesting party to start potential authorization attempts.
And if you're still unsure, I can put all this into a simple but understandable perspective: let's say you're going to invent a new authorization scheme tomorrow, then the specs allow you to use that too. If the specs would have restricted implementation of such newer authorization technology implementations, those specs would've been modified ages ago. The specs define standards, but they do not really limit the multitude of potential implementations.
Your "TL;DR" doesn't match the "TL" version.
The proper response for requesting a resource that you need authorization to request, is 401.
302 is not the proper response, because, in fact, the resource is not available some place else. The original URL was correct, the client simply didn't have the rights. If you follow the redirect, you do not actually get what you're looking for. You get dropped in to some ad hoc workflow that has nothing to do with the resource.
403 is incorrect. 403 is the "can't get there from here" error. You simply can't see this, I don't care who you are. Some would argue 403 and 404 are similar. The difference is simply with 403, the server is saying "yea, I have it, but you can't", whereas 404 says "I know nothing about what you're talking about." Security wonks would argue that 404 is "safer". Why tell them something they don't need to know.
The problem you are encountering has nothing to do with REST or HTTP. Your problem is trying to set up some stateful relationship between the client and server, manifested in the end via some cookie. The whole resource -> 302 -> Login page is all about user experience using the hack that's known as the Web Browser, which happens to be both, in stock form, a lousy HTTP client and a lousy REST participant.
HTTP has an authorization mechanism. The Authorization header. The user experience around it, in a generic browser, is awful. So no one uses it.
So there is not proper HTTP response (well there is, 401, but don't/can't use that). There is not proper REST response, as REST typically relies on the underlying protocol (HTTP in this case, but we've tackled that already).
So. 302 -> 200 for the login page is all she wrote. That's what you get. If you weren't using the browser, or did everything via XHR or some other custom client, this wouldn't be an issue. You'd just use Authorization header, follow the HTTP protocol, and leverage a scheme like either DIGEST or what AWS uses, and be done. Then you can use the appropriate standards to answer questions like these.
As you point out, 403 Forbidden is explicitly defined with the phrase "Authorization will not help", but it is worth noting that the authors were almost certainly referring here to HTTP authorization (which will indeed not help as your site uses a different authorization scheme). Indeed, given that the status code is a signal to the user agent rather than the user, such a code would be correct insofar as any authorization the agent attempts to provide will not assist any further with the required authorization process (c.f. 401 Unauthorized).
However, if you take that definition of 403 Forbidden literally and feel it is still inappropriate, perhaps 409 Conflict might apply? As defined in RFC 2616 §10.4.10:
The request could not be completed due to a conflict with the current
state of the resource. This code is only allowed in situations where
it is expected that the user might be able to resolve the conflict
and resubmit the request. The response body SHOULD include enough
information for the user to recognize the source of the conflict.
Ideally, the response entity would include enough information for the
user or user agent to fix the problem; however, that might not be
possible and is not required.
There is indeed a conflict with the current state of the resource: the resource is in a "locked" state and such conflict can only be "resolved" through the user providing their credentials and resubmitting the request. The body will include "enough information for the user to recognize the source of the conflict" (it will state that they are not logged-in) and indeed will also include "enough information for the user or user agent to fix the problem" (i.e. a login form).
Your Answer:
401 Unauthorized especially if you do not care or will not be redirecting people to a login page
-or-
302 Found to imply there was the resource but they need to provide credentials to be returned to it. Do this only if you will be using a redirect and make sure to provide appropriate information in the body of the response.
Other Suggestions:
401 Unauthorized is generally used for resources the user does not have access to after handling authentication.
403 Forbidden is a little obscure to me in honesty. I use it when I lock down resources from the file system level, and like your post said, "authorization does not help".
400 Bad Request is inappropriate as needing to login does not represent malformed syntax.
I believe 401 is the correct status code to return from failed authorization. Reference RFC 2616 section-14.8
It reads "A user agent that wishes to authenticate itself with a server-- usually, but not necessarily, after receiving a 401 response"

Non-SEO anti-spoofing external link redirect: Status code?

I've read several documents on the merits of the different HTTP redirect status codes, but those've all been very SEO-centric. I've now got a problem where search-engines don't factor in, because the section of the site in question is not publicly viewable.
However, we do want our website to be as accurate / helpful with meta-data as possible, especially for accessibility reasons.
Now, our application takes external links provided by third parties and routes them across an anti-spoofing page with a disclaimer. Since this redirector page can effectively also be embedded via an Ajax call in certain constellations, we also want to strip any query parameters from the referer (for privacy purposes; the target site has no business finding out what internal page the user was on before).
To do this, the confirmation button triggers a server-side script which in turn redirects (rather than just opening the page for the user).
So much as to why our anti-spoofing disclaimer page ends up triggering a redirect.
The question is:
Does it effectively make any difference which status code I use? Do non-typical browsers (e.g. screen-readers) care? If so, what's the best practise for such redirects? The most semantically sound, if you so will? They all seem various degrees of insincere to me.
I'm thinking of a 302 - but as it makes no sense trying to bookmark the page (it's protected with a crsf token), so there's probably no harm in a 301, either, is there? So I'm wondering if there's a reason for me to prefer the one over the other.
Hmm. Here's the list. 301 sounds okay (emphasis mine):
The requested resource has been assigned a new permanent URI and any future references to this resource SHOULD use one of the returned URIs. Clients with link editing capabilities ought to automatically re-link references to the Request-URI to one or more of the new references returned by the server, where possible.
302 doesn't fit n my opinion:
The requested resource resides temporarily under a different URI
However, my favourite is 303 see other:
The response to the request can be found under a different URI and SHOULD be retrieved using a GET method on that resource. This method exists primarily to allow the output of a POST-activated script to redirect the user agent to a selected resource. The new URI is not a substitute reference for the originally requested resource.
But that might be so rare (I've never seen it used in the wild) that some clients may not understand it - which would render your desire for maximum compatibility moot. 301 is probably the closest choice.