URL Fragment and 302 redirects - redirect

It's well known that the URL fragment (the part after the #) is not sent to the server.
I do wonder though how fragments work when a server redirect (via HTTP status 302 and Location: header) is involved.
My question is really two-fold:
If the original URL had a fragment (/original.php#foo), and a redirect is made to /new.php, does the fragment part of the original URL simply get lost? Or does it sometimes get applied to the new URL? Will the new URL ever be /new.php#foo in this case?
Regardless of the original URL, if the server redirects to a new URL with a fragment (/new.php#foo), will the fragment get "honored"? Or does the server really have no business interfering with the fragment at all -- and will the browser therefore ignore it by simply going to /new.php??

Update 2022-09-22:
RFC 9110/STD 97 HTTP Semantics (which obsoletes 7231 (and others)), has been published as an INTERNET STANDARD in June 2022. The wording in the newly numbered Section 10.2.2 Location stays the same as before/below.
Update 2014-Jun-27:
RFC 7231, Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content, has been published as a PROPOSED STANDARD. From the Changelog:
The syntax of the Location header field has been changed to allow all
URI references, including relative references and fragments, along
with some clarifications as to when use of fragments would not be
appropriate. (Section 7.1.2)
The important points from Section 7.1.2. Location:
If the Location value provided in a 3xx (Redirection) response does
not have a fragment component, a user agent MUST process the
redirection as if the value inherits the fragment component of the URI
reference used to generate the request target (i.e., the redirection
inherits the original reference's fragment, if any).
For example, a
GET request generated for the URI reference
"http://www.example.org/~tim" might result in a 303 (See Other)
response containing the header field:
Location: /People.html#tim
which suggests that the user agent redirect to
"http://www.example.org/People.html#tim"
Likewise, a GET request generated for the URI reference
"http://www.example.org/index.html#larry" might result in a 301 (Moved
Permanently) response containing the header field:
Location: http://www.example.net/index.html
which suggests that the user agent redirect to
"http://www.example.net/index.html#larry", preserving the original
fragment identifier.
This should clearly answer your questions.
Update END
this is an open (not specified) issue with the current HTTP specification. it is addressed in 2 issues of the IETF httpbis working group:
#6: Fragments allowed in Location
#43: Fragment combination / precedence during redirects
#6 allows fragments in the Location header. #43 says this:
I just tested this with various browsers.
Firefox and Safari use the fragment in the location header.
Opera uses the fragment from the source URI, when present, otherwise the fragment from the redirect location
IE (8) ignores the fragment in the location URI, thus will use the fragment from the source URI, when present
Proposal:
"Note: the behavior when fragment identifiers from the original URI and the redirect need to be combined is undefined; current User Agents indeed differ on what fragment takes precedence."
[...]
It appears that IE8 does use the fragment idenfitier from Location (the behavior I saw might be limited to localhost).
Thus we seem to have consistent behavior for Safari/IE/Firefox/Chrome (just tested), in that the fragment from the Location header gets used, no matter what the original URI was.
I therefore change my proposal to document that as expected behavior.
this leads to the most browser compatible and future proof (because this issue will eventually get standardized) answer to your question:
A: fragments from original URLs get discarded.
B: fragments from the Location header are honored.

Safari 5 and IE9 and below drop the original URI's fragment if a HTTP/3xx redirect occurs. If the Location header on the response specifies a fragment, it is used.
IE10+, Chrome 11+, Firefox 4+, and Opera will all "reattach" the original URI's fragment after following a 3xx redirection.
Test page: http://www.webdbg.com/test/redir/fragment/.
See further discussion of this issue at http://blogs.msdn.com/b/ieinternals/archive/2011/05/17/url-fragments-and-redirects-anchor-hash-missing.aspx

Just to let you know, here you can find proper spec. by w3c defining how all should behave: http://www.w3.org/TR/cuap#uri - clause 4.1 - see below:
When a resource (URI1) has moved, an HTTP redirect can indicate its
new location (URI2).
If URI1 has a fragment identifier #frag, then the new target that the
user agent should be trying to reach would be URI2#frag. If URI2
already has a fragment identifier, then #frag must not be appended and
the new target is URI2.
Wrong: Most current user agents do implement HTTP redirects but do not
append the fragment identifier to the new URI, which generally
confuses the user because they end up with the wrong resource.
References:
HTTP redirects are described in section 10.3 of the HTTP/1.1
specification [RFC2616]. The required behavior is described in detail
in "Handling of fragment identifiers in redirected URLs" [RURL]. The
term "Persistent Uniform Resource Locator (PURL)" designates a URL (a
special case of a URI) that points to another one through an HTTP
redirect. For more information, refer to "Persistent Uniform Resource
Locators" [PURL]. Example:
Suppose that a user requests the resource at
http://www.w3.org/TR/WD-ruby/#changes and the server redirects the
user agent to http://www.w3.org/TR/ruby/. Before fetching that latter
URI, the browser should append the fragment identifier #changes to it:
http://www.w3.org/TR/ruby/#changes.

Posting similar issue with the solution faced by me.
Hope it helps to someone with the similar requirement of preserving hash in IE for 302 redirects.
Adding essential parts of the answer instead of links alone
We use SiteMinder authentication in our application.
I figured out that after successful authentication, SiteMinder is doing 302 redirection to user requested application page by using login form hidden variable value (where it stores user requested URL /myapp/ - without hash fragment since it won't be sent to the server) with name similar to redirect. Sample form below
Since redirect hidden variable value contains only /myapp/ without hash fragment and it's a 302 redirect, the hash fragment is automatically removed by IE even before coming to our application and whatever the solutions we are trying in our application code are not working out.
IE is redirecting to /myapp/ only and it is landing on default home page of our app https://ourapp.com/myapp/#/home.
Have wasted almost a day to figure out this behavior.
The solution is:
Have changed the login form hidden variable (redirect) value to hold the hash fragment by
appending window.location.hash along with existing value. Similar to below code
$(function () {
var $redirect = $('input[name="redirect"]');
$redirect.val($redirect.val() + window.location.hash);
});
After this change, the redirect hidden variable is storing user requested URL value as /myapp/#/pending/requests and SiteMinder is redirecting it to /myapp/#/pending/requests in IE.
The above solution is working fine in all the three browsers Chrome, Firefox and IE.
Thanks to #AlexFord for the detailed explanation and providing solution to this issue.

Related

HTTP Redirect Status Code

I have an ASP.NET website. A user can access the URL /partners/{partner-id} in my app. When that url is invoked, I do two things:
1) I want to log the partner ID and user that requested it and
2) Redirect the url to the partner's website.
My question is, which HTTP Status Code should I use? I was using 301. However, that introduced a problem where my logging code was getting skipped. I suspect its because a 301 represents a permanent redirect. However, I basically want to remain the middle man so that I properly log the details.
What HTTP status code should I use?
Thanks!
Taking a look here:
https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
you should use the 302 status code. Two useful points about the 302 redirect:
Since the redirection might be altered on occasion, the client SHOULD
continue to use the Request-URI for future requests
This says by inferring that the redirect may be temporary, clients should always check the initial URI instead of going to the redirect URI as a default behavior, meaning they will pass through your logging system each time rather than going directly to the redirected URI on subsequent requests. The 302 response also states:
This response is only cacheable if indicated by a Cache-Control or
Expires header field.
By default, the 301 redirect is cacheable unless you explicitly specify, but the 302 is not cacheable unless explicitly specified.
However, it's probably a good idea to explicitly add in 'do not cache' headers to the redirect to let the client know that it should not be cached just in case you have a client that doesn't follow the default spec behavior. There are a number of other answers in stackoverflow regarding this, here's a decent one:
How to control web page caching, across all browsers?

Response.Redirect() vs Response.RedirectPermanent()

I am new to ASP.Net 4.0, and have seen a new feature called Response.RedirectPermanent(). I have checked a few articles, but I'm unable to understand clearly the actual meaning and difference of Response.RedirectPermanent() over Response.Redirect().
According to Gunnar Peipman,
Response.Redirect() returns 302 to browser meaning that asked resource is temporarily moved to other location. Permanent redirect means that browser gets 301 as response from server. In this case browser doesn’t ask the same resource from old URL anymore – it uses URL given by Location header.
Why do I need to check the server response such as 301, 302? And how does it get permanently redirected the page to the server?
301 response (RedirectPermanent) is very useful for SEO purposes. For example, you had a site implemented in ASP.NET WebForms and redesigned using ASP.NET MVC. You'd like to inform search engines that page /Catalog/ProductName.aspx becomes /products/product-name. Then you set 301 redirect from /Catalog/ProductName.aspx to /products/product-name and links in search engines' indices will be replaced. 302 (Redirect) is mostly for internal purposes. For example, the redirect after login (if returnUrl was set in URL).

url shortener 301 redirection understanding

We're working on a URL shortener project in PHP. We're using 301 HTTP redirection and naturally track our links visits. but there is something strange :
After we shorten a URL and go through it by a browser, only the first visit is tracked, and it seems that no other request is sent to our server and it directly goes to the destination URL.(I think this is a browser cache after one try). But :
When trying with a similar service like bitly , it has different treat. some of the same requests on the same browsers are tracked in bitly visit tracking (In fact more than one of them, and I don't understand why, I don't see any logic) while they also use 301 redirection.(at left bottom of browser window sometimes writes "waiting for bit.ly..." and sometimes not , in fact randomly).
Are any tricks included here? What this different treat happens?
Read the HTTP specification. A 301 response tells the browser that the requested resource has permanantly moved to the new URL that is being redirected to, and should not use the original URL anymore:
10.3.2 301 Moved Permanently
The requested resource has been assigned a new permanent URI and
any future references to this resource SHOULD use one of the
returned URIs. Clients with link editing capabilities ought to
automatically re-link references to the Request-URI to one or more
of the new references returned by the server, where possible. This
response is cacheable unless indicated otherwise.
The new permanent URI SHOULD be given by the Location field in the
response. Unless the request method was HEAD, the entity of the
response SHOULD contain a short hypertext note with a hyperlink to
the new URI(s).
If the 301 status code is received in response to a request other
than GET or HEAD, the user agent MUST NOT automatically redirect the
request unless it can be confirmed by the user, since this might
change the conditions under which the request was issued.
Note: When automatically redirecting a POST request after
receiving a 301 status code, some existing HTTP/1.0 user agents
will erroneously change it into a GET request.
For what you are attempting, try using 302, 303, or 307 instead.
10.3.3 302 Found
The requested resource resides temporarily under a different URI.
Since the redirection might be altered on occasion, the client SHOULD
continue to use the Request-URI for future requests. This response
is only cacheable if indicated by a Cache-Control or Expires header
field.
The temporary URI SHOULD be given by the Location field in the
response. Unless the request method was HEAD, the entity of the
response SHOULD contain a short hypertext note with a hyperlink to
the new URI(s).
If the 302 status code is received in response to a request other
than GET or HEAD, the user agent MUST NOT automatically redirect the
request unless it can be confirmed by the user, since this might
change the conditions under which the request was issued.
Note: RFC 1945 and RFC 2068 specify that the client is not allowed
to change the method on the redirected request. However, most
existing user agent implementations treat 302 as if it were a 303
response, performing a GET on the Location field-value regardless
of the original request method. The status codes 303 and 307 have
been added for servers that wish to make unambiguously clear which
kind of reaction is expected of the client.
.
10.3.4 303 See Other
The response to the request can be found under a different URI and
SHOULD be retrieved using a GET method on that resource. This method
exists primarily to allow the output of a POST-activated script to
redirect the user agent to a selected resource. The new URI is not a
substitute reference for the originally requested resource. The 303
response MUST NOT be cached, but the response to the second
(redirected) request might be cacheable.
The different URI SHOULD be given by the Location field in the
response. Unless the request method was HEAD, the entity of the
response SHOULD contain a short hypertext note with a hyperlink to
the new URI(s).
Note: Many pre-HTTP/1.1 user agents do not understand the 303
status. When interoperability with such clients is a concern, the
302 status code may be used instead, since most user agents react
to a 302 response as described here for 303.
.
10.3.8 307 Temporary Redirect
The requested resource resides temporarily under a different URI.
Since the redirection MAY be altered on occasion, the client SHOULD
continue to use the Request-URI for future requests. This response
is only cacheable if indicated by a Cache-Control or Expires header
field.
The temporary URI SHOULD be given by the Location field in the
response. Unless the request method was HEAD, the entity of the
response SHOULD contain a short hypertext note with a hyperlink to
the new URI(s) , since many pre-HTTP/1.1 user agents do not
understand the 307 status. Therefore, the note SHOULD contain the
information necessary for a user to repeat the original request on
the new URI.
If the 307 status code is received in response to a request other
than GET or HEAD, the user agent MUST NOT automatically redirect the
request unless it can be confirmed by the user, since this might
change the conditions under which the request was issued.
Just to note down my comments..
Cache control headers also plays a role on this. If you check with curl or firebug persistant tracking, you can see the cache control headers before the location. bitly is configured to be contacted back if user clicks on the links after 90 seconds.

Non-SEO anti-spoofing external link redirect: Status code?

I've read several documents on the merits of the different HTTP redirect status codes, but those've all been very SEO-centric. I've now got a problem where search-engines don't factor in, because the section of the site in question is not publicly viewable.
However, we do want our website to be as accurate / helpful with meta-data as possible, especially for accessibility reasons.
Now, our application takes external links provided by third parties and routes them across an anti-spoofing page with a disclaimer. Since this redirector page can effectively also be embedded via an Ajax call in certain constellations, we also want to strip any query parameters from the referer (for privacy purposes; the target site has no business finding out what internal page the user was on before).
To do this, the confirmation button triggers a server-side script which in turn redirects (rather than just opening the page for the user).
So much as to why our anti-spoofing disclaimer page ends up triggering a redirect.
The question is:
Does it effectively make any difference which status code I use? Do non-typical browsers (e.g. screen-readers) care? If so, what's the best practise for such redirects? The most semantically sound, if you so will? They all seem various degrees of insincere to me.
I'm thinking of a 302 - but as it makes no sense trying to bookmark the page (it's protected with a crsf token), so there's probably no harm in a 301, either, is there? So I'm wondering if there's a reason for me to prefer the one over the other.
Hmm. Here's the list. 301 sounds okay (emphasis mine):
The requested resource has been assigned a new permanent URI and any future references to this resource SHOULD use one of the returned URIs. Clients with link editing capabilities ought to automatically re-link references to the Request-URI to one or more of the new references returned by the server, where possible.
302 doesn't fit n my opinion:
The requested resource resides temporarily under a different URI
However, my favourite is 303 see other:
The response to the request can be found under a different URI and SHOULD be retrieved using a GET method on that resource. This method exists primarily to allow the output of a POST-activated script to redirect the user agent to a selected resource. The new URI is not a substitute reference for the originally requested resource.
But that might be so rare (I've never seen it used in the wild) that some clients may not understand it - which would render your desire for maximum compatibility moot. 301 is probably the closest choice.

in what situation does the HTTP_REFERER not work?

I have used referrer before in foo.php to decide whether the page iframing foo.php is of a particular URL. (using $_SERVER['HTTP_REFERER'])
It turned out that most of the time, it worked (about 98% of the time), but it also seemed like some users arrived the page and $_SERVER['HTTP_REFERER'] was not set in foo.php and therefore broke the code. [update: These user claimed that they followed the usual page flow and didn't use the URL of foo.php all by itself on the browser (that they let it be an iframe) and the users never altered their browser settings.]
I wonder what the reasons are that it could happen?
The HTTP/1.1 RFC does not make it mandatory to send an HTTP referer header. You can't make any assumptions about its presence when writing robust code; perfectly conforment browsers may not include it.
Moreoever, the RFC advises that "The Referer field MUST NOT be sent if the Request-URI was obtained from a source that does not have its own URI, such as input from the user keyboard", and "We suggest, though do not require, that a convenient toggle interface be provided for the user to enable or disable the sending of From and Referer information".
The later is not very common (though some browsers have a "Private" mode that fulfils the requirements). More likely for your 2% is that people Bookmarked the URL, which fulfils the first criteria (URI obtained from a source without a URI), and so the browser sends no referer.
Not by default AFAIK, but it's easy to turn it off (for privacy) e.g. in Firefox via about:config, and surely some users could be using browsers distributed to them (e.g. by their IT department) with such kinds of setting. So you should try to avoid relying on REFERER for any important functionality (also because it's mis-spelled, of course;-).