cflocation vs cfheader for 301 redirects

cflocation vs cfheader for 301 redirects - redirect

I am "renaming" an existing file for a project I am working on. To maintain backwards compatibility, I am leaving a cfm file in place to redirect the users to the new one.
buy.cfm: old
shop.cfm: new
In order to keep everything as clean as possible, I want to send the 301 statuscode response if a user tries to go to buy.cfm.
I know that I can use either cflocation with the statuscode attribute
<cflocation url="shop.cfm" statuscode="301" addtoken="false">
or I can use the cfheader tags.
<cfheader statuscode="301" statustext="Moved permanently">
<cfheader name="Location" value="http://www.mysite.com/shop.cfm">
Are there any reasons to use one method over the other?

I think they do the same thing, with <cflocation> being more readable

I tested this on ColdFusion 9.
There is one major difference, and it is that cflocation stops execution of the page and then redirects to the specified resource.
From the Adobe ColdFusion documentation:
Stops execution of the current page and opens a ColdFusion page or
HTML file.
So you would need to do this:
<cfheader statuscode="301" statustext="Moved permanently">
<cfheader name="Location" value="http://www.example.com/shop.cfm">
<cfabort>
to get the equivalent of this:
<cflocation url="shop.cfm" statuscode="301" addtoken="false">
Otherwise, you risk running into issues if other code runs after the cfheader tag. I came across this when fixing some code where redirects were inserted into an application.cfm file -- using cfheader -- without aborting the rest of the page processing.
I also noticed, in the response headers, that cflocation also sets the following headers accordingly:
Cache-Control: no-cache
Pragma: no-cache
One might want to add these headers in if using the cfheader tag with Location, if needed:
<cfheader name="Cache-Control" value="no-cache">
<cfheader name="Pragma" value="no-cache">

To elaborate on the Answer by Andy Tyrone, while they MAY do the same thing in certain circumstances, the CFHEADER method give you more control over the headers passed in the request. This becomes useful, for example, if you want to send cache control headers to a browser or content delivery network so that they do not keep hitting your server with the same old redirect request. There is no way (to my knowledge) to tell a CFLocation to cache the redirect.

Related

Where do I place a 301 redirect when using ColdFusion?

I found this code for 301 redirects in ColdFusion:
<cfheader statuscode="301" statustext="Moved Permanently">
<cfheader name="Location" value="[the URL to be redirected to]">
<cfabort>
What file do I place this code in? Is it the "missing page" that is now supposed to be giving a 301 error when someone lands on it? Or is there a file that's similar to .htaccess that I should put it in?

First of all: 3xx status codes are not errors but redirects.
Your code snippet isn't wrong, but ColdFusion has a more comfy way to do these 3 lines with a single statement:
<cflocation url="[the URL to be redirected to]" statusCode="301">
You can put this tag anywhere in your .cfm template. ColdFusion executes everything up to this point and then stops execution, sets the response header accordingly, discards the output buffer (because 3xx are not supposed to contain a body) and transmits the response (header with location reference).
Note: Your code snippet would include content in the response body (e.g. everything you put in <cfoutput> tags), which is usually not desired. So I strongly recommend to use the cflocation tag for common redirects. It'll also protect you from forgetting to place <cfabort> after it.
For a common scenario like "redirect visitor from a no longer existing page to a new page", you can simply do this:
no_longer_existing_page.cfm
<cflocation url="the_new_page.cfm" statusCode="301" addToken="false">
the_new_page.cfm
<cfoutput>Hello World !!</cfoutput>
Requests to both pages will now point to the_new_page.cfm and return Hello World !!. (This is a redirect, not a rewrite, so the address in the browser will change to the_new_page.cfm in both cases.)

Redirect or forward

Looking through some legacy code I have in front of me using struts one, I see:
<global-forwards>
...
<forward name="accessDenied" path="/www/jsp/AccessDeniedForm.do" redirect="true" />
</global-forwards>
So it's just a global forward to send to a access denied page.
I am curious about the decision to redirect as opposed to forward. What are the advantages and disadvantages of using it?

What are the pro's and con's of using it?
Before discussing pro's and con's of using that forward element with redirect set to true, let's understand what is actually going on with that configuration. When redirect is set to true in the forward element, a redirect instruction should be issued to the user-agent so that a new request is issued for this forward's resource. This link will probably provide detail information that you need.
The default value for redirect is to false, essentially when the forward element is called, it forward to that path specified and that's it. If you are setting redirect to true, take for example, the browser will make another request. So I think with these said, you probably know or have an idea the pro and con if you really want to use it.

In redirect, the control can be directed to different servers or even another domain name.The redirect takes a round trip.When a redirect is issued , it is sent back to the client , and redirected URL information is in the header instructing the browser to move to the next URL. This will act as a new request and all the request and response data is lost.
In forward , the forwarding is done from server side , the client browser URL do not change.the data is also not lost.It is just like a browser page refresh. Whatever data posted in the first submit is resubmitted again.So use it with caution.
Both forward and redirect are used in different scenarios ,the global forward should be redirect because it is an error situation.
Redirect is slower as it needs a roundtrip.Forwards are faster.

If you specify
redirect="true", Struts uses a client-side redirect
[response.sendRedirect()]
. The JSP will be invoked by a new browser request, and any data stored in the old request will be lost.

URLRewriteFile and "#" char in URL string

I'm using the Google means of making my GWT app searchable (https://developers.google.com/webmasters/ajax-crawling/docs/getting-started), which works fine. Unfortunately, it seems Bing does not follow the same pattern/rule.
I thought I'd add a URL filter, based on user-agent to map all URL's of the form
http://www.example.com/#!blah=something
to
http://www.example.com/?_escaped_fragment_=blah=something
only for BingBot so that my CrawlerServet returned the same as the GoogleBot requests. I have a URLRewrite rule like:
<rule>
<condition name="user-agent">Firefox/8.0</condition>
<from use-query-string="true">^(.*)#!(.*)$</from>
<to type="redirect">?_escaped_fragment_=$2</to>
</rule>
(I'm using a user-agent of Firefox to test)
This never matches. If I change the rule to ^(.)!(.)$ and try and match on
http://www.example.com/!blah=something
it will work, but using the same rule
http://www.example.com/#!blah=something
will not work, because it seems the URL string the filter is using is truncated at the "#".
Can anyone tell me if it's possible to make this work.

The browser doesn't send the hash to the server, as you've discovered. Watching a given request, you'll see that it only sends along the url before the # symbol.
GET / HTTP/1.1
Host: example.com
...
From the link you mentioned:
Hash fragments are never (by specification) sent to the server as part of an HTTP request. In other words, the crawler needs some way to let your server know that it wants the content for the URL www.example.com/ajax.html#!key=value (as opposed to simply www.example.com/ajax.html).
From the descriptions in the text, it is the server's job to translate from the 'ugly' url to a pretty one (with a hash), and to send back a snapshot of what that page might look like if loaded with a hash on the client. That page may have other links using hashes to load other documents - the crawler will automatically translate those back to ugly urls, and request more data from the server.
So in short, this is not an change you should need to make, the GoogleBot will make it automatically, provided you have opted into using hash fragments. As for other bots, apparently Bing now supports this idea as well, but that appears to be outside the scope of your question.

How should I update a REST resource?

I'm not sure how I should go about updating individual properties of a REST resource. Consider the following example:
# HTTP GET to /users/1.xml
<?xml version="1.0" encoding="UTF-8" ?>
<response>
<user>
<id>1</id>
<name>John Doe</name>
<email>john#doe.com</email>
</user>
</response>
How should I facilitate for updating John's email? HTTP PUT comes to mind, but I'd be making it hard on my clients by requiring a complete XML (matching the HTTP GET response) to modify the resource.
The PUT method requests that the
enclosed entity be stored under the
supplied Request-URI. If the
Request-URI refers to an already
existing resource, the enclosed entity
SHOULD be considered as a modified
version of the one residing on the
origin server.
Is there any other way?

If your server framework is flexible enough to handle it, you can do:
Request:
PUT /users/1/email
Content-Type: text/plain
john#newemail.com
Response:
200 OK
Content-Location: /users/1
By using a URL to refer to the email as its own resource, you can PUT directly to it using a simple format like text/plain. In the response, the Content-Location url gives the client an indication that the change has had an impact on the user resource.
The PATCH method is also another way that you can do partial updates. This is a newly introduced method and as yet there are no standard formats for sending XML diff documents. So, if you take this approach you will not find much guidance.
The other thing to consider is that REST works best with large grained updates. If you find yourself needing to make these kinds of small changes, then maybe you need to rethink your distributed architecture.

in what situation does the HTTP_REFERER not work?

I have used referrer before in foo.php to decide whether the page iframing foo.php is of a particular URL. (using $_SERVER['HTTP_REFERER'])
It turned out that most of the time, it worked (about 98% of the time), but it also seemed like some users arrived the page and $_SERVER['HTTP_REFERER'] was not set in foo.php and therefore broke the code. [update: These user claimed that they followed the usual page flow and didn't use the URL of foo.php all by itself on the browser (that they let it be an iframe) and the users never altered their browser settings.]
I wonder what the reasons are that it could happen?

The HTTP/1.1 RFC does not make it mandatory to send an HTTP referer header. You can't make any assumptions about its presence when writing robust code; perfectly conforment browsers may not include it.
Moreoever, the RFC advises that "The Referer field MUST NOT be sent if the Request-URI was obtained from a source that does not have its own URI, such as input from the user keyboard", and "We suggest, though do not require, that a convenient toggle interface be provided for the user to enable or disable the sending of From and Referer information".
The later is not very common (though some browsers have a "Private" mode that fulfils the requirements). More likely for your 2% is that people Bookmarked the URL, which fulfils the first criteria (URI obtained from a source without a URI), and so the browser sends no referer.

Not by default AFAIK, but it's easy to turn it off (for privacy) e.g. in Firefox via about:config, and surely some users could be using browsers distributed to them (e.g. by their IT department) with such kinds of setting. So you should try to avoid relying on REFERER for any important functionality (also because it's mis-spelled, of course;-).