URL rewriting with a hash - hash

So, I am new to url rewriting, and was wondering if it is possible to rewrite the url when a hash comes into play. For instance, I have the following URL:
http://testdomain.com/#/test.php
Is there a way to rewrite this to:
http://testdomain.com/index.php?url=test.php
I am not sure whether the hash effectively makes this 'invisible' on the php side. I tried to capture the request URL this way, and it does not read anything from the hash on.

After reading more up on this, I have found it not to be possible, as the hash is not sent to the server, and is client side only.

Related

Make anchor in link url available in template

If I set a link in Backend to a specific content element on a page (http://www.test.url/example#999), is there any chance to get this anchor-id in the target template?
As already mentioned, the hash-value is not part of the http request.
Here is an example on how to access the hash on the client side via JavaScript: How to get Url Hash (#) from server side
It is of course possible to convert the hash to a get or post request on the client side.

Express : Server-Side Routing - redirect() and updating header "Location"

I'm asking this one for the record:
So I have a client making an Ajax call and I'm trying to have the server handle it and redirect the client server-side.
The express docs make it seem res.redirect(path) is going to actually send a response from the server for the client to redirect(re-route).
e.g.
var path = 'http://localhost:8080/newRoute';
res.redirect(path);
//the client will now go to http://localhost:8080/newRoute
But it appears that this only tells the client to make another request to
the url given.(Which seems useless, but that is what my network requests are showing currently).
Many suggest to do the following to do an actual redirect server-side
var path = 'http://localhost:8080/newRoute';
response.writeHead(302, {'Location': path});
response.end();
So does this mean that that we need to change the header in order for the redirect work?
i.e.
res.location('http://localhost:8080/newRoute');
res.redirect('http://localhost:8080/newRoute');
But the above looks horribly redundant and makes res.redirect look like it wasn't intended for server-side redirects to a new page.
Yet the Express docs show an example like this:
res.redirect('http://google.com');
which I don't know how that could be interpreted any other way than "send the client to the page 'http://google.com' ".
Big Question:
So is res.redirect(path) suppose to handle server-side redirects? If not, what do we do?

Redirect or forward

Looking through some legacy code I have in front of me using struts one, I see:
<global-forwards>
...
<forward name="accessDenied" path="/www/jsp/AccessDeniedForm.do" redirect="true" />
</global-forwards>
So it's just a global forward to send to a access denied page.
I am curious about the decision to redirect as opposed to forward. What are the advantages and disadvantages of using it?
What are the pro's and con's of using it?
Before discussing pro's and con's of using that forward element with redirect set to true, let's understand what is actually going on with that configuration. When redirect is set to true in the forward element, a redirect instruction should be issued to the user-agent so that a new request is issued for this forward's resource. This link will probably provide detail information that you need.
The default value for redirect is to false, essentially when the forward element is called, it forward to that path specified and that's it. If you are setting redirect to true, take for example, the browser will make another request. So I think with these said, you probably know or have an idea the pro and con if you really want to use it.
In redirect, the control can be directed to different servers or even another domain name.The redirect takes a round trip.When a redirect is issued , it is sent back to the client , and redirected URL information is in the header instructing the browser to move to the next URL. This will act as a new request and all the request and response data is lost.
In forward , the forwarding is done from server side , the client browser URL do not change.the data is also not lost.It is just like a browser page refresh. Whatever data posted in the first submit is resubmitted again.So use it with caution.
Both forward and redirect are used in different scenarios ,the global forward should be redirect because it is an error situation.
Redirect is slower as it needs a roundtrip.Forwards are faster.
If you specify
redirect="true", Struts uses a client-side redirect
[response.sendRedirect()]
. The JSP will be invoked by a new browser request, and any data stored in the old request will be lost.

URLRewriteFile and "#" char in URL string

I'm using the Google means of making my GWT app searchable (https://developers.google.com/webmasters/ajax-crawling/docs/getting-started), which works fine. Unfortunately, it seems Bing does not follow the same pattern/rule.
I thought I'd add a URL filter, based on user-agent to map all URL's of the form
http://www.example.com/#!blah=something
to
http://www.example.com/?_escaped_fragment_=blah=something
only for BingBot so that my CrawlerServet returned the same as the GoogleBot requests. I have a URLRewrite rule like:
<rule>
<condition name="user-agent">Firefox/8.0</condition>
<from use-query-string="true">^(.*)#!(.*)$</from>
<to type="redirect">?_escaped_fragment_=$2</to>
</rule>
(I'm using a user-agent of Firefox to test)
This never matches. If I change the rule to ^(.)!(.)$ and try and match on
http://www.example.com/!blah=something
it will work, but using the same rule
http://www.example.com/#!blah=something
will not work, because it seems the URL string the filter is using is truncated at the "#".
Can anyone tell me if it's possible to make this work.
The browser doesn't send the hash to the server, as you've discovered. Watching a given request, you'll see that it only sends along the url before the # symbol.
GET / HTTP/1.1
Host: example.com
...
From the link you mentioned:
Hash fragments are never (by specification) sent to the server as part of an HTTP request. In other words, the crawler needs some way to let your server know that it wants the content for the URL www.example.com/ajax.html#!key=value (as opposed to simply www.example.com/ajax.html).
From the descriptions in the text, it is the server's job to translate from the 'ugly' url to a pretty one (with a hash), and to send back a snapshot of what that page might look like if loaded with a hash on the client. That page may have other links using hashes to load other documents - the crawler will automatically translate those back to ugly urls, and request more data from the server.
So in short, this is not an change you should need to make, the GoogleBot will make it automatically, provided you have opted into using hash fragments. As for other bots, apparently Bing now supports this idea as well, but that appears to be outside the scope of your question.

HttpClient - getting incorrect page source

I used HttpClient and GetMethod to get the page source of the URL :
http://www.google.com/finance?chdnp=1&chdd=1&chds=1&chdv=1&chvs=Logarithmic&chdeh=0&chdet=1264263288788&chddm=391&chddi=120&chls=Ohlc&q=NSE:.NSEI&
But somehow I always end up getting page source of :
http://www.google.com/finance?q=NSE:.NSEI
Can anyone tell me why and how to get page source of the former URL?
I'm going to go out on a limb here and assume that what's going on is that your HttpClient implementation handles HTTP redirects internally and so when you call GetMethod on the first URL, the server (google.com) is probably sending back an HTTP redirect (302, or 301) response for the second URL which is what you end up getting back.
The reason for that is probably that the first URL requires some sort of cookie which you're not providing when you make your request. The best way to determine exactly what happens when you make the request that way is to use a tool such as WireShark or Fiddler to analyse the HTTP request/response sequence from your HttpClient and that of a normal request made using FireFox or IE and see what exactly is different.