Amazon S3 Redirect Rule - Preserve Query Params - redirect

I noticed Amazon S3 Redirect rule - GET data is missing but after following the accepted answer my query params still are not being preserved.
I have a site that uses React and React Router, meaning I have several URLs that load identical HTML and JS and then the JS figures out which part of the app to load based on the URL.
For example:
/foo, /bar, /baz all should load index.html, which loads bundle.js. Then bundle.js observes the URL and routes to some React component (also in bundle.js).
However no foo, bar, or baz file exists in S3, only index.html. What I want to do is when I get a 404, redirect to /#!/{URL} (eg. /foo redirects to /#!/foo). This works fine with my redirect rule (below). However, I also want to bring query params with me (eg. /foo?ping=pong redirects to /#!/foo?ping=pong) but instead /foo?ping=pong just redirects to /#!/foo.
Here are my redirect rules:
<RoutingRules>
<RoutingRule>
<Condition>
<HttpErrorCodeReturnedEquals>404</HttpErrorCodeReturnedEquals>
</Condition>
<Redirect>
<Protocol>http</Protocol>
<HostName>www.mydomain.com</HostName>
<ReplaceKeyPrefixWith>#!/</ReplaceKeyPrefixWith>
</Redirect>
</RoutingRule>
</RoutingRules>
Any ideas on some way I can achieve this? Ideally without having to go change something in S3/CloudFront every time I add a new page?

The problem was that I had the origin set up in CloudFront not to forward Query Strings so when S3 got the request it would redirect properly without the query params. You can find this setting in CloudFront > Behaviors > Forward Query Strings.

If you want to have clear urls though you can also check out this trick. You need to setup cloudfront distribution and then alter 404 behaviour in "Error Pages" section of your distribution. That way you can again domain.com/foo/bar links :)

The menus and options in CloudFront/S3 change a lot over time.
Here is a December 2021 solution.
Step 1) Create a "Request" Policy in CloudFront that allows QueryStrings
Note: you might want to also add some Headers like Origin or Access-Control-... headers for CORS.
Step 2) Go to your Distribution > Update the Origin request policy
Step 3) Kick a new Invalidation on /*
Additional Notes for Debuging/Testing
I would recommend testing with curl in terminal rather than a browser to avoid caching and also seeing the details. I do curl -v https://example.com/cb?foo=bar1.
Keep increasing the value of the query string (bar1 in the above example, to bar2, bar3) with every test to make such there is no caching again.

Related

AEM query param being removed and CSRF token added

My application has a search functionality which uses a query param fullText for the search term. But on my QA server, any query parameters are being removed and csrf token is being added.
Example, on homepage, if I search for 'tax', the url should be:
https://www.qaserver.com/en/search.html?fullText=tax
Instead, it changes to below url and remains on the 'same' page it is on.
https://www.qaserver.com/en/home.html?%3Acq_csrf_token=eyJleHAiOjE1MDAyNDk5NzgsImlhdCI6MTUwMDI0OTM3OH0.EXoQy8xeVh3j9kdFdnenLGLl2sFEh_boi_jFareO1is
Is there any AEM/dispatcher config missing or incorrect ?
The dispatcher or AEM logs don't show who is appending this param or why.
The same thing happens with direct IP of publish server as well.
Include <cq:includeClientLib categories="granite.csrf.standalone"/> on the page from you are making POST ajax or form submit. This should resolve the issue.
Or the other option is to exclude particular servlet path from CSRF Filter Configuration (Which is not recommended).

How facebook like websites is able to load the profile, instead of a directory when a request like facebook.com/profile/username is recieved?

When the facebook.com/profile/{username} is requested how is server able to load page with data corresponding to that user, instead of navigating to a directory named in that {username}, and possibly showing a 404 error ?
It's achieved typically using a pattern called "front controller", where all requests are handled by the same file (let's say index.php, talking specifically about PHP now). So all URLs are like this:
facebook.com/index.php/profile/abc
facebook.com/index.php/account
That file serves as the bootstrap for the application, reading extra parameters (anything after index.php) and dispatching requests to the appropriate handlers/controllers.
Then there's multiple ways you can get rid of that ugly index.php, depending on how you configure your web server (loads of questions here about that subject: htaccess remove index.php from url as an example).
Read more about it here: https://en.m.wikipedia.org/wiki/Front_controller

Checking user ip address after redirect

As part of a session security feature I am checking $ENV{REMOTE_ADDR} to make sure the users IP is the same during the whole stay on a website.
Some parts of the website show a waiting screen, if for example the rendering of a file takes some seconds, and I redirect the user to a result screen by the use of a meta tag <meta http-equiv="refresh" content="$time; URL=…">.
Unfortunately after this redirect the $ENV{REMOTE_ADDR} variable does not return the users IP but the one from the server.
Is there something I am missing to get this to work properly and/or are there alternatives I could use to redirect the user?
For various reasons htaccess or http-header redirects are not an option and I don't want to use JavaScript for this.
I am already using a 'click me' button to allow the user to manually skipping the wait.
You could try to alter between temporary/permanent type of redirect. Check in server logs, the the http code is 301 or 302?
I misread the accesslogs … it was actually a different script executed on the server, therefore having the servers IP, which caused all this.

tracker.js is hitting directly dispatcher URL? How to resolve that?

I have a multiple publish instances and multiple dispatchers in Production environment of my website. While i see net tab in firefox, i see a failed request of tracker.js directly hitting to dispatcher URL.
GET http://web.dispatcher.com/libs/wcm/stats/tracker.js?blah-blah
where web.dispatcher.com
I feel dispatcher URL should not get exposed like this. And why even it is hitting dispatcher URL. Any ideas?
I feel either turn off the impressions tracker but not sure how to do that? Or rewrite the request to hide dispatcher. Any suggestions? And How to do it?
if you go to [host]:[port]/system/console/configMgr you will see a configuration for "Day CQ WCM Page Statistics"
Modify that to configure the full url where you want the tracker to send it's requests to.
Additionally, this XHR request is generated by stats.jsp, which is generally included in your head.jsp. If you simply remove that script include from your template (or change it to conditionally include, ie. to not include when WCMMode==DISABLED) you can stop that request from being generated.

url rewrite & redirect question

Say currently I have url like :
http://mydomain.com/showpost.php?p=123
Now I want to make it prettier :
http://mydomain.com/123/post-title
I'm using apache rewrite which grabs segment '123' and put the url back to
http://mydomain.com/showpost.php?p=123
OK. Here is the problem. I want to redirect the original non-pretty urls which were indexed by Google to the pretty versions, I want this because I heard that Google may punish me if he sees multiple urls pointing to identical content. So I need to
redirect /showpost.php?p=123 to /123/post-title
This I have to do in my php code coz there's no way Apache to be able to figure out the 'post-title', but if I put the redirect code in php code, then it will be a infinite loop, such as :
Request : /showpost.php?p=123
redirected to : /123/post-title
rewritten to: /showpost.php?p=123
redirected again to : /123/post-title
...
So on and so forth.
Sorry I should Google the solution first but I really don't know how to describe my situation in English to make Google return reasonable results.
Please help me. Thanks.
You can set a request variable in your rewrite rule, by adding something like [E=rewritten:true] at the end of the RewriteRule line, to record the fact that the rewrite was done. In your PHP, you should be able to access this as $_SERVER['rewritten'], and do the redirect only if the flag is absent.
Alternately, you can use a rewrite rule to redirect the non-pretty URL to the pretty one without the post title, then use application code to issue another redirect with the post title added. Using two redirects is less desirable, of course, but it's only a temporary measure until search engines update their indexes, then you can remove it.
Make sure you use 301 Moved Permanently for your redirect, btw.