Stop Cloudflare from caching redirects

Stop Cloudflare from caching redirects - redirect

I have a bit of a unique situation here.
We have a domain that hosts user uploaded media (that has had image operations applied to it) (usermedia.com). Media is stored in an Amazon S3 bucket and Cloudflare sits in front of this bucket.
If a user requests an image, they might browse to https://usermedia.com/my-image-resize-200-200.jpg.
If this image exists, it is served, otherwise Amazon S3 does a 302 redirect (via Routing Rules) to https://app.com/generate/my-image-resize-200-200.jpg which generates the resized image, uploads it to S3, and then does another redirect back to https://usermedia.com/my-image-resize-200-200.jpg. This time the file exists in S3 and is served.
The problem is when we have the Cloudflare proxy enabled - it caches redirects and so if the media doesn't exist, Cloudflare gets stuck in a continuous redirect cycle. I've tried using a 307 redirect but the problem persists.
Any ideas how to get around this issue?

I have an exact same setup and some problem. I found out that the problem is indeed the clouldflare. When the image doesn't exist, S3 returns a 307 pointing to the resizer endpoint, the problem here is that cloudflare adds the cache header (e.g.: cache-control:public, max-age=691200) because probably you have on the caching tab the option Browser Cache Expiration set, so when the browser gets the response it will cache it (because a cache-control header is present), so the next request will be served from browser cache.
UPDATE:
A quick solution may be simple. On cloudflare set Browser Cache Expiration to "Respect Existing Headers", this way cloudflare will not add the headers that make browser cache the response. If you still want images to be cached locally, simply set the cache-control header directly on the S3 image when creating it from the resize script
S3.putObject({
Body: buffer,
Bucket: BUCKET,
ContentType: 'image/jpeg',
CacheControl: 'max-age=604800',
Key: finalKey,
})
If you have other assets (e.g. css or javascript, etc) you can set the CacheControl header for them as well from the aws console, or, if you have them in some folder, let's say "foo", you can go to Cloudflare, create a page rule and instruct to set the Browser Cache TTL for all files under the "foo" folder.
I know this is a late answer but hope to help people having the same problem in the future

At the moment I see that Cloudflare doesn't cache 307 redirects. First query:
Second query:
You can set the respose code in S3 redirection rules like this:
<RoutingRules>
<RoutingRule>
<Condition>
<HttpErrorCodeReturnedEquals>403</HttpErrorCodeReturnedEquals>
</Condition>
<Redirect>
<HostName>app.com</HostName>
<HttpRedirectCode>307</HttpRedirectCode>
</Redirect>
</RoutingRule>
</RoutingRules>

Related

How to fix "load unsafe scripts"?

so I'll start from the very beginning.
Basicly I purchased a template off themeforest and I manually edited the code in a markup editor to match my preferences.
As I was finished, I decided to host my website on github pages - I uploaded my code directory to a repository as you do.
Here's a link to my repository:
https://github.com/KristofferHari/kristofferhari.github.io
Here's a link to my current website URL:
https://kristofferhari.github.io/ (As you can see, everything's kinda buggy)
So I managed to contact the seller and this is what I was provided with:
The reason for that is because the resources are using a http connection and they can’t be loaded on https connection website. So
you have to upload all the resources (scripts/stylesheets) to github
in order to use them on github.
So I suppose that through my browser, I am trying to connect to my website through a https connection rather than an http. (Is this what is actually causing the problem, and what's the difference between a http connection and a https?)
Secondly, how would I upload all my resources (scripts/stylesheets) to github?
Thanks in advance!

There is a relatively simple solution: to use a protocol-relative URL format.
e.g. your error
Mixed Content: The page at 'https://kristofferhari.github.io/' was loaded over HTTPS, but requested an insecure stylesheet 'http://fonts.googleapis.com/css?family=Open+Sans:400,700,300,900'. This request has been blocked; the content must be served over HTTPS.
The problem is you are loading
http://fonts.googleapis.com/css?family=Open+Sans:400,700,300,900'
from
https://kristofferhari.github.io/
The page is secure (HTTPS), but it's loading insecure content (HTTP).
To fix it, you basically need to change the stylesheet to:
https://fonts.googleapis.com/css?family=Open+Sans:400,700,300,900'
But a more flexible solution is to use a protocol relative format:
//fonts.googleapis.com/css?family=Open+Sans:400,700,300,900'
which will then work on either http or https.
Apply this change to all included resources.

Google Cloud LB: Change "server error" default html page

By default, if the load balance can't find a backend to redirect traffic to, for example if all available backends are down, it shows this html page:
Transcript:
Error: Server Error
The server encountered a temporary error and could not complete your request.
Please try again in 30 seconds.
I would like to use my own static html page instead.
I saw this on the LB + Cloud storage page here:
You can also configure a custom index page and a custom error page that will be served if the requested object doesn’t exist. This can be done by adding a Website Configuration to your Cloud Storage bucket. With a Website Configuration, you could serve a static webpage directly out of a Cloud Storage bucket from your own domain.
How would that work ?
I know how to host static page on cloud storage, but how would I use it with the LB ?

Simply put, you can't, at least for now.
The HTTP Load Balancer with Cloud Storage you found is in alpha, you will need to request a whitelist to try it. But it won't solve your problem.
Because as of now, there is no way to control a load balancer's redirection manully based on the backends' responses. I don't think that will ever be possible. It's not the purpose of a load balancer in GCP.
You can also configure a custom index page and a custom error page that will be served if the requested object doesn’t exist.
The above statement only means that you can have a custom 404 page for unfound objects in the bucket. It's not meant to let you redirect traffic if your Back-services are down (502). There is a big difference between : I can't find a page, and Nothing is working because I don't have a server.
You can only redirect traffic coming from outside toward the inside of your network. You can't do the opposite. You can't ask the load balancer to redirect based on a response.
Instead of trying to make the 502 error page beautiful, ask yourself why you have it in the first place, and try to fix that.

How to redirect a subdirectory to an external url s3

I have an old site which I am moving to s3, except some of the pages I am moving to another subdomain (not s3). They have urls like:
http://www.example.com/2015/09/07/some-url
which I would like to redirect to a url like:
http://subdomain.example.com/2015/09/07/some-url
I can't seem to get it to work, here are my redirect rules:
<RoutingRules>
<RoutingRule>
<Condition>
<KeyPrefixEquals>2015/09/07/some-url/</KeyPrefixEquals>
</Condition>
<Redirect>
<Protocol>http</Protocol>
<HostName>subdomain.example.com</HostName>
<ReplaceKeyPrefixWith>2015/09/07/some-url/</ReplaceKeyPrefixWith>
</Redirect>
</RoutingRule>
Also, do I have to actually create empty directories in the s3 bucket for the rules to work?

You don't need empty directories for this to work. You probably do, however, want to remove the trailing slash.
<KeyPrefixEquals>2015/09/07/some-url/</KeyPrefixEquals>
...becomes...
<KeyPrefixEquals>2015/09/07/some-url</KeyPrefixEquals>
The trailing slash, if needed, is typically going to be added by the destination server if not supplied by the incoming request.
Also, you shouldn't need to set <ReplaceKeyPrefixWith> unless the value is changing.

https://docs.aws.amazon.com/AmazonS3/latest/dev/how-to-page-redirect.html
Setting a Page Redirect from the REST API
The following Amazon S3 API actions support the x-amz-website-redirect-location header in the request. Amazon S3 stores the header value in the object metadata as x-amz-website-redirect-location.
PUT Object
Initiate Multipart Upload
POST Object
PUT Object - Copy
When setting a page redirect, you can either keep or delete the object content. For example, suppose you have a page1.html object in your bucket.
To keep the content of page1.html and only redirect page requests, you can submit a PUT Object - Copy request to create a new page1.html object that uses the existing page1.html object as the source. In your request, you set the x-amz-website-redirect-location header. When the request is complete, you have the original page with its content unchanged, but Amazon S3 redirects any requests for the page to the redirect location that you specify.
To delete the content of the page1.html object and redirect requests for the page, you can send a PUT Object request to upload a zero-byte object that has the same object key:page1.html. In the PUT request, you set x-amz-website-redirect-location for page1.html to the new object. When the request is complete, page1.html has no content, and requests are redirected to the location that is specified by x-amz-website-redirect-location.
When you retrieve the object using the GET Object action, along with other object metadata, Amazon S3 returns the x-amz-website-redirect-location header in the response.

FaceBook loads HTTPS hosted iframe apps via HTTP POST (S3 & CloudFront errors)

I have been trying to write a bucket policy that will allow (X-HTTP-Method-Override) because my research shows that Facebook loads HTTPS hosted iframe apps via HTTP POST instead of HTTP GET which causes S3 and CloudFront errors.
Can anyone please help me with this problem?
This is what's returned from S3 if I served my Facebook app directly from S3:
<?xml version="1.0" encoding="UTF-8" ?>
- <Error>
<Code>MethodNotAllowed</Code>
<Message>The specified method is not allowed against this resource.</Message>
<ResourceType>OBJECT</ResourceType>
<Method>POST</Method>
<RequestId>B21565687724CCFE</RequestId>
<HostId>HjDgfjr4ktVxqlIBeIlvXT3UzBNuPg8b+WbhtNHOvNg3cDNpfLH5GIlyUUpJKZzA</HostId>
</Error>
This is what's returned from CloudFront if I served my Facebook app from CloudFront with S3 as the origin:
ERROR
The request could not be satisfied.
Generated by cloudfront (CloudFront)
I think the solution should be to write a bucket policy that makes use of X-HTTP-Method-Override... Probably I am wrong though. A solution to this problem would be highly appreciated.

After trying many different ways to get this to work, it turns out that it simply is not possible to make the POST to static content work on S3 as things stand. Even if you allow POST through Cloudfront, enable CORS, change the bucket policy so that the Cloudfront origin identity can GET/PUT etc. it will still throw an error.
As an aside, S3 is not the only thing that balks at responding to such a POST request to static content. If you configure nginx as an origin for a Facebook iframe you will get the same 405 error, though you can work around that problem in a couple of ways (essentially rewriting it to a GET under the covers). You can also change the page (though still static) to be a dynamic extension (.aspx or .php) to work around the issue with nginx.
You can host all your other content on S3 of course, and just move the page that you POST to onto a different origin. With a decent cache time you should see minimal traffic, but it will mean keeping your content in two places. What I ended up doing was:
Creating EC2 instances in an autoscaling group (just in case) to serve the content
They used a cron job to sync the content from S3 every 5 minutes
No change in workflow was required (still just upload content to S3)
It's not ideal, nor is it particularly efficient, but hopefully it will save others a lot of fruitless testing trying to get this to work on S3 alone.

You can set your Cloudfront distribution to allow POST methods.
If you go into your dashboard and edit the Behavior for the distribution
- Then select Allowed HTTP Methods - GET, HEAD, PUT, POST, PATCH, DELETE, OPTIONS
This allows the POST from Facebook to go through to your origin.

I was fighting with S3 and CloudFront for last couple of days. and I confirm that with any bucket policy we cannot redirect POST calls from Facebook to S3 static (JS enriched) contents.
The only solution seems to be the one Adam Comerford mentioned in this thread:
Having a light application which receives Facebook calls then fetching the content from S3 or CloudFront.
If anyone has any other solution or idea it will be appreciated.

you can't change POST to GET - that's the way Facebook loads app page because it also sends data about the current user as POST body (see signed_request for more details). I would suggest you look into fixing your app to make sure it properly responds to POST request.

Using IIRF to redirect to a PDF

I'm using IIRF to redirect certain URLs to specific PDF files. For instance, for the URL /newsletter/2010/02 I'd like it to redirect to /pdf/newsletters/Feb2010.pdf. I'm not too hot at regular expressions, but I created the following rule:
RedirectRule ^/newsletter/2010/01 /pdf/newsletters/Newsletter012010.pdf [I,R=301]
and it does redirect, but the address bar doesn't change, and when trying to save the file it wants to save as 01 instead of Feb2010.pdf. I don't presume my users will be savvy enough to enter a PDF extension before saving, and they shouldn't have to. Is there anything I can do about this?

Two suggestions:
clear your browser cache
Redirect to a full URL. instead of /pdf/newsletters/Foo.pdf, redirect to http://server/pdf/foo.pdf
It's strange that it wants to use 01 as the file. Surprising. Are you sure the browser is sending a new request? Use Fiddler to verify. A redirect should result in the browser address bar getting updated, ALWAYS. If you get a 301 you will see it very clearly in the Fiddler trace.
If you don't see the expected 301, Is it possible that you previously used a RewriteRule in the ini file, and the browser cached the result, and now when you ask for /newsletter/2010/01 , you are getting the cached result, rather than the redirected URL from IIRF? Clear your browser cache and request it again, to test this.
I guess it would be easy to just clear the browser cache and re-try it, without even checking Fiddler.