Fallback for GCP Storage - google-cloud-storage

I am trying to build a website where I would like to leverage Google Cloud Storage to serve static assets. But want any resource which is not found in the bucket should fallback to my webserver. Is there a way to specify a fallback for the bucket.

Yes, this can be done by changing the 404.html on Cloud Storage to contain a redirect to another webpage.
You do this by adding in the header:
<meta http-equiv="Refresh" content="5; url=https://example.com">

There are a few client side JavaScript/css ways to do this. For example https://medium.com/#webcore1/react-fallback-for-broken-images-strategy-a8dfa9c1be1e or Fallback background-image if default doesn't exist

Related

Cloudfront, lambda # edge, S3 redirect objects,

I am building an S3 URL redirect, nothing special just a bunch of zero-length objects with the WebsiteRedirectLocation meta filled out. The S3 bucket is set to server static websites, bucket policy set to public, etc. It works just fine.
HOWEVER - I also want to lock down certain files in the bucket - specifically some HTML files that serve to manage the redirects (like adding new redirects). With the traditional setup, I can both use the redirects, and also serve the HTML page just fine. But in order to lock it down, I need to use Cloudfront and Lambda#edge like in these posts:
https://douglasduhaime.com/posts/s3-lambda-auth.html
http://kynatro.com/blog/2018/01/03/a-step-by-step-guide-to-creating-a-password-protected-s3-bucket/
I have modified the lambda#edge script to only prompt for a password IF the admin page (or its assets like CSS/JS) are requested. If the requested path is something else (presumably a redirect file) the user is not prompted for a password. And yes, I could also set a behavior rule in Cloudfront to decide when to use the Lambda function to prompt for a password.
And it works kind of. When I follow these instructions and visit my site via the Cloudfront URL, I do indeed get prompted for a password when I goto the root of my site - the admin page. However, the redirects will not work. If I try to load a redirect the browser just downloads it instead.
Now, in another post someone suggested that I change my Cloudfront distribution endpoint to the S3 bucket WEBSITE endpoint - which I think also means changing the bucket policy back to website mode and the public which sucks because now its accessible outside of the Cloudfront policy which I do not want. Additionally - Cloudfront no longer automatically serves the specified index file, which isn't the worst thing.
SO - is it possible to lock down my bucket, the server it entirely through Cloudfront with Lambda#edge BUT also have Cloudfront respect those redirects instead of just prompting a download? Is there a setting in Cloudfront to respect the headers? Should I set up different behavior rules for the different files (HTML vs redirects)?
Instead of using the WebsiteRedirectLocation meta, which is specific to S3 static website hosting and has no effect when Cloudfront is the server, replace your empty objects with HTML objects that contain a meta HTML tag with the desired redirect target:
<meta http-equiv="Refresh" content="0; url=https://www.example.com" />
The number before the semicolon is the delay before the redirect, in seconds, where 0 is immediate.
Don't forget to also change the Content-Type meta tag of the objects to text/html.
And if you want to support old browsers that might not handle the Refresh directive correctly, add an anchor link as explained here.

Content disposition when downloading from google cloud storage

I would like to use a direct link to https://storage.cloud.google.com/mybucket/myfile.pdf?response-content-disposition=attachment;%20filename=myfile.pdf (this is the link that gets created when browsing the GCS in the Cloud Console itself.
The work-around is going to https://www.googleapis.com/storage/v1beta2/b/mybucket/o/myfile.pdf?alt=media but that's obviously not a very pretty URL.
Has the API changed any that response-content-disposition is actually something else now?

Different index files for different directories on Google Cloud Storage, possible?

Problem: Planning to have my Jekyll-generated static site served from Google Cloud Storage, but need to serve feeds from example.com/feed/ for backwards compatibility with WordPress.
Possible solution: Say the static feed file (index.xml) is located at example.com/feed/index.xml. Then if it's possible to set a different index file for a directory itself (apart from what's set for the bucket e.g. index.html), then people would be able to access my feed from example.com/feed/.
But is this possible? If not, is there an alternative I'd be missing?
You could potentially create an object in the example.com bucket with the name /feed/. That's a bit awkward to think about, and because of the way gsutil works you'd have to do it via the API manually, but it would allow you to serve a feed from example.com/feed/.
Alternately, you could simply name your xml content /feed/index.html. If all of your users are indeed visiting example.com/feed/, then being able to name the file index.xml is not entirely relevant. The only special thing you'd need to do is make sure you set the right content type for the /feed/index.html object.
Another thing to keep in mind here is that the feed itself should be linked from your main index page with a link like this:
<link rel="alternate" type="application/rss+xml" title="My Awesome Feed" href="http://example.com/path/to/feed.xml" />
That gives you the ability to name your feed sanely, and your users can point their feed readers at http://example.com/ directly -- the reader should be able to follow the link to the feed itself. That won't help you if you have established readership that expects a wordpress style feed, but you could steer new people in the right direction and deprecate the weird wordpress style after a while.

An object in Google Cloud Storage which acts as a "redirect" or "symlink"

I'm looking to move an existing website to Google Cloud Storage. However, that existing website has changed its URL structure a few times in the past. These changes are currently handled by Apache: for example, the URL /days/000233.html redirects to /days/new-post-name and /days/new-post-name redirects to /days/2002/01/01/new-post-name. Similarly, /index.rss redirects to /feed.xml, and so on.
Is there a way of marking an object in GCS so that it acts as a "symlink" to another GCS object in the same bucket? That is, when I add website configuration to a bucket, requesting an object (ideally) generates a 301 redirect header to a different object, or (less ideally) serves the content of the other object as its own?
I don't want to simply duplicate the object at each URL, because that would triple my storage space. I also can't use meta refresh headers inside the object content, because some of the redirected objects are not HTML documents (they are images, or RSS feeds). For similar reasons, I can't handle this inside the NotFound 404.html with JavaScript.
Unfortunately, symlink functionality is currently not supported by Google Cloud Storage. It's a good idea though and worth considering as a future feature.

Google analytics with meta http-equiv="REFRESH" redirection

Is it possible to apply google analytics to pages that use the <meta http-equiv="REFRESH" content="0;url={url}"> redirect method?
I'm assuming that the redirection on this type of page would happen before the js has a chance to run, and so analytics would not work on it.
Yes, if you want to track these pages, you need to fire the Google Analytics tracking first and next only, redirect.
I would recommend:
using the ga.js traditional syntax over the async syntax
wait something like 150ms after the tracking has been called, before redirecting (with setTimeout() , in Javascript instead of using meta tag)
Use Google Chrome's Google Analytics debugger extension for example to check how it goes