Restrict access to Google Cloud Storage hosted website - google-cloud-storage

We have a production website running on a host e.g. domain.com
We gonna use haproxy to proxy requests from domain.com to static.domain.com - which is a domain-named bucket in Cloud Storage. Also we have a development version which is also served on Cloud Storage at static.dev.domain.com.
So it turns out that the same page will be available on 3 different domains, which is very bad from the SEO perspective.
My initial idea was to restrict access to domain-named buckets by IP but I see no way to do it. No way for basic http authorization either. Any ideas how to protect static web sites from being indexed?

Actually, it is not possible to restrict the access to one IP from the Cloud Storage console, nevertheless, I found this Public Issue Tracker that requests the restriction of a bucket access by an IP Address. As a recommendation, I suggest you to do a comment and start the issue so you can get notifications about this.
On the other hand, I think that this StackOverflow post could helps you since this one mentions a possible workaround using VPC Service Controls :)

Since my main concern was SEO, to solve duplicate content issue I added canonical url:
<link rel="canonical" href="https://example.com">

Related

How to deploy a Firebase app with firestore and express app to be globally accessible?

We have successfully built a Firebase application with Firestore, functions, hosting, auth. Now we are working on an Atlassian confluence integration and a global rollout. The confluence plugin rest endpoints are served by an express app.
What is the proper way to achieve a unique url in all countries around the globe, e.g. https://myapp.com/confluence/api with no or at least acceptable latency to serve health checks as well? Is a hosting rewrite to function serving the express app enough? Do we need to manage any replication to regions around the globe by ourselves?
Thanks a lot for any advice.
You can use the Firebase hosting to connect a custom domain:
use a custom domain (like example.com or app.example.com) instead
of a Firebase-generated domain for your Firebase-hosted site.
Firebase Hosting provisions an SSL certificate for each of your
domains and serves your content over a global CDN.
Note the following about connecting custom domains:
Each custom domain can only be connected to one Hosting site.
Each custom domain is limited to having 20 subdomains per apex domain, due to SSL certificate minting limits.
When Firebase verifies domain ownership, an SSL certificate is being provisioned for your domain and it's being deployed across Firebase global CDN (content delivery network). This delivery network cashes your content on Firebase edge servers' SSDs to ensure quick content delivery and low latency globally.

How to limit access in Cloud Foundry

I am new to Cloud Foundry.
Is there any way that only specific users can view and update an app deployed in Cloud Foundry?
1.I deployed an app in Cloud Foundry using “cf push”command.
2.After entering “cf push “command I’ve got an message below.
Using manifest file /home/stevemar/node-hello-world/manifest.yml
enter Creating app node-hello-world-example...
name: node-hello-world-example
requested state: started
routes: {route-information}
last uploaded: Mon 14 Sep 13:46:54 UTC 2020
stack: cflinuxfs3
buildpacks: sdk-for-nodejs
type: web
instances: 1/1
memory usage: 256M
3.Using the {route-information} above,I can see the app deployed via browser entering below URL.
https://{route-information}
By this way ,anyone can see app from browser, but I don’t want that to be seen by everyone and limit access to specific user.
I heard that this global IP will be allocated to {route-information} by default.
Is there any way to limit access to only between specific users?
(For example,is there any function like “private registry” at Kubernetes in Cloud Foundry which is not open to public)
Since I am using Cloud Foundry in IBM Cloud it would be better if there is solution using IBM Cloud.
I’ve already granted cloud foundry role to the other user.
Thank you.
The CloudFoundry platform itself does not provide any access controls for applications. If you assign a public route to your application, where the DNS is publicly resolvable and the foundation is on the public Internet, like IBM Bluemix, then anyone can access your app.
There's a number of things you can do to limit access, but they do require some work on your part.
Use a private DNS. You can add any domain you want to Cloud Foundry, even ones that don't resolve. That means you could add my-cool-domain.local which does not resolve anywhere. You could then add a record to /etc/hosts for this domain or perhaps run DNS on your local network to resolve this DNS domain and direct traffic to the CloudFoundry.
With this setup, most people cannot access your application because the DNS domain for the route to your application does not resolve anywhere. It's important to understand that this isn't really security, but obscurity. It would stop most traffic from making it to your app, but if someone knew the domain, they could add their own /etc/hosts header or send fake Host headers to access your application.
This type of setup can work well if you have light security requirements like you just want to hide something while you work on it, or it can work well paired with other options below.
You can set up access controls in your application. Many application servers & frameworks can do things like restrict access by IP address or require user access (Basic auth is easy and it is OK, if you're only allowing HTTPS traffic to your app which you should always do anyway).
You can use OAuth2 to secure apps too. Again, many app servers & frameworks have support for this and make it relatively simple to secure your apps. If you don't have a corporate OAuth2 solution, there are public providers you can use. Exactly how you do OAuth2 in your app is beyond the scope of this question, but there's plenty of material out there on how to do this. Google information for your application language/framework of choice.
You could set up an access Gateway. This would be an application that's job is to proxy traffic to other applications on the foundation. The Gateway could be something like Nginx, Apache HTTPD, or Spring Cloud Gateway. The idea is that the gateway would be publicly accessible, and would almost certainly apply access controls/restrictions (see #2, many of these proxies have access control options that only take a few lines of config). Your actual applications would not be deployed publicly though. When you deploy your actual applications, they would only be on the internal Cloud Foundry domain.
CloudFoundry has local domains, often apps.internal (run cf domains to see if that shows up), which you can use to easily route traffic across the internal container-to-container network. Using this domain and the C2C network, you can have apps deployed to CF that are not accessible to the public Internet, except through your Gateway.
Again, how you configure this exactly is outside the scope of this question, but check out the docs I linked to for info on using the C2C network & internal routes. Then check out your proxy server of choice's documentation.

Are static sites hosted on google cloud storage accessable through https?

According to this post from 2014, https is not available to static sites on google cloud engine: https://stackoverflow.com/a/22767544/46799
Is this still the case? If so, are there any plans add this functionality?
My site is hosted on GCS and I have a cname entry which maps my url to a bucket on GCS. I need to start providing access to the site through https now, am I out of luck?
This is still the case, sorry. You can access GCS via HTTPS, but not via CNAME redirects.

Use CloudFlare to CDN a Google Cloud Storage Bucket

I've heard many good things about Cloudflare, and they have an excellent CDN product that features functionality not found on competitors (HTTP2, IPv6 etc).
I have files in a Google Cloud Storage bucket.
How to set these files as the origin for a Cloudflare CDN?
(The Cloudflare control panel seems to just want a website on a root domain...?)
Maybe a bit late, but I put my answer just in case it is useful for someone else looking to do the same thing.
I have a bucket in Google Cloud Storage behind CloudFlare. You just need to follow the instructions here:
https://cloud.google.com/storage/docs/website-configuration
In CloudFlare you will need to manage your root domain, but then you can create a subdomain just for your bucket in Google Cloud Storage (don't forget to enable CloudFlare features on that subdomain). I think that's the way CloudFlare works, managing your root domain and I don't think you can avoid it.
If you need specific settings for the subdomain used for your bucket, you can use page rules in CloudFlare. For example, I had to use them because Google Cloud Storage does not support SSL but my pages using those static files were on SSL, so I had specific settings for that subdomain to use flexible SSL.

How to use S3 as static web page and EC2 as REST API for it together? (AWS)

With AWS services we have the Web application running from the S3 bucket and accessing the data through the REST API from Load Balancer (which is set of Node.js applications running on EC2 instance).
Currently we have specified URL's as following:
API Load Balancer: api.somedomain.com
Static Web App on S3: somedomain.com
But having this setup brought us a set of problems since requests are CORS with this setup. We could workaround CORS with special headers, but that doesn't work with all browsers.
What we want to achieve is running API on the same domain but with different path:
API Load Balancer: somedomain.com/api
Static Web App on S3: somedomain.com
One of the ideas was to attach the API Load Balancer to the CDN and forward all request to Load Balancer if query is coming on the "/api/*" path. But that doesn't work since our API is using not only HEAD and GET requests, but also POST, PUT, DELETE.
Another idea is using second EC2 instance instead of S3 bucket to host website (using some web server like nginx or apache). But that gives too much overhead when everything is in place already (S3 static content hosting). Also if using this scenario we wouldn't get all the benefits of Amazon CloudFront performance.
So, could your recommend how to combine Load Balancer and S3, so they would run on same domain, but with different paths? (API on somedomain.com/api and Web App on somedomain.com)
Thank you!
You can't have an EC2 instance and an S3 bucket with the same host name. Consider what happens when a web browser makes a request to that host name. DNS resolves it to an IP address (or addresses) and the packets of the request are delivered to that address. The address either terminates at the EC2 instance or the S3 bucket, not both.
As I understand your situation, you have static web pages hosted on S3 that include JavaScript code that makes various HTTP requests to the EC2 instance. If the S3 web pages are on a different host than the EC2 instance then the same origin policy will prevent the browser from even attempting some of the requests.
The only solutions I can see are:
Make all requests to the EC2 instance, with it fetching the S3 contents and delivering it to the browser whenever a web page is asked for.
Have your JavaScript use iframes and change the document.domain in the the web pages to a common parent origin. For example, if your web pages are at www.example.com and your EC2 instance is at api.example.com, the JavaScript would change document.domain to just example.com and the browser would permit iframes from from www.example.com to communicate with api.example.com.
Bite the bullet and use CORS. It's really not hard, and it's supported in all remotely recent browsers (IE 8 and 9 do it, but not in a standard way).
The first method is no good, because you almost might as well not use S3 at all in that case.
The second case should be okay for you. It should work in any browser, because it's not really CORS. So no CORS headers are needed. But it's tricky.
The third, CORS, approach should be just fine. Your EC2 instance just has to return the proper headers telling web pages from the S3 bucket that it's safe for them to talk to the EC2 instance.
Just wanted to add an additional bit to the answer that, if we go with CORS approach and preflight requests adds an overhead to the server and network bandwidth, we may even consider adding header "Access-Control-Max-Age" to the CORS response
Access-Control-Max-Age