Use only a domain and disable https://storage.googleapis.com url access - google-cloud-storage

I am newbie at cloud servers and I've opened a google cloud storage to host image files. I've verified my domain and configured it, to view images via my domain. The problem is, same file is both accessible via my domain example.com/images/tiny.png and also via storage.googleapis.com/example.com/images/tiny.png Is there any solution to disable access via storage.googleapis.com and use only my domain?

Google Cloud Platform Support Version:
NOTE: This is the reply from Google Cloud Platform Support when contacted via email...
I understand that you have set up a domain name for one of your Cloud Storage buckets and you want to make sure only URLs starting with your domain name have access to this bucket.
I am afraid that this is not possible because of how Cloud Storage permission works.
Making a Cloud Storage bucket publicly readable also gives each of its files a public link. And currently this public link can’t be disabled.
A workaround would be implement a proxy program and running it on a Compute Engine virtual machine. This VM will need a static external IP so that you can map your domain to it. The proxy program will be in charged of returning the requested file from a predefined Cloud Storage bucket while the bucket keeps to be inaccessible to the public.
You may find these documents helpful if you are interested in this workaround:
1. Quick start to set up a Linux VM (1).
2. Python API for accessing Cloud Storage files (2).
3. How to download service account keys to grant a program access to a set of services (3).
4. Pricing calculator for getting a picture on how much a VM may cost (4).
(1) https://cloud.google.com/compute/docs/quickstart-linux
(2) https://pypi.org/project/google-cloud-storage/
(3) https://cloud.google.com/iam/docs/creating-managing-service-account-keys
(4) https://cloud.google.com/products/calculator/
My Version:
It seems the solution to this question is really a simple, just FUSE Google Cloud Storage with VM Instance.
After FUSE private files from GCS can be accessed through VM's IP address. It made Google Cloud Storage Bucket act like a directory.
The detailed documentation about how to setup FUSE in Google Cloud is here.

There is but it requires you to do more work.
Your current solution works because you've made access to the GCS bucket (example.com), public and then you're DNS aliasing from your domain.
An alternative approach would be for you to limit access to the GCS bucket to one (possibly several) accounts and then run a web-server that uses one of the accounts to access your image files. You could then also either permit access to your web-server to anyone or also limit access to it.
More work for you (and possibly cost) but more control.

Related

Google Cloud Storage 500 Internal Server Error 'Google::Cloud::Storage::SignedUrlUnavailable'

Trying to get Google Cloud Storage working on my app. I successfully saved an image to a bucket, but when trying to retrieve the image, I receive this error:
GCS Storage (615.3ms) Generated URL for file at key: 9A95rZATRKNpGbMNDbu7RqJx ()
Completed 500 Internal Server Error in 618ms (ActiveRecord: 0.2ms)
Google::Cloud::Storage::SignedUrlUnavailable (Google::Cloud::Storage::SignedUrlUnavailable):
Any idea of what's going on? I can't find an explanation for this error in their documentation.
To provide some explanation here...
Google App Engine (as well as Google Compute Engine, Kubernetes Engine, and Cloud Run) provides "ambient" credentials associated with the VM or instance being run, but only in the form of OAuth tokens. For most API calls, this is sufficient and convenient.
However, there are a small number of exceptions, and Google Cloud Storage is one of them. Recent Storage clients (including the google-cloud-storage gem) may require a full service account key to support certain calls that involve signed URLs. This full key is not provided automatically by App Engine (or other hosting environments). You need to provide one yourself. So as a previous answer indicated, if you're using Cloud Storage, you may not be able to depend on the "ambient" credentials. Instead, you should create a service account, download a service account key, and make it available to your app (for example, via the ActiveStorage configs, or by setting the GOOGLE_APPLICATION_CREDENTIALS environment variable).
I was able to figure this out. I had been following Rail's guide on Active Storage with Google Storage Cloud, and was unclear on how to generate my credentials file.
google:
service: GCS
credentials: <%= Rails.root.join("path/to/keyfile.json") %>
project: ""
bucket: ""
Initially, I thought I didn't need a keyfile due to this sentence in Google's Cloud Storage authentication documentation:
If you're running your application on Google App Engine or Google
Compute Engine, the environment already provides a service account's
authentication information, so no further setup is required.
(I am using Google App Engine)
So I commented out the credentials line and started testing. Strangely, I was able to write to Google Cloud Storage without issue. However, when retrieving the image I would receive the 500 server error Google::Cloud::Storage::SignedUrlUnavailable.
I fixed this by generating my private key and adding it to my rails app.
Another possible solution as of google-cloud-storage gem version 1.27 in August 2020 is documented here. My Google::Auth.get_application_default as in the documentation returned an empty object, but using Google::Cloud::Storage::Credentials.default.client instead worked.
If you get Google::Apis::ClientError: badRequest: Request contains an invalid argument response when signing check that you have dash in the project name in the signing URL (i.e projects/-/serviceAccounts explicit project name in the path is deprecated and no longer valid) and that you have "issuer" string correct, as the full email address identifier of the service account not just the service account name.
If you get Google::Apis::ClientError: forbidden: The caller does not have permission verify the roles your Service Account have:
gcloud projects get-iam-policy <project-name>
--filter="bindings.members:<sa_name>"
--flatten="bindings[].members" --format='table(bindings.role)'
=> ROLE
roles/iam.serviceAccountTokenCreator
roles/storage.admin
serviceAccountTokenCreator is required to call the signBlob service, and you need storage.admin to have ownership of the thing you need to sign. I think these are project global rights, I couldn't get it to work with more fine grained permissions unfortunately (i.e one app is admin for a certain Storage bucket)

working with Google Cloud Storage without gsutil

I have developed a Software in which is configured directories to save files. I run it on Linux. These directories are informed by config file.
I would like to use compute engine nodes because I need to increase its performance. Therefore, I would like to use Google Storage to save these files into a save repository.
In [1] is showed mounting a bucket as file system. I tried it, but no success. I receive authentication error.
Can anyone help me to get success in order to access my bucket by compute engine nodes ?
[1] https://cloud.google.com/compute/docs/disks/gcs-buckets
Best regards,
It sounds like you did not start your GCE instance with a service account.
According to the docs you linked, you need to configure a service account or run gcloud auth login to configure your credentials for accessing cloud storage.
If you are trying to set up gcsfuse without running on GCE you will need to use the gcloud auth login approach.

Right way of using Google Storage on a GCE VM

I want to know the right/best way of having one machine copying data to Google Storage.
I need one machine to be able to write to a bucket, but not be able to create or delete other buckets.
While researching, I found out that you should create a account service so this account can log in to GC and then use the storage.
But the problem is, when the machine is from GCE, there are scopes. When setting up the scope "Default" it can Read from Google Storage, but can not write to it. Even after authenticated with a service account.
When the scope is Devstorage.read_write now the machine can create and remove buckets from that storage without login. I find that to risk.
Does anyone have any recommendations?
Thanks
The core problem here is that the "write" scope covers both write and delete, and that the GCE service account is likely a member of project-editors, which can create and delete buckets. It sounds like what you want to do is restrict a service account to only being able to affect a single bucket. You should be able to do this with these steps:
Create a service account in your project (and save the private key file).
In the permissions page for the project, make sure that service account is not a project editor for your project.
Using an account that does have full permissions to your project, create the bucket, then grant the service account write access to the bucket. Example gsutil commands to do this:
gsutil mb gs://yourbucket
gsutil acl ch -u your-service-account-name#gserviceaccount.com:W gs://yourbucket
Create a VM that does not have a GCE service account enabled.
Push the service account's private key file to that VM.
On the VM, gcloud auth activate-service-account --key-file=your-key-file.json
Now gsutil commands run on the VM should be able to write to (and delete) objects in that bucket, but not any other buckets in your project.

Authorizing GCE to Access GCS

I have a django app running in my Google Compute Engine, and it needs to upload video files to my bucket in Google Cloud Storage. When searching for authentication methods, I found this doc. Under Setting the scope of service account access for instances section, it says I need to enable the Cloud Platform access in the settings when creating the VM. I wonder if it is a must and if there's any other way that I can access my cloud storage bucket from my apps in the compute engine. Because creating a new VM and set up the environment is very time-consuming. Any input would be greatly appreciated. Thanks in advance.
As documented on the page you linked to, to authenticate from Google Compute Engine to Google Cloud Storage, you have several options:
Use VM scopes: this must be set before creating the VM, because scopes are immutable once the VM is created. If you want read-only access, you need to add the scope devstorage.read_only (short form) or https://www.googleapis.com/auth/devstorage.read_only (full path). If you want read-write access, you should use the scope devstorage.read_write (short form) or https://www.googleapis.com/auth/devstorage.read_write (full path).
Note: there's also a feature gcloud beta compute instances set-scopes to update GCE VM scopes at runtime.
An alternative to using scopes is to use JSON authentication tokens, such as via Service accounts which can be used by Google API client libraries to connect to Google Cloud Storage.

Google Storage access based on IP Address

Is there a way to give access to a Google Cloud Storage bucket based on the IP address it is coming from.
On Amazon s3, you can just set this in the access policy like this:
"Condition" : {
"IpAddress" : {
"aws:SourceIp" : ["192.168.176.0/24","192.168.143.0/24"]
}
}
I do not want to use a signed url.
The updated answers on this page are only partially correct and should not be recommended for the use case of access control to Cloud Storage Objects.
Access Context Manager (ACM) defines rules to allow access (e.g. an IP address).
VPC Service Controls create an "island" around a project and ACM rules can be attached. These rules are "ingress" rules and not "egress" rules meaning "anyone at that IP can get into all resources in the project with the correct IAM permissions".
The ACM rule specifying an IP address will allow that IP address to access all Cloud Storage Objects and all other protected resources owned by that project. This is usually not the intended result. You cannot apply an IP address rule to an object, only to all objects in a project. VPC Service Controls are designed to prevent data from getting out of a project and are NOT designed to allow untrusted anonymous users access to a project's resources.
UPDATE: This is now possible using VPC Service Controls
No, this is not currently possible.
There's currently a Feature request to restrict google cloud storage bucket by IP Address.
The VPC Service Controls [1] allow users to define a security perimeter around Google Cloud Platform resources such as Cloud Storage buckets (and some others) to constrain data within a VPC and help mitigate data exfiltration risks.
[1] https://cloud.google.com/vpc-service-controls/
I used VPC Service Controls on behalf of a client recently to attempt to accomplish this. You cannot use VPC Service Controls to whitelist an ip address on a single bucket. Jterrace is right. There is no such solution for that today. However, using VPC Service Controls you can draw a service perimeter around the Google Cloud Storage (GCS) service as a whole within a given project, then apply an access level to your service perimeter in the project to allow an ip address/ip address range access to the service (and resources within). The implications are that any new buckets created within the project will be created within the service perimeter and thus be regulated by the access levels applied to the perimeter. So you'll likely want this to be the sole bucket in this project.
Note that the service perimeter only affects services you specify. It does not protect the project as a whole.
Developer Permissions:
Access Context Manager
VPC Service Controls
Steps to accomplish this:
Use VPC Service Controls to create a service perimeter around the entire Google Cloud Storage service in the project of your choosing
Use Access Context Manager to create access levels for ip address you want to whitelist and users/groups who will have access to the service
Apply these access levels to the service perimeter created in the previous step (it will take 30 minutes for this change to take effect)
Note: Best practice would be to provide access to the bucket using a service account or users/groups ACL, if that is possible. I know it isn't always so.