Google Cloud Stogage cache control issue with Google CDN - google-cloud-storage

I use Google CDN with load balancer and changed the metadata for the existing object to reset cache control.
$ gsutil setmeta -h "Cache-Control:private, max-age=0, no-transform" gs://bucket/*.jpg
And I updated the new object, but I still got the old object.
Any help why it still has the old object?

Changing the metadata on a Cloud Storage object does not remove any already cached copies from Cloud CDN caches. To instruct Cloud CDN to stop serving from cache, you can request a cache invalidation. There's more information at https://cloud.google.com/cdn/docs/cache-invalidation-overview.

Related

Use only a domain and disable https://storage.googleapis.com url access

I am newbie at cloud servers and I've opened a google cloud storage to host image files. I've verified my domain and configured it, to view images via my domain. The problem is, same file is both accessible via my domain example.com/images/tiny.png and also via storage.googleapis.com/example.com/images/tiny.png Is there any solution to disable access via storage.googleapis.com and use only my domain?
Google Cloud Platform Support Version:
NOTE: This is the reply from Google Cloud Platform Support when contacted via email...
I understand that you have set up a domain name for one of your Cloud Storage buckets and you want to make sure only URLs starting with your domain name have access to this bucket.
I am afraid that this is not possible because of how Cloud Storage permission works.
Making a Cloud Storage bucket publicly readable also gives each of its files a public link. And currently this public link can’t be disabled.
A workaround would be implement a proxy program and running it on a Compute Engine virtual machine. This VM will need a static external IP so that you can map your domain to it. The proxy program will be in charged of returning the requested file from a predefined Cloud Storage bucket while the bucket keeps to be inaccessible to the public.
You may find these documents helpful if you are interested in this workaround:
1. Quick start to set up a Linux VM (1).
2. Python API for accessing Cloud Storage files (2).
3. How to download service account keys to grant a program access to a set of services (3).
4. Pricing calculator for getting a picture on how much a VM may cost (4).
(1) https://cloud.google.com/compute/docs/quickstart-linux
(2) https://pypi.org/project/google-cloud-storage/
(3) https://cloud.google.com/iam/docs/creating-managing-service-account-keys
(4) https://cloud.google.com/products/calculator/
My Version:
It seems the solution to this question is really a simple, just FUSE Google Cloud Storage with VM Instance.
After FUSE private files from GCS can be accessed through VM's IP address. It made Google Cloud Storage Bucket act like a directory.
The detailed documentation about how to setup FUSE in Google Cloud is here.
There is but it requires you to do more work.
Your current solution works because you've made access to the GCS bucket (example.com), public and then you're DNS aliasing from your domain.
An alternative approach would be for you to limit access to the GCS bucket to one (possibly several) accounts and then run a web-server that uses one of the accounts to access your image files. You could then also either permit access to your web-server to anyone or also limit access to it.
More work for you (and possibly cost) but more control.

is gsutil cp secure for uloading sensitive data

I'm reading docs about how to use google cloud, particularly to store data on a bucket.
I can see the gcloud scp command to upload file to a VM in a secure way (highlighted in the doc).
To uload to a bucket, it's said to use gsutil cp
Is this command secure ? If I want to upload sensitive data, do I have to take more precautions (and how)
As per the documentation:
By default, gsutil accesses Cloud Storage through JSON API request endpoints. You can change this default to the XML API.
The JSON API request endpoint is HTTPS - so assuming the security provided by HTTPS is sufficient for your needs, it should be fine. That won't guard against attacks if your local machine has been compromised with a bogus version of gsutil, but at that point all bets are probably off.

working with Google Cloud Storage without gsutil

I have developed a Software in which is configured directories to save files. I run it on Linux. These directories are informed by config file.
I would like to use compute engine nodes because I need to increase its performance. Therefore, I would like to use Google Storage to save these files into a save repository.
In [1] is showed mounting a bucket as file system. I tried it, but no success. I receive authentication error.
Can anyone help me to get success in order to access my bucket by compute engine nodes ?
[1] https://cloud.google.com/compute/docs/disks/gcs-buckets
Best regards,
It sounds like you did not start your GCE instance with a service account.
According to the docs you linked, you need to configure a service account or run gcloud auth login to configure your credentials for accessing cloud storage.
If you are trying to set up gcsfuse without running on GCE you will need to use the gcloud auth login approach.

Set all files in Google Cloud Storage bucket to gzip by default

I am trying to set a Google Cloud Storage bucket so that any files I upload are automatically gzip'd and "Content-Encoding: gzip" is set.
I tried "gsutil defacl set public-read gs://bucket" based upon Set all files in Google Cloud Storage Bucket to public by default but was unsuccessful.
Any ideas?
There's no way to configure a bucket to automatically gzip files being uploaded there by default. One possibility would be to configure object change notifications on the bucket and implement code that responds to the notifications by reading the new input object and writing a compressed equivalent and then deleting the original.

Upload files to Google Cloud Storage without downloading them locally?

I want to upload 120 files, each around 1.2GB so about 150GB in total, from an HTTPS website onto my Google Cloud Storage.
I really, really don't want to have to download them all locally, and then upload them individually.
Is there any way around this? Surely I can just give Google Cloud Storage a URL to pull from? I don't control the HTTPS server.
It seems to be possible to upload from S3 to Google Cloud Storage, but S3 seems to suffer from the same problem.
If your website allows public access you can use the GCS Transfer Service to do it: https://cloud.google.com/storage/transfer/