Google cloud storage: change bucket region - google-cloud-storage

I'm using google cloud storage.
I have created a bucket in US, and now I need to move it to EU, due to GDPR.
Is there any way to change the bucket location?
If not, In case of removing the bucket and creating a new one instead - can I give it the same name? (as it is globally unique).

Yes, as per Cloud Storage documentation, you can move or rename your bucket:
If there is no data in your old bucket, delete the bucket and create another bucket with a new name, in a new location, or in a new
project.
If you have data in your old bucket, create a new bucket with the desired name, location, and/or project, copy data from the old bucket
to the new bucket, and delete the old bucket and its contents. (The link provided above describes this process).
Keep in mind that if you would like your new bucket to have the same name as your old bucket, you must move your data twice: an intermediary bucket temporarily holds your data so that you can delete the original bucket and free up the bucket name for the final bucket.

Related

How to store AWS S3 object data to a postgres DB

I'm working on a Golang application where users will be able to upload files:Images & PDFs.
The files will be stored in AWS S3 bucket which I've implemented. However I dont know how to go about retrieving identifiers for the stored items to save them in Postgres.
I was thinking of using an item.ID but the AWS sdk for go method does not provide an object ID:
for _,item:=range response.Contents{
log.Printf("Name : %s\n",item.Key)
log.Printf("ID : %s\n",*item.)
}
What other options are available to retrieve stored object references from AWS S3?
A common approach is to event source a lambda with an S3 bucket event. This way, you can get more details about the object created within your bucket. Then you can make this lambda function to persist the object metadata into postgres
Another option would be simply to append the object key you are using in your SDK to the bucket name you're targeting, then the final result would be full URI that points to the object stored. Something like this
s3://{{BUCKET_NAME/{{OBJECT_KEY}}

Azure Data factory event trigger on new container with files added

Can azure data factory trigger event when new container added with files in storage account ? if not how this can be implemented
You need to read this for full details.
The Blob path begins with and Blob path ends with properties allow you to specify the containers, folders, and blob names for which you want to receive events.

Can we preserve the storage class when copying an object to a new bucket?

We have two different buckets: short-term, that has lifecycle policies applied, and retain, where we put data that we intend to keep indefinitely. The way we get data into the retain bucket is usually by copying the original object from the short-term bucket using the JSON API.
The short-term bucket after 30 days moves data to nearline, after 60 days to coldline, and after 90 days deletes the data. The storage class for our retain bucket is standard. When we're copying data from short-term bucket to the retain bucket, we'd like to preserve the storage-class of the file that we're duplicating - is it possible for us to specify the storage class on the destination file using the JSON API?
If you want to preserve the storage class it is recommended to perform a rewrite instead:
Use the copy method to copy between objects in the same location and storage class
In the rewrite you should set the storage class. The other way should be in the case that you have separated the objects according to the storage class, but as per my understanding, this is not your case.

Google Cloud Storage: How to Delete a folder (recursively) in Python

I am trying to delete a folder in GCS and its all content (including sub-directories) with its Python library. Also I understand GCS doesn't really have folders (but prefix?) but I am wondering how I can do that?
I tested this code:
from google.cloud import storage
def delete_blob(bucket_name, blob_name):
"""Deletes a blob from the bucket."""
storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket_name)
blob = bucket.blob(blob_name)
blob.delete()
delete_blob('mybucket', 'top_folder/sub_folder/test.txt')
delete_blob('mybucket', 'top_folder/sub_folder/')
The first call to delete_blob worked but not the 2nd one. What can I delete a folder recursively?
To delete everything starting with a certain prefix (for example, a directory name), you can iterate over a list:
storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket_name)
blobs = bucket.list_blobs(prefix='some/directory')
for blob in blobs:
blob.delete()
Note that for very large buckets with millions or billions of objects, this may not be a very fast process. For that, you'll want to do something more complex, such as deleting in multiple threads or using lifecycle configuration rules to arrange for the objects to be deleted.
Now it can be done by:
def delete_folder(cls, bucket_name, folder_name):
bucket = cls.storage_client.get_bucket(bucket_name)
"""Delete object under folder"""
blobs = list(bucket.list_blobs(prefix=folder_name))
bucket.delete_blobs(blobs)
print(f"Folder {folder_name} deleted.")
deleteFiles might be what you are looking for. Or in Python delete_blobs. Assuming they are the same, the Node docs do a better job describing the behavior, namely
This is not an atomic request. A delete attempt will be made for each
file individually. Any one can fail, in which case only a portion of
the files you intended to be deleted would have.
Operations are performed in parallel, up to 10 at once.

Google storage api list storage bucket with "/" in the name

I am trying to list all objects in a bucket(Google storage) in the google storage api. The bucket is nested like a folder, such as "my-bucket/sub-folder". I got the following error:
com.google.api.client.googleapis.json.GoogleJsonResponseException: 404 Not Found
If I use a bucket name without "/" it works fine. How can I list a bucket like a folder structure?
Google Cloud Storage buckets do not have slashes in their name. In the example above, the bucket is named "my-bucket" and the object is named something like "sub-folder/object.txt" or just "object.txt".
It's useful to remember that GCS does not have any real notion of folders. There are only buckets and objects in buckets. If you have a subdirectory named "dir" in bucket named "mybucket", and that subdirectory has 5 objects in it, what you really have is 5 objects named "dir/obj1", "dir/obj2", etc, all still within bucket "mybucket."
A number of tools (like gsutil and the GCS web-based storage browser) make it appear that there are folders, through use of markers and prefixes in the API -- even though as noted, there really are just objects that have slashes in the name.