move a bucket to other storage class - google-cloud-storage

I want to transfer a complete bucket to coldline easy. My problem is that when I try to run gsutil, it disconnects and charges me each time.
This is the command I'm trying to use:
gsutil rewrite -s coldline gs: // bucket / **

You can use a lifecycle policy on the bucket to downgrade all objects in the bucket to coldline.

Related

gsutil - what are the storage class options for cp?

I'm using gsutil CLI to copy files to buckets of Google Cloud. But I didn't find what are the options for specifying a storage class? I read the documentation and the options are not written there. It's just written to use -s <class> but what are the options for <class>?
These are the following Storage Class that you can define:
STANDARD
NEARLINE
COLDLINE
ARCHIVE
Additional: You should only use copy to copy between the same location and storage class

Google cloud storage: Cannot reuse bucket name after deleting bucket

I deleted an existing bucket on google cloud storage using:
gsutil rm -r gs://www.<mydomain>.com
I then verify then bucket was deleted using:
gcloud storage ls gs://www.<mydomain>.com
And I get expected response:
ERROR: (gcloud.storage.ls) gs://www.<mydomain>.com not found: 404.
I then verify then bucket was deleted using:
gsutil ls
And I get expected empty response.
I then tried to recreate a new bucket with same name using:
gsutil mb -p <projectid> -c STANDARD -l US-EAST1 -b on gs://www.<mydomain>.com
I get the unexpected error below indicating bucket still exists:
www.<mydomain>.com
Creating gs://www.<mydomain>.com/...
ServiceException: 409 A Cloud Storage bucket named 'www.<mydomain>.com' already exists. Try another name. Bucket names must be globally unique across all Google Cloud projects, including those outside of your organization.
How can I reuse the bucket name for the bucket that I deleted?
I found the answer to my question here:
https://stackoverflow.com/a/44763841
Basically I had deleted the project the bucket was in before or after (not sure) deleting the bucket. For some reason this causes the bucket to still appear to exist even though it does not. The behavior does not seem quite right to me but I believe waiting for billing period to complete and project to be deleted would delete the phantom bucket. Unfortunately this means I have to wait 2 weeks. I will confirm this in 2 weeks.

How to download multiple objects from IBM Cloud Object Storage?

I am trying to use IBM Cloud Object Storage to store images uploaded to my site by users. I have this functionality working just fine.
However, based on the documentation here (link) it appears as though only one object can be downloaded from a bucket at a time.
Is there any way a list of objects could all be downloaded from the bucket? Is there a different approach to requesting multiple objects from a COS bucket?
Via the REST API, no, you can only download a single object at a time. But most tools (like the AWS CLI, or Minio Client) allow downloading all objects that share a prefix (eg foo/bar and foo/bas). The IBM forks of the S3 libraries also are now integrated with Aspera, and can transfer large directories all at once. What are you trying to do?
According to S3 spec (https://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectGET.html), you can only download one object at a time.
There are various tools which may help to download multiple objects at a time from COS. I used AWS CLI tool to download and upload the objects from/to COS.
So install aws-cli tool and configure it by supplying access_key_id and secret_access_key here.
Recursively copying S3 objects to a local directory: When passed with the parameter --recursive, the following cp command recursively copies all objects under a specified prefix and bucket to a specified directory.
C:\Users\Shashank>aws s3 cp s3://yourBucketName . --recursive
for example:
C:\Users\Shashank>aws --endpoint-url http://s3.us-east.cloud-object-storage.appdomain.cloud s3 cp s3://yourBucketName D:\s3\ --recursive
In my case having endpoint based on us-east region and I am copying objects into D:\s3 directory.
Recursively copying local files to S3: When passed with the parameter --recursive, the following cp command recursively copies all files under a specified directory to a specified bucket.
C:\Users\Shashank>aws s3 cp myDir s3://yourBucketName/ --recursive
for example:
C:\Users\Shashank>aws --endpoint-url http://s3.us-east.cloud-object-storage.appdomain.cloud s3 cp D:\s3 s3://yourBucketName/ --recursive
I am copying objects from D:\s3 directory to COS.
For more reference, you can see the link here.
I hope it works for you.

Change storage class of (existing) objects in Google Cloud Storage

I recently learnt of the new storage tiers and reduced prices announced on the Google Cloud Storage platform/service.
So I wanted to change the default storage class for one of my buckets from Durable Reduced Availability to Coldline, as that is what is appropriate for the files that I'm archiving in that bucket.
I got this note though:
Changing the default storage class only affects objects you add to this bucket going forward. It does not change the storage class of objects that are already in your bucket.
Any advice/tips on how I can change class of all existing objects in the bucket (using Google Cloud Console or gsutil)?
The easiest way to synchronously move the objects to a different storage class in the same bucket is to use rewrite. For example, to do this with gsutil, you can run:
gsutil -m rewrite -s coldline gs://your-bucket/**
Note: make sure gsutil is up to date (version 4.22 and above support the -s flag with rewrite).
Alternatively, you can use the new SetStorageClass action of the Lifecycle Management feature to asynchronously (usually takes about 1 day) modify storage classes of objects in place (e.g. by using a CreatedBefore condition set to some time after you change the bucket's default storage class).
To change the storage class from NEARLINE to COLDLINE, create a JSON file with the following content:
{
"lifecycle": {
"rule": [
{
"action": {
"type": "SetStorageClass",
"storageClass": "COLDLINE"
},
"condition": {
"matchesStorageClass": [
"NEARLINE"
]
}
}
]
}
}
Name it lifecycle.json or something, then run this in your shell:
$ gsutil lifecycle set lifecycle.json gs://my-cool-bucket
The changes may take up to 24 hours to go through. As far as I know, this change will not cost anything extra.
I did this:
gsutil -m rewrite -r -s <storage-class> gs://my-bucket-name/
(-r for recursive, because I want all objects in my bucket to be affected).
You could now use "Data Transfer" to change a storage class by moving your bucket objects to a new bucket.
Access this from the left panel of Storage.
If you couldn't access to the gsutil console, as in Google Cloud Function environment because Cloud Functions server instances don't have gsutil installed. Gsutil works on your local machine because you do have it installed and configured there. For all these cases I suggest you to evaluate the update_storage_class() blob method in python. This method is callable when you retrieve the single blob (in other words it refers to your specific object inside your bucket). Here an example:
from google.cloud import storage
storage_client = storage.Client()
blobs = storage_client.list_blobs(bucket_name)
for blob in blobs:
print(blob.name)
print(blob.storage_class)
all_classes = ['NEARLINE_STORAGE_CLASS', 'COLDLINE_STORAGE_CLASS', 'ARCHIVE_STORAGE_CLASS', 'STANDARD_STORAGE_CLASS', 'MULTI_REGIONAL_LEGACY_STORAGE_CLASS', 'REGIONAL_LEGACY_STORAGE_CLASS']
new_class = all_classes[my_index]
update_storage_class(new_class)
References:
Blobs / Objects documentation: https://googleapis.dev/python/storage/latest/blobs.html#google.cloud.storage.blob.Blob.update_storage_class
Storage classes: https://cloud.google.com/storage/docs/storage-classes

Deleting Cloud Storage Objects in Batch

Whats the best way to delete many cloudstorage objects? I have a bucket that contains ~500K objects and I'd like to delete them all.
Do you I have to make 1 api request for each object I want to delete or is there some sort of batch method? I'm currently using gsutil to delete one at a time.
You need to make 1 api request for each object. The simplest way to accomplish this would be with gsutil:
$ gsutil -m rm gs://bucket_with_many_objects/**
The -m option enables multithreading, which will delete many objects in parallel.
Note that with gsutil the "*" wildcard will only match the top-level objects (up to the next "/" in the path name). If you want to delete all the objects you can either use:
$ gsutil -m rm -R gs://bucket_with_many_objects
or
$ gsutil -m rm gs://bucket_with_many_objects/**
Mike Schwartz, Google Cloud Storage Team
I had a similar problem, with a bucket containing over 800,000 objects, the gsutil -m rm gs://bucket-name method does work, but takes a long time as it is essentially still deleting each object one at a time.
After contacting the Cloud Storage team at Google, they pointed me in the direction of bucket lifecycle policies, although not instant, they allow you to mass delete objects in a more efficient way.
I have written a blog post on Deleting Full Buckets using this method.