I want to ensure deleted files have a window of recovery. I'd like to use the primitives offered by google cloud storage such that I don't have to maintain the logic necessary to prevent files deleted in error from being irrecoverable.
I do not see a better way to achieve than doing:
create normal bucket for files that are displayed to the users
create trash bucket for files pending permanent deletion with lifecycle rule that deletes objects after N days of creation
upon file deletion request from the normal bucket, first copy files to the trash bucket , then deletion of file from normal bucket
What is the "idiomatic" way of implementing delayed permanent deletion GCP cloud storage?
NOTE: I am trying to avoid chron jobs or additional database interaction
NOTE: this is not a soft delete as the file is expected to eventually be permanently deleted without any trace/storage associated with it
You can keep all the files in the same bucket like this, assuming two things:
Each file is also referenced in a database that you're querying to build the UI.
You're able to write backend code to manage the bucket - peoeple are not dealing with files directly with a Cloud Storage SDK.
It uses Cloud Tasks to schedule the deletion:
User asks for file deletion.
File is marked as "deleted" in the database, not actually deleted from bucket.
Use Cloud Tasks to schedule the actual deletion 5 days from now.
On schedule, the task triggers a function, which deletes the file and its database record.
Your UI will have to query the database in order to differentiate between deleted and trashed files.
Related
I'm storing backups in Cloud Storage. A desirable property of such a backup is to ensure the device being backed up cannot erase the backups, to protect against ransomware or similar threats. At the same time, it is desirable to allow the backup client to delete so old files can be pruned. (Because the backups are encrypted, it isn't possible to use lifecycle management to do this.)
The solution that immediately comes to mind is to enable object versioning and use lifecycle rules to retain object versions (deleted files) for a certain amount of time. However, I cannot see a way to allow the backup client to delete the current version, but not historical versions. I thought it might be possible to do this with an IAM condition, but the conditional logic doesn't seem flexible enough to parse out the object version. Is there another way I've missed?
The only other solution that comes to mind is to create a second bucket, inaccessible to the backup client, and use a Cloud Function to replicate the first bucket. The downside of that approach is the duplicate storage cost.
To answer this:
However, I cannot see a way to allow the backup client to delete the current version, but not historical versions
When you delete a live object, object versioning will retain a noncurrent version of it. When deleting the noncurrent object version, you will have to specify the object name along with its generation number.
Just to add, you may want to consider using a transfer job to replicate your data on a separate bucket.
Either way, both approach (object versioning or replicating buckets) will incur additional storage costs.
I am writing an online text editor. I want to allow users to add inline images and video to the document. I am struggling to implement this in a reliable way.
Current infrastructure:
Database (postgres) of documents (text, title, author, list of media objects referencing S3)
Object store (S3) where the images/video/files are stored
The current flow:
User creates a new document
User makes changes, but doesn't save it. These changes are stored in localStorage so they are not lost on refresh.
The user attaches an image
The image displays a loading indicator as it is uploaded to S3 (or equivalent)
The user saves the document, and the data is saved to a database. The objects are not saved, only S3 URLs to them.
Problem
If the user deletes the document before saving, or if saving fails, there will be orphan files in S3 that are not referenced by any documents.
A "delete document" action must now delete something from Postgres and S3. Since you cannot do a transaction across two completely different services, one can imagine a situation where the postgres delete succeeds, but the S3 delete fails, creating more orphan objects.
Attempts at solutions
I tried storing the media in localStorage and committing them all when the document is saved. This would solve the issue, but localStorage is limited to 5-10mb, which is too small.
A reaper daemon that queries references to S3 in the database and cross-references it with objects stored in S3 to find orphan objects, which it would automatically delete.
The reaper daemon would work, but it feels like a hack. I really don't want to manage an entirely new service just to store some files. Is there a better way to do this? What is the industry standard?
If it matters, I'm using React+Typescript and the text editor is built upon DraftJS.
I have followed the directions provided by Google to delete some old buckets in the in my account ... it is very straight forward process listed however after confirming the deletion to occur the "Preparing to Delete" pops up on the bottom left, but the system never deletes the files and bucket ?
I have posted this several times but no one have suggested a solution or a reason why the process does not work.
If you have a lot of files in your bucket, it might simply take a long time to perform the operation.
As a workaround to the UI being unclear, you can use gsutil to remove all files in a bucket, followed by the bucket itself, using gsutil rm -r gs://bucket.
In my experience, when the bucket has lots of objects, using the web interface or gsutil alone is not the best way.
What I did, was that I added a life cycle rule, to have Google delete all the object in the bucket.
Then, coming back after a day, the bucket can easily be deleted.
In google cloud storage, I'm trying to create a bucket with name "www.coladmmo.com" but it says "This bucket name is already in use. Bucket names must be globally unique. Try another name. " I only created this bucket with this name successfully before and i deleted the entire project. But now it says that it is already in use.
The bucket probably still exists in the project you deleted. See the Create, shut down, and restore projects documentation page, specifically:
After a 30-day waiting period, the project and associated data are
permanently deleted from the console.
Note that after the 30-day waiting period ends, the time it takes to
completely delete a project may vary. For example, if a project has
billing set up, it might not be completely deleted until the current
billing cycle ends, you receive the next bill, and your account is
successfully charged. Additionally, the number and types of services
in use may also affect when the system permanently deletes a project.
I created a bucket in a project. I subsequently deleted that project, so its bucket should be deleted along with it.
Now I'm attempting to make a bucket with the same name in another project, but I get the error:
"This bucket name is already in use. Bucket names must be globally unique. Try another name."
It's been over 12 hours. Documentation suggests that bucket IDs are supposed to get released if they are no longer in use. Will that bucket ID ever become available again?
From the support documentation:
Shutting down a project stops all billing and traffic serving, shuts
down any Google Cloud Platform App Engine applications, and terminates
all Compute Engine instances. All project data associated with Google
Cloud and Google APIs services becomes inaccessible.
After a 7-day waiting period, the project and associated data are
permanently deleted from the console.
Note that after the 7-day waiting period ends, the time it takes to
completely delete a project may vary. For example, if a project has
billing set up, it might not be completely deleted until the current
billing cycle ends, you receive the next bill, and your account is
successfully charged. Additionally, the number and types of services
in use may also affect when the system permanently deletes a project.