hi am unable to delete or update an event for a folder in s3 buckets. the buckets has many events, can any one give help to solve this issue - bucket

hi am unable to delete or update an event for a folder in s3 buckets the buckets has many events, can any one give help to solve this issue.
the particular bucket has 6 events I want to delete one event.

Related

Delay permanent deletion from google cloud storage bucket

I want to ensure deleted files have a window of recovery. I'd like to use the primitives offered by google cloud storage such that I don't have to maintain the logic necessary to prevent files deleted in error from being irrecoverable.
I do not see a better way to achieve than doing:
create normal bucket for files that are displayed to the users
create trash bucket for files pending permanent deletion with lifecycle rule that deletes objects after N days of creation
upon file deletion request from the normal bucket, first copy files to the trash bucket , then deletion of file from normal bucket
What is the "idiomatic" way of implementing delayed permanent deletion GCP cloud storage?
NOTE: I am trying to avoid chron jobs or additional database interaction
NOTE: this is not a soft delete as the file is expected to eventually be permanently deleted without any trace/storage associated with it
You can keep all the files in the same bucket like this, assuming two things:
Each file is also referenced in a database that you're querying to build the UI.
You're able to write backend code to manage the bucket - peoeple are not dealing with files directly with a Cloud Storage SDK.
It uses Cloud Tasks to schedule the deletion:
User asks for file deletion.
File is marked as "deleted" in the database, not actually deleted from bucket.
Use Cloud Tasks to schedule the actual deletion 5 days from now.
On schedule, the task triggers a function, which deletes the file and its database record.
Your UI will have to query the database in order to differentiate between deleted and trashed files.

Dataprep jobs running for over 72 hours since 6/20 update. Job status reads complete but not published

I have been running daily Dataprep jobs and since the update last week, approximately half of my jobs are now hanging and not being published. They appear as jobs in progress although when I go to the actual job page, the job appears to be complete. There is no publishing action and the publishing target does not appear updated. Some jobs have now been going on for over 72 hours since Friday.
I've seen traces of other users having the same issue online but have not seen any sort of response or recognition from either Google or Trifacta.
I have tried restarting the jobs to no success and it appears that there is no way to cancel those hanging jobs because from Google's perspective, it seems as though the jobs were successful itself, just not published. This problem appears both on my jobs that publish to BigQuery as well as jobs that publish to Google Cloud Storage, as well as manual and scheduled jobs.
This may impact only jobs that have been pushed during the upgrade and should be rather cosmetic in nature. Please note that you won't get charged.
Did the exact same job work before with no changes? If so, please contact support and provide them as reference the successful and now failing job ID so it can be investigated further.
Cheers,
Sebastian
I have come acros the same problem! The output of the jobs is placed in a temp folder in cloudstorage with the output mostly consisting out of multiple files without headers....
It is also creating huge issues here. Instead of the normal output file, it places multiple parts of it in a temp folder without headers. The makes new scheduled jobs that rely on these outputs useless, because it does not load the new output.
If you manually merge the files in the temp folder and add headers (in case of csv) + place them in the correct folder, the output can be created manualy (for csv).
Also no response from Google yet.
We're seeing the exact same thing down to the destinations and job types . . . it's almost like Dataprep is losing track of the underlying DataFlow job and not finishing on its completion (that's why you see the temp files—that's the output, then Dataprep handles the formatting of the output file separately).
Someone was kind enough to already post this on the issue tracker, so please go star it and add any additional details that may be helpful to the Dataprep team:
https://issuetracker.google.com/issues/135865374

Disappearing bucket, how to investigate

I am working on a project for a client and a couple of weeks ago most of the content "disappeared".
Images and videos are routed through FileStack (a file processing service) but actually stored on Google Cloud Storage in one bucket.
On the day in question everything was working, and then everything stopped working. When we investigated it turned out that the bucket FileStack was pointing to was non-existent, so we created a new bucket with the same name and everything magically worked itself out.
Now my question is, where did all the files from the disappeared bucket go? Is it possible to get them back? Is it possible to figure out what happened?
I have extensively reviewed the audit log in the Activity tab and it shows zero activity for the bucket in question. Is there anywhere else we can investigate?
Can you please send email to gs-team#google.com, noting the bucket name and an example object name from that bucket, along with the last time you were successfully able to access that bucket/object? Doing it that way will avoid exposing these names on the public forum. Please mention my name in the message, so I will get it and can investigate.
Thanks,
Mike Schwartz
GCS Team
When an object is deleted, it's deleted from the system and there isn't any option to recover it [1]. You can prevent this behavior by using object versioning [2]. And to get a better overview of the activity in Cloud Storage you can enable the "Data Access logs" [3].
About the reason why the objects has disappeared, as a first workaround you can review if there's an Object Lifecycle enabled [4].
https://cloud.google.com/storage/docs/deleting-objects
https://cloud.google.com/storage/docs/object-versioning
https://cloud.google.com/storage/docs/audit-logs
https://cloud.google.com/storage/docs/lifecycle

Run a Google Cloud Function for each file in a bucket

I have a Google Cloud Function triggered by a Google Cloud Storage object.finalize event. When I deploy a new version of this function, I would like to run it for every existing file in the bucket (which have already been processed by the previous version of the function). Processing all the existing files in the bucket is a long running task, hence I don't think a Google Cloud Function which will process all files in a row is an option.
The best option I can see for now is to make a Google Cloud Function I can triggered via HTTP that will list all the files in the bucket and publish one event per file via Google PubSub, and then process each of these events with a slightly modified version of my initial Google Cloud Function which accepts a PubSub event in place of the object.finalize storage event.
I think it can work but I was wondering if there was an easier way to perform this operation.
If the operation you're trying to perform may take longer than the maximum time that a Cloud Function can run, you will need to split that operation into multiple steps. Your approach of using a PubSub trigger for each individual file, sounds like a valid approach to do that for me.
One option might be to write a small program that lists all of the objects in a bucket and, for each object, posts a message to Cloud Pub/Sub that triggers your function in the same way a GCS change would.

Deleting Buckets in Google Cloud Storage

I have followed the directions provided by Google to delete some old buckets in the in my account ... it is very straight forward process listed however after confirming the deletion to occur the "Preparing to Delete" pops up on the bottom left, but the system never deletes the files and bucket ?
I have posted this several times but no one have suggested a solution or a reason why the process does not work.
If you have a lot of files in your bucket, it might simply take a long time to perform the operation.
As a workaround to the UI being unclear, you can use gsutil to remove all files in a bucket, followed by the bucket itself, using gsutil rm -r gs://bucket.
In my experience, when the bucket has lots of objects, using the web interface or gsutil alone is not the best way.
What I did, was that I added a life cycle rule, to have Google delete all the object in the bucket.
Then, coming back after a day, the bucket can easily be deleted.