Google Cloud Storage and being charged for files not found - google-cloud-storage

Does anyone know if you are charged for a file request in Google Cloud Storage if the file doesn't exist? In other words, does someone accessing a non-existent file in your bucket count against your requests? Or is that only for files that exists?

Customers are not charged for requests that result in a 400-level or 500-level HTTP response.
The only exception is for 404 responses returned for buckets that have Website Configuration enabled with a custom NotFoundPage object.

Related

Is Firebase Cloud Storage security rule checked on upload start or upload complete?

Suppose a file took a long time to be uploaded, and relevant canUpload value in firestore changed to false during this period, would the upload be successful?
If I have rules like this:
allow create : if firestore.get(user/$(request.auth.id)).canUpload
Firebase security rules for Cloud Storage uploads are triggered after the payload has been received on the server (as it has access to metadata about the payload), but before the data is actually committed to storage itself.
It is indeed possible for the rules to change between starting the upload and completing it, in which case only the new, updated rules will be evaluated.
I believe the upload would continue, this is because Firestone rules are only checked when the operation is initiated, and not continuesly throughout the upload.

Best approach to upload the file via REST api from API gateway

User Case: Customer can upload the file from the public REST api to our S3 bucket and then we can process the file using downstream services.
After doing some research I am able to find 3 ways to do it:
Uploading using OCTET-STREAM file type
Upload the file using form-data request
Upload the file using the pre-signed URL
In first 2 cases user will send the binary file and we will upload the file to S3 after file validation.
In the 3rd method user have to hit 3 apis. First API to get the S3 pre-signed URL which will give access to the user to upload the file to S3. In second hit user will upload the file to that s3 pre-signed URL. After the user complete the upload he will send the request to process the file.
Do we have any security issues with step 3? As user can misuse the pre-signed URL with malicious file.
Which of these method is best according to industry practice?
Details of each approach:
1. Uploading using OCTET-STREAM file type
Pros:
This method is good to upload file types which can be opened in some application such as xlsx.
1 API hit. Direct file upload
Cons:
This option is not suitable to upload multiple files. If in future we need to support multiple file upload this should be changed to multipart/form-data (A2).
No metadata can be send as body parameter. Metadata can be send in headers.
2. Upload the file using form-data request
User will upload the file with the API request by attaching it as multipart form.
Pros
We can send multiple files at the same time.
We can send extra parameters in the body.
3. Upload the file using the pre-signed URL
Cons
Customer have to hit the 3 APIs to upload the file. (2 API hits to upload and then 1 more API hit to check the process the file)
If you want them to load data into a bucket, the best way will almost always be the pre-signed URL. This gives you complete control over how you hand out access to the bucket, but also allows them to directly upload into the bucket when they have the access.
In the first two examples the user can send malicious data to your API, potentially DOSing the server / incurring costs on you to manage the payloads as you have no control over access (it is public).
In the third case they can request a URL from you, but that is it, other than spamming you for requests for URLs, unless you grant them a URL they can't access the bucket or do anything else. This seems much better than spamming your upload with large junk files and having you process them before you decide you didn't want them anyway.
Finally using the pre-signed URL is the pattern AWS would expect you to use, and so have a lot of support for managing the access, roles, logging and monitoring etc that you would want to put around this service. When you are standing up the API yourself this will all be up to you to manage.

Google pubsub to Google cloud storage

Is it possible for a bucket in cloud storage to receive data/messages from pubSub? if yes then how??
Currently i am publishing messages to pubsub
and i want to use pull delivery type (for that i have to provide endpoint URL for the bucket, which i couldn't find anywhere)
I found this somewhere in there docs
But it didn't work.
No, sorry. GCS only accepts uploads of complete files via HTTP. You could build a small app that took incoming Pub/Sub messages and uploading them as separate GCS objects or batch them into groups of messages and upload those to GCS, but there's no such built-in functionality.
Can I ask you more about your use case? What are you trying to do?

How to upload Files to Cloud Storage?

I have a Google Cloud Endpoints wich is using Cloud SQL to store data. I want to provide a file upload for Clients and the files should be stored in Cloud Storage but I also want to store file meta data and the file storage url in Cloud SQL.
What's the best was to do this?
Can I upload files through cloud endpoints or do I need an extra upload Servlet?
How can I update my database entities which needs a reference to the uploaded files.
Any examples on how to combine those 3 technologies?
Assuming your clients are not added to your google cloud project (which is typically the case), your users don't have write access to your GCS bucket. You can either submit files to your application and move to GCS from there (not recommended as consumes more network and CPU) or a better way is to submit to GCS directly.
To let the client write to your GCS bucket directly, you will need to either:
1. put your access key on client for write access (not recommended), if the client is used by limited trusted people.
2. generate a time-bound token and put it on the client as signed URL to upload directly.
Endpoints APIs themselves cannot do this, but you can generate the signed GCS URL at the server and get it using endpoints on client. then set it as form action (on web client, other clients have similar ways for signed upload) and submit the form to upload the file.
<form action="SIGNED_URL_FROM_ENDPOINTS" method="post" enctype="multipart/form-data">
I don't see an open-source code out there doing exactly this, but closest is this project that does generate the signed URL with a time-out (the only unintuitive part).
Best way to update the metadata in your database is to watch GCS bucket using 'Object Change Notifications'. Another way is to send the metadata to your server from client itself, which can be an endpoints call. You can also use a mix of both where the metadata goes to server using endpoints even before the the file is uploaded and the notification updates the record with confirmation that it is available to serve.

Best practices to redirect a HTTP POST to my REST API towards my S3 bucket?

Say we want a REST API to support file uploads, and we want uploads to be done directly on S3.
According to this solution Amazon S3 direct file upload from client browser - private key disclosure, we have to create POLICY and SIGNATURE for user to be allowed to upload to S3.
However, we want a single entry point for the API, including uploads.
Can we:
1. in our API, catch POST https://www.example.org/users/1234/objects
2. calculate POLICY and SIGNATURE to allow direct upload to S3
3. return a 307 "Temporary Redirect" to https://s3-bucket.s3.amazonaws.com
How to pass POLICY and SIGNATURE in the redirect?
What is best practice here?
You dont redirect, instead your API should return the policy and signature in the response (say in JSON).
Then the browser can use these values to directly upload to S3 as in the document. This is a two step process.