I have a Google Cloud Endpoints wich is using Cloud SQL to store data. I want to provide a file upload for Clients and the files should be stored in Cloud Storage but I also want to store file meta data and the file storage url in Cloud SQL.
What's the best was to do this?
Can I upload files through cloud endpoints or do I need an extra upload Servlet?
How can I update my database entities which needs a reference to the uploaded files.
Any examples on how to combine those 3 technologies?
Assuming your clients are not added to your google cloud project (which is typically the case), your users don't have write access to your GCS bucket. You can either submit files to your application and move to GCS from there (not recommended as consumes more network and CPU) or a better way is to submit to GCS directly.
To let the client write to your GCS bucket directly, you will need to either:
1. put your access key on client for write access (not recommended), if the client is used by limited trusted people.
2. generate a time-bound token and put it on the client as signed URL to upload directly.
Endpoints APIs themselves cannot do this, but you can generate the signed GCS URL at the server and get it using endpoints on client. then set it as form action (on web client, other clients have similar ways for signed upload) and submit the form to upload the file.
<form action="SIGNED_URL_FROM_ENDPOINTS" method="post" enctype="multipart/form-data">
I don't see an open-source code out there doing exactly this, but closest is this project that does generate the signed URL with a time-out (the only unintuitive part).
Best way to update the metadata in your database is to watch GCS bucket using 'Object Change Notifications'. Another way is to send the metadata to your server from client itself, which can be an endpoints call. You can also use a mix of both where the metadata goes to server using endpoints even before the the file is uploaded and the notification updates the record with confirmation that it is available to serve.
Related
I want to build a tiny story system where users can upload videos.
I'm using Firebase and the frontend will be in flutter.
I'm struggling a bit to design the flow from frontend to my Go backend. What's the simplest way to achieve this ?
From what I understand I could use different flows:
Front ask for an upload signed url to Go backend
Backend generate a gcp storage signed url
Front uploads the video
Front send the link to backend
Backend transcode the video
Backend store the link in firestore
Or
Front use directly firebase storage
Front send the link to backend ?
What's the benefits of using an upload signed url vs directly firebase storage?
Thanks in advance
What's the benefits of using an upload signed url vs directly firebase storage?
Firebase storage offers simplicity of security rules to restrict access while using GCS directly will require you to have a backend to generate signed URLs. I would prefer signed URLs when it's the system does not use Firebase Authentication or you want some validation before the file is uploaded as first place. However most of that can be done using security rules as well.
When using Firebase storage, the upload is simpler just by using uploadBytes() function while signed URLs would require some additional code. An example can be found in this
I am not sure what you mean by 'transcode video' but you can use Cloud Storage Triggers for Cloud Functions and run any actions such as adding URL to Firestore or process video once a file is uploaded.
I am just thinking what the best approach is to implement a simple form with file upload on a static website without any backend.
Scenario:
I have static website (NuxtJS) where a form can be filled and files can be uploaded.
To protect this form I wanted to use recaptcha by Google but as I read a little further in their documentation it seems that I need a backend which is a overkill for a static website.
Furthermore I wanted to support file upload... quite complicated without a backend.
What I thought of:
Maybe an existing product which does exactly what I am looking for? Or should I build a AWS Lambda Pipeline (of course with an S3 Bucket) to function as my "backend" for recaptcha and file upload.
Is there any approach which makes this scenario simpler, or am I thinking to complicated at the moment.
Use Case / Flow Chart:
Users enters Website.
Fills out form.
(optional) uploads files
Checks recaptcha
Clicks Send - Sends "Message" in our companies slack channel / or email.
However I solved this "common" task with a custom "backend" hosted on AWS Lambda which makes the whole stuff "serverless".
For those who are interested in "how to setup a server less backend" here's the current flow-chart which I made use of.
As you can see after the recaptcha is validated on client side and a token is generated, it is sent to the AWS API Gateway which triggers a Lambda Function (NodeJS Implementation of a Backend) where the token is validated and for file uploads pre-signed Uris are generated.
Notice: The API Gateway and the S3 Bucket need a valid CORS Configuration to communicate with each other and the world.
User Case: Customer can upload the file from the public REST api to our S3 bucket and then we can process the file using downstream services.
After doing some research I am able to find 3 ways to do it:
Uploading using OCTET-STREAM file type
Upload the file using form-data request
Upload the file using the pre-signed URL
In first 2 cases user will send the binary file and we will upload the file to S3 after file validation.
In the 3rd method user have to hit 3 apis. First API to get the S3 pre-signed URL which will give access to the user to upload the file to S3. In second hit user will upload the file to that s3 pre-signed URL. After the user complete the upload he will send the request to process the file.
Do we have any security issues with step 3? As user can misuse the pre-signed URL with malicious file.
Which of these method is best according to industry practice?
Details of each approach:
1. Uploading using OCTET-STREAM file type
Pros:
This method is good to upload file types which can be opened in some application such as xlsx.
1 API hit. Direct file upload
Cons:
This option is not suitable to upload multiple files. If in future we need to support multiple file upload this should be changed to multipart/form-data (A2).
No metadata can be send as body parameter. Metadata can be send in headers.
2. Upload the file using form-data request
User will upload the file with the API request by attaching it as multipart form.
Pros
We can send multiple files at the same time.
We can send extra parameters in the body.
3. Upload the file using the pre-signed URL
Cons
Customer have to hit the 3 APIs to upload the file. (2 API hits to upload and then 1 more API hit to check the process the file)
If you want them to load data into a bucket, the best way will almost always be the pre-signed URL. This gives you complete control over how you hand out access to the bucket, but also allows them to directly upload into the bucket when they have the access.
In the first two examples the user can send malicious data to your API, potentially DOSing the server / incurring costs on you to manage the payloads as you have no control over access (it is public).
In the third case they can request a URL from you, but that is it, other than spamming you for requests for URLs, unless you grant them a URL they can't access the bucket or do anything else. This seems much better than spamming your upload with large junk files and having you process them before you decide you didn't want them anyway.
Finally using the pre-signed URL is the pattern AWS would expect you to use, and so have a lot of support for managing the access, roles, logging and monitoring etc that you would want to put around this service. When you are standing up the API yourself this will all be up to you to manage.
I need to store my service data in Google Storage and let my users download files depending on their (users) access rights.
I've already made service that connects to Google Storage using server-centric mechanism, and transfers them to client-side, but I need client-side to go to Storage and download file without server-side.
I've tried to use temporary links for files, but I can't check, if user downloaded file or not to properly delete temporary link.
I've tried to look for oauth2 support, but it seems Google doesn't support oauth in such way (When my service decides to allow access or no).
The best solution is to generate tokens for users and if Google Storage would call my service before every file download.
How can I achieve that?
I'm storing objects in buckets on google cloud storage. I would like to provide a http url to the object for download. Is there a standard convention or way to expose files stored in cloud storage as http urls?
Yes. Assuming that the objects are publicly accessible:
http://BUCKET_NAME.storage.googleapis.com/OBJECT_NAME
You can also use:
http://storage.googleapis.com/BUCKET_NAME/OBJECT_NAME
Both HTTP and HTTPS work fine. Note that the object must be readable by anonymous users, or else the download will fail. More documentation is available at https://developers.google.com/storage/docs/reference-uris
If it is the case that the objects are NOT publicly accessible and you only want the one user to be able to access them, you can generate a signed URL that will allow only the holder of the URL to download the object, and even then only for a limited period of time. I recommend using one of the GCS client libraries for this, as it's easy to get the signing code slightly wrong: https://developers.google.com/storage/docs/accesscontrol#Signed-URLs
One way is to use https://storage.cloud.google.com// see more documentation at
https://developers.google.com/storage/docs/collaboration#browser
If the file is not public, you can use this link to the file and it will authenticate with your signed in Google account:
https://storage.cloud.google.com/{bucket-name}/{folder/filename}
Otherwise generate a signed URL:
gsutil signurl -d 10m Desktop/private-key.json gs://example-bucket/cat.jpeg