I have a contant stream of messages (json). I would like to store it in GCS and have one file per hour. Is there a way to do it?
You can use GCS Resumable Uploads to stream data, either via the JSON API or XML API.
By using a resumable upload, you can continue to append more data to the object until you want to close it, so you could continuously stream data to it in chunks and then finalize it after an hour.
Related
Need some guidance or suggestion
Scenario:
We are trying fetch a file using RestAPI and save it to Azure Data Lake.
Journey:
Login, GetToken, Invoke GetFile API, and save to AzureDataLake
We are trying 2 options
using LogicApp-- this work fine, but apparently this is not approved PaaS service.
Using Data Factory-- here we are facing some issue. We are able to invoke the RestEndpoint, but unable to parse the response into a file (pdf, img etc...). How can I achieve this?
Best,
Ashwin
Please change to use HTTP connector :
HTTP connector is generic to retrieve data from any HTTP endpoint, for example, to download files.
REST connector is used to copy the REST JSON response.
You can refer this documentation
User Case: Customer can upload the file from the public REST api to our S3 bucket and then we can process the file using downstream services.
After doing some research I am able to find 3 ways to do it:
Uploading using OCTET-STREAM file type
Upload the file using form-data request
Upload the file using the pre-signed URL
In first 2 cases user will send the binary file and we will upload the file to S3 after file validation.
In the 3rd method user have to hit 3 apis. First API to get the S3 pre-signed URL which will give access to the user to upload the file to S3. In second hit user will upload the file to that s3 pre-signed URL. After the user complete the upload he will send the request to process the file.
Do we have any security issues with step 3? As user can misuse the pre-signed URL with malicious file.
Which of these method is best according to industry practice?
Details of each approach:
1. Uploading using OCTET-STREAM file type
Pros:
This method is good to upload file types which can be opened in some application such as xlsx.
1 API hit. Direct file upload
Cons:
This option is not suitable to upload multiple files. If in future we need to support multiple file upload this should be changed to multipart/form-data (A2).
No metadata can be send as body parameter. Metadata can be send in headers.
2. Upload the file using form-data request
User will upload the file with the API request by attaching it as multipart form.
Pros
We can send multiple files at the same time.
We can send extra parameters in the body.
3. Upload the file using the pre-signed URL
Cons
Customer have to hit the 3 APIs to upload the file. (2 API hits to upload and then 1 more API hit to check the process the file)
If you want them to load data into a bucket, the best way will almost always be the pre-signed URL. This gives you complete control over how you hand out access to the bucket, but also allows them to directly upload into the bucket when they have the access.
In the first two examples the user can send malicious data to your API, potentially DOSing the server / incurring costs on you to manage the payloads as you have no control over access (it is public).
In the third case they can request a URL from you, but that is it, other than spamming you for requests for URLs, unless you grant them a URL they can't access the bucket or do anything else. This seems much better than spamming your upload with large junk files and having you process them before you decide you didn't want them anyway.
Finally using the pre-signed URL is the pattern AWS would expect you to use, and so have a lot of support for managing the access, roles, logging and monitoring etc that you would want to put around this service. When you are standing up the API yourself this will all be up to you to manage.
I am trying to write a cloud function that receives a csv log from a python server, then posts it to either a Firestore Database collection or a storage bucket. I am a little lost on how to continue. Any suggestions?
Is it possible for a bucket in cloud storage to receive data/messages from pubSub? if yes then how??
Currently i am publishing messages to pubsub
and i want to use pull delivery type (for that i have to provide endpoint URL for the bucket, which i couldn't find anywhere)
I found this somewhere in there docs
But it didn't work.
No, sorry. GCS only accepts uploads of complete files via HTTP. You could build a small app that took incoming Pub/Sub messages and uploading them as separate GCS objects or batch them into groups of messages and upload those to GCS, but there's no such built-in functionality.
Can I ask you more about your use case? What are you trying to do?
I searched many sites in uploading multiple images in a single request on WCF. The webservices showing in the to uploading images can be converted in to into bytes format to saving the directory. My requirement is uploading multiple images in single request and some pf the parameters also coming in the http post method. how store those images in server using WCF?
I am not sure why you want this?
If you make multiple request it would be more user friendly.
In Single request you can't upload multiple images and also its not recommended.