Library to upload to Google Cloud Storage with a signed URL? - google-cloud-storage

I'm building a web app that needs to let users upload large files. I would like to use Google Cloud Storage. I've spent several days reading the docs, and I'm still not sure how to actually initiate the upload from the browser.
I understand that a signed URL is needed for this purpose (allowing an anonymous user to upload to a bucket), which I can generate on the server and send to the browser.
However, the JavaScript Client Library seems intended to only run on an application server with authentication as a User or Service Account. For example, the #google-cloud/storage package has a method for generating signed URLs, but not for uploading a file using that signed URL.
What am I missing?
PS - These will be big files, so it would be nice to perform resumeable uploads.
UPDATE: From what I can tell (hints here), you just need to PUT or POST to the signed URL. So I guess I should go snag a generic file upload utility that wraps XHR. (Maybe jQuery?)

I don't believe such a library exists. You'll probably want to just use standard JavaScript or your favorite library for making remove HTTP calls.
You'll make an initial POST to the signed URL. This will result in a 201, like so:
HTTP/1.1 201 Created
Location: https://example.storage.googleapis.com/music.mp3?upload_id=tvA0ExBntDa...gAAEnB2Uowrot
Date: Fri, 01 Oct 2010 21:56:18 GMT
Content-Length: 0
Content-Type: audio/mpeg
The URL in the "Location" header, including the upload_id parameter, is the URL to which you'll follow up with a PUT with your data. That operation can be interrupted and resumed.
There's more documentation on exactly how to use resumable uploads with XML here: https://cloud.google.com/storage/docs/xml-api/resumable-upload

Related

Data Factory can't download CSV file from web API with Basic Auth

I'm trying to download a CSV file from a website in Data Factory using the HTTP connector as my source linked service in a copy activity. It's basically a web call to a url that looks like https://www.mywebsite.org/api/entityname.csv?fields=:all&paging=false.
The website uses basic authentication. I have manually tested by using the url in a browser and entering the credentials, and everything works fine. I have used the REST connector in a copy activity to download the data as a JSON file (same url, just without the ".csv" in there), and that works fine. But there is something about the authentication in the HTTP connector that is different and causing issues. When I try to execute my copy activity, it downloads a csv file that contains the HTML for the login page on the source website.
While searching, I did come across this Github issue on the docs that suggests that the basic auth header is not initially sent and that may be causing an issue.
As I have it now, the authentication is defined in the linked service. I'm hoping that maybe I can add something to the Additional Headers or Request Body properties of the source in my copy activity to make this work, but I haven't found the right thing yet.
Suggestions of things to try or code samples of a working copy activity using the HTTP connector and basic auth would be much appreciated.
The HTTP connector expects the API to return a 401 Unauthorized response after the initial request. It then responds with the basic auth credentials. If the API doesn't do this, it won't use the credentials provided in the HTTP linked service.
If that is the case, go to the copy activity source, and in the additional headers property add Authorization: Basic followed by the base64 encoded string of username:password. It should look something like this (where the string at the end is the encoded username:password):
Authorization: Basic ZxN0b2njFasdfkVEH1fU2GM=`
It's best if that isn't hard coded into the copy activity but is retrieved from Key Vault and passed as secure input to the copy activity.
I suggest you try to use the REST connector instead of the HTTP one. It supports Basic as authentication type and I have verified it using a test endpoint on HTTPbin.org
Above is the configuration for the REST linked service. Once you have created a dataset connected to this linked service you can include it in you copy activity.
Once the pipeline executes the content of the REST response will be saved in the specified file.

Authorization for a POST request to Google Storage bucket

I am trying to figure out how to use Google Cloud Storage for my app. The app should allow any user to POST (or PUT) objects into a Bucket and also let them read any files from said bucket. I am confused as to how I am supposed to form my POST requests in order to make this work.
I have just been playing around, sending requests to see if I can upload a file into a bucket. I have tried forming a request according to this example:
https://cloud.google.com/storage/docs/json_api/v1/how-tos/upload
POST /upload/storage/v1/b/<bucketname>/o?uploadType=media&name=<objectname> HTTP/1.1
Host: www.googleapis.com
Content-Type: image/jpeg
Content-Length: 100
Authorization: Bearer <your_auth_token>
I am confused as to what exactly the 'auth_token' is. I have tried going into the developers console and generating a 'Public API access' key and attempted to use this, but I got a response saying it was unauthorized.
Am I generating the right type of key? I understand OAuth keys are used when you need access to a Google user's data. I don't need this: I simply need to allow users of my app to add files to a bucket and read from that bucket. Any help to point me in the right direction is appreciated.
It is like an ID to identify you have permission to perform your actions, as you havenĀ“t provide which language you are using, I will provide a golang example
https://github.com/johnbalvin/google-cloud-go/blob/master/storage/resumableUpload.go

Google Cloud Storage: Setting incorrect MIME-type

I have a Node.js server running on a Google Compute Engine virtual instance. The server streams incoming files to Google Cloud Storage GCS. My code is here: Node.js stream upload directly to Google Cloud Storage
I'm passing Content-Type in the XML headers and it's working just fine for image/jpeg MIME-types, but for video/mp4 GCS is writing files as application/octet-stream.
There's not much to this, so I'm totally at a loss for what could be wrong ... any ideas are welcome!
Update/Solution
The problem was due to the fact that the multiparty module was creating content-type: octet-stream headers on the 'part' object that I was passing into the pipe to GCS. This caused GCS to receive two content-types, of which the octet part was last. As a result, GCS was using this for the inbound file.
Ok, looking at your HTTP request and response it seems like content-type is specified in the URL returned as part of the initial HTTP request. The initial HTTP request should return the endpoint which can be used to upload the file. I'm not sure why that is specified there but looking at the documentation (https://developers.google.com/storage/docs/json_api/v1/how-tos/upload - start a resumable session) it says that X-Upload-Content-Type needs to be specified, along some other headers. This doesn't seem to be specified in HTTP requests that were mentioned above. There might be an issue with the library used but the returned endpoint does not look as what is specified in the documentation.
Have a look at https://developers.google.com/storage/docs/json_api/v1/how-tos/upload, "Example: Resumable session initiation request" and see if you still have the same issue if you specify the same headers as suggested there.
Google Cloud Storage is content-type agnostic, i.e., it treats any kind of content in the same way (videos, music, zip files, documents, you name it).
But just to give some idea,
First I believe that the video () you are uploading is more or less size after it being uploded. so , it falls in application/<sub type>. (similar to section 3.3 of RFC 4337)
To make this correct, I believe you need to fight with storing mp4 metadata before and after the file being uploaded.
please let us know of your solution.
A solution that worked for me in a similar situation is below. TLDR: Save video from web app to GCS with content type video/mp4 instead of application/stream.
Here is the situation. You want to record video in the browser and save it to Google Cloud Storage with a content type set to video/mp4 instead of application/octet-stream. User records video and clicks button to send video file to your server for saving. After sending the video file from the client to your server, the server sends the video file to Google Cloud Storage for saving.
You successfully save the video to Google Cloud Storage and by default GCS assigns a content type of application/octet-stream to the video.
To assign a content type video/mp4 instead of application/octet-stream, here is some server-side Python code that works.
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(destination_blob_name)
blob.upload_from_file(file_obj, rewind=True)
blob.content_type = 'video/mp4'
blob.patch()
Here are some links that might help.
https://cloud.google.com/storage/docs/uploading-objects#storage-upload-object-python
https://stackoverflow.com/a/33320634/19829260
https://stackoverflow.com/a/64274097/19829260
NOTE: at the time of this writing, the Google Docs about editing metadata don't work for me because they say to set metadata but metadata seems to be read-only (see SO post https://stackoverflow.com/a/33320634/19829260)
https://cloud.google.com/storage/docs/viewing-editing-metadata#edit

webclient.download downloading incomplete tiff image from a url

I am using below code to download a tiff file from a lotus domino server.
string url
= "http://10.1.1.23\\Domino\\ImageDb.nsf\\500-99-9o9\\$File\\abc.tif";
// Create an instance of WebClient
WebClient client = new WebClient();
string filename
= "c:\\test.tif";
client.DownloadFile(url,filename);
But the file which is downloaded is of 4kb instead of 22kb and when i try to open it, its says its in improper/invalid format. Any guesses what is going wrong?
Using Fiddler, you will see that the 4kb file is the authentication HTML page that Domino automatically presents when an un-authenticated request for content is made via HTTP and the particular resource requested is not accessible anonymously.
In this case, it sounds like when you request this file resource in Domino, you will need to authenticate.
You can do this by providing a valid LTPToken in the request header, which is issued by the Domino server once you have authenticated. Alternatively, if authentication is not possible you can make the database ACL, and document accessible to "anonymous" users. Although not specifically C# code, these links will help you understand about LTPA on Domino, here, here and here

How Do I Upload Multiple Files Using the iPhone

I am posting (HTTP POST) various values to the posterous api. I am successfully able to upload the title, body, and ONE media file, but when I try to add in a second media file I get a server 500.
They do allow media and media[] as parameters.
How do I upload multiple files with the iPhone SDK?
The 500 your getting is probably based on one of two things:
An incorrect request
An error on the server
Now, if its an incorrect, the HTTP server would be more helpful responding back with like a 415 (unsupported media type) or something. A 500 insists that something went wrong on the server and that your request was valid.
You'll have to dig into the server API or code (if you wrote it), or read the docs and figure out what's wrong with your second request ... seems like maybe your not setting the appropriate media type?
EDIT: Ok, so I looked at the API. It appears your posting XML, so your request content-type should be
Content-Type: application/xml
The API doc didn't specifically say, but that would be the correct type.
EDIT: Actually on second glance, are you just POSTing w/URI params? Their API doc isn't clear (I'm also looking rather quickly)