Microsoft graph API for Teams: How to download large data from hosted content in chunks? - rest

In Microsoft teams, embedded images from the posts(chat) are stored in the hosted content. This image can be very huge. Here is the API provided by the Microsoft.
GET https://graph.microsoft.com/v1.0/teams/%7Bteams-id%7D/channels/%7Bchannel-id%7D/messages/1675676227292/hostedContents/aWQ9eF8wLXNpbi1kNC1mNWNkZmI2NWQ0ZDBmYmVjMDY3ZWQzYjBkZjBlYjYwOCx0eXBlPTEsdXJsPWh0dHBzOi8vaW4tYXBpLmFzbS5za3lwZS5jb20vdjEvb2JqZWN0cy8wLXNpbi1kNC1mNWNkZmI2NWQ0ZDBmYmVjMDY3ZWQzYjBkZjBlYjYwOC92aWV3cy9pbWdv/$value
This API works perfectly fine and I am able to download any image. However, while executing this API through the code, we have restriction of maximum 8MB can be downloaded in single API call. Therefore, I am not able to download 17MB file. (If I execute this API directly in Postman, it works fine.). I want to download 17MB file in the chunks of 8 MB.
API call to this API returns "contentBytes": null in the response. So I do not know the size of file before downloading it. I also tried adding "Range : bytes =0-4096" in the header of request but still it is fetching the entire file. How to download this content in chunk?
How to download this content in chunk?
More info on hostedContent graph api:

Related

Creating best API: Upload N files and json metadata

I am creating an API.
I am unsure what the API should look like.
Several BLOB files (PDF, JPG, ZIP, ...) should get uploaded and
some JSON which contains meta data.
What is the most up-to-date way to design the API?
There are two cases:
the upload was successful. Then I think 201 (created) would be feasible
the upload was not successful (for example invalid meta data). Then 422 (Unprocessable Entity) should get returned.
Example:
Three pdf-files should get uploaded (at once) associated with some json meta data.
What you often see is that you have a resource for handling the BLOBs and one for the meta-data - Facebook and Twitter is doing this for images and videos.
For example /files would take your BLOB data and return an ID for the uploaded BLOB data.
The metadata would be send to another resource, called /posts and could consume application/json.
In the application I currently work for, we had the same issue and decided to use one endpoint consuming multipart/form-data - here you can send the BLOBs and the metadata within different boundaries and have everything in one resource.
Another way would be to base64 encode the BLOBs, which will result in an 33 % overhead, and I therefore not recommend it. But with base64 you could do all your work in one application/json resource.

ASP.NET Web Api Best Practice for long Creating, Polling, Deliverying long running jobs

We are working on a new RESTful api using ASP.NET Web API. Many of our customers receive a nightly datafeed from us. For each feed we run a schedule SQL Agent job that fires off a stored procedure which executes an SSIS package and delivers files via Email/FTP. Several customers would benefit from being able to run this job on demand and then receive either their binary file (xml, xls, csv, txt, etc.) or a direct transfer of the data in JSON or XML.
The main issue is that the feeds generally take a while to run. Most run within a few minutes but there are a couple that can take 20 minutes (part of the project is optimizing these jobs). I need some help finding a best practice for setting up this api.
Here are our actions and proposed REST calls
Create Feed Request
POST ./api/feedRequest
Status 201 Created
Returns feedID in the body (JSON or XML)
We thought POST would be the correct request type because we're creating a new request.
Poll Feed Status
GET ./api/feedRequest/{feedID}
Status 102 Processing (feed is processing)
Status 200 OK (feed is completed)
Cancel Feed Request
DELETE .api/feedRequest/{feedID}
Status 204 No Content
Cancels feed request.
Get Feed
GET .api/feed/{feedID}
Status 200 OK
This will return the feed data. We'll probably pass parameters into the header to specify how they want their data. Setting feedType to "direct" would require JSON or XML setting in Content-Type. Setting feedType to "xml", "xls", "csv", etc., will transfer a binary data file back to the user. For some feeds this is a custom template that is specified in the feed definition already stored in our tables.
Questions
Does it appear that we're on the right track? Any immediate suggestions or concerns?
We are trying to decide whether to have a /feed resource and a /feedRequest resource or whether to keep it all under /feed. The above scenario is the two resource approach. The single resource would POST /feed to start request, PUT /feed to check the status, GET /feed when it's done. The PUT doesn't feel right and right now we're leaning towards the stated solution above. Does this seem right?
We're concerned about very large dataset returns. Should we be breaking these into pieces or will REST service handle these large returns. Some feeds can be in excess of 100MB.
We also have images that may be generated to accompany the feed, they're zipped up in a separate file when the feed stored procedure and package are called. We can keep this all in the same request and call GET /feed/{feedID}/images on the return.
Does anyone know of a Best Practice or a good GitHub example we could look at that does something similar to this with MS technologies? (We considered moving to ASP.NET Core as well.

WWW::Mechanize::Chrome capture XHR response

I am using Perl WWW::Mechanize::Chrome to automate a JS heavy website.
In response to a user click the page among many other requests, requests and loads a JSON file using XHR.
Is there some way to save this particular JSON data to a file?
To intercept requests like that, you generally need to use the webRequest API to filter and retrieve specific responses. I do not think you can do that via WWW::Mechanize::Chrome.
WWW::Mechanize::Chrome tries to give you the content of all requests, but Chrome itself does not make the content of XHR requests available ( https://bugs.chromium.org/p/chromium/issues/detail?id=457484 ). So the approach I take in (for example ) Net::Google::Keep is to replay the XHR requests using plain Perl LWP requests by copying the cookies and parameters from the Chrome requests-
Please note that the official support forum for WWW::Mechanize::Chrome is https://perlmonks.org , not StackOverflow.

Multipart upload binary content with OneDrive Rest APIs

As per the API documentation here I formed my request with postman as follows:
. This is working fine.
But when it comes to binary content(encoded in base64 format), it uploads the file but that is not previewed when I try to open the same on OneDrive.
File gets uploaded successfully but not previewable.
What am I missing here? Any suggestions?
OneDrive doesn't support Content-Transfer-Encoding when using the multi-part upload method. In this case, we're ignoring the header (that seems like a bug) and just storing the base64 encoded data in the file stream (without decoding it).
You'll have to upload the raw bytes as the second part of the request, without any content-transfer-encoding, to have this work.
Since it seems like you are just uploading a file and not trying to set any custom metadata while doing it, you're better off using one of the other upload methods, like PUT or createUploadSession
Drive does not store the image in the base64 format it stores it in binary. you can directly select the image using postman and can upload as binary with the multipart request
Here is the link for adding blob in the postman
How to upload images using postman to azure blob storage

Google Cloud Storage: Setting incorrect MIME-type

I have a Node.js server running on a Google Compute Engine virtual instance. The server streams incoming files to Google Cloud Storage GCS. My code is here: Node.js stream upload directly to Google Cloud Storage
I'm passing Content-Type in the XML headers and it's working just fine for image/jpeg MIME-types, but for video/mp4 GCS is writing files as application/octet-stream.
There's not much to this, so I'm totally at a loss for what could be wrong ... any ideas are welcome!
Update/Solution
The problem was due to the fact that the multiparty module was creating content-type: octet-stream headers on the 'part' object that I was passing into the pipe to GCS. This caused GCS to receive two content-types, of which the octet part was last. As a result, GCS was using this for the inbound file.
Ok, looking at your HTTP request and response it seems like content-type is specified in the URL returned as part of the initial HTTP request. The initial HTTP request should return the endpoint which can be used to upload the file. I'm not sure why that is specified there but looking at the documentation (https://developers.google.com/storage/docs/json_api/v1/how-tos/upload - start a resumable session) it says that X-Upload-Content-Type needs to be specified, along some other headers. This doesn't seem to be specified in HTTP requests that were mentioned above. There might be an issue with the library used but the returned endpoint does not look as what is specified in the documentation.
Have a look at https://developers.google.com/storage/docs/json_api/v1/how-tos/upload, "Example: Resumable session initiation request" and see if you still have the same issue if you specify the same headers as suggested there.
Google Cloud Storage is content-type agnostic, i.e., it treats any kind of content in the same way (videos, music, zip files, documents, you name it).
But just to give some idea,
First I believe that the video () you are uploading is more or less size after it being uploded. so , it falls in application/<sub type>. (similar to section 3.3 of RFC 4337)
To make this correct, I believe you need to fight with storing mp4 metadata before and after the file being uploaded.
please let us know of your solution.
A solution that worked for me in a similar situation is below. TLDR: Save video from web app to GCS with content type video/mp4 instead of application/stream.
Here is the situation. You want to record video in the browser and save it to Google Cloud Storage with a content type set to video/mp4 instead of application/octet-stream. User records video and clicks button to send video file to your server for saving. After sending the video file from the client to your server, the server sends the video file to Google Cloud Storage for saving.
You successfully save the video to Google Cloud Storage and by default GCS assigns a content type of application/octet-stream to the video.
To assign a content type video/mp4 instead of application/octet-stream, here is some server-side Python code that works.
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(destination_blob_name)
blob.upload_from_file(file_obj, rewind=True)
blob.content_type = 'video/mp4'
blob.patch()
Here are some links that might help.
https://cloud.google.com/storage/docs/uploading-objects#storage-upload-object-python
https://stackoverflow.com/a/33320634/19829260
https://stackoverflow.com/a/64274097/19829260
NOTE: at the time of this writing, the Google Docs about editing metadata don't work for me because they say to set metadata but metadata seems to be read-only (see SO post https://stackoverflow.com/a/33320634/19829260)
https://cloud.google.com/storage/docs/viewing-editing-metadata#edit