Copy Activity Not able to copy any response from Rest api in Azure Data factory - azure-data-factory

I am using input as rest api url .And I am trying to save the response to a sql table.When I run the pipeline the pipeline run successfully,But it is showing zero rows copied.
I tested the api in postman.I am able to see the reponse data (9 mb)
Anybody else got this same issue,Please help me

I tried to reproduce and faced similar problem Its not inserting any records.
The problem is causing due to API returns response in Json and pipeline doesn't know which object value should store in which column.
To resolve this use Mapping. import the scma and amp the paricular columns as below:
Output:

I think the intend here is copy the response json to SQL and if thats the case then we cannot do that with copy activity .
One way is you can use a web activity to call the API and after that you can call a Stored proc activity and pass the response as a input paramter to the SP . The SP will insert the record in the table . But 9MB of response is too big , i doubt if the web activity can handle that .

Related

Azure Data Factory - REST Pagination rules

I'm trying to pull data from Hubspot to my SQL Server Database through an Azure Data Factory pipeline with the usage of a REST dataset. I have problems setting up the right pagination rules. I've already spent a day on Google and MS guides, but I find it hard to get it working properly.
This is the source API. I am able to connect and pull the first set of 20 rows. It gives an offset which is usable with vidoffset= which is returned in the body.
I need to return the result of vid-offset from the body to the HTTP request. Also the process needs to stop when has-more results in 'false'.
I tried to reproduce the same in my environment and I got the below results:
First I create a linked service with this URL: https://api.hubapi.com/contacts/v1/lists/all/contacts/all?hapikey=demo&vidOffset
Then after I created the pagination end condition rule with $.has-more and absolute URL.
For demo purpose, I took sink as a storage account.
The pipeline run success full look at the below image for reference.
For more information refer this Ms Document

REST API Pagination in Azure Data Factory

I have a project scenario to get all values from an endpoint URL. I am using ADF Pipeline but I'm having some issues with pagination.
To get the following values, I need to make requests with the PaginationCursor value in the current body response in the following request header.
I have read that ADF supports the following case, which would be mine.
Next request’s header = property value in current response body ADF - Pagination support
I don't know how to use the following attributes in order to use the paginationCursor value from the current response body in the header of the next request.
Attributes for pagination in ADF
I tried to reproduce above but not successful. Instead, if you want to do it without pagination, you can try this approach.
First create a web activity with any page URL of your API to get the total number of pages count.
In ForEach create an array for page numbers using the count from web activity as
#range(1,activity('Web1').output.total_pages)
Inside ForEach use the copy activity and give the source REST dataset parameter for the page number like ?page=#{item()}.
In the sink dataset also, create a dataset for each page with the dataset parameter value like APIdataset#{item()}.csv. This generates the sink dataset names like APIdataset1.csv, APIdataset2.csv,...
Now, you can copy from your REST API without pagination.
My repro for your reference:
Copy activity:
I could solve this problem with the following attributes.
Solution
In the Headers I had to put the name of the header of the next call. In my case the name is PaginationCursor and I got the value of this header from the actual body response called paginationCursor.

Rest API call from copy activity

Hi i am processing a set of ~50K records from a pipe delimeted flat kn azure data factory and need to invoke a rest API call for each input record. So, I am using a foreach loop to access each record and inside the loop, I am using a copy activity to invoke a rest API call.
My question is, can I invoke the rest API call in bulk for all the records at once, as the foreach loop is slowing the pipeline execution. I want to remove the foreach loop and also process the API json response and store it in azure sql database.
Thanks
You will have to check the Pagination properties so that you can decide how much payload you need to return from source API:
https://learn.microsoft.com/en-us/azure/data-factory/connector-rest?tabs=data-factory#pagination-support
Also, if you need to store the API JSON response in Azure SQL, then you can do so with many built in functions like JSON_PATH
More details can be found in this link:
https://learn.microsoft.com/en-us/azure/azure-sql/database/json-features

429 error trying to hit an API from within an Azure Data Factory For Each activity

We're trying to put together a proof-of-concept where we read data from an API and store it in a blob. We have a For Each activity that loops through a file that has parameters that are used in an API call. We are trying to do this in parallel. The first API call works fine, but the second call returns a 429 error. Are we asking the impossible?
Usually error code 429 meaning is too many requests. Inside ForEach activity, use Sequence execution option and see if that helps.

Azure data factory: Using output of Rest in copy data activity in next activity

I have copy data activity which copies data from Rest to SQL Server. Rest returns a json response. I need to have another Web activity after success of copy data. This activity needs data from previous rest api response (which is part of copy data). Any idea how we can achieve this.
I have tried using
#{activity('ACTIVITY_NAME').output.<json_field_from_response>
I get following error.
{
"errorCode": "InvalidTemplate",
"message": "The expression 'activity('ACTIVITY_NAME').output.batch_id' cannot be evaluated because property 'batch_id' doesn't exist, available properties are 'dataRead, dataWritten, rowsRead, rowsCopied, copyDuration, throughput, errors, effectiveIntegrationRuntime, usedDataIntegrationUnits, usedParallelCopies, executionDetails'.",
"failureType": "UserError",
"target": "Web1"
}
I am hoping there will be some way in dataset or pipeline to set variable to be used later. But I am not able to find it. Thanks.
You could use lookup activity and get the data from lookup output.