Rest API call from copy activity - azure-data-factory

Hi i am processing a set of ~50K records from a pipe delimeted flat kn azure data factory and need to invoke a rest API call for each input record. So, I am using a foreach loop to access each record and inside the loop, I am using a copy activity to invoke a rest API call.
My question is, can I invoke the rest API call in bulk for all the records at once, as the foreach loop is slowing the pipeline execution. I want to remove the foreach loop and also process the API json response and store it in azure sql database.
Thanks

You will have to check the Pagination properties so that you can decide how much payload you need to return from source API:
https://learn.microsoft.com/en-us/azure/data-factory/connector-rest?tabs=data-factory#pagination-support
Also, if you need to store the API JSON response in Azure SQL, then you can do so with many built in functions like JSON_PATH
More details can be found in this link:
https://learn.microsoft.com/en-us/azure/azure-sql/database/json-features

Related

Copy Activity Not able to copy any response from Rest api in Azure Data factory

I am using input as rest api url .And I am trying to save the response to a sql table.When I run the pipeline the pipeline run successfully,But it is showing zero rows copied.
I tested the api in postman.I am able to see the reponse data (9 mb)
Anybody else got this same issue,Please help me
I tried to reproduce and faced similar problem Its not inserting any records.
The problem is causing due to API returns response in Json and pipeline doesn't know which object value should store in which column.
To resolve this use Mapping. import the scma and amp the paricular columns as below:
Output:
I think the intend here is copy the response json to SQL and if thats the case then we cannot do that with copy activity .
One way is you can use a web activity to call the API and after that you can call a Stored proc activity and pass the response as a input paramter to the SP . The SP will insert the record in the table . But 9MB of response is too big , i doubt if the web activity can handle that .

How to retrieve the continuationToken from Rest API while using Azure Data Factory?

I'm trying to implement Azure Data Factory's Copy Activity to copy data from an API to Azure Blob Storage. I have set up the source & sinks correctly so that when I trigger the pipeline it pulls and loads the first load of data but I am struggling with pagination.
When I trigger the pipeline It loads the first page correctly and afterward, it doesn't return the next continuation token for fetching the data from API. If I use it until or for-each activity the pipeline copies the data of the same continuation token endless until timeout.
When I run the Rest API call in Postman it returns the data and also the next continuation token as well.
The continuation Token will be like 0000xxxx-00000-xxx00-000000xx000000 and the next continuation token is like 0000xxyy-00000-xxx00-000000yy000000.
My goal is to retrieve the data from Rest API using a continuation token and retrieve the next continuation token so that I can retrieve the next page data until the continuation token is null and store it in Azure Blob Storage with Azure Data Factory Pipeline.
I am able to retrieve the access token from the rest API and only the first page data.
Is there any way to solve this issue please let me know?
Assuming that your web activity gives you an array of elements to be processed, you can do the following :
Chain a For Each activity to iterate through the array.
Inside the ForEach activity, you can have items as something like (assuming array of items as outputArray) : #activity('webActivity').output.outputArray
Inside the ForEach activity, you can have a parameterized web activity to iterate through each item in the array. Each item can be accessed using the expression :
#item()
Please check this link : https://imgur.com/a/LE1CHvp

How to use REST API as source for Lookup activity in Azure Data Factory

I am trying to incrementally load data from a ServiceNow data source into an Azure SQL table as per the guide from Microsoft https://learn.microsoft.com/en-us/azure/data-factory/tutorial-incremental-copy-portal. This uses 2 lookup activities to find 2 dates, I then filter the source data in the copy activity to only return and insert rows where the sys_updated_on date is between the 2 lookup values.
I would now like to lookup a value from a REST API dataset. However, I do not get the option to choose my REST dataset in the lookup activity. It just does not appear as an option. The REST URL is setup to return me one date value which I need to pass into the WHERE clause of my source in the copy data. If I cannot retrieve this value in the lookup, how else can I pass it to my WHERE clause?
Currently I use activity('LookupOldWaterMarkActivity').output.firstRow.watermarkvalue and
convertTimeZone(activity('LookupNewWaterMarkActivity').output.firstRow.watermarkvalue
Thanks
As per the Microsoft official document, the Rest dataset is supported in lookup activity.
You can post feedback from an Azure Data factory or raise a support request for fixing the issue.
As a workaround, you can create an HTTP dataset with JSON format and use the output value in later activities.

429 error trying to hit an API from within an Azure Data Factory For Each activity

We're trying to put together a proof-of-concept where we read data from an API and store it in a blob. We have a For Each activity that loops through a file that has parameters that are used in an API call. We are trying to do this in parallel. The first API call works fine, but the second call returns a 429 error. Are we asking the impossible?
Usually error code 429 meaning is too many requests. Inside ForEach activity, use Sequence execution option and see if that helps.

Azure data factory: From Salesforce to REST API

I have created data factory source as Salesforce for which i am querying lead and then want to pass each lead email id as an argument (POST request) to a REST endpoint.
I am not sure in the pipeline what sink should i put and if it is HTTP File dataset then how to pass the email id from Source to sink in the argument?
The only way I know how to surface data in ADF itself is through a lookup activity. You can then iterate through the results of the lookup using a forEach activity. Reference the result of the lookup using in the ForEach items parameter:
#activity('nameoflookupactivity').output.value
If you need to add multiple IDs at once I think to only way would be to concatenate your IDs in a sink like a SQL database. You would first have to copy your IDs to a table in SQL. Then in Azure SQL DB/SQL server 2017 you could use the following query:
SELECT STRING_AGG([salesforce_Id], ',') AS Ids FROM salesforceLeads;
The ADF tutorial for incrementally loading multiple tables discusses the ForEach activity extensively, you can find it here:
ADF Incrementally Loading multiple Tables Tutorial
For more information about the STRING_AGG function check out:STRING_AGG Documentation
Thanks,
Jan