I have a project scenario to get all values from an endpoint URL. I am using ADF Pipeline but I'm having some issues with pagination.
To get the following values, I need to make requests with the PaginationCursor value in the current body response in the following request header.
I have read that ADF supports the following case, which would be mine.
Next request’s header = property value in current response body ADF - Pagination support
I don't know how to use the following attributes in order to use the paginationCursor value from the current response body in the header of the next request.
Attributes for pagination in ADF
I tried to reproduce above but not successful. Instead, if you want to do it without pagination, you can try this approach.
First create a web activity with any page URL of your API to get the total number of pages count.
In ForEach create an array for page numbers using the count from web activity as
#range(1,activity('Web1').output.total_pages)
Inside ForEach use the copy activity and give the source REST dataset parameter for the page number like ?page=#{item()}.
In the sink dataset also, create a dataset for each page with the dataset parameter value like APIdataset#{item()}.csv. This generates the sink dataset names like APIdataset1.csv, APIdataset2.csv,...
Now, you can copy from your REST API without pagination.
My repro for your reference:
Copy activity:
I could solve this problem with the following attributes.
Solution
In the Headers I had to put the name of the header of the next call. In my case the name is PaginationCursor and I got the value of this header from the actual body response called paginationCursor.
Related
I am using input as rest api url .And I am trying to save the response to a sql table.When I run the pipeline the pipeline run successfully,But it is showing zero rows copied.
I tested the api in postman.I am able to see the reponse data (9 mb)
Anybody else got this same issue,Please help me
I tried to reproduce and faced similar problem Its not inserting any records.
The problem is causing due to API returns response in Json and pipeline doesn't know which object value should store in which column.
To resolve this use Mapping. import the scma and amp the paricular columns as below:
Output:
I think the intend here is copy the response json to SQL and if thats the case then we cannot do that with copy activity .
One way is you can use a web activity to call the API and after that you can call a Stored proc activity and pass the response as a input paramter to the SP . The SP will insert the record in the table . But 9MB of response is too big , i doubt if the web activity can handle that .
I have a API source in an ADF DataFlow task. The API source gives me the current page and the toatl number of pages in the body of the response. I want to use that information to paginate through my API source. I'm able to paginate through it just fine outside of a DataFlow activity using the range function. The issue is that the Rest transformation in a DataFlow activity does not support the range function. I've been trying to use the AbsoluteUrl function plus an expression to do add one to the current page returned by the body but either pagination does not accept expressions or I cannot figure out the syntax
I have a url like this:
BaseURL/fabricationcodes?facets=relatedArticles:Not%20Empty&page={PageNumber}.
In this example my rest linked service URL has everything I need minus the &page=pageNumber. So I'm trying to add that part with the key/value pair function of AbsoluteUrl. The Key being &page= and the value should be currentPage +1. My desire is for it to get the first page, page 0 and then add +1 to that to formulate the next pages url. the end condition being when body.totalPages == body.currentPage
I've tried a bunch of different expression formaulations but none seem to work and debugging in a Data flow is tough b/c the logging and error messaging is poor
What I have right now.
As data flow don't support Range option or you cannot use dynamic expression to get page from API response.
To work around the issue, you can use Data Flow activity within ForEach loop using range function in dynamic expression.
First take a web activity and pass the URL of the Rest API as below Ito get the total no of Pages from API response
then take a for each activity to iterate on API like pagination give the Dynamic expression as #range(1,activity('Web1').output.total_pages)
I will iterate the API till the respective range in sequential manner.
create parameter with type string in source DataSource.
give that parameter as dynamic value in relative URL.
after this gave parameter value as ?page=#{item()} to give the no coming from range to the page.
OUTPUT:
I am trying to incrementally load data from a ServiceNow data source into an Azure SQL table as per the guide from Microsoft https://learn.microsoft.com/en-us/azure/data-factory/tutorial-incremental-copy-portal. This uses 2 lookup activities to find 2 dates, I then filter the source data in the copy activity to only return and insert rows where the sys_updated_on date is between the 2 lookup values.
I would now like to lookup a value from a REST API dataset. However, I do not get the option to choose my REST dataset in the lookup activity. It just does not appear as an option. The REST URL is setup to return me one date value which I need to pass into the WHERE clause of my source in the copy data. If I cannot retrieve this value in the lookup, how else can I pass it to my WHERE clause?
Currently I use activity('LookupOldWaterMarkActivity').output.firstRow.watermarkvalue and
convertTimeZone(activity('LookupNewWaterMarkActivity').output.firstRow.watermarkvalue
Thanks
As per the Microsoft official document, the Rest dataset is supported in lookup activity.
You can post feedback from an Azure Data factory or raise a support request for fixing the issue.
As a workaround, you can create an HTTP dataset with JSON format and use the output value in later activities.
I'm trying to list all the pipelines stored in the Azure Data Factory instance. I want to use Azure Data Factory REST API v2, Pipelines - List By Factory method.
I noticed the "nextLink" field in the PipelineListResponse, which contains the link to the next page of results, if any remaining results exist.
My question is, how many PipelineResources are sent in a single page of the response?
I didn't find any documentation regarding this question.
How many PipelineResources are sent in a single page of the response?
In normal, the list operation response includes the nextLink property when the list operation returns more than 1,000 items. For more details, please refer to here.
I have a resource which basically represents a number. I have two possible updates for this number: Set the number to a specific value or add a value to it. Now I'm confused if I can use the query string part of the URL to specify the desired behavior.
Something like this:
/resource/{id}/?mode=add
/resource/{id}/?mode=set
Or is there an alternative way two represent to update strategies for a rest resource?
An Alternative would be to extend the request body with this information but this since strange, since the request data should contain the data and not "meta information" for the request itself - as far as I understand REST apis.
The project is an ordinary angularjs (client) and java (server) project.