Azure Data Factory: Pagination in Data Flow with Rest API Source - azure-data-factory

I have a API source in an ADF DataFlow task. The API source gives me the current page and the toatl number of pages in the body of the response. I want to use that information to paginate through my API source. I'm able to paginate through it just fine outside of a DataFlow activity using the range function. The issue is that the Rest transformation in a DataFlow activity does not support the range function. I've been trying to use the AbsoluteUrl function plus an expression to do add one to the current page returned by the body but either pagination does not accept expressions or I cannot figure out the syntax
I have a url like this:
BaseURL/fabricationcodes?facets=relatedArticles:Not%20Empty&page={PageNumber}.
In this example my rest linked service URL has everything I need minus the &page=pageNumber. So I'm trying to add that part with the key/value pair function of AbsoluteUrl. The Key being &page= and the value should be currentPage +1. My desire is for it to get the first page, page 0 and then add +1 to that to formulate the next pages url. the end condition being when body.totalPages == body.currentPage
I've tried a bunch of different expression formaulations but none seem to work and debugging in a Data flow is tough b/c the logging and error messaging is poor
What I have right now.

As data flow don't support Range option or you cannot use dynamic expression to get page from API response.
To work around the issue, you can use Data Flow activity within ForEach loop using range function in dynamic expression.
First take a web activity and pass the URL of the Rest API as below Ito get the total no of Pages from API response
then take a for each activity to iterate on API like pagination give the Dynamic expression as #range(1,activity('Web1').output.total_pages)
I will iterate the API till the respective range in sequential manner.
create parameter with type string in source DataSource.
give that parameter as dynamic value in relative URL.
after this gave parameter value as ?page=#{item()} to give the no coming from range to the page.
OUTPUT:

Related

REST API Pagination in Azure Data Factory

I have a project scenario to get all values from an endpoint URL. I am using ADF Pipeline but I'm having some issues with pagination.
To get the following values, I need to make requests with the PaginationCursor value in the current body response in the following request header.
I have read that ADF supports the following case, which would be mine.
Next request’s header = property value in current response body ADF - Pagination support
I don't know how to use the following attributes in order to use the paginationCursor value from the current response body in the header of the next request.
Attributes for pagination in ADF
I tried to reproduce above but not successful. Instead, if you want to do it without pagination, you can try this approach.
First create a web activity with any page URL of your API to get the total number of pages count.
In ForEach create an array for page numbers using the count from web activity as
#range(1,activity('Web1').output.total_pages)
Inside ForEach use the copy activity and give the source REST dataset parameter for the page number like ?page=#{item()}.
In the sink dataset also, create a dataset for each page with the dataset parameter value like APIdataset#{item()}.csv. This generates the sink dataset names like APIdataset1.csv, APIdataset2.csv,...
Now, you can copy from your REST API without pagination.
My repro for your reference:
Copy activity:
I could solve this problem with the following attributes.
Solution
In the Headers I had to put the name of the header of the next call. In my case the name is PaginationCursor and I got the value of this header from the actual body response called paginationCursor.

Simple way to call a REST API from an excel cell, hopefully without VBA

I have a column of hundreds of parcel tracking numbers, and I want to use the parcel carrier's REST API to pull some of the tracking information (promised delivery date, and actual delivery date) and then use conventional spreadsheet techniques on it.
Is there a simple way to put the REST call in a formula in an excel cell? in pseudocode it would be call this REST url, using the tracking number in the cell to the left as an argument, and here's the authentication user/password, and return the value in the response field 'deliveryDate'
I am intrigued by Power Query and I figured out how to use Power Query to do it for a static REST url with the tracking number manually filled in, but I don't know how to make PQ do it for hundreds of items.
or maybe there is an online tool for building this function with lego blocks for a caveman like myself?

Specify Global Time Range on Azure DevOps API Calls

I am using the Azure DevOps API to put together a report of build metrics. I am looking for a way to add a global time range query parameter to my calls, so that I am only querying items that occur within the range of time x and time y.
It appears some specific calls contain a query parameter such as a "creationDate" or "startTime"; however I am unable to find a way to globally limit the date range of API calls I make. I am using a large selection of REST endpoints and many either do not contain a time range query parameter or use a different implementation method to accomplish the same thing.
In short, I am looking for a way to globally limit the Azure DevOps REST API with either a query parameter or an API key setting to return results within a specific range. Is this possible?
I am looking for a way to specify a time range such as "get all builds
that ran within the last week" or "get all pipelines created within
the last month" without modifying my code to parse these elements from
each response.
If so, agree with Krzysztof Madej, if the rest api does not provide parameters to specify the date range, there should be no out-of-the-box method to achieve this demand.
The Builds-List rest api provides parameters such as maxTime, minTime to specify the date range, while the Pipelines-List rest api does not provide such parameters to specify the date range.
So, if you don’t want to modify the code to parse these elements from response, I think this is currently impossible.

Include multiple query parameters in HCM cloud rest Get call

I have an hcm cloud instance. I'm working on the rest api which are provided by the cloud.
I want to get an employee by matching both PersonNumber as well as DateOfBirth.
But whatever I tried based on the first parameter, I'm getting the output. Second is not even checked.
Can anyone help?
This is the rest url I'm using
https://host:port/hcmCoreApi/resources/11.12.1.0/emps/?q=DateOfBirth=1991-09-19&PersonNumber=240
For passing multiple search items in a query parameter in a rest call, the structure should be as following
https://host:port/hcmCoreApi/resources/11.12.1.0/emps/?q=DateOfBirth='1991-09-19'&PersonNumber=240
Basically quotes '' are required around String inputs and integer can be passed directly.

Possible to specify date_preset with insights edge in Facebook Ads API?

For the Marketing API, I know that I'm able to make one call to retrieve all of the adsets from a certain account along with their insights, but am I able to specify the date_preset for the insights edge in that same call?
For example, the following gives me lifetime insights stats:
/v2.4/{accountID}/adcampaigns?fields=insights
To be clear - I know this is possible to retrieve by making separate calls for each adset id (where I know I can specify the date_preset); instead, I'd like to do this via the call where I get a long list of the ad sets plus their insights details in one go.
Yes this is possible using query expansion, however you probably should not do it in this anyway.
Using query expansion results in multiple requests being executed in one HTTP call, in this case one to get all the adcampaigns, and then N requests where N is the number of adcampaigns returned. This will in turn affect your rate limiting.
The most efficient way to request all insights for all adcampaigns (ad sets) is instead to request them at the account level, specifying aggregation level:
/v2.4/act_{ADACCOUNT_ID}/insights?date_preset=last_7_days&level=campaign
This requires just 1 request, or the number of requests to retrieve the total number of pages.
If you really want to achieve this with query expansion, you can do the following for example:
/v2.4/act_{ADACCOUNT_ID}/adcampaigns?fields=insights.date_preset(last_30_days).time_increment(all_days)
You can see the parameters to insights that would normally be query parameters of the form param_name=param_value are now in the form of param_name(param_value).
To specify the date_preset , here is the correct format . Its important to use insights as edge to get the date_preset filtering .
/v2.10/act_{ADACCOUNT_ID}/insights?fields=impressions,clicks,ctr,unique_clicks,unique_ctr,spend,cpc&date_preset=last_3d
The above one is tested with latest Graph Api version(2.10) as of now . FOr more info related to the date_preset values refer to there api docs .
https://developers.facebook.com/docs/marketing-api/insights/parameters