Data Factory can't download CSV file from web API with Basic Auth - azure-data-factory

I'm trying to download a CSV file from a website in Data Factory using the HTTP connector as my source linked service in a copy activity. It's basically a web call to a url that looks like https://www.mywebsite.org/api/entityname.csv?fields=:all&paging=false.
The website uses basic authentication. I have manually tested by using the url in a browser and entering the credentials, and everything works fine. I have used the REST connector in a copy activity to download the data as a JSON file (same url, just without the ".csv" in there), and that works fine. But there is something about the authentication in the HTTP connector that is different and causing issues. When I try to execute my copy activity, it downloads a csv file that contains the HTML for the login page on the source website.
While searching, I did come across this Github issue on the docs that suggests that the basic auth header is not initially sent and that may be causing an issue.
As I have it now, the authentication is defined in the linked service. I'm hoping that maybe I can add something to the Additional Headers or Request Body properties of the source in my copy activity to make this work, but I haven't found the right thing yet.
Suggestions of things to try or code samples of a working copy activity using the HTTP connector and basic auth would be much appreciated.

The HTTP connector expects the API to return a 401 Unauthorized response after the initial request. It then responds with the basic auth credentials. If the API doesn't do this, it won't use the credentials provided in the HTTP linked service.
If that is the case, go to the copy activity source, and in the additional headers property add Authorization: Basic followed by the base64 encoded string of username:password. It should look something like this (where the string at the end is the encoded username:password):
Authorization: Basic ZxN0b2njFasdfkVEH1fU2GM=`
It's best if that isn't hard coded into the copy activity but is retrieved from Key Vault and passed as secure input to the copy activity.

I suggest you try to use the REST connector instead of the HTTP one. It supports Basic as authentication type and I have verified it using a test endpoint on HTTPbin.org
Above is the configuration for the REST linked service. Once you have created a dataset connected to this linked service you can include it in you copy activity.
Once the pipeline executes the content of the REST response will be saved in the specified file.

Related

REST Client extension - API response as HTML

I wanted to use vs code's extension REST client for testing purposes. So I used a curl of an existing API running on my local machine. But instead of JSON, I got HTML as a response. The curl works as expected in the terminal but not with the extension.
The same Behaviour is with another extension named Thunder Client.
Postman is getting JSON responses for the same API I believe that issue lies within vs code itself, I just don't know how to resolve it.
According to this article:
https://softwareengineering.stackexchange.com/questions/207835/is-it-ok-to-return-html-from-a-json-api
If you have declared you only accept one format in the header then the service should only send back that format or throw an error. If you have not put an ACCEPT in the header, the the service may send back whatever.
Check what is in the ACCEPT header:
But also check how Thunder Client translates the call in powershell:
I see that in my call the response is translated to JSON. I would guess that most REST clients are assuming that users want to work in JSON. Maybe that's what your two REST clients are doing?

How can I use REST API authentication in Mendix?

I have designed an API REST service (with Bonita) to which I can perfectly connect with Postman, with the following parameters:
By the way, the x-www-form-urlencoded option that is selected comes from the Content-type application/x-www-form-urlencoded header that is not displayed in my screenshot. The official Bonita specification states that this header is needed and I always get a 200-OK status code as an answer.
How can I specify an equivalent request with the body part in a Mendix Call REST service in a microflow? Here is what I have so far:
I guess the body part should be specified in the Request tab, but I just don't know how to do it properly. I always get the following error message for my connector, which means that, whatever I specify, the username is not taken into account:
An error has occurred while handling the request. [User 'Anonymous_69a378ed-bb56-4183-ae71-c9ead783db1f' with session id '5fefb6ad-XXXX-XXXX-XXXX-XXXXXXXXb34f' and roles 'Administrator']
I finally found that the proxy setting was the actual problem. It was set at the project scope and simply clicking on No proxy in the General tab did the trick! (both services are hosted on my local machine so far)
I just had to fill in the dedicated Authentication field in the HTTP Headers tab then, with the correct credentials, to eventually log in my Bonita service.

Data Factory v2 - connecting using REST

The aim is to connect to a public REST api using ADF. It's my first stab at sending requests to a REST api in ADF. It is the Companies House ('CH') governement website's API in England.
I have created an account and obtained a key. Apparently, it is basic authentication and the user name is the API key and password will be ignored (CH note on authentication)
I want to explore the contents of the 'Search all' API (CH note on Search All) and want to copy the results to Blob Storage.
I therefore set the linked service to use REST as below, the obfuscated User Name is the key I obtained from CH, the password is jsut the key repeated as their documentation states they ignore the password:
[
I then have added a REST dataset referencing this linked service:
And the testing of the connection works fine.
Problems then arise in the copy data task, I am getting an error when previewing and also when I attempt a copy to blob of 'Invalid Authorization Header':
I'd be grateful for pointers on where I'm going wrong.
I can't reproduce your Auth error but i notice that you want to add some parameters with your GET request in the Request Body.
I think you need to add parameters in relativeUrl property:
A relative URL to the resource that contains the data. When this
property isn't specified, only the URL that's specified in the linked
service definition is used. The HTTP connector copies data from the
combined URL: [URL specified in linked service]/[relative URL
specified in dataset].
Also i suggest you checking the correct REST API format of Search Api you are using.There is no other special features in the ADF REST connector. Just make sure the GET request works locally and duplicate it.

How do I know Splunk REST API Base URL?

We have Splunk deployed in https://splunkit.corp.company.com (url modified).
and able to access Splunk Web home page on https://splunkit.corp.company.com/en-US/app/launcher/home (url modified).
I am building a dashboard application which uses the JSON data provided by Splunk REST services.
I have gone through the link and rest end points as here.
From above links I got know
I need to make post request to services/auth/login with username and password. This returns session key which will be used in further API calls.
Have to make post request to services/search/jobs to create a search. This returns search id.
I need to check services/search/jobs/ for search complete.
If search complete Then I can retrieve results using services/search/jobs//results.
The problem here I facing is I don't know whats the base URL. I tried constructing https://splunkit.corp.company.com/en-US/services/auth/login and etc but not working.
Any help appreciated. Thanks
I had the same question earlier. Well here is an workaround to find out the REST API Base URL. I found this solution by accident in fact.
In the Firefox browser, open the Web Developer / Network tool, to inspect the URLs between your local computer and the Splunk server
Logon to the Splunk via Web Interface
We assume you had finished a search beforehand, so there should be an Job stored on the Server already. Then, we click the Activity / Jobs link at the right top location of the window
There will be a list of jobs listed. Choose any job, click the Job / Delete Job button, then the Job search result will be deleted.
In the Web Developer tool, inspect the URL when deleting the job
For me, I got an URL looks like:
https://the-company-splunk-server.com/en-US/splunkd/__raw/services/search/jobs/scheduler_search_RMD554b7a649e94cdf69_at_1526886000_58534?output_mode=json
The top secret is: the URL before /services/ is the REST API Base URL. In this case, the base URL is:
https://the-company-splunk-server.com/en-US/splunkd/__raw/services/
Test the Base URl
We can try this Base URL for login with CURL:
curl --insecure https://the-company-splunk-server.com/en-US/splunkd/__raw/services/auth/login -d username=your-user -d password=your-password
We got the following result:
<response>
<sessionKey>kq6gkXO_dFcJzJG2XpwZs1IwfhH8MkkYDaBsZrPxZh8</sessionKey>
</response>
So the test is succeed. We have proven the base URL works.
Good luck.

Get non file body from multipart/form-data using AWS API Gateway and Lambda

I am trying to get the form data from a multipart/form-data POST to my ASW Lambda web service via API Gateway.
The HTTP POST has Content-Type "multipart/form-data" and body that is URL encoded. File data is also sent in this post (hence the multipart, I guess).
The web service needs to integrate with a thirdparty service, so changing the format of the POST isn't really an option.
I have seen this thread talking about converting the URL encoded data to JSON object for use in Lambda, but this doesn't do the trick.
I have also tried setting the Integration Request -> Mapping Templates for content type multipart/form-data to Input passthrough. This didn't help either.
I did come across another question about uploading a file using multipart/form-data, but since I'm not interested in the file, just the body, that answer didn't help.
Below find screenshot (sorry) of the captured post via runscope.
If the goal is to use Lambda, you'll need to pass valid JSON to the function. Currently there isn't a way to JSON-ify data inside Api Gateway that comes in as non-JSON data.
Our short term fix (on our backlog) is to provide a variable in the mapping templates to grab the raw input of the request. That way you could do a simple JSON conversion using a template like:
{
"body" : "$input.body"
}
or something like that.
Check out the mapping template reference for more info: http://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-mapping-template-reference.html
Edit 4/7 - feature has been released as $input.body