connect to REST endpoint using OAuth2 - azure-data-factory

I am trying to explore different options to connect to a REST endpoint using Azure Data Factory. I have the below python code which does what I am looking for but not sure if Azure Data Factory offers something out of the box to connect to the api or a way to call a custom code.
Code:
import sys
import requests
from requests_oauthlib import OAuth2Session
from oauthlib.oauth2 import BackendApplicationClient
import json
import logging
import time
logging.captureWarnings(True)
api_url = "https://webapi.com/api/v1/data"
client_id = 'client'
client_secret = 'secret'
client = BackendApplicationClient(client_id=client_id)
oauth = OAuth2Session(client=client)
token = oauth.fetch_token(token_url='https://webapi.com/connect/accesstoken', client_id=client_id, client_secret=client_secret)
client = OAuth2Session(client_id, token=token)
response = client.get(api_url)
data = response.json()
When I look at the REST linked service I don't see many authentication options
Could you please point to me on what activities to use to make OAuth2 working in Azure Data Factory

You would have to use a WebActivity to call using POST method and get the authentication token before getting data from API.
Here is an example.
First create an Web Activity.
Select your URL that would do the authentication and get the token.
Set Method to POST.
Create header > Name: Content-Type Value: application/x-www-form-urlencoded
Configure request body for HTTP request.
..
Format: grant_type=refresh_token&client_id={client_id}&client_secret=t0_0CxxxxxxxxOKyT8gWva3GPU0JxYhsQ-S1XfAIYaEYrpB&refresh_token={refresh_token}
Example: grant_type=refresh_token&client_id=HsdO3t5xxxxxxxxx0VBsbGYb&client_secret=t0_0CqU8oA5snIOKyT8gWxxxxxxxxxYhsQ-S1XfAIYaEYrpB&refresh_token={refresh_token
I have shown above for example, please replace with respective id and secret when you try.
As an output from this WebActivity, you would receive a JSON string. From which you can extract the access_token to further use in any request header from further activities (REST linked service) in the pipeline depending on your need.
You can get the access_token like below. I have assigned it to a variable for simplicity.
#activity('GetOauth2 token').output.access_token
Here is an example from official MS doc for Oauth authentication implementation for copying data.

Related

How can I a Google Api restful endpoint using service key?

I'm using postman to memic a restful api call and trying to access google sheets API end point. When I try to access my endpoint it returns:
{
"error": {
"code": 403,
"message": "The request is missing a valid API key.",
"status": "PERMISSION_DENIED"
}
}
which is fair enough as I did not use my API key. I created a service account and got a json file, but I plan to access using a rest endpoint so need to pass token in header but I'm not sure how.
I looked at the json file and wasn't sure what to extract in order to pass it for my rest call.
Has anyone been able to do this successfully?
Before calling Google Services from Postman, you would need to re-create the flow for getting an access token form service account credentials :
build and encode the JWT payload from the data from credentials files (to populate aud, iss, sub, iat and exp)
request an access token using that JWT
make the request to the API using this access token
You can find a complete guide for this flow is located here: https://developers.google.com/identity/protocols/oauth2/service-account#authorizingrequests
Here is an example in python. You will need to install pycrypto and pyjwt to run this script :
import requests
import json
import jwt
import time
#for RS256 you may need this
#from jwt.contrib.algorithms.pycrypto import RSAAlgorithm
#jwt.register_algorithm('RS256', RSAAlgorithm(RSAAlgorithm.SHA256))
token_url = "https://oauth2.googleapis.com/token"
credentials_file_path = "./google.json"
#build and sign JWT
def build_jwt(config):
iat = int(time.time())
exp = iat + 3600
payload = {
'iss': config["client_email"],
'sub': config["client_email"],
'aud': token_url,
'iat': iat,
'exp': exp,
'scope': 'https://www.googleapis.com/auth/spreadsheets'
}
jwt_headers = {
'kid': config["private_key_id"],
"alg": 'RS256',
"typ": 'JWT'
}
signed_jwt = jwt.encode(
payload,
config["private_key"],
headers = jwt_headers,
algorithm = 'RS256'
)
return signed_jwt
with open(credentials_file_path) as conf_file:
config = json.load(conf_file)
# 1) build and sign JWT
signed_jwt = build_jwt(config)
# 2) get access token
r = requests.post(token_url, data= {
"grant_type": "urn:ietf:params:oauth:grant-type:jwt-bearer",
"assertion": signed_jwt.decode("utf-8")
})
token = r.json()
print(f'token will expire in {token["expires_in"]} seconds')
at = token["access_token"]
print(at)
Note the value of the scope: https://www.googleapis.com/auth/spreadsheets
Probably, you can do all the above flow using Google API library depending on what
programming language you prefer
The script above will print the access token :
ya29.AHES67zeEn-RDg9CA5gGKMLKuG4uVB7W4O4WjNr-NBfY6Dtad4vbIZ
Then you can use it in Postman in Authorization header as Bearer {TOKEN}.
Or using curl :
curl "https://sheets.googleapis.com/v4/spreadsheets/$SPREADSHEET_ID" \
-H "Authorization: Bearer $ACCESS_TOKEN"
Note: you can find an example of using service account keys to call Google translate API here

Gsuite Directory API 403 Error on API call or Grant Type Error on JWT Generation

Using Python and creating my own JWT using HTTP/Rest methodology, I simply can't get delegation to work.
On one hand, google JWT troubleshoot documentation says that ISS needs to be the same as the SUB (the service account).
However, on the server to server oauth2 documentation, it says that to impersonate an account, the sub needs to be the account I am attempting to impersonate in the claim.
Needless to say, despite enabling domain-wide delegation, adding the correct scopes, etc, I get nothing back but 403 when attempting to access the user domain utilizing the requests library in python with the following example:
> requests.get("https://www.googleapis.com/admin/directory/v1/users/useremail#/
> google.org",headers={'Authorization':f' Bearer {oauth2tokenhere}'})
Here is an example of my claim:
> claim = { "iss": 'serviceaccountemail',
> 'sub': 'impersonatedaccountemail',
> 'scope': 'https://www.googleapis.com/auth/admin.directory.user.readonly',
> 'exp': ((datetime.datetime.today() + datetime.timedelta(minutes=60)).timestamp()),
> 'iat': ((datetime.datetime.today()).timestamp()),
> 'aud': "https://oauth2.googleapis.com/token"
> }
The above claim will generate a generalized grant error (cute, but not helpful).
If I change the claim and ensure that the sub and the iss are the same, the oauth2token generates, but I get a 403 error when attempting to hit the API.
Here is the server to server oauth2 documentation stating the sub should be the
account the service account is attempting to impersonate.
https://developers.google.com/identity/protocols/OAuth2ServiceAccount
Here is the troubleshooting article outlining the ISS/Sub being the same (although cloud article is the closest relevant topic I could find)
https://cloud.google.com/endpoints/docs/openapi/troubleshoot-jwt
EDIT:
I am utilizing the service account information from the downloaded .json file that is downloaded when creating the service account file.
import json as j
import datetime
import jwt
import requests
#creates the claim, 'secret' (from the private key), and the kid, from the service account file, and returns these values in a tuple.
#the tuple can then be used to make dependable positional argument entries to the parameters of the createJWT function.
def create_claim_from_json(self,objpath,scope=["https://www.googleapis.com/auth/admin.directory.user.readonly" "https://www.googleapis.com/auth/admin.directory.user"]):
with open(f'{objpath}','r') as jobj:
data = j.load(jobj)
claim = {
"iss": str(data['client_id']),
"sub": str(data['client_id']),
"scope": str(scope),
"exp": ((datetime.datetime.today() + datetime.timedelta(minutes=59)).timestamp()),
"iat": ((datetime.datetime.today()).timestamp()),
"aud": "https://oauth2.googleapis.com/token"
}
private_key = data['private_key']
kid = {"kid": f"{data['private_key_id']}"}
return claim, private_key, kid
#assembles the JWT using the claim, secret (Private key from the Service account file), the kid value, and the documented RS256 alg.
#returns the completed JWT object back to be used to send to the oauth2 endpoint
#the JWT will be used in the function call retrieve_oauth2token.
def createJWT(self, claim, secret, kid, alg='RS256'):
encoded_jwt = (jwt.encode(claim, secret, alg, kid)).decode('UTF-8')
return encoded_jwt
#Using the JWT created in memory, sends the JWT to the googleapi oauth2 uri and returns a token
def retrieve_oauth2token(self, jwt):
oauth2 = requests.post(f'https://oauth2.googleapis.com/token?grant_type=urn%3Aietf%3Aparams%3Aoauth%3Agrant- type%3Ajwt-bearer&assertion={jwt}')
oauth2=oauth2.json()
return oauth2 #['access_token'], oauth2['token_type']
The documentation has a clear overview, did you follow the steps as described in the addendum? I am missing some parts of your code. But you did not mention using a service account (json) key. And the documentation also show that you have to use the (delegated) service account as both iss and sub. Furthermore, you need to use a kid. This is how it is done:
payload = {
'iss': '123456-compute#developer.gserviceaccount.com',
'sub': '123456-compute#developer.gserviceaccount.com',
'aud': 'https://firestore.googleapis.com/',
'iat': time.time(),
'exp': iat + 3600
}
additional_headers = {'kid': PRIVATE_KEY_ID_FROM_JSON}
signed_jwt = jwt.encode(payload, PRIVATE_KEY_FROM_JSON, headers=additional_headers, algorithm='RS256')
url = "URL OF THE API TO CALL"
header = {'Authorization': f'Bearer {signed_jwt}'}
resp = requests.get(url, headers=header)
Note: you can find PRIVATE_KEY_FROM_JSON in the private_key_id field of your service account JSON credentials file.

what API Gateway methods support Authorization?

When I create a resource/method in AWS API Gateway API I can create one of the following methods: DELETE, GET, HEAD, OPTIONS, PATCH or POST.
If I choose GET then API Gateway doesn't pass authentication details; but for POST it does.
For GET should I be adding the cognito credentials to the URL of my GET? or just never use GET and use POST for all authenticated calls?
My set-up in API Gateway/Lambda:
I created a Resource and two methods: GET and POST
Under Authorization Settings I set Authorization to AWS_AIM
For this example there is no Request Model
Under Method Execution I set Integration type to Lambda Function and I check Invoke with caller credentials (I also set Lambda Region and Lambda Function)
I leave Credentials cache unchecked.
For Body Mapping Templates, I set Content-Type to `application/json' and the Mapping Template to
{ "identity" : "$input.params('identity')"}
In my Python Lambda function:
def lambda_handler(event, context):
print context.identity
print context.identity.cognito_identity_id
return True
Running the Python function:
For the GET context.identity is None
For the POST context.identity has a value and context.identity.cognito_identity_id has the correct value.
As mentioned in comments: all HTTP methods support authentication. If the method is configured to require authentication, authentication results should be included in the context for you to access via mapping templates to pass down stream as contextual information.
If this is not working for you, please update your question to reflect:
How your API methods are configured.
What your mapping template is.
What results you see in testing.
UPDATE
The code in your lambda function is checking the context of the Lambda function, not the value from API Gateway. To access the value passed in from API Gateway, you would need to use event.identity not context.identity.
This would only half solve your problem as you are not using the correct value to access the identity in API gateway. That would be $context.identity.cognitoIdentityId (assuming you are using Amazon Cognito auth). Please see the mapping template reference for a full guide of supported variables.
Finally, you may want to consider using the template referenced in this question.

Calling Cloud Stack With Parameters

i am trying to make an api call using the below code and it works fine
import urllib2
import urllib
import hashlib
import hmac
import base64
baseurl='http://www.xxxx.com:8080/client/api?'
request={}
request['command']='listUsers'
request['response']='xml'
request['apikey']='xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
secretkey='xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
request_str='&'.join(['='.join([k,urllib.quote_plus(request[k])]) for k in request.keys()])
sig_str='&'.join(['='.join([k.lower(),urllib.quote_plus(request[k].lower().replace('+','%20'))])for k in sorted(request.iterkeys())])
sig=hmac.new(secretkey,sig_str,hashlib.sha1)
sig=hmac.new(secretkey,sig_str,hashlib.sha1).digest()
sig=base64.encodestring(hmac.new(secretkey,sig_str,hashlib.sha1).digest())
sig=base64.encodestring(hmac.new(secretkey,sig_str,hashlib.sha1).digest()).strip()
sig=urllib.quote_plus(base64.encodestring(hmac.new(secretkey,sig_str,hashlib.sha1).digest()).strip())
req=baseurl+request_str+'&signature='+sig
res=urllib2.urlopen(req)
result = res.read()
print result
what i want to know how can i send additional parameter with the Api call??
and how to send parameters when iam sending data to cloud stack instead of getting from the cloud stack
e.g createuser
Add additional parameters to the the request dictionary.
E.g. listUsers allows details of a specific username to be listed (listUsers API Reference). To do so, you'd update request creation as follows:
request={}
request['command']='listUsers'
request['username']='admin'
request['response']='xml'
request['apikey']='xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
Also the Rules for Signing say to "Lower case the entire Command String and sort it alphabetically via the field for each field-value pair" This section of the docs also covers adding an expiry to the URL.
Finally, you need to ensure the HTTP GET is not cached by network infrastructure by making each HTTP GET unique. The CloudStack API uses a cache buster. Alternatively, you can add an expiry to each query, or use an HTTP POST.

List of Spreadsheets Gdata OAuth2

Getting a list of spreadsheets with spreadsheet api in Gdata,
Oauth1 Way
spreadSheetService = gdata.spreadsheet.service.SpreadsheetsService()
spreadSheetService.SetOAuthInputParameters(gdata.auth.OAuthSignatureMethod.HMAC_SHA1,self.CONSUMER_KEY,self.CONSUMER_SECRET,two_legged_oauth=True, requestor_id=self.requestor_id)
spreadSheetService.GetSpreadsheetsFeed(query = q)
But since spreadSheetService is not available for OAuth2 because of this won't fix issue #594
How do I query for a list of spreadsheets with gdata.spreadsheets.client.SpreadsheetClient ?
(assuming Python)
I was able to use gd_client.auth_token = gdata.gauth.OAuth2TokenFromCredentials(credentials) to take a credentials object created by an OAuth2 flow (using the oauth2client) and use this with the gdata library.
Full example here (for a command-line app):
# Do OAuth2 stuff to create credentials object
from oauth2client.file import Storage
from oauth2client.client import flow_from_clientsecrets
from oauth2client.tools import run
storage = Storage("creds.dat")
credentials = storage.get()
if credentials is None or credentials.invalid:
credentials = run(flow_from_clientsecrets("client_secrets.json", scope=["https://spreadsheets.google.com/feeds"]), storage)
# Use it within gdata
import gdata.spreadsheets.client
import gdata.gauth
gd_client = gdata.spreadsheets.client.SpreadsheetsClient()
gd_client.auth_token = gdata.gauth.OAuth2TokenFromCredentials(credentials)
print gd_client.get_spreadsheets()
If you're specifically looking for 2-legged, the same technique works, but you will need to create a different type of credentials object. See the following recent answer regarding how to create this: Using Spreadsheet API OAuth2 with Certificate Authentication
Here's a variation that writes an OAuth 2.0 Bearer auth header directly to the request and allows you to continue to use the older gdata.spreadsheet.service.SpreadsheetsService style client code:
import httplib2
# Do OAuth2 stuff to create credentials object
from oauth2client.file import Storage
from oauth2client.client import flow_from_clientsecrets
from oauth2client.tools import tools
storage = Storage("creds.dat")
credentials = storage.get()
if credentials is None or credentials.invalid:
flags = tools.argparser.parse_args(args=[])
flow = flow_from_clientsecrets("client_secrets.json", scope=["https://spreadsheets.google.com/feeds"])
credentials = tools.run_flow(flow, storage, flags)
if credentials.access_token_expired:
credentials.refresh(httplib2.Http())
# Use it within old gdata
import gdata.spreadsheet.service
import gdata.service
client = gdata.spreadsheet.service.SpreadsheetsService(
additional_headers={'Authorization' : 'Bearer %s' % credentials.access_token})
#public example
entry = client.GetSpreadsheetsFeed('0AoFkkLP2MB8kdFd4bEJ5VzR2RVdBQkVuSW91WE1zZkE')
print entry.title