ADF: linkedService template function not defined - azure-data-factory

I am currently trying to add some parameterised linked services. I have two services currently: a key vault, and a data lake. The configuration are:
// Key vault
{
"name": "Logical Key Vault",
"properties": {
"parameters": {
"environment": {
"type": "String"
}
},
"annotations": [],
"type": "AzureKeyVault",
"typeProperties": {
"baseUrl": "https://kv-#{linkedService().environment}.vault.azure.net"
}
}
}
// Data lake
{
"name": "Logical Data Lake",
"properties": {
"type": "AzureBlobFS",
"parameters": {
"environment": {
"type": "String"
}
},
"annotations": [],
"typeProperties": {
"url": "https://sa#{replace(linkedService().environment, '-', '')}.dfs.core.windows.net",
"accountKey": {
"type": "AzureKeyVaultSecret",
"secretName": "storageAccountKey",
"store": {
"referenceName": "Logical Key Vault",
"type": "LinkedServiceReference",
"parameters": {
"environment": {
"value": "#linkedService().environment",
"type": "Expression"
}
}
}
}
}
}
}
Both linked services are parameterised by an environment parameter, and I have confirmed that the Key Vault works fine and is able to correctly retrieve secrets. The problem happens when I attempt to retrieve the storage key from the key vault. I get the following error:
Error code
FailToResolveParametersInExploratoryController
Details
The parameters and expression cannot be resolved for schema operations.
Error Message: {
"message": "ErrorCode=InvalidTemplate, ErrorMessage=The template function 'linkedService' is not defined or not valid."
}
My attempts at debugging this has identified the use of #linkedService on line 38 to be the issue, which is when the Data Lake is trying to pass its own environment parameter to the Key Vault so that it may obtain the storage key. If I remove this use of #linkedService.environment and replace it with a hard coded value, the linked service successfully connects to the data lake.
The expression is trivially simple, and the web interface itself offers the option:
As a result, I am unsure why the use of #linkedService fails here. The web interface and ability to use expressions would suggest it should work, but then #linkedService is undefined for some reason.
While debugging this, I did find that using the expression
#string(linkedService.environment)
Does indeed work, but this seems rather odd as the environment is itself a string and thus its conversion into a string should be a no-op. I have also looked into removing the # entirely and trying
linkedService.environment
and while this does correctly resolve to the environment, it still results in an error as it resulting parameter contains the surrounding quotation and thus the linked service fails to connect to the key vault https://kv-'foobar'.vault.azure.net as it is clearly invalid (assuming my environment was foobar).

Related

Azure ARM Template parameters for parametrized linked service

Please, forgive the confusing tittle, if it is, but it does describe the problem I am having
So, I have a linked service in my Azure Datafactory. It is used for Azure SQL Database connect.
The Database name and user name are being taken from the parameters set in linked service itself. Here is a snippet of json config
"typeProperties": {
"connectionString": "Integrated Security=False;Encrypt=True;Connection Timeout=30;Data Source=myserver.database.windows.net;Initial Catalog=#{linkedService().dbName};User ID=#{linkedService().dbUserName}",
"password": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "KeyVaultLink",
"type": "LinkedServiceReference"
},
"secretName": "DBPassword"
},
"alwaysEncryptedSettings": {
"alwaysEncryptedAkvAuthType": "ManagedIdentity"
}
}
This works fine when in debug in the Azure portal. However, when I get the ARM Template for the whole thing, during ARM Template deployment it asks for input Connection string for the linked service. If I go to the linked service definition, and look up its connection string it will come this way
"connectionString": "Integrated Security=False;Encrypt=True;Connection Timeout=30;Data Source=dmsql.database.windows.net;Initial Catalog=#{linkedService().dbName};User ID=#{linkedService().dbUserName}"
Then when I input it in the ARM Template deployment should I be replacing "#{linkedService().dbName}" and "#{linkedService().dbUserName}" with actual values at the spot when I am entring it ? I am confused because during the ARM Template deployment there are no separate fields for these parameters, and these (parameters specific to linked service itself) are not present as separate parameters in the ARM Template definition.
I created database in my azure portal
and enabled system assigned managed Identity for sql db.
Image for reference:
I created azure keywault and created secret.
Image for reference:
I have created new access policy for Azure data factory.
Image for reference:
I created Azure data factory and enabled system managed identity.
Image for reference:
I have created new parametrized linked service to connect with database with below parameters dbName and userName. I am taking database name and User name dynamically by using above parameters.
Image for reference:
Linked service is created successfully.
json format of my lined service:
{
"name": "SqlServer1",
"properties": {
"parameters": {
"dbName": {
"type": "String"
},
"userName": {
"type": "String"
}
},
"annotations": [],
"type": "SqlServer",
"typeProperties": {
"connectionString": "Integrated Security=False;Data Source=dbservere;Initial Catalog=#{linkedService().dbName};User ID=#{linkedService().userName}",
"password": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "AzureKeyVault1",
"type": "LinkedServiceReference"
},
"secretName": "DBPASSWORD"
},
"alwaysEncryptedSettings": {
"alwaysEncryptedAkvAuthType": "ManagedIdentity"
}
}
}
}
I exported the arm template of data factory.
This is my linked service in my ARM template:
"SqlServer1_connectionString": {
"type": "secureString",
"metadata": "Secure string for 'connectionString' of 'SqlServer1'",
"defaultValue": "Integrated Security=False;Data Source=dbservere;Initial Catalog=#{linkedService().dbName};User ID=#{linkedService().userName}"
},
"AzureKeyVault1_properties_typeProperties_baseUrl": {
"type": "string",
"defaultValue": "https://keysqlad.vault.azure.net/"
}
Image for reference:
I have got parameters dbName and userName in my ARM template description.
{
"name": "[concat(parameters('factoryName'), '/SqlServer1')]",
"type": "Microsoft.DataFactory/factories/linkedServices",
"apiVersion": "2018-06-01",
"properties": {
"parameters": {
"dbName": {
"type": "String"
},
"userName": {
"type": "String"
}
},
"annotations": [],
"type": "SqlServer",
"typeProperties": {
"connectionString": "[parameters('SqlServer1_connectionString')]",
"password": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "AzureKeyVault1",
"type": "LinkedServiceReference"
},
"secretName": "DBPASSWORD"
},
"alwaysEncryptedSettings": {
"alwaysEncryptedAkvAuthType": "ManagedIdentity"
}
}
},
"dependsOn": [
"[concat(variables('factoryId'), '/linkedServices/AzureKeyVault1')]"
]
}
Image for reference:
If you didn't get parameters in ARM template description copy the value of "connectionString" and modified what you needed to and left the parameters in place and added it to the "connectionString" override parameter in my Azure Release Pipeline, and it will work.

Kafka Connect JSON Schema does not appear to support "$ref" tags

I am using Kafka Connect with JSONSchema and am in a situation where I need to convert the JSON schema manually (to "Schema") within a Kafka Connect plugin. I can successfully retrieve the JSON Schema from the Schema Registry and am successful converting with simple JSON Schemas but I am having difficulties with ones that are complex and have valid "$ref" tags referencing components within a single JSON Schema definition.
I have several questions:
The JsonConverter.java does not appear to handle "$ref". Am I correct, or does it handle it in another way elsewhere?
Does the Schema Registry handle the referencing of sub-definitions? If yes, is there code that shows how the dereferencing is handled?
Should the JSON Schema be resolved to a string without references (ie. inline the references) before submitting to the Schema Registry and thereby remove the "$ref" issue?
I am looking at the Kafka Source code module JsonConverter.java below:
https://github.com/apache/kafka/blob/trunk/connect/json/src/main/java/org/apache/kafka/connect/json/JsonConverter.java#L428
An example of the complex schema (taken from the JSON Schema site) is shown below (notice the "$ref": "#/$defs/veggie" tag the references a later sub-definition)
{
"$id": "https://example.com/arrays.schema.json",
"$schema": "https://json-schema.org/draft/2020-12/schema",
"description": "A representation of a person, company, organization, or place",
"title": "complex-schema",
"type": "object",
"properties": {
"fruits": {
"type": "array",
"items": {
"type": "string"
}
},
"vegetables": {
"type": "array",
"items": { "$ref": "#/$defs/veggie" }
}
},
"$defs": {
"veggie": {
"type": "object",
"required": [ "veggieName", "veggieLike" ],
"properties": {
"veggieName": {
"type": "string",
"description": "The name of the vegetable."
},
"veggieLike": {
"type": "boolean",
"description": "Do I like this vegetable?"
}
}
}
}
}
Below is the actual schema returned from the Schema Registry after it the schema was successfully registered:
[
{
"subject": "complex-schema",
"version": 1,
"id": 1,
"schemaType": "JSON",
"schema": "{\"$id\":\"https://example.com/arrays.schema.json\",\"$schema\":\"https://json-schema.org/draft/2020-12/schema\",\"description\":\"A representation of a person, company, organization, or place\",\"title\":\"complex-schema\",\"type\":\"object\",\"properties\":{\"fruits\":{\"type\":\"array\",\"items\":{\"type\":\"string\"}},\"vegetables\":{\"type\":\"array\",\"items\":{\"$ref\":\"#/$defs/veggie\"}}},\"$defs\":{\"veggie\":{\"type\":\"object\",\"required\":[\"veggieName\",\"veggieLike\"],\"properties\":{\"veggieName\":{\"type\":\"string\",\"description\":\"The name of the vegetable.\"},\"veggieLike\":{\"type\":\"boolean\",\"description\":\"Do I like this vegetable?\"}}}}}"
}
]
The actual schema is embedded in the above returned string (the contents of the "schema" field) and contains the $ref references:
{\"$id\":\"https://example.com/arrays.schema.json\",\"$schema\":\"https://json-schema.org/draft/2020-12/schema\",\"description\":\"A representation of a person, company, organization, or place\",\"title\":\"complex-schema\",\"type\":\"object\",\"properties\":{\"fruits\":{\"type\":\"array\",\"items\":{\"type\":\"string\"}},\"vegetables\":{\"type\":\"array\",\"items\":{\"$ref\":\"#/$defs/veggie\"}}},\"$defs\":{\"veggie\":{\"type\":\"object\",\"required\":[\"veggieName\",\"veggieLike\"],\"properties\":{\"veggieName\":{\"type\":\"string\",\"description\":\"The name of the vegetable.\"},\"veggieLike\":{\"type\":\"boolean\",\"description\":\"Do I like this vegetable?\"}}}}}
Again, the JsonConverter in the Apache Kafka source code has no notion of JSONSchema, therefore, no, $ref doesn't work and it also doesn't integrate with the Registry.
You seem to be looking for the io.confluent.connect.json.JsonSchemaConverter class + logic

Arm template deployment fail with 409 error for one specific storage account

I use arm template to deploy a storage account. However, I got an error saying: StorageAccountAlreadyExists: The storage account named xxx already exists.
My release pipeline is set to incremental, so shouldn't really show this error.
I changed storage account name to a new one, not only it worked the first time, but I can keep on deploying the same pipeline and no error ever thrown out.
Looks like it is something specific to this account, however, I can't see anything special. The arm template we use is also quite normal (something we got from official examples before).
{
"$schema": "http://schema.management.azure.com/schemas/2019-06-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"StorageDescriptor": {
"type": "string",
"defaultValue": "StorageAccount",
"metadata": {}
},
"StorageAccountName": {
"type": "string",
"defaultValue": "[toLower(concat(parameters('StorageDescriptor'), resourceGroup().name))]",
"metadata": { "Description": "Override name for the storage account" }
},
"StorageType": {
"type": "string",
"defaultValue": "Standard_LRS",
"allowedValues": [
"Standard_LRS",
"Standard_ZRS",
"Standard_GRS",
"Standard_RAGRS",
"Premium_LRS"
]
},
"Environment": {
"type": "string",
"defaultValue": "PreProd",
"metadata": { "description": "PreProd or Prod" }
}
},
"variables": {
},
"resources": [
{
"name": "[parameters('StorageAccountName')]",
"type": "Microsoft.Storage/storageAccounts",
"location": "[resourceGroup().location]",
"apiVersion": "2019-06-01",
"dependsOn": [],
"tags": {
"displayName": "Web Job Storage Account"
},
"properties": {
"accountType": "[parameters('StorageType')]"
}
}
],
"outputs": {
}
}
Even though your release pipeline is set to incremental, the storage account name must be unique every time you deploy. Refer to: here.
Arm template deployment fail with 409 error for one specific storage account
You need to check if the storage account attributes have been changed through the Azure/PowerShell portal by somebody else, and are different than the ones specified on the ARM template.
To resolve this issue, please try to export the template and update it in the Azure devops repo:
Then, we could update this new exported template file as you want and deploy with it.
As test, I could keep on deploying the same pipeline and no error ever thrown out.

Why is password type of AzureKeyVaultSecret dropped when creating LinkedService via powershell,

I'm attemping to create a LinkedService via the powershell command
New-AzureRmDataFactoryV2LinkedService -ResourceGroupName rg -DataFactoryName df -Name n -DefinitionFile n.json
the result is that the LinkedService is created, however the reference to the password type of AzureKeyVaultSecret is removed rendering it non-operational
The config file n.json was extracted from the DataFactory code tab and has the syntax below...
{
"name": "<name>",
"type": "Microsoft.DataFactory/factories/linkedservices",
"properties": {
"type": "Oracle",
"typeProperties": {
"connectionString": "host=<host>;port=<port>;serviceName=<serviceName>;user id=<user_id>",
"password": {
"type": "AzureKeyVaultSecret",
"store": {
"referenceName": "Prod_KeyVault",
"type": "LinkedServiceReference"
},
"secretName": "<secretname>"
}
},
"connectVia": {
"referenceName": "<runtimename>",
"type": "IntegrationRuntimeReference"
}
}
}
When the new LinkedService is created, the code looks exactly the same except properties->typeProperties->password is removed and requires manual configuration - which I'm trying to avoid if possible.
Any thoughts?
If you have tried using "Update-Module -Name AzureRm.DataFactoryV2" to update your powershell to the latest version, and it is still the same behavior, then the possible root cause is that password is not support as Azure Key Value yet in Powershell. As far as I know, it is a new feature added recently. So it may take some time to rollout it to Powershell.
In that case, the workaround is to use UI to create linked service for now.

Azure Automation Registration Endpoint is corrupted when used to pull DSC configuration

For some reason, I keep getting these weird issues.....
In this case, I have a Key and Endpoint URL for the Automation Account stored as Secrets in a KeyVault (I don't know of a away to extract it natively from Automation Account using ARM).
I can extract these values perfectly and they they are published to the Template that runs a PowerShell extension to pull a DSC Configuration.
For example as seen as an Input deploying the Template:
"RegistrationUrl":"https://ase-agentservice-prod-1.azure-automation.net/accounts/e0799801-a8da-8934-b0f3-9a43191dd7e6"
However, I receive the following error (note the Url in the Error with 3 forward slashes)
"code": "VMExtensionProvisioningError",
"message": "VM has reported a failure when processing extension 'dscLcm'.
Error message: "DSC Configuration 'ConfigureLCMforAAPull' completed with error(s). Following are the first few: The attempt to 'get an action' for AgentId 11A5A267-6D00-11E7-B07F-000D3AE0FB1B from server URL https://ase-agentservice-prod-1.azure-automation.net///accounts/e0799801-a8da-8934-b0f3-9a43191dd7e6/Nodes(AgentId='11A5A267-6D00-11E7-B07F-000D3AE0FB1B')/GetDscAction failed with server error 'ResourceNotFound(404)'.
For further details see the server error message below or the DSC debug event log with ID 4339.
ServerErrorMessage:- 'No NodeConfiguration was found for the agent.'\"."
The Endpoint Url is passed as a Secure String. I tried passing it a normal string - Same problem.
The Key and Endpoint are feed into the Template as Parameters:
"dscKeySecret": {
"type": "securestring",
"metadata": {
"description": "Key for PowerShell DSC Configuration."
}
},
"dscUrlSecret": {
"type": "securestring",
"metadata": {
"description": "Url for PowerShell DSC Configuration."
}
},
These values are used to create a parameter to be passed to the next template that runs the VM Extension.
"extn-settings": {
"value": {
"configuration": {
"url": "[concat(variables('urls').dscScripts, '/', 'lcm-aa-pull', '/', 'lcm-aa-pull', '.zip')]",
"script": "[concat('lcm-aa-pull', '.ps1')]",
"function": "ConfigureLCMforAAPull"
},
"configurationArguments": {
"registrationKey": {
"username": "dsckeySecret",
"password": "[parameters('dscKeySecret')]"
},
"registrationUrl": "[parameters('dscUrlSecret')]",
"configurationMode": "ApplyAndMonitor",
"configurationModeFrequencyMins": 15,
"domain": "[variables('names').domain]",
"name": "dscLcm",
"nodeConfigurationName": "[variables('names').config.ad]",
"rebootNodeIfNeeded": true,
"refreshFrequencyMins": 30
},
"protectedSettings": null,
}
}
The next template receives the Parameters and used in the Properties of the VM's Resources section:
"properties": {
"publisher": "Microsoft.Powershell",
"type": "DSC",
"typeHandlerVersion": "2.22",
"autoUpgradeMinorVersion": true,
"settings": {
"configuration": "[parameters('extn-settings').configuration]",
"configurationArguments": "[parameters('extn-settings').configurationArguments]"
},
"protectedSettings": "[parameters('extn-settings').protectedSettings]"
}
So why is the Url being corrupted with the the first '/' being changed to '///'?
I don't why the Endpoint Url has 3 x '/', but that wasn't the issue.... I wish I found the issue before I posted this question...
I found the Node Configuration Name was wrong with a spelling mistake (hang head in shame)
Thanks anyway!