Azure data factory pass activity output to a dataset

Azure data factory pass activity output to a dataset - azure-data-factory

I am using a SQL Server query which would return the last 3 months since a customer last purchased a product. For instance, There's a customer 100 that last made a purchase in August 2022. The SQL query will return June, July, August. Which would be in the format 062022, 072022, 082022. Now I need to be able to pass these values to the Copy data activity REST api dataset Relative URL (/salemonyr/062022) in the ForEach activity.
So during the first iteration the Relative URL should be set to /salemonyr/062022 the second would be /salemonyr/072022 and third /salemonyr/082022.
Error: The expression 'length(activity('MonYear').output.value)' cannot be evaluated because property 'value' doesn't exist, available properties are 'resultSetCount, recordsAffected, resultSets, outputParameters, outputLogs, outputLogsLocation, outputTruncated, effectiveIntegrationRuntime, executionDuration, durationInQueue, billingReference
Script activity json:
{
"resultSetCount": 1,
"recordsAffected": 0,
"resultSets": [
{
"rowCount": 3,
"rows": [
{
"MonYear": 062022
},
{
"MonYear": 072022
},
{
"MonYear": 082022
}
]
}
],
"outputParameters": {},
"outputLogs": "",
"outputLogsLocation": "",
"outputTruncated": false,
"effectiveIntegrationRuntime": "",
"executionDuration": 0,
"durationInQueue": {
"integrationRuntimeQueue": 3
},
"billingReference": {
"activityType": "PipelineActivity",
"billableDuration": [
{
"meterType": "",
"duration": 0.016666666666666666,
"unit": "Hours"
}
]
}
}
How would I accomplish this to read the values dynamically from the SQL query.

You can use #split(item().colname,',')[0] , split(item().colname,',')[1] and split(item().colname,',')[2] in the relative URL path.
Check the below video for details:

You can use REST Dataset parameter and use it in the Relative URL.
Relative URL:
Give lookup output to ForEach. use your query in lookup.
Give this to ForEach and inside ForEach, in copy sink(REST DATASET) use the below expression for the dataset parameter.
/salemonyr/#{item().sample_date}
In source, you can give your source.
By this, you can copy the data to the respective Relative URL.

Related

What is OData standard for odata.nextLink in case of 1:N $expand queries?

We can see the odata.nextlink standard in the server driven paging for normal queries here. But there is no odata.nextlink standard mentioned in case of 1:N $expand queries in $expand docs.
Can someone please confirm OData standard for 1:N $expand queries please?
Example:
If we have multiple account_tasks for a single account, how the result should look like:
GET [Organization URI]/api/data/v9.1/accounts?$top=1&$expand=Account_Tasks($select=subject)
Option-1: Where data is shown in list inline till the page-size, and odata.nextLink is shown if data count exceeds the page-size. So, odata.nextLink will show the next set of results. (Similar to standard pagination here.)
{
"#odata.context": "[Organization URI]/api/data/v9.1/$metadata#accounts(name,Account_Tasks(subject,scheduledstart))",
"value": [
{
"#odata.etag": "W/\"37867294\"",
"name": "Contoso, Ltd. (sample)",
"accountid": "7a4814f9-b0b8-ea11-a812-000d3a122b89",
"Account_Tasks": [
{
"#odata.etag": "W/\"28876919\"",
"subject": "Task 1 for Contoso, Ltd.",
},
{
// More account_tasks here. No odata.nextLink if data count < page-size.
]
}
]
}
Option-2: We'll show empty results inline and provide an odata.nextLink to actual data.
{
"#odata.context": "[Organization URI]/api/data/v9.1/$metadata#accounts(name,Account_Tasks(subject,scheduledstart))",
"value": [
{
"#odata.etag": "W/\"37867294\"",
"name": "Contoso, Ltd. (sample)",
"accountid": "7a4814f9-b0b8-ea11-a812-000d3a122b89",
"Account_Tasks": [],
// Empty list shown above and URL given below will show the full results.
"Account_Tasks#odata.nextLink": "[Organization URI]/api/data/v9.1/accounts(7a4814f9-b0b8-ea11-a812-000d3a122b89)/Account_Tasks?$select=subject,scheduledstart"
}
]
}
Option-3: Where data is shown in list till page-size, and odata.nextLink is shown every time (even if data count is smaller than the page-size). So, the odata.nextLink will show the full expand results including inline results.
{
"#odata.context": "[Organization URI]/api/data/v9.1/$metadata#accounts(name,Account_Tasks(subject,scheduledstart))",
"value": [
{
"#odata.etag": "W/\"37867294\"",
"name": "Contoso, Ltd. (sample)",
"accountid": "7a4814f9-b0b8-ea11-a812-000d3a122b89",
"Account_Tasks": [
{
"#odata.etag": "W/\"28876919\"",
"subject": "Task 1 for Contoso, Ltd.",
},
{
// More account tasks here
],
"Account_Tasks#odata.nextLink": "[Organization URI]/api/data/v9.1/accounts(7a4814f9-b0b8-ea11-a812-000d3a122b89)/Account_Tasks?$select=subject,scheduledstart"
}
]
}
Thanks in advance.

Good question -- paging of nested results is often misunderstood.
Nested results are paged individually, so where the nested account_tasks for a particular account exceeds a sever-defined threshold, the account_tasks up to that threshold are included, along with a nextlink to retrieve the additional account_tasks for that account. Which, I believe, is your Option 1.
Note that, since the threshold is server-defined, it is also valid to have a threshold of 0, and only include a nextlink for the nested account_tasks. However, each account will still have a different nextlink, and following that nextlink will return only those account_tasks for the account in which the nextlink was returned.
Does that make sense?

Create Entities and training phrases for values in functions for google action

I have created a trivia game using the SDK, it takes user input and then compares it to a value in my DB to see if its correct.
At the moment, I am just passing a raw input variable through my conversation, this means that it regularly fails when it mishears the user since the exact string which was picked up is rarely == to the value in the DB.
Specifically I would like it to only pick up numbers, and for example realise that it must extract '10' , from a speech input of 'my answer is 10'.
{
"actions": [
{
"description": "Default Welcome Intent",
"name": "MAIN",
"fulfillment": {
"conversationName": "welcome"
},
"intent": {
"name": "actions.intent.MAIN"
}
},
{
"description": "response",
"name": "Raw input",
"fulfillment": {
"conversationName": "rawInput"
},
"intent": {
"name": "raw.input",
"parameters": [{
"name": "number",
"type": "org.schema.type.Number"
}],
"trigger": {
"queryPatterns":[
"$org.schema.type.Number:number is the answer",
"$org.schema.type.Number:number",
"My answer is $org.schema.type.Number:number"
]
}
}
}
],
"conversations": {
"welcome": {
"name": "welcome",
"url": "https://us-central1-triviagame",
"fulfillmentApiVersion": 2
},
"rawInput": {
"name": "rawInput",
"url": "https://us-central1-triviagame",
"fulfillmentApiVersion": 2
}
}
}
app.intent('actions.intent.MAIN', (conv) => {
conv.data.answers = answersArr;
conv.data.questions = questionsArr;
conv.data.counter = answersArr.length;
var thisQuestion = conv.data.questions;
conv.ask((conv.data.answers)[0]));
});
app.intent('raw.input', (conv, input) => {
if(input == ((conv.data.answers)[0])){
conv.ask(nextQuestion());
}
app.intent('actions.intent.TEXT', (conv,input) => {
//verifying if input and db value are equal
// at the moment input is equal to 'my number is 10' (for example) instead of '10'
//therefore the string verification never works
conv.ask(nextQuestion());
});
In a previous project i used the dialogflow UI and I used this #system.entities number parameter along with creating some training phrases so it understands different speech patterns.
This input parameter I am passing through my conv , is only a raw string where I'd like it to be filtered using some sort of entity schema.
How do I create the same effect of training phrases/entities using the JSON file?

You can't do this using just the Action SDK. You need a Natural Language Processing system (such as Dialogflow) to handle this as well. The Action SDK, by itself, will do speech-to-text, and will use the actions.json configuration to help shape how to interpret the text. But it will only return the entire text from the user - it will not try to determine how it might match an Intent, nor what parameters may exist in it.
To do that, you need an NLP/NLU system. You don't need to use Dialogflow, but you will need something that does the parsing. Trying to do it with simple pattern matching or regular expressions will lead to nightmares - find a good system to do it.
If you want to stick to things you can edit yourself, Dialogflow does allow you to download its configuration files (they're just JSON), edit them, and update or replace the configuration through the UI or an API.

SSAS Tabular Add Column via TMSL

Good Morning,
Objective: I am working on trying to add new columns to a SSAS Tabular Model table. With a long-term aim to programmaticly made large-batch changes when needed.
Resources I've found:
https://learn.microsoft.com/en-us/sql/analysis-services/tabular-models-scripting-language-commands/create-command-tmsl
This one gives the template I've been following but seems to not work.
What I have tried so far:
{
"create": {
"parentObject": {
"database": "TabularModel_1_dev"
, "table": "TableABC"
},
"columns": [
{
"name": "New Column"
, "dataType": "string"
, "sourceColumn": "Column from SQL Source"
}
]
}
}
This first one is the most true to the example but returns the following error:
"The JSON DDL request failed with the following error: Unrecognized JSON property: columns. Check path 'create.columns', line 7, position 15.."
Attempt Two:
{
"create": {
"parentObject": {
"database": "TabularModel_1_dev"
, "table": "TableABC"
},
"table": {
"name": "Item Details by Branch",
"columns": [
{
"name": "New Column"
, "dataType": "string"
, "sourceColumn": "New Column"
}
]
}
}
}
Adding table within the child list returns error too;
"...Cannot execute the Create command: the specified parent object cannot have a child object of type Table.."
Omitting the table within the parentObject is unsuccessful as well.

I know it's been three years since the post, but I too was attempting the same thing and stumbled across this post in my quest. I ended up reaching out to microsoft and was told that the Add Column example they gave in their documentation was a "doc bug". In fact, you can't add just a column, you have to feed it an entire table definition via createOrReplace.
SSAS Error Creating Column with TMSL

Retrieve UserName from ServiceNow

I am able to retrieve records for a particular Incident ID using Invoke-RestMethod. However, while retrieving the data, values like Resolved To, Updated By, etc. get populated by a sysid.
Resolved By comes in this format:
https<!>://devinstance.servicenow.com/api/sysid, value= sysid
I would like to view the username instead of the sysid.

The 'User ID' (user_name) isn't on the Incident, it's on the sys_user table, so you'll have to dot-walk to it.
If you're using the table API, you'll need to specify a dot-walked field to return, using the sysparm_fields query parameter.
This is no problem, just specify your endpoint like this:
$uri = "https://YOUR_INSTANCE.service-now.com/api/now/table/incident?sysparm_query=number%3DINC0000001&sysparm_fields=resolved_by.user_name"
I've specified a query for a specific incident number is requested, but you can replace that with whatever your query is.The important part is sysparm_fields=resolved_by.user_name. You'll want to specify any other fields you need here, as well.
The JSON I get as a result of running this API call, is the following:
{
"result": [
{
"resolved_by.user_name": "admin"
}
]
}
Note the element name: "resolved_by.user_name".
Another option for doing this, would be to tell the API to return both display, and actual values by specifying the sysparm_display_value parameter and setting it to all to return both sys_id and display value, or just true to return only display values.
Your URI would then look like this:
https://dev12567.service-now.com/api/now/table/incident?sysparm_query=resolved_byISNOTEMPTY%5Enumber%3DINC0000001&sysparm_display_value=all
And your JSON would contain the following:
"number": {
"display_value": "INC0000001",
"value": "INC0000001"
},
"resolved_by": {
"display_value": "System Administrator",
"link": "https://YOUR_INSTANCE.service-now.com/api/now/table/sys_user/6816f79cc0a8016401c5a33be04be441",
"value": "6816f79cc0a8016401c5a33be04be441"
},
"sys_updated_by": {
"display_value": "admin",
"value": "admin"
},
This would be accessed by:
answer.result[n].resolved_by.display_value

ADF cannot parse DateTimeOffset

We have JSON's that contain timestamps in the format:
2016-11-03T03:05:21.673Z
2016-11-03T03:05:21.63Z
So the appropriate format to parse the data is yyyy-MM-ddTHH:mm:ss.FFF\Z
I tried all of these variants to explain to ADF how to parse it:
"structure": [
{
"name": "data_event_time",
"type": "DateTime",
"format": "yyyy-MM-ddTHH:mm:ss.FFF\\Z"
},
...
]
"structure": [
{
"name": "data_event_time",
"type": "DateTimeOffset",
"format": "yyyy-MM-ddTHH:mm:ss.FFFZ"
},
...
]
"structure": [
{
"name": "data_event_time",
"type": "DateTimeOffset"
},
...
]
"structure": [
{
"name": "data_event_time",
"type": "DateTime"
},
...
]
In all of these cases above ADF fails with the error:
Copy activity encountered a user error at Sink side: ErrorCode=UserErrorInvalidDataValue,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Column 'data_event_time' contains an invalid value '2016-11-13T00:44:50.573Z'. Cannot convert '2016-11-13T00:44:50.573Z' to type 'DateTimeOffset' with format 'yyyy-MM-dd HH:mm:ss.fffffff zzz'.,Source=Microsoft.DataTransfer.Common,''Type=System.FormatException,Message=String was not recognized as a valid DateTime.,Source=mscorlib,'.
What am i doing wrong? How to fix it?

The previous issue has been fixed. Thanx wBob.
But now i have a new issue at the sink level.
I'm trying to load data from Azure Blob Storage to Azure DWH via ADF + PolyBase:
"sink": {
"type": "SqlDWSink",
"sqlWriterCleanupScript": "$$Text.Format('DELETE FROM [stage].[events] WHERE data_event_time >= \\'{0:yyyy-MM-dd HH:mm}\\' AND data_event_time < \\'{1:yyyy-MM-dd HH:mm}\\'', WindowStart, WindowEnd)",
"writeBatchSize": 6000000,
"writeBatchTimeout": "00:15:00",
"allowPolyBase": true,
"polyBaseSettings": {
"rejectType": "percentage",
"rejectValue": 10.0,
"rejectSampleValue": 100,
"useTypeDefault": true
}
},
"enableStaging": true,
"stagingSettings": {
"linkedServiceName": "AppInsight-Stage-BlobStorage-LinkedService"
},
"translator": {
"type": "TabularTranslator",
"columnMappings": "..."
}
But the process fails with error:
Database operation failed. Error message from database execution : ErrorCode=FailedDbOperation,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Error happened when loading data into SQL Data Warehouse.,Source=Microsoft.DataTransfer.ClientLibrary,''Type=System.Data.SqlClient.SqlException,Message=107091;Query aborted-- the maximum reject threshold (10 %) was reached while reading from an external source: 6602 rows rejected out of total 6602 rows processed. Rows were rejected while reading from external source(s). 52168 rows rejected from external table [ADFCopyGeneratedExternalTable_0530887f-f870-4624-af46-249a39472bf3] in plan step 2 of query execution: Location: '/13/2cd1d10f-4f62-4983-a38d-685fc25c40a2_20161102_135850.blob' Column ordinal: 0, Expected data type: DATETIMEOFFSET(7) NOT NULL, Offending value: 2016-11-02T13:56:19.317Z (Column Conversion Error), Error: Conversion failed when converting the NVARCHAR value '2016-11-02T13:56:19.317Z' to data type DATETIMEOFFSET. Location: '/13/2cd1d10f-4f62-4983-a38d-685fc25c40a2_20161102_135850.blob' Column ordinal: 0, Expected ...
I read the Azure SQL Data Warehouse loading patterns and strategies
If the DATE_FORMAT argument isn’t designated, the following default formats are used:
DateTime: ‘yyyy-MM-dd HH:mm:ss’
SmallDateTime: ‘yyyy-MM-dd HH:mm’
Date: ‘yyyy-MM-dd’
DateTime2: ‘yyyy-MM-dd HH:mm:ss’
DateTimeOffset: ‘yyyy-MM-dd HH:mm:ss’
Time: ‘HH:mm:ss’
Looks like i have no ability at ADF level to specify the datetime format for PolyBase.
Does someone know any workaround?

We looked at a similar issue recently here:
What's reformatting my input data before I get to it?
JSON does not have a Datetime format as such, so leave the type and format elements out. Then your challenge is with the sync. Inserting these values into an Azure SQL Database for example should work.
"structure": [
{
"name": "data_event_time"
},
...
Looking at your error message, I would expect that to work inserting into a DATETIME column in SQL Data Warehouse (or SQL Database or SQL Server on a VM) but it is ordinary DATETIME data, not DATETIMEOFFSET.
If you have issues inserting into the target sink, you may have to workaround by not using the Polybase checkbox and code that side of the process yourself, eg
Copy raw files to blob storage or Azure Data Lake (now Polybase supports ADLS)
Create external tables over the files where the datetime data is set as varchar data-type
CTAS the data into an internal table, also converting the string datetime format to a proper DATETIME using T-SQL

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Azure data factory pass activity output to a dataset - azure-data-factory

You can use #split(item().colname,',')[0] , split(item().colname,',')[1] and split(item().colname,',')[2] in the relative URL path. Check the below video for details:

Related

What is OData standard for odata.nextLink in case of 1:N $expand queries?

Create Entities and training phrases for values in functions for google action

SSAS Tabular Add Column via TMSL

Retrieve UserName from ServiceNow

ADF cannot parse DateTimeOffset

Categories

Resources