I am having an issue with the inline data set for Common Data Model in Azure Data Factory.
Simply, everything in ADF appears to connect and read from my manifest file and entity definition - but when I click the "Data preview" button, I always get "No output data" - which I find bizarre, as the data can be read perfectly when using the CDM connector to the same files in PowerBI. What am I doing wrong to mean that the data is not read into the data preview and subsequent transformations in the mapping data flow?
My Manifest file looks as below (referring to an example entity):
{
"$schema": "CdmManifest.cdm.json",
"jsonSchemaSemanticVersion": "1.0.0",
"imports": [
{
"corpusPath": "cdm:/foundations.cdm.json"
}
],
"manifestName": "manifestExample",
"explanation": "example",
"entities": [
{
"type": "LocalEntity",
"entityName": "Entityname",
"entityPath": "folder/EntityName.cdm.json/Entityname",
"dataPartitions": [
{
"location": "folder/data/Entityname/Entityname.csv",
"exhibitsTraits": [
{
"traitReference": "is.partition.format.CSV",
"arguments": [
{
"name": "columnHeaders",
"value": "true"
},
{
"name": "delimiter",
"value": ","
}
]
}
]
}
]
},
...
I am having exactly same output message "No output data". I am using json not manifest. If i sink the source it moves no data but without error. My CDM originates from PowerBI dataflow. The PowerApps works fine but historization and privileges make it useless.
Edit:
On Microsofts info on preview feature we can find this
screen. I will make a guess that CDM the ADS sources is not the same which orignates from Power BI.
Related
I am trying to create a pipeline where I want to store a particular value from a web activity in azure data factory, in a variable, so that I can pass it to other activities.
I want to get the export ID but I keep running into errors.
The response of the web activity looks like this:
{
"requestId": "----",
"result": [
{
"exportId": "---",
"format": "CSV",
"status": "Created",
"createdAt": "2020-12-15T16:03:01Z"
}
],
"success": true
}
I have tried the following methods but it fails: #string(activity('Web1').output.result.exportId
#string(activity('Web1').output.result[0].exportId
#string(activity('Web1').output.result.[0]
first(#string(activity('Web1').output.result)
enter image description here
enter image description here
I have tried this. Your second expression should work #string(activity('Web1').output.result[0].exportId)
My test
Output of Web activity
These expressions also work fine on my side, you can have a try:
#string(activity('Web1').output['result'][0]['exportId'])
#string(activity('Web1').output.result[0].exportId)
#string(first(activity('Web1').output['result']).exportId)
#string(json(activity('Web1').output.response)[0]['Id'])
We are working on implementing a custom logging solution. Most of the information what we need is already present in log analytics from data factory analytics solution but for getting log info on data flows, there is a challenge. When querying, we get this error in output. "Too large to parse".
Since data flows are complex and critical piece in a pipeline, we are in desperate need to get data like rows copied, skipped, read etc of each activities with in data flow. can you pls help how to get those info?
You can get the same information shown in the ADF portal UI by making a POST request to the below REST endpoint. You can find more information and read about authentication on the following link https://learn.microsoft.com/en-us/rest/api/datafactory/pipelineruns/querybyfactory
You can choose to query by factory or for a specific pipeline run id depending on your needs.
https://management.azure.com/subscriptions/<subscription id>/resourcegroups/<resource group name>/providers/Microsoft.DataFactory/factories/<ADF resource Name>/pipelineruns/<pipeline run id>/queryactivityruns?api-version=2018-06-01
Below is an example of the data you can get from one stage:
{
"stage": 7,
"partitionTimes": [
950
],
"lastUpdateTime": "2020-07-28 18:24:55.604",
"bytesWritten": 0,
"bytesRead": 544785954,
"streams": {
"CleanData": {
"type": "select",
"count": 241231,
"partitionCounts": [
950
],
"cached": false
},
"ProductData": {
"type": "source",
"count": 241231,
"partitionCounts": [
950
],
"cached": false
}
},
"target": "MergeWithDeltaLakeTable",
"time": 67589,
"progressState": "Completed"
}
I am accessing a RESTful API that pages results in groups of 50 using the HTTP connector. The REST connector doesn't seem to support Client Certificates so I can't use the pagination in that.
I have a Pipeline Variable called SkipIndex that defaults to 0. Inside the Until loop I have a Copy Data Activity that works (HTTP source to BLOB sink), then a Set Variable Activity that I am trying to get to increment this Variable.
{
"name": "Add 50 to SkipIndex",
"type": "SetVariable",
"dependsOn": [
{
"activity": "Copy next to temp",
"dependencyConditions": [
"Succeeded"
]
}
],
"userProperties": [],
"typeProperties": {
"variableName": "SkipIndex",
"value": {
"value": "50++",
"type": "Expression"
}
}
}
Everything I have tried results in errors such as "The expression contains self referencing variable. A variable cannot reference itself in the expression." and the one above with 50++ causes a sink error during debug.
How can I get the Until loop to increment this variable after it retrieves data?
Agree that REST Connector does supports pagination but does not for Client Certificates Authentication type.
For the idea of your Until activity scenario,i am tripped by the can't self-reference a variable in an expression limitation also. Maybe you could make a little trick on that: Add one more variable to persist the index number.
For example,i got 2 variables: count and indexValue
Until Activity:
Inside Until Activity:
V1:
V2:
BTW, no usage of 50++ in ADF.
Reading the Concourse documentation about Implementing a Resource Type, in regards to what the check, in, and out scripts must emit, it is not clear why this output is needed or how Concourse uses it. My questions are:
1) How does Concourse use the output of the check script, the in script, and the out script?
2) And, why is it required that the in and out script emit the version? What happens if you don't?
For context, here is the relevant parts of the documentation:
1) For the check script:
...[it] must print the array of new versions, in chronological order,
to stdout, including the requested version if it's still valid.
For example:
[
{ "ref": "61cbef" },
{ "ref": "d74e01" },
{ "ref": "7154fe" }
]
2) For the in script:
The script must emit the fetched version, and may emit metadata as a list of key-value pairs. This data is intended for public consumption and will make it upstream, intended to be shown on the build's page.
For example:
{
"version": { "ref": "61cebf" },
"metadata": [
{ "name": "commit", "value": "61cebf" },
{ "name": "author", "value": "Hulk Hogan" }
]
}
3) Similar to the in script, the out script:
The script must emit the resulting version of the resource. For
example, the git resource emits the sha of the commit that it just
pushed.
For example:
{
"version": { "ref": "61cebf" },
"metadata": [
{ "name": "commit", "value": "61cebf" },
{ "name": "author", "value": "Mick Foley" }
]
}
Concourse uses the check result to verify if there is any new resource available. According to your pipeline definition, the presence of a new resource would trigger a job. The in is therefore used to read the specific resource using parameters provided by the pipeline whilst the out would take care of writing them.
As your in is going to use the information provided by the check you may want to use a similar structure, but you're not obliged to. It is useful to echo the same version information in your check/in/out in order to be able to log it and understand each resource in your pipeline is belonging to which version.
Good Morning,
Objective: I am working on trying to add new columns to a SSAS Tabular Model table. With a long-term aim to programmaticly made large-batch changes when needed.
Resources I've found:
https://learn.microsoft.com/en-us/sql/analysis-services/tabular-models-scripting-language-commands/create-command-tmsl
This one gives the template I've been following but seems to not work.
What I have tried so far:
{
"create": {
"parentObject": {
"database": "TabularModel_1_dev"
, "table": "TableABC"
},
"columns": [
{
"name": "New Column"
, "dataType": "string"
, "sourceColumn": "Column from SQL Source"
}
]
}
}
This first one is the most true to the example but returns the following error:
"The JSON DDL request failed with the following error: Unrecognized JSON property: columns. Check path 'create.columns', line 7, position 15.."
Attempt Two:
{
"create": {
"parentObject": {
"database": "TabularModel_1_dev"
, "table": "TableABC"
},
"table": {
"name": "Item Details by Branch",
"columns": [
{
"name": "New Column"
, "dataType": "string"
, "sourceColumn": "New Column"
}
]
}
}
}
Adding table within the child list returns error too;
"...Cannot execute the Create command: the specified parent object cannot have a child object of type Table.."
Omitting the table within the parentObject is unsuccessful as well.
I know it's been three years since the post, but I too was attempting the same thing and stumbled across this post in my quest. I ended up reaching out to microsoft and was told that the Add Column example they gave in their documentation was a "doc bug". In fact, you can't add just a column, you have to feed it an entire table definition via createOrReplace.
SSAS Error Creating Column with TMSL