I have an SSIS package that successfully uses the Microsoft SAP BW connector. The SAP Administrator has set up his side so that it uses a process chain and ProgramId as connection criteria. I start my SSIS package and it runs in "Wait" mode until the SAP job executes. This all works great. I now need to replicate this using the Azure data factory's SAP BW connector but the Azure connector does not have the same look and feel so I am attempting to edit the code in the Connections tab for the SAPBW connection to include the Wait mode etc.
The SAP BW connection to the SAP BW system successfully passes the "Test Connection" in the Data Factory.
In the SSIS SAP BW connector the advanced properties display these values which I am trying to replicate (hope this image works):
So I added the "Custom Properties" to the code in the Connections -> linked Services->SapBw
{
"name": "SapBw",
"type": "Microsoft.DataFactory/factories/linkedservices",
"properties": {
"type": "SapBw",
"typeProperties": {
"server": "sapdb.compnme.local",
"systemNumber": "00",
"clientId": "400",
"userName": "myUser",
"encryptedCredential": "abc123"
},
"connectVia": {
"referenceName": "ARuntime",
"type": "IntegrationRuntimeReference"
}
},
"Custom Properties":{
"DbTableName":"/BIC/OHCSST_OHD",
"DestinationName":"CSST_OHD",
"ExecutionMode":"W",
"GatewayHost":"sapdb.compnme.local",
"GatewayService":"sapgw00",
"ProcessChain":"Z_CS_STAT_OHD",
"ProgramId":"ProgId_P23",
"Timeout":"1200"
}
}
Unfortunately, when I click "Finish" the connection is successfully published but when I go to view the code my Custom Properties have disappeared. Is there a different process to connect to SAP Open Hub iwht the Azure data factory as there does not appear to be anything on the MS website to guide me.
Your image attachment could not display correctly. Based upon what I comprehend, I wonder if you confused ADF SSIS-IR and ADF Self-hosted IR.
Because you leveraged the BW connector in SSIS, apparently you were using the SSIS package and deployed it to ADF SSIS-IR stack. This IR has nothing to do with the Self-hosted IR which is required by ADF Copy activity from SAP BW. You mentioned you defined custom properties in the linked services, but the context of linked services is for the ADF native BW MDX connection interface. No matter what you define in the ADF linked services, it would not affect SSIS IR. Also, you may need to realize that ADF native BW interface is for MDX access only to query BW InfoCube and BEx QueryCube data. There is nothing to do with Open Hub.
Tactically, you should apply the custom properties to your BW connection in SSIS package, but I have a feeling that you may not know deeply the pros and cons of SSIS BW connector, ADF BW connector, Open Hub, and MDX. From real project experience, there are major robustness issues with the SSIS BW connector's integration with Open Hub and Process Chain. The DTP jobs inside the process chain could fail frequently, and the "reset" of DTP jobs is a frustrating experience. I suggest you describe your requirement before spending too much energy solving a connection property issue.
Did some work with a Microsoft person - the process we wanted was to use an OpenHub connection in the Data Factory. This link to the Microsoft Azure Data Factory forum has a document that talks about how to achieve this.
DataFactory Forum
Unfortunately this process didn't work for me becasue our SAP Version is 4 when it should work with 7.3 13.
Related
I have a source of SAP BW Open Hub in data factory and a sink of Azure data lake gen2 and am using a copy activity to move the data.
I am attempting to transfer the data to the lake and split into numerous files, with 200000 rows per file. I would also like to be able to prefix all of the filenames e.g. 'cust_', so the files would be something along the lines of cust_1, cust_2, cust_3 etc.
This method only seems to be an issue when using SAP BW Open Hub as a source (it works fine when using SQL Server as a source. Please see the warning message below. After checking with out internal SAP BW team, they assure me that the data is in a tabular format, and no explicit partition is enabled, so there shouldn't be an issue.
When executing the copy activity, the files are transferred to the lake but the file name prefix setting is ignored, and the filenames instead are set automatically, as below (the name seems to be automatically made up of the SAP BW Open Hub table and the request ID):
Here is the source config:
All other properties on the other tabs are set to default and have been unchanged.
QUESTION: without using a data flow, is there any way to split the files when pulling from SAP BW Open Hub and also be able to dictate the filenames in the lake?
I tried to reproduce the issue and it works fine with a work around. Instead of splitting the data while copying from SAP BW to Azure data lake storage, you can just simply copy the entire exact data (without partition) into the Azure SQL Database. Please follow copy data from SAP Business warehouse by using azure data factory (make sure to use Azure SQL Database as sink).
Now the data is in you Azure SQL Database, you can now simply use the copy activity to copy the data to Azure data lake storage.
In source configuration, keep “Partition option” as None.
Source Config:
Sink config:
Output:
I have an Azure Data Factory with a pipeline that I'm using to pick up data from an on-premise database and copy to CosmosDB in the cloud. I'm using a data flow step at the end to delete documents that don't exist in the source from the sink.
I have 3 integration runtimes set up:
AutoResolveIntegrationRuntime (default set up by Azure)
Self hosted integration runtime (I set this up to connect to the on-premise database so it's used by the source dataset)
Data flow integration runtime (I set this up to be used by the data flow step with a TTL setting)
The issue I'm seeing is when I trigger the pipeline the AutoResolveIntegrationRuntime is the one being used so I'm not getting the optimisation that I need from the Data flow integration runtime with the TTL.
Any thoughts on what might be going wrong here?
Per my experience, only the AutoResolveIntegrationRuntime (default set up by Azure) supports the optimization:
When we choose the data flow run on non-default integration, there isn't the optimization:
And once the integration runtime created, we also couldn't change the settings:
Data Factory documents didn't talk more about this. When I run the pipeline, I found that the dataflowruntime won't work:
That means that no matter which integration runtime you used to connect to the dataset, data low will always use the Azure Default integration runtime.
SHIR doesnt support dataflow execution.
I am trying to automate the publishing of Power BI report to different workspaces that act like Dev, Test and Prod environments. Using Powershell commands, I am able to achieve this automation via Connect-PowerBIServiceAccount.
I am stuck at how to automate the datasources mapping to the servers under Gateway connection in Datasets tab (Settings).
Is there like a Powershell or Power BI REST API that I can use to automate this process?
Propably your solution is here:
POST https://api.powerbi.com/v1.0/myorg/groups/{groupId}/datasets/{datasetId}/Default.BindToGateway
And in request body, you need to specify datasources:
{
"gatewayObjectId": "1f69e798-5852-4fdd-ab01-33bb14b6e934",
"datasourceObjectIds": [
"dc2f2dac-e5e2-4c37-af76-2a0bc10f16cb",
"3bfe5d33-ab7d-4d24-b0b5-e2bb8eb01cf5"
]
}
https://learn.microsoft.com/en-us/rest/api/power-bi/datasets/bindtogatewayingroup
As Earlier Posted a thread for syncing Data from Premises Mysql to Azure SQL over here referring this article, and found that lookup component for watermark detection is only available for SQL Server Only.
So tried a work Around, that while using "Copy" Data Flow task ,will pick data greater than last watermark stored from Mysql.
Issue:
Able to validate package successfully but not able to publish same.
Question :
In Copy Data Flow Task i'm using below query to get data from MySql greater than watermark available.
Can't we use Query like below on other relational sources like Mysql
select * from #{item().TABLE_NAME} where #{item().WaterMark_Column} > '#{activity('LookupOldWaterMark').output.firstRow.WatermarkValue}'
CopyTask SQL Query Preview
Validate Successfully
Error With no Details
Debug Successfully
Error After following steps mentioned by Franky
Azure SQL Linked Service Error (Resolved by re configuring connection /edit credentials in connection tab)
Source Query got blank (resolved by re-selection source type and rewriting query)
Could you verify if you have access to create a template deployment in the azure portal?
1) Export the ARM Template: int he top-right of the ADFv2 portal, click on ARM Template -> Export ARM Template, extract the zip file and copy the content of the "arm_template.json" file.
2) Create ARM Template deployment: Go to https://portal.azure.com/#create/Microsoft.Template and log in with the same credentials you use in the ADFv2 portal (you can also get to this page going in the Azure portal, click on "Create a resource" and search for "Template deployment"). Now click on "Build your own template in editor" and paste the ARM template from the previous step in the editor and Save.
3) Deploy template: Click on existing resource group and select the same resource group as the one where your Data Factory is. Fill out the parameters that are missing (for this testing it doesn't really matter if the values are valid); Factory name should already be there. Agree the terms and click purchase.
4) Verify the deployment succeeded. If not let me know the error, it might be an access issue which would explain why your publish fails. (ADF team is working on giving a better error for this issue).
Did any of the objects publish into your Data Factory?
In SLC ARC the list of connectors available (when creating datasources and thus generating models) via the UI was hard-coded (link to overview of issue) Does the same hold true for API Connect?
Effectively, I'd like to create a fork of the mssql connector to address some issues with how schemas are processed when generating models from existing tables. If I create such a connector, will I be able to install it so that I can utilize it via the GUI (again, I could not via SLC ARC due to hard-coding). Any help is greatly appreciated!
EDIT: I've installed the loopback-connector-redis connector into a throwaway project. When I spin up APIC it does not appear on the data sources screen. So, rephrasing my question: are there settings or otherwise that would allow such connectors to be included. Ideally, APIC would scan my project and determine what I have installed, exposing those connectors.
As you've seen, the list is currently fixed and doesn't detect additional installed connectors.
If you want to use your own custom connector, create a new datasource using the API Designer, select the MSSQL connector and fill in the values per usual.
Next, you'll need to open a file on your system to tweak the connector target.
In your project directory, open ./server/datasources.json and you should see the datasource you just created. Then, just change the connector value to the name of the custom version you created, save, and continue developing your APIs like normal.
{
"db": {
"name": "db",
"connector": "memory"
},
"DB2 Customers": {
"host": "datbase.acme-air.com",
"port": 50000,
"database": "customers",
"password": "",
"name": "Customer DB",
"connector": "db2-custom",
"user": "mhamann#us.ibm.com"
}
}
Unfortunately, you're now on your own in terms of managing datasources, as they won't show up in the Designer's datasource editor. They will still be usable in other parts of the Designer, so you can connect up your models, etc.