Azure polybase external data source creation error - tsql

I am trying to create an external table in synapse analytics, but I am facing error while creating the external data source.
Below is the code:
CREATE MASTER KEY ENCRYPTION BY PASSWORD='xxxxxxxxxxxx'; -- executed
CREATE DATABASE SCOPED CREDENTIAL storageCred WITH -- executed
IDENTITY = 'SHARED ACCESS SIGNATURE',
SECRET = 'xxxxxxxxxxxxxx';
CREATE EXTERNAL DATA SOURCE adls WITH -- execution failed
( TYPE = HADOOP,
LOCATION = 'abfss://staging#devedw2021.dfs.core.windows.net',
CREDENTIAL = storageCred
)

The syntax looks right for the External Data Source, but the problem may be with the Database Scoped Credential. I spent a LOT of time on this, and the only way I could get this to work was with Account name and key:
CREATE DATABASE SCOPED CREDENTIAL CausewayAdlsCredentials
WITH
IDENTITY = '<storage_account_name>' ,
SECRET = '<storage_account_key>'
;
One word of warning: beware the documentation. There are several different locations that discuss this problem, and they have conflicting messaging or refer to old versions. This one is OK, but only section C worked for me.

Related

External Table on DELTA format files in ADLS Gen 1

We have number of databricks DELTA tables created on ADLS Gen1. and also, there are external tables built on top each of those tables in one of the databricks workspace.
similarly, I am trying to create same sort of external tables on the same DELTA format files,but in different workspace.
I do have read only access via Service principle on ADLS Gen1. So I can read DELTA files through spark data-frames, as in given below:
read_data_df = spark.read.format("delta").load('dbfs:/mnt/data/<foldername>')
I can even able to create hive external tables, but I do see following warning while reading data from the same table:
Error in SQL statement: AnalysisException: Incompatible format detected.
A transaction log for Databricks Delta was found at `dbfs:/mnt/data/<foldername>/_delta_log`,
but you are trying to read from `dbfs:/mnt/data/<foldername>` using format("hive"). You must use
'format("delta")' when reading and writing to a delta table.
To disable this check, SET spark.databricks.delta.formatCheck.enabled=false
To learn more about Delta, see https://learn.microsoft.com/azure/databricks/delta/index
;
If I create external table 'using DELTA', then I see a different access error as in:
Caused by: org.apache.hadoop.security.AccessControlException:
OPEN failed with error 0x83090aa2 (Forbidden. ACL verification failed.
Either the resource does not exist or the user is not authorized to perform the requested operation.).
failed with error 0x83090aa2 (Forbidden. ACL verification failed.
Either the resource does not exist or the user is not authorized to perform the requested operation.).
Does it mean that I would need full access, rather just READ ONLY?, on those underneath file system?
Thanks
Resolved after upgrading to Databricks Runtime environment to runtime version DBR-7.3.

How NOT to create a azurerm_mssql_database_extended_auditing_policy

I'm trying to deploy my infra with terraform.
I have a mssql server and database and using azurerm 2.32
While deploying mssql I'm getting following error
Error: issuing create/update request for SQL Server "itan-mssql-server" Blob Auditing Policies(Resource Group "itan-west-europe-resource-group"): sql.ExtendedServerBlobAuditingPoliciesClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code="DataSecurityInvalidUserSuppliedParameter" Message="Invalid parameter 'storageEndpoint'. Value should be a blob storage endpoint (e.g. https://MyAccount.blob.core.windows.net)."
I have already tried
defining extended_auditing_policy on database level - failed
defining extended_auditing_policy on server level - failed
defining azurerm_mssql_database_extended_auditing_policy on root level - failed
leaving empty extended_auditing_policy - failed
Global level of definition looks like this (^C^V from terraform documentation with adjustment to my project):
resource "azurerm_mssql_database_extended_auditing_policy" "db-policy" {
database_id = azurerm_mssql_database.itan-mssql-database.id
storage_endpoint = azurerm_storage_account.itan_storage_account.primary_blob_endpoint
storage_account_access_key = azurerm_storage_account.itan_storage_account.primary_access_key
storage_account_access_key_is_secondary = false
retention_in_days = 1
depends_on = [
azurerm_mssql_database.itan-mssql-database,
azurerm_storage_account.itan_storage_account]
}
I'm looking for one of two possible solutions:
total disabling of audits (I don't really needed now)
fixing error and enabling the audit
Thanks!
Jarek
This is caused by Breaking change in the SQL Extended Auditing Settings API. Please check also this issue in terraform provider.
As a workaround you may try call ARM template from terraform. However, I'm not sure if under the hood they use the same or different API.
Workarund that looks to be working for me is like this:
I Followed tip by [ddarwent][1] from git hub:
https://github.com/terraform-providers/terraform-provider-azurerm/issues/8915#issuecomment-711029508
So basically its like this:
terraform apply
Go to terraform.tfstate delete "tainted mssql server"
terraform apply
Go to terraform.tfstate delete "tainted mssql database"
terraform apply
Looks like all my stuff is on and working

AWS MobileAnalyticsManager access to folder 'AWS Mobile Services\M4SP' is denied

I am trying to add the AWSSDK DLL into my C# code to collect my event data and pass the data to the AWS bucket. My C# code is created with VS Share point template. The project contains WSP files. The following code indicates how I use the AWSSDK :
using Amazon;
using Amazon.CognitoIdentity;
using Amazon.MobileAnalytics.MobileAnalyticsManager;
CognitoAWSCredentials credentials = new CognitoAWSCredentials(
"us-east-1:xxxxxx",//PoolID
RegionEndpoint.USEast1
);
Amazon.AWSConfigs.ApplicationName = "M4SP";
AWSConfigs.LoggingConfig.LogMetrics = true;
AWSConfigs.LoggingConfig.LogResponses = ResponseLoggingOption.Always;
AWSConfigs.LoggingConfig.LogMetricsFormat = LogMetricsFormatOption.JSON;
MobileAnalyticsManager manager = MobileAnalyticsManager.GetOrCreateInstance(
"xxxxxxxxxxxxxxxxxxx",//AppID
credentials,
RegionEndpoint.USEast1 // Region
);
CustomEvent customEvent = new CustomEvent("TestRecordEvent");
customEvent.AddAttribute("label", "M4SP");
customEvent.AddAttribute("action", "invoke");
customEvent.AddAttribute("details", "run the workflow test");
manager.RecordEvent(customEvent);
I found the code inside AWSSDK DLL was trying to log the data to local folder before passing it to AWS database. The location of the folder is C:\Users\[userid]\AppData\Roaming\AWS Mobile Services.
There is no problem in a standalone project since it always uses current user’s identity to run the application so it has access to the folder. But, because of the authentication mechanism of SharePoint solutions, it uses Application Pool Identity to access the folder and it gets access denied issue and the whole process fails.
Here is the error:
"Access to the path 'AWS Mobile Services\M4SP' is denied."
I modified the access right of Share point Application Pool Identity (in my case, it is “network service” account) but it still can’t access the folder .
Does anyone have a solution for this issue? Thanks very much for the help!!

How to share information across notebooks in a DSX project

Is it possible to share information (such as credentials) across multiple notebooks in a DSX project, e.g. with environment variables?
For example a Cloud Foundry application in Bluemix has a control setting where environment variables can be defined, is there a similar concept for a DSX project (I couldn't see anything in the various project level settings).
Separate notebooks have separate runtimes in the background and at the moment it is not possible to share credentials among notebooks by defining environment variables. But there are helper methods for most obvious credential requirements in a project. This is called the "Insert to code" method.
For example: if you have an object store associated with your project.
Select the "Data" tab in the top bar.
Add some file to the object store by browsing or simple drag-n-drop.
Insert credentials of that object store container in your notebook by selecting the "Insert credentials" option, right besides your file in the right hand side panel.
You can then directly insert those credential (Step 3) in any other notebook in that project.
Besides "Insert to code" there are other helper functions like "Insert SparkR dataframe", "Pandas dataframe" etc. to speed up the analytics process of data scientists. Hope that was a bit helpful.
FYI - I've added a feature request on uservoice to allow Bluemix services to be bound to a project and then the credentials be accessed in the same way a Bluemix application accessess credentials. Please vote if you think this would be useful.
Currently, one pattern I use quite a lot is to create a notebook in my project that is used to save credentials to a file on DSX:
! echo '{ "username": "xxxx", "password": "xxxx", ... }' > cloudant_creds.json
That file is now available to all of your notebooks on the project. NOTE: the file is saved on the spark service file system. If you use the same spark service in other dsx projects, they will also be able to access the file.
The credentials for cloudant normally include other fields such as host, I haven't shown these fields here so I can Keep the example simple. I have indicated there are more fields with the .... I normally copy this json from the bluemix service credentials field.
In your other notebooks, you would read the credentials something like this:
with open('cloudant_creds.json') as data_file:
sourceDB = json.load(data_file)
You can then refer the credentials like this:
dfReader = sqlContext.read.format("com.cloudant.spark")
dfReader.option("cloudant.host", sourceDB.host)
if sourceDB.username:
dfReader.option("cloudant.username", sourceDB.username)
if sourceDB.password:
dfReader.option("cloudant.password", sourceDB.password)
df = dfReader.load(sourceDB.database).cache()

How to insert data into my SQLDB service instance (Bluemix)

I have created an SQLDB service instance and bound it to my application. I have created some tables and need to load data into them. If I write an INSERT statement into RUN DDL, I receive a SQL -104 error. How can I INSERT SQL into my SQLDB service instance.
If you're needing to run your SQL from an application then there are several examples (sample code included) of how to accomplish this at the site listed below:
http://www.ng.bluemix.net/docs/services/SQLDB/index.html#run-a-query-in-java
Additionally, you can execute SQL in the SQL Database Console by navigating to Manage -> Work with Database Objects. More information can be found here:
http://www.ng.bluemix.net/docs/services/SQLDB/index.html#sqldb_005
s.executeUpdate("CREATE TABLE MYLIBRARY.MYTABLE (NAME VARCHAR(20), ID INTEGER)");
s.executeUpdate("INSERT INTO MYLIBRARY.MYTABLE (NAME, ID) VALUES ('BlueMix', 123)");
Full Code
Most people do initial database population or migrations when they deploy their application. Often these database commands are programming language specific. The poster didn't include the programming language. You can accomplish this two ways.
Append a bash script that would call your database scripts that you uploaded. This project shows how you can call that bash script from within your manifest file as part of doing a CF Push.
Some languages like offer a file type or service that will automatically get used to populate the database on initial deploy or when your migrate/synch the db. For example Python Django offers a "fixtures" file that will automatically take a JSON file and populate your database tables