Synapse suddenly started having problem about having hash distribution column in a Merge - merge

I started geting error at 06-25-2022 in my Fact table flows. Before that there was no problem and nothing has changed.
The Error is:
Operation on target Fact_XX failed: Operation on target Merge_XX failed: Execution fail against sql server. Sql error number: 100090. Error Message: Updating a distribution key column in a MERGE statement is not supported.

Sql error number: 100090. Error Message: Updating a distribution key column in a MERGE statement is not supported.
You got this error because updating a distribution key column through MERGE command is not supported in azure synapse currently.
The MERGE is currently in preview for Azure Synapse Analytics.
You can refer to the official documentation for more details about this as given below: -
https://learn.microsoft.com/en-us/sql/t-sql/statements/merge-transact-sql?view=azure-sqldw-latest&preserve-view=true
It clearly states that The MERGE command in Azure Synapse Analytics, which is presently in preview, may under certain conditions, leave the target table in an inconsistent state, with rows placed in the wrong distribution, causing later queries to return wrong results in some cases.

Related

How to delete only tables from target database using DACPAC or using SQLPackege.exe arguments which is not in Source DB? [Azure DevOps]

I have tried multiple ways to delete only tables from target database that is not in Source Database using dacpac.
If anyone has better suggestion or solution to maintain Source and target DB similar in terms of tables only.
Suggest solution in any of these:
dacpac file.
SQL Project in .net
SQLPackege.exe arguments.
Added These SQLPackage.exe arguments:
/p:DropObjectsNotInSource=true /p:BlockOnPossibleDataLoss=false /p:AllowDropBlockingAssemblies=true
Facing these errors:
*** Could not deploy package.
Warning SQL72012:
Warning SQL72015:
Warning SQL72014:
Warning SQL72045:
These StackOverFlow links didn't help me or similar:
Deploy DACPAC with SqlPackage from Azure Pipeline is ignoring arguments and dropping users
Is it possible to exclude objects/object types from sqlpackage?
I am expecting that my Source and Target DBs should have equal tables

Azure Synapse Exception while reading table from synapse Dwh

While reading from table I'm getting
jdbc.SQLServerException : Create External Table As Sect statement failed as the path ####### could not be used for export.
Error Code :105005
jdbc.SQLServerException : Create External Table As Sect statement
failed as the path ####### could not be used for export. Error Code
:105005
This error occurs because of PolyBase can't complete the operation. The operation failure can be due to the following reasons :
Network failure when you try to access the Azure blob storage
The configuration of the Azure storage account.
You can fix this issue by following this article it helps you resolve the problem that occurs when you do a CREATE EXTERNAL TABLE AS SELECT.
For more in detail, please refer below links:
https://learn.microsoft.com/en-us/troubleshoot/sql/analytics-platform-system/error-cetas-to-blob-storage
https://www.sqlservercentral.com/articles/access-external-data-from-azure-synapse-analytics-using-polybase
https://knowledge.informatica.com/s/article/000175628?language=en_US

Azure Data Factory CICD error: The document creation or update failed because of invalid reference

All, when running a build pipeline using Azure Devops with ARM template, the process is consistently failing when trying to deploy a dataset or a reference to a dataset with this error:
ARM Template deployment: Resource Group scope (AzureResourceManagerTemplateDeployment)
BadRequest: The document creation or update failed because of invalid reference 'dataset_1'.
I've tried renaming the dataset and also recreating it to see if that would help.
I then deleted the dataset_1.json file from the repo and still get the same message so it's some reference to this dataset and not the dataset itself I think. I've looked through all the other files for references to this but they all look fine.
Any ideas on how to troubleshoot this?
thanks
try this
Looks like you have created 'myTestLinkedService' linked service, tested connection but haven't published it yet and trying to reference that linked service in the new dataset that you are trying to create using Powershell.
In order to reference any data factory entity from Powershell, please make sure those entities are published first. Please try publishing the linked service first from the portal and then try to run your Powershell script to create the new dataset/actvitiy.
I think I found the issue. When I went into the detailed logs I found that in addition to this error there was an error message about an invalid SQL connection string, so I though it may be related since the dataset in question uses Azure SQL database linked service.
I adjusted the connection string and this seems to have solved the issue.

SonarQube - Cannot insert duplicate key in object 'dbo.ce_activity'

Currently, we are using SonarQube 8.8.
Our Azure DevOps builds that use SonarQube have been running fine for awhile with no issues. Recently our builds have been hanging on the “publish quality gate result” step. When looking at the logs we found that we receive this error:
Error updating database. Cause: com.microsoft.sqlserver.jdbc.SQLServerException: Violation of PRIMARY KEY constraint 'pk_ce_activity'. Cannot insert duplicate key in object 'dbo.ce_activity'. The duplicate key value is ....
It looks like their is something in our pipeline that is trying to use a uuid that is already in our Microsoft SQL Server database.
Any ideas on how to mitigate this issue?
This looks like a SQL server database issue.
You can reference the following articles to try solving the issue:
https://support.esri.com/en/technical-article/000016425
Violation of PRIMARY KEY constraint. Cannot insert duplicate key in object

External Table on DELTA format files in ADLS Gen 1

We have number of databricks DELTA tables created on ADLS Gen1. and also, there are external tables built on top each of those tables in one of the databricks workspace.
similarly, I am trying to create same sort of external tables on the same DELTA format files,but in different workspace.
I do have read only access via Service principle on ADLS Gen1. So I can read DELTA files through spark data-frames, as in given below:
read_data_df = spark.read.format("delta").load('dbfs:/mnt/data/<foldername>')
I can even able to create hive external tables, but I do see following warning while reading data from the same table:
Error in SQL statement: AnalysisException: Incompatible format detected.
A transaction log for Databricks Delta was found at `dbfs:/mnt/data/<foldername>/_delta_log`,
but you are trying to read from `dbfs:/mnt/data/<foldername>` using format("hive"). You must use
'format("delta")' when reading and writing to a delta table.
To disable this check, SET spark.databricks.delta.formatCheck.enabled=false
To learn more about Delta, see https://learn.microsoft.com/azure/databricks/delta/index
;
If I create external table 'using DELTA', then I see a different access error as in:
Caused by: org.apache.hadoop.security.AccessControlException:
OPEN failed with error 0x83090aa2 (Forbidden. ACL verification failed.
Either the resource does not exist or the user is not authorized to perform the requested operation.).
failed with error 0x83090aa2 (Forbidden. ACL verification failed.
Either the resource does not exist or the user is not authorized to perform the requested operation.).
Does it mean that I would need full access, rather just READ ONLY?, on those underneath file system?
Thanks
Resolved after upgrading to Databricks Runtime environment to runtime version DBR-7.3.