Post load scripts for Postgresql database - postgresql

We successfully migrated data to an Azure Virtual Machine that contains a Postgresqldb with Azure Data Factory. We now need to run on this database some post-loading scripts, like creating views, create indexes and so on.
For a normal SQL-DB I would put the scripts into a Stored Procedure and trigger them in Azure Data Factory.
What is the best way to trigger these scripts for the PostgreSQL also from Azure Data Factory?

ADF stored proc acitivity does not support PostgresSQL , I think you should use Azure function as a workaround . You can always invoke the azure function from ADF .

Related

How to connect VectorWise Database to Azure Data Factory or Azure Synapse?

I have a vectorwise database that contain big size multiple table, I need to copy those table into Azure Storage or Azure SQL DB or Synapse. We don't have direct connector for vectorWise in Azure data factory.
Is there any way to connect with VectorWise in ADF by creating API or something?
You will need to setup a virtual machine on a network which can reach your database, you will need to install the self-hosted integration runtime on it, you will need to install the proper ODBC driver from your database vendor, and then create an ODBC linked service.

Execute PostgreSql stored procedure in azure data factory

I have a requirement to execute Azure PostgreSQL stored procedure using Azure Data Factory. Is there any way to do the same. I found that Stored Procedure activity only supports SQL procedures.
Basically I have some staging tables in PostgreSQL and I want to load data into target tables(in PostgreSQL). I have written some stored procedures to apply transformations and load data. I want to run those stored procedures in ADF. Is there any other suggested option to achieve this. TIA!
There is a way to execute the PostgreSQL function from Azure Synapse Analytics or Azure data factory. use lookup activity with query as "select function-name()"
Yes, Stored Procedure activity doesn't support Azure PostgreSQL stored procedure. It only supports Azure SQL Database, Azure Synapse Analytics and SQL Server Database. So you can try to delegate the call to PostgreSQL with Azure Function as Joel said.

Is it possible to copy GeoJson data into PostGIS with Azure Data Factory?

I am looking if it is possible to transition from Airflow to Azure Data Factory.
I have a REST API from which I extract GeoJSON and would like to export this to a Postgres Database with PostGIS. I tried to do this with the Copy Data activity, but this only provides a simple mapping between the GeoJSON fields and similar fields in my table.
Normally I would use ogr2ogr to do this, but am not sure how to approach this with Azure Data Factory.
Does anyone know if my use case would be possible? If yes, how would you suggest to do it?
I fixed my own question. I created an Azure Function which runs Python in a self assigned docker container (one of the options in Azure Functions). I installed gdal in the standard Azure Functions Python Docker container and run subprocess.run() to execute ogr2ogr with the parameters I pass to it via the body of the Azure Functions POST request. I can run this Azure Function via Azure Data Factory.
Hope this can help anyone else searching for a similar approach.

Connect to Azure SQL Database from Databricks Notebook

I wanted to load the data from Azure Blob storage to Azure SQL Database using Databricks notebook . Could anyone help me in doing this
I'm new to this, so I cannot comment, but why use Databricks for this? It would be much easier and cheaper to use Azure Data Factory.
https://learn.microsoft.com/en-us/azure/data-factory/tutorial-copy-data-dot-net
If you really need to use Databricks, you would need to either mount your Blob Storage account, or access it directly from your Databricks notebook or JAR, as described in the documentation (https://docs.azuredatabricks.net/spark/latest/data-sources/azure/azure-storage.html).
You can then read the files into DataFrames for whatever format they are in, and use the SQL JDBC connector to create a connection for writing the data to SQL (https://docs.azuredatabricks.net/spark/latest/data-sources/sql-databases.html).

How to schedule U-SQL procedure in ADF?

I created a stored procedure using U-SQL in Azure data lake.I want to schedule that stored procedure in Azure Data factory.
Is it possible?
I've tried following steps:
I created a stored procedure using U-SQL in Azure Data Lake.
I've created a script which executes the same procedure.
Now, I am trying to run that U-SQL script from ADF.
Is this a right way to execute a U-SQL stored procedure?
Yes, if you want to schedule the execution of a U-SQL stored procedure/script, you should be able to run the script by using the "DataLakeAnalyticsU-SQL" activity type in ADF.
You can schedule U-SQL activities in ADF. You can find detailed documentation below:
https://learn.microsoft.com/en-us/azure/data-factory/data-factory-usql-activity