Azure Data factory: PostgreSQL to blob storage if PostgreSQL is in a private subnet (Azure) - postgresql

So I want to create a copy activity in Azure Data factory. From PostgreSQL to Azure blob storage.
As my vm (postgreSQL) is in a private subnet in Azure.
So my question is, is it possible to create a pipeline from a vm which is in a private subnet?
updates
So currently this is the situation.
I have created a private endpoint postgresql-2-data-storage and now I want to connect datafactory to my vm which is in a Azure vNet with a private ip address 172.16.101.4
when I click on create new Linked service, I don't see Azure resource(vNet) or private endpoint.

PostgreSQL to blob storage if PostgreSQL is in a private subnet
AFAIK, to Assess the database from On-premises or from Azure private network you need to configure a self-hosted integration runtime to connect to it.
Using Azure Private Link, you can connect to various platform as a service (PaaS) deployment in Azure via a private endpoint and to access data from Private network, you need to Create private endpoint on Azure data factory an add that endpoint to same virtual network where your VM is present.
Go to your ADF settings >> Networking >> Private endpoint connection >> Private endpoint.
Then fill all details an configure it. after this install SHIR in your VM and connect your PostgreSQL to Data factory
Follow this document To Install Self-Hosted Integration Runtime on Azure VM by using Private EndPoint for more information.

Related

How To create DigitalOcean service endpoint connection in AzureDevops when using DigitalOcean Tools task in Azure DevOps

I am trying to upload build apk file in DigitalOcean by using Azure DevOps.
In AzureDevops,we have task called DigitalOcean Tools by using this we can upload the files in DigitalOcean.Below is the link for your reference.
https://marketplace.visualstudio.com/items?itemName=marcelo-formentao.digitalocean-tools&ssr=false#overview
I installed that task in my organization.
First it will ask for to create DigitalOcean Connection by using service endpoint in azure devops.
I Search in Service endpoint in Azure DevOps i didn't find Service connection for Digital Ocean(apart that I found gitlab,ssh,azure..eg i found for all these Service Connection).
My Question is which service connector i need to used for Digital Ocean?
Please help me on this
DigitalOcean Connection: It's based on AWS configuration (only Access Key ID and Secret Key ID is required).
You can choose AWS connection type to create DigitalOcean service endpoint connection in Azure DevOps. And fill the Access Key ID and Secret Key ID that you can get from your DigitalOcean.
Result
UPDATE
If you didn't find AWS end point connection, you should install AWS Toolkit for Azure DevOps extension in marketplace.Url:AWS Toolkit for Azure DevOps
After installing the extension, you would see AWS connection type in your service connection.I test in my devops and it works.

error connecting to azure data lake in azure data factory

I am trying to create a linked service in Azure Data Factory to an Azure Data Lake Storage Gen2 data store. Below is my linked service configuration:
I get the following error message when I test the connection:
Error code 24200 Details ADLS Gen2 operation failed for: Storage
operation '' on container 'testconnection' get failed with 'Operation
returned an invalid status code 'Forbidden''. Possible root causes:
(1). It's possible because some IP address ranges of Azure Data
Factory are not allowed by your Azure Storage firewall settings. Azure
Data Factory IP ranges please refer
https://learn.microsoft.com/en-us/azure/data-factory/azure-integration-runtime-ip-addresses..
I have found a very similar question here, but I'm not using Managed Identity as my authentication method. Perhaps I should be using that method. How can I overcome this error?
I tried to create a linked service to my Azure Data Lake storage and when I test its connection, it gives me the same error.
Error code 24200 Details ADLS Gen2 operation failed for: Storage
operation '' on container 'testconnection' get failed with 'Operation
returned an invalid status code 'Forbidden''. Possible root causes:
(1). It's possible because some IP address ranges of Azure Data
Factory are not allowed by your Azure Storage firewall settings. Azure
Data Factory IP ranges please refer
https://learn.microsoft.com/en-us/azure/data-factory/azure-integration-runtime-ip-addresses
As indicated by the Possible root causes in the error details, this occurred because of the Azure data lake storage account firewall settings.
Navigate to your data lake storage account, go to Networking -> Firewalls and virtual networks.
Here, when the public network access is either disabled or enabled from selected virtual networks and IP addresses, the linked service creation fails with the above specified error message.
Change it to Enabled from all networks save the changes and try creating the linked service again.
When we test the connection before creating the linked service, it will be successful, and we can proceed to create it.
UPDATE:
In order to proceed with a data lake storage with public access enabled from selected virtual netowrks and IP addresses to create a successful connection via linked service, you can use the following approach.
Assuming your data lake storage has public network access enabled from selected virtual netowrks and IP addresses, first create an integration runtime in your azure data factory.
In your data factory studio, navigate to Manage -> Integration Runtime -> New. Select Azure,self hosted as the type of integration runtime.
Select Azure in the next window and click continue. Enter the details for integration runtime
In the virtual network tab, enable the virtual network configuration and check the interactive authoring checkbox.
Now continue to create the Integration runtime. Once it is up and running, start creating the linked service for data lake storage.
In Connect via integration runtime, select the above created IR. In order to complete the creation, we also need to create a managed private endpoint (It will be prompted as shown in the image below).
Click Create new, with account selection method as From azure subscription, select the data lake storage you are creating the linked service to and click create.
Once you create this, a private endpoint request will be sent to your data lake storage account. Open the storage account, navigate to Networking -> Private endpoint connections. You can see a pending request. Approve this request.
Once this is approved, you can successfully create the linked service where your data lake storage allows access on selected virtual networks and IP addressess.
The error has occurred because of firewall and network access restriction. One way to overcome this error is by adding your client ip to the firewall and network setting of your storage account. Navigate to your data lake storage account, go to Networking -> Firewalls and virtual networks. Under firewall option click on "Add your client ip address"

How to use Azure Data Factory, Key Vaults and ADF Private Endpoints together

I've created new ADF instance on Azure with Managed Virtual Network integration enabled.
I planned to connect to Azure Key Vault to retrieve credentials for my pipeline’s source and sink systems using Key Vault Private Endpoint. I was able to successfully create it using Azure Data Factory Studio. I have also created Azure Key Vault linked service.
However, when I try to configure another Linked Services for source and destination systems the only option available for retrieving credentials from Key Vault is AVK Linked Service. I'm not able to select related Private Endpoint anywhere (please see below screen).
Do I miss something?
Are there any additional configuration steps required? Is the scenario I've described possible at all?
Any help will be appreciated!
UPDATE: Screen comparing 2 linked services (one with managed network and private endpoint selected and another one where I'm not able to set this options up):
Managed Virtual Network integration enabled, Make sure check which region you are using unfortunately ADF managed virtual network is not supported for East Asia.
I have tried in my environment even that option is not available
So, I have gathered some information even if you create a private endpoint for Key Vault, this column is always shown as blank .it validates URL format but doesn't do any network operation
As per official document if you want to use new link service, instead of key vault try to create other database services like azure sql, azure synapse service like as below
For your Reference:
Store credentials in Azure Key Vault - Azure Data Factory | Microsoft Docs
Azure Data Factory and Key Vault - Tech Talk Corner

Accessing Amazon RDS Postgresql from Azure DevOps Hosted Agent

How can I allow Azure DevOps Hosted Agent access my Amazon RDS PostgreSql without setting the Security Group to Anywhere. I was looking for IP Range or something to whitelist Azure DevOps Agents but can't find it.
In Azure, I can check a box to grant all "Azure DevOps Services" access to my Azure SQL Database but of course its not present in AWS.
I don't think we can access the Amazon RDS PostgreSql directly from Azure DevOps Hosted Agent, I mean using the hosted service account.
However, Amazon RDS for PostgreSQL Supports User Authentication with Kerberos and Microsoft Active Directory, so we can try writing script to access it by using the specific credential. Then run the scripts in pipeline by adding corresponding tasks (e.g AWS CLI or AWS PowerShell).
Also check How do I allow users to connect to Amazon RDS with IAM credentials?
For the IP ranges, please refer to Allowed address lists and network connections and Microsoft-hosted Agents for details.
The IPs used for the hosted Agent IP ranges are linked through here. I have not had much success using it for hosted agents. The list is big and the documentation is not really clear about what types of services you need to whitelist.
I would go with whitelisting the hosted agent IP just-in-time during the pipeline run, then remove it as a final step. First you can grab the ip of the hosted agent:
$hostedIPAddress = Invoke-RestMethod http://ipinfo.io/json | Select -exp ip
Then you could use the AWS CLI or AWS PowerShell module to add the specific IP. Azure DevOps AWS tools task includes the CLI.
Do the needed work against the DB, then make sure you clean up the rule\temp security group at the end.

On-Prem SQL connection throwing SqlException in Datafactory custom activity

I have added code for Azure datafactory custom activity in Azure batch service and pointed the datafactory pipeline to the bacth service. When I execute the code in local environment, it works fine. But when I upload it to run in azure batch service, it throws and sqlexception
System.Data.SqlClient.SqlException: A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections.
Today, Custom activity cannot access on-prem resource. The data movement gateway can only be used on copy\Sproc activity scenario, and it doesn’t have interface to execute customer’s code.
The solution here is:
Try copy activity to copy your data to azure storage or other public cloud can access. Then run custom activity.
Otherwise you can try vNet and ExpressRoute to connect your Azure public cloud with your onprem environment.