Is there any github repo where I can integrate my pySpark Spark streaming code with PowerBI? And I want it real time.
This is the last piece in my puzzle.
Related
We would like know if we using Databricks jobs instead ADF for orchestration, we might have to check if databricks jobs support file based trigger. kindly advise.
ultimately goal is, we have different ADF environment and subscription, we know that the subscription and environment does not a issues to stop our goal.
Kindly help.
There will be upcoming feature to trigger jobs based on the file events. It was mentioned in the latest Databricks quarterly roadmap webinar that you can watch.
I doubt that . But in ADF you do have the support for file based trigger and also that ADF has a notebook activity . You can stich these together .
https://learn.microsoft.com/en-us/azure/data-factory/transform-data-using-databricks-notebook
I am investigating if spark 3.1 and PPrometheus have push mechanisms between them.
I know it's possible to pull but I'd like to send the metrics from Spark to Prometheus.
My aim is to make changes to my Azure Databricks notebooks using an IDE rather than in Databricks. While at the same time implementing some sort of version control.
Reading the Databricks-Connect documentation this doesn't look like it supports this kind of functionality. Was wondering if anyone else has tried to do this and had any success?
How do I trigger a notebook in my Azure Machine Learning notebook workspace from Azure Data Factory
I want to run a notebook in my Azure ML workspace when there are changes to my Azure storage account.
My understanding is that your use case is 100% valid and it is currently possible with the azureml-sdk. It requires that you create the following:
Create an Azure ML Pipeline. Here's a great introduction.
Add a NotebookRunnerStep to your pipeline. Here is a notebook demoing the feature. I'm not confident that this feature is still being maintained/supported, but IMHO it's a valid and valuable feature. I've opened this issue to learn more
Create a trigger using Logic apps to run your pipeline anytime a change in the datastore is detected.
There's certainly a learning curve to Azure ML Pipelines, but I'd argue the payoff is in the flexibility you get in composing steps together and easily scheduling and orchestrating the result.
This feature is currently supported by Azure ML Notebooks. You can also use Logic apps to trigger a run of your Machine Learning pipeline when there are changes to your Azure storage account.
I made this dataflow pipeline that will connect Pub/Sub to Big query. Any ideas where would be the right place to commit this upsteam in Apache Beam.