How to restore DatabricksRoot(FileStore) data after workspace is decommissioned? - azure-devops

My Azure Databricks workspace was decommissioned. I forgot to copy files stored in the DatabricksRoot storage (dbfs:/FileStore/...).
Can the workspace be recommissioned/restored? Is there any way to get my data back?

Unfortunately, the end-user cannot restore Databricks Workspace.
It can be done by raising support ticket here
It is best practice not to store any data elements in the root Azure Blob storage that is used for root DBFS access for the workspace. The root DBFS storage is not supported for production customer data. However, you might store other objects such as libraries, configuration files, init scripts, and similar data. Either develop an automated process to replicate these objects or remember to have processes in place to update the secondary deployment for manual deployment.
Refer - https://learn.microsoft.com/en-us/azure/databricks/administration-guide/disaster-recovery#general-best-practices

Related

Where does Azurite store blobs, queues and tables on mac?

I'm developing an Azure Function on VSCode. I see that a bunch of files are created in my workspace folder. However, even if I delete them, when I open Azure Storage Explorer, I still see a bunch of containers etc. How can I delete all of them in one command?
Folders in Azure Storage aren't really created or deleted (Azure Blob storage does not have a concept of folders and everything inside the container is considered a blob including the folders. You can easily delete a folder including all its contents in Storage Explorer) , they exist as long as there are blobs stored in them. The way to delete a folder is to retrieve all blobs in it using ListBlobsSegmentedAsync and calling DeleteIfExists() on each of them.
Ref: There is a similar discussion threads here, refer to the suggestions mentioned in this Q&A thread and SO thread

Move files in FTP with Logic apps

In an FTP I need to move files from folder to the archives file once they are deposited, I've build previous pipelines in Azure data factory, but since FTP is not supported in copy data I resorted to logic apps but I dont know which tasks to use. I also need to trigger the logic app from ADF.
Thank you,
There are several ways to implement the workflow you are trying to achieve using the SFTP/FTP connector depending on how frequently the files are added and how big the file sizes are. And after that you can create the Azure Blob Storage to archive the files from FTP Folder.
Following steps would give you an overall steps which you should follow.
In azure portal search Logic app and create. Open the Logic App and under DEVELOPMENT TOOLS select Logic App Designer and from the list of Templates click on Blank Logic App and search for FTP – When a file is added or modified as trigger.
Then provide the connection details for the remote FTP server you wish to connect to, as shown below for SFTP server.
Once you have the connection created we need to specify the folder in which the files will reside.
Then Click New step and Add an action. Now you would need to configure the target Blob storage account to transfer the FTP file to. Search for Blob and select AzureBlobStorage – Create blob.
Like this you would be able to archive the FTP files. You should also refer to this article to get more information how to copy files from FTP to Blob Storage in Logic App.
There is also an Quick Start template available for Copy FTP files to Azure Blob logic app by Microsoft. This template allows you to create a Logic app triggers on files in an FTP server and copies them to an Azure Blob container.
And for you second problem -
I also need to trigger the logic app from ADF
Check this Execute Logic Apps in Azure Data Factory (V2) Microsoft document.

copy files from azure file storage to azure website after release

I have files that need to be copied over to my website (azure website) after a deployment has been made. usually these files are server specific (I have multiple different servers for different releases), and usually in the past, before i used azure, i just had a backup folder with these files and a powershell script that i ran after deployment that just copied those files right over.
Now that i'm moving to azure, i'd like to keep this functionality. I'm interested in copying over these files into azure file storage, and then in my release task after azure website deployment, just copying from that file storage over into the site\wwwroot folder. I'm not really seeing an easy way to do this. Is there a release task i can use with this in mind?
Is there a release task i can use with this in mind?
Yes, we could use the Azure File Copy task. I also do a demo to copy the zip file to the azure storage. It works correctly on my side. Fore more information, please refer to the screenshot.
Note: If you don't want to zip the files, you could remove the Archive File task.
Test result:

Share large datasets between a group

Can please someone suggest an online services to share large files, over 100GB, amongst a group of people?
Specifically, we are working on a machine learning project that requires constant access to the files but without the need to download them. For this project we will manipulate the files with python and R, I know that I can upload and share the code with Git but is there a service (like docker?) that you can store information and 'play' with it online?
Thanks!
Common practice - use Git for your code and S3 for data.
You can also check open source tool DVC - http://dataversioncontrol.com -
which orchestrates Git modeling code with S3 or GCP storage. It was designed for ML scenarios. Python and R code both are supported by DVC.

How to deploy only worker/web role in Azure

If you have a web AND a worker role in an Azure solution, all the waiting for the publishing an update package, uploading to the cloud storage, waiting for the package to be deployed could be exhausting, an waste a lot of time.
How to upload/deploy only the worker or web role of an Microsoft Azure Solution, that contains both roles, and save both internet traffic and time?
There is no option to build a package for only one of the both roles, but if you have limited bandwidth or traffic, and want to save from the upload time (which can be quite a big portion if you have a lot of static content: Look here for an example), there is one option.
As maybe you know, the package generated from Visual Studio for deployment (the 'cspkg' file) is nothing more, than an archive file.
Suppose, you want to update the WORKER role only. The steps are:
Create the update package as normal
Open it with the best archive manager (7zfm)
Inside, besides the other files are 2 'cssx' files - one for each
role. Delete the unnecessary cssx file.
Upload to Azure Blob Storage (optional)
Update the instances from the Azure Management Portal using the
'local' or 'storage' source as normal
On the Role dropdown, select only the role you want to update
Press OK :)
Hope this helps.
It is a lot easier to just add two additional cloud projects to your solution. In one project, have it reference only your web role. In the other project, have it reference only your worker role.
You can keep the cloud project that references both roles and use that for local debugging but when it is time to deploy, right click the cloud project that references only role you wish to deploy and click "Publish"
You will end up maintaining configuration files for each cloud project but that sounds a lot easier than manually messing around with editing the package file each time.