How to copy many csv files using Synapse Pipelines from an online source with the date in the file name? - azure-data-factory

There is this git repository publicly available. It's being refreshed daily. There are several csv files with the structure like "DA-01-12-2022", "DA-02-12-2022", "DA-03-12-2022" and so on. The date is in the file name. It's also in the githublink, so I can copy one file without problem but since there are many CSV files in the git folder how can I use Synapse pipelines to copy all the files in the git repository to a storage in azure. I feel like I have to use loops but how can I tell it to use the date?
Thanks and best regards!

You can use a copy activity to load all csv and store in one parquet files
You can also use ITER activity to make loop with date

Related

Azure Synapse Notebook Folders Structure not saved in GitHub

My Synapse Workspace is configured with GitHub. The code is organized in folders under "NoteBook". Example: Under Notebook, Dev1 folder contains notebook1 and notebook2. Dev2 folder contains notebook3 and notebook4
When Synapse Publishes, the GitHub Repo does not maintain the folder structure. All 4 files are under "repo_name/notebook/ notebook1, notebook2,notebook3, notebook4
How can do I configure Synapse GitHub to keep the folder structure?
The short answer is you don't. When you save a notebook (or SQL script, or anything else for that matter), you are actually saving a JSON representation of the asset. The "folders" in Synapse are not actually folders, but rather properties in the JSON:
This is why file names need to be globally unique, so you can't have two "Notebook1" files in different folders. Again, same for SQL scripts, datasets, pipelines, etc.

How to backup the data on Azure Devops?

I would like to schedule (for my company) a backup of our most important data in Azure DevOps, and that, for different reasons : security, urgent recovery required, virus, migration, etc...
I can execute a backup of the repositories and the Wikis (because it's under GIT, so easy to download), but how can do a backup of the "Board" section (Backlogs, Work items, etc...), and the build pipelines definitions?
How to backup the data on Azure Devops?
In current Azure DevOps, there is no out of the box solution to this. You could manually save the project data through below ways:
Source code and custom build templates: You can download your files
as a zip file. Open ... Repository actions actions for the
repository, file, or folder and choose Download as Zip. You can
also Download from the right side of the screen to download
either all of the files in the currently selected folder, or the
currently selected file.
This process doesn't save any change history or links to other
artifacts.
If you use Git, clone your repositories to retain the full project
history and all the branches.
Build data: To save logs and data in your drop build folders, see
View build results.
Work item tracking data: Create a work item query and open it using
Excel. Save the Excel spreadsheet.
This process doesn't save any attachments, change history, or links
to other artifacts.
build/release defintions: you could export the json file for them and then import them when restoring them.
There has been a related user voice, you could monitor and vote up it: https://developercommunity.visualstudio.com/content/idea/365441/provide-a-backup-service-for-visual-studio-team-se.html.
Here are some tickets(ticket1 ,ticket2) with the same issue you can refer to.
If you want to create scheduled tasks, you can write a script by using the Azure CLI with the Azure Devops Extension
As you said, for the repositories, it's quiet easy as they are Git repositories.
I wrote such a script that we could improve to also backup the Workitems, Backlog, etc...
It's open source, let me know what you would like to backup first and I'll improve it.
Github : azure-devops-repository-backup

How can I copy just new and changed files with an Azure Devops pipeline?

I have a large (lots of dependencies, thousands of files) nodejs app that I am deploying with an Azure Devops YAML build and Azure Devops "classic editor" release pipeline.
Is there some way to copy JUST new and changed files during a file copy, not every single file? My goal is to reduce the time it takes to complete the copy files step of the deploy, as I deploy frequently, but usually with just changes to one or a few files.
About copying only the changed files into artifacts for releasing, if the changed files are in a specific folder , you can copy files in the specified folder by specifying SourceFolder and Contents arguments in the Copy Files task.
If the changed files are distributed in different folders, I am afraid that there is no out-of-the-box method to only pick the changed files when using copy file or PublishArtifacts task.
As workaround, we could add powershell task to deletes all files (recursive) which have timestamp that is < (Now - x min), with this way Artifact directory contains of ONLY CHANGED files. More detailed info please refer this similar case.
Alternatively, you can call Commits-Get Changes rest api through a script in the powershell task, and then retrieve the changed files in the response and then copy them to specific target folder.
GET https://dev.azure.com/{organization}/{project}/_apis/git/repositories/{repositoryId}/commits/{commitId}/changes?api-version=5.0

copy files from azure file storage to azure website after release

I have files that need to be copied over to my website (azure website) after a deployment has been made. usually these files are server specific (I have multiple different servers for different releases), and usually in the past, before i used azure, i just had a backup folder with these files and a powershell script that i ran after deployment that just copied those files right over.
Now that i'm moving to azure, i'd like to keep this functionality. I'm interested in copying over these files into azure file storage, and then in my release task after azure website deployment, just copying from that file storage over into the site\wwwroot folder. I'm not really seeing an easy way to do this. Is there a release task i can use with this in mind?
Is there a release task i can use with this in mind?
Yes, we could use the Azure File Copy task. I also do a demo to copy the zip file to the azure storage. It works correctly on my side. Fore more information, please refer to the screenshot.
Note: If you don't want to zip the files, you could remove the Archive File task.
Test result:

Script to Compare TFS labels in a folder

I have several branches in TFS (dev, test, stage) and when I merge changes into the test branch I want an automated script to find all the updated SQL files based on the labels and deploy them to the SQL database.
Currently I manually use compare in TFS source control explorer to get the files that are changed and use a custom powershell script to deploy to the database.
I am looking for a script that would copy the changed sql files to a repository so that my powershell will do the rest.
Any help would be appreciated.
Thanks
Nit