Azure Data Factory - Event based triggers on multiple files/blobs - azure-data-factory

I am invoking an ADF V2 pipeline via an event based trigger when new files/blobs are created in a folder within a blob container.
Blob Container structure:
BlobContainer ->
FolderName ->
-> File1.csv
-> File2.csv
-> File3.csv
I've created the trigger with below configuration:
Container Name: BlobContainer
Blob path begins with: FolderName/
Blob path ends with: .csv
Event Checked:Blob Created
Trigger Screenshot
Problem: Three csv files are created in the folder on ad hoc basis. The trigger that invokes the pipeline runs 3 times (probably because 3 blobs are created). The pipeline actually move the files in another blob container. So the 1st trigger run succeeds and remaining 2 fails because the files have been moved already. However how can I configure the trigger so that it only run once per folder even though 3 files are created within it?
Because the files are generated together, I am required to move them together into a new location using ADF.

Your blobEventTrigger triggered the pipeline for each file, For it, you can use a 'lookup activity' which gets the filenames and then use filter activity, which filtered the required filename and gives the filterdItemCounts attribute that could be checked in the IF Activity. When there is no file the filterdItemCounts returns '0'and your pipeline not triggered.
Summary-
Lookup Activity -> Filter Activity -> IF Activity -> Your Pipeline

Related

Azure DevOps deployment fails with the error message "The operation was canceled."

My Azure DevOps pipeline tasks successfully complete without issues except for the final deployment step:
Job Issues - 1 Error
The job running on agent XXXX ran longer than the maximum time of 00:05:00 minutes. For more information, see https://go.microsoft.com/fwlink/?linkid=2077134
The build logs state the operation was canceled:
021-03-02T20:50:00.4223027Z Folders: 695
2021-03-02T20:50:00.4223319Z Files: 10645
2021-03-02T20:50:00.4223589Z Size: 672611102
2021-03-02T20:50:00.4223851Z Compressed: 249144045
2021-03-02T20:50:03.6023001Z ##[warning]Unable to apply transformation for the given package. Verify the following.
2021-03-02T20:50:03.6032907Z ##[warning]1. Whether the Transformation is already applied for the MSBuild generated package during build. If yes, remove the <DependentUpon> tag for each config in the csproj file and rebuild.
2021-03-02T20:50:03.6034584Z ##[warning]2. Ensure that the config file and transformation files are present in the same folder inside the package.
2021-03-02T20:50:04.5268038Z Initiated variable substitution in config file : C:\azagent\A2\_work\_temp\temp_web_package_3012195912183888\Areas\Admin\sitemap.config
2021-03-02T20:50:04.5552027Z Skipped Updating file: C:\azagent\A2\_work\_temp\temp_web_package_3012195912183888\Areas\Admin\sitemap.config
2021-03-02T20:50:04.5553082Z Initiated variable substitution in config file : C:\azagent\A2\_work\_temp\temp_web_package_3012195912183888\web.config
2021-03-02T20:50:04.5642868Z Skipped Updating file: C:\azagent\A2\_work\_temp\temp_web_package_3012195912183888\web.config
2021-03-02T20:50:04.5643366Z XML variable substitution applied successfully.
2021-03-02T20:51:00.8934630Z ##[error]The operation was canceled.
2021-03-02T20:51:00.8938641Z ##[section]Finishing: Deploy IIS Website/App:
When I examine the deployment states, I notice one of my tasks takes quite a while for what should be a fairly simple operation:
The file transform portion takes over half of the allotted 5 minutes? Could this be the issue?
steps:
- task: FileTransform#1
displayName: 'File Transform: '
inputs:
folderPath: '$(System.DefaultWorkingDirectory)/_site.com/drop/Release/Nop.Web.zip'
fileType: json
targetFiles: '**/dataSettings.json'
It may be inefficient but FileTransform log shows a significant amount of time spent after the variable has been substituted. Not sure what's causing the long delay, but the logs don't account for the time after the variable has been successfully substituted:
2021-03-02T23:04:44.3796910Z Folders: 695
2021-03-02T23:04:44.3797285Z Files: 10645
2021-03-02T23:04:44.3797619Z Size: 672611002
2021-03-02T23:04:44.3797916Z Compressed: 249143976
2021-03-02T23:04:44.3970596Z Applying JSON variable substitution for **/App_Data/dataSettings.json
2021-03-02T23:04:45.2396016Z Applying JSON variable substitution for C:\azagent\A2\_work\_temp\temp_web_package_0182869515217865\App_Data\dataSettings.json
2021-03-02T23:04:45.2399264Z Substituting value on key DataConnectionString with (string) value: ***
**2021-03-02T23:04:45.2446986Z JSON variable substitution applied successfully.
2021-03-02T23:07:25.4881687Z ##[section]Finishing: File Transform:**
The job running on agent XXXX ran longer than the maximum time of 00:05:00 minutes.
Based on the error message, the cause of this issue is that the running time of the Agent Job has reached the maximum value set.
If you are using the release pipeline, you could set the timeout in Agent Job -> Execution plan ->timeout.
If you are using the build pipeline, you could set the timeout in Agent Job -> Agent Job -> Execution plan -> timeout (For An Agent Job) and Options -> Build job -> Build job timeout in minutes (For whole Build pipeline).
The file transform portion takes over half of the allotted 5 minutes
From the task log, the zip package contains many files and folders. Therefore, transform task will take more time to traverse the file to find the target file.

how to regenerate meson for newly added yaml files

I have added yaml files to add new dbus objects and I added PHOSPHOR_MAPPER_SERVICE_append = " com/newCoName"
(newCoName is the name of my company)
But when I run bitbake, do_configure for phosphor_mapper bails when it passes the option -Ddata_com_newCoName to meson. The following readme says I need to run ./regenerate_meson from the gen directory when I add new YAML files. But how do I do that from a recipe file?
https://github.com/openbmc/phosphor-dbus-interfaces
One option is to generate these files outside ot yocto environment (i.e. not involving bitbake). Thus
clone that git repo
place your yaml file where you cloned repo
do what readme tells, i.e. go to gen directory and execute meson-regenerate script
collect changes that are done by script and create patch
add patch to your layer and reference it in .bbappend file (meta-/recipes-phosphor/dbus/phosphor-dbus-interfaces_git.bbappend)
Another option would be to add to .bbappend file additional task that runs before do_configure - and call that script from there:
do_configure_prepend() {
cd ${S}/gen && ./meson-regenerate
}
Along this .bbappend you should add your yaml so that it lands inside gen folder in patch or directly in your layer (check FILESEXTRAPATHS).
In both cases you'll need to patch meson_options.txt: add option
option('data_com_newCoName', type: 'boolean', value: true)

Replace values in multiple appsettings files stored in different artifacts during VSTS Release pipeline

I am trying to replace appsettings.json and e2e-appsettings.json variables stored in two different artifacts in the release pipeline.
As per below code, ONLY appsettings.json is updating, but for the second line it gives an error,
error: NO JSON file matched with specific pattern: e2e/XXX.EndToEnd.Integration.Tests/e2e-appsettings.json
As per the information it should be the relative path to root. So in this case not sure what should be the root as there are two build artifacts.
Further in download artifact logs says,
2019-04-09T02:33:55.9132583Z Downloaded e2e/XXX.EndToEnd.Integration.Tests/e2e-appsettings.json to D:\a\r1\a\EstimationCore\e2e\XXX.EndToEnd.Integration.Tests\e2e-appsettings.json
The artifact with other appsettings.json file which is working fine is a zip file. logs, Downloading app/app.zip to D:\a\r1\a\EstimationCore\app\app.zip
These I have already tried which gave the same error
- NO JSON file matched with specific pattern: e2e/XXX.EndToEnd.Integration.Tests/e2e-appsettings.json.
- NO JSON file matched with specific pattern: **/*e2e-appsettings.json
- NO JSON file matched with specific pattern: d:\a\r1\a\EstimationCore\e2e\XXX.EndToEnd.Integration.Tests\e2e-appsettings.json.
- NO JSON file matched with specific pattern: d:\a\r1\a\EstimationCore\**\**\e2e-appsettings.json.

Azure data factory Lookup

I have a lookup activity which reads data from a SQL table and the output of this is passed onto multiple Execute Pipeline tasks as parameters. The flow is as follows
Lookup -> Exct Pipeline 1 -> Exct Pipeline 2 -> Exct Pipeline 3
.This works fine for the first pipeline however the second execute Pipeline fails with the following error.
> "The template validation failed: 'The inputs of template action 'Exct
> Pipeline 2' at line '1 and column '178987' cannot reference action
> 'Lookup'. Action 'Lookup' must either be in 'runAfter' path or within
> a scope action on the 'runAfter' path of action 'Execute Pipeline 3',
> or be a Trigger"
Another point to be noted is that the Pipeline runs fine when triggered.It only fails when in debug.
Has anyone else seen this issue?

Spring batch, handle failed files again, skip successfully handled

How to ideologically correct organize file handling?
I have a folder for new files (NEW), folder for old files (OLD), a folder for failed files (FAIL). New file puts in NEW, then if the handling was correct, the file goes to OLD, if the handling was failed, the file goes to ERR. Then we take this file again and correcting it and put in NEW if all ok file goes to OLD if failed goes to ERR. And repeat again and again.
I have job with constant name "fileHandlingJob", in job i have some steps: "extract", "handling", "utilize", and i have job parameters: "filePath", "fileName".
Thanks!
If you state that uniqueness criteria of file - it's file's name, then you are on right way.
If job was in state FAILED (ERR folder) then you can retrigger it with same set of parameters. If job was COMPLETED - you can't run it again. Spring batch will complain.
You can ensure this behaviour by having unique file name as Job's parameter. So no other job could be triggered with same file name. Spring batch will simply prevent this.
Second parameter filePath can be additional non-unique parameter.
JobParametersBuilder jobParametersBuilder = new JobParametersBuilder()
.addString("fileName", "myfile.xml", true)
.addDate("filePath", "C:\new\myfile.xml", false);
true/false here means whether parameter is unique or not.