I hope you guys keep health and keep strong in Pandemic covid-19.
I have some question on Azure Data Factory. btw I have create some pipeline with Metadata activity with detail below:
I have file in Folder and Subfolder like this:
I have metadata activity with for each with first get metadata child item (in folder) like this:
metadata with last modified like this (if you setting like this, metadata only read last modified subfolder
after that add variable I use #item().Name to read file in that folder like this:
after running metadata which have subfolder, I've get error like this:
the error give info that with #item().Name cannot read subfolder on that folder. the metadata for each file is success, but error like this which on my activity cannot read metadata subfolder .
many big thanks to have answer, Thank You
If you need to access the folder
Create a clone of same dataset and setup parameter as below, leave the file field empty.
If you need to access the file inside directory, use condition #equals(item().type,'Folder') to identity directory then inside that use dataset with parameters for directory and file.
Related
I am trying to save table as partition using .Q.dpt[hdbroot;.z.d;`tablename].
But it's generating No such file or directory error, but the directory is present.
can you please help me on this.
I have created blank folder to store the data but it's checking for sym file while storing data.
I have created one blank folder and gave that folder path to hdbroot variable, but it's not working.
I could replicate your error by trying to save to a location that doesn't exist on the machine.
q).Q.dpt[`:/does/not/exist;.z.d;`t]
'/does/not/exist/sym. OS reports: No such file or directory
[0] .Q.dpt[`:/does/not/exist;.z.d;`t]
Like I mentioned in my comment, make sure that the hdbroot variable is exactly the location you're expecting. key can help you determine this, here is a quick helper function for you.
q)exists:{"Folder/file ",$[11=abs type key x;"exists";"does not exist"]}
q)exists`:/does/not/exist
"Folder/file does not exist"
q)exists`:/tmp
"Folder/file exists"
I am trying to check if any zip file exists in my SFTP folder. GetMetadata activity works fine if I explicitly provide the filename but I can't know the file name here as the file name is embeded with timestamp and sequence number which are dynamic.
I tried specifying *.zip but that never works and GetMetadata activity always returns false even though the zip file actually exists. is there any way to get this worked? Suggestion please.
Sample file name as below, in this the last part 0000000004_20210907080426 is dynamic and will change every time:
TEST_TEST_9999_OK_TT_ENTITY_0000000004_20210907080426
You could possibly do a Get Metadata on the folder and include the Child items under the Field List.
You'll have to iterate with a ForEach using the expression
#activity('Get Folder Files').output.childItems
and then check if item().name (within the ForEach) ends with '.zip'.
I know it's a pain when the wildcard stuff doesn't work for a given dataset, but this alternative ought to work for you.
If you are using exists in the Get Metadata activity, you need to provide the file name in it.
As a workaround, you can get the child items (with filename *.zip) using the Get Metadata activity.
Output:
Pass the output to If Condition activity, to check if the required file exists.
#contains(string(json(string(activity('Get Metadata1').output.childItems))),'.zip')
You can use other activities inside True and False activities based on If Condition.
If there is no file exists or no child items found in the Get Metadata activity.
If condition output:
For SFTP dataset, if you want to use a wildcard to filter files under the field specified folderPath, you would have to skip this setting and specify the file name in activity source settings (Get Metadata activity).
But Wildcard filter on folders/files is not supported for Get Metadata activity.
I want to copy the file from Source to target container but only when the Source file is new
(latest file is placed in source). I am not sure how to proceed this and not sure about the syntax to check the source file greater than target. Should i have to use two get metadata activity to check source and target last modified date and use if condition. i tried few ways but it didn't work.
Any help will be handy
syntax i used for the condition is giving me the error
#if(greaterOrEquals(ticks(activity('Get Metadata_File').output.lastModified),activity('Get Metadata_File2')),True,False)
error message
The function 'greaterOrEquals' expects all of its parameters to be either integer or decimal numbers. Found invalid parameter types: 'Object'
You can try one of the Pipeline Templates that ADF offers.
Use this template to copy new and changed files only by using
LastModifiedDate. This template first selects the new and changed
files only by their attributes "LastModifiedDate", and then copies
them from the data source store to the data destination store. You can
also go to "Copy Data Tool" to get the pipeline for the same scenario
with more connectors.
View
documentation
OR...
You can use Storage Event Triggers to trigger the pipeline with copy activity to copy when each new file is written to storage.
Follow detailed example here: Create a trigger that runs a pipeline in response to a storage event
I am trying to create a new dataset in ADF that looks for csv files that meet a certain naming convention. These files are located within a series of different folders in my Azure Blob Storage.
For instance, in the sample directory below, I am trying to pull out csv files that contain the word "cars".
Folder A
fastcars.csv
fasttrucks.csv
Folder B
slowcars.csv
slowtrucks.csv
Ideally , I would end up with the files "slowcars.csv" and "fastcars.csv". I've seen examples out there were people were able to wildcard the file name. I have been playing around with that, but have had no luck. (See image below for one example of what I have been doing).
Is what I am trying to do even possible? Would appreciate any advice you guys may have. Please let me know if I can provide further clarification.
According to the description of filename in this documentation,
The file name under the given fileSystem + folderPath. If you want to
use a wildcard to filter files, skip this setting and specify it in
activity source settings.
so you need to specify it in activity not in file path.
A easy sample in copy activity:
Hope this can help you.
'pickAndStore' method allows me to specify full path to the file, but I don't know it's extension at this point (file path has to be defined before file is uploaded, so it's not possible to provide a path with correct extension).
if I use 'pick' and then 'store' I have 2 files (because both methods uploads file to the s3). I can delete 'old' file, but it's not optimal and can be pain (take ages) with really big files.
Is there any better solution? Ideally to rename existing file.
Currently, there is no workaround for renaming file.
However, in our Javascript API v2 we are planing to add new callback function. onStart callback will be fired after user pick file but before file uploading. There could be option like renaming file based on original filename.
We will keep you updated.