Is there a way to know filenames generated with MultiResourceItemWriter? - spring-batch

I'm writing a spring-batch application with spring-boot support and I'm looking for a way to know which files were generated by MultiResourceItemWriter. The first solution I have in mind is to have a folder for only the files generated and check the content, but if there is something already implemented on spring-batch would be great!
The intention is to encrypt and then upload each file to an sftp server.

The file names generated by the MultiResourceItemWriter are the combination of the resource name + the suffix created by the ResourceSuffixCreator. For example, if you create the writer like the following:
MultiResourceItemWriter<String> writer = new MultiResourceItemWriter<>();
writer.setResource(new FileSystemResource(new File("data.txt")));
writer.setResourceSuffixCreator(index -> "part" + index);
Then the generated files will be data.txt.part1, data.txt.part2, etc.

MultiResourceItemWriter doesn't perform write directly but delegate this job to other components.
All those components are ResourceAwareItemWriterItemStream implementors so you may write a ResourceAwareItemWriterItemStreamDelegate, intercept setResource() method and store resource into current step execution-context as a collection.
If you want to pass this list of resources to next steps you may use an ExecutionContextPromotionListener.

Related

Azure Factory v2 Wildcard

I am trying to create a new dataset in ADF that looks for csv files that meet a certain naming convention. These files are located within a series of different folders in my Azure Blob Storage.
For instance, in the sample directory below, I am trying to pull out csv files that contain the word "cars".
Folder A
fastcars.csv
fasttrucks.csv
Folder B
slowcars.csv
slowtrucks.csv
Ideally , I would end up with the files "slowcars.csv" and "fastcars.csv". I've seen examples out there were people were able to wildcard the file name. I have been playing around with that, but have had no luck. (See image below for one example of what I have been doing).
Is what I am trying to do even possible? Would appreciate any advice you guys may have. Please let me know if I can provide further clarification.
According to the description of filename in this documentation,
The file name under the given fileSystem + folderPath. If you want to
use a wildcard to filter files, skip this setting and specify it in
activity source settings.
so you need to specify it in activity not in file path.
A easy sample in copy activity:
Hope this can help you.

"How to embed resources" or "How to access a Resource"

I am struggling with embedded resources or resources in general with Dynamics365. My goal is to add a xml-file as resource to a model and use that resource in some testcode.
I tried to add the xml as resource-element but it seems this does not embedd the xml into the compiled dll so i don't know how to pick up that xml-file in my testcode. Currently my testcode loads the xml from "C:\Temp\test.xml" where i copied my xml to, but thats not a viable solution and i thought adding the xml as resource would be ok. Or is there a better approach to this scenario ?
You can use class SysResource to interact with resources. I used the following code in one of my unit tests to load the content of a file resource into a file and create a CommaStreamIo instance from that file. You should be able to modify that to do your stuff with an xml file.
ResourceNode textFileResourceNode = SysResource::getResourceNode(resourceStr(MyTextFileResourceName));
str textFilename = SysResource::saveToTempFile(textFileResourceNode);
CommaStreamIo commaStreamIo = CommaStreamIo::constructForRead(File::UseFileFromURL(textFilename));
Also take a look at reading a resource into a string.
You could also take a look at how some of the standard resources are used. For example, there are several .xslt file resources that are used to transform bank statement formats.

Reading files from Apache Spark textFileStream

I'm trying to read/monitor txt files from a Hadoop file system directory. But I've noticed all txt files inside this directory are directories themselves as showed in this example bellow:
/crawlerOutput/b6b95b75148cdac44cd55d93fe2bbaa76aa5cccecf3d723c5e47d361b28663be-1427922269.txt/_SUCCESS
/crawlerOutput/b6b95b75148cdac44cd55d93fe2bbaa76aa5cccecf3d723c5e47d361b28663be-1427922269.txt/part-00000
/crawlerOutput/b6b95b75148cdac44cd55d93fe2bbaa76aa5cccecf3d723c5e47d361b28663be-1427922269.txt/part-00001
I'd want read all the data inside the part's files. I'm trying to use the following code as showed in this snippet:
val testData = ssc.textFileStream("/crawlerOutput/*/*")
But, unfortunately it said it doesn't exist /crawlerOutput/*/*. Doesn't textFileStream accept wildcards? What should I do to solve this problem?
The textFileStream() is just a wrapper for fileStream() and does not support subdirectories (see https://spark.apache.org/docs/1.3.0/streaming-programming-guide.html).
You would need to list the specific directories to monitor. If you need to detect new directories a StreamingListener could be used to check then stop streaming context and restart with new values.
Just thinking out loud.. If you intend to process each subdirectory once and just want to detect these new directories then potentially key off another location that may contain job info or a file token that once present could be consumed in the streaming context and call the appropriate textFile() to ingest the new path.

How can I rename file uploaded to s3 using javascript api?

'pickAndStore' method allows me to specify full path to the file, but I don't know it's extension at this point (file path has to be defined before file is uploaded, so it's not possible to provide a path with correct extension).
if I use 'pick' and then 'store' I have 2 files (because both methods uploads file to the s3). I can delete 'old' file, but it's not optimal and can be pain (take ages) with really big files.
Is there any better solution? Ideally to rename existing file.
Currently, there is no workaround for renaming file.
However, in our Javascript API v2 we are planing to add new callback function. onStart callback will be fired after user pick file but before file uploading. There could be option like renaming file based on original filename.
We will keep you updated.

Jenkins How can i upload a text file and use it as a parameter

I have a txt file that is holding a string inside, I want to be able to use this string in one of my scripts, so I'm wondering if there is a way to set the content of the file as one of the build properties or parameters which I'll be able to use in my scripts it should be the same as using one of the build environment properties.
For example : ${JOB_NAME} which is holding the the job name, so in the same way I want to access the content of the file which is holding some value inside.
Is it possible?
You can upload a file from your computer to the workspace through the File parameter of the job.
You can use Extended Choice plugin parameter, to read value(s) from a file and display them in a dropdown/radio-button/checkbox for the user to select, dynamically, every time the build is triggered.
You can use EnvInject plugin to read value(s) from a file and inject them into the build as environment variables, so that they can be used by the rest of the build steps/scripts.
Your question is very unclear on what your are trying to do. Pick one of the 3 methods above based on what you need, or clarify your question.