merging multiple pdf files into one per file name using PDFtk pro - pdftk

I have a situation that I need to merge files again by file names. Now, I have files in one folder like this -
A1.pdf,
A2.pdf,
B1.pdf,
C1.pdf,
C2.pdf,
C3.pdf.
The goal is to merge files by file names and I will get A.pdf, B.pdf, C.pdf. I tried different things in the batch file, but none worked so far. Can you please help?
The real files names are like this below.
115_11W0755_70258130_841618403_01.PDF
115_12W0332_70258122_202990692_01.PDF
115_12W0332_70258122_202990692_02.PDF
115_12W0332_70258122_202990692_03.PDF
115_14W0491_70258174_562605608_01.PDF
115_14W0491_70258174_562605608_02.PDF
115_14W0776_70258143_680477806_01.PDF
115_16W0061_70258083_942231888_01.PDF
115_16W0065_70258176_202990692_01.PDF
115_16W0065_70258176_202990692_02.PDF
the 3rd part (70258083) is the element that works as uinque per batch. In other words, I want to merge files per this element. from the file names listed above, there will be 6 PDF files.
I am using the batch script below to merge two files into one. I don't know how to tweak this to more than 2 files to merge OR leave a single file alone.
Please help.
setlocal enabledelayedexpansion
for %%# in (115_*.pdf) do (
set n=%%~n#
set n=!n:~,-30!
pdftk A=!n!.pdf B=%%# cat B A output C:\IN\_fileNames\Merge\Files\!n!.pdf
)
here is the error screen

Related

Create a file listing that contains the created date of the file

I'm trying to copy photos from someone's iphone to my windows laptop. The problem is the photos on the iphone save as filename like IMG 360, IMG 361 etc... but this isn't helpful when I want to copy these and organise by a certain filename and date created.
I use Google Photos and my own backup to organise photos in chronological order.
We went on holiday together and I am trying to find the best way to get their files organised and merged in with my own photos so that they appear in the right chronological order.
Unless there is a better way to do this, I am trying to create a file listing using a BAT file to list all the files together with their CREATED DATE and then I will create another BAT file to rename those files by incorporating their CREATED DATE.
Any ideas?
Thanks Chirag
I tried the below but this is supposed to only organise in chronological order, but it doesn't seem to even do that.
dir /a /b /-p /s /T:C /o:gen >filelisting.txt
You can use the command dir /T:C /O:D > filelisting.txt to create a file listing that contains the created date of each file in a directory.

How to generate a 10000 lines test file from original file with 10 lines?

I want to test an application with a file containing 10000 lines of records (plus header and footer lines). I have a test file with 10 lines now, so I want to duplicate these line 1000 times. I don't want to create a C# code in my app to generate that file (is only for test), so I am looking for a different and simple way to do that.
What kind of tool can I use to do that? CMD? Visual Studio/VS Code extension? Any thought?
If your data is textual, load the 10 records from your test file into an editor. Select all, copy, insert at the end of file. Repeat until the file is of length 10000+
This procedure requires ceil(log_2(1000)) cycles, 10 in your case, in general ceil(log_2(<target_number_of_lines>/<base_number_of_lines>)).
Alternative (large files)
Modern editors should not have performance problems here. However, the principle can be applied using a cat cli command. Assuming that you copy the original file into a file named dup0.txt proceed as follows:
cat dup0.txt dup0.txt >dup1.txt
cat dup1.txt dup1.txt >dup0.txt
leaving you with the quadrupled number of lines in dup0.txt.

Combining pdftk strings for specific pages

I've checked "Similar questions" and went through a lot of search but I can't seem to find a way to combine the snippets I already figured out; would be awesome if someone is able to help.
Using pdftk, alternatively running through PowerShell
I got two .pdf files (f.e.: A=1000 pages, B=5000 pages) which I need to combine in a specific way to generate a new .pdf file. In detail I need page 1-3, 4-6[...] of file A merged with page 1-4, 4-8[...] of file B with a blank page between 1-3 & 4-6.
So far I figured how to burst the files, add a blank page and combine them to a new .pdf file. Yet I'm only able to that for one needed document at a time (a new file with 8 pages).
pdftk fileC.pdf fileD.pdf cat output fileE.pdf
pdftk A=fileE.pdf B=blankpage.pdf cat A1-1 B1-1 A2-4 output conclusion.pdf
Now I'm wondering if there's a way to output the complete file with a command? Otherwise I'd have to do it for every merge of two long files.
Thanks in advance!

Compare different versions of the same directory (by date modified)

This is a multi-part question. I can fill in details once I get to a working prototype.
Situation: Due to a comedy of errors, I have three copies of a very large directory, each copy has some new files/versions of files that are unique. I would like to combine these, keeping the newest version of every file.
Breakdown of things I don't know: How to compare, recursively, directories to one another (probably going to do two at a time; 1 vs 2 = 1+2, then 1+2 vs 3 = 1+2+3). Step crucial to this, how to use the path/filename of a file in directory 1 to first see if it can be found in directory 2, then, if found, use date modified to determine whether to make a copy from 1 or 2 to the new combined directory.
I think with these 3 pieces of information (recursively compare files b/t two directories, by path, and by date modified), I can piece together how to script this. While I can look up these bits separately, it's going to be tough to convince myself this process was done correctly and I'd like to have a little help with the actual assessment/moving step so I have less worry that I've overlooked some small but crucial detail.
Will post the script when I have it put together, along with any caveats about my confidence in it.
Don't waste time writing a script when robocopy is built for file copying and has enough options to cover pretty much any situation...
By default it will only copy a file if the source and destination have different time stamps or different file sizes.
Using /XO will exclude older files that differ, so you will only end up with the newest files in destination.
/E includes subfolders inc empty ones, change to /S to not include empty.
robocopy C:\source1 C:\destination /E /XO
robocopy C:\source2 C:\destination /E /XO
[etc]

Talend tWaitForFile insufficiency

We have a producer process that write files into a specific folder, which run continuously, we have to read files one by one using talend, there is 2 issues:
The 1st: tWaitForFile read only files which exist before its starting, so files which have created after the component starting are not visible for it.
The 2nd: There is no way to know if the file is released by the producer process, it may be read while it is not completely written, the parameter _wait_release_ of tWaitForFile does not work on Linux system !
So how can make Talend read complete written files from a directory that have an increasing files number ?
I'm not sure what you mean by your first issue. tWaitForFile has options to trigger when files are created, modified or deleted in a folder.
As for the second issue, your best bet here is for the file producer to be creating an OK or control file which is a 0 byte touch when it has finished writing the file you want.
In this case you simply look for the appearance of the OK file and then pick up the relevant completed file. If you name the 2 files the same but with a different file extension (the OK file is typically called ".OK" then this should be easy enough to look for. So you would set your tWaitForFile to look for "*.OK" files and then connect this to an iterate to a tFileInputDelimited (in the case you want to pick up a delimited text file) and then declare the file name as ((String)globalMap.get("tWaitForFile_1_CREATED_FILE")).substring(0,((String)globalMap.get("tWaitForFile_1_CREATED_FILE")).length()-3) + ".txt"
I've included some screenshots to help you below:

Categories