merging PDFs with Ghostscipt ignoring outline and using pdfmark instead - merge

I am using a Batch script to merge different PDFs in one complete file.
%gsc% -dBATCH -sDEVICE=pdfwrite -sPAPERSIZE=letter -dEPSFitPage -o %dsk%%zus%%ext% %mfd% %pth%tmp\pdfmarks
%dsk%%zus%%ext%: Path and name of final (complete) document
%mfd%: Path and name of docs to be merged (c:\test\1.pdf c:\test\2.pdf ...)
%pth%tmp = path to the pdfmarks file
Additionally, I am creating a pdfmark document inside the script which gs uses to create the bookmarks. But unfortunately, some of the docs I am merging, have already their own bookmarks and I did not yet find a solution how to ignore those. GS should only use the bookmarks inside the pdfmarks file.
How can this be done?

Firstly; you are not 'merging' PDF files when you use Ghotscript's pdfwrite device. The process is described in detail here
The important point is that the way the input file(s) are constructed has no bearing on the way the output file is constructed. If any other software you use relies on the file being constructed in a particular fashion it may not work on the output PDF file.
The -dEPSFitPage switch only has any effect when the input is an EPS file. If you want to 'fit' PostScript or PDF files then you need to use -dPDFFitPage, -dPSFitPage or just -dFitPage. However, all of these rely on you first selecting a media size, and then preventing it being altered by setting -dFIXEDMEDIA. For EPS files you would more normally use -dEPSCrop which sets the media size to the EPS declared BoundingBox.
You can prevent the PDF interpreter reading the Outlines tree (which you are calling Bookmarks) and then creating a pdfmark from it to pass to the pdfwrite device by using the -dNO_PDFMARK_OUTLINES switch which oddly isn't documented, presumably an oversight.

Related

How do I configure mplayer to use a default edl file name?

I want to configure mplayer to look for an edl when playing a video. Specifically, I want it to use "show.edl" when playing "show.mp4", assuming both are in the same directory. Very similar to how it looks for subtitles.
I can add a default edl in the config file by adding the following:
edl=default.edl
And this will look for the file "default.edl" IN THE CURRENT DIRECTORY, rather than in the directory where the media file is. And it isn't named after the media file either, and thus even if it did look in the right place, I'd have one single edl file for every media file in that directory.
Not really what I wanted.
So, is there a way, in the "~/.mplayer/config" file, to specify the edl relative to the input file name?
Mplayer's config file format doesn't seem to support any sort of replacement syntax. So there's no way to do this?
MPlayer does not have a native method to specify strings in the config file relative to the input file name. So there's no native way to deal with this.
There's a variety of approaches you could use to get around that. Writing a wrapper around mplayer to parse out the input file and add an "-edl=" parameter is fairly general, but will fail on playlists, and I'm sure lots of other edge cases. The most general solution would of course be to add the functionality to mplayer's config parser (m_parse.c, iirc.)
The simplest, though, is to (ab)use media-specific configuration files.
pros:
Doesn't require recompiling mplayer!
Well defined and limited failure modes. I.E. the ways it fails and when it fails are easily understood, and there aren't hidden "oops, didn't expect that" behaviors hidden anywhere.
Construction and updating of the edl files is easily automated.
cons:
Fails if you move the media around, as the config files need to full path to the edl file to function correctly.
Requires you have a ".conf" file as well as an EDL file, which adds clutter to the file system.
Malicious config files in the media directory may be a security issue. (Though if you're allowing general upload of media files, you probably have bigger problems. mplayer is not at all a security-hardened codebase, nor generally are the codecs it uses.)
To make this work:
Add "use-filedir-conf=yes" to "/etc/mplayer.conf" or "~/.mplayer/config". This is required, as looking in the media directory for config files is turned off by default,
For each file "clip.mp4" which has an edl "clip.edl" in the same directory, create a file "clip.mp4.conf" which contains the single line "edl=/path/to/clip.edl". The complete path is required.
Enjoy!
Automatic creation and updating of the media-specific .conf files is left as an exercise for the student.

Set filename to ditamap title in DITA-OT Command Line PDF transformation

I have a script that builds a system on a regular schedule, and as part of that system, I need to convert several documents from dita to PDF.
I can run the following shaped command line from my script fine:
dita --input=<file location> --output=<output location> --format=pdf
But due to naming conventions and other restrictions, the name of the ditamap files are not always well-formed or human-readable (and I am not able to change the name of the files). I'm aware of the outputBase.file parameter that I can pass in on the command line, but I would like dita to be able to scan/read the file and substitute the document title as the filename, something along the lines of:
dita --input=<file> --output=<output> --format=pdf --outputBase.file=$title
Is this even possible?
You don't have to change dita command-line formats. Instead, you can change output PDF file name to the document title according to following steps:
In the top of the your PDF plug-in processing, read the main map's title (bookmap or map) using XSLT task and output XML file that contains title.
Set the title to some property you prefer (such as document.title). To set property, it is useful to use <xmlproperty> task in ant script.
After generating PDF file, change the PDF file name in <output location> to ${document.title}.pdf in the last phase of build process.
In my experience, one of the user want to output PDF that is authored in bookmap. In this case, above technique works fine for this user.
Hope this helps your development.

Doxygen-produced PDF - change url color?

I’m using Doxygen 1.8.10 (on Windows) to generate LaTeX files, and MiKTex 2.9 to generate a PDF. The PDF is functional, but not very pretty. I’ve figured out how to customize the title page (I added graphics and non-default text) and how to get the images into the PDF.
But... how do I change the styling for things such as the color of URLs (which are just text in the Doxygen comments, and then Doxygen turns them into \href items)?
**** I believe I need to change something in the hyperref package’s config or what Doxygen writes to the .tex files, but I’m not sure which approach is right, nor how to do either one...
I’ve created a custom_doxygen.sty file, and assigned it to the LATEX_EXTRA_STYLESHEET. I assume that it’s being picked up by Doxygen because Doxygen is successfully picking up my custom LATEX_HEADER file, which is in the same directory as the custom_doxygen.sty file. But what I don’t know is what to put into the custom_doxygen.sty file?
If I run everything as default (that is, no LATEX_EXTRA_STYLESHEET), the following code gets written to the refman.tex file:
% Hyperlinks (required, but should be loaded last)
\usepackage{ifpdf}
\ifpdf
\usepackage[pdftex,pagebackref=true]{hyperref}
\else
\usepackage[ps2pdf,pagebackref=true]{hyperref}
\fi
\hypersetup{%
colorlinks=true,%
linkcolor=blue,%
citecolor=blue,%
unicode%
}
And what I need is for the “urlcolor” to also be blue (its default in the hyperref package is magenta—an odd choice for sure).
I tried just basically copying what was in the refman.tex file to the custom_doxygen.sty file (and making sure that the custom_doxygen.sty file is assigned to the LATEX_EXTRA_STYLESHEET setting in my Doxyfile) and adding a “urlcolor=blue,%” to the setup section, but there’s no change in the output.
If I manually edit the refman.tex file (that is, I add "citecolor=blue,%" to the \hypersetup) after it's output from Doxygen, and then use the edited file as input to MiKTeX, I get the desired output.
So a workaround could be to just script the desired change and run the script every time. But it would be certainly be better to get Doxygen to write the necessary configuration. Plus, there are other things I want to customize (such as the font of explicit html hrefs), so I'd like to learn how to do things properly.

Method to decompress a PDF (non-Adobe) while retaining form fields?

I found a similar question that involves Acrobat, but in this case the PDF was made with a combination of MS Word and CenoPDF v3, with which I'm unfamiliar. Additionally the PDF is version 1.3. I'd like to decompress it, to see its low-level workings and make some changes. It's easy with GhostScript's -dCompressPages=false parameter, but that simultaneously strips all the fill-in form functionality. Is there a method for decompressing the file while leaving everything else intact? A quick search of the docs for tcpdf and fpdi (cited in the link) didn't reveal a compression option.
Ghostscript and pdfwrite isn't a good combination. The PDF file you get out is NOT the same as the one you put in. This is because of the way that Ghostscript and pdfwrite work; the input is fully interpreted to a sequence of graphics primitives, which is sent to the Ghostscript graphics library. These are then sent to the requested device, most devices then render the result to a bitmap, but the pdfwrite family reassemble those graphics primitives int a new PDF file.
Note that the contents of the new PDF file have no relationship to the original, other than the appearance when rendered. Ghostscript and pdfwrite do maintain much of the non-marking content of PDF files such as hyperlinks and so on (which obviously don't get turned into graphics primitives), by interpreting them into pdfmark operations (an extension to the PostScript language defined by Adobe). However, even if Ghostscript and pdfwrite maintained all this content, the resulting PDF file wouldn't be the same as the original one decompressed....
There are tools which will decompress PDF files, and I would recommend one of our other products, MuPDF. A part of this is mutool, and "mutool clean -d in.pdf out.pdf" will decompress pretty much everything in a PDF file
QPDF can decompress PDF documents (among other things). I used this tool in the past and it preserved forms and data.
The tool has some issues with large PDFs (can take too much time and memory for decompression). The tool can produce incomplete output (with warnings in console) for some partially broken / nonstandard PDFs.

C# folder and subfolder

Upon numerous searches, I am here to see if someone has any idea on how I should go about tackling this issue.
I have a folder with sub-folders. The sub-folder containers each has files of different file types e.g. pdf, png, jpeg, tiff, avi and word documents.
My goal is to write a code in C# that will go into the subfolder, and combined all the files into one pdf using the name of the folder. The only exception is that a file such as avi will not be pdf'ed in which case I want a nudge as to which folder it is and possibly file name. I am trying to use the form approach, so that you can copy in the folder pathname and also destination of the created pdf.
Thanks.
to start, create a FolderBrowserDialog to get the root folder. Alternatively just make a textbox in which you paste the folder name ( less preferred since the first method gives you nicer error-handling straight out of the box )
In order to iterate through, see How to: Iterate Through a Directory Tree
To find the filetype, check System.IO.FileInfo.Extension for each file you iterate through. Add those to list with the data you need. ( hint, create a list of objects in which your object reflects the data you need such as path, type etc,... ). If its an avi don't toss it in the list but flash a warning (messagebox?) instead.
From here the original question gets fuzzy. What exactly do you need in the pdf. Just the filenames and locations or do you actually want to throw the actual contents of the file in pdf?