is there a possibility to add 'crop marks' to generated pdf ?
I mean on the marks for cutting final product? sometimes it's referred as bleed
picture:
Related
I'm creating a PDF document using iTextSharp, what I'm doing is generating all of my content in a c# List<Chapter> where the Chapters contain one or more Sections, and the Chapters have not yet been added to the document. I then enumerate through my List<Chapter> to generate a table of contents at the start of the document, and then add the Chapters to the document after my TOC.
That works great when my Sections contain text and images, but now I need to generate a Section containing boxes and lines. I don't want to draw my boxes and lines into an image and drop the image into the Section, that won't look as good as if I have actual PDF boxes and lines.
The Sections containing graphical elements can be intermixed with Sections containing text, so I need a way to add some kind of element to a Section such that that graphical Section works like text Sections in terms of going onto a new page only if necessary.
What's the best way to do this? I feel like it somehow involves PdfTemplates but I'm not sure how. Or maybe I need to create a PdfPTable and create my graphical elements in an IPdfPCellEvent?
You are on the right track when you want to involve PdfTemplate elements. PdfTemplate is an iText object that corresponds with the concept of Form XObjects in the PDF specification. We chose another name because the word Form is somewhat misleading (people confuse it with form fields, interactive forms, etc).
The content stream of a page in PDF is a sequence of PDF syntax, consisting of operands and operators. An XObject is an object that is external to this content stream. The content of an XObject is stored inside the PDF document only once, but it can be reused many times on the same page, on different pages.
There are different types of XObjects, but Image XObjects and Form XObjects are the most important ones.
Image XObjects are used when we work with raster images. You are absolutely right when you write: *"I don't want to draw my boxes and lines into an image and drop the image into the Section, that won't look as good as if I have actual PDF boxes and lines."
Form XObjects are used when we want to reuse PDF syntax. This is what you need: you want to define moveTo(), lineTo(), curveTo(), stroke(), fill(),... operations, and you want these lines and shapes to be stored as vector data.
The solution to your problem is to draw lines and shapes to a PdfTemplate object and to wrap the PdfTemplate object inside an Image object. When you add that Image object to a Section or a Chapter, it will be added as a Form XObject. You don't have to feat that it will be degraded into a raster image.
You can find some examples of this technique on the official web site. For instance in the answer to the question
How to generate 2D barcode as vector image?
Here we create a PdfTemplate with a bar code and we return it as an Image object. The screen shot that shows you the internals of the resulting PDF proves that the bar code is added as a vector image.
public Image createBarcode(PdfContentByte cb, String text,
float mh, float mw) throws BadElementException {
BarcodePDF417 pf = new BarcodePDF417();
pf.setText("BarcodePDF417 barcode");
Rectangle size = pf.getBarcodeSize();
PdfTemplate template = cb.createTemplate(
mw * size.getWidth(), mh * size.getHeight());
pf.placeBarcode(template, BaseColor.BLACK, mh, mw);
return Image.getInstance(template);
}
To create a PdfTemplate object, you need a PdfContentByte instance (e.g. using writer.getDirectContent()) and use the createTemplate() method passing a width and a height as parameters. Then you draw content to the PdfTemplate and turn it into an Image object using Image.getInstance().
You'll find more info on drawing lines and shapes in the chapter on Absolute positioning of lines and shapes and in the example section of Chapter 3 and Chapter 14 of my book.
I want to prepare one selection of data from my high-quality PDF document which has no textual elements (just a plot), prepared originally by Matlab.
I do not want to give the whole picture for my collegues because it is too overwhelming.
#1 Tools in Matlab
I know this thread How can I read an image file that is stored in PDF format (much like reading a jpeg file with I = imread('image.jpg')? but I have got denying experiences from my colleagues and to my task PDF should be enough because my data is just a high-quality plot without textual elements.
Most relevant thread is this one How to extract data from pdf file in matlab?
Most attempts are based on extracting PDF to TXT, like How to Read PDF file in Matlab? about pdftotext.
I want now imcrop the PDF such that the output could be used in the time-series analysis of Mathematica here, but I did not find that the default imcrop tool of Matlab is supporting PDF, Crop an Image.
Some findings
Show and Save as PDF based on the answer. I do pdf = Import[filename.pdf]; Show[pdf[[1]], PlotRange -> {{50, 200}, {100, 300}}] and I see a good selected picture in Image viewer, but failure when exporting the picture back to Mathematica seeing the complete picture. Why? PlotRange does not crop but only put a white mask on the top of the picture which can separated etc in Mathematica.
Going from Show to ImageCrop based on this answer. Wrong approach, confusion with ImageTake.
Going from Show to ImageTake based on this answer.
The Show and ImageTake are not injective to each other because ImageTake has at least reversed order of parameters {ymin,ymax}, {xmin,xmax} according to the manual. However, I could not manage to select the correct selection by just reversing the parameters. Why?
Comments for Mathematica
It would be nice if the regions selected would correspond to each other.
Therefore, I would like to have some visual tool to select appropriate area from the figure.
I notice there occurs some aliasing when enlarging the original image.
It would be nice to know how Mathematica handles such cases with ImageTake.
How can you prepare imcrop of PDF image for the time-series toolbox of Mathematica?
I think this question is about image extraction.
However, I extended the question to the thread Better Colormap of Matlab and Image Extraction for Time-Series Toolbox of Mathematica? for Mathematica.
Mathematica will import your pdf as a graphic object which you can 'crop' using plotrange.
pdf = Import[filename.pdf];
Show[pdf[[1]], PlotRange -> {{50, 200}, {100, 300}}]
note the values are {{xmin,xmax},{ymin,ymax}} in "points"
You can also rasterize and then use ImageTake
ImageTake[Rasterize[pdf[[1]]], {10, 100}, {20, 100}]
here the values are {ymin,ymax} , {xmin,xmax} (note the reverse order )
Note the [[1]] here is effectively the page number. I'm pretty sure Import returns a list of pages even if the pdf is a single page.
If you want to actually extract plot data that's a whole other question. For that I'd suggest mathematica.stackexchange.com and provide an example file.
I am developing an image processing software that extracts/crops and enhances this cropped single page form from an image taken from a cellphone camera.The form has no rectangular boundaries to simplify the process of extraction.Yes it is a white background black text format but nothing apart from that is fixed.Now some Text will be present which will verify that the image is of the form required.So my questions are these.
1) Can i search for a specific regular expression using leptonica library itself or do i have to shift focus to other libraries like the tessarect API to do this.So far i have not found anything of this sort
2) Now suppose i know the text at the top left corner and the bottom right corner and i search it succesfully.Can i get the co-ordinates of the particular text that i am searching and then crop the image accordingly?
Leptonica doesn't do anything with text, it's an image processing library.
To enable acquiring position of the text, add tessedit_create_hocr 1 to you Tesseract config file (or set this option whichever way you configure Tesseract if you're using it as a library).
The result is no longer a text file, but a UTF-8-encoded HTML file (note: it's not valid XML). Its format is self-explanatory. It will contain positions and dimensions of all words on all pages in pixels, as found on the input image. You need to parse that HTML, find the words you're looking for, and then get bounding boxed of those words.
How the line smoothness in a contour plot can be improved for publications? For instance, the dotted lines look really bad, the continuous lines look as if their thickness varies. See below
Here's part of the code:
Vals = [0:5:200]; contourf(X,Y,W,Vals,'EdgeColor','k','LineWidth',1.2,'LineStyle',':');axis square;grid;hold on
Vals = [10:10:200]; contour(X,Y,W,Vals,'EdgeColor','k','LineWidth',1.2);
Vals = [20 : 20 : 200]; [C,h] = contour(X,Y,W,Vals,'Color','k','LineWidth',1.8);
clabel(C,h,'FontName','Palatino Linotype','FontAngle','italic','Fontsize',9,'Color','w')
print -djpeg -r300 filename
Thanks!
Saved as png doesn't help much... check the lines :/ See below:
Check the dotted lines now...
Here's saving as eps (-r1200)... it looks better
Exporting as vector graphics will definitely improve the image over what you see on your screen; I use LaTeX for publications and you can either export to eps for postscript output, and use epstopdf for PDF output, and embed these directly in your document; that would be the best solution.
Additionally, there are also a bunch of general utilities for making your plots look better for camera-ready publications, the most notable that comes to mind is exportfig, which has a load of features to help even with pixel graphics. These go above and beyond just generating smoother-looking images.
http://www.mathworks.us/matlabcentral/fileexchange/23629-exportfig
(copied from that page):
This function saves a figure or single axes to one or more vector and/or bitmap file formats, and/or outputs a rasterized version to the workspace, with the following properties:
Figure/axes reproduced as it appears on screen
Cropped borders (optional)
Embedded fonts (pdf only)
Improved line and grid line styles
Anti-aliased graphics (bitmap formats)
Render images at native resolution (optional for bitmap formats)
Transparent background supported (pdf, eps, png)
Semi-transparent patch objects supported (png only)
RGB, CMYK or grayscale output (CMYK only with pdf, eps, tiff)
Variable image compression, including lossless (pdf, eps, jpg)
Optionally append to file (pdf, tiff)
Vector formats: pdf, eps
Bitmap formats: png, tiff, jpg, bmp, export to workspace
This function is especially suited to exporting figures for use in publications and presentations, because of the high quality and portability of media produced.
Update: I see your example code now. Did you try changing -r300 to some really high value? More pixels per inch should make everything look smoother. For publication, crank it up really high, like -r1200.
Original:
One thing you can try is exporting the plot in some format that supports vector graphics. Matlab supports both PDF and EMF, so try one of those. Export using the saveas command or from the figure's "File -> Save as" menu item. After that, open or import the image file in some other application and hopefully it will look better.
Please add a new screenshot if you get a nicer image!
On this website, one can create stitch charts from images. I'm trying to do this in MATLAB. I have implemented everything using the Image Processing Toolbox (Reducing of number of the colors, mapping to the color space of available yarn colors). I'm done with all of this, the only thing I still need to do is to create an output similar to these files from MATLAB, which basically show which yarn to use for each raster point of the stitch chart:
BW
Color
My question is how to print a table with a lot of very small fields with the color and/or symbol inside.
It should look somehow like in these PDF files. How can I print out a table like this? Directly from MATLAB? Can I create a PDF file like this in MATLAB? Should I export it to Excel somehow?
For the "color" PDF you linked, this looks just like a pixelated image. Why don't you save each "field" as one pixel, for example in a TIF file using imwrite(I, 'filename')? You could then print this TIF into a PDF using an appropriate scaling factor to make the pixels large enough.
For the "BW" PDF which basically contains a large table of symbols, it would probably be easiest to go through HTML or RTF file format to get the table of symbols, and then use some html2pdf or rtf2pdf converter...