I am doing something like extracting the pdf text in a string format so as to annotate the text and in the same process i need to find the image positions covered in the same pdf file so as to maintain its position. Now the problem is that i am not getting the exact positions of the images in the same pdf file. Is it possible to use some thing like OCR,if yes,how to use that?
Can anybody help me in finding the exact position of the image in the pdf file? I need to implement some pdf reader kind of application for ipad,that's just for the knowledge.
Thank you.
Isn't OCR a little bit heavy weight for iphone?
Take look on tools like pdftotext from Xpdf. It is much simpler to read data, as to render and recognize it back again.
Related
I'm using Swift in iOS and used the code based on this SO post to save a UIImage as a bmp
Convert UIImage to NSData and convert back to UIImage in Swift?
The data I create and save is recognized as a bmp in Photoshop and Preview but can’t be read by the Adafruit PyPortal. The only diff I can find is that when I resave the bmp in Photoshop as a bmp again, it shows as “Flip row order” selected in the BMP options screen that appears right after the main save screen.
If I uncheck this option and save the file, the PyPortal can then read this resaved file. This post above was great for getting the UIImage into .bmp format, but I need to get this additional file change done programmatically on iOS, so opening in a third-party product, or working with shell commands won't work as a solution. I’ve not been able to find anything in Apple’s docs that looks like it corresponds to Flip row order and there isn’t much online about this option within Photoshop so it’s unclear even what this does.
For the curious I have samples of the bmp my app creates as well as options resaved or run through an online converter (both these options work on PyPortal).
https://drive.google.com/drive/folders/1DQYes-cJXKm3ue8Z9cACDLEN5bxnnkJc
Any suggestions are appreciated. Thx!
The “Flip row order” option tells the bmp reader software to read the first row of pixels first and the last row of pixels last. This is not the “normal” way most bmp reader interpret a bmp format image. Most implementations read from the last row first and then work its way up to the first row of pixels.
Your options are either, rewrite the bitmap creating software so that it matches the abilities of the bmp reader software or change the bmp reader software so it can read the bmp file.
I’m fully aware this doesn’t solve the issue but can help understanding what it going wrong and guide you in the right direction.
First, sorry for my English. I have a question regarding tesseract. Is there a way to recognize text in a graphic or a picture without having to clean the image you want to recognize? in what I read there to clean the image first, remove graphics and photographs and leave only the text. But I want the user to upload to the server and newspaper clippings you can recognize this news without human intervention. It can be tricky. But if you know any other way I could do as they are grateful. Thank you very much
No, you can't.
Tesseract is made for reading text and only text. When you perform OCR on a subject with both text and an image, Tesseract spits out things it finds in the image (garbled crap).
You can detect image regions and crop them out, though. I think that would be a better question to ask.
In my iPad application I want to add signature in my pdf file.
I already do perform following steps:
Open pdf in UIView (zooming is not implemented yet).
Add one transparent subview (UIImageView) and draw signature on that.
Save all screen using UIGraphicsGetImageFromCurrentImageContext() as a image.
Convert and save the image as a pdf.
This is works fine but pdf quality is very poor.
But now I want to add a signature/image as a pdf metadata. Same as a markup and commenting features of PDF.
Is there any help or sample code is available for the same?
It should be possible to improve the quality of the output by skipping the image/pdf conversion, but afaik there's no lib that will help you editing the metadata of a pdf on the iPad (at least, none that's freely available).
Depending on what exactly you want to do, you may have to write a parser from scratch to know what exactly you have to append to your document to see the wanted effect:
It is very easy to append data to a pdf, but it has to be "registered" in the right locations so that a reader can use this information.
I have a problem in showing *.mpp (microsoft project files) in my app. I thought of showing in image format but i dont know how to convert it into image format. or is there any other way to view mpp files.
thanks in advance
I have doubts you can with the standard SDK.
You should first convert your .mpp file to a more convenient format, such as pdf or an image format, like png, using a tool like Zamzar or something similar. Then, depending on your output, on the iPhone you would use a UIWebView to display a PDF or an UIImageView to view an image.
I'm trying to write a eBook, for the iPhone, using PDF format.
The problem is, I can't create a PDF with 5 cm x 5 cm (example).
I've tried Adobe Acrobat Pro 9. Didn't work, since it is not possible to custom the paper size.
I've tried Pages 08, but it's also not possible (it's possible to set the custom size, but it doesn't work, might be a bug).
I've tried Microsoft Word. The generated PDF is a mess... Doesn't work right.
So.. I can't create a PDF, with a custom paper size. This is nuts... There must be a tool or something that works right.
Anyone knows any tool that works well?
Thanks
On the Mac (since the underlying drawing system Quartz is based on the same ancestor as PDF), you can always generate PDFs by doing Print->Save as PDF...
This generally gives good results.
I have only a suggestion, but maybe open office?
Given that the iPhone resolution is 320 x 360px with a resolution of 163ppi we need to optimise print settings before exporting our document to PDF. I’m tipping most people will view their document with the iphone orientated in landscape mode so we’ll base our document width on 360px.
So here’s the settings you need to use when exporting or printing your document to PDF:
Width: 125mm x 225mm.
That’s it. Now just print your document to PDF using a PDF printer driver like doPDF and email the document to your iPhone.
Have a look at latex. You can typeset the document to any size that you want. http://www.latex-project.org/