PDF Parsing in iPhone

PDF Parsing in iPhone - iphone

I am trying to read a PDF on iPhone, I got to know that we can get the info about PDF from CGPDFDocumentGetCatalog method. But, this method returns a CGPDFDictionaryRef. I have browsed through the documentation and didn't find any method to extract its Key/Values. Please help me if anybody has solution for the problem. Or you can tell if we can have any other way to extract data from PDFs.

It seems that you have to extract the names of the values first. Take a look to this site especially the allScriptsInPDFDocument method

I have just checked out the documentation and it has a load of functions for getting key value pairs. If you don't know what the keys are you can use CGPDFDictionaryApplyFunction to with an appropriate callback.
Alternatively, check out the PDF Specification for a detailed description of the catalog (section 7.7.2).

Related

can i convert pdf to xml/JSON using itext?

I have a requirement to convert PDF to XML/JSON to get the properties of the text, can I do that using itext?
If so, please let me know which class, that you are aware of, can do that.
I tried looking into the API but could not find any method.

generate word file through open xml

I am using openxml to generate my word file that contains user input messages and attachment if there are any. Now, I am stuck in a situation where I don't know how to display PDF /JPEG/JPG if user attached such things with the inputted message.
Is there any way I can show the above attached in my generated word file.
Thanks

MSDN has the specific example of adding the images to the document. The sample code you could use is provided here.
http://msdn.microsoft.com/en-us/library/bb497430%28v=office.14%29.aspx

Insert Objects into Word 2010 with metadata

Please excuse my ignorance on this subject I am very new to this.
I need to be able to insert images and html that contains tabular data into a word document from a Word Addin. This I have managed to do in its most basic form using the InsertFile method. Word converts the html into its native syntax wordprocessingML which is fine.
However, I need to be able to store some metadata with each inserted object so that it can be regenerated externally and replaced in the document when requested by the user. I have been looking at Open XML but can see how or if it is possible with this either.
Please can you point me in the right direction as to how best I can achieve this.
Thanks in advance.

How does one add custom metadata tags to an image on the iPhone?

I tried adding custom entries in the exif dictionaries I received from an image. This didn't work. I'm assuming this is due to the fact exif is a standard that is already defined.
Basically I am trying to create a metadata tag that can be placed in jpegs that will have no character limit.
I read that XMP metadata tags do not have character limits. Is this true? If so how would I create these on the iPhone?
Thank you.

I'm not sure what you've all tried code wise, but Caffeinated Cocoa has a pretty good blog entry on Image Metadata that I've used in an application for a client a while ago that might help you.
Also, the SO post Problem setting exif data for an image looks like it references the Caffeinated Cocoa. Although this question is a little over a year old, it still might help.
Try giving these a shot.

iPhone RSS Reader -- parseXML won't Load some XML feeds

I am using the SIMPLE RSS reading example found at http://theappleblog.com/2008/08/04/tutorial-build-a-simple-rss-reader-for-iphone/
It uses parseXML to load the RSS feeds.
Here is the problem I am having. For the following RSS feed example, I am having trouble getting it to load the feed. Comes up with an error that it cannot connect. However on my Mac RSS Reader it works fine, so I know the link is good.
Any ideas on why it cannot load this particular feed but it can load others fine?
http://www.okstate.com/rss.dbml?db_oem_id=200&media=news
Thanks.

I've just released an open source RSS/Atom Parser for iPhone and hopefully it might be of some use.
I'd love to hear your thoughts on it too!

In my experience, HTML markup causes an RSS parser to fail in most cases. I've experienced a problem like this with a lot of parser classes I've come across (in search of the ultimate one, which I didn't find)
My guess is that entities such as
&#39;s
are responsible for your crash. That was usually the case with my crashes. This also lead to my decision to create a 'proxy server' to pre-parse the XML before sending it to the iPhone (which gives me the advantage of caching, scaling, and some other stuff). I do believe there are solid solutions out there, but is always difficult writing a parser for so many RSS implementations.
P.S: W3C validates this feed as 'valid', so it really is 'our' problem..

Your problem could lie with:
Unicode characters (i.e. I see some o's with two dots above them in the feed)
The code you have doesn't respect CDATA sections correctly
To find out which is the case, save the feed file to your local disk and load it via your code to make sure the error happens.
Do a binary search on the file to find out if a particular RSS entry is causing the problem (i.e. remove all but the first rss entry and see if the problem exists. If it does, then the problem is there, if it doesn't put half the rss entries back in the file and repeat)

I've been experiencing a similar issue. I haven't yet pinned down the answer, but I've noticed that RSS 2 tends to parse more successfully than the rest.

There are many RSS feeds that contain invalid XML, usually because they were hacked together on the server side using HTML templates by somebody who didn't understand XML. I've seen improperly escaped (or non-escaped) HTML post contents, missing close tags, badly nested tags, and so on.
If you want to be able to parse arbitrary feeds, you have to clean up bad XML. The usual way is to use the "htmlTidy" library, which is included in the OS. This can clean up XML as well as HTML.
This example you're following uses NSXMLParser -- I have no idea why. It's a lower-level API and it doesn't support tidying. I would suggest using NSXMLDocument instead. There's a flag in that API that will tell it to use tidy when parsing the XML. This API also returns you the XML as a handy tree of elements that's easy to work with.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

PDF Parsing in iPhone - iphone

It seems that you have to extract the names of the values first. Take a look to this site especially the allScriptsInPDFDocument method

Related

can i convert pdf to xml/JSON using itext?

generate word file through open xml

Insert Objects into Word 2010 with metadata

How does one add custom metadata tags to an image on the iPhone?

iPhone RSS Reader -- parseXML won't Load some XML feeds

Categories

Resources