How can I parse raw email source and extract the HTML part?

How can I parse raw email source and extract the HTML part? - iphone

In my iPhone app, I'm handed the raw source of an email, in RFC822 (or "eml") format. I'd like the HTML part of this message (if one exists).
Rather than attempting to parse it out myself and converting escape chars and so on, I thought I'd check to see if anyone knows of an objective-c library to do this for me.
In .NET, I've always used the Mailbee classes for anything email related, but I can't seem to find anything similar for cocoa.

You may have a look at the Pantomime framework for Mac OS X. It provides a full fledged email package, including RFC822 parsing. It can be downloaded directly from here.
As far as I know it has not been ported to iPhone, but it should give you a good starting point.
Claus

Related

Support displaying emojis in UILabel

I'm having problems displaying emojis in a UILabel.
in some cases, it even causes a crash when lay-outing the characters in the label.
these characters are returning from server as unicode, and are parsed with AFNetworking framework.
this is an example of how it is returned from the server (console logs):
\U05d4\U05d9\U05d9
i have tried different approaches, like lowercasing this to "\u05d4" or playing with the encoding of the string returning.
nothing seems to work.
i did managed to show a couple of emojis properly (which makes me think it maybe a server related issue?) - does the server needs to support sets of unicode characters so it can return it in the appropriate encoding? i'd be happy if someone could clarify this point for me. (btw, server is written in RubyOnRails i believe.)
should i parse the data with a different parser (SBJSON)? although switching the networking framework at this point would be impossible due to time and resources available..
what other options do i have?
Thanks

i think you should be able to just paste an emoji character in the code directly as a text.

Access adobe digital editions from the command line

I'm looking to create a script for my 80-year old grandmother that downloads the books she needs, and converts them using the command-line version of Calibre, to kindle format so she can read them on her kindle. She gets a lot of her books from a service in the form of Adobe .epub books. AFAIK, none of these books have DRM on them prior to being converted, so let me be clear - I'm not asking how to strip DRM from an ebook.
What I am asking is whether there is a way to programmatically (from the command line is fine if Adobe Digital Editions supports CL args) use the ticket file to request a book from the library, and download it, in .epub form, to the local hard drive. I simply don't want my grandmother to have to go through all of the unneeded screens in Adobe Digital Editions' interface - she gets confused easily, and the interface tends to be overwhelming for her. I simply want to write a function (it can be a system() call to a command... that's fine) that will allow her to take a file received from the library or digital service and automatically retrieve the proper .epub file.
I have all of the other steps ready to go... I just can't find any way to retrieve the book from the service without using the DE interface.
Any suggestions?

Check this S.O. posting, I know it will help ;-)
pdf-adobe-digital-edition

c/c++/objective-c library for encoding mpegs

I am looking to encode a jpg sequence into an mpeg format on an iphone project I am working on. My google searches are coming up pretty short. Does anyone happen to know of a library that would let me do something like this?

have a look at ffmpeg
(more characters required)

Parsing source of a webpage with Objective-C

Is there a way to parse a website's source on the iPhone to get the URL's of photos on that page? If so how would you do that?
Thanks

I'd say go for regular expressions - there is a one page library that wraps c regexesthat you can drop into your project.

I recommend regular expressions. There's a great open source Regex library for Cocoa called RegexKit. For the most part, you can just drop it in your code and it'll "just work".
Getting all the urls of images wouldn't be too difficult (less than 20 lines of code) if you assume that all images are going to be in <img> tags. You'd just grab all the image tags (something like: <img\s+[^>]+>), then iterate through those matches. For each match, you'd pull out whatever's in the src attribute: src\s*=\s*("|')?\s*([^\s"']+)(\s|"|')
You might need to tweak that a bit, but it shouldn't be too bad.

There is no super easy way. When I had to do it I wrote a libxml2 SAX parser. libxml2 has an html reader that works fairly well with malformed html, and libxml2 is included with the base system.

You could try it using regular expressions, but I wouldn't recommend that. You should have a look at NSXMLParser, assuming the webpage is coded to be XHTML compliant. TouchXML is another good library.

take a look at Event Driven XML Parsing in the iPhone reference library

Are you OK with any approach you use not picking up on images loaded dynamically via JavaScript.
The closest thing I could see working is to parse out any JavaScript imports, load those up too, and then use a regular expression across the whole file looking for anything that ends in ".jpg/.gif/.png" and grab the full URL out from that. The libxml approach would miss out on references to images not in img tags, but it might well be good enough.

iPhone RSS Reader -- parseXML won't Load some XML feeds

I am using the SIMPLE RSS reading example found at http://theappleblog.com/2008/08/04/tutorial-build-a-simple-rss-reader-for-iphone/
It uses parseXML to load the RSS feeds.
Here is the problem I am having. For the following RSS feed example, I am having trouble getting it to load the feed. Comes up with an error that it cannot connect. However on my Mac RSS Reader it works fine, so I know the link is good.
Any ideas on why it cannot load this particular feed but it can load others fine?
http://www.okstate.com/rss.dbml?db_oem_id=200&media=news
Thanks.

I've just released an open source RSS/Atom Parser for iPhone and hopefully it might be of some use.
I'd love to hear your thoughts on it too!

In my experience, HTML markup causes an RSS parser to fail in most cases. I've experienced a problem like this with a lot of parser classes I've come across (in search of the ultimate one, which I didn't find)
My guess is that entities such as
&#39;s
are responsible for your crash. That was usually the case with my crashes. This also lead to my decision to create a 'proxy server' to pre-parse the XML before sending it to the iPhone (which gives me the advantage of caching, scaling, and some other stuff). I do believe there are solid solutions out there, but is always difficult writing a parser for so many RSS implementations.
P.S: W3C validates this feed as 'valid', so it really is 'our' problem..

Your problem could lie with:
Unicode characters (i.e. I see some o's with two dots above them in the feed)
The code you have doesn't respect CDATA sections correctly
To find out which is the case, save the feed file to your local disk and load it via your code to make sure the error happens.
Do a binary search on the file to find out if a particular RSS entry is causing the problem (i.e. remove all but the first rss entry and see if the problem exists. If it does, then the problem is there, if it doesn't put half the rss entries back in the file and repeat)

I've been experiencing a similar issue. I haven't yet pinned down the answer, but I've noticed that RSS 2 tends to parse more successfully than the rest.

There are many RSS feeds that contain invalid XML, usually because they were hacked together on the server side using HTML templates by somebody who didn't understand XML. I've seen improperly escaped (or non-escaped) HTML post contents, missing close tags, badly nested tags, and so on.
If you want to be able to parse arbitrary feeds, you have to clean up bad XML. The usual way is to use the "htmlTidy" library, which is included in the OS. This can clean up XML as well as HTML.
This example you're following uses NSXMLParser -- I have no idea why. It's a lower-level API and it doesn't support tidying. I would suggest using NSXMLDocument instead. There's a flag in that API that will tell it to use tidy when parsing the XML. This API also returns you the XML as a handy tree of elements that's easy to work with.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How can I parse raw email source and extract the HTML part? - iphone

You may have a look at the Pantomime framework for Mac OS X. It provides a full fledged email package, including RFC822 parsing. It can be downloaded directly from here. As far as I know it has not been ported to iPhone, but it should give you a good starting point. Claus

Related

Support displaying emojis in UILabel

Access adobe digital editions from the command line

c/c++/objective-c library for encoding mpegs

Parsing source of a webpage with Objective-C

iPhone RSS Reader -- parseXML won't Load some XML feeds

Categories

Resources