Objective C HTML escape/unescape - iphone

Wondering if there is an easy way to do a simple HTML escape/unescape in Objective C. What I want is something like this psuedo code:
NSString *string = #"<span>Foo</span>";
[string stringByUnescapingHTML];
Which returns
<span>Foo</span>
Hopefully unescaping all other HTML entities as well and even ASCII codes like Ӓ and the like.
Is there any methods in Cocoa Touch/UIKit to do this?

Check out my NSString category for XMLEntities. There's methods to decode XML entities (including all HTML character references), encode XML entities, stripping tags and removing newlines and whitespace from a string:
- (NSString *)stringByStrippingTags;
- (NSString *)stringByDecodingXMLEntities; // Including all HTML character references
- (NSString *)stringByEncodingXMLEntities;
- (NSString *)stringWithNewLinesAsBRs;
- (NSString *)stringByRemovingNewLinesAndWhitespace;

Another HTML NSString category from Google Toolbox for Mac
Despite the name, this works on iOS too.
http://google-toolbox-for-mac.googlecode.com/svn/trunk/Foundation/GTMNSString+HTML.h
/// Get a string where internal characters that are escaped for HTML are unescaped
//
/// For example, '&' becomes '&'
/// Handles and 2 cases as well
///
// Returns:
// Autoreleased NSString
//
- (NSString *)gtm_stringByUnescapingFromHTML;
And I had to include only three files in the project: header, implementation and GTMDefines.h.

This link contains the solution below. Cocoa CF has the CFXMLCreateStringByUnescapingEntities function but that's not available on the iPhone.
#interface MREntitiesConverter : NSObject <NSXMLParserDelegate>{
NSMutableString* resultString;
}
#property (nonatomic, retain) NSMutableString* resultString;
- (NSString*)convertEntitiesInString:(NSString*)s;
#end
#implementation MREntitiesConverter
#synthesize resultString;
- (id)init
{
if([super init]) {
resultString = [[NSMutableString alloc] init];
}
return self;
}
- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)s {
[self.resultString appendString:s];
}
- (NSString*)convertEntitiesInString:(NSString*)s {
if (!s) {
NSLog(#"ERROR : Parameter string is nil");
}
NSString* xmlStr = [NSString stringWithFormat:#"<d>%#</d>", s];
NSData *data = [xmlStr dataUsingEncoding:NSUTF8StringEncoding allowLossyConversion:YES];
NSXMLParser* xmlParse = [[[NSXMLParser alloc] initWithData:data] autorelease];
[xmlParse setDelegate:self];
[xmlParse parse];
return [NSString stringWithFormat:#"%#",resultString];
}
- (void)dealloc {
[resultString release];
[super dealloc];
}
#end

This is an incredibly hacked together solution I did, but if you want to simply escape a string without worrying about parsing, do this:
-(NSString *)htmlEntityDecode:(NSString *)string
{
string = [string stringByReplacingOccurrencesOfString:#""" withString:#"\""];
string = [string stringByReplacingOccurrencesOfString:#"&apos;" withString:#"'"];
string = [string stringByReplacingOccurrencesOfString:#"<" withString:#"<"];
string = [string stringByReplacingOccurrencesOfString:#">" withString:#">"];
string = [string stringByReplacingOccurrencesOfString:#"&" withString:#"&"]; // Do this last so that, e.g. #"&lt;" goes to #"<" not #"<"
return string;
}
I know it's by no means elegant, but it gets the job done. You can then decode an element by calling:
string = [self htmlEntityDecode:string];
Like I said, it's hacky but it works. IF you want to encode a string, just reverse the stringByReplacingOccurencesOfString parameters.

In iOS 7 you can use NSAttributedString's ability to import HTML to convert HTML entities to an NSString.
Eg:
#interface NSAttributedString (HTML)
+ (instancetype)attributedStringWithHTMLString:(NSString *)htmlString;
#end
#implementation NSAttributedString (HTML)
+ (instancetype)attributedStringWithHTMLString:(NSString *)htmlString
{
NSDictionary *options = #{ NSDocumentTypeDocumentAttribute : NSHTMLTextDocumentType,
NSCharacterEncodingDocumentAttribute :#(NSUTF8StringEncoding) };
NSData *data = [htmlString dataUsingEncoding:NSUTF8StringEncoding];
return [[NSAttributedString alloc] initWithData:data options:options documentAttributes:nil error:nil];
}
#end
Then in your code when you want to clean up the entities:
NSString *cleanString = [[NSAttributedString attributedStringWithHTMLString:question.title] string];
This is probably the simplest way, but I don't know how performant it is. You should probably be pretty damn sure the content your "cleaning" doesn't contain any <img> tags or stuff like that because this method will download those images during the HTML to NSAttributedString conversion. :)

Here's a solution that neutralizes all characters (by making them all HTML encoded entities for their unicode value)... Used this for my need (making sure a string that came from the user but was placed inside of a webview couldn't have any XSS attacks):
Interface:
#interface NSString (escape)
- (NSString*)stringByEncodingHTMLEntities;
#end
Implementation:
#implementation NSString (escape)
- (NSString*)stringByEncodingHTMLEntities {
// Rather then mapping each individual entity and checking if it needs to be replaced, we simply replace every character with the hex entity
NSMutableString *resultString = [NSMutableString string];
for(int pos = 0; pos<[self length]; pos++)
[resultString appendFormat:#"&#x%x;",[self characterAtIndex:pos]];
return [NSString stringWithString:resultString];
}
#end
Usage Example:
UIWebView *webView = [[UIWebView alloc] init];
NSString *userInput = #"<script>alert('This is an XSS ATTACK!');</script>";
NSString *safeInput = [userInput stringByEncodingHTMLEntities];
[webView loadHTMLString:safeInput baseURL:nil];
Your mileage will vary.

The least invasive and most lightweight way to encode and decode HTML or XML strings is to use the GTMNSStringHTMLAdditions CocoaPod.
It is simply the Google Toolbox for Mac NSString category GTMNSString+HTML, stripped of the dependency on GTMDefines.h. So all you need to add is one .h and one .m, and you're good to go.
Example:
#import "GTMNSString+HTML.h"
// Encoding a string with XML / HTML elements
NSString *stringToEncode = #"<TheBeat>Goes On</TheBeat>";
NSString *encodedString = [stringToEncode gtm_stringByEscapingForHTML];
// encodedString looks like this now:
// <TheBeat>Goes On</TheBeat>
// Decoding a string with XML / HTML encoded elements
NSString *stringToDecode = #"<TheBeat>Goes On</TheBeat>";
NSString *decodedString = [stringToDecode gtm_stringByUnescapingFromHTML];
// decodedString looks like this now:
// <TheBeat>Goes On</TheBeat>

This is an easy to use NSString category implementation:
http://code.google.com/p/qrcode-scanner-live/source/browse/trunk/iphone/Classes/NSString%2BHTML.h
http://code.google.com/p/qrcode-scanner-live/source/browse/trunk/iphone/Classes/NSString%2BHTML.m
It is far from complete but you can add some missing entities from here: http://code.google.com/p/statz/source/browse/trunk/NSString%2BHTML.m
Usage:
#import "NSString+HTML.h"
NSString *raw = [NSString stringWithFormat:#"<div></div>"];
NSString *escaped = [raw htmlEscapedString];

The MREntitiesConverter above is an HTML stripper, not encoder.
If you need an encoder, go here: Encode NSString for XML/HTML

MREntitiesConverter doesn't work for escaping malformed xml. It will fail on a simple URL:
http://www.google.com/search?client=safari&rls=en&q=fail&ie=UTF-8&oe=UTF-8

If you need to generate a literal you might consider using a tool like this:
http://www.freeformatter.com/java-dotnet-escape.html#ad-output
to accomplish the work for you.
See also this answer.

This easiest solution is to create a category as below:
Here’s the category’s header file:
#import <Foundation/Foundation.h>
#interface NSString (URLEncoding)
-(NSString *)urlEncodeUsingEncoding:(NSStringEncoding)encoding;
#end
And here’s the implementation:
#import "NSString+URLEncoding.h"
#implementation NSString (URLEncoding)
-(NSString *)urlEncodeUsingEncoding:(NSStringEncoding)encoding {
return (NSString *)CFURLCreateStringByAddingPercentEscapes(NULL,
(CFStringRef)self,
NULL,
(CFStringRef)#"!*'\"();:#&=+$,/?%#[]% ",
CFStringConvertNSStringEncodingToEncoding(encoding));
}
#end
And now we can simply do this:
NSString *raw = #"hell & brimstone + earthly/delight";
NSString *url = [NSString stringWithFormat:#"http://example.com/example?param=%#",
[raw urlEncodeUsingEncoding:NSUTF8Encoding]];
NSLog(url);
The credits for this answer goes to the website below:-
http://madebymany.com/blog/url-encoding-an-nsstring-on-ios

Why not just using ?
NSData *data = [s dataUsingEncoding:NSUTF8StringEncoding allowLossyConversion:YES];
NSString *result = [[[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding] autorelease];
return result;
Noob question but in my case it works...

This is an old answer that I posted some years ago. My intention was
not to provide a "good" and "respectable" solution, but a "hacky" one
that might be useful under some circunstances. Please, don't use this solution unless nothing else works.
Actually, it works perfectly fine in many situations that other
answers don't because the UIWebView is doing all the work. And you can
even inject some javascript (which can be dangerous and/or useful). The performance should be horrible, but actually is not that bad.
There is another solution that has to be mentioned. Just create a UIWebView, load the encoded string and get the text back. It escapes tags "<>", and also decodes all html entities (e.g. ">") and it might work where other's don't (e.g. using cyrillics). I don't think it's the best solution, but it can be useful if the above solutions doesn't work.
Here is a small example using ARC:
#interface YourClass() <UIWebViewDelegate>
#property UIWebView *webView;
#end
#implementation YourClass
- (void)someMethodWhereYouGetTheHtmlString:(NSString *)htmlString {
self.webView = [[UIWebView alloc] init];
NSString *htmlString = [NSString stringWithFormat:#"<html><body>%#</body></html>", self.description];
[self.webView loadHTMLString:htmlString baseURL:nil];
self.webView.delegate = self;
}
- (void)webView:(UIWebView *)webView didFailLoadWithError:(NSError *)error {
self.webView = nil;
}
- (void)webViewDidFinishLoad:(UIWebView *)webView {
self.webView = nil;
NSString *escapedString = [self.webView stringByEvaluatingJavaScriptFromString:#"document.body.textContent;"];
}
- (void)webViewDidStartLoad:(UIWebView *)webView {
// Do Nothing
}
#end

Related

Xml parser atttributes

I am working on XMLParser. I used NSLog and get a following xml string :
<table><tr><td><img src="http://www.24h.com.vn/upload/3-2012/images/2012-09-16/1347762760_bong-da-genoa-juve.jpg"width='80' height='80' /></td><td>(20h, 16/9) Juventus sẽ có trận đấu khó khăn tới sân của Genoa.</td></tr></table>
how to get link in img src.
I tried:
else if([elementName isEqualToString:#"img"])
{
currentString=[attributeDict objectForKey:#"src"];
self.storingCharacter=YES;
}
But unsuccessful. Any help?
You need to implement an html parser to get the objects you wanted. I suggest you to use hpple.
I took the snippet of Albaregar solution from parsing HTML on the iPhone and modified it to your needs. I didn't test the adapted snippet, but it should works.
#import "TFHpple.h"
NSData *data = [[NSData alloc] initWithContentsOfFile:#"yourfile.html"];
// Create parser
xpathParser = [[TFHpple alloc] initWithHTMLData:data];
//Get the first img tag
NSArray *elements = [xpathParser searchWithXPathQuery:#"//img[0]"];
// Access the first img attribute src
TFHppleElement *element = [elements objectAtIndex:0];
// Get the text within the src attribute
NSString *src_attr = [element content];
[xpathParser release];
[data release];

unrecognized selector

I have a problem with the next code:
NSDictionary * imagen = [[NSDictionary alloc] initWithDictionary:[envio resultValue]];
NSString *imagenS = [imagen valueForKey:#"/Result"];
ClaseMaestra *b1 = [[ClaseMaestra alloc]init];
NSData *imagenDecode = [[NSData alloc] initWithData:[b1 base64DataFromString:imagenS]];
NSLog(#"Decode Image:");
NSLog(#"%#", imagenDecode);
//SAVE IMAGE
NSArray *sysPaths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory,NSUserDomainMask, YES);
NSString *docDirectory = [sysPaths objectAtIndex:0];
NSString *filePath = [NSString stringWithFormat:#"%#david.png",docDirectory];
[imagenDecode writeToFile:filePath atomically:YES];
Blockquote
[envio resultValue] --> return a NSDictionary with one image in Base 64 codification.
I want decoder and save this image but in my console I have showed this message:
2011-08-23 19:19:39.750 WSStub[38501:a0f] *************************
2011-08-23 19:19:39.752 WSStub[38501:a0f] SendImage
2011-08-23 19:19:39.752 WSStub[38501:a0f] *************************
2011-08-23 19:19:39.759 WSStub[38501:a0f] -[ClaseMaestra base64DataFromString:]: unrecognized selector sent to instance 0xd00ad0
Program received signal: “EXC_BAD_ACCESS”.
ClaseMaestra interface is:
#import <Foundation/Foundation.h>
#class NSString;
#interface ClaseMaestra : NSObject
+ (NSMutableData *)base64DataFromString: (NSString *)string;
#end
I can´t understand the "unrecognized selector" error...
This is a class method and you call iton an instance of the class. You should either change it to an instance method. instead of:
+ (NSMutableData *)base64DataFromString: (NSString *)string;
Use:
- (NSMutableData *)base64DataFromString: (NSString *)string;
Or, change the call, instead of:
NSData *imagenDecode = [[NSData alloc] initWithData:[b1 base64DataFromString:imagenS]];
Use:
NSData *imagenDecode = [[NSData alloc] initWithData:[ClaseMaestra base64DataFromString:imagenS]];
What to choose depends on your needs.
base64DataFromString: is a class method (starts with a +). So instead of
ClaseMaestra *b1 = [[ClaseMaestra alloc]init];
NSData *imagenDecode = [[NSData alloc] initWithData:[b1 base64DataFromString:imagenS]];
You should do
NSData *data = [ClaseMaestra base64DataFromString:imagenS];
You are sending a class message to an instance. The receiver should be a class.
So do:
NSData *imagenDecode = [[NSData alloc] initWithData:[ClaseMaestra base64DataFromString:imagenS]];
You'll also get this error if you use the name of a private framework, eg: MPMovieView .Everyone knows you're not supposed to use those, but what I didn't know is that I was using one!
What's odd is, if you use Xibs, they load the system one and give you the same type of error (Class methods).
But if you load it in code, it shadows the system framework one. I spent a decent hour scratching my head, ensuring everything was hooked up right... it was, just needed to change how I named my custom stuff. Posting this for anyone with similar

internationalization in iphone application

my application run vary well but now when UISwitch button is on at that time i have to convet whole application in spanish when off then convert in to english how it possible plz give any replay for that.
i18n
Create the following structure:
resources/i18n/en.lproj/Localizable.strings
resources/i18n/es.lproj/Localizable.strings
Create an additional directory with the corresponding two letter code for each additional language supported.
It's recommended to encode Localized.strings in UTF-16. You can convert between encodings in the inspector pane of XCode.
If the files are recognized as i18n resources, they will be presented like this:
A sample file has the following content:
"hello"="hola";
Then use the following in your program:
NSString *string = NSLocalizedString(#"hello", nil);
Choose language dynamically
To change the language for your application dynamically use this code:
#implementation Language
static NSBundle *bundle = nil;
+(void)initialize {
NSUserDefaults* defs = [NSUserDefaults standardUserDefaults];
NSArray* languages = [defs objectForKey:#"AppleLanguages"];
NSString *current = [[languages objectAtIndex:0] retain];
[self setLanguage:current];
}
/*
example calls:
[Language setLanguage:#"es"];
[Language setLanguage:#"en"];
*/
+(void)setLanguage:(NSString *)code {
NSLog(#"preferredLang: %#", code);
NSString *path = [[ NSBundle mainBundle ] pathForResource:code ofType:#"lproj" ];
// Use bundle = [NSBundle mainBundle] if you
// dont have all localization files in your project.
bundle = [[NSBundle bundleWithPath:path] retain];
}
+(NSString *)get:(NSString *)key alter:(NSString *)alternate {
return [bundle localizedStringForKey:key value:alternate table:nil];
}
#end
Then translate your strings like this:
NSString *hello [Language get:#"hello", nil, nil];
The code above was originally posted by Mauro Delrio as an answer to How to force NSLocalizedString to use a specific language.

Problems with NSKeyedArchiver- Simply does not archive at all!

I am probably not seeing something here, that is why I am asking for help :)
Here is the deal I have a NSMutable array of items that fulfill the NSCoding protocol, but NSKeyedArchiver always fails to archive it... here is my object implementation:
#implementation YTVideo
#synthesize URL,thumb,titulo;
#pragma mark NSCoding
#define kTituloKey #"titulo"
#define kURLKey #"URL"
#define kThumbKey #"thumb"
-(id)initWithData:(NSString *)ktitle :(UIImage *)kThumb :(NSURL *)kURL{
self.titulo = ktitle;
self.thumb = kThumb;
self.URL = kURL;
return self;
}
- (void) encodeWithCoder:(NSCoder *)encoder {
[encoder encodeObject:titulo forKey:kTituloKey];
[encoder encodeObject:URL forKey:kURLKey];
NSData *thumbData = UIImagePNGRepresentation(thumb);
[encoder encodeObject:thumbData forKey:kThumbKey];
}
- (id)initWithCoder:(NSCoder *)decoder {
NSString* ktitulo = [decoder decodeObjectForKey:kTituloKey];
NSURL* kURL = [decoder decodeObjectForKey:kURLKey];
NSData* kThumbdata = [decoder decodeObjectForKey:kThumbKey];
UIImage* kThumb=[UIImage imageWithData:kThumbdata];
return [self initWithData:ktitulo:kThumb:kURL];
}
#end
During the program execution I have a NSMutable array of those objects called videosArray.
then, eventually, I try:
NSString* path =[NSHomeDirectory() stringByAppendingPathComponent:#"teste.wrapit"];
NSLog(#"PATH =%#",path);
bool teste = [NSKeyedArchiver archiveRootObject:videosArray toFile:path];
NSLog(#"aramzenamento:%#",teste ? #"sucesso!" :#"Nope");
BOOL fileExists = [[NSFileManager defaultManager] fileExistsAtPath:path];
NSLog(#"Arquivo armazenado existe?%#",fileExists ?#"Sim":#"Nao");
And I always get a fail on my boolean checks...
Any Ideas where I am completely wrong??
Thanks!!
The problem you're experiencing has nothing to do with NSKeyedArchiver. By the looks of it, you're trying to archive your object at the root-level of your sandbox (the directory returned by NSHomeDirectory()). Try replacing the first line of the second block of code with
NSString *path = [NSHomeDirectory() stringByAppendingPathComponent:#"Documents"];
path = [path stringByAppendingPathComponent:#"teste.wrapit"];
Another (perhaps cleaner) way to get the path of your Documents folder is to use the C function NSSearchPathForDirectoriesInDomains, which returns an array of paths that match the first argument:
NSArray * paths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES);
NSString *path = [[paths objectAtIndex:0] stringByAppendingPathComponent:#"teste.wrapit"];
It's worth pointing out that, on iOS, the function NSSearchPathForDirectoriesInDomain is guaranteed to return an NSArray with a single element when you use a built-in constant (like NSDocumentDirectory) for the first argument, so you can safely use objectAtIndex:0 on the array.

function to get the file name of an URL

I have some source code to get the file name of an url
for example:
http://www.google.com/a.pdf
I hope to get a.pdf
because the way to join 2 NSStrings I can get is 'appendString' which only for adding a string at right side, so I planned to check each char one by one from the right side of string 'http://www.google.com/a.pdf', when it reach at the char '/', stop the checking, return string fdp.a , after that I change fdp.a to a.pdf
source codes are below
-(NSMutableString *) getSubStringAfterH : originalString:(NSString *)s0
{
NSInteger i,l;
l=[s0 length];
NSMutableString *h=[[NSMutableString alloc] init];
NSMutableString *ttt=[[NSMutableString alloc] init ];
for(i=l-1;i>=0;i--) //check each char one by one from the right side of string 'http://www.google.com/a.pdf', when it reach at the char '/', stop
{
ttt=[s0 substringWithRange:NSMakeRange(i, 1)];
if([ttt isEqualToString:#"/"])
{
break;
}
else
{
[h appendString:ttt];
}
}
[ttt release];
NSMutableString *h1=[[[NSMutableString alloc] initWithFormat:#""] autorelease];
for (i=[h length]-1;i>=0;i--)
{
NSMutableString *t1=[[NSMutableString alloc] init ];
t1=[h substringWithRange:NSMakeRange(i, 1)];
[h1 appendString:t1];
[t1 release];
}
[h release];
return h1;
}
h1 can reuturn the coorect string a.pdf, but if it returns to the codes where it was called, after a while system reports
'double free
*** set a breakpoint in malloc_error_break to debug'
I checked a long time and foudn that if I removed the code
ttt=[s0 substringWithRange:NSMakeRange(i, 1)];
everything will be Ok (of course getSubStringAfterH can not returns the corrent result I expected.), no error reported.
I try to fix the bug a few hours, but still no clue.
Welcome any comment
Thanks
interdev
The following line does the job if url is a NSString:
NSString *filename = [url lastPathComponent];
If url is a NSURL, then the following does the job:
NSString *filename = [[url path] lastPathComponent];
Try this:
Edit: from blow comment
NSString *url = #"http://www.google.com/a.pdf";
NSArray *parts = [url componentsSeparatedByString:#"/"];
NSString *filename = [parts lastObject];
I think if you have already had the NSURL object, there is lastPathComponent method available from the iOS 4 onwards.
NSURL *url = [NSURL URLWithString:#"http://www.google.com/a.pdf"];
NSString *filename = [url lastPathComponent];
Swift 3
Let's say that your url is http://www.google.com/a.pdf
let filename = url.lastPathComponent
\\filename = "a.pdf"
This is more error free and meant for getting the localized name in the URL.
NSString *localizedName = nil;
[url getResourceValue:&localizedName forKey:NSURLLocalizedNameKey error:NULL];
I haven't tried this yet, but it seems like you might be trying to do this the hard way. The iPhone libraries have the NSURL class, and I imagine that you could simply do:
NSString *url = [NSURL URLWithString:#"http://www.google.com/a.pdf"];
NSString *path = [url path];
Definitely look for a built in function. The libraries have far more testing and will handle the edge cases better than anything you or I will write in an hour or two (generally speaking).