How can I escape this regex properly in Objective-C? - iphone

I have the following regex I would like to escape in Objective-C
/\B\$((?:[0-9]+(?=[a-z])|(?![0-9\.\:\_\-]))(?:[a-z0-9]|[\_\.\-\:](?![\.\_\.\-\:]))*[a-z0-9]+)/ig;
Not exactly sure how to escape it so it works in Objective-C
Update:
NSString* pattern = #"/\\B\\$((?:[0-9]+(?=[a-z])|(?![0-9\\.\\:\\_\\-]))(?:[a-z0-9]|[\\_\\.\\-\\:](?![\\.\\_\\.\\-\\:]))*[a-z0-9]+)/ig;";
NSRegularExpression *usernameRegex = [[[NSRegularExpression alloc] initWithPattern:pattern
options:NSRegularExpressionCaseInsensitive
error:nil];
error:nil];
Gives me an error about Parse Issue - Unexpected Identifier

Backslashes are used as escape characters in C strings. To make a regexp that contains backslashes as regex escapes, you need to double them.

Following on from the correct solution by millimoose, here is a NSString category method I use to escape black slashes for Regex patterns in Objective C.
+ (NSString *)escapeBackslashes:(NSString *)regexString
{
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"\\\\" options:NSRegularExpressionCaseInsensitive | NSRegularExpressionDotMatchesLineSeparators | NSRegularExpressionAnchorsMatchLines | NSRegularExpressionAllowCommentsAndWhitespace error:&error];
if (error == NULL)
{
return [regex stringByReplacingMatchesInString:regexString options:0 range:NSMakeRange(0, [regexString length]) withTemplate:#"\\\\"];
}
else
{
return regexString;
}
}
Usage example:
NSString* pattern = [NSString escapeBackslashes:pattern];

Related

Matching HTML with NSRegularExpression

Basically I'm looking for a good example of matching HTML (also newlines and whitespace) using NSRegularExpression.
I have this PHP code I wrote a while back:
preg_match_all("/<dt>(.+?)<\/dt>\W+<dd>(.+?)<\/dd>/si", $data, $m['deets']);
Now I know this works in PHP but for the life of me I can't translate it to Objective-C. Here was my attempt.
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"<dt>(.+?)<\/dt>\W+<dd>(.+?)<\/dd>" options:(NSRegularExpressionCaseInsensitive) error:&error];
return [regex matchesInString:target options:NSCaseInsensitiveSearch range:NSMakeRange(0, [target length])];
My target in this case is a bunch of HTML.
I never used NSRegularExpression, but NSPredicate instead :
NSError *error = NULL;
NSString* pattern = #"/<dt>(.+?)<\/dt>\W+<dd>(.+?)<\/dd>/si";
NSPredicate* predicate = [NSPredicate predicateWithFormat:#"SELF MATCHES %#", pattern];
if ([predicate evaluateWithObject:myTargetString] == YES) {
// Okay
} else {
// Not found
}
Hope this helps.
EDIT :
NSPredicate is cool, be don't work if you want to get the matching range of your target string.
Your code is right, but the problem comes from the regexp expression, you must escape your \ characters and not escape / ones.
#"<dt>(.+?)</dt>\\W+<dd>(.+?)</dd>"
So :
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"<dt>(.+?)</dt>\\W+<dd>(.+?)</dd>" options:(NSRegularExpressionCaseInsensitive) error:&error];
return [regex matchesInString:target options:NSCaseInsensitiveSearch range:NSMakeRange(0, [target length])];

How to find if the first character of last word in a NSString value is Ampersand using NSRegularExpression?

I would like to find if the first letter of last word starts with Ampersand in a NSString value using NSRegularExpression.
I used the following expression, but it shows the last word matching even if the the ampersand is anywhere in the last word.
Please advice me that how can i achieve it.
Thank you.
BOOL flagSymbolFound = NO;
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"[&]\\b\\w*$" options:NSRegularExpressionCaseInsensitive error:&error];
if(!error) {
NSUInteger numberOfMatches = [regex numberOfMatchesInString:stringValue options:0 range:NSMakeRange(0, [stringValue length])];
if(numberOfMatches > 0)
flagSymbolFound = YES;
else
flagSymbolFound = NO;
}
Try "\\s[&]\\w+$" pattern. It should match space-separated words, e.g. foo &bar
Try adding the "^" anchor to the beginning of your regex:
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"^[&]\\b\\w*$" options:NSRegularExpressionCaseInsensitive error:&error];

Match NSArray of characters Objective-C

I have to match the number of occurrences of n special characters in a string.
I thought to create an array with all these chars (they are 20+) and create a function to match each of them.
I just have the total amount of special characters in the string, so I can make some math count on them.
So in the example:
NSString *myString = #"My string #full# of speci#l ch#rs & symbols";
NSArray *myArray = [NSArray arrayWithObjects:#"#",#"#",#"&",nil];
The function should return 5.
Would it be easier match the characters that are not in the array, take the string length and output the difference between the original string and the one without special chars?
Is this the best solution?
NSString *myString = #"My string #full# of speci#l ch#rs & symbols";
//even in first continuous special letters it contains -it will return 8
//NSString *myString = #"#&#My string #full# of speci#l ch#rs & symbols";
NSArray *arr=[myString componentsSeparatedByCharactersInSet:[NSMutableCharacterSet characterSetWithCharactersInString:#"##&"]];
NSLog(#"resulted string : %# \n\n",arr);
NSLog(#"count of special characters : %i \n\n",[arr count]-1);
OUTPUT:
resulted string : (
"My string ",
full,
" of speci",
"l ch",
"rs ",
" symbols"
)
count of special characters : 5
You should utilize an NSRegularExpression, its perfect for your scenario. You can create one like this:
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"(#|&)" options:NSRegularExpressionCaseInsensitive error:&error];
NSUInteger numberOfMatches = [regex numberOfMatchesInString:string options:0 range:NSMakeRange(0, [string length])];
Caveat: I ripped the code from the Apple Developer site. And I'm no regex guru so you will have to tweak the pattern. But you get the gist.
You should look also at NSRegularExpression:
- (NSUInteger)numberOfCharacters:(NSArray *)arr inString:(NSString *)str {
NSMutableString *mutStr = #"(";
for(i = 0; i < [arr count]; i++) {
[mutStr appendString:[arr objectAtIndex:i]];
if(i+1 < [arr count]) [mutStr appendString:#"|"];
}
[mutStr appendString:#")"];
NSRegularExpression *regEx = [NSRegularExpression regularExpressionWithPattern:mutStr options:NSRegularExpressionCaseInsensitive error:nil];
NSUInteger *occur = [regExnumberOfMatchesInString:str options:0 range:NSMakeRange(0, [string length])];
[mutStr release];
return occur;
}
Usage example:
NSString *myString = #"My string #full# of speci#l ch#rs & symbols";
NSArray *myArray = [NSArray arrayWithObjects:#"#",#"#",#"&",nil];
NSLog(#"%d",[self numberOfCharacters:myArray inString:myString]); // will print 5

Text extraction with NSRegularExpression

Given a NSString *test = #"...href="/functions?q=KEYWORD\x26amp...";
How can I extract the word KEYWORD from the string using NSRegularExpression?
I have tried with the following NSRegularExpression on iOS SDK 4.2 but it is not able to find the text. Does the following code looks okay?
NSRegularExpression *testRegex = [NSRegularExpression regularExpressionWithPattern:#"(?<=href=\"\\/functions\\?q=).+?(?=\\x26amp])" options:0 error:nil];
NSRange result = [testRegex rangeOfFirstMatchInString:test options:0 range:NSMakeRange(0, [test length])];
You have a stray "]" in your regex, right before the end, which is probably causing a problem. You also need to use four slashes to match a slash in the input string. (Double it to escape it in the C string, and then double again to escape it in the regex). I'd suggest two things. First, pass something in the error parameter and take a look at in it in the debugger. Second, I'm not a big fan of lookahead/lookbehind expressions. I think this style is more readable:
NSString *regexStr = #"href=\"\\/functions\\?=(.+?)\\\\x26amp";
NSError *error;
NSRegularExpression *testRegex = [NSRegularExpression regularExpressionWithPattern:regexStr options:0 error:&error];
if( testRegex == nil ) NSLog( #"Error making regex: %#", error );
NSTextCheckingResult *result = [testRegex firstMatchInString:test options:0 range:NSMakeRange(0, [test length])];
NSRange range = [result rangeAtIndex:1];

NSRegularExpression and capture groups on iphone

I need a little kickstart on regex on the iphone.
Basically I have a list of dates in a private MediaWiki in the form of
*185 BC: SOME EVENT HERE
*2001: SOME OTHER EVENT MUCH LATER
I now want to parse that into an Object that has a NSDate property and a -say- NSString property.
I have this so far: (rawContentString contains the mediawiki syntax of the page)
NSString* regexString =#"\\*( *[0-9]{1,}.*): (.*)";
NSRegularExpressionOptions options = NSRegularExpressionCaseInsensitive;
NSError* error = NULL;
NSRegularExpression* regex = [NSRegularExpression regularExpressionWithPattern:regexString options:options error:&error];
if (error) {
NSLog(#"%#", [error description]);
}
NSArray* results = [regex matchesInString:rawContentString options:0 range:NSMakeRange(0, [rawContentString length])];
for (NSTextCheckingResult* result in results) {
NSString* resultString = [rawContentString substringWithRange:result.range];
NSLog(#"%#",resultString);
}
unfortunately I think the regex is not working the way I hope and I dont know how to capture the matched date and text.
Any help would be great.
BTW: there is not by any chance a regex Pattern compilation for MediaWiki Syntax out there somewhere ?
Thanks in advance
Heiko
*
My issue was that I was using matchesInString and I needed to use firstMatchInString because it returns multiple ranges in a single NSTextCheckingResult.
This is counter intuitive, but it worked.
I got the answer from http://snipplr.com/view/63340/
My Code (to parse credit card track data):
NSRegularExpression *track1Pattern = [NSRegularExpression regularExpressionWithPattern:#"%.(.+?)\\^(.+?)\\^([0-9]{2})([0-9]{2}).+?\\?." options:NSRegularExpressionCaseInsensitive error:&error];
NSTextCheckingResult *result = [track1Pattern firstMatchInString:trackString options:NSMatchingReportCompletion range:NSMakeRange(0, trackString.length)];
self.cardNumber = [trackString substringWithRange: [result rangeAtIndex:1]];
self.cardHolderName = [trackString substringWithRange: [result rangeAtIndex:2]];
self.expirationMonth = [trackString substringWithRange: [result rangeAtIndex:3]];
self.expirationYear = [trackString substringWithRange: [result rangeAtIndex:4]];
As for the regex, i think something around these lines:
\*([ 0-9]{1,}.*):(.*)
should work better to what you need. You're not escaping the first *, and why is there a * in the first group statement?