parsing string starting with # and # in objective-C - iphone

So I am trying to parse a string that has the following format:
baz#marroon#red#blue #big#cat#dog
or, it can also be separated by spaces:
baz #marroon #red #blue #big #cat #dog
and here's how I am doing it now:
- (void) parseTagsInComment:(NSString *) comment
{
if ([comment length] > 0){
NSArray * stringArray = [comment componentsSeparatedByString:#" "];
for (NSString * word in stringArray){
}
}
}
I've got the components separated by space working, but what if it has no space.. how do I iterate through these words? I was thinking of using regex.. but I have no idea on how to write such regex in objective-C. Any idea, for a regex that would cover both of these cases?
Here's my first attempt:
NSError * error;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"(#|#)\\S+" options:NSRegularExpressionCaseInsensitive error:&error];
NSArray* wordArray = [regex matchesInString:comment
options:0 range:NSMakeRange(0, [comment length])];
for (NSString * word in wordArray){
}
Which doesn't work.. I think my regex is wrong.

Here is a way to do it using NSScanner that puts the separated strings and a string representation of their ranges into an array (this assumes that your original string started with a # -- if it doesn't and you need it to, then just prepend the hash to the string at the start).
NSMutableArray *array = [NSMutableArray array];
NSString *str = #"#baz#marroon#red#blue #big#cat#dog";
NSScanner *scanner = [NSScanner scannerWithString:str];
NSCharacterSet *searchSet = [NSCharacterSet characterSetWithCharactersInString:#"##"];
NSString *outputString;
while (![scanner isAtEnd]) {
[scanner scanUpToCharactersFromSet:searchSet intoString:nil];
[scanner scanCharactersFromSet:searchSet intoString:&outputString];
NSString *symbol = [outputString copy];
[scanner scanUpToCharactersFromSet:searchSet intoString:&outputString];
NSString *wholePiece = [[symbol stringByAppendingString:outputString]stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceCharacterSet]];
NSString *rangeString = NSStringFromRange([str rangeOfString:wholePiece]);
[array addObject:wholePiece];
[array addObject:rangeString];
}
NSLog(#"%#",array);

I think the regular expression you really want is [##]?\\w+. It will find groups of letters optionally preceded by an # or #. Your expression wouldn't work because it looks for any non-space character, which includes # and #. (Depending on what can be in the "words," you might want something more or less specific than \w, but it isn't clear from the question.)

If you need the ranges, then NSRegularExpression probably works well:
NSString *comment = #"#baz#marroon#red#blue #big#cat#dog";
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"[##]\\w+" options:0 error:nil];
NSArray* wordArray = [regex matchesInString:comment
options:0
range:NSMakeRange(0, [comment length])];
for (NSTextCheckingResult *result in wordArray)
NSLog(#"%#", [comment substringWithRange:result.range]);
Or, [##][a-zA-z]+ works if you're ok with ASCII alpha words only.

Related

NSRegularExpression ISSUE

I'm working with NSRegularExpression to read a text and find out hashtag.
This is NSString that I used in regularExpressionWithPattern.
- (NSString *)hashtagRegex
{
return #"#((?:[A-Za-z0-9-_]*))";
//return #"#{1}([A-Za-z0-9-_]{2,})";
}
And this is my method:
// Handle Twitter Hashtags
detector = [NSRegularExpression regularExpressionWithPattern:[self hashtagRegex] options:0 error:&error];
links = [detector matchesInString:theText options:0 range:NSMakeRange(0, theText.length)];
current = [NSMutableArray arrayWithArray:links];
NSString *hashtagURL = #"http://twitter.com/search?q=%23";
//hashtagURL = [hashtagURL stringByAddingPercentEscapesUsingEncoding:NSASCIIStringEncoding];
for ( int i = 0; i < [links count]; i++ ) {
NSTextCheckingResult *cr = [current objectAtIndex:i];
NSString *url = [theText substringWithRange:cr.range];
NSString *nohashURL = [url stringByReplacingOccurrencesOfString:#"#" withString:#""];
nohashURL = [nohashURL stringByReplacingOccurrencesOfString:#" " withString:#""];
[theText replaceOccurrencesOfString:url
withString:[NSString stringWithFormat:#"%#", hashtagURL, nohashURL, url]
options:NSLiteralSearch
range:NSMakeRange(0, theText.length)];
current = [NSMutableArray arrayWithArray:[detector matchesInString:theText options:0 range:NSMakeRange(0, theText.length)]];
}
[theText replaceOccurrencesOfString:#"\n" withString:#"<br />" options:NSLiteralSearch range:NSMakeRange(0, theText.length)];
[_aWebView loadHTMLString:[self embedHTMLWithFontName:[self fontName]
size:[self fontSize]
text:theText]
baseURL:nil];
Everything worked but it figured out a little issue when I use a string like this:
NSString * theText = #"#twitter #twitterapp #twittertag";
My code highlights only #twitter on each word and not the second part of it (#twitter #twitter(app) #twitter(tag)).
I hope someone will help me!
Thank you :)
The statement
[theText replaceOccurrencesOfString:url
withString:[NSString stringWithFormat:#"%#", hashtagURL, nohashURL, url]
options:NSLiteralSearch
range:NSMakeRange(0, theText.length)];
is replacing all instances of the string url with the replacement string. In the example you give, the first time through the loop, url is #"#twitter", and all three occurrences of that string within theText are replaced in one go. This is what theText looks like then:
#twitter #twitterapp #twittertag
So, of course, the next two times round the loop, the results are not quite what you expect... !
I think the fix is to limit the range of the replacement:
[theText replaceOccurrencesOfString:url
withString:[NSString stringWithFormat:#"%#", hashtagURL, nohashURL, url]
options:NSLiteralSearch
range:cr.range];

Take part of string in-between symbols?

I would like to be able to take the numbers lying behind the ` symbol and in front of any character that is non-numerical and convert it into a integer.
Ex.
Original String: 2*3*(123`)
Result: 123
Original String: 4`12
Result: 4
Thanks,
Regards.
You can use regular expressions. You can find all the occurrences like this:
NSString *mystring = #"123(12`)456+1093`";
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"([0-9]+)`" options:0 error:nil];
NSArray *matches = [regex matchesInString:mystring options:0 range:NSMakeRange(0, mystring.length)];
for (NSTextCheckingResult *match in matches) {
NSLog(#"%#", [mystring substringWithRange:[match rangeAtIndex:1]]);
}
// 12 and 1093
If you only need one occurrence, then replace the for loop with the following:
if (matches.count>0) {
NSTextCheckingResult *match = [matches objectAtIndex:0];
NSLog(#"%#", [mystring substringWithRange:[match rangeAtIndex:1]]);
}
There can be better way to do this, Quickly i could come up with this,
NSString *mystring = #"123(12`)";
NSString *neededString = nil;
NSScanner *scanner =[NSScanner scannerWithString:mystring];
[scanner scanUpToString:#"`" intoString:&neededString];
neededString = [self reverseString:neededString];
NSLog(#"%#",[self reverseString:[NSString stringWithFormat:#"%d",[neededString intValue]]]);
To reverse a string you can see this

Objective-C: Find consonants in string

I have a string that contains words with consonants and vowels. How can I extract only consonants from the string?
NSString *str = #"consonants.";
Result must be:
cnsnnts
You could make a character set with all the vowels (#"aeiouy")
+ (id)characterSetWithCharactersInString:(NSString *)aString
then use the
- (NSString *)stringByTrimmingCharactersInSet:(NSCharacterSet *)set
method.
EDIT: This will only remove vowels at the beginning and end of the string as pointed out in the other post, what you could do instead is use
- (NSArray *)componentsSeparatedByCharactersInSet:(NSCharacterSet *)separator
then stick the components back together. You may also need to include capitalized versions of the vowels in the set, and if you want to also deal with accents (à á è è ê ì etc...) you'll probably have to include that also.
Unfortunately stringByTrimmingCharactersInSet wont work as it only trim leading and ending characters, but you could try using a regular expression and substitution like this:
[[NSRegularExpression
regularExpressionWithPattern:#"[^bcdefghjklmnpqrstvwx]"
options:NSRegularExpressionCaseInsensitive
error:NULL]
stringByReplacingMatchesInString:str
options:0
range:NSMakeRange(0, [str length])
withTemplate:#""]
You probably want to tune the regex and options for your needs.
Possible, for sure not-optimal, solution. I'm printing intermediate results for your learning. Take care of memory allocation (I didn't care). Hopefully someone will send you a better solution, but you can copy and paste this for the moment.
NSString *test = #"Try to get all consonants";
NSMutableString *found = [[NSMutableString alloc] init];
NSInteger loc = 0;
NSCharacterSet *consonants = [NSCharacterSet characterSetWithCharactersInString:#"bcdfghjklmnpqrstvwxyz"];
while(loc!=NSNotFound && loc<[test length]) {
NSRange r = [[test lowercaseString] rangeOfCharacterFromSet:consonants options:0 range:NSMakeRange(loc, [test length]-loc)];
if(r.location!=NSNotFound) {
NSString *temp = [test substringWithRange:r];
NSLog(#"Range: %# Temp: %#",NSStringFromRange(r), temp);
[found appendString:temp];
loc=r.location+r.length;
} else {
loc=NSNotFound;
}
}
NSLog(#"Found: %#",found);
Here is a NSString category that does the job:
- (NSString *)consonants
{
NSString *result = [NSString stringWithString:self];
NSCharacterSet *characterSet = [NSCharacterSet characterSetWithCharactersInString:#"aeiou"];
while(1)
{
NSRange range = [result rangeOfCharacterFromSet:characterSet options:NSCaseInsensitiveSearch];
if(range.location == NSNotFound)
break;
result = [result stringByReplacingCharactersInRange:range withString:#""];
}
return result;
}

Match NSArray of characters Objective-C

I have to match the number of occurrences of n special characters in a string.
I thought to create an array with all these chars (they are 20+) and create a function to match each of them.
I just have the total amount of special characters in the string, so I can make some math count on them.
So in the example:
NSString *myString = #"My string #full# of speci#l ch#rs & symbols";
NSArray *myArray = [NSArray arrayWithObjects:#"#",#"#",#"&",nil];
The function should return 5.
Would it be easier match the characters that are not in the array, take the string length and output the difference between the original string and the one without special chars?
Is this the best solution?
NSString *myString = #"My string #full# of speci#l ch#rs & symbols";
//even in first continuous special letters it contains -it will return 8
//NSString *myString = #"#&#My string #full# of speci#l ch#rs & symbols";
NSArray *arr=[myString componentsSeparatedByCharactersInSet:[NSMutableCharacterSet characterSetWithCharactersInString:#"##&"]];
NSLog(#"resulted string : %# \n\n",arr);
NSLog(#"count of special characters : %i \n\n",[arr count]-1);
OUTPUT:
resulted string : (
"My string ",
full,
" of speci",
"l ch",
"rs ",
" symbols"
)
count of special characters : 5
You should utilize an NSRegularExpression, its perfect for your scenario. You can create one like this:
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"(#|&)" options:NSRegularExpressionCaseInsensitive error:&error];
NSUInteger numberOfMatches = [regex numberOfMatchesInString:string options:0 range:NSMakeRange(0, [string length])];
Caveat: I ripped the code from the Apple Developer site. And I'm no regex guru so you will have to tweak the pattern. But you get the gist.
You should look also at NSRegularExpression:
- (NSUInteger)numberOfCharacters:(NSArray *)arr inString:(NSString *)str {
NSMutableString *mutStr = #"(";
for(i = 0; i < [arr count]; i++) {
[mutStr appendString:[arr objectAtIndex:i]];
if(i+1 < [arr count]) [mutStr appendString:#"|"];
}
[mutStr appendString:#")"];
NSRegularExpression *regEx = [NSRegularExpression regularExpressionWithPattern:mutStr options:NSRegularExpressionCaseInsensitive error:nil];
NSUInteger *occur = [regExnumberOfMatchesInString:str options:0 range:NSMakeRange(0, [string length])];
[mutStr release];
return occur;
}
Usage example:
NSString *myString = #"My string #full# of speci#l ch#rs & symbols";
NSArray *myArray = [NSArray arrayWithObjects:#"#",#"#",#"&",nil];
NSLog(#"%d",[self numberOfCharacters:myArray inString:myString]); // will print 5

Text extraction with NSRegularExpression

Given a NSString *test = #"...href="/functions?q=KEYWORD\x26amp...";
How can I extract the word KEYWORD from the string using NSRegularExpression?
I have tried with the following NSRegularExpression on iOS SDK 4.2 but it is not able to find the text. Does the following code looks okay?
NSRegularExpression *testRegex = [NSRegularExpression regularExpressionWithPattern:#"(?<=href=\"\\/functions\\?q=).+?(?=\\x26amp])" options:0 error:nil];
NSRange result = [testRegex rangeOfFirstMatchInString:test options:0 range:NSMakeRange(0, [test length])];
You have a stray "]" in your regex, right before the end, which is probably causing a problem. You also need to use four slashes to match a slash in the input string. (Double it to escape it in the C string, and then double again to escape it in the regex). I'd suggest two things. First, pass something in the error parameter and take a look at in it in the debugger. Second, I'm not a big fan of lookahead/lookbehind expressions. I think this style is more readable:
NSString *regexStr = #"href=\"\\/functions\\?=(.+?)\\\\x26amp";
NSError *error;
NSRegularExpression *testRegex = [NSRegularExpression regularExpressionWithPattern:regexStr options:0 error:&error];
if( testRegex == nil ) NSLog( #"Error making regex: %#", error );
NSTextCheckingResult *result = [testRegex firstMatchInString:test options:0 range:NSMakeRange(0, [test length])];
NSRange range = [result rangeAtIndex:1];