How to find twitter handles in a NSString - iphone

If I had the following:
NSString *tweet = #"Shoutout to #somebody and #somebodyElse for your help on this one #shoutouts";
How would i go about finding the range of the twitter handles (eg #somebody)??
I want to make them bold in my Attributed String which is the next step.
Bonus points if you can help me find the # hash tags as well, but I assume its the same algorithm.

NSRegularExpression is your friend.

Use NSRegularExpression class,
http://developer.apple.com/library/ios/#documentation/Foundation/Reference/NSRegularExpression_Class/Reference/Reference.html
Than trying using this online tool to build Regex,
http://www.gskinner.com/RegExr/
I tried this and it seems like you can build a good one,
SAMPLE CODE - NOT TESTED
NSString *yourString = #"Shoutout to #somebody and #somebodyElse for your help on this one #shoutouts";
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression
regularExpressionWithPattern:#"\#\S+|#\S+"
options:NSRegularExpressionCaseInsensitive
error:&error];
[regex enumerateMatchesInString:yourString options:0 range:NSMakeRange(0, [yourString length]) usingBlock:^(NSTextCheckingResult *match, NSMatchingFlags flags, BOOL *stop){
// your code to handle matches here
}];
About test in online tool!
Good luck!

Related

How can I find a dynamic number in a long NSString?

I have a very big NSString, which holds around 1500 characters in it. In this string I need to extract a phone number, which may change frequently, as it is a dynamic data. The phone number will be in the format of 251-221-2000, how can I extract this?
Check out this previous question on regular expressions and NSString.
Search through NSString using Regular Expression
In your case an appropriate regular expression would be #"\\d{3}-\\d{3}-\\d{4}".
This sounds like a perfect candidate for a regular expression. You can use the NSRegularExpression class to achieve this. You can test your regular expression at http://www.regextester.com
NSString *yourString = #"Your 1500 characters string ";
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression
regularExpressionWithPattern:#"\d{3}-\d{3}-\d{4}"
options:NSRegularExpressionCaseInsensitive
error:&error];
[regex enumerateMatchesInString:yourString options:0 range:NSMakeRange(0, [yourString length]) usingBlock:^(NSTextCheckingResult *match, NSMatchingFlags flags, BOOL *stop){
// your code to handle matches here
}];
Let me know it is working or not.

NSRegularExpression not getting exact text

I have a string like:
<book>MyBook</book><value>myValue</value>
Now I want to get the text "myValue" out of this string. I want to use NSRegularExpression to do this. I tried this:
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"(<book>MyBook</book>\\s*<value>).*?(</value>)"
options:NSRegularExpressionCaseInsensitive
error:&error];
NSArray *textArray = [regex matchesInString:myData options:0 range:NSMakeRange(0, [myData length])];
NSTextCheckingResult * result = [rege firstMatchInString:myData
options:0
range:NSMakeRange(0, [myData length])];
The result is:
<book>MyBook</book><value>myValue</value>
So I get the whole string, but I only want "myValue". How can I do this? What am I missing here?
Thanks in advance!
That happens because you wrote a regex that matches the entire string. I'd reckon that writing a regex that will only match the myValue part of the string is way too complicated to be bothered with (due to the fact that you've got MyBook string that will probably match anything myValue does).
I'd recommend not using regex for this, as they are not intended for the use you've described here. If you don't want to use any XML deserialization, you could use a NSScanner or any of the NSString class methods which will yield a simpler, and easier code to maintain.
For example, using an NSScanner and a few other methods:
NSString *stringToBeScanned = #"<book>MyBook</book><value>myValue</value>";
NSString *myValue;
NSScanner *scanner = [NSScanner scannerWithString:stringToBeScanned];
[scanner scanUpToString:#"<value>" intoString:nil];
// After the above, we've got "<value>myValue</value>" left to scan
[scanner scanUpToString:#"</value>" intoString:&myValue];
// We ended up with a "<value>myValue" type of a string
// This will trim the remaining of the string we don't need
myValue = [myValue stringByReplacingOccurrencesOfString:#"<value>" withString:#""];
The above could probably be written better and I might have made a mistake or two writing it out my head, but the principle should work.

Regarding regular expression in iPhone

I am working on regular expressions but where ever I search I am getting the code and explanation for validating the email now I have to do something like this
Contents of the file are like this(file formate may be .rtf, .txt ...etc)
[Title:languageC]
[Author:Dennis Ritchie]
[Description:this book is nice to learn the C language]
form this file now I want to extract the languageC, Dennis Ritchie, this book is nice to learn the C language. I have achieved this by using NSStrings, NSScanner and NSRange but now I want to achieve this same using regularexpressions is it possible.
NSString *regexStr = #"\[Title:([.]+)\][ ]+\[Author:([.]+)\][ ]+\[Description:([.]+)\]";
NSError *error;
NSRegularExpression *testRegex = [NSRegularExpression regularExpressionWithPattern:regexStr options:0 error:&error];
if( testRegex == nil ) NSLog( #"Error making regex: %#", error );
NSTextCheckingResult *result = [testRegex firstMatchInString:test options:0 range:NSMakeRange(0, [test length])];
NSRange range = [result rangeAtIndex:1]; // This will give you Title,
You should use a regex like this:
/\[[^:]+:([^]])\]/
E.g.:
NSError *error;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"\[[^:]+:([^]])\]" options:0 error:&error];
NSArray* matches = [regex matchesInString:YOUR_STR options:0 range:NSMakeRange(0, [YOUR_STRING length]);

Parse NSString from right hand side?

> (2009 RX7)</font></td>
>monospace" size="-1">214869 (2007 PAZ)</font></td>
>monospace" size="-1"> 4155 Accord</font></td>
I wonder if someone could offer me a little help, I have a list of NSString items (See Above) that I want to parse some data from. My problem is that there are no tags that I can use within the strings nor do the items I want have fixed positions. The data I want to extract is:
2009 RX7
2007 PAZ
4155 Accord
My thinking is that its going to be easier to parse from the right hand end, remove the </font></td> and then use ";" to separate the data items:
(2009&nbsp RX7)
(2007&nbsp PAZ)
4155&nbsp Accord
which can them be cleaned up to match the example given. Any pointers on doing this or working through from the right would be very much appreciated.
Personally I think you are better off with a regex. So my solution would be:
Regex of: ([0-9]+)[^;]+;([A-Za-z0-9]+)
Which for all the example text provides 3 matches. ie for:
(2009 RX7)</font></td>
0: 2009 RX7)<
1: 2009
2: RX7
I haven't coded this up, but did test the Regex at www.regextester.com
Regex's are implemented via NSRegularExpression and are available in iOS 4.0 and later.
Edit
Given that this appears to be a web scraping application, you never know when those pesky HTML code monkeys will change their output and break your carefully crafted matching methodology. As such I would change my regex to:
([0-9]+)([^;]+;)+([A-Za-z0-9]+)
Which adds an extra group, but allows for any number of elements between the number and the string.
Try this code:
NSString *str = #"> (2009 RX7)</font></td>";
NSRange fontRange = [str rangeOfString:#"</Font>" options:NSBackwardsSearch];
NSRange lastSemi = [str rangeOfString:#";" options:NSBackwardsSearch range:NSMakeRange(0, fontRange.location-1)];
NSRange priorSemi = [str rangeOfString:#";" options:NSBackwardsSearch range:NSMakeRange(0, lastSemi.location-1)];
NSString *yourString = [str substringWithRange:NSMakeRange(priorSemi.location+1, fontRange.location-1)];
The key element here is the NSBackwardsSearch search option.
This should do the trick:
NSString *s = #">monospace\" size=\"-1\"> 4155 Accord</font></td>";
NSArray *strArray = [s componentsSeparatedByString:#";"];
// you're interested in last two objects
NSArray *tmp = [strArray subarrayWithRange:NSMakeRange(strArray.count - 2, 2)];
In tmp you'll have something like:
"4155&nbsp",
"Accord</font></td>"
strip unneeded chars and you're all set.
Using NSRegularExpression:
NSRegularExpression *regex;
NSTextCheckingResult *match;
NSString *pattern = #"([0-9]+) ([A-Za-z0-9]+)[)]?</font></td>";
NSString *string = #"> (2009 RX7)</font></td>";
regex = [NSRegularExpression
regularExpressionWithPattern:pattern
options:NSRegularExpressionCaseInsensitive
error:nil];
match = [regex firstMatchInString:string options:0 range:NSMakeRange(0, [string length])];
NSLog(#"'%#'", [string substringWithRange:[match rangeAtIndex:1]]);
NSLog(#"'%#'", [string substringWithRange:[match rangeAtIndex:2]]);
NSLog output:
'2009'
'RX7'

what's the best way to detect if a word in NSString has a number?

example: word with number in string
NSString *str = [NSString stringWithFormat:#"this is an 101 example1 string"]
Since example1 has a number in the end and i want to remove it. I can break it into an array and filter it out using predicate, but that seems slow to me since I need to do like a million of these.
What would be a more efficient way?
Thanks!
Probably NSRegularExpression. I think ([^0-9 ]+)\d+|\d+([^0-9 ]+) should do it. Just replace it with $1.
Based on Chuck's response, here is the complete code in case someone might find it useful:
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"([^0-9 ]+)\\d+|\\d+([^0-9 ]+)"
options:NSRegularExpressionCaseInsensitive
error:&error];
NSString *modifiedString = [regex stringByReplacingMatchesInString:str2
options:0
range:NSMakeRange(0, [str2 length])
withTemplate:#"$1"];