How to ignore whitespaces in regular expression - iphone

I am new to iPhone.I have a small doubt in regular expressions that at present i am using regular expression below one in my project that is
NSRegularExpression *regularExpression =
[NSRegularExpression regularExpressionWithPattern:#"href=\"(.*).zip\""
options:NSRegularExpressionCaseInsensitive
error:&error];
it searches the website viewsource and gives results which are in below pattern
href="kjv/36_Zep.zip"
href="kjv/37_Hag.zip"
but one of the link in view source is like below
href="kjv/38_Zec.zip "
i want to ignore the white spaces after the .zip
how it is possible if any body know this please help me

One way is to do a string replace of all whites spaces with the empty string or use a strip function on that string to remove all trailing spaces. Refer String replacement in Objective-C
If you don't want to do that, use the pattern for empty space in your regular expression to match one or more white spaces.
\s includes \n(ewline) \r(eturn) \t(tab) \v(ertical tab) \f(orm feed) and space. If you want only space use "" which is actually a blank space.

You can match the examples you provided with the following regex...
#"href=\"(.+)\.zip\s*\""
I modified your regex by adding
1) + (matches 1 or more of the preceding character) to capture the entire name before the .zip,
2) \ to the . to prevent it from matching all characters,
3) \s* to match (skip in your case) zero or more whitespaces.

Suppose its given a NSString *test = #"...href="/functions?q=KEYWORD\x26amp... " and you want to perform actions on this string with NSRegularExpression, you could also do easy method call like this
NSTextCheckingResult *result = [testRegex firstMatchInString:[test stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]] options:0 range:NSMakeRange(0, [test length])];
And dont change anything in your NSRegularExpression.

I commonly use groups to gather the item I want. However you need to know how groups work.
Unfortunately You cannot name them. but think of it this way.
groups are indexed with numbers for the () encountered.
0 is the entire match.
1 is the first set of ()
2 is the second set of () and so on.
if you have a group set like this.
NSString *matchString = #"(href)=\"((.*)[.]zip)\"";
you would have 4 groups.
Group 0 is the entire string, Group 1 is the "href", Group 2 is the entire filename and group 3 would be the filename without the extension.
Hope that helps.
NSRegularExpression *regularExpression =
[NSRegularExpression regularExpressionWithPattern:#"href=\"(.*[.]zip)[^\"]*\""
options:NSRegularExpressionCaseInsensitive
error:&error];
NSMutableArray *foundMatches = [NSMutableArray array];
[regex enumerateMatchesInString:originalString
options:0
range:NSMakeRange(0, [originalString length])
usingBlock:^(NSTextCheckingResult *result, NSMatchingFlags flags, BOOL *stop) {
if (result.numberOfRanges == 2){
[foundMatches addObject:[originalString substringWithRange:[result rangeAtIndex:1]]];
}
}];
the match I used here would mess up in the event there is a .zip in the filename that does not include the extension.
e.g. href="my.zip.file.zip" would put match group 2 would be "my.zip" as opposed to "my.zip.file.zip"

Related

NSRegularExpression is not correct working

please i need your help.
Here i wrote the part of code, and can not find, where is my mistake:
NSString *inputString =#"11111111111";
NSError *error = nil;
NSRegularExpression *regExpression = [NSRegularExpression regularExpressionWithPattern:#"[[a-zA-Z]]*"
options:NSRegularExpressionCaseInsensitive error:&error];
NSUInteger numberOfMatches = [regExpression numberOfMatchesInString:inputString
options:0
range:NSMakeRange(0, [inputString length])];
NSLog(#"numberOfMatches=%d", numberOfMatches);
// here shows "numberOfMatches = 7"
But checked here the result, the answer is incorrect !
http://gskinner.com/RegExr/
So question: where is my mistake ?
I don't know why you are getting only 7 matches, I think there should be 12. You don't specify any requirements, so I do a bit of guessing:
The problem is your quantifier *. It matches 0 or more, means [[a-zA-Z]]* it will also match if it finds 0 chars (the empty string) and an empty string will be found before every digit and at the end of the string.
probably it will help you to use the + quantifier, it matches 1 or more. So maybe the regex [a-z]+ is what you want.
Btw. [[a-zA-Z]]* is most probably wrong, I think you want [a-zA-Z]. The other thing is, when you use options:NSRegularExpressionCaseInsensitive you don't need to specify upper and lower case letters in your character class, [a-z] would be fine.
I think you have to change the RegularExpression as [[a-z]*]
you can give this a try if your input is 1111...
NSRegularExpression * regExpression = [NSRegularExpression
regularExpressionWithPattern:#"\\W-?1?[0-9]{2}(\\.[0-9]{1,2})?\\W"
options:0
error:&error];
for reference you can visit here.

NSRegularExpression omitting certain character

So I had the following regex:
#"(#|#)\\S+"
however the \S here includes # and # as well. How do I make this regex so that it's \S but not including # or #?
Basically I want a non white space character excluding # and #
Try (#|#)[^\s##]+. [^\s##] will match everything except space characters, # and #.
And remember to double escape \ when put in the objective-c string literal.
To exclude a set of characters u simply have to add ^ before it..so do it like ^(#|#)
you can use the following :-
NSStirng *string=#"Your String";
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"(#|#|\s)" options:NSRegularExpressionCaseInsensitive error:&error];
NSString *modifiedString = [regex stringByReplacingMatchesInString:string options:0
range:NSMakeRange(0, [string length]) withTemplate:#""];
As discussed on the other thread where you originally asked this question,
#"(#|#)\\w+"
will do what I think you want. If you really want every character except #, #, and whitespace, then
#"[##][^##\\s]+"
should do it. Both of these will take your string:
#"#baz#marroon#red#blue #big#cat#dog"
and if you use matchesInString:options:range it give you:
#"#baz"
#"#marroon"
#"#red"
#"#blue"
#"#big"
#"#cat"
#"#dog"
If this is not what you want, you should give us the input string and what you want as output, and we can tell you how to get it.

Array is having extra symbols other than which i added, trimming the array in iphone app using objective c

in my iphone app i am adding string values to the array first strSelectedDir value ill be
xxx-jan16-2011-10.30AM later its value ill be xxx-feb16-2011-02.30PM ,,i am adding these 2 values into the array arrDownloadedDirNames using the following code
[arrDownloadedDirNames addObject:strSelectedDir];
but in the out put array some new line symbols(\n) and symbols like \ "" are coming as shown bellow
(
"(\n \"xxx-jan16-2011-10.30AM\"\n)",
"xxx-feb16-2011-02.30PM"
)
but i want array should be like this with no extra symbols other than which are in the input string
(
xxx-jan16-2011-10.30AM ,
xxx-feb16-2011-02.30PM
)
how can i do this, why extra symbols are added? how can i remove those
please can any one help me,, thanx in advance
I guess your String is from a Parse or something right?
The extra symbols you see are called Escape Sequences:
\n = linebreake
\" = "
You can replace this Charakters pretty easy.
NSMutabeString *yourNewString = [NSMutabelString stringWithFormat:#"%#",[arrDownloadDirNames objectAtIndex:i]
[yourNewString replaceOccurrencesOfString:#"\n \"" withString:#"" options:NSLiteralSearch range:NSMakeRange(0, [yourNewString length])];
[yourNewString replaceOccurrencesOfString:#"\"\n" withString:#"" options:NSLiteralSearch range:NSMakeRange(0, [yourNewString length])];
Cheers
nettz

NSRegularExpression: How to match word starting with #

I'd like to convert references to usernames such as #me, #you, #them (twitter style) to links. What would be the correct NSRegularExpression pattern for this?
Based on the input from answers below, here's what I ended up using...
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:#"(?<!\\w)#([\\w\\._-]+)?" options:NSRegularExpressionCaseInsensitive error:&error];
NSString *newString = [regex stringByReplacingMatchesInString:stringIn options:0 range:NSMakeRange(0, [stringIn length]) withTemplate:#"<a href='http://mydomain.com/$1'>$0</a>"];
This may not be a perfect match for Twitter, as my own site supports dots, dashes and underlines
(?<!\w)#\w+ should be pretty safe.
The (?<!\w) is called a Negative Lookbehind and makes sure there's no word character before the #, preventing to match email addresses.
#[a-zA-Z0-9_]+ should get the job done pretty well.
I would recommend throwing some test data in RegExr and fiddling with the regex to match exactly what you need to.

NSRegularExpression's numberOfMatchesInString always returns one!

I have the following code:
NSString *text = #"http://bit.ly/111 http://bit.ly/222 http://www.www.www";
NSRegularExpression *aLinkRegex = [NSRegularExpression regularExpressionWithPattern:#".*http://.*" options:NSRegularExpressionCaseInsensitive error:nil];
NSUInteger numberOfMatches = [aLinkRegex numberOfMatchesInString:text options:0 range:NSMakeRange(0, [text length])];
I want to find the number of http's in the text (I know this isn't a good regex), but numberOfMatchesInString always returns 1, while it should return 3 in the above code.
Could someone please tell me what's wrong with the above code?
Cheers,
There is only one match, because your regular expression matches the first http:// and the .* "eats" the rest of the string.
Why not search for something more like:
http://
or if you are trying to capture each URL in full, something like:
http://[^ ]*
Which means search for anything after http:// that is not a space.
You should really look into reading through some kind of regular expression guide.