Confusion with case used by CFURLCreateStringByAddingPercentEscapes encoding - iphone

I want URL encoding to be done. My input string is "ChBdgzQ3qUpNRBEHB+bOXQNjRTQ="
I get an output as "ChBdgzQ3qUpNRBEHB%2BbOXQNjRTQ%3D" which is totally correct except the case which gets encoded.
Ideally, it should have been "ChBdgzQ3qUpNRBEHB%2bbOXQNjRTQ%3d" instead of the output I get.
i.e I should have got %2b and %3d instead of %2B and %3D.
Could this be done?
The code I used is as below :
NSString* inputStr = #"ChBdgzQ3qUpNRBEHB+bOXQNjRTQ=";
NSString* outputStr = (NSString *)CFURLCreateStringByAddingPercentEscapes(NULL,
(CFStringRef)inputStr,
NULL,
(CFStringRef)#"!*'\"();:#&=+$,/?%#[]% ",
CFStringConvertNSStringEncodingToEncoding(encoding));

Another perhaps more elegant but slower way would be to loop over your string, converting each character in the string one by one (so you would get the length of your string, then get a substring from it from location 0 to length-1, with one character each time, then translate just that substring. If the returned string has a length > 1, then CFURLCreateStringByAddingPercentEscapes encoded the character, and you can safely turn the case into lower case.
In all cases you append the returned (and possibly modified) string to a mutable string, and when done you have exactly what you want for any possible string. Even though this would appear to be a real processor hog, the reality is you would probably never notice the extra consumed cycles.
Likewise, a second approach would be to just convert your whole string first, then copy it byte by byte to a mutable string, and if you find a "%", then turn the next two characters into lower case. Just a slightly different way to slice the problem.

You can use a regular expression to perform the post operation:
NSMutableString *finalStr = outputStr.mutableCopy;
NSRegularExpression *re = [[NSRegularExpression alloc] initWithPattern:#"(?<=%)[0-9A-F]{2}" options:0 error:nil];
for (NSTextCheckingResult *match in [re matchesInString:escaped options:0 range:NSMakeRange(0, escaped.length)]) {
[finalStr replaceCharactersInRange:match.range withString:[[escaped substringWithRange:match.range] lowercaseString]];
}
The code uses this regular expression:
(<?=%)[0-9A-F]{2}
It matches two hexadecimal characters, only if preceded by a percent sign. Each match is then iterated and replaced within a mutable string. We don't have to worry about offset changes because the replacement string is always the same length.

Related

Searching for multiple strings in an NSString

Is it possible in Objective C to search an NSString for a number of different strings at the same time?
For example, I want to search for all occurrences of the strings "good", "great", "awesome", "incredible", "fantastic" and "brilliant" in a very long string.
My first though is to use NSString:rangeOfString: and cycle through multiple times (once for each string) but it strikes me that with longer sets of strings, this may become inefficient and slow.
Is there an in-built way of searching for multiple strings like this, or should I create my own method?
EDIT: The results are in!
After finding some time to benchmark, I found that the RegEx method is indeed slower (more than 2x slower) than the looping rangeInString method. The numbers, for your delectation, are as follows:
With a list of 150,000 words (~1103,500 characters) and 20 match-words, with 5412 matches present
NSString:rangeInString search = 231.077ms
Regular Expression search = 530.113ms
it strikes me that with longer sets of strings, this may become inefficient and slow.
So, have you benchmarked it? If not, then you don't have the right to judge it as "inefficient" and "slow". Premature optimization is evil. Just stick with those nice and simple for loops and the - [NSString rangeOfString:] method.
But: to actually answer your question, it's not impossible to avoid the manual looping. If you use NSRegularExpression with a regex like good|great|awesome, then you can find all occurrences in one pass. The use of regular expressions would probably be slower than a simple string search, though.
Regular expressions are so widely used that the implementation will be efficient. Specifically, a regex match will traverse the input string once.
NSRegularExpression *regex =
[NSRegularExpression regularExpressionWithPattern: #"(good|great|...)"
options: NSRegularExpressionCaseInsensitive
error: ...];
NSArray *matches = [regex matchesInString: string
options: 0
range: NSMakeRange(0, [string length])];
for (NSTextCheckingResult *match in matches)
...
Here is a test snippet:
NSString *string = #"not good nor great";
// as above
for (NSTextCheckingResult *match in matches)
NSLog (#"Match: %#", match);
produces:
2013-08-22 10:21:11.644 foo[2454:707] Match: <NSSimpleRegularExpressionCheckingResult: 0x7fc954301650>{4, 4}{<NSRegularExpression: 0x7fc9543001c0> (good|great) 0x1}
2013-08-22 10:21:11.644 foo[2454:707] Match: <NSSimpleRegularExpressionCheckingResult: 0x7fc954301540>{13, 5}{<NSRegularExpression: 0x7fc9543001c0> (good|great) 0x1}
Yes, internally the NSString is a data blob of unichars. You could retrieve a pointer to that and then have multiple queues search parts of it, though you'd have to make sure that you divide on white space characters so that miss a word part of two ranges.

How to use regular expression in iPhone app to separate string by , (comma)

I have to read .csv file which has three columns. While parsing the .csv file, I get the string in this format Christopher Bass,\"Cry the Beloved Country Final Essay\",cbass#cgs.k12.va.us. I want to store the values of three columns in an Array, so I used componentSeparatedByString:#"," method! It is successfully returning me the array with three components:
Christopher Bass
Cry the Beloved Country Final Essay
cbass#cgs.k12.va.us
but when there is already a comma in the column value, like this
Christopher Bass,\"Cry, the Beloved Country Final Essay\",cbass#cgs.k12.va.us
it separates the string in four components because there is a ,(comma) after the Cry:
Christopher Bass
Cry
the Beloved Country Final Essay
cbass#cgs.k12.va.us
so, How can I handle this by using regular expression. I have "RegexKitLite" classes but which regular expression should I use. Please help!
Thanks-
Any regular expression would probably turn out with the same problem, what you need is to sanitize your entries or strings, either by escaping your commas or by highlighting strings this way: "My string". Otherwise you will have the same problem. Good luck.
For your example you would probably need to do something like:
\"Christopher Bass\",\"Cry\, the Beloved Country Final Essay\",\"cbass#cgs.k12.va.us\"
That way you could use a regexp or even the same method from the NSString class.
Not related at all, but the importance of sanitizing strings: http://xkcd.com/327/ hehehe.
How about this:
componentsSeparatedByRegex:#",\\\"|\\\","
This should split your string whereever " and , appear together in either order, resulting in a three-member array. This of course assumes that the second element in the string is always enclosed in parentheses, and the characters " and , never appear consecutively within the three components.
If either of these assumptions is incorrect, other methods to identify string components may be used, but it should be made clear that no generic solution exists. If the three component strings can contain " and , anywhere, not even a limited solution is possible in such cases:
Doe, John,\"\"Why Unescaped Strings Suck\", And Other Development Horror Stories\",Doe, John <john.doe#dev.null>
Hopefully there is nothing like the above in your CSV data. If there is, the data is basically unusable, and you should look into a better CSV exporter.
The regex you're searching for is: \\"(.*)\\"[ ^,]*|([^,]*),
in ObjC: (('\"' && string_1 && '\"' && 0-n spaces) || string_2 except comma) && comma
NSString *str = #"Christopher Bass,\"Cry, the Beloved Country ,Final Essay\",cbass#cgs.k12.va.us,som";
NSString *regEx = #"\\\"(.*)\\\"[ ^,]*|([^,]*),";
NSMutableArray *split = [[str componentsSeparatedByRegex:regEx] mutableCopy];
[split removeObject:#""]; // because it will print always both groups even if the other is empty
NSLog(#"%#", split);
// OUTPUT:
2012-02-07 17:42:18.778 tmpapp[92170:c03] (
"Christopher Bass",
"Cry, the Beloved Country ,Final Essay",
"cbass#cgs.k12.va.us",
som
)
RegexKitLite will add both strings to the array, therefore you will end up with empty objects for your array. removeObject:#"" will delete those but if you need to maintain true empty values (eg. your source has val,,ue) you have to modify the code to the following:
str = [str stringByReplacingOccurrencesOfRegex:regEx withString:#"$1$2∏"];
NSArray *split = [str componentsSeparatedByString:#"∏"];
$1 and $2 are those two strings mentioned above, ∏ is in this case a character which will most likely never appear in normal text (and is easy to remember: option-shift-p).
The last part looks like it will never contain a comma. Neither will the first one as far as I can see...
What about splitting the string like this:
NSArray *splitArr = [str componentsSeparatedByString:#","];
NSString *nameStr = [splitArr objectAtIndex:0];
NSString *emailStr = [splitArr lastObject];
NSString *contentStr = #"";
for(int i=1; i<[splitArr count]-1; ++i) {
contentStr = [contentStr stringByAppendingString:[splitArr objectAtIndex:i]];
}
This will use the first and last string as is, and combine the rest into the content.
Kind of a hack, but a name and an email address will never contain a comma, right?
Is the title guarantied to have the quotation marks? And is it the only component that can have them? Because then componentSeparatedByString:#"\"" should get you this:
Christopher Bass,
Cry, the Beloved Country Final Essay
,cbass#cgs.k12.va.us
Then use componentSeparatedByString:#"," or substringFrom/ToIndex: to get rid of the two commas in the first and last component.
Here's a solution using substring:
NSString* input = #"Christopher Bass,\"Cry, the Beloved Country Final Essay\",cbass#cgs.k12.va.us";
NSArray* split = [input componentsSeparatedByString:#"\""];
NSString* part1 = [split objectAtIndex:0];
NSString* part2 = [split objectAtIndex:1];
NSString* part3 = [split objectAtIndex:2];
part1 = [part1 substringToIndex:[part1 length] - 1];
part3 = [part3 substringFromIndex:1];
NSLog(part1);
NSLog(part2);
NSLog(part3);

Splitting a number off prefix of a string on iPhone

Say I have a string like "123alpha". I can use NSNumber to get the 123 out, but how can I determine the part of the string that NSNumber didn't use?
You can use NSScanner to both get the value and the rest of the string.
NSString *input = #"123alpha";
NSScanner *scanner = [NSScanner scannerWithString:input];
float number;
[scanner scanFloat:&number];
NSString *rest = [input substringFromIndex:[scanner scanLocation]];
If it is important to know exactly what is left after parsing the value this is a better approach than trying to trim characters. While I can't think of any particular bad input at the moment that would fail the solution suggested by the OP in the comment to this answer, it looks like a bug waiting to happen.
if your numbers are always at the beginning or end of a string and you want only the remaining characters, you could trim with a character set.
NSString *alpha = #"123alpha";
NSString *stripped = [alpha stringByTrimmingCharactersInSet:[NSCharacterSet characterSetWithCharactersInString:#"0123456789"]];
If its starts out as a char * (as opposed to an NSString *), you can use strtol() to get the number and discover where the number ends in a single call.

How do you split NSString into component parts?

In Xcode, if I have an NSString containing a number, ie #"12345", how do I split it into an array representing component parts, ie "1", "2", "3", "4", "5"... There is a componentsSeparatedByString on the NSString object, but in this case there is no delimiter...
There is a ready member function of NSString for doing that:
NSString* foo = #"safgafsfhsdhdfs/gfdgdsgsdg/gdfsgsdgsd";
NSArray* stringComponents = [foo componentsSeparatedByString:#"/"];
It may seem like characterAtIndex: would do the trick, but that returns a unichar, which isn't an NSObject-derived data type and so can't be put into an array directly. You'd need to construct a new string with each unichar.
A simpler solution is to use substringWithRange: with 1-character ranges. Run your string through a simple for (int i=0;i<[myString length];i++) loop to add each 1-character range to an NSMutableArray.
A NSString already is an array of it’s components, if by components you mean single characters. Use [string length] to get the length of the string and [string characterAtIndex:] to get the characters.
If you really need an array of string objects with only one character you will have to create that array yourself. Loop over the characters in the string with a for loop, create a new string with a single character using [NSString stringWithFormat:] and add that to your array. But this usually is not necessary.
In your case, since you have no delimiter, you have to get separate chars by
- (void)getCharacters:(unichar *)buffer range:(NSRange)aRange
or this one
- (unichar)characterAtIndex:(NSUInteger) index inside a loop.
That the only way I see, at the moment.
Don't know if this works for what you want to do but:
const char *foo = [myString UTF8String]
char third_character = foo[2];
Make sure to read the docs on UTF8String

How a get a part of the string from main String in Objective C

I have mainString from which i need to get the part of the string after finding a keyword.
NSString *mainString = "Hi how are you GET=dsjghdsghghdsjkghdjkhsg";
now I need to get the string after the keyword "GET=".
Waiting for a reply.
Have a look at the NSString documentation.
Assuming your string really is so totally straightforward, you could do something like this:
NSArray *components = [mainString componentsSeparatedByString: #"GET="];
NSString *stringYouWant = [components objectAtIndex: 1];
Obviously, this performs absolutely no error checking and makes a number of assumptions about the actual contents of mainString, but it should get you started.
Note, also, that the code is somewhat defensive in that it assumes that you are looking for GET= and not separating on =. Either way is a hack in terms of parsing, but... hey... hacks are sometimes the right answer.
You can use a regex via RegexKitLite:
NSString *mainString = #"Hi how are you GET=dsjghdsghghdsjkghdjkhsg";
NSString *matchedString = [mainString stringByMatching:#"GET=(.*)" capture:1L];
// matchedString == #"dsjghdsghghdsjkghdjkhsg";
The regex used, GET=(.*), basically says "Look for GET=, and then grab everything after that". The () specifies a capture group, which are useful for extracting just part of a match. Capture groups begin at 1, with capture group 0 being "the entire match". The part inside the capture group, .*, says "Match any character (the .) zero or more times (the *)".
If the string, in this case mainString, is not matched by the regex, then matchedString will be NULL.
You can get the location of the first occurrence of = and then just take a substring of mainString from the location of = to the end of the string.