Format NSString (Char-Replacement, UTF, ...) - iphone

Google Maps API delivers me a string which contains the German letters: ö, ä , ü and probably several other special characters.
The string looks like:
#" (several spaces ...) Frankfurt an der Oder (several spaces ...) "
(1) If I try stringByReplacing ... and make the spaces disappear, it looks like:
#"FrankfurtanderOder" ... which is even worse. So I need to delete the spaces before the first and after the last word, not the spaces in between. How to do this?
(2) Sometimes Google delivers me #"W\U00fcrzburg, Deutschland"
... there is nothing said in the JSON-request about encodings ... could it be that the JSON-parser and not the api is the problem?
However, still I have to solve it. Any ideas?
Thank you so far!
EDIT:
For (2) I'll do the workaround and replace some UTF-8 characters ... (Even If this is definitely not the best solution ...)
ä -> ä
ö -> ö
ü -> ü
Ä -> Ä
Ö -> Ö
Ü -> Ü
ß -> ß
" -> "
\u00C4 -> Ä
\u00E4 -> ä
\u00D6 -> Ö
\u00F6 -> ö
\u00DC -> Ü
\u00FC -> ü
\u00DF -> ß

– stringByTrimmingCharactersInSet:
NSString *str = #" Frankfurt an der Oder ";
NSString *trimmed = [str stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
NSLog(#"\"%#\"", str);
NSLog(#"\"%#\"", trimmed);
2012-03-26 14:10:49.302 xx[3752:f803] " Frankfurt an der Oder "
2012-03-26 14:10:49.333 xx[3752:f803] "Frankfurt an der Oder"
about the ü. Does the \U00fc appear in an UILabel or did you just got them from a NSLog? In my experience sometimes NSLog doesn't print the decoded letters but they appear okay in interface elements.

You need a few steps here:
NSString *unescapeBackslashes(NSString *input)
{
// find occurences of '\'
int index = 0;
NSRange range = NSMakeRange(0, input.length);
NSMutableString *output = [NSMutableString string];
while ((range = [input rangeOfString:#"\\u" options:0 range:NSMakeRange(index, input.length - index)]).location != NSNotFound) {
assert(input.length > range.location + 5);
char temp[5];
strncpy(temp, [input cStringUsingEncoding:NSASCIIStringEncoding] + range.location + 2, 4);
[output appendString:[input substringWithRange:NSMakeRange(index, range.location - index)]];
// append the unicode char
[output appendFormat:#"%C", strtol(temp, NULL, 16)];
index = range.location + 6;
}
[output appendString:[input substringWithRange:NSMakeRange(index, input.length - index)]];
return output;
}
int main(int argc, const char *argv[])
{
#autoreleasepool {
NSString *input = #" W\\u00fcrzburg, Deutschland ";
NSLog(#"Input: %#", input);
NSString *trimmed = [input stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceCharacterSet]];
NSString *escaped = unescapeBackslashes(trimmed);
NSLog(#"Trimmed: %#", trimmed);
NSLog(#"Escaped: %#", escaped);
}
}

Related

Read new lines with NSScanner

I'm trying to read the characters between words in a string.
NSCharacterSet* whiteSpace = [NSCharacterSet characterSetWithCharactersInString:#" \n\r\t"];
NSScanner* testScanner = [NSScanner scannerWithString:#"space newline\n space space newline\r end"];
while([testScanner isAtEnd] == NO) {
NSString* spaceBetweenWords = #"";
[testScanner scanUpToCharactersFromSet:whiteSpace intoString:NULL];
[testScanner scanCharactersFromSet:whiteSpace intoString:&spaceBetweenWords];
NSLog(#"x%#x", spaceBetweenWords);
}
the output is:
xx
xx
xx
xx
I would expect it to be:
x x
x
x
x x
x
x
Any ideas how to make it work?
NSScanner skips white space by default.
To fix, add:
[scanner setCharactersToBeSkipped:nil];

UILabel Convert Unicode(Japanese) and display

After hours of research I gave up.
I receive text data from a WebService. For some case, the text is inJapanese, and the WS returns its Unicoded version. For example: \U00e3\U0082\U008f
I know that this is a Japanese char.
I am trying to display this Unicode char or string inside a UILabel.
Since the simple setText method does'nt display the correct chars, I used this (copied) routine:
unichar unicodeValue = (unichar) strtol([[[p innerData] valueForKey:#"title"] UTF8String], NULL, 16);
char buffer[2];
int len = 1;
if (unicodeValue > 127) {
buffer[0] = (unicodeValue >> 8) & (1 << 8) - 1;
buffer[1] = unicodeValue & (1 << 8) - 1;
len = 2;
} else {
buffer[0] = unicodeValue;
}
[[cell title] setText:[[NSString alloc] initWithBytes:buffer length:len encoding:NSUTF8StringEncoding] ];
But no success: the UILabel is empty.
I know that one way could be convert the chars to hex and then from hex to String...is there a simpler way?
SOLVED
First you must be sure that your server is sending UTF8 and not UNICODE CODE POINTS. The only way I found is to json_encode strings which contain UNICODE chars.
Then, in iOS user unescaping following this link Using Objective C/Cocoa to unescape unicode characters, ie \u1234

Validating the phone numbers

I want to have only 13 numeric values or the 13numeric values can be prefixed with "+" sysmbol.so the + is not mandatory
Example : 1234567891234
another example is : +1234567891234
Telephone number format should be international,Is there any Regex for phone number validation in iPhone
I have tried the above link , but this +1234545 but i want to have only 13 numarals or + can be prefixed with that 13 numerals.
Please let me know , what can i change it here
This is the code i tried
NSString * forNumeric = #"^\\+(?:[0-9] ?){6,14}[0-9]$";
BOOL isMatch = [[textFieldRounded text] isMatchedByRegex:forNumeric];
if (isMatch == YES){
NSLog(#"Matched");
}
else {
NSLog(#"Not matched");
}
NSString * regex = #"((07|00447|004407|\\+4407|\\+447)\\d{9})";
Having found the leading 0 or the leading +44 once, why search for it again?
Basic simplification leads to
NSString * regex = #"((07|00440?7|\\+440?7)\\d{9})";
then to
NSString * regex = #"((07|(00|\\+)440?7)\\d{9})";
then to
NSString * regex = #"((0|(00|\\+)440?)7\\d{9})";
but 00 isn't the only common dial prefix, 011 is used in the US and Canada.
Adding that, and turning the order round, gives:
NSString * regex = #"(^((0(0|11)|\\+)440?|0)7\\d{9}$)";
or preferably
NSString * regex = #"(^(?:(?:0(?:0|11)|\\+)(44)0?|0)(7\\d{9}$))";
allowing 00447, 011447, +447, 004407, 0114407, +4407, 07 at the beginning, and with non-capturing groups.
For wider input format matching, allowing various punctuation (hyphens, brackets, spaces) use
NSString * regex = #"(^\\(?(?:(?:0(?:0|11)\\)?[\\s-]?\\(?|\\+)(44)\\)?[\\s-]?\\(?(?:0\\)?[\\s-]?\\(?)?|0)(7\\d{9})$)";
Extract the 44 country code in $1 (null if number entered as 07...) and the 10-digit NSN in $2.
However, be aware that numbers beginning 070 and 076 (apart from 07624) are NOT mobile numbers.
The final pattern:
NSString * regex = #"(^\\(?(?:(?:0(?:0|11)\\)?[\\s-]?\\(?|\\+)(44)\\)?[\\s-]?\\(?(?:0\\)?[\\s-]?\\(?)?|0)(7([1-5789]\\d{2}|624)\\)?[\\s-]?\\d{6}))$)";
Extract the NSN in $2 then remove all non-digits from it for further processing.
^(\+?)(\d{13})$ should do the trick, escape the slashes for objective-C usage.
13 digits, with an options + prefix.
If you want to play with regexp expressions you can use services like this one for visual feedback, very handy.
NSString * forNumeric = #"^(\\+?)(\\d{13})$";
How about this?
NSString *forNumeric = #"\\+?[0-9]{6,13}";
NSPredicate *predicate;
predicate = [NSPredicate predicateWithFormat:#"self matches %#", forNumeric];
BOOL isMatch = [predicate evaluateWithObject:#"+1234567890123"];
if (isMatch) NSLog(#"Matched");
else NSLog(#"Not matched");
NSDataDetector *matchdetector = [NSDataDetector dataDetectorWithTypes:NSTextCheckingTypePhoneNumber
error:&error];
NSUInteger matchNumber = [matchdetector numberOfMatchesInString:string options:0 range:NSMakeRange(0, [string length])];
If you use UITextField then:
textField.dataDetectorTypes = UIDataDetectorTypePhoneNumber;
you could try using a NSDataDetector:
http://developer.apple.com/library/ios/#documentation/Foundation/Reference/NSDataDetector_Class/Reference/Reference.html
available in iOS4+
The following is what I do for validating UK mobile numbers:
- (BOOL) isValidPhoneNumber
{
NSString * regex = #"((07|00447|004407|\\+4407|\\+447)\\d{9})";
NSPredicate *testPredicate = [NSPredicate predicateWithFormat:#"SELF MATCHES %#", regex];
BOOL validationResult = [testPredicate evaluateWithObject: self];
return validationResult;
}
See if it helps you

What is a good way to remove the formatting from a phone number to only get the digits?

Is there a better or shorter way of striping out all the non-digit characters with Objective-C on the iPhone?
NSString * formattedNumber = #"(123) 555-1234";
NSCharacterSet * nonDigits = [[NSCharacterSet decimalDigitCharacterSet] invertedSet];
NSString * digits;
NSArray * parts = [formattedNumber componentsSeparatedByCharactersInSet:nonDigits];
if ( [parts count] > 1 ) {
digits = [parts componentsJoinedByString:#""];
} else {
digits = [parts objectAtIndex:0];
}
return digits;
You could use a RegEx-replacement that replaces [\D] with nothing.
Dupe of Remove all but numbers from NSString
The accepted answer there involves using NSScanner, which seems heavy-handed for such a simple task. I'd stick with what you have there (though someone in the other thread suggested a more compact version if it, thus:
NSString *digits = [[formattedNumber componentsSeparatedByCharactersInSet:
[[NSCharacterSet decimalDigitCharacterSet] invertedSet]]
componentsJoinedByString:#""];
Phone numbers can contain asterisks and number signs (* and #), and may start with a +. The ITU-T E-123 Recommandation recommends that the + symbol be used to indicate that the number is an international number and also to serve as a reminder that the country-specific international dialling sequence must be used in place of it.
Spaces, hyphens and parentheses cannot be dialled so they do not have any significance in a phone number. In order to strip out all useless symbols, you should remove all characters not in the decimal character set, except * and #, and also any + not found at the start of the phone number.
To my knowledge, there is no standardised or recommended way to represent manual extensions (some use x, some use ext, some use E). Although, I have not encountered a manual extension in a long time.
NSUInteger inLength, outLength, i;
NSString *formatted = #"(123) 555-5555";
inLength = [formatted length];
unichar result[inLength];
for (i = 0, outLength = 0; i < inLength; i++)
{
unichar thisChar = [formatted characterAtIndex:i];
if (iswdigit(thisChar) || thisChar == '*' || thisChar == '#')
result[outLength++] = thisChar; // diallable number or symbol
else if (i == 0 && thisChar == '+')
result[outLength++] = thisChar; // international prefix
}
NSString *stripped = [NSString stringWithCharacters:result length:outLength];
You could do something like this:
NSString *digits = [[formattedNumber componentsSeparatedByCharactersInSet:[NSCharacterSet decimalDigitCharacterSet]] componentsJoinedByString:#""];
Noting 0xA3's comment above, you could optionally use a different NSCharacterSet that includes + and other non-digits that are valid in phone numbers.

Escape Double-Byte Characters for RTF

I am trying to escape double-byte (usually Japanese or Chinese) characters from a string so that they can be included in an RTF file. Thanks to poster falconcreek, I can successfully escape special characters (e.g. umlaut, accent, tilde) that are single-byte.
- (NSString *)stringFormattedRTF:(NSString *)inputString
{
NSMutableString *result = [NSMutableString string];
for ( int index = 0; index < [inputString length]; index++ ) {
NSString *temp = [inputString substringWithRange:NSMakeRange( index, 1 )];
unichar tempchar = [inputString characterAtIndex:index];
if ( tempchar > 127) {
[result appendFormat:#"\\\'%02x", tempchar];
} else {
[result appendString:temp];
}
}
return result;
}
It appears this is looking for any unicode characters with a decimal value higher than 127 (which basically means anything not ASCII). If I find one, I escape it and translate that to a hex value.
EXAMPLE: Small "e" with acute accent gets escaped and converted to its hex value, resulting in "\'e9"
While Asian characters are above 127 decimal value, the output from the above appears to be reading the first byte of the unicode double byte character and encoding that then passing the second byte as is. For the end user it ends up ????.
Suggestions are greatly appreciated. Thanks.
UPDATED Code sample based on suggestion. Not detecting. :(
NSString *myDoubleByteTestString = #"blah は凄くいいアップです blah åèüñ blah";
NSMutableString *resultDouble = [NSMutableString string];
for ( int index = 0; index < [myDoubleByteTestString length]; index++ )
{
NSString *tempDouble = [myDoubleByteTestString substringWithRange:NSMakeRange( index, 1 )];
NSRange doubleRange = [tempDouble rangeOfComposedCharacterSequenceAtIndex:index];
if(doubleRange.length > 2)
{
NSLog(#"%# is a double-byte character. Escape it.", tempDouble);
// How to escape double-byte?
[resultDouble appendFormat:tempDouble];
}
else
{
[resultDouble appendString:tempDouble];
}
}
Take a look at the code at rangeOfComposedCharacterSequenceAtIndex: to see how to get all the characters in a composed character. You'll then need to encode each of the characters in the resulting range.