Our code calls stringWithUTF8String but some data we have uses an octal sequence \340 in the string. This causes some code to break because we never expect the function to return nil. I did some research and found that any octal sequence from \200-\777 will give the same result. I know I can handle this returning nil but I want to understand why it would return nil, and what those octal escapes are interpreted as.
NSString *result = [NSString stringWithUTF8String:"Mfile \340 xyz.jpg"];
running this code return nil for result. It appears that to code defensively we will have to check null results for this everywhere where we use it which seems unfortunate. The documentation for the function does not say anything about returning nil as a possibility. I would bet that there is a lot of code out there that does not check for it either.
The UTF-8 Character Table doesn't have an entry for \340. You need to use the ASCII encoding for this. Do,
NSString * result = [NSString stringWithCString:"Mfile \340 xyz.jpg" encoding:NSASCIIStringEncoding];
NSLog(#"%#", result);
If you want iOS to handle it as UTF-8 you have to make sure it's valid UTF-8 characters you pass to it, so you may need to convert the octal characters to something human readable first.
I added a category which is called safeStringWithUTF8String: this is called everywhere instead it simply checks the return value for nil and returns the empty string if not valid. Not great but not sure what else to do we have to be able to handle any data passed in.
Related
How can I normalize locale characters like Turkish "İĞŞÇ" to "igsc" in Dart-Flutter?
var string = "İĞŞÇ".normalize();
print(string)
Output: igsc
Is there a way to do this like above?
It is impossible for the compiler to tell that "İĞŞÇ" = "igsc"
they have different bits combination so you can't do so unfortunately automatically.
unless you declare in your code that for example
"Ğ":"g"
and so on...
and when the compiler finds the special character it will converted to its declared equivalent.
I have to decode some base64 string using Perl, and I want to know the docode is success or not.
How can I know the decode is OK? What will happen if my decode is failed?
There is no "decode is failed" with MIME::Base64::decode_base64. It will simply ignore anything which does not fit, i.e. characters which are not valid base64 characters, incomplete padding at the end or any data following the end marker '='. Thus, it will always return something and in the worst case this will be an empty string.
Note that this behavior is not even wrong. At least some of the various Base64 standards explicitly require invalid characters to be skipped and none defines error handling in case of incomplete padding or data after '='. Still, the output of MIME::Base64 might be different compared to other implementations in case of invalid data.
When using MIME::Base64's decode_base64, the decode is always deemed to be successful. Disallowed characters are ignored.
You could strictly verify that you have a valid base64 using the following:
my $c1 = '[A-Za-z0-9+/]';
my $c2 = '[AQgw]';
my $c3 = '[AEIMQUYcgkosw048]';
die "Invalid data\n"
if $s !~ m{^(?:$c1{4})*+(?>$c1(?>$c2==|$c1$c3=)|)\z};
Whitespace is often used in the middle, so you might want to allow whitespace. (In fact, encode_base64 includes whitespace in its output by default!)
The = are often left out, so you might want to allow missing =.
If you're worried about data corruption, include a hash of the data with the data.
I am getting some weird characters when converting a NSArray containing NSDictionaries to a JSON string.
I tried using both SBJson and NSJSONSerialization with the same result.
The NSDictionary is populated with the content of the address book, with the contact name, email and phone number, and are mostly in hebrew.
The characters look like this:
\327\237
I could not find any information about this, help anyone?
Thanks in advance!
EDIT *
Here is a snippet of the JSON:
[
{"fname":"סתם טקסט"},
{"fname":"סתם טקסט"},
{"fname":"נ\327\231ר"}
]
its supposed to be:
[
{"fname":"סתם טקסט"},
{"fname":"סתם טקסט"},
{"fname":"ניר"}
]
And i am getting the JSON by using the following code:
NSData *jsonData = [NSJSONSerialization dataWithJSONObject:ContactsArray options:NSJSONReadingMutableLeaves error:&err];
NSLog(#"JSON: %#", [NSString stringWithUTF8String:[jsonData bytes]]);
These characters are octal escape codes. I prefer to look at things in hex. \327 and \237 are 0xD7 and 0x9F in hex.
I looked up U+00D7 and U+009F (unicode characters). They are MULTIPLICATION SIGN and APPLICATION PROGRAM COMMAND. That doesn't make sense in this context, so a straight conversion is not the way to go.
Next, I thought UTF-8 encoding. D7 9F decodes as U+05DF. This is HEBREW LETTER FINAL NUN. That makes sense in this context.
So, I'm guess the data you are seeing in UTF-8 characters that are not understood and octal escaped. JSON doesn't support octal escapes, so I'm guessing it's NSLog() or whatever you are using to print the JSON that is doing the escaping.
For some reason, I can not get an autohotkey string comparison to work in the script I need it in, but it is working in a test script.
Tester
password = asdf
^!=::
InputBox,input,Enter Phrase,Enter Phrase,,,,,,,30,
if ( input == password ){
MsgBox, How original your left home row fingers are
Return
} else {
MsgBox, You entered "%input%"
Return
}
Main
password = password
!^=::
InputBox,input,Enter Password,Enter Password,HIDE,,,,,,30,
if ( input == password ){
MsgBox,"That is correct sir"
;Run,C:\Copy\Registry\disable.bat
return
}else{
MsgBox,That is not correct sir you said %input%
Return
}
Main keeps giving me the invalid. Any ideas?
Your "main" script works just fine.
The == comparitor is case sensitive, you know.
I found that strings in the clipboard were not comparing properly to strings in my source files when the strings contained in the source file contained non-ascii characters. After converting the file to UTF-8 with BOM, it would correctly compare.
The documentation doesn't say directly that it will affect string comparisons but it does say that it has an affect. In the FAQ section it states:
Why are the non-ASCII characters in my script displaying or sending
incorrectly?
Short answer: Save the script as UTF-8 with BOM.
Although AutoHotkey supports Unicode text, it is optimized for
backward-compatibility, which means defaulting to the ANSI encoding
rather than the more internationally recommended UTF-8. AutoHotkey
will not automatically recognize a UTF-8 file unless it begins with a
byte order mark.
Source: https://web.archive.org/web/20230203020016/https://www.autohotkey.com/docs/v1/FAQ.htm#nonascii
So perhaps it does more than just display and send incorrectly, but also store values incorrectly causing invalid comparisons.
I have been having some problem with the stringByAddingPercentEscapesUsingEncoding: method.
Here's what happens:
When I try to use the method to convert the NSString:
"..City=Cl&PostalCode=Rh6 0Nt"
I get this this..
"City=Cl&PostalCode=Rh62t"
It should be:
"..City=Cl&PostalCode=Rh6%200Nt"
What can I do about this? Thanks in advance !!
For me, this:
NSString *s=[#"..City=Cl&PostalCode=Rh6 0Nt" stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding];
NSLog(#"s=%#",s);
... outputs:
s=..City=Cl&PostalCode=Rh6%200Nt
You're most likely using the wrong encoding.
This happens when you're trying to encode to NSASCIIStringEncoding a string with characters not supported by ASCII.
Make sure you're encoding to NSUTF8StringEncoding, if the string can contain UTF8 characters or the method would return nil.