Swift Thai Localization Problems - swift

There seems to be a problem with the String library that apple uses.
Here's my Localizable.strings
"error_failed_to_retrieve_certificate" = "เกิิดผิดพลาดในการกู้คะแนน";
Here's how I set it to any view
anyView.text = return NSLocalizedString("error_failed_to_retrieve_certificate", comment: "")
But somehow the string that is being displayed gets warped, when it gets displayed, (the second character becomes different.
Here's what it looks like too when I search it using the Project Search.
But on the Strings it looks different (notice the third character)
Here's one image that is side by side

Note that I don't know any Thai.
It seems like that your string has an extra ิ (U+0E34 THAI CHARACTER SARA I) in it. The character before that, กิ, is already two code points combined - ก (U+0E01 THAI CHARACTER KO KAI) and ิ, so the extra ิ got displayed alone. I would say it's an Xcode bug.
I've removed the extra character here:
เกิดผิดพลาดในการกู้คะแนน
Copy and paste that and it should be fine.

You need to check if you have unique key "error_failed_to_retrieve_certificate". this key value is unique.

Related

crystal reports attempting to link two tables by matching string with no luck

As stated in the title, I have two tables I'm attempting to link. Both Strings appear to be a match, however Crystal Reports is not picking it up. The only thing I can think is that that length of the field is different, even though the strings are the same. could that cause a discrepancy? If so how can I correct for it? Thank you
Length of the string will prevent a match. If you are using the Trim(string) function, that only removes spaces found at the beginning or end of your string, so the two strings could still be of different lengths after using this function. You will need to use another function to capture a substring of the original string. To do this you can use the Left(string, length) function to ensure both strings are the same length.
If they still do not match then you may have non-printable characters in one or both of your strings. Carriage Return and Line Feed tend to be the most commonly found non-printable characters. A Carriage Return is represented as Chr(10), while a Line Feed is represented as Chr(13). These are Built In Constants similar to those found in VBA and Visual Basic.
You can use a find and replace to remove them with the following formula. Its not a bad idea to also include the trim and left functions in this as well to ensure you get the best match possible.
Replace(Replace(Left(Trim({YourStringField}), 10),Chr(10), ""),Chr(13), "")
There are a few additional Built In Constants you may need to check for if this doesn't work. A Tab is represented as Chr(9) for example. Its very rare for strings to contain the other Built In Constants though. In most cases Carriage Return and Line Feed are the only ones that are typically found in Plain Text. Tabs and the other constants should only be found in Rich Text and are very rare in string data.

How to know which CharacterSet contains a given character?

Is there a way to check if a character belongs to a CharacterSet?
I wanna know what CharacterSet should I use for character -. Do I use symbols?
I've checked this documentation but still no idea. https://developer.apple.com/documentation/foundation/characterset
When removing extra whitespace at the end of a string, we do it like this:
let someString = " "
print("\(11111) - \(someString)".trimmingCharacters(in: .whitespaces))
But what if I just want to remove the -? Or any special character such as *?
EDIT: I was looking for a complete set of characters per each CharacterSet if it's possible.
What you want is defined in the Unicode standard. It is referred to as Unicode General Categories. Each Unicode character is in a category.
The Unicode website provides a complete character list showing the character's code, category, and name. You can also find a complete list of Unicode categories as well.
The - is U+2D (HYPHEN-MINUS). It is listed as being in the "Pd" (punctuation) category.
If you look at the documentation for CharacterSet, you will see punctuationCharacters which is documented as:
Returns a character set containing the characters in Unicode General Category P*.
The "Pd" category is included in "P*" (which means any "P" category).
I also found https://www.compart.com/en/unicode/category which is a third party list of each character by category. A bit more user friendly than the Unicode reference.
To summarize. If you want to know which CharacterSet to use for a given character, lookup the character's category using one of the charts I linked. Once you know its category, look at the documentation for CharacterSet to see which predefined character set applies to that category.

Is Localizable.strings required for the root language of an app?

As we're enabling our (English) application to be localized, we're replaced all in-line strings with NSLocalizedString() calls. Since all of our English strings are all there in-line with the code, e.g. NSLocalizedString(#"OK, #"OK button in a message box"), is there any reason we need the English version of Localizable.strings? When we try removing the strings from the English Localizable.strings, the program seems to work fine. Just wanted to double check if there was some side-effect to not having that around. Thanks, alex
One of the main points of using the NSLocalizedString() macro is so that your programming code can be parsed with the genstrings command-line tool to generate the corresponding Localizable.strings file(s) (see Resource Programming Guide: About the String-Loading Macros and Using the Genstrings Tool to Create Strings Files).
That Localizable.strings file then serves as a starting point for your translators, to use to translate to another language. Without that file to work with, your translators would basically need access to your source code in order to see all the strings you want to use (which kind of defeats the purpose).
Yes, your English version works fine right now, since if a localized version of the string you try to get in code –– for example, NSLocalizedString(#"OK", #"") –– cannot be found in a .strings file, it simply uses the #"OK" string that you passed in.
Another reason why you should likely be keeping the English Localizable.strings is that you should generally try to avoid using high-ASCII characters in your code, but should use the full range of available characters in your actual user interface. For example, you may not want to put the following characters in your code, but would want to use them in your user interface:
… (horizontal ellipsis) (U+2026)
“ (left double quotation mark) (U+201C)
” (right double quotation mark) (U+201D)
‘ (left single quotation mark) (U+2018)
’ (right single quotation mark) (U+2019)
So in code, you'd do something like this:
NSLocalizedString(#"Add Bookmark...", #"")
and then in your .strings file (which is UTF16, so this is fine):
"Add Bookmark..." = "Add Bookmark…";
It's not recommended to use the words as keys, if you want to change that Ok for something else and you have any other Localizable.strings, you will have to edit every single of them to update the Okkey.
Surely your translators will want your Localizable.strings file to work from. And want it to be ALL the strings.
(It is true that the system will fall back to the key, but relying on that seems like bad practice. And you'll find it more reliable to use proper punctuation in a Unicode file, which source code seldom is.)

Diamonds with question marks

I'm getting these little diamonds with question marks in them in my HTML attributes when I present data from my database. I'm using EPiServer and a few custom properties.
This is the information I've gathered,
I save my data as a XML document, since I use custom EPiServer properties which need more than one defined value. This is saved as UTF8.
It's only attributes in element tags which have this problem, such as align=left becomes align=�left�. There is no " character there, but I get the diamonds anyway.
If I use " outside an element, it works and shows correctly.
Any clues?
This is a problem with your character encoding scheme.
I would recommend reading this article, where (close to the bottom of it), he shows you why you get that little diamond with question marks.
Has the XML been touched by any of the Microsoft Office suite products.
These are notorius for switching vanilla quotes (") x'22' to smartquotes x'93' and x'94'(“”).
Also singlequote (') is often converted from x'27' to x'91' and x'92' pairs (‘’).

Search or compare within a Grapheme Cluster in Korean

In my current implementation of a UISearchBarController I'm using [NSString compare:] inside the filterContentForSearchText:scope: delegate method to return relevant objects based on their name property to the results UITableView as you start typing.
So far this works great in English and Korean, but what I'd like to be able to do is search within NSString's defined character clusters. This is only applicable for a handfull of languages, of which Korean is one.
In English, compare: returns new results after every letter you enter, but in Korean the results are generated once you complete a recognized grapheme cluster. I would like to be able to search through my Korean objects name property via the individual elements that make up a syllable.
Can anyone shed any light on how to approach this? I'm sure it has something to do with searching through UTF16 characters manually, or by utilising a lower level class.
Cheers!
Here is a specific example that's just not working:
`NSString *string1 = #"이";
`NSString *string2 = #"ㅣ";
NSRange resultRange = [[string1 decomposedStringWithCanonicalMapping] rangeOfString: [string2 decomposedStringWithCanonicalMapping] options:(NSLiteralSearch)];
The result is always NSNotFound, with or without decomposedStringWithCanonicalMapping.
Any ideas?
I'm no expert, but I think you're very unlikely to find a clean solution for what you want. There doesn't seem to be any relationship between a Korean character's Unicode value and the graphemes that it's made up of.
e.g. "이" is \uc774 and "ㅣ" is \u3163. From the perspective of the NSString, they're just two different characters with no specific relationship to each other.
I suspect that you will have to find or create an explicit mapping between characters and their graphemes, and then write your own search function that consults this mapping.
This very long page on Unicode Korean can help you, if it comes to that. It has a table of all the characters which suggests some structured relation between the way characters are numbered and their components.
If you use compare:options with NSLiteralString, it should compare character by character, that is, the Unicode code points, regardless of the grapheme. The default behavior of compare: is to use no options. You could use - decomposedStringWithCanonicalMapping to get the Unicode bytes of the input string, but I'm not sure how that would interact with compare:.