How to remove the last unicode symbol from NSString - iphone

I have implemented a custom keyboard associated with a text field, so when the user presses the delete button, I remove the last character from the string, and manually update the current text field text.
NSRange range = NSMakeRange(currentTextFieldString.length-1, 1);
[currentTextFieldString replaceCharactersInRange:range withString:#""];
So far so good.
Now, the problem is, that the user has the option to enter some special unicode symbols, these are not 1 byte, they can be 2 bytes too, now on pressing the delete button, I have to remove the entire symbol, but if I follow the above approach, the user has to press the delete button twice.
Here, if I do:
NSRange range = NSMakeRange(currentTextFieldString.length-2, 2);
[currentTextFieldString replaceCharactersInRange:range withString:#""];
it works fine, but then, the normal characters, which are just 1 byte, get deleted twice at a time.
How to handle such scenarios?
Thanks in advance.
EDIT:
It is strange, that if I switch to the iPhone keyboard, it handles both cases appropriately. There must be some way to do it, there is something that I am missing, but am not able to figure out what.

Here's the problem. NSStrings are encoded using UTF-16. Many common Unicode glyphs take up only one unichar (a 16 bit unsigned value). However, some glyphs take up two unichars. Even worse, some glyphs can be composed or decomposed, e.g.é might be one Unicode code point or it might be two - an acute accent followed by an e. This makes it quite difficult to do what you want viz delete one "character" because it is really hard to tell how many unichars it takes up.
Fortunately, NSString has a method that helps with this: -rangeOfComposedCharacterSequenceAtIndex:. What you need to do is get the index of the last unichar, run this method on it, and the returned NSRange will tell you where to delete from. It goes something like this (not tested):
NSUInteger lastCharIndex = [myString length] - 1; // I assume string is not empty
NSRange rangeOfLastChar = [myString rangeOfComposedCharacterSequenceAtIndex: lastCharIndex];
myNewString = [myString substringToIndex: rangeOfLastChar.location];

If you can't get this to work by default, then use an if/else block and test if the last character is part of a special character. If it is, use the substring to length-2, otherwise use the substring to length-1.

I don't know exactly what the problem is there with the special characters byte length.
What i suggest is:
Store string length to a param, before adding any new characters
If user selects backspace (remove last characters) then remove the string from last length to new length. Means for example last saved string length is 5 and new string length is 7 then remove get a new string with the index from 0 to 4, so it will crop the remaining characters.
This is the other way around to do as i don't know the exact what problem internally.
But i guess logically this solution should work.
Enjoy Coding :)

Related

Remove characters from string iOS

I have an XML which I am using to parse news. News have a description. I'm using NSString to show that description in UILabel.
But, the description comes like this:
Bad news for Windows’ the researches show that Windows’ for years....
And it is being showed with those unwanted characters in UILabel. The numbers are changing in every string. They are not the sames.
I want to remove the characters that begins with &# and the numbers that follow. How can I do that? Which string encoding format should I use?
Thanks a lot.
EDIT: I don't have just one string. If I remove &#8217 from this one, there might be &#7610 in another one. It won't be removed.
I can remove &# characters and numbers too. But when I do that, In a string like that "In 1980, Jobs told us to&#2540 do something" the output will be "In , Jobs told us to do something" 1980 will be gone too, but I don't want that. That's a problem either.
These are ASCAII symbols so you need to use utf-8 string.check this link
so use this line of code
NSString *resultString = [NSString stringWithUTF8String:myAsciiString];

Can you split a NSString into an NSArray of strings based on a content size?

Basically i want to pass an NSString and a CGSize (the content size) and get an NSArray of NSStrings that will fit into this content size.
So for instance i have #"something really long" and specify a size CGSizeMake(20, 10), i would get back a NSArray of [#"something", #" really", #" long"] for example.
Anyone have any ideas / sample code?
Not sure this is that straight forward. Is the CGSize enough? Does not the amount of text that can fit inside a contentSize depend on the fontSize too? Not sure you would want to do all this. It can get pretty complex. Dont see the need for all that.
Instead define some comfortable contentSize for your UILabel & truncate the rest or even better opt for Adjust to Fit along with defining a min. font size. With this iOS will try its best to fit your text keeping in mind the min font size. Here's how to do it in IB -
Assuming that you want to break by words (break where there is a space) here is a psuedo-code algorithm:
1) Get Width as parameter
2) Declare String temp, int breakPoint, array list
3) For Each (character in String)
Add character to temp String
if (character is space)
set breakPoint to currentIndexInTemp
end if
if (temp String size > Width OR character is newline)
declare String newString
set newString equal to substring of temp(from 0 to breakPoint)
add newString to list
set temp to substring of temp(from breakPoint + 1 to end)
reset breakPoint
end if
end For Each
4) if (length of temp > 0)
add temp to list
end if
5) return list
I'm sorry if that's terrible pseudo-code, I don't write pseudo-code often (actually, at all) and I'm not familiar enough with Objective-C to write it in the language.
NOTE
The breakPoint is the last occurrence of a space, or break in
characters, if the length is too long then you want to cut at the
breakPoint to prevent chopping words. This is the index in the
temp String only, not the String you're breaking up.
There may be a simpler way to to do this in Objective-C, I don't know, this algorithm is based off a function I wrote in Java to break a string into 80 character lines.
EDIT
Added New Line testing. In my example, if it's a new line then break it there whether it's too long or not. Also, when you break the String into an array (the second if) you'll want to reset the breakPoint variable. I'd also agree that for overly large portions of text this will add overhead, I've not had issues with text-processing of several hundred characters. Although that was done on a desktop/laptop program.

NSString stringWithCharacters Unicode Problem

This has got to be simple -- surely this method is supposed to work -- but I'm having some kind to two-byte-to-one-byte problem, I think.
The purpose of the code is to generate a string of 0 characters of a certain length (10 minus the number of digits that will be tacked onto the end). It looks like this:
const unichar zero = 0x0030;
NSString *zeroBuffer = [NSString stringWithCharacters:&zero length:(10 - [[NSString stringWithFormat:#"%i", photoID] length])];
Alternate second line (casting the thing at address &zero):
NSString *zeroBuffer = [NSString stringWithCharacters:(unichar *)&zero length:(10 - [[NSString stringWithFormat:#"%i", photoID] length])];
0x0030 is the address of the numeral 0 in the Basic Latin portion of the unicode table.
If photoID is 123 I'd want zeroBuffer to be #"0000000". What it actually ends up as is a zero and then some crazy unicode characters along the lines of (not sure how this will show) this:
0䪨 燱ܾ뿿﹔
I'm assuming that I've got data crossing character boundaries or something. I've temporarily rewritten it as a dumb substring thing, but this seems like it would be more efficient.
What am I doing wrong?
stringWithCharacters:length: expects the first argument to be the address of a buffer containing each of the characters to be inserted in the string in sequence. It's reading your character zero for the first character, then advancing to the following memory address and reading whatever data is there for the next character, and so on. This is not the right method for doing what you're trying to do.
Alas, there isn't a built-in repeat-this-string method. See the answers here for suggestions.
Alternatively, you can avoid the issue completely and just do this:
[NSString stringWithFormat:#"%010i", photoID];
That causes the number formatter to output a decimal number padded with ten zeroes.

Determine if non-numerical characters have been pasted into UITextField

For a specialized calculator I would like to allow copy / paste for a textfield which is meant for numerical values only. So, only numerical characters should be actually pasted or the pasted string should be rejected if it contains non-numerical characters.
I was thinking about using UITextFieldDelegates textField:shouldChangeCharactersInRange:replacementString: method to check the pasted string for non-numerical characters. But NSString offers no method for checking whether it does NOT contain characters specified in a single set. So this way I would need to check occurances of characters from several sets, which is clumsy and these checks would run for every single number that would be typed in, which appears like quite some overhead to me.
Another way would be to iterate and check for every character in the replacement string whether there's a match in a numerical set.
Either way would propably work, but I feel like I'm missing something.
Do you have any advice? Is there a convenience method to achieve this?
But NSString offers no method for checking whether it does NOT contain characters specified in a single set
sure it does.
if([myString rangeOfCharacterFromSet:myCharacterSet].location ==NSNotFound)
{
//means there is no character from specified set in specified string
}

Search or compare within a Grapheme Cluster in Korean

In my current implementation of a UISearchBarController I'm using [NSString compare:] inside the filterContentForSearchText:scope: delegate method to return relevant objects based on their name property to the results UITableView as you start typing.
So far this works great in English and Korean, but what I'd like to be able to do is search within NSString's defined character clusters. This is only applicable for a handfull of languages, of which Korean is one.
In English, compare: returns new results after every letter you enter, but in Korean the results are generated once you complete a recognized grapheme cluster. I would like to be able to search through my Korean objects name property via the individual elements that make up a syllable.
Can anyone shed any light on how to approach this? I'm sure it has something to do with searching through UTF16 characters manually, or by utilising a lower level class.
Cheers!
Here is a specific example that's just not working:
`NSString *string1 = #"이";
`NSString *string2 = #"ㅣ";
NSRange resultRange = [[string1 decomposedStringWithCanonicalMapping] rangeOfString: [string2 decomposedStringWithCanonicalMapping] options:(NSLiteralSearch)];
The result is always NSNotFound, with or without decomposedStringWithCanonicalMapping.
Any ideas?
I'm no expert, but I think you're very unlikely to find a clean solution for what you want. There doesn't seem to be any relationship between a Korean character's Unicode value and the graphemes that it's made up of.
e.g. "이" is \uc774 and "ㅣ" is \u3163. From the perspective of the NSString, they're just two different characters with no specific relationship to each other.
I suspect that you will have to find or create an explicit mapping between characters and their graphemes, and then write your own search function that consults this mapping.
This very long page on Unicode Korean can help you, if it comes to that. It has a table of all the characters which suggests some structured relation between the way characters are numbered and their components.
If you use compare:options with NSLiteralString, it should compare character by character, that is, the Unicode code points, regardless of the grapheme. The default behavior of compare: is to use no options. You could use - decomposedStringWithCanonicalMapping to get the Unicode bytes of the input string, but I'm not sure how that would interact with compare:.