decoding quoted-printables - iphone

I am looking for a way to decode quoted-printables.
The quoted-printables are for arabic characters and look like this:
=D8=B3=D8=B9=D8=A7=D8=AF
I need to convert it to a string, and store it or display..
I've seen post on stackoverflow for the other way around (encoding), but couldn't find decoding.

Uhm, it's a little hacky but you could replace the = characters with a % character and use NSString's stringByReplacingPercentEscapesUsingEncoding: method. Otherwise, you could essentially split the string on the = characters, convert each element to a byte value (easily done using NSScanner), put the byte values into a C array, and use NSString's initWithBytes:length:encoding: method.
Note that your example isn't technically in quoted-printable format, which specifies that a quoted-printable is a three character sequence consisting of an = character followed by two hex digits.

In my case I was coming from EML... bensnider's answer worked great... quoted-printable (at least in EML) uses an = sign followed by \r\n to signify a line wrapping, so this was the code needed to cleanly translate:
(Made as a category cause I loves dem)
#interface NSString (QuotedPrintable)
- (NSString *)quotedPrintableDecode;
#end
#implementation NSString (QuotedPrintable)
- (NSString *)quotedPrintableDecode
{
NSString *decodedString = [self stringByReplacingOccurrencesOfString:#"=\r\n" withString:#""]; // Ditch the line wrap indicators
decodedString = [decodedString stringByReplacingOccurrencesOfString:#"=" withString:#"%"]; // Change the ='s to %'s
decodedString = [decodedString stringByReplacingPercentEscapesUsingEncoding:NSUTF8StringEncoding]; // Replace the escaped strings.
return decodedString;
}
#end
Which worked great for decoding my EML / UTF-8 objects!

Bensnider's answer is correct, the easy way of it.
u'll need to replace the "=" to "%"
NSString *s = #"%D8%B3%D8%B9%D8%A7%D8%AF";
NSString *s2 = [s stringByReplacingPercentEscapesUsingEncoding:NSUTF8StringEncoding];
s2 stored "سعاد" which makes sense so this should work straight forward with out a hack

In some cases the line ends are not "=\r\n" but are only "=\n", in which case you need another step:
decodedString = [self stringByReplacingOccurrencesOfString:#"=\n" withString:#""];
Otherwise, the final step fails due to the unbalanced "%" at the end of a line.

I know nothing of the iPhone, but most email processing libraries will contain functions to do this, as email is where this format is used. I suggest searching for MIME decoding type functions, similar to those at enter link description here.
The earlier posters approach also seems fine to me - I feel he is being a little too self-deprecating in describing it as hacky :)

Please see a working solution that takes a quoted-printable-containing strings and resolves those graphemes. The only thing you should pay attention to is the encoding (that answer is based upon UTF8, by it can be easily switched to any other): https://stackoverflow.com/a/32903103/2799410

Related

How to remove the last unicode symbol from NSString

I have implemented a custom keyboard associated with a text field, so when the user presses the delete button, I remove the last character from the string, and manually update the current text field text.
NSRange range = NSMakeRange(currentTextFieldString.length-1, 1);
[currentTextFieldString replaceCharactersInRange:range withString:#""];
So far so good.
Now, the problem is, that the user has the option to enter some special unicode symbols, these are not 1 byte, they can be 2 bytes too, now on pressing the delete button, I have to remove the entire symbol, but if I follow the above approach, the user has to press the delete button twice.
Here, if I do:
NSRange range = NSMakeRange(currentTextFieldString.length-2, 2);
[currentTextFieldString replaceCharactersInRange:range withString:#""];
it works fine, but then, the normal characters, which are just 1 byte, get deleted twice at a time.
How to handle such scenarios?
Thanks in advance.
EDIT:
It is strange, that if I switch to the iPhone keyboard, it handles both cases appropriately. There must be some way to do it, there is something that I am missing, but am not able to figure out what.
Here's the problem. NSStrings are encoded using UTF-16. Many common Unicode glyphs take up only one unichar (a 16 bit unsigned value). However, some glyphs take up two unichars. Even worse, some glyphs can be composed or decomposed, e.g.é might be one Unicode code point or it might be two - an acute accent followed by an e. This makes it quite difficult to do what you want viz delete one "character" because it is really hard to tell how many unichars it takes up.
Fortunately, NSString has a method that helps with this: -rangeOfComposedCharacterSequenceAtIndex:. What you need to do is get the index of the last unichar, run this method on it, and the returned NSRange will tell you where to delete from. It goes something like this (not tested):
NSUInteger lastCharIndex = [myString length] - 1; // I assume string is not empty
NSRange rangeOfLastChar = [myString rangeOfComposedCharacterSequenceAtIndex: lastCharIndex];
myNewString = [myString substringToIndex: rangeOfLastChar.location];
If you can't get this to work by default, then use an if/else block and test if the last character is part of a special character. If it is, use the substring to length-2, otherwise use the substring to length-1.
I don't know exactly what the problem is there with the special characters byte length.
What i suggest is:
Store string length to a param, before adding any new characters
If user selects backspace (remove last characters) then remove the string from last length to new length. Means for example last saved string length is 5 and new string length is 7 then remove get a new string with the index from 0 to 4, so it will crop the remaining characters.
This is the other way around to do as i don't know the exact what problem internally.
But i guess logically this solution should work.
Enjoy Coding :)

NSLog outputs unicode characters as garbage when debugging on the iPhone

EDIT: NSLog output works well in the simulator, but doesn't work when connected to a real device. And it seems that it is a bug — http://openradar.appspot.com/11148883. Also it happens that it is related to the LLDB, switching Xcode to GDB resolves the problem. Either it's possible to JetBrain's AppCode, which works well with the LLDB.
I have a bunch of unicode strings in the application, and if I try to output any of those strings using something like NSLog(#"%#", aString) then all the ASCII characters in the string will be printed fine but all the cyrillic letters will be messed up, so instead of
newLocation: coordinate:60.019584,30.284954 'Удельная'
I'm getting:
newLocation: coordinate:60.019584,30.284954 '–ü–æ–∫–ª–æ–Ω–Ω–æ–≥–æ—Ä—Å–∫–∞—è'
And that's quite hard to do any debugging with that kind of output. And because that app is targeted for the Russian market only I can't just change locale and use English strings.
So I wonder if there any way to make NSLog work well with unicode characters? And I'm looking only for some kind of one-liner solution, I know that there are some ways to write half a page of code and output unicode chars, but I'm looking for something shorter. Ideally I'm looking for some method of NSString that will make it all work. e.g.
NSLog(#"%#", [aString someThingThatMakesUnicodeWorkWithXcodeConsole]);
Yes, obviously you can create a string that will contain and output cyrillic letters. When I was learning Objective-C, I had the same problem in the begining(I'm as well was working with Russian words and stuff like that). So solution is to convert the string to other format like this:
NSString *string = [NSString stringWithCString:"Привет, как дела?" encoding:4];
NSLog(#"%#", string);
or
NSString *string = [NSString stringWithUTF8String:"Этот вариант короче!"];
NSLog(#"%#", string);
Hope it helps you!
P.S It means that you need to make create your strings as C-Style Strings, and set their encoding parameter to 4(UTF-8). You can see all list of avaliable parameters in the documentation to NSStringEncoding in NSString.
As far as I know it is relevant to NSLog() and LLDB on some Xcode versions. Have a try with one of these solutions:
Check log in Xcode Organizer >> Devices >> your device >> Console.
Use GDB as your debugger instead of LLDB if you are using the latter one. This can be changed from the schema options. Please refer to the steps in the comment by "cocos2d man" below.
Upgrade to Xcode 4.3.2. Some people say it solved this issue, but I haven't confirmed this myself.
Try to convert it in to UTF8 string.
NSString *str = [aString UTF8String]
NSLog(#"%#", str);
Hope this helps.
Try putting it like NSLog(#"%#", aString);
EDIT :
you can convert it in UTF8 string. This could get you through.
NSString *str = [aString UTF8String];
Hope this helps.
Try this. It works for me.
NSLog(#"%#", [NSString stringWithCString:[[places description] cStringUsingEncoding:NSASCIIStringEncoding] encoding:NSNonLossyASCIIStringEncoding]);

how to write special character in objective-C NSString

when I try to write this JSON:
{"author":"mehdi","email":"email#hotmail.fr","message":"Hello"}
like this in Objective-C:
NSString *myJson=#"{"author":"mehdi","email":"email#hotmail.fr","message":"Hello"}";
it doesn't work. Can someone help me?
You need to escape quote characters with a backslash:
NSString *myJson = #"{\"author\":\"mehdi\",\"email\":\"email#hotmail.fr\",\"message\":\"Hello\"}";
Otherwise the compiler will think that your string literal ends right after the first {.
The backslashes will not be present as characters in the resulting NSString. They are merely there as hints for the compiler and are removed from the actual string during compilation.
Newbie note: JSON strings that you read directly from a file via Objective C of course do not need any escaping! (JSON itself may need such, but that's about it. No need for additional escaping on the ObjC-side of it.)

Newline chars somehow get added to my strings. And cant remove them

On some of my strings there seems to be somekind of newline char. I think this is the case because when i do a simple NSLog
NSLog(#"Test: %#",aNSMutableString);
I would get output like below
Test:
I am a String
I've tried using
[mutableString stringByTrimmingCharactersInSet:[NSCharacterSet newlineCharacterSet]];
But it does not remove whatever it is thats forcing the newline to happen.
In a string that i parse out from a file which has 4 characters 'm3u8' has 5 chars when I check the length of the new string.
Anybody got an idea of what might be going on?
Thanks
-Code
P.S.
I know I could just zap the first char out of all my strings but it feels like a hack and i still wont know whats going on.
[mutableString stringByTrimmingCharactersInSet:[NSCharacterSet newlineCharacterSet]];
The above will not directly modify your mutableString. It returns a new autoreleased NSString with the characters trimmed. See NSString doc.
e.x.
NSString *trimmedString = [mutableString stringByTrimmingCharactersInSet:[NSCharacterSet newlineCharacterSet]];
NSLog(#"Test: %#", trimmedString);
should give you expected results.
I think #Sam 's answer will fix your problem, but I think the origin of your problem is the file source. Do you know how it is encoded? Is it part of a download? My guess is that you have a Windows' file with "\n\r" terminating lines and you are using Unix string tools that are breaking on "\n", thus leaving a leading "\r".
Verify the source of the file and read the document lines with the appropriate encoding.

NSString stringWithCharacters Unicode Problem

This has got to be simple -- surely this method is supposed to work -- but I'm having some kind to two-byte-to-one-byte problem, I think.
The purpose of the code is to generate a string of 0 characters of a certain length (10 minus the number of digits that will be tacked onto the end). It looks like this:
const unichar zero = 0x0030;
NSString *zeroBuffer = [NSString stringWithCharacters:&zero length:(10 - [[NSString stringWithFormat:#"%i", photoID] length])];
Alternate second line (casting the thing at address &zero):
NSString *zeroBuffer = [NSString stringWithCharacters:(unichar *)&zero length:(10 - [[NSString stringWithFormat:#"%i", photoID] length])];
0x0030 is the address of the numeral 0 in the Basic Latin portion of the unicode table.
If photoID is 123 I'd want zeroBuffer to be #"0000000". What it actually ends up as is a zero and then some crazy unicode characters along the lines of (not sure how this will show) this:
0䪨 燱ܾ뿿﹔
I'm assuming that I've got data crossing character boundaries or something. I've temporarily rewritten it as a dumb substring thing, but this seems like it would be more efficient.
What am I doing wrong?
stringWithCharacters:length: expects the first argument to be the address of a buffer containing each of the characters to be inserted in the string in sequence. It's reading your character zero for the first character, then advancing to the following memory address and reading whatever data is there for the next character, and so on. This is not the right method for doing what you're trying to do.
Alas, there isn't a built-in repeat-this-string method. See the answers here for suggestions.
Alternatively, you can avoid the issue completely and just do this:
[NSString stringWithFormat:#"%010i", photoID];
That causes the number formatter to output a decimal number padded with ten zeroes.