NSXmlParser foundCharacters method is not reading words in one time? - iphone

NSXmlParser foundCharacters method is not reading string in one time when characters coming with special characters like København which is a danish word???
It breaks it from ø and read it separately...

What's your question? This is documented behavior:
The parser object may send the delegate several parser:foundCharacters: messages to report the characters of an element. Because string may be only part of the total character content for the current element, you should append it to the current accumulation of characters until the element changes.
If you wish to capture the entire textual contents of a tag, you'll have to catch all these messages and join the contents in a string.

Related

Why does an email subject contain linefeed or carriage return characters?

I'm making a code to check a mailbox, and forward unseen mails to another user.
But sometimes it fails with an error:
ValueError: Header values may not contain linefeed or carriage return characters
I checked the raw fetched data and found out that the 'Subject' value contains \r\n.
Not all mails contain, but some do.
It just appears normal in the mailbox, and I have no idea why some contain such characters.
Does it have to do with the length of the subject?
How can I deal with these situations?
Thanks :)
Email messages have a maximum line length. That's historical and the rule isn't upheld 100% of the time, so to speak. But in header fields, a space is to be treated the same as a CR LF and a sequence of spaces or a htab character. This is a really long subject, encoded in that way:
Subject: Pretend this is about 80-90
characters long
The simplest way to deal with it is to consider any sequences of space characters to be a single space.
Read the source of any email message, you'll see this wrapping in most of then. The Received fields is almost always wrapped, for instance, and quite often To if there are many addressees, or Content-Type/Content-Disposition for attachments.

How to parse special characters in XML for iPad?

I am getting problem while parsing xml files that contains some special characters like single quote,double quote (', "")etc.I am using NSXMLParser's parser:foundCharacters:method to collect characters in my code.
<synctext type = "word" >They raced to the park Arthur pointed to a sign "Whats that say" he asked Zoo said DW Easy as pie</synctext>
When i parse and save the text from above tag of my xml file,the resultant string is appearing,in GDB, as
"\n\t\tThey raced to the park Arthur pointed to a sign \"Whats that say\" he asked Zoo said DW Easy as pie";
Observe there are 2 issues:
1)Unwanted characters at the beginning of the string.
2)The double quotes around Whats that say.
Can any one please help me how to get rid of these unwanted characters and how to read special characters properly.
NSString*string =[string stringByTrimmingCharactersInSet:[NSCharacterSet characterSetWithCharactersInString:#" \n\t"]];
The parser is apparently returning exactly what's in the string. That is, the XML was coded with the starting tag on one line, a newline, two tabs, and the start of the string. And quotes in the string are obviously there in the original (and it's not clear in at least this example why you'd want to delete them).
But if you want these characters gone then you need to post-process the string. You can use Rams' statement to eliminate the newline and tabs, and stringByReplacingOccurrencesOfString:WithString: to zap the quotes.
(Note that some XML parsers can be instructed to return strings like this with the leading/trailing stuff stripped, but I'm not sure about this one. The quotes will always be there, though.)

query about xml parsing

i just want to knw,is there any boundations in xml parsing with characters
like can we parse a word containing some characters like
"frühe" containing "ü"
"böser" containing "ö"
while i am parsing my xml,which is few different languages, some characters are like the above.
and wen i saw in console, it get interpted,exaactly wen it reacher "ü"
becoz at console it prints "fr"
so can someone provide me some ideas about this thing
regards
shishir
If you are using the standard NSXmlParser class and the XML file has the correct encoding= attribute then you shouldn't have anything to worry about. The console output probably isn't unicode-aware so it is interpreting the multi-byte UTF-8 characters literally. Try showing the parsed text in a UIAlertView or some other UI element and see if you still have problems.

NSURL doesn't work any time

i have the following problem sometimes my openURL-Dialog works perfectly, then i looked at the variable from the url and that is the variable:
www.brehm-gmbh.de
but some other times there are some crazy elements at the end of the variable like this:
www.adamczyk-fenster.de%E2%80%8E
i get this pages from an .asc file and both are in this file normal without this elements,
what can i do to solve this problem?
thank you all for helping beforehand
From Wikipedia:
The left-to-right mark (LRM) is a
control character or non-printing
character, used in the computerized
typesetting of bi-directional text,
containing mixed left-to-right scripts
(such as English and Russian) and
right-to-left scripts (such as Arabic
and Hebrew). It is used to change the
way adjacent characters are grouped
with respect to text direction.
You're getting this because (1) you've got non-English URLs, are composing URLs from non-English strings or you have some other non-English elements and the string encoding is attempting to compensate or (2) it's garbarge being interpreted as an encoding (unlikely if it is consistant.)
Call -[NSString localizedNameOfStringEncoding] on the string before you use it see what encoding it is using. You probably need to explicitly establish an encoding when you read in the strings before you put them in the NSURL.

NSXMLParser shreds umlauts (ä, ö, ü)

I use NSXMLParser for parsing XML documents of a server. They are encoded as UTF8.
My problem is, that NSXMLParser breaks at umlauts (ä, ö, ü) and starts a new element.
For example:
Lösen -- NSXMLParser ---> L + ösen
How do I get NSXMLParser to read my umlaut words completely, as every other word.
Regards
Sorry but based on your comment on the original question (foundCharacters receiving the text in two calls) the parser is behaving perfectly well. See the "Discussion" section for the parser:foundCharacters: method quoted below:
The parser object may send the delegate several parser:foundCharacters: messages to report the characters of an element. Because string may be only part of the total character content for the current element, you should append it to the current accumulation of characters until the element changes.
As you can see the parser is free to pass your delegate the characters in as many chunks as it sees fit.
foundCharacters: is not delinited by tags, you need to concatentate the characters passed in unti lthe next call to didEndElement.
I ran into that issue with Spanish characters in this line:
(void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string
I'm sure if you get the found characters section working well with the didEndElement function, you'll be fine.