Unity 2019 - linebreak \n not working for UI text elements - unity3d

I am having some difficulty getting linebreaks to work for my Unity UI elements. (Unity 2019.2.17f1 Personal)
What I'm doing is:
string twoLinesOfText = LanguagePack.getTextByID(ID);
result:
twoLinesOfText = "Text line 1\nText line 2"
Expected output:
Text line 1
Text line 2
Reality:
Text line 1\nText line 2
I have tried using "\n", "\\n" and "\r\n". None of these give the intended result.
I assign the text to the component using
UITextComponent.GetComponent<Text>().text = twoLinesOfText;
Can this direct assignment be a problem? Do i need to push my string through a toString() or parse it somehow for the \n to be recognised?
Workaround:
I have a workaround. By using an XML file for my LanguagePack, and inserting (enter) linebreaks in the base file, I feed the linebreaks into my Unity UI elements. Obviously this is not ideal.
Reading back the strings in Debug.Log does not show which linebreak code was ultimately used: it just breaks the string according to the (enter) linebreaks in the XML file.

You can't import it trought Language Package. What you should do is :
string line1 = LanguagePackage.getTextByID(ID1);
string line2 = LanguagePackage.getTextByID(ID2);
string twoLinesOfText = line1 + "\n" + line2;
UITextComponent.GetComponent<Text>().text = twoLinesOfText;

Run into this problem myself, a little investigation showed that what I thought was \n in the string had been converted to \\n so it showed in the text box as \n.
Converting it during debugging to just \n got me the multiline text I wanted.
Now to investigate where in my data chain it got converted :-)
Ok, investigation complete. A file was saved, on my PC from a program in Visual Basic using the File.WriteAllLines function, one of those lines had a couple of instances of \n. A look at that file in notepad shows it had correctly written that line. The problem came when I used File.ReadAllLines in my unity program as it converted those \n instances to \\n. As far as I can tell this is not a documented action, in fact it's possible, on reading the MS docs, to think that it would have split that line into multiple lines, which it doesn't do.
I checked in my VB program and File.ReadAllLines does not behave in this way there. It's probably something to do with the environment, VB does not use \n, C# does. I fixed the problem by tagging a replace onto the string e.g. string.Replace("\\n", "\n"). It's entirely possible that attempting to write a string from C# with File.WriteAllLines could also mess with \n.
Geez, this was hard to write as the Editor here messes with \\n and convert it to \n and I end up having to use \\\n

For people who encounter this issue. You Could try to use some HTML similar syntax and see whether it works or not.
Eg:
Using for newline instead of \n

Related

Last string appears at beginning of line in formatted output

Has anyone any idea why the following would format itself in a weird way? In several years I've had no problem with creating simple text output but this problem has me baffled.
I'm using the line
print "$BC,$Ttl,$FN,$SN,$Finalage,$OurLoc,$OurDT,$FinalPC\n";
Every value is a simple text string on which I've run "chomp" to remove return characters.
I would expect the output to look like the following:
*DD10099999,,Information Services,Guest Ticket 2,41,C G,03/11/2020,NE8 9BB*
$BC is the first item and $FinalPC is the postcode at the end.
Instead I get:
*,NE8 9BB99, ,Information Services,Guest Ticket 2,41,C G,03/11/2020*
The final item has somehow moved to the beginning of the line and overwritten the first item. This is happening consistently on every line of my screen and text file output and I'm completely stumped as to why. The data is read from a text file and compared with database output which is also simple text. There are no occurrences of \b anywhere in my code. Why would a backspace character get into it?
The string in $OurDT ends with a carriage return, which causes your terminal to home the cursor. Presumably, the value of $OurDT came from a Windows file read on a unixy machine.
One option is to fix the file (e.g. by using the dos2unix utility).
Another is to accept both CRLF and LF as line endings (e.g. by using s/\s+\z// instead of chomp).

Save UITextView String With \n Instead of Line Breaks (Swift)

I have a UITextView in my Swift app in which users can input text. They can input text with as many line breaks as they like but I need to save the string with the newline command (\n). How would I do this?
For example, my user inputs
Line 1
Line 2
Line 3
in the UITextView. If I was to retrieve the string...
let string = textview.text!
this would return
"Line 1
Line 2
Line 3"
when I would like for it to return
"Line1\nLine2\nLine3"
How would I go about doing this? Can I use a form of replacingOccurrences(of:with:)? I feel like I'm missing a fairly obvious solution...
Eureka! After WAY too much research and learning all about String escapes, I found a very simple solution. I'm quite surprised that this isn't an answer out there already (as far as I can tell haha) so hopefully, this helps someone!
It's actually quite simple and this will work of any String you could be using.
textView.text!.replacingOccurrences(of: "\n", with: "\\n")
Explanation:
Ok so as you can tell, it's quite simple. We want to replace the newline command \n with the string "\n". The problem is that if we replace \n with \n, it's just going to transfer over to a newline, not a string. This is why escapes are so important. As you can see, I am replacing \n with \\n. By adding an extra \ we escape the command \n entirely which turns it into a string.
I hope this helps someone! Have a great day!
Have you tried replacing \r with \n? or even \r\n with \n?
I hope I am not making an obvious assumption you considered, but maybe this may come in handy:
Is a new line = \n OR \r\n?

Why is this LSEP symbol showing up on Chrome and not Firefox or Edge?

So this web page is rendering with these symbols and they are found throughout this website/application but on no other sites. Can anyone tell me
What this symbol is?
Why it is showing up only in one browser?
That character is U+2028 Line Separator, which is a kind of newline character. Think of it as the Unicode equivalent of HTML’s <br>.
As to why it shows up here: my guess would be that an internal database uses LSEP to not conflict with literal newlines or HTML tags (which might break the database or cause security errors), and either:
The server-side scripts that convert the database to HTML neglected to replace LSEP with <br>
Chrome just breaks standards by displaying LSEP as a printing (visible) character, or
You have a font installed that displays LSEP as a printing character that only Chrome detects. To figure out which font it is, right click on the offending text and click “Inspect”, then switch to the “Computed” tab on the right-hand panel. At the very bottom you should see a section labeled “Rendered Fonts” which will help you locate the offending font.
More information on the line separator, excerpted from the Unicode standard, Chapter 5.8, Newline Guidelines (on p. 12 of this PDF):
Line Separator and Paragraph Separator
A paragraph separator—independent of how it is encoded—is used to indicate a
separation between paragraphs. A line separator indicates where a line break
alone should occur, typically within a paragraph. For example:
This is a paragraph with a line separator at this point,
causing the word “causing” to appear on a different line, but not causing
the typical paragraph indentation, sentence breaking, line spacing, or
change in flush (right, center, or left paragraphs).
For comparison, line separators basically correspond to HTML <BR>, and
paragraph separators to older usage of HTML <P> (modern HTML delimits
paragraphs by enclosing them in <P>...</P>). In word processors, paragraph
separators are usually entered using a keyboard RETURN or ENTER; line
separators are usually entered using a modified RETURN or ENTER, such as
SHIFT-ENTER.
A record separator is used to separate records. For example, when exchanging
tabular data, a common format is to tab-separate the cells and to use a CRLF
at the end of a line of cells. This function is not precisely the same as line
separation, but the same characters are often used.
Traditionally, NLF started out as a line separator (and sometimes record
separator). It is still used as a line separator in simple text editors such as
program editors. As platforms and programs started to handle word processing
with automatic line-wrap, these characters were reinterpreted to stand for
paragraph separators. For example, even such simple programs as the Windows
Notepad program and the Mac SimpleText program interpret their platform’s NLF
as a paragraph separator, not a line separator. Once NLF was reinterpreted to
stand for a paragraph separator, in some cases another control character was
pressed into service as a line separator. For example, vertical tabulation VT
is used in Microsoft Word. However, the choice of character for line separator
is even less standardized than the choice of character for NLF. Many Internet
protocols and a lot of existing text treat NLF as a line separator, so an
implementer cannot simply treat NLF as a paragraph separator in all
circumstances.
Further reading:
Unicode Technical Report #13: Newline Guidelines
General Punctuation (U+2000–U+206F) chart PDF
SE: Why are there so many spaces and line breaks in Unicode?
SO: What is unicode character 2028 (LS / Line Separator) used for?
U+2028 on codepoints.net A misprint here says that U+2028 was added in v. 1.1 of the Unicode standard, which is false — it was added in 1.0
I found that in WordPress the easiest way to remove "L SEP" and "P SEP" characters is to execute this two SQL queries:
UPDATE wp_posts SET post_content = REPLACE(post_content, UNHEX('e280a9'), '')
UPDATE wp_posts SET post_content = REPLACE(post_content, UNHEX('e280a8'), '')
The javascript way (mentioned in some of the answers) can break some things (in my case some modal windows stopped working).
You can use this tool...
http://www.nousphere.net/cleanspecial.php
...to remove all the special characters that Chrome displays.
Steps:
Paste your HTML and Clean using HTML option.
You can manually delete the characters in the editor on this page and see the result.
Paste back your HTML in file and save :)
I recently ran into this issue, tried a number of fixes but ultimately I had to paste the text into VIM and there was an extra space I had to delete. I tried a number of HTML cleaners but none of them worked, VIM was the key!
9999years answers is great.
In case you use Symfony with Twig template I would recommend to check for an empty Twig block. In my case it was an empty Twig block with an invisible char inside.
The LSEP char was only displayed on certain device / browser.
On the other I had a blank space above the header and I could not see any invisible char.
I had to inspect the GET request to see that the value 1f18 was before the open html tag.
Once I removed an empty Twig block it was gone.
hope this can help someone one day ...
My problem was similar, it was "PSEP" or "P SEP". Similar issue, an invisible character in my file.
I replaced \x{2029} with a normal space. Fixed. This problem only appeared on Windows Chrome. Not on my Mac.
I agree with #Kapil Bathija - Basically you can copy & paste your HTML code into http://www.nousphere.net/cleanspecial.php and convert it.
Then it will convert the special characters for you - Just remove the spaces in between the words and you will realize you have to press backspace 2x meaning there is an invalid character that can't be translated.
I had the same issue and it worked just fine afterwards.
You can also copy the text, paste it into a HTML editor such as Coda, remove the linebreak, copy it and paste it back into your site.
Video here: https://www.loom.com/share/501498afa7594d95a18382f1188f33ce
Looks like my client pasted HTML into Wordpress after initially creating it with MS-Word. Even deleting the and visible spaces did not fix the issue. The extended characters became visible in vi/vim.
If you don't have vi/vim available, try highlighting from 2 chars before the LSEP to 2 chars after the LSEP; delete that chunk, and re-type the correct characters.

PyGame: Proper use of Unicode

My goal is to create a program, with which the user can learn Bible verses by getting shown a problem and solving it through input (e.g. "Quote vers Gen 3:15"). As the Bible translation, I have to work with, is German, it contains a ton of umlauts, which are never showing properly.
My PyGame file's header:
#!/usr/bin/python
# -*- coding: utf-8 -*-
Later on, I list the three German umlauts:
u'ö'.encode('utf-8')
u'ä'.encode('utf-8')
u'ü'.encode('utf-8')
The txt-file is parsed by this function:
def load_list(listname):
fullname = os.path.join("daten", listname + ".txt")
with codecs.open(fullname, "r", "utf-8-sig") as name:
lines = name.readlines()
for x in range(0, len(lines)):
lines[x] = lines[x].strip("\n")
lines[x] = lines[x].strip("\r")
print lines
I'm aware, that I could combine the two lines with the strip-commands, but that's not the topic here.
How can I get my PyGame to display the umlauts from the text-file correctly as well also display the user input's umlauts correctly? I checked hundreds of suggestions, I can't get anything really working here.
Any help is highly appreciated, before I lose my sane mind (well, as I'm sitting here, coding games, I probably did already anyway :D )
I'll try to summarize:
Printing something else than a string or unicode opject triggers that object's __repr__() method. If it is a sequence, this applies to the contained elements as well, causing any non-ascii character to be escaped with \xXX (or \uXXXX) notation. Note the difference between print 'text' and print ['text']: in the latter case, the string's quotes will be printed as well (besides the brackets of course). Use str.join() for concatenating lists of strings in order to control the way the output looks.
It's a good idea to always explicitely decode input (as you do by using codecs) and encode the output (which is not done in the code snippets in your question).
The source file encoding (the # coding: utf8 line in the header) has nothing to do with encoding of input and output. It only enables you to type non-ascii character in string literals (= characters inside quotes in the source file), instead of using \xXX escapes.
Hope that makes some things clearer. There's a lot that can go wrong that looks like an encoding error, and it's not always easy to find out what's actually happening.

Encoding newlines in iCal files

I'm trying to figure out how to encode newlines in the DESCRIPTION part of an iCal file in such a way that they will import properly into Outlook, Google Calendar and the Apple Calendar.
The original code I inherited used "=0D=0A" with a quoted-printable encoding, which works great in Outlook, but not in Google Calendar.
The spec seems to say you should use "\n" to represent a newline. This works great in Google Calendar, but Outlook just puts the literal "\n" characters in there.
Is there a way you've done this that will work consistently accross calendaring systems?
OK, looks like I'm answering my own question.
The correct way to do it is to use "\n" for line breaks. Outlook did not recognize this because I had "ENCODING=quoted-printable" on the description. Once I removed that, Outlook displayed the new lines correctly.
Also, to get the file to open correctly in Apple iCal, you need to use "VERSION:2.0" for the file version. If you use "VERSION:1.0", it will tell you it can't read the file (even though it conforms to the 1.0 spec).
NOTE: As others have mentioned, the file actually has to contain the literal string \n. Since most languages treat that as an escape sequence meaning a newline character, you probably need to use the string \\n in your code.
The comment with the link to the RFC from Matthew Bucket above in the original post helped me. Quoting from there:
A BACKSLASH character in a "TEXT" property value MUST
be escaped with another BACKSLASH character
So, I did a
$description = str_replace("\r\n", "\\n", $description);
and it worked
Might be worth saying that you need the literal \n, not the newline symbol, literally backslash then n in the ical. Also don't forget to do the 75 character "folding" too.
Your output file should be like below---
BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//2013//#Ur Site Name#//EN
BEGIN:VEVENT
UID:[event]2012
DTSTART:20130101T100000
DTEND:20130101T120000
LOCATION:
SUMMARY:#Meeting Title here#
DESCRIPTION:What is realistic for financial services companies to achieve via Social Media channels? \n\nJoin us on 11th September 2013 at 4pm (BST) where we
-----bla bla bla ----
END:VEVENT
END:VCALENDAR
Here you have to take care of Version, it should be 2.0 and Escape char ... \n(newline), semicolon(;) and comma(,). If you are writing in .net then it should like ... "\\n", "\\;" and "\\,".
You can check your output file on this site as well... https://icalendar.org/validator.html
Thanks,
Bhaskar
According to this RFC:
Content lines are delimited by a line break,
which is a CRLF sequence (CR character followed by LF character).
So you should use \r\n. I used this in strings without additional backslash escaping.
This is my answer for DESCRIPTION
$filev = str_replace("\r\n", '\\n', $p);
$filev = str_replace("<br>",'\\n',$filev);
$filev = (str_replace(";","\;",str_replace(",",'\,',$filev)));
I had to escape the output in the string to set a literal "\n" in the output file. Like so. Worked a charm.
$events .= "DESCRIPTION:" . str_replace("\n","\\n",str_replace(";","\;",str_replace(",",'\,',get_event_contents()))) . "\n";
=0D=0A works with Outlook, but you'll need to change the DESCRIPTION key, so that line breaks can be interpreted.
DESCRIPTION;ENCODING=QUOTED-PRINTABLE:
Enter your text after the colon, using =0D=0A for line breaks. Outlook will read the line breaks correctly. Using \\n only works if you're using DESCRIPTION without ENCODING:QUOTED-PRINTABLE.
I'm using VERSION:2.0