Error: The input character is not valid in MATLAB statements or expressions - matlab

I have an error when I try to run this on Matlab r2012b
Yielding the error:
>> t=-2:.1:5;
Error: The input character is not
valid in MATLAB statements or
Editors note:
First code line of original post contained a "hidden" character (single source of error) which was, due to SO formatting, lost in the first edit (intended to fix code formatting). Even when re-rolling back to revision 1, the "hidden" character is lost.
t={Character: ASCII Code 2}-2:.1:5;
Original code (thanks Daniel) can be found here

In your code the third char of t=-2:.1:5; is not a whitespace (ASCII Code 32) like MATLAB displays it, it is a start of text (ASCII Code 2). I have no idea how these control chars got into your code, but to clean it up I recommend a text editor which allows to display all hidden chars.


SyntaxError:(unicode error) 'unicodeescape' codec' can't decode bytes in position 0-5: truncated \UXXXXXXXX escape

Using Autokey 95.8, Python 3 version in Linux Mint 19.3 and I have a series of keyboard macros which generate Unicode characters. This example works:
# alt+shift+a = á
import sys
char = "\u00E1"
But the attempt to print an mdash [—] generates the following error:
SyntaxError:(unicode error) 'unicodeescape' codec' can't decode bytes in position 0-5: truncated \UXXXXXXXX escape
# alt+shift+- = —
import sys
char = "\u2014"
Any idea how to overcome this problem in Autokey is greatly appreciated.
The code you posted above would not generated the error you ae getting - "truncated \UXXXXXXXX" needs an uppercase \U - and 8 hex-digits - if you try putting in the Python source char = "\U2014", you will get that error message (and probably it you got it when experimenting with the file in this way).
The sequence char = "\u2014" will create an mdash unicode character on the Python side - but that does not mean it is possible to send this as a Keyboard sybo via autokey to Windows. That is the point your program is likely failing (and since there is no programing error, you won't get a Python error message - it is just that it won't work - although Autokey might be nice and print out some apropriate error message in this case).
You'd have to look around on how to type an arbitrary unicode character on your S.O. config (on Linux mint it should be on the docs for "wayland" I guess), and send the character composign sequence to Autokey instead. If there is no such a sequence, then finding a way to copy the desired character to the window environment clipboard, and then send Autokey the "paste" sequence (usually ctrl + v - but depending on the app it could change. Terminal emulators use ctrl + shift + v, for example)
When you need to emit non-English US characters in AutoKey, you have two choices. The simplest is to put them into the clipboard with clipboard.fill_clipboard(your characters) and paste them into the window using keyboard.send_keys("<ctrl>+v"). This almost always works.
If you need to define a phrase with multibyte characters in it, select the Paste using Clipboard (Ctrl+V) option. (I'm trying to get that to be the default option in a future release.)
The other choice, that I'm still not quite sure of, is directly sending the Unicode escape sequence to the window, letting it convert that into the actual Unicode character. Something like keyboard.send_keys("\U2014"). Assigning that to a variable first, as in the question, creates the actual Unicode character which that API call can't handle correctly.
The problem being that the underlying code for keyboard.send_keys() wants to send keycodes that actually exist on your keyboard or that it can add to an unused key in your layout. Most of the time that doesn't work for anything multibyte.

Identify hidden control character and ignore when scanning csv file

I am trying to use textscan in MATLAB to read in mixed format data from a .csv file. I am currently running into a problem that there are a number of nonvisible characters which are getting read in as a string when I am not expecting them. I believe if I set this character as a delimiter or whitespace it will solve my text scanning issue.
My main problem at the moment is that I don't know what character it is to be able to identify it. I have used isstrprop to determine that it is a control character. I guessed that it was the NUL character, so I tried adding \0 to the delimiter set for textscan. Unfortunately MATLAB does not recognize that as a valid \ constant.
Below is one line of the data file, copied from Notepad. The characters preceding each of the commas are the ones in question. The following line is the command I used in MATLAB to read it.
1 ,T,171215,173201,21.982413N,159.342881W,150 ,0 ,0 ,3D,SPS ,2.7 ,2.5 ,1.0 ,
C = textscan(fid,'%d%s%d%d%s%s%d%d%d%s%s%f%f%f%s','delimiter',',','headerlines',1,'MultipleDelimsAsOne',1)
Also, for what it's worth, using deblank on the string of characters that is read in does remove them. However, I only know how to apply this after the textscan, so the characters still throw off the parsing.
How can I identify this character and set it to be ignored by textscan?

Why is this LSEP symbol showing up on Chrome and not Firefox or Edge?

So this web page is rendering with these symbols and they are found throughout this website/application but on no other sites. Can anyone tell me
What this symbol is?
Why it is showing up only in one browser?
That character is U+2028 Line Separator, which is a kind of newline character. Think of it as the Unicode equivalent of HTML’s <br>.
As to why it shows up here: my guess would be that an internal database uses LSEP to not conflict with literal newlines or HTML tags (which might break the database or cause security errors), and either:
The server-side scripts that convert the database to HTML neglected to replace LSEP with <br>
Chrome just breaks standards by displaying LSEP as a printing (visible) character, or
You have a font installed that displays LSEP as a printing character that only Chrome detects. To figure out which font it is, right click on the offending text and click “Inspect”, then switch to the “Computed” tab on the right-hand panel. At the very bottom you should see a section labeled “Rendered Fonts” which will help you locate the offending font.
More information on the line separator, excerpted from the Unicode standard, Chapter 5.8, Newline Guidelines (on p. 12 of this PDF):
Line Separator and Paragraph Separator
A paragraph separator—independent of how it is encoded—is used to indicate a
separation between paragraphs. A line separator indicates where a line break
alone should occur, typically within a paragraph. For example:
This is a paragraph with a line separator at this point,
causing the word “causing” to appear on a different line, but not causing
the typical paragraph indentation, sentence breaking, line spacing, or
change in flush (right, center, or left paragraphs).
For comparison, line separators basically correspond to HTML <BR>, and
paragraph separators to older usage of HTML <P> (modern HTML delimits
paragraphs by enclosing them in <P>...</P>). In word processors, paragraph
separators are usually entered using a keyboard RETURN or ENTER; line
separators are usually entered using a modified RETURN or ENTER, such as
A record separator is used to separate records. For example, when exchanging
tabular data, a common format is to tab-separate the cells and to use a CRLF
at the end of a line of cells. This function is not precisely the same as line
separation, but the same characters are often used.
Traditionally, NLF started out as a line separator (and sometimes record
separator). It is still used as a line separator in simple text editors such as
program editors. As platforms and programs started to handle word processing
with automatic line-wrap, these characters were reinterpreted to stand for
paragraph separators. For example, even such simple programs as the Windows
Notepad program and the Mac SimpleText program interpret their platform’s NLF
as a paragraph separator, not a line separator. Once NLF was reinterpreted to
stand for a paragraph separator, in some cases another control character was
pressed into service as a line separator. For example, vertical tabulation VT
is used in Microsoft Word. However, the choice of character for line separator
is even less standardized than the choice of character for NLF. Many Internet
protocols and a lot of existing text treat NLF as a line separator, so an
implementer cannot simply treat NLF as a paragraph separator in all
Further reading:
Unicode Technical Report #13: Newline Guidelines
General Punctuation (U+2000–U+206F) chart PDF
SE: Why are there so many spaces and line breaks in Unicode?
SO: What is unicode character 2028 (LS / Line Separator) used for?
U+2028 on A misprint here says that U+2028 was added in v. 1.1 of the Unicode standard, which is false — it was added in 1.0
I found that in WordPress the easiest way to remove "L SEP" and "P SEP" characters is to execute this two SQL queries:
UPDATE wp_posts SET post_content = REPLACE(post_content, UNHEX('e280a9'), '')
UPDATE wp_posts SET post_content = REPLACE(post_content, UNHEX('e280a8'), '')
The javascript way (mentioned in some of the answers) can break some things (in my case some modal windows stopped working).
You can use this tool... remove all the special characters that Chrome displays.
Paste your HTML and Clean using HTML option.
You can manually delete the characters in the editor on this page and see the result.
Paste back your HTML in file and save :)
I recently ran into this issue, tried a number of fixes but ultimately I had to paste the text into VIM and there was an extra space I had to delete. I tried a number of HTML cleaners but none of them worked, VIM was the key!
9999years answers is great.
In case you use Symfony with Twig template I would recommend to check for an empty Twig block. In my case it was an empty Twig block with an invisible char inside.
The LSEP char was only displayed on certain device / browser.
On the other I had a blank space above the header and I could not see any invisible char.
I had to inspect the GET request to see that the value 1f18 was before the open html tag.
Once I removed an empty Twig block it was gone.
hope this can help someone one day ...
My problem was similar, it was "PSEP" or "P SEP". Similar issue, an invisible character in my file.
I replaced \x{2029} with a normal space. Fixed. This problem only appeared on Windows Chrome. Not on my Mac.
I agree with #Kapil Bathija - Basically you can copy & paste your HTML code into and convert it.
Then it will convert the special characters for you - Just remove the spaces in between the words and you will realize you have to press backspace 2x meaning there is an invalid character that can't be translated.
I had the same issue and it worked just fine afterwards.
You can also copy the text, paste it into a HTML editor such as Coda, remove the linebreak, copy it and paste it back into your site.
Video here:
Looks like my client pasted HTML into Wordpress after initially creating it with MS-Word. Even deleting the and visible spaces did not fix the issue. The extended characters became visible in vi/vim.
If you don't have vi/vim available, try highlighting from 2 chars before the LSEP to 2 chars after the LSEP; delete that chunk, and re-type the correct characters.

utf8 inputenc error (RStudio & knitr & pdflatex); unknown unicode character 150=U+0096

I originally ran my .Rnw-file with the latex option:
It produced an error:
"! Package inputenc Error: Unicode char \u8: not set up for use with LaTeX."
I switched to [utf8x], which generated a somewhat more helpful error message:
"! Package ucs Error: Unknown Unicode character 150 = U+0096,
(ucs) possibly declared in uni-0.def."
I tried to replace the 0096 ( character with \DeclareUnicodeCharacter{0096}{\"o} to easily detect where to problem was but when using [utf8x] the error message remained the same and when using [utf8] there was an additional error: "! Package inputenc Error: Cannot define Unicode char value < 00A0"
Thanks for any help!
I had the same issue with my bibliography. In my editor (TeXstudio), the character U+0096 is rendered as whitespace. For some unknown reason, the line pdflatex reports as containing the offending character is inaccurate.
I solved the problem by running a regular expression search for \x0096 and it found the offending character immediately. Deleting the character and replacing it with a true space fixed the issue.
Incidentally, I tried the \DeclareUnicodeCharacter{0096}{ } fix and it did nothing for me. This could be because the offending character was in the .bib file rather than the .tex file where I placed the command.
I do not think that it is workable way by switching the [utf8x].
Just carefully check your code, particularly the part you copy from somewhere, not type it by your self.
I do have the same problem recently.
I show you How I solve this problem.
I remove the code from the R markdown part by part to find which part caused this problem. Finally, I found the below part that resulted in the error in my code.
### Platform:Affymetrix A-AFFY-2-Affymetrix GeneChip Arabidopsis Genome [ATH1-121501].
I remember I copy this information from webpage. So I delete them and type this part by myself. It can run and generate the pdf file without any error.
To be clear, I show you the difference between the copy version and the version of my typing:
This is just one example I think. I want to point out that it is always problematic when you copy something from an unknown resource file into your code.
Hope this can help you and other people who were frustrated by this problem.

Line 1, Column 1: character "‍" not allowed in prolog

When I am going to validate my page using w3c validator, I am getting : Line 1, Column 1: character "‍" not allowed in prolog error.
There is a character, or data interpreted as a character, in the document before the doctype declaration. In the error message quoted, there is the character U+200D ZERO WIDTH JOINER (ZWJ) between the quotation marks, so this seems to be the culprit. ZWJ is an invisible control character. There is no point in having it at the start of a file, as it is supposed to cause ligature or joining behavior for the characters (usually letters) around it. ZWJ is invalid at the start of a document by HTML rules.
You may need a good editor, like BabelPad, to detect and remove the ZWJ.
I copied all my code into a new fresh file and used that file instead. It worked for me