How to represent any unicode character in Julia? - unicode

I am trying to figure out if there is a list or any documentation which states all the possible unicode characters I can represent and access with \ commands. I know I can do things like \sin or the like but is there a robust list? I tried searching around but didn't find anything. I was specifically looking to see if there was a x hat character or not.

As it turns out, there is a robust list of all possible unicode characters near the bottom of the Julia manual. According to the docs:
You can also get information on how to type a symbol by entering it in the REPL help, i.e. by typing ? and then entering the symbol in the REPL
which is how I figured out how to represent x hat!

In addition to logankilpatrick's answer, if you need them in a program, you can access all the REPL completions directly from the REPL package:
julia> import REPL
julia> REPL.REPLCompletions.latex_symbols
Dict{String, String} with 2500 entries:
"\\1/8" => "⅛"
"\\bscra" => "𝓪"
"\\guilsinglright" => "›"
"\\blacktriangleright" => "▶"
⋮ => ⋮
julia> REPL.REPLCompletions.emoji_symbols
Dict{String, String} with 829 entries:
"\\:ghost:" => "👻"
"\\:metro:" => "🚇"
"\\:children_crossing:" => "🚸"
"\\:suspension_railway:" => "🚟"
⋮ => ⋮

Related

Noweb does not cross-reference Perl identifiers delimited on the left by #

Consider this Noweb source file named quux.nw:
\documentclass{article}
\usepackage{noweb}
\usepackage[colorlinks]{hyperref}
\begin{document}
<<quux.pl>>=
my #foo ;
my $bar ;
my %baz ;
# %def foo bar baz
\end{document}
and compiled using the commands:
$ noweb quux.nw
$ latexmk -pdf quux.tex
The identifiers bar and baz are properly highlighted as identifiers and cross referenced in the PDF output. The identifier foo is not.
It's my understanding that Noweb has a very simple heuristic for recognizing identifiers. foo should be recognizable as an identifier because, like bar and baz, it begins with an alphanumeric, is delimited on the left by a symbol (at-sign), and is delimited on the right by a delimiter (whitespace).
I considered the possibility that the at-sign was being interpreted by Noweb as an escape and tried doubling it, but that (i) did not solve the problem, and (ii) introduced the syntax error my ##foo into quux.pl. This makes sense because according to the fine manual, a double at-sign is only treated specially in columns 1–2.
Noweb treats # as alphanumeric, with the rationale that it “helps LaTeX”. I did not find anything about this in the Noweb manual. This is documented only in the Noweb source file finduses.nw, line 24, in Noweb version 2.12.
Apparently, when writing your own LaTeX package, any macro you define has public scope. To write “private” macros, the trick is to temporarily reclass the # as a letter at the top of the package, incorporate an # into the name of each “private” macro, and restore the class of # at the bottom of the package. The macro remains public, but is impossible to call because the name gets broken up into multiple lexemes. (A user can still call such a macro by reclassing # as a letter before the call, but if they do that, they assume the risk.)
So yes, # should be included as an alphanumeric character when the code block contains a LaTeX package.
The full list of symbols treated as alphanumeric by Noweb is:
_ ' # #
The _ is treated as an identifier character in many programming languages, so Noweb is right to treat it as alphanumeric.
The # is treated as alphanumeric to “avoid false hits on C preprocessor directives”.
No explanation is given for treating the ' as alphanumeric.
Ideally, Noweb would support separate character class schemes for each source language. But as I understand it, Noweb has only the one global character class scheme, and no support for changing it (other than modifying the source).
Fortunately, Perl has alternate syntaxes for array identifiers that work around this limitation. Instead of #foo you can write #{foo} or even # foo and it will work.

What does the "?" operator do in Elixir?

The Ecto source code makes use of expressions ?0, ?1, etc. You can see how they evaluate:
iex(14)> ?0
48
iex(15)> ?1
49
iex(16)> ?2
50
What does that mean though? This is very hard to search for. What does the ?<character> actually do?
From: https://elixir-lang.org/getting-started/binaries-strings-and-char-lists.html#unicode-and-code-points
In Elixir you can use a ? in front of a character literal to reveal its code point:
If you aren't familiar with code points:
Unicode organizes all of the characters in its repertoire into code charts, and each character is given a unique numerical index. This numerical index is known as a Code Point.
The ?<character> can also be used in interesting ways for pattern matching and guard clauses.
defp parse_unsigned(<<digit, rest::binary>>) when digit in ?0..?9,
do: parse_unsigned(rest, false, false, <<digit>>)
...
defp parse_unsigned(<<?., digit, rest::binary>>, false, false, acc) when digit in ?0..?9,
do: parse_unsigned(rest, true, false, <<acc::binary, ?., digit>>)
The Elixir docs on it also clarify that it is only syntax. As #sabiwara points out:
Those constructs exist only at the syntax level. quote do: ?A just returns 65 and doesn't show any ? operator
As #Everett noted in the comments, there is a helpful package called Xray that provides some handy utility functions to help understand what's happening.
For example Xray.codepoint(some_char) can do what ?<char> does but it works for variables whereas ? only works with literals. Xray.codepoints(some_string) will do the whole string.

Scala - Why does dotless not apply to this case

I'm parsing some XML, and I'm chaining calls without a dot. All of these methods take no parameters (except \\, which takes one), so it should be fairly possible to chain them without a dot, right?
This is the code that does not work:
val angle = filteredContextAttributes.head \\ "contextValue" text toDouble
The error is: not found: value toDouble
However, it does work like this:
(filteredContextAttributes.head \\ "contextValue" text) toDouble
text returns only a String and does not take parameters, and I don't see any other parameters needed in \\ to cause an issue.
What am I missing? I don't want to hack it out, but to understand what' the problem.
And also I can't use head without the dot. When removing the dot it says: Cannot resolve symbol head
It's because text is a postfix notation - this means a method follows the object and takes no parameters. The trick with postfix is that it can only appear at the end expression. That's why when you add parenthesis it works (the expression is then bounded by the parenthesis and you get two postfix notations, one ending with text and the second one ending with toDouble). In your example that's not the case as you are trying to call a method further in the chain.
That's also the reason why you need to do filteredContextAttributes.head and not filteredContextAttributes head. I'm sure if you do (filteredContextAttributes head) it will work as again the postfix notation will be at the end of the expression!
There are also prefix and infix notations in Scala and I urge you to read about them to get a hang of when you can skip . and () (for instance why you need () when using the map method etc.).
To add on what #Mateusz already answered, this is the because of mixing postfix notation and arity-0 suffix notation.
There's also a great write up in another answer: https://stackoverflow.com/a/5597154/125901
You can even see a warning on your shorter example:
scala> filteredContextAttributes.head \\ "contextValue" text
<console>:10: warning: postfix operator text should be enabled
by making the implicit value scala.language.postfixOps visible.
This can be achieved by adding the import clause 'import scala.language.postfixOps'
or by setting the compiler option -language:postfixOps.
See the Scala docs for value scala.language.postfixOps for a discussion
why the feature should be explicitly enabled.
Which is a pretty subtle hint that this isn't the best construct style-wise. So, if you aren't specifically working in a DSL, then you should prefer adding in explicit dots and parenthesis, especially when mixing infix, postfix and/or suffix notations.
For example, you can prefer doc \\ "child" over doc.\\("child"), but once you step outside the DSL--in this example when you get your NodeSeq--prefer adding in perens.

Julia: How to deal with special unicode characters

I am working with the Distributions package which uses special unicode characters for many of the variables within types. The normal distribution, for instance, uses μ and σ. If I want to edit the standard deviation, I need to somehow type:
n.σ = 5.0
Is it possible to type these values into the repl (outside of using copy-paste)? How does one create these characters with one's keyboard?
Thank you
At the REPL, use LaTeX shortcuts, e.g. type \sigma and press tab to autocomplete. Note you need to using Julia 0.3 or higher for this to work.
Many text editors have add-ins to do something similar, e.g. https://github.com/mvoidex/UnicodeMath for SublimeText.
In Windows 10 and Linux under most modern desktops you can add a Greek keyboard map and then switch between Greek and English using [[windows button]]+space. Since many Latin / English letters are derived from ancient Greek, these have analogs so S types a Sigma (σΣ), D, a delta(δΔ) etc. ετψ.
Figured it out by looking up "Entering Unicode in Linux" on Google.
One can press Ctrl+Shift+u, then the 4-digit UTF-16 Hex encoding for the character. For example, σ = u03bc
Tt is difficult to remember all the unicodes or latex shortcuts (at least for me) or search it on the web. When working with REPL or jupyter notebook, Julia has provided a simple way to do it as mentioned here:
You can also get information on how to type a symbol by entering it in the REPL help, i.e. by typing ? and then entering the symbol in the REPL (e.g., by copy-paste from somewhere you saw the symbol).
For example:

Using Ruby 1.9.1, how could I access this string's characters one at a time?

I don't know how else to explain, so I'll give you List of greek words with english derivatives. Look at the a table, please, first column. Notice there are words like ἄβαξ. Using Ruby 1.9.1, which has better encoding support than Ruby 1.8, how could I iterate over each character forming that word? For example, I'd like to get the letters in order, one at a time, like:
ἄ
β
α
ξ
I'm not really sure how to go about that, as the .size method reports a different length of the string than the ones we perceive. Can you help?
Example:
#!/usr/bin/env ruby19
str = "ἄβαξ"
puts "#{str} - encoding: #{str.encoding.name} / size: #{str.size}"
str.each_char do |c|
puts c
end
Using some Google-fu, you'll find a lot of good articles on Ruby 1.9 and character encoding.
The following seems to work in 1.9
testStr="ἄβαξ"
testStr.each_char { |k|
puts k
}