Unicode characters in ggplot labels - unicode

I can get ggplot to print Japanese Unicode characters in axis labels and legends, but not in labels. Is this a bug?
library(extrafont)
library(ggplot2)
data_frame <- cbind.data.frame("number"=c(1:3), "kana"=c("い","ろ","は"))
ggplot(data=data_frame, aes(kana, number)) +
geom_point() + theme_gray(base_family = "Meiryo") ##works great
ggplot(data=data_frame, aes(kana, number, label=kana)) +
geom_point() + geom_label() + theme_gray(base_family = "Meiryo") ##no such luck

Related

Separator for five or more digits number in Tableau

I was working on a Tableau Project. We want to have a separator for five or more digits numbers.
For ex:-
1 as 1
12 as 12
123 as 123
1234 as 1234
12345 as 12,345
123456 as 1,23,456
Can you please assist me, how to achieve this?
I am nearly sure that this cannot be done as long as numbers are formatted as numbers. However, as a workaround, I have developed a method which however will convert numbers to string. Let's say you have a column col of desired numbers
copy your column say col2 (save original for future use) and convery type to string
Create a new calculated field say desired by using this calculation
If LEN([Col2]) <= 4 THEN
[Col2]
ELSEIF LEN([Col2]) < 6 THEN
REPLACE([Col2], RIGHT([Col2], 3), "") + "," +RIGHT([Col2], 3)
ELSEIF LEN([Col2]) <8 THEN
REPLACE([Col2], RIGHT([Col2], 5), "") + "," +
REPLACE(RIGHT([Col2],5), RIGHT([Col2], 3), "") + "," +RIGHT([Col2], 3)
ELSE
REPLACE([Col2], RIGHT([Col2], 7), "") + "," +
REPLACE(RIGHT([Col2],7), RIGHT([Col2], 5), "") + "," +
REPLACE(RIGHT([Col2],5), RIGHT([Col2], 3), "") + "," +RIGHT([Col2], 3)
END
this CF will work exactly as desired for upto 9 digits.
Alignment is not a big problem, if considered

Is there an uppercase letter that has two lowercase alternatives?

Are there any two characters ch1, ch2, ch1 <> ch2, and ch1 and ch2 are lowercase letters, where uppercase(ch1) == uppercase(ch2)? Is there actually such a character in Unicode?
A follow-up question is: For any ch that's a lowercase letter, is the following expression always true?
ch == lowercase(uppercase(ch))
Did a quick test in Java:
public static void main(String[] args) {
for (char ch1 = 0; ch1 < 65534; ch1++) {
if (!isLetter(ch1) || !isLowerCase(ch1)) {
continue;
}
String s1 = "" + ch1;
for (char ch2 = (char) (ch1 + 1); ch2 < 65535; ch2++) {
if (!isLetter(ch2) || !isLowerCase(ch2)) {
continue;
}
String s2 = "" + ch2;
if (s1.toUpperCase(Locale.US).equals(s2.toUpperCase(Locale.US))) {
System.out.println("ch1=" + ch1 + " (" + (int) ch1 + "), ch2=" + ch2 + " (" + (int) ch2 + ")");
}
}
}
}
It prints:
ch1=i (105), ch2=ı (305)
ch1=s (115), ch2=ſ (383)
ch1=µ (181), ch2=μ (956)
ch1=ΐ (912), ch2=ΐ (8147)
ch1=ΰ (944), ch2=ΰ (8163)
ch1=β (946), ch2=ϐ (976)
ch1=ε (949), ch2=ϵ (1013)
ch1=θ (952), ch2=ϑ (977)
ch1=ι (953), ch2=ι (8126)
ch1=κ (954), ch2=ϰ (1008)
ch1=π (960), ch2=ϖ (982)
ch1=ρ (961), ch2=ϱ (1009)
ch1=ς (962), ch2=σ (963)
ch1=φ (966), ch2=ϕ (981)
ch1=в (1074), ch2=ᲀ (7296)
ch1=д (1076), ch2=ᲁ (7297)
ch1=о (1086), ch2=ᲂ (7298)
ch1=с (1089), ch2=ᲃ (7299)
ch1=т (1090), ch2=ᲄ (7300)
ch1=т (1090), ch2=ᲅ (7301)
ch1=ъ (1098), ch2=ᲆ (7302)
ch1=ѣ (1123), ch2=ᲇ (7303)
ch1=ᲄ (7300), ch2=ᲅ (7301)
ch1=ᲈ (7304), ch2=ꙋ (42571)
ch1=ṡ (7777), ch2=ẛ (7835)
ch1=ſt (64261), ch2=st (64262)
So the answer is: yes, there are distinct characters that have the same upper-case representation.
Upper, title, and lower cases are locale-specific, and so in different locales you may have different lower case letter (e.g. French upper cases may lose accents).
But Unicode defines also a standard way to convert to upper case or to lower case, and with an exception for Turkish languages, which may have different rules (marked with T in the CaseFolding.txt Unicode database, and further special cases for Turkish, Greek, and Lithuanian, in SpecialCasing.txt).
For most cases, you have a unique way to convert lower to upper (and the contrary), but see SIGN KELVIN which maps with K and other signs which use the same glyphs as other letters (that should go away, if you remove compatibility characters with a normalization).
One case is the Greek Sigma letter. There is only one in upper case, but you may use two different in lower case, depending on whether it is at the end of a word.
You will find more information in the Unicode document about Unicode database: http://www.unicode.org/reports/tr44/#Casemapping and in the Unicode standard (linked in the document, as well the two files I named above).
Note: some characters increase the number of code points, so when converting back, one should check the longest match.

Why is my code returning 0? And not the numbers of Upper and Lower characters?

I'm trying to code that calculates how many upper and lower characters in a string. Here's my code.
I've been trying to convert it to string, but not working.
def up_low(string):
result1 = 0
result2 = 0
for x in string:
if x == x.upper():
result1 + 1
elif x == x.lower():
result2 + 1
print('You have ' + str(result1) + ' upper characters and ' +
str(result2) + ' lower characters!')
up_low('Hello Mr. Rogers, how are you this fine Tuesday?')
I expect my outcome to calculate the upper and lower characters. Right now I'm getting "You have 0 upper characters and 0 lower characters!".
It's not adding up to result1 and result2.
Seems your error is in the assignation, missimg a '=' symbol (E.g. result1 += 1)
for x in string:
if x == x.upper():
result1 += 1
elif x == x.lower():
result2 +**=** 1
The problem is in the line result1 + 1 and result2 + 1. This is an expression, but not an assignment. In other words, you increment the counter, and then the incremented value goes nowhere.
The solution is to work the assignment operator = into there somewhere.

How do you add spaces in front of 'y' integers in a nested loop?

for x in range(1, 4):
print(x)
for y in range(5, 10):
print(y)
I tried adding " " + in front of y within print. Essentially,
print(" " + y)
I tried creating a string w that equals " " to add in front of y. Essentially,
w = " "
print(w + y)
I'd like the output to look like:
1
5
6
2
5
6
3
5
6
I'm exploring .join() at the moment to see if this method can provide a solution.
Thank you.
for x in range(1, 4):
print(x)
for y in range(5, 10):
w = " "
print(w + str(y)) or print(" " + str(y))
output:
1
5
6
2
5
6
Python string formatting is easy and clean in this case. You can see https://pyformat.info/ for a quick introduction.
Example with the most recent format function:
for x in range(1, 4):
print("{:d}".format(x))
for y in range(5, 10):
print("{:>3d}".format(y))
Short explanation:
{:d}: format the argument as a digit
{:>3d}: align right, make the string 3 characters long and format the argument as a digit

Clarification on bias of a perceptron

Isn't it true that if a bias is not present, a line passing through origin should be able to linearly separate the two data sets??
But the most popular answer in this -->> question says
y
^
| - + \\ +
| - +\\ + +
| - - \\ +
| - - + \\ +
---------------------> x
stuck like this
I am confused about it. Do you mean the origins in figure above are somewhere in middle of x-axis and y-axis? Can somebody please help me and clarify this?
Alright, so perhaps the original ASCII graph was not 100% accurate! Let me try to depict this again:
y y
^ ^
- + \\ | + -\\+ | +
- +\\ | + + - \\ + | + +
- - \\ | + - - \\ | +
- - + \\| + - - \\+ | +
------------------------> x ---------------------------> x
- - |\\ + - - \\ | +
- - + | \\ + - - \\ + | +
- - - | \\ + + - - -\\ | + +
-- - - | +\\ ++ -- - - \\ | + ++
stuck like this needs to get like this
y = ax y = ax + b
(w0*x + w1*y = 0) (w0*x + w1*y + w2*1 = 0)
I think your intuition is correct on this issue:
Do you mean the origins in figure above are somewhere in middle of x-axis and y-axis?
In my reading of the graph, yes.
I think the ASCII graph, as cool as it is, is a bit confusing here, because it shows a line that is not traveling through what would normally be considered as the origin. Normally one would think of the intersection of the x- and y-axis lines as the origin, but in this diagram the separating line is clearly not passing through said intersection. As you've noted, a perceptron without a bias term can only define a separating line that passes through the origin, so the ASCII graph must have some sort of odd origin that is floating out in space somewhere.
Also, note that a standard perceptron always defines a linear separator, but a linear separator is not guaranteed to be able to partition a given dataset correctly -- that depends completely on the dataset. There are also variants of the perceptron that use the "kernel trick" to define nonlinear separators, but that's a whole different story. :)
Hope that helps.