Alphanumeric identifiers and '$' character - scala

From Programming in Scala section 6.10 (Page 151):
Identifiers in user programs should not contain '$' character, even though it will compile; if they do this might lead to name clashes with identifiers generated by Scala compiler.
I am sure it's a reason for this, but why not prevent use of the '$' character in alphanumeric identifiers?

Some of the identifiers generated internally by the Scala compiler contain '$' characters. If you create new identifiers with '$' characters, you might clash with the internally generated characters, and chaos ensues. OTOH, you sometimes need to '$' characters, either on those (now very rare) occasions when access to the internally generated Scala characters is necessary, or because someone used such an identifier in Java code you wish to call (where it's legal, if also discouraged).

Related

Are periods in object names bad practice?

For example, a constraint for a default value of 0 could be named DF__tablename.columnname.
Although my search for this being bad practice doesn't yield results, in the numerous constraints examples I've seen on SO and many other sites, I never spotted a period.
Using period in an object name is bad practice.
Don't use dot character in an identifier. Yes it can be done but the drawbacks outweigh any benefits.
tl;dr
Special characters, such as a dot, are not allowed in regular identifiers. If an identifier does not follow the rules for regular identifier, then references to the identifier must be enclosed in square brackets (or ANSI double quotes).
https://learn.microsoft.com/en-us/sql/relational-databases/databases/database-identifiers?view=sql-server-2017
In terms of the period (dot character), using that in an identifier is not allowed in a regular identifier; but it could be used within square brackets.
The dot character is even more of a special-ish character in SQL; it's used to separate an identifier from a preceding qualifier.
SELECT mytable.mycolumn FROM mytable
We could also write that as
SELECT [mytable].[mycolumn] FROM mytable
We could also write
SELECT [mytable.mycolumn] FROM mytable
but that means something very different. With that, we aren't referencing a column named mycolumn, we are now referencing an identifier that contains a dot character.
SQL Server will deal with this just fine.
But if we do this, and start using the dot character in our identifiers, we will be causing confusion and frustration to future readers. Any benefit we would gain by using dot characters in identifiers is going to be far outweighed by the downside for others.
Similarly, why we don't create tables named WHERE (1=1) OR, or create columns named SUBSTR(foo.bar,1,10) to avoid monstrosities like
SELECT [SUBSTR(foo.bar,1,10)] FROM [WHERE (1=1)] OR]
Which may be valid SQL, but it will cause future readers to become very upset, and cause them to curse us, our descendants and loved ones. Don't make them do that. For the love of all that is good and beautiful in this world, don't use dot characters in identifiers.
It is perfectly valid to have periods in the object names. However, this requires you to use square brackets around the object name when referring to it. In case you forget these square brackets you will get some error messages that can be less intuitive to the inexperienced developer. For this reason I recommend not to use periods in the object names. I would also guess this is the main reason you don't often see examples of periods in object names on the internet.
In your example, you could use another underscore instead of the period, like this: DF__tablename_columnname

How should Fancytree node keys be escaped to avoid special characters?

I discovered that the '+' character in a custom node key is converted silently to a space character. I obviously need to escape these special characters, but I could not find documentation about which characters are not allowed in keys.
Thanks!
There should be no conversion, except for casting non-strings to a string.
When the generateIdsoption is used, the key is added as id="KEY" attribute to the generated HTML element, so the standard restrictions apply.
The key is also internally used as JavaScript hash key.
I'd recommend plain ascii keys, but '{', '.', '~', ... should be no problem as well.
As far as I know, + is interpreted as space by browsers, when part of a URL, so maybe you see the conversion there.

Why Kotlin does not allow slash in identifiers

The Unicode is allowed in identifiers in backticks
val `💾id` = "1"
But slash is not allowed
val `application/json` = "application/json"
In Scala we can have such names.
This is a JVM limitation. From the specification section 4.2.2:
Names of methods, fields, local variables, and formal parameters are stored as unqualified names. An unqualified name must contain at least one Unicode code point and must not contain any of the ASCII characters . ; [ / (that is, period or semicolon or left square bracket or forward slash).
In Scala names are mangled to avoid this limitation, in Kotlin they are not.
Kotlin's identifiers are used as-is, without any mangling, in the names of JVM classes and methods generated from the Kotlin code. The slash has a special meaning in JVM names (it separates packages and class names). Therefore, Kotlin doesn't allow using it in an identifier.

Mathematical formula terms in Scala

Our application relies on lots of equations, which, to correspond with the standard scientific names, use variable names like mu_k, (if the standard is $\mu_k$). (We could debate whether scientists should switch to CS style descriptive variable names, but often the terms don't really describe anything, they are just part of equations, and, more over, we need our code to match the known literature.)
In C this is easy to name vars this way: int mu_k. We are considering porting our code to Scala, but I know that val mu_k is discouraged in Scala, because underscores have special meanings.
If we use underscores only in the middle of the var name (e.g. mu_k) and not beginning or end (e.g. _x or x_), will this present a problem in Scala?
What is the recommended naming convention for Scala in this case?
You are right that underscores are discouraged in variable names in Scala, which implies that they are not forbidden. In my opinion, a convention should be followed wherever sensible.
In the case of mathematical formulae, I disagree that the Greek letters don't convey a meaning; the meaning is not necessarily intuitively descriptive for non-mathematicians, but as you say, the reference to the usage in a paper may be meaningful and important. Therefore, sticking with the underscore won't hurt, although I would probably prefer a more Scala-style way as muX when possible and meaningful. If you want a perfect answer, you might need to perform a usability test with your developers. In the specific example, I personally find mu_x more readable than muX, but that might differ among individuals.
I don't think the Scala compiler has a problem with underscores in the examples you described. Presumably, even leading and trailing underscores are fine, but should indeed be avoided strictly because they have a special meaning: http://docs.scala-lang.org/style/naming-conventions.html#methods.
Underscores are not special in any way in identifiers. There are a lot of special meanings for the underscore in Scala, but not in identifiers. (There is a special rule in identifiers that if you want to mix alphanumeric characters and operator characters in the same identifier, they have to be separated by an underscore, e.g. foo? is not a legal identifier, but foo_? is.)
So, there is no problem using an identifier with an underscore in it.
It is generally preferred to use camelCase and PascalCase for alphanumeric identifiers, and not mix alphanumeric and operator characters in the same identifier (i.e. use maxBy instead of max_by and use isFoo instead of foo_?) but that's just a coding convention whose purpose is to reduce the number of "unspecial" underscores, so that you can quickly scan for the "special" ones.
But in your case, you are using special naming conventions anyway, so you don't need to adhere to the community naming conventions as strictly.
However, I personally would actually prefer the name µ_k over mu_k.
That's as far as it goes with Scala, unfortunately. The Fortress programming language by Sun/Oracle did allow boldface, overstrike, superscripts and subscripts in identifier names, so something like µk would have been possible as a legal identifier, but sadly, Fortress was abandoned a couple of years ago.
I'm not stating this is the correct way, and myself would be rather discouraged to do this, but you can use full string literals as identifiers:
From: http://www.scala-lang.org/files/archive/spec/2.11/01-lexical-syntax.html
id ::= plainid
| ‘’ stringLiteral ‘’
Finally, an identifier may also be formed by an arbitrary string
between back-quotes (host systems may impose some restrictions on
which strings are legal for identifiers). The identifier then is
composed of all characters excluding the backquotes themselves.
So this is valid:
val ’mu k‘
(sorry, for formatting)

Using # on Variable Names

Googling I've found this DB2 Function declaration:
CREATE FUNCTION QGPL.SPLIT (
#Data VARCHAR(32000),
#Delimiter VARCHAR(5)
)
Whats means # symbol before the Variable Name?
Regards,
Pedro
The # character is simply the first character of the SQL identifier [variable name] naming the parameter defined for the arguments of the User Defined Function (UDF); slightly reformatted [because at first glance I thought that revision might make the at-symbols appear more conspicuously to be part of the name, though now I think probably not]:
CREATE FUNCTION QGPL.SPLIT
( #Data VARCHAR(32000)
, #Delimiter VARCHAR(5)
) returns ...
Put simply, the use of the # character in an identifier is highly discouraged; the use of such variant characters, although supported in standard object naming, they can cause great pains and difficulties, including some that are insurmountable:
http://www.ibm.com/support/knowledgecenter/api/content/ssw_ibm_i_71/db2/rbafzch2iden.htm
Identifiers
An identifier is a token used to form a name. An identifier in an SQL statement is an SQL identifier, a system identifier, or a host identifier.
Note: $, #, #, and all other variant characters should not be used in identifiers because the code points used to represent them vary depending on the CCSID of the string in which they are contained. If they are used, unpredictable results may occur. [...]
[Edit-addendum 17May2015]
http://www.ibm.com/support/knowledgecenter/api/content/nl/en-us/SSEPGG_10.5.0/com.ibm.db2.luw.admin.dbobj.doc/doc/c0004625.html
Naming rules in a multiple national language environment
The basic character set that can be used in database names consists of the single-byte uppercase and lowercase Latin letters (A…Z, a…z), the Arabic numerals (0…9) and the underscore character (_).
This list is augmented with three special characters (#, #, and $) to provide compatibility with host database products. Use special characters #, #, and $ with care in a multiple national language environment because they are not included in the multiple national language host (EBCDIC) invariant character set. Characters from the extended character set can also be used, depending on the code page that is being used. If you are using the database in a multiple code page environment, you must ensure that all code pages support any elements from the extended character set you plan to use.
[...]
[/Edit-addendum 17May2015]