Cool class and method names wrapped in ``: class `This is a cool class` {}? - scala

I just found some scala code which has a strange class name:
class `This is a cool class` {}
and method name:
def `cool method` = {}
We can use a sentence for a class or method name!
It's very cool and useful for unit-testing:
class UserTest {
def `user can be saved to db` {
// testing
}
}
But why we can do this? How to understand it?

This feature exists for the sake of interoperability. If Scala has a reserved word (with, for example), then you can still refer to code from other languages which use it as a method or variable or whatever, by using backticks.
Since there was no reason to forbid nearly arbitrary strings, you can use nearly arbitrary strings.

As #Rex Kerr answered, this feature is for interoperablility. For example,
To call a java method,
Thread.yield()
you need to write
Thread.`yield`()
since yield is a keyword in scala.

The Scala Language Specification:
There are three ways to form an identifier. First, an identifier can
start with a letter which can be followed by an arbitrary sequence of
letters and digits. This may be followed by underscore ‘_’ characters
and another string composed of either letters and digits or of
operator characters. Second, an identifier can start with an operator
character followed by an arbitrary sequence of operator characters.
The preceding two forms are called plain identifiers. Finally, an
identifier may also be formed by an arbitrary string between
back-quotes (host systems may impose some restrictions on which
strings are legal for identifiers). The identifier then is composed of
all characters excluding the backquotes themselves.

Strings wrapped in ` are valid identifiers in Scala, not only to class names and methods but to functions and variables, too.

To me it is just that the parser and the compiler were built in a way that enables that, so the Scala team implemented it.
I think that it can be cool for a coder to be able to give real names to functions instead of getThisIncredibleItem or get_this_other_item.
Thanks for your questions which learnt me something new in Scala!

Related

Scala style guideline for underscore in identifiers

I have accepted from many other languages that underscores have as much freedom as alphabets in an identifier. Hence _v and v_. Also that trailing underscores are recommended to avoid ambiguity with reserved keywords (class_, case_).
val abc_=0
<console>:1: error: '=' expected but integer literal found.
val abc_=0
Underscores being an important part of Scala typing system, what is the recommended way to use them in identifiers, so that parser and human can both be happy? What are all possible ambiguities that identifiers with underscores bring?
Leading whitespaces seem to add to confusion _class instead of class_.
Related questions:
What are all the uses of an underscore in Scala?
Scala underscores in names
Trailing underscores are a bad idea because things like x_+ are valid variable names on their own. Don't use trailing underscores at all.
Leading underscores are less bad of an idea, but it's still hard to visually parse things like _myfunc _. There is something of a convention to make private members that hold constructor arguments of the same name start with _: class X(x: Int) { private var _x = x }. My recommendation is don't do it. You're asking for confusion. Use myX or theX or xLocal or xi or something for your internal variable. Still, if you do go with _x, you'll have good company; people will tend to know what you mean.
Underscores within a name are not widely used, since camel case is the standard. The exception that I make is that I use underscores within implicit defs that are not expected to be used by hand, and instead state why the conversion is taking place: tuple2_can_expand might add an expand method to convert a Tuple2 into a Tuple3, for example.
There is only one place you need underscores in identifiers: between alphanumeric characters and other. In fact, that's just what happens in your case: the parser thinks you are declaring val abc_= and don't have = after it! Most common use is for "setter" methods:
def prop: String // or some other type
def prop_=(v: String)
I've also seen predicate_? instead of more Java-like isPredicate.
keyword_ aren't often used, but if you do use them, don't skimp on the whitespace. Write, e.g., val abc_ = 0. But for that matter, val abc = 0 is more readable than val abc=0 as well, so you should have whitespace there anyway. As Rex Kerr says, _privateVariable is acceptable, but not recommended practice.

Why don't scala collections have any human-readable methods like .append, .push, etc

Scala collections have a bunch of readable and almost readable operators like :+ and +:, but why aren't there any human readable synonyms like append?
All mutable collections in Scala have the BufferLike trait and it defines an append method.
Immutable collections do not have the BufferLike trait and hence only define the other methods that do not change the collection in place but generate a new one.
Symbolic method names allow the combination with the assignment operation =.
For instance, if you have a method ++ which creates a new collection, you can automatically use ++= to assign the new collection to some variable:
var array = Array(1,2,3)
array ++= Array(4,5,6)
// array is now Array(1,2,3,4,5,6)
This is not possible without symbolic method names.
In fact they often some human-readable synonyms:
foldLeft is equivalent to /:
foldRight is equivalent to :\
The remaining ones are addition operators, which are quite human readable as they are:
++ is equivalent to java addAll
:+ is append
+: is prepend
The position of the semi-colon indicates the receiver instance.
Finally, some weird operators are legacies of other functional programming languages. Such as list construction (SML) or actor messaging (erlang).
Is it any different than any other language?
Let's take Java. What's the human readable version of +, -, * and / on int? Or, let's take String: what's the human readable version of +? Note that concat is not the same thing -- it doesn't accept non-String parameters.
Perhaps you are bothered by it because in Java -- unlike, say, C++ -- either things use exclusively non-alphabetic operators, or alphabetic operators -- with the exception of String's +.
The Scala standard library does not set out to be Java friendly. Instead, adapters are provided to convert between Java and Scala collections.
Attempting to provide a Java friendly API would not only constrain the choice of identifiers (or mandate that aliases should be provided), but also limit the way that generics and function types were used. Substantially more testing would be required to validate the design.
On the same topic, I remember some debate as to whether the 2.8 collections should implement java.util.Iterable.
http://scala-programming-language.1934581.n4.nabble.com/How-to-set-the-scale-for-scala-BigDecimal-s-method-td1948885.html
http://www.scala-lang.org/node/2177

Lucene.Net Underscores causing token split

I've scripted a MsSqlServer databases tables,views and stored procedures into a directory structure that I am then indexing with Lucene.net. Most of my table, view and procedure names contain underscores.
I use the StandardAnalyzer. If I query for a table named tIr_InvoiceBtnWtn01, for example, I recieve hits back for tIr and for InvoiceBtnWtn01, rather than for just tIr_InvoiceBtnWtn01.
I think the issue is the tokenizer is splitting on _ (underscore) since it is punctuation.
Is there a (simple) way to remove underscores from the punctuation list or is there another analyzer that I should be using for sql and programming languages?
Yes, the StandardAnalyzer splits on underscore. WhitespaceAnalyzer does not. Note that you can use a PerFieldAnalyzerWrapper to use different analyzers for each field - you might want to keep some of the standard analyzer's functionality for everything except table/column name.
WhitespaceAnalyzer only does whitespace splitting though. It won't lowercase your tokens, for example. So you might want to make your own analyzer which combines WhitespaceTokenizer and LowercaseFilter, or look into LowercaseTokenizer.
EDIT: Simple custom analyzer (in C#, but you can translate it to Java pretty easily):
// Chains together standard tokenizer, standard filter, and lowercase filter
class MyAnalyzer : Analyzer
{
public override TokenStream TokenStream(string fieldName, System.IO.TextReader reader)
{
StandardTokenizer baseTokenizer = new StandardTokenizer(Lucene.Net.Util.Version.LUCENE_29, reader);
StandardFilter standardFilter = new StandardFilter(baseTokenizer);
LowerCaseFilter lcFilter = new LowerCaseFilter(standardFilter);
return lcFilter;
}
}

Purpose of Scala's Symbol? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
What are some example use cases for symbol literals in Scala?
What's the purpose of Symbol and why does it deserve some special literal syntax e. g. 'FooSymbol?
Symbols are used where you have a closed set of identifiers that you want to be able to compare quickly. When you have two String instances they are not guaranteed to be interned[1], so to compare them you must often check their contents by comparing lengths and even checking character-by-character whether they are the same. With Symbol instances, comparisons are a simple eq check (i.e. == in Java), so they are constant time (i.e. O(1)) to look up.
This sort of structure tends to be used more in dynamic languages (notably Ruby and Lisp code tends to make a lot of use of symbols) since in statically-typed languages one usually wants to restrict the set of items by type.
Having said that, if you have a key/value store where there are a restricted set of keys, where it is going to be unwieldy to use a static typed object, a Map[Symbol, Data]-style structure might well be good for you.
A note about String interning on Java (and hence Scala): Java Strings are interned in some cases anyway; in particular string literals are automatically interned, and you can call the intern() method on a String instance to return an interned copy. Not all Strings are interned, though, which means that the runtime still has to do the full check unless they are the same instance; interning makes comparing two equal interned strings faster, but does not improve the runtime of comparing different strings. Symbols benefit from being guaranteed to be interned, so in this case a single reference equality check is both sufficient to prove equality or inequality.
[1] Interning is a process whereby when you create an object, you check whether an equal one already exists, and use that one if it does. It means that if you have two objects which are equal, they are precisely the same object (i.e. they are reference equal). The downsides to this are that it can be costly to look up which object you need to be using, and allowing objects to be garbage collected can require complex implementation.
Symbols are interned.
The purpose is that Symbol are more efficient than Strings and Symbols with the same name are refered to the same Symbol object instance.
Have a look at this read about Ruby symbols: http://glu.ttono.us/articles/2005/08/19/understanding-ruby-symbols
You can only get the name of a Symbol:
scala> val aSymbol = 'thisIsASymbol
aSymbol: Symbol = 'thisIsASymbol
scala> assert("thisIsASymbol" == aSymbol.name)
It's not very useful in Scala and thus not widely used. In general, you can use a symbol where you'd like to designate an identifier.
For example, the reflection invocation feature which was planned for 2.8.0 used the syntax obj o 'method(arg1, arg2) where 'o' was a method added to Any and Symbol was added the method apply(Any*) (both with 'pimp my library').
Another example could be if you want to create an easier way to create HTML documents, then instead of using "div" to designate an element you'd write 'div. Then one can imagine adding operators to Symbol to make syntactic sugar for creating elements

Scala: keyword as package name

I'm trying to use a Java library (no source code available) which defines some xxx.xxx.object package. Scala complains about the presence of "object" in the package name, so I can't import from it, and I can't refer to its classes with fully qualified name either.
Is there a way around it?
Wrapping the object in a ` (the quote next to 1) should work.
xxx.xxx.`object`
To complete agilefall's answer, the Scala Language Specification mentions that an import is composed of id:
id ::= plainid
| ‘\`’ stringLit ‘\`’
an identifier may also be formed by an arbitrary string between back-quotes (host systems may impose some restrictions on which strings are legal for identifiers). The identifier then is composed of all characters excluding the backquotes themselves.
Backquote-enclosed strings are a solution when one needs to access Java identifiers that are reserved words in Scala.
For instance, the statement Thread.yield() is illegal, since yield is a reserved word in Scala. However, here’s a work-around:
Thread.`yield`()