cats' NonEmptyList vs scala stdlib :: - scala

I am studying the cats library recently, and I have come across this class called NonEmptyList.
After reading the api, I couldn't help wondering what is it that made the cats authors to create a new class, instead of utilizing something that is built in (::) and use type classes to extend it. It is not even listed in the cats github page, so I have come here to ask about it. Maybe it is because cons is a subtype of List? (Though I do not know the implications of it)
What are the differences between :: and NEL? And why did cats authors have to write NEL instead of using ::?

The main reason for having NonEmptyList which doesn't extend from List is developer experience of including assumptions in the APIs.
Firstly, note that :: has all the methods List has which can be misleading and it makes it harder to design better APIs with more powerful assumptions. Additionally, List doesn't have any methods that directly return ::, which means that developer needs to maintain non-empty abstraction manually.
Let me show you an example which shows what I mean in practice:
// NonEmptyList usage is intuitive and types fit together nicely
val nonEmpty: NonEmptyList[Int] = NonEmptyList.of(1, 2, 3)
val biggerNonEmpty: NonEmptyList[Int] = 0 :: nonEmpty
val nonEmptyMapped: NonEmptyList[Int] = nonEmpty.map(_ * 2)
// :: has lots of problems
// PROBLEM: we can't easily instantiate ::
val cons: ::[Int] = 1 :: 2 :: 3 :: Nil // type mismatch; found: List[Int]; required: ::[Int]
val cons: ::[Int] = new ::[Int](1, ::(2, ::(3, Nil)))
// PROBLEM: adding new element to Cons returns List
val biggerCons: ::[Int] = 0 :: cons // type mismatch; found: List[Int]; required: ::[Int]
// PROBLEM: ::.map returns List
val consMapped : ::[Int] = cons.map(_ * 2) // type mismatch; found: List[Int]; required: ::[Int]
Note that NonEmptyList has methods that return List, namely filter, filterNot and collect. Why? Because filtering through NonEmptyList may mean that you filter out all elements and the list can become an empty one.
This is what makes the whole non-empty abstraction so powerful. By properly using function input and output types, you can encode assumptions about the API. :: doesn't provide this abstraction.

Related

what's the conceptual purpose of the Tuple2?

take this as a follow up to this SO question
I'm new to scala and working through the 99 problems. The given solution to p9 is:
object P09 {
def pack[A](ls: List[A]): List[List[A]] = {
if (ls.isEmpty) List(List())
else {
val (packed, next) = ls span { _ == ls.head }
if (next == Nil) List(packed)
else packed :: pack(next)
}
}
}
The span function is doing all the work here. As you can see from the API doc (it's the link) span returns a Tuple2 (actually the doc says it returns a pair - but that's been deprecated in favor or Tuple2). I was trying to figure out why you don't get something back like a list-of-lists or some such thing and stumbled across the SO link above. As I understand it, the reason for the Tuple2 has to do with increasing performance by not having to deal with Java's 'boxing/unboxing' of things like ints into objects like Integers. My question is
1) is that an accurate statement?
2) are there other reasons for something like span to return a Tuple2?
thx!
A TupleN object has at least two major differences when compared to a "standard" List+:
(less importantly) the size of the tuple is known beforehand, allowing to better reason about it (by the programmer and the compiler).
(more importantly) a tuple preserves the type information for each of its elements/"slots".
Note that, as alluded, the type Tuple2 is a part of the TupleN family, all utilizing the same concept. For example:
scala> ("1",2,3l)
res0: (String, Int, Long) = (1,2,3)
scala> res0.getClass
res1: Class[_ <: (String, Int, Long)] = class scala.Tuple3
As you can see here, each of the elements in the 3-tuple has a distinct type, allowing for better pattern matching, stricter type protection etc.
+heterogeneous lists are also possible in Scala, but, so far, they're not part of the standard library, and arguably harder to understand, especially for newcomers.
span returns exactly two values. A Tuple2 can hold exactly two values. A list can contain arbitrarily many values. Therefore Tuple2 is just a better fit than using a list.

When should .empty be used versus the singleton empty instance?

Working with collections in Scala, it's common to need to use the empty instance of the collection for a base case. Because the default empty instances extend the collection type class with a type parameter of Nothing, it sometimes defeats type inference to use them directly. For example:
scala> List(1, 2, 3).foldLeft(Nil)((x, y) => x :+ y.toString())
<console>:8: error: type mismatch;
found : List[String]
required: scala.collection.immutable.Nil.type
List(1, 2, 3).foldLeft(Nil)((x, y) => x :+ y.toString())
^
fails, but the following two corrections succeed:
scala> List(1, 2, 3).foldLeft(Nil: List[String])((x, y) => x :+ y.toString())
res9: List[String] = List(1, 2, 3)
scala> List(1, 2, 3).foldLeft(List.empty[String])((x, y) => x :+ y.toString())
res10: List[String] = List(1, 2, 3)
Another place I've run into a similar dilemma is in defining default parameters. These are the only examples I could think of off the top of my head, but I know I've seen others. Is one method of providing the correct type hinting preferable to the other in general? Are there places where each would be advantageous?
I tend to use Nil (or None) in combination with telling the type parameterized method the type (like Kigyo) for the specific use case given. Though I think using explicit type annotation is equally OK for the use case given. But I think there are use cases where you want to stick to using .empty, for example, if you try to call a method on Nil: List[String] you first have to wrap it in braces, so that's 2 extra characters!!.
Now the other argument for using .empty is consistency with the entire collections hierarchy. For example you can't do Nil: Set[String] and you can't do Nil: Option[String] but you can do Set.empty[String] and Option.empty[String]. So unless your really sure your code will never be refactored into some other collection then you should use .empty as it will require less faff to refactor. Furthermore it's just generally nice to be consistent right? :)
To be fair I often use Nil and None as I'm often quite sure I'd never want to use a Set or something else, in fact I'd say it's better to use Nil when your really sure your only going to deal with lists because it tells the reader of the code "I'm really really dealing with lists here".
Finally, you can do some cool stuff with .empty and duck typing check this simple example:
def printEmptyThing[K[_], T <: {def empty[A] : K[A]}](c: T): Unit =
println("thing = " + c.empty[String])
printEmptyThing[List, List.type](List)
printEmptyThing[Option, Option.type](Option)
printEmptyThing[Set, Set.type](Set)
will print:
> thing = List()
> thing = None
> thing = Set()

Is cons a method or a class?

Marting Odersky in his book writes:
Class ::, pronounced “cons” for “construct,” represents non-empty lists.
and
The list construction methods :: and ::: are special. Because they end
in a colon, they are bound to their right operand. That is, an
operation such as x :: xs is treated as the method call xs.::(x), not
x.::(xs). In fact, x.::(xs) would not make sense, as x is of the list
element type, which can be arbitrary, so we cannot assume that this
type would have a :: method. For this reason, the :: method should
take an element value and yield a new list
So is :: a method or a class?
It is both a class and a method. Cons is a type parametised class. List[A] has two sub classes: Cons and Nil. As Cons is a class it can be created by its constructor as follows:
val s = new ::[Int](4, Nil)
Cons is a case class and we use the constructor when we do a pattern match. Cons is also a method on the list class that is implemented in its two sub classes. Hence we can use the cons method on the cons class that we have created above.
val s1 = s.::(5)
Part of the confusion may arise, because we normally create Lists using the apply method of the List object:
val s2 = List(1, 2, 3)
Normally the apply method of an object returns a new instance of the Class with the same name as an object. This is however just convention. In this particular case the List Object returns a new instance of the Cons subclass. The List class itself is a sealed abstract class so can not be instantiated. So the above apply method does the following:
val s2 = 1 :: 2 :: 3 :: Nil
Any method that ends in a ':' is a method on the operand to its right so it can be rewritten as
val s2 = 1 :: (2 :: (3 :: Nil)) //or as
val s2 = Nil.::(3).::(2).::(1)
So the Cons(::) method on the Nil object takes the 3 as a parameter and produces an anonymous instantiation of the Cons class with 3 as its head and a reference to the Nil object as its tail. Lets call this anonymous object c1. The Cons method is then called on c1 taking 2 as its parameter returning a new anonymous Cons instantiation lets call it c2, which has 2 for its head and a reference to c1 as its tail. Then finally the cons method on the c2 object takes the 1 as a parameter and returns the named object s2 with 1 as its head and a reference to c2 as its tail.
A second point of confusion is that the REPL and Scala worksheets use a class' toString methods to display results. So the worksheet gives us:
val s3 = List(5, 6, 7) // s3 : List[Int] = List(5, 6, 7)
val s4 = List() // s4 : List[Nothing] = List()
val s5: List[Int] = List() // s5 : List[Int] = List()
s3.getClass.getName // res3: String = scala.collection.immutable.$colon$colon
s4.getClass.getName // res4: String = scala.collection.immutable.Nil$
s5.getClass.getName // res5: String = scala.collection.immutable.Nil$
As said above List is sealed so new sub sub classes can not be created as Nil is an object and Cons is final. As Nil is an object it can't be parametrised. Nil inherits from List[Nothing]. At first glance that doesn't sound that useful but remember these data structures are immutable, so we can never add to them directly and Nothing is a sub class of every other class. So we can add the Nil class (indirectly) to any List without problem. The Cons class has two members a head and another List. Its a rather neat solution when you clock it.
I'm not sure if this has any practical use but you can use Cons as a type:
var list: ::[Int] = List (1,2,3).asInstanceOf[::[Int]]
println("Initialised List")
val list1 = Nil
list = list1.asInstanceOf[::[Int]] //this will throw a java class cast exception at run time
println("Reset List")
Short answer: both.
There is a subclass of list called ::, but you won't refer to it explicitly very often.
When you write e.g. 1 :: 2 :: Nil, the :: is a method on List, which creates an instance of the class :: behind the scenes.
:: is best thought of not as a method or a class, though, but as a constructor in the algebraic data type (ADT) sense. Wikipedia calls ADT constructors "quasi-functional entit[ies]", which makes them sound more complicated than they are, but isn't necessarily a bad way to think about them.
List has two constructors, :: (called cons in some languages) and Nil. The Wikipedia article I've linked above gives a good introduction to the idea of lists as algebraic data types.
It's worth noting that in some languages (like Haskell) ADT constructors don't have their own types associated with them—they are just functions that create an instance of the ADT.
This is typically effectively the case in Scala as well, where it's fairly rare to refer to the type of an ADT constructor like ::. It is possible, though—we can write this:
def foo(xs: ::[Int]) = ???
Or this (where Some is one of the constructors for the Option ADT).
def bar(s: Some[Int]) = ???
But this is in general not very useful, and could be considered an artifact of the way Scala implements algebraic data types.

Testing an assertion that something must not compile

The problem
When I'm working with libraries that support type-level programming, I often find myself writing comments like the following (from an example presented by Paul Snively at Strange Loop 2012):
// But these invalid sequences don't compile:
// isValid(_3 :: _1 :: _5 :: _8 :: _8 :: _2 :: _8 :: _6 :: _5 :: HNil)
// isValid(_3 :: _4 :: _5 :: _8 :: _8 :: _2 :: _8 :: _6 :: HNil)
Or this, from an example in the Shapeless repository:
/**
* If we wanted to confirm that the list uniquely contains `Foo` or any
* subtype of `Foo`, we could first use `unifySubtypes` to upcast any
* subtypes of `Foo` in the list to `Foo`.
*
* The following would not compile, for example:
*/
//stuff.unifySubtypes[Foo].unique[Foo]
This is a very rough way of indicating some fact about the behavior of these methods, and we could imagine wanting to make these assertions more formal—for unit or regression testing, etc.
To give a concrete example of why this might be useful in the context of a library like Shapeless, a few days ago I wrote the following as a quick first attempt at an answer to this question:
import shapeless._
implicit class Uniqueable[L <: HList](l: L) {
def unique[A](implicit ev: FilterAux[L, A, A :: HNil]) = ev(l).head
}
Where the intention is that this will compile:
('a' :: 'b :: HNil).unique[Char]
While this will not:
('a' :: 'b' :: HNil).unique[Char]
I was surprised to find that this implementation of a type-level unique for HList didn't work, because Shapeless would happily find a FilterAux instance in the latter case. In other words, the following would compile, even though you'd probably expect it not to:
implicitly[FilterAux[Char :: Char :: HNil, Char, Char :: HNil]]
In this case, what I was seeing was a bug—or at least something bug-ish—and it has since been fixed.
More generally, we can imagine wanting to check the kind of invariant that was implicit in my expectations about how FilterAux should work with something like a unit test—as weird as it may sound to be talking about testing type-level code like this, with all the recent debates about the relative merit of types vs. tests.
My question
The problem is that I don't know of any kind of testing framework (for any platform) that allows the programmer to assert that something must not compile.
One approach that I can imagine for the FilterAux case would be to use the old implicit-argument-with-null-default trick:
def assertNoInstanceOf[T](implicit instance: T = null) = assert(instance == null)
Which would let you write the following in your unit test:
assertNoInstanceOf[FilterAux[Char :: Char :: HNil, Char, Char :: HNil]]
The following would be a heck of a lot more convenient and expressive, though:
assertDoesntCompile(('a' :: 'b' :: HNil).unique[Char])
I want this. My question is whether anyone knows of any testing library or framework that supports anything remotely like it—ideally for Scala, but I'll settle for anything.
Not a framework, but Jorge Ortiz (#JorgeO) mentioned some utilities he added to the tests for Foursquare's Rogue library at NEScala in 2012 which support tests for non-compilation: you can find examples here. I've been meaning to add something like this to shapeless for quite a while.
More recently, Roland Kuhn (#rolandkuhn) has added a similar mechanism, this time using Scala 2.10's runtime compilation, to the tests for Akka typed channels.
These are both dynamic tests of course: they fail at (test) runtime if something that shouldn't compile does. Untyped macros might provide a static option: ie. a macro could accept an untyped tree, type check it and throw a type error if it succeeds). This might be something to experiment with on the macro-paradise branch of shapeless. But not a solution for 2.10.0 or earlier, obviously.
Update
Since answering the question, another approach, due to Stefan Zeiger (#StefanZeiger), has surfaced. This one is interesting because, like the untyped macro one alluded to above, it is a compile time rather than (test) runtime check, however it is also compatible with Scala 2.10.x. As such I think it is preferable to Roland's approach.
I've now added implementations to shapeless for 2.9.x using Jorge's approach, for 2.10.x using Stefan's approach and for macro paradise using the untyped macro approach. Examples of the corresponding tests can be found here for 2.9.x, here for 2.10.x and here for macro paradise.
The untyped macro tests are the cleanest, but Stefan's 2.10.x compatible approach is a close second.
ScalaTest 2.1.0 has the following syntax for Assertions:
assertTypeError("val s: String = 1")
And for Matchers:
"val s: String = 1" shouldNot compile
Do you know about partest in the Scala project? E.g. CompilerTest has the following doc:
/** For testing compiler internals directly.
* Each source code string in "sources" will be compiled, and
* the check function will be called with the source code and the
* resulting CompilationUnit. The check implementation should
* test for what it wants to test and fail (via assert or other
* exception) if it is not happy.
*/
It is able to check for example whether this source https://github.com/scala/scala/blob/master/test/files/neg/divergent-implicit.scala will have this result https://github.com/scala/scala/blob/master/test/files/neg/divergent-implicit.check
It's not a perfect fit for your question (since you don't specify your test cases in terms of asserts), but may be an approach and/or give you a head start.
Based on the links provided by Miles Sabin I was able to use the akka version
import scala.tools.reflect.ToolBox
object TestUtils {
def eval(code: String, compileOptions: String = "-cp target/classes"): Any = {
val tb = mkToolbox(compileOptions)
tb.eval(tb.parse(code))
}
def mkToolbox(compileOptions: String = ""): ToolBox[_ <: scala.reflect.api.Universe] = {
val m = scala.reflect.runtime.currentMirror
m.mkToolBox(options = compileOptions)
}
}
Then in my tests I used it like this
def result = TestUtils.eval(
"""|import ee.ui.events.Event
|import ee.ui.events.ReadOnlyEvent
|
|val myObj = new {
| private val writableEvent = Event[Int]
| val event:ReadOnlyEvent[Int] = writableEvent
|}
|
|// will not compile:
|myObj.event.fire
|""".stripMargin)
result must throwA[ToolBoxError].like {
case e =>
e.getMessage must contain("value fire is not a member of ee.ui.events.ReadOnlyEvent[Int]")
}
The compileError macro in µTest does just that:
compileError("true * false")
// CompileError.Type("value * is not a member of Boolean")
compileError("(}")
// CompileError.Parse("')' expected but '}' found.")

Is there any fundamental limitations that stops Scala from implementing pattern matching over functions?

In languages like SML, Erlang and in buch of others we may define functions like this:
fun reverse [] = []
| reverse x :: xs = reverse xs # [x];
I know we can write analog in Scala like this (and I know, there are many flaws in the code below):
def reverse[T](lst: List[T]): List[T] = lst match {
case Nil => Nil
case x :: xs => reverse(xs) ++ List(x)
}
But I wonder, if we could write former code in Scala, perhaps with desugaring to the latter.
Is there any fundamental limitations for such syntax being implemented in the future (I mean, really fundamental -- e.g. the way type inference works in scala, or something else, except parser obviously)?
UPD
Here is a snippet of how it could look like:
type T
def reverse(Nil: List[T]) = Nil
def reverse(x :: xs: List[T]): List[T] = reverse(xs) ++ List(x)
It really depends on what you mean by fundamental.
If you are really asking "if there is a technical showstopper that would prevent to implement this feature", then I would say the answer is no. You are talking about desugaring, and you are on the right track here. All there is to do is to basically stitch several separates cases into one single function, and this can be done as a mere preprocessing step (this only requires syntactic knowledge, no need for semantic knowledge). But for this to even make sense, I would define a few rules:
The function signature is mandatory (in Haskell by example, this would be optional, but it is always optional whether you are defining the function at once or in several parts). We could try to arrange to live without the signature and attempt to extract it from the different parts, but lack of type information would quickly come to byte us. A simpler argument is that if we are to try to infer an implicit signature, we might as well do it for all the methods. But the truth is that there are very good reasons to have explicit singatures in scala and I can't imagine to change that.
All the parts must be defined within the same scope. To start with, they must be declared in the same file because each source file is compiled separately, and thus a simple preprocessor would not be enough to implement the feature. Second, we still end up with a single method in the end, so it's only natural to have all the parts in the same scope.
Overloading is not possible for such methods (otherwise we would need to repeat the signature for each part just so the preprocessor knows which part belongs to which overload)
Parts are added (stitched) to the generated match in the order they are declared
So here is how it could look like:
def reverse[T](lst: List[T]): List[T] // Exactly like an abstract def (provides the signature)
// .... some unrelated code here...
def reverse(Nil) = Nil
// .... another bit of unrelated code here...
def reverse(x :: xs ) = reverse(xs) ++ List(x)
Which could be trivially transformed into:
def reverse[T](list: List[T]): List[T] = lst match {
case Nil => Nil
case x :: xs => reverse(xs) ++ List(x)
}
// .... some unrelated code here...
// .... another bit of unrelated code here...
It is easy to see that the above transformation is very mechanical and can be done by just manipulating a source AST (the AST produced by the slightly modified grammar that accepts this new constructs), and transforming it into the target AST (the AST produced by the standard scala grammar).
Then we can compile the result as usual.
So there you go, with a few simple rules we are able to implement a preprocessor that does all the work to implement this new feature.
If by fundamental you are asking "is there anything that would make this feature out of place" then it can be argued that this does not feel very scala. But more to the point, it does not bring that much to the table. Scala author(s) actually tend toward making the language simpler (as in less built-in features, trying to move some built-in features into libraries) and adding a new syntax that is not really more readable goes against the goal of simplification.
In SML, your code snippet is literally just syntactic sugar (a "derived form" in the terminology of the language spec) for
val rec reverse = fn x =>
case x of [] => []
| x::xs = reverse xs # [x]
which is very close to the Scala code you show. So, no there is no "fundamental" reason that Scala couldn't provide the same kind of syntax. The main problem is Scala's need for more type annotations, which makes this shorthand syntax far less attractive in general, and probably not worth the while.
Note also that the specific syntax you suggest would not fly well, because there is no way to distinguish one case-by-case function definition from two overloaded functions syntactically. You probably would need some alternative syntax, similar to SML using "|".
I don't know SML or Erlang, but I know Haskell. It is a language without method overloading. Method overloading combined with such pattern matching could lead to ambiguities. Imagine following code:
def f(x: String) = "String "+x
def f(x: List[_]) = "List "+x
What should it mean? It can mean method overloading, i.e. the method is determined in compile time. It can also mean pattern matching. There would be just a f(x: AnyRef) method that would do the matching.
Scala also has named parameters, which would be probably also broken.
I don't think that Scala is able to offer more simple syntax than you have shown in general. A simpler syntax may IMHO work in some special cases only.
There are at least two problems:
[ and ] are reserved characters because they are used for type arguments. The compiler allows spaces around them, so that would not be an option.
The other problem is that = returns Unit. So the expression after the | would not return any result
The closest I could come up with is this (note that is very specialized towards your example):
// Define a class to hold the values left and right of the | sign
class |[T, S](val left: T, val right: PartialFunction[T, T])
// Create a class that contains the | operator
class OrAssoc[T](left: T) {
def |(right: PartialFunction[T, T]): T | T = new |(left, right)
}
// Add the | to any potential target
implicit def anyToOrAssoc[S](left: S): OrAssoc[S] = new OrAssoc(left)
object fun {
// Use the magic of the update method
def update[T, S](choice: T | S): T => T = { arg =>
if (choice.right.isDefinedAt(arg)) choice.right(arg)
else choice.left
}
}
// Use the above construction to define a new method
val reverse: List[Int] => List[Int] =
fun() = List.empty[Int] | {
case x :: xs => reverse(xs) ++ List(x)
}
// Call the method
reverse(List(3, 2, 1))