In Scala how do I represent a mix of things and groups of things in a parameter list? X = (X | atom)* - scala

I have a *-parameter method. I would like to be able to pass a mix of atoms and groups
of atoms into the method. Ideally I would like the groups to be able to hold groups too.
The grammar rule would be:
X = (X | atom)*
The groups need to ordered, but not necessarily of class List.
The motivation is that there are many calls to the *-parameter method and some groups of parameters occur more than once amongst these calls. I would like to be able to store these groups in vals to re-use them.

Why not something like this?
trait GroupOrAtom // or any other nicer name!
class Atom extends GroupOrAtom
class AtomGroup(val atoms: Seq[Atom]) extends GroupOrAtom
def process(elements: GroupOrAtom*) = ...
If you're looking for a more fancy way to do it using union types, try reading Miles Sabin's amazing post on how to implement union types in Scala. This should probably not be your first choice, though, as a solution implementing a common trait like GroupOrAtom is clearer and easier.

Could just passing tuples work for you?
def processor(tokens: Any) = // pattern match on tuples
processor('atom)
processor('atom1, 'atom2)
processor('atom1, ('atom2a, 'atom2b))

Related

Combine two enums

How to combine two enums in scala?
object FilterDimensions extends Enumeration {
type FilterDimensions = Value
val Instance, Usage, Cost = Value
}
object Filter2Dimensions extends Enumeration {
type Filter2Dimensions = Value
val Instance, Savings, Coverage = Value
}
Output needs to be a single enumeration which contains only the distinct values
Enum - Instance, Usage, Costs, Saving, Coverage
I don't think that's possible.
Even if you have an Enumeration A_1, with values B and C; and another Enumeration A_2 with values C and D, the correct 'name's are A_1.B, A_1.C, A_2.C, A_2.D. So they are completely unrelated types, there is no automatic way to discard what you think as a duplicated. Unless you base yourself on the String representation
Maybe you are thinking on the macroscopic problem wrong (or maybe you really need to do that you because someone defined like that and you have to do it). Have you though on having a common trait on both Enumerations? That would allow your methods to receive either Enumerations (not sure if it works, because you want the Enumeration.Value...)
Other option would be to have a 2in1 type. Either[FilterDimensions, Filter2Dimensions] could do it, but I'm almost sure cats,scalaz, or shapeless have a more decent type
I wrote some stuff about enumerations on Scala, maybe you find it useful. It covers some alternatives to the native scala Enumerations, and maybe some of them have the features you need:
http://pedrorijo.com/blog/scala-enums/
http://pedrorijo.com/blog/scala-enums-part2/

How to define a union type that works at runtime?

Following on form this excellent set of answers on how to define union types in Scala. I've been using the Miles Sabin definition of Union types, but one questions remains.
How do you work with these if the type isn't know until Runtime? For example:
trait inv[-A] {}
type Or[A,B] = {
type check[X] = (inv[A] with inv[B]) <:< inv[X]
}
case class Foo[A : (Int Or String)#check](a: A)
Foo(1) // Foo[Int] = Foo(1)
Foo("hi") // Foo[String] = Foo(hi)
Foo(2.0) // Error!
This example works since the parameter A is know at compile time, and calling Foo(1) is really calling Foo[Int](1). However, what do you do if parameter A isn't known until runtime? Maybe you're paring a file that contains the data for Foo's, in which case the type parameter of Foo isn't know until you read the data. There's no easy way to set parameter A in this case.
The best solutions I've been able to come up with are:
Pattern Match on the data you've read and then create different Foo's based that type. In my case this isn't feasible because my case-class actually contains dozens of union types, so there'd be hundreds of combinations of types to pattern match.
Cast the type you've just read to be (String or Int), so you have a single type to pass around, that passes the Type Class constraint when you create Foo with it. Then return Foo[_] instead. This puts the onus back on the Foo user to work out the type of each field (since they'll appear to be type Any), but at least it defers having to know the type until the field is actually used, in which case a pattern match seems more tractable.
The second solution looks like this:
def parseLine: Any // Parses data point, but can be either a String or
// Int, so returns Any.
def mkFoo: Foo[_] = {
val a = parseLine.asInstanceOf[Int with String]
Foo(a) // Passes type constraint now
}
In practice I've ended up using the second solution, but I'm wondering if there's something better I can do?
Another way to state the problem is: What does it mean to return a Union Type? Functions can only return a single type, and the trickery we use with Miles Sabin union types is only useful for the types you pass in, not for the types you return.
PS. For context, why this is a problem in my case is that I'm generating a set of case-classes from a Json schema file. Json naturally supports union types, so I would like to make my case classes reflect that too. This works great in one direction: users creating case-classes to be serialized out to Json. But gets sticky in the other direction: user's parsing Json files to have a set of populated case classes returned to them.
The "standard" Scala solution to this problem is to use an ordinary discriminated-union type (ie, to forego true union types altogether):
sealed trait Foo
case class IntFoo(x: Int) extends Foo
case class StringFoo(x: String) extends Foo
This reflects the fact that, as you observe, the particular type of the member is a runtime value; the JVM type-tag of the Foo instance provides this runtime value.
Miles Sabin's implementation of union types is very clever, but I'm not sure if it provides any practical benefit, because it only restricts the type of thing that can go into a Foo, but provides the user of a Foo with no computable version of that restriction, in the way a match provides you with a computable version of the sealed trait. In general, for a restriction to be useful, it needs two sides: a check that only the right things are put in, and an extractor (aka an eliminator) that allows the same right things to come out the other end.
Perhaps if you gave some explanation of why you're looking for a purer union type it would illuminate whether regular discriminated unions are sufficient or if you really need something more.
There's a reason every JSON parser for Scala requires well defined types into which the JSON will be converted, even if some fields have to be dropped: you cannot work with something you don't know the type of.
To given an example, say you have a, and maybe a is a String, maybe it's an Int, but you don't know what it is. Why computation could you possibly make with a, not knowing its type? Why would your code compute the sum of all a's, for instance, if you didn't know in advance it was a number?
Generally, the answer to that is to perform user-provided data manipulation at runtime over data with unknown characteristics, as the user itself sees that it's a number and decides they want to know what the sum of that field is. Fine, but you are going the wrong way about it if so.
There is a well defined way to represent JSON data in Scala (and, for that matter, any data that has the same characteristics as JSON. Which is using a hierarchy of classes. A json value may be a json object, array or one of a number of primitives. A json object contains a list of key/value pairs, whose keys are json strings and values are json values. And so on. This is easy to represent, and there are many library doing so already. In fact, there are so many that there's a project called Json4s which presents a unified API which can be used and is implemented by many of the aforementioned libraries.
Things like the records which Miles Sabin's Shapeless library provide are intended to be used when the input doesn't have a well defined schema, but the program knows what it needs from that input. And, yes, the program might know what to do with a if it is an Int or a String, but not every possible value.
The next Scala 3 (mid 2020) based on Dotty will implement the proposal for Union Type from last Sept. 2018
You see it in "a tour of Scala 3" (June 2019)
Union Types Provide ad-hoc combinations of types
Subsetting = Subtyping
No boxing overhead
case class UserName(name: String)
case class Password(hash: Hash)
def help(id: UserName | Password) = {
val user = id match {
case UserName(name) => lookupName(name)
case Password(hash) => lookupPassword(hash)
}
...
}
Union Types Work also with singleton types
Great for JS interop
type Command = "Click" | "Drag" | "KeyPressed"
def handleEvent(kind: Command) = kind match {
case "Click" => MouseClick()
case "Drag" => MoveTo()
case "KeyPressed" => KeyPressed()
}

Scala Style Guide: Why Mimic a function?

I’m reading the Scala style guide: http://docs.scala-lang.org/style/naming-conventions.html
and they mention this:
Objects
Objects follow the class naming convention (camelCase with a
capital first letter) except when attempting to mimic a package or a
function. These situations don’t happen often, but can be expected in
general development.:
object ast {
sealed trait Expr
case class Plus(e1: Expr, e2: Expr) extends Expr
...
}
object inc {
def apply(x: Int): Int = x + 1
}
I can think of maybe a few thin use cases for the "object ast". But I can't think of why anyone would want to "mimic a function" in the manner of "object inc". It feels a bit unconventional, and likely to confuse other developers.
Are there any example cases where the core Scala libraries do this? Or when would it be good practice to define a function like this?
As mentioned in the comments, one good example is shapeless.Poly functions.
A Poly function is a polymorphic version of a function. It needs to be represented as an object for three main reasons:
it contains multiple functions (to handle multiple cases, since they're polymorphic)
an object's companion object is the object itself. This allows for defining the various cases as implicit methods inside the object and have them picked up by the compiler
objects provide a stable identifier, so the compiler won't complain when passing the instance of the function to any of shapeless's methods
Technicalities aside, they're conceptually functions, hence the same naming style for regular functions is used.

Understanding GenericTraversableTemplate and other Scala collection internals

I was exchanging emails with an acquaintance that is a big Kotlin, Clojure and Java8 fan and asked him why not Scala. He provided many reasons (Scala is too academic, too many features, not the first time I hear this and I think this is very subjective)
but his biggest pain point was as an example, that he doesn't like a language where he can't understand the implementation of basic data structures, and he gave LinkedList as an example.
I took a look at scala.collection.LinkedList and counted the things I either understand or somewhat understand.
CanBuildFrom - after some effort, I get it, type classes, not the longest suicide note
in history [1]
LinkedListLike - I can't remember where I read it, but I got convinced this is there for a good reason
But then I started to stare at these
GenericTraversableTemplate - now I'm scratching my head as well...
SeqFactory, GenericCompanion - OK, now you lost me, I start to understand his point
Can someone who understand this well please explain GenericTraversableTemplate SeqFactory and GenericCompanion in the context of LinkedList? What they are for, what impact on the end user they have (e.g. I'm sure they are there for a good reason, what is that reason?)
Are they there for a practical reason? or is it a level of abstraction that could have been simplified?
I like Scala collections because I don't have to understand the internals to be able to effectively use them. I don't mind a complex implementation if it helps me to keep my usage simpler. e.g. I don't mind paying the price of a complex library if I get the ability to write cleaner more elegant code using it in return. but it will sure be nice to better understand it.
[1] - Is the Scala 2.8 collections library a case of "the longest suicide note in history"?
I will try to describe the concepts from the point of view of a random pedestrian (I've never contributed a single line to the Scala collection library, so don't hit me too hard if I'm wrong).
Since LinkedList is now deprecated, and because Maps provide a better example, I will use TreeMap as example.
CanBuildFrom
The motivation is this: If we take a TreeMap[Int, Int] and map it with
case (x, y) => (2 * x, y * y * 0.3d)
we get TreeMap[Int, Double]. This type safety alone would already explain the necessity for
simple genericBuilder[X] constructs.
However, if we map it with
case (x, y) => x
we obtain an Iterable[Int] (more precisely: a List[Int]), this is no longer a Map, the type of the container has changed. This is where CBF's come into play:
CanBuildFrom[This, X, That]
can be seen as a kind of "type-level function" that tells us: if we map a collection of type
This with a function that returns values of type X, we can build a That. The most specific
CBF is provided at compile time, in the first case it will be something like
CanBuildFrom[TreeMap[_,_], (X,Y), TreeMap[X,Y]]
in the second case it will be something like
CanBuildFrom[TreeMap[_,_], X, Iterable[X]]
and so we always get the right type of the container. The pattern is pretty general.
Every time you have a generic function
foo[X1, ..., Xn](x1: X1, ..., xn: Xn): Y
where the result type Y depends on X1, ..., Xn, you can introduce an implicit parameter as
follows:
foo[X1, ...., Xn, Y](x1: X1, ..., xn: Xn)(implicit CanFooFrom[X1, ..., Xn, Y]): Y
and then define the type-level function X1, ..., Xn -> Y piecewise by providing multiple
implicit CanFooFrom's.
LinkedListLike
In the class definition, we see something like this:
TreeMap[A, B] extends SortedMap[A, B] with SortedMapLike[A, B, TreeMap[A, B]]
This is Scala's way to express the so-called F-bounded polymorphism.
The motivation is as follows: Suppose we have a dozen (or at least two...) implementations of the trait SortedMap[A, B]. Now we want to implement a method withoutHead, it could look
somewhat like this:
def withoutHead = this.remove(this.head)
If we move the implementation into SortedMap[A, B] itself, the best we can do is this:
def withoutHead: SortedMap[A, B] = this.remove(this.head)
But is this the most specific result type we can get? No, that's too vague.
We would like to return TreeMap[A, B] if the original map is a TreeMap, and
CrazySortedLinkedHashMap (or whatever...) if the original was a CrazySortedLinkedHashMap.
This is why we move the implementation into SortedMapLike, and give the following signature to the withoutHead method:
trait SortedMapLike[A, B, Repr <: SortedMap[A, B]] {
...
def withoutHead: Repr = this.remove(this.head)
}
now because TreeMap[A, B] extends SortedMapLike[A, B, TreeMap[A, B]], the result type of
withoutHead is TreeMap[A,B]. The same holds for CrazySortedLinkedHashMap: we get the exact type back. In Java, you would either have to return SortedMap[A, B] or override the method in each subclass (which turned out to be a maintenance nightmare for the feature-rich traits in Scala)
GenericTraversableTemplate
The type is: GenericTraversableTemplate[+A, +CC[X] <: GenTraversable[X]]
As far as i can tell, this is just a trait that provides implementations of
methods that somehow return regular collections with same container type but
possibly different content type (stuff like flatten, transpose, unzip).
Stuff like foldLeft, reduce, exists are not here because these methods care only about content type, not container type.
Stuff like flatMap is not here, because the container type can change (again, CBF's).
Why is it a separate trait, is there a fundamental reason why it exists?
I don't think so... It probably would be possible to group the godzillion of methods somewhat differently. But this is just what happens naturally: you start to implement a trait, and it turns out that it has very many methods. So instead you group loosely related methods, and put them into 10 different traits with awkward names like "GenTraversableTemplate", and them mix them all into traits/classes where you need them...
GenericCompanion
This is just an abstract class that implements some basic functionality which is common
for companion objects of most collection classes (essentially, it just implements very
simple factory methods apply(varargs) and empty).
For example there is method apply that takes varargs of some type A and returns a collection of type CC[A]:
Array(1, 2, 3, 4) // calls Array.apply[A](elems: A*) on the companion object
List(1, 2, 3, 4) // same for List
The implementation is very simple, it's something like this:
def apply[A](varargs: A*): CC[A] = {
val builder = newBuilder[A]
for (arg <- varargs) builder += arg
builder.result()
}
This is obviously the same for Arrays and Lists and TreeMaps and almost everything else, except 'constrained irregular Collections' like Bitset. So this is just common functionality in a common ancestor class of most companion objects. Nothing special about that.
SeqFactory
Similar to GenericCompanion, but this time more specifically for Sequences.
Adds some common factory methods like fill() and iterate() and tabulate() etc.
Again, nothing particularly rocket-scientific here...
Few general remarks
In general: I don't think that one should attempt to understand every single trait in this library. Rather, one should try to look at the library as a whole. As a whole, it has a very interesting architecture. And in my personal opinion, it's actually a very aesthetic piece of software, but one has to stare at it for quite a while (and try to re-implement the whole architectural pattern several times) to grasp it. On the other hand: for example CBF's are kind of "design pattern" that clearly should be eliminated in successors of this language. The whole story with the scope of implicit CBF's still seems like a total nightmare to me. But many things seemed completely inscrutable at first, and almost always, it ended with an epiphany (which is very specific for Scala: for the majority of other languages, such struggles usually end with the thought "Author of this is a complete idiot").

How to abstract the data and control to support easy extensibility?

I have a case:
There is SomeNode, which is composed by different basic parts: say some types of A,B,C. There is also a transformation function that will transform an instance of SomeNode to another SomeNode.
However, there can be some other parts added to SomeNode, in addition to A,B,C, so say there might be D as the fourth part of the SomeNode. and So, the transformation function's interface might also need to change accordingly for the newly added component SomeNode, but there might be some same logic shared.
Then I have been wondering, what's a good design to abstract SomeNode and its transformation function for easy extensibility? Using trait? how? Some inspiration examples?
Thanks,
For managing modularity, check out Cake pattern (aka Bakery of Doom pattern). There's been some interesting articles on the topic:
Real-World Scala: Dependency Injection (DI)
Dependency injection vs. Cake pattern
Existential Types FTW
Pushing the envelope on OO and functional with Scala
For managing transformations and traits, Scala compiler is the best example I can think of.
Maybe I'm missing something, but aren't you looking for inheritance?
class SomeNode(val a: Int, val b: String, val c: Char)
class SpecialNode(from: SomeNode, val d: List[Int]) extends
SomeNode(from.a, from.b, from.c)
SpecialNode shares fields of SomeNode and adds new d field. The transformation looks something like this (of course it can be more complex:
def transform(node: SomeNode) = new SpecialNode(node, Nil)
I'm hesitant to write this at the expense of showing my noob identity, but based on keywords used in the question, it sounds like 'OOP fundamentals' are what is being asked for.
"How to abstract the data and control to support easy extensibility?" (Generalization)
"There is SomeNode, which is composed by different basic parts:" (Aggregation)
"There is also a transformation function that will transform an instance of SomeNode to another SomeNode." (Polymorphism)
"However, there can be some other parts added to SomeNode, in addition to A,B,C, ... function's interface might also need to change accordingly for the newly added component SomeNode, but there might be some same logic shared. " (Interface)
"Then I have been wondering, what's a good design to abstract SomeNode and its transformation function for easy extensibility?" (Inheritance)
Based on my interpretation of the question, I would recommend researching S.O.L.I.D. design principles http://www.youtube.com/watch?NR=1&v=05jVWgKZ6MY&feature=endscreen