In scala, is it possible to have a two-elements only Set? - scala

In the end, I want to have a case class Swap so that Swap(a, b) == Swap(b, a).
I thought I could use a Set of two elements, and it quite does the job :
scala> case class Swap(val s:Set[Int])
defined class Swap
scala> Swap(Set(2, 1)) == Swap(Set(1, 2))
res0: Boolean = true
But this allows for any number of elements, and I would like to limit my elements to two. I found the class Set.Set2, which is the default implementation for an immutable Set with two elements, but it doesn't work the way I tried, or variations of it :
scala> val a = Set(2, 1)
a: scala.collection.immutable.Set[Int] = Set(2, 1)
scala> a.getClass
res3: Class[_ <: scala.collection.immutable.Set[Int]] = class scala.collection.immutable.Set$Set2
scala> case class Swap(val s:Set.Set2[Int])
defined class Swap
scala> val swp = Swap(a)
<console>:10: error: type mismatch;
found : scala.collection.immutable.Set[Int]
required: Set.Set2[Int]
val swp = Swap(a)
^
So my questions are :
is there a way to use Set2 as I try ?
is there a better way to implement my case class Swap ? I read that one shouldn't override equals in a case class, though it was my first idea.

This is a generic implementation -
import scala.collection.immutable.Set.Set2
def set2[T](a: T, b: T): Set2[T] = Set(a, b).asInstanceOf[Set2[T]]
case class Swap[T](s: Set2[T])
Swap(set2(1,2)) == Swap(set2(2,1)) //true
The reason that your solution didn't work is because of the signature
Set(elems: A*): Set
In case of 2 elements the concrete type will be Set2 but the compiler doesn't know that so you have to cast it to Set2

You can always hide the implementation details of Swap, in this case you actually should.
You could implement it using Set or you could implement it as:
// invariant a <= b
class Swap private (val a: Int, val b: Int)
object Swap {
def apply(a: Int, b: Int): Swap =
if (a <= b) new Swap(a, b) else new Swap(b, a)
}
Unfortunately you have to use class here and reimplement equals hashCode etc yourself, as we cannot get rid of scalac auto-generated apply: related SO Q/A
And make all functions on Swap maintain that invariant.
Then equals comparion is essentially this.a == other.a && this.b == other.b, we don't need to care about swapping anymore.

The problem is that you don't know statically that a is a Set2 - as far as the compiler is concerned you called Set(as: A*) and got back some kind of Set.
You could use shapeless sized collections to enforce a statically known collection size.

Related

Shapeless - turn a case class into another with fields in different order

I'm thinking of doing something similar to Safely copying fields between case classes of different types but with reordered fields, i.e.
case class A(foo: Int, bar: Int)
case class B(bar: Int, foo: Int)
And I'd like to have something to turn a A(3, 4) into a B(4, 3) - shapeless' LabelledGeneric comes to mind, however
LabelledGeneric[B].from(LabelledGeneric[A].to(A(12, 13)))
results in
<console>:15: error: type mismatch;
found : shapeless.::[shapeless.record.FieldType[shapeless.tag.##[Symbol,String("foo")],Int],shapeless.::[shapeless.record.FieldType[shapeless.tag.##[Symbol,String("bar")],Int],shapeless.HNil]]
(which expands to) shapeless.::[Int with shapeless.record.KeyTag[Symbol with shapeless.tag.Tagged[String("foo")],Int],shapeless.::[Int with shapeless.record.KeyTag[Symbol with shapeless.tag.Tagged[String("bar")],Int],shapeless.HNil]]
required: shapeless.::[shapeless.record.FieldType[shapeless.tag.##[Symbol,String("bar")],Int],shapeless.::[shapeless.record.FieldType[shapeless.tag.##[Symbol,String("foo")],Int],shapeless.HNil]]
(which expands to) shapeless.::[Int with shapeless.record.KeyTag[Symbol with shapeless.tag.Tagged[String("bar")],Int],shapeless.::[Int with shapeless.record.KeyTag[Symbol with shapeless.tag.Tagged[String("foo")],Int],shapeless.HNil]]
LabelledGeneric[B].from(LabelledGeneric[A].to(A(12, 13)))
^
How do I reorder the fields in the record (?) so this can work with a minimum of boilerplate?
I should leave this for Miles but it's happy hour where I'm from and I can't resist. As he points out in a comment above, the key is ops.hlist.Align, which will work just fine for records (which are just special hlists, after all).
If you want a nice syntax, you need to use a trick like the following for separating the type parameter list with the target (which you want to provide explicitly) from the type parameter list with all the other stuff (which you want to be inferred):
import shapeless._, ops.hlist.Align
class SameFieldsConverter[T] {
def apply[S, SR <: HList, TR <: HList](s: S)(implicit
genS: LabelledGeneric.Aux[S, SR],
genT: LabelledGeneric.Aux[T, TR],
align: Align[SR, TR]
) = genT.from(align(genS.to(s)))
}
def convertTo[T] = new SameFieldsConverter[T]
And then:
case class A(foo: Int, bar: Int)
case class B(bar: Int, foo: Int)
And then:
scala> convertTo[B](A(12, 13))
res0: B = B(13,12)
Note that finding alignment instances will get expensive at compile time for large case classes.
As noticed #MilesSabin (godlike shapeless creator), there is an align operation, it is used like:
import ops.hlist.Align
val aGen = LabelledGeneric[A]
val bGen = LabelledGeneric[B]
val align = Align[aGen.Repr, bGen.Repr]
bGen.from(align(aGen.to(A(12, 13)))) //> res0: B = B(13,12)
P.S. Noticed that there is an example on GitHub.

Scala default Set Implementation

I can see that from Scala documentation scala.collection.immutable.Set is only a trait. Which one on the Set implementation is used by default ? HashSet or TreeSet (or something else) ?
I would like to know/plan the running time of certain functions.
Example:
scala> val s = Set(1,3,6,2,7,1)
res0: scala.collection.immutable.Set[Int] = Set(1, 6, 2, 7, 3)
What would be the running time of s.find(5), O(1) or O(log(n)) ?
Since same apply for Map, what is the best way to figure this out ?
By looking at the source code, you can find that sets up to four elements have an optimized implementation provided by EmptySet, Set1, Set2, Set3 and Set4, which simply hold the single values.
For example here's Set2 declaration (as of scala 2.11.4):
class Set2[A] private[collection] (elem1: A, elem2: A) extends AbstractSet[A] with Set[A] with Serializable
And here's the contains implementation:
def contains(elem: A): Boolean =
elem == elem1 || elem == elem2
or the find implementation
override def find(f: A => Boolean): Option[A] = {
if (f(elem1)) Some(elem1)
else if (f(elem2)) Some(elem2)
else None
}
Very straightforward.
For sets with more than 4 elements, the underlying implementation is an HashSet. We can easily verify this in the REPL:
scala> Set(1, 2, 3, 4).getClass
res1: Class[_ <: scala.collection.immutable.Set[Int]] = class scala.collection.immutable.Set$Set4
scala> Set(1, 2, 3, 4, 5, 6).getClass
res0: Class[_ <: scala.collection.immutable.Set[Int]] = class scala.collection.immutable.HashSet$HashTrieSet
That being said, find must always iterate over the whole HashSet, since it's unsorted, so it will be O(n).
Conversely, a lookup operation like contains will be O(1) instead.
Here's a more in-depth reference about performance of scala collections in general.
Speaking of Map, pretty much the same concepts apply. There are optimized Map implementations up to 4 elements, and then it's an HashMap.

Comparing None's of Different Types

Why does each None (of different Option types) evaluate to true?
scala> val x: Option[String] = None
x: Option[String] = None
scala> val y: Option[Int] = None
y: Option[Int] = None
Both x and y are Option's of separate types, yet their None's equal each other.
scala> x == y
res0: Boolean = true
Why?
Without looking at the actual implementation, I would assume that None is actually a case object. Hence, there is exactly one None in your memory. So both are the same thing. And identity obviously implies equality.
As to the question, why you can actually compare the two: This is due to Scala's subtyping: Every Object has an equals method, and this method is what you are using with the == operator.
edit: I found the implementation On github you can see, that None is indeed a case object. There also is no equals() method, so you are using the one which is automatically generated for case classes. Hence the comment below about type erasure also applies to your Some() case.
And 1 == "string" returns false. Standard equality check in Scala is not type safe, i.e:
final def ==(arg0: Any): Boolean
If you want typesafety use === with Equal Typeclass from scalaz.
There are really two questions here:
Why does x == y typecheck in your REPL?
Why do they equal each other?
x == y compiles because == in Scala is not type safe:
scala> "x" == 1
res4: Boolean = false
So why do they equal each other? An Option in Scala is conceptually similar to an Algebraic Data Type in Haskell:
data Maybe a = Nothing | Just a
But if you look at the Option.scala source, you'll see that an Option is defined (simplifying somewhat) as:
sealed abstract class Option[+A] extends Product with Serializable
final case class Some[+A](x: A) extends Option[A]
case object None extends Option[Nothing]
On the Some side, you can see a case class with parameterized type +A - so a someish Option[Int] becomes Some[Int].
However, on the None side, you see an Option[Nothing] object, so a noneish Option[Int] and a noneish Option[String] both become an Option[Nothing] object, and hence equal each other.
As #TravisBrown points out, Scalaz catches this much earlier, at compile time:
scala> import scalaz._
scala> import Scalaz._
scala> val x: Option[String] = None
scala> val y: Option[Int] = None
scala> x === y
<console>:16: error: could not find implicit value for parameter F0: scalaz.Equal[Object]
x === y
scala> val z: Option[String] = None
scala> x === z
res3: Boolean = true
Option is (roughly speaking) implemented like this:
trait Option[+A]
case class Some[A](value: A) extends Option[A]
case object None extends Option[Nothing]
Because None is an object, (a singleton in java terms), there's only one instance of it, and because Nothing is at the very bottom of the type hierarchy, meaning it's a subtype of EVERY type in Scala (even null), the two Nones are effectively the same type

Orderings and TreeMap

I'm building a MultiSet[A] and using a TreeMap[A, Int] to keep track of the elements.
class MultiSet[A <: Ordered[A] ](val tm: TreeMap[A, Int]) { ... }
Now I want to create a MultiSet[Int] using this framework. In particular, I want a method that will take a Vector[Int] and produce a TreeMap[Int, Int] that I can use to make a MultiSet[Int].
I wrote the following vectorToTreeMap, which compiles without complaint.
def vectorToTreeMap[A <: Ordered[A]](elements: Vector[A]): TreeMap[A, Int] =
elements.foldLeft(new TreeMap[A, Int]())((tm, e) => tm.updated(e, tm.getOrElse(e, 0) + 1))
But when I try
val tm: TreeMap[Int, Int] = vectorToTreeMap(Vector(1, 2, 3))
I get compiler complaints saying that Int doesn't conform to A <: Ordered[A]. What does it take to create a TreeMap[Int, Int] in this context? (I want the more general case because the MultiSet[A] is not always MultiSet[Int].)
I also tried A <: scala.math.Ordered[A] and A <: Ordering[A] but with no better results. (I'll admit that I don't understand the differences among the three possibilities and whether it matters in this situation.)
Thanks for your help.
The problem is that Int is an alias for the java int, which does not implement Ordered[Int]. How could it, since java does not even know that the Ordered[T] trait exists.
There are two ways to solve your problem:
View bounds:
The first approach is to change the constraint <: to a view bound <%.
def vectorToTreeMap[A <% Ordered[A]](elements: Vector[A]): TreeMap[A, Int] =
elements.foldLeft(new TreeMap[A, Int]())((tm, e) => tm.updated(e, tm.getOrElse(e, 0) + 1))
A <: Ordered[A] means that the method vectorToTreeMap is only defined for types that directly implement Ordered[A], which excludes Int.
A <% Ordered[A] means that the method vectorToTreeMap is defined for all types that "can be viewed as" implementing Ordered[A], which includes Int because there is an implicit conversion defined from Int to Ordered[Int]:
scala> implicitly[Int => Ordered[Int]]
res7: Int => Ordered[Int] = <function1>
Type classes
The second approach is to not require any (direct or indirect) inheritance relationship for the type A, but just require that there exists a way to order instances of type A.
Basically you always require an ordering to be able to create a TreeMap from a vector, but to avoid having to pass it every single time you call the method you make the ordering an implicit parameter.
def vectorToTreeMap[A](elements: Vector[A])(implicit ordering:Ordering[A]): TreeMap[A, Int] =
elements.foldLeft(new TreeMap[A, Int]())((tm, e) => tm.updated(e, tm.getOrElse(e, 0) + 1))
It turns out that there are instances of Ordering[A] for all java primitive types as well as for String, as you can see with the implicitly method in the scala REPL:
scala> implicitly[Ordering[Int]]
res8: Ordering[Int] = scala.math.Ordering$Int$#5b748182
Scala is even able to derive orderings for composite types. For example if you have a Tuple where there exists an ordering for each element type, scala will automatically provide an ordering for the tuple type as well:
scala> implicitly[Ordering[(Int, Int)]]
res9: Ordering[(Int, Int)] = scala.math.Ordering$$anon$11#66d51003
The second approach of using so-called type classes is much more flexible. For example, if you want a tree of plain old ints, but with reverse order, all you have to do is to provide a reverse int ordering either directly or as an implicit val.
This approach is also very common in idiomatic scala. So there is even special syntax for it:
def vectorToTreeMap[A : Ordering](elements: Vector[A]): TreeMap[A, Int] = ???
is equivalent to
def vectorToTreeMap[A](elements: Vector[A])(implicit ordering:Ordering[A]): TreeMap[A, Int] = ???
It basically means that you want the method vectorToTreeMap defined only for types for which an ordering exists, but you do not care about giving the ordering a name. Even with the short syntax you can use vectorToTreeMap with an implicitly resolved Ordering[A], or pass an Ordering[A] explicitly.
The second approach has two big advantages:
it allows you to define functionality for types you do not "own".
it allows you to decouple the behavior regarding some aspect like e.g. ordering from the type itself, whereas with the inheritance approach you couple the behavior to the type. For example you can have a normal Ordering and a caseInsensitiveOrdering for a Sting. But if you let String extend from Ordered, you must decide on one ordering behavior.
That is why the second approach is used in the scala collections itself to provide an ordering for TreeMap.
Edit: here is an example to provide an ordering for a type that does not have one:
scala> case class Person(name:String, surname:String)
defined class Person
scala> implicitly[Ordering[Person]]
<console>:10: error: No implicit Ordering defined for Person.
implicitly[Ordering[Person]]
^
Case classes do not have orderings automatically defined. But we can easily define one:
scala> :paste
// Entering paste mode (ctrl-D to finish)
case class Person(name:String, surname:String)
object Person {
// just convert to a tuple, which is ordered by the individual elements
val nameSurnameOrdering : Ordering[Person] = Ordering.by(p => (p.name, p.surname))
// make the nameSurnameOrdering the default that is in scope unless something else is specified
implicit def defaultOrdering = nameSurnameOrdering
}
// Exiting paste mode, now interpreting.
defined class Person
defined module Person
scala> implicitly[Ordering[Person]]
res1: Ordering[Person] = scala.math.Ordering$$anon$9#50148190

What is the Scala syntax for summing a List of objects?

For example
case class Blah(security: String, price: Double)
val myList = List(Blah("a", 2.0), Blah("b", 4.0))
val sum = myList.sum(_.price) // does not work
What is the syntax for obtaining the sum?
Try this:
val sum = myList.map(_.price).sum
Or alternately:
val sum = myList.foldLeft(0.0)(_ + _.price)
You appear to be trying to use this method:
def sum [B >: A] (implicit num: Numeric[B]): B
and the compiler can't figure out how the function you're providing is an instance of Numeric, because it isn't.
Scalaz has this method under a name foldMap. The signature is:
def M[A].foldMap[B](f: A => B)(implicit f: Foldable[M], m: Monoid[B]): B
Usage:
scala> case class Blah(security: String, price: Double)
defined class Blah
scala> val myList = List(Blah("a", 2.0), Blah("b", 4.0))
myList: List[Blah] = List(Blah(a,2.0), Blah(b,4.0))
scala> myList.foldMap(_.price)
res11: Double = 6.0
B here doesn't have to be a numeric type. It can be any monoid. Example:
scala> myList.foldMap(_.security)
res12: String = ab
As an alternative to missingfaktor's Scalaz example, if you really want to sum a list of objects (as opposed to mapping each of them to a number and then summing those numbers), scalaz supports this as well.
This depends on the class in question having an instance of Monoid defined for it (which in practice means that it must have a Zero and a Semigroup defined). The monoid can be considered a weaker generalisation of core scala's Numeric trait specifically for summing; after all, if you can define a zero element and a way to add/combine two elements, then you have everything you need to get the sum of multiple objects.
Scalaz' logic is exactly the same as the way you'd sum integers manually - list.foldLeft(0) { _ + _ } - except that the Zero provides the initial zero element, and the Semigroup provides the implementation of + (called append).
It might look something like this:
import scalaz._
import Scalaz._
// Define Monoid for Blah
object Blah {
implicit def zero4Blah: Zero[Blah] = zero(Blah("", 0))
implicit def semigroup4Blah: Semigroup[Blah] = semigroup { (a, b) =>
// Decide how to combine security names - just append them here
Blah(a.security + b.security, a.price + b.price)
}
}
// Now later in your class
val myList = List(Blah("a", 2.0), Blah("b", 4.0))
val mySum = myList.asMA.sum
In this case mySum will actually be an instance of Blah equal to Blah("ab", 6.0), rather than just being a Double.
OK, for this particular example you don't really gain that much because getting a "sum" of the security names isn't very useful. But for other classes (e.g. if you had a quantity as well as a price, or multiple relevant properties) this can be very useful. Fundamentally it's great that if you can define some way of adding two instances of your class together, you can tell scalaz about it (by defining a Semigroup); and if you can define a zero element too, you can use that definition to easily sum collections of your class.