When is one Set less than another in Scala? - scala

I wanted to compare the cardinality of two sets in Scala. Since stuff sometimes "just work" in Scala, I tried using < between the sets. It seems to go through, but I can't make any sense out of the result.
Example:
scala> Set(1,2,3) < Set(1,4)
res20: Boolean = true
What does it return?
Where can I read about this method in the API?
Why isn't it listed anywhere under scala.collection.immutable.Set?
Update: Even the order(??) of the elements in the sets seem to matter:
scala> Set(2,3,1) < Set(1,3)
res24: Boolean = false
scala> Set(1,2,3) < Set(1,3)
res25: Boolean = true

This doesn't work with 2.8. On Scala 2.7, what happens is this:
scala.Predef.iterable2ordered(Set(1, 2, 3): Iterable[Int]) < (Set(1, 3, 2): Iterable[Int])
In other words, there's an implicit conversion defined on scala.Predef, which is "imported" for all Scala code, from an Iterable[A] to an Ordered[Iterable[A]], provided there's an implicit A => Ordered[A] available.
Given that the order of an iterable for sets is undefined, you can't really predict much about it. If you add elements to make the set size bigger than four, for instance, you'll get entirely different results.

If you want to compare the cardinality, just do so directly:
scala> Set(1, 2, 3).size < Set(2, 3, 4, 5).size
res0: Boolean = true

My knowledge of Scala is not extensive, but doing some test, I get the following:
scala> Set(1,2) <
<console>:5: error: missing arguments for method < in trait Ordered;
follow this method with `_' if you want to treat it as a partially applied function
Set(1,2) <
^
That tells me that < comes from the trait Ordered. More hints:
scala> Set(1,2) < _
res4: (Iterable[Int]) => Boolean = <function>
That is, the Set is evaluated into an Iterable, because maybe there is some implicit conversion from Iterable[A] to Ordered[Iterable[A]], but I'm not sure anymore... Tests are not consistent. For example, these two might suggest a kind of lexicographical compare:
scala> Set(1,2,3) < Set(1,2,4)
res5: Boolean = true
1 is equal, 2 is equal, 3 is less than 4.
scala> Set(1,2,4) < Set(1,2,3)
res6: Boolean = false
But these ones don't:
scala> Set(2,1) < Set(2,4)
res11: Boolean = true
scala> Set(2,1) < Set(2,2)
res12: Boolean = false
I think the correct answer is that found in the Ordered trait proper: There is no implementation for < between sets more than comparing their hashCode:
It is important that the hashCode method for an instance of Ordered[A] be consistent with the compare method. However, it is not possible to provide a sensible default implementation. Therefore, if you need to be able compute the hash of an instance of Ordered[A] you must provide it yourself either when inheiriting or instantiating.

Related

Lists in Scala - plus colon vs double colon (+: vs ::)

I am little bit confused about +: and :: operators that are available.
It looks like both of them gives the same results.
scala> List(1,2,3)
res0: List[Int] = List(1, 2, 3)
scala> 0 +: res0
res1: List[Int] = List(0, 1, 2, 3)
scala> 0 :: res0
res2: List[Int] = List(0, 1, 2, 3)
For my novice eye source code for both methods looks similar (plus-colon method has additional condition on generics with use of builder factories).
Which one of these methods should be used and when?
+: works with any kind of collection, while :: is specific implementation for List.
If you look at the source for +: closely, you will notice that it actually calls :: when the expected return type is List. That is because :: is implemented more efficiently for the List case: it simply connects the new head to the existing list and returns the result, which is a constant-time operation, as opposed to linear copying the entire collection in the generic case of +:.
+: on the other hand, takes CanBuildFrom, so you can do fancy (albeit, not looking as nicely in this case) things like:
val foo: Array[String] = List("foo").+:("bar")(breakOut)
(It's pretty useless in this particular case, as you could start with the needed type to begin with, but the idea is you can prepend and element to a collection, and change its type in one "go", avoiding an additional copy).

Scala syntax ## [duplicate]

What is the difference between methods ## and hashCode?
They seem to be outputting the same values no matter which class or hashCode overloading I use. Google doesn't help, either, as it cannot find symbol ##.
"Subclasses" of AnyVal do not behave properly from a hashing perspective:
scala> 1.0.hashCode
res14: Int = 1072693248
Of course this is boxed to a call to:
scala> new java.lang.Double(1.0).hashCode
res16: Int = 1072693248
We might prefer it to be:
scala> new java.lang.Double(1.0).##
res17: Int = 1
scala> 1.0.##
res15: Int = 1
We should expect this given that the int 1 is also the double 1. Of course this issue does not arise in Java. Without it, we'd have this problem:
Set(1.0) contains 1 //compiles but is false
Luckily:
scala> Set(1.0) contains 1
res21: Boolean = true
## was introduced because hashCode is not consistent with the == operator in Scala. If a == b then a.## == b.## regardless of the type of a and b (if custom hashCode implementations are correct). The same is not true for hashCode as can be seen in the examples given by other posters.
Just want to add to the answers of other posters that although the ## method strives to keep the contract between equality and hash codes, it is apparently not good enough in some cases, like when you are comparing doubles and longs (scala 2.10.2):
> import java.lang._
import java.lang._
> val lng = Integer.MAX_VALUE.toLong + 1
lng: Long = 2147483648
> val dbl = Integer.MAX_VALUE.toDouble + 1
dbl: Double = 2.147483648E9
> lng == dbl
res65: Boolean = true
> lng.## == dbl.##
res66: Boolean = false
> (lng.##, lng.hashCode)
res67: (Int, Int) = (-2147483647,-2147483648)
> (dbl.##, dbl.hashCode)
res68: (Int, Int) = (-2147483648,1105199104)
In addition to what everyone else said, I'd like to say that ## is null-safe, because null.## returns 0 whereas null.hashCode throws NullPointerException.
From scaladoc:
Equivalent to x.hashCode except for boxed numeric types and null. For numerics, it returns a hash value which is consistent with value equality: if two value type instances compare as true, then ## will produce the same hash value for each of them. For null returns a hashcode where null.hashCode throws a NullPointerException.

Scala comparison error

I am trying to compare an item from a List of type Strings to an integer. I tried doing this but I get an error saying that:
'value < is not a member of List[Int]'
The line of code that compares is something similar to this:
if(csvList.map(x => x(0).toInt) < someInteger)
Besides the point of why this happens, I wondered why I didn't get an error
when I used a different type of comparison, such as ' == '.
So if I run the line:
if( csvList.map(x => x(0).toInt) == someInteger)
I don't get an error. Why is that?
Let's start with some introductions before answering the questions
Using the REPL you can understand a bit more what you are doing
scala> List("1", "2", "3", "33").map(x => x(0).toInt)
res1: List[Int] = List(49, 50, 51, 51)
The map function is used to transform every element, so x inside the map will be "1" the first time, "2" the second, and so on.
When you are using x(0) you are accessing the first character in the String.
scala> "Hello"(0)
res2: Char = H
As you see the type after you have mapped your strings is a List of Int. And you can compare that with an Int, but it will never be equals.
scala> List(1, 2, 3) == 5
res0: Boolean = false
This is very much like in Java when you try
"Hello".equals(new Integer(1));
If you want to know more about the reasons behind the equality problem you can check out Why has Scala no type-safe equals method?
Last but not least, you get an error when using less than because there is no less than in the List class.
Extra:
If you want to know if the second element in the list is smaller than 2 you can do
scala> val data = List("1", "10", "20")
data: List[String] = List(1, 10, 20)
scala> 5 < data(1).toInt
res2: Boolean = true
Although it is a bit strange, maybe you should transform the list of string is something a bit more typed like a case class and then do your business logic with a more clear data model.
You can refer to
Why == operator and equals() behave differently for values of AnyVal in Scala
Every class support operator ==, but may not support <,> these operators.
in your code
csvList.map(x => x(0).toInt)
it returns a List<int>, and application use it to compare with a int,
so it may process a implicit type conversion. Even the compiler doesn't report it as a error. Generally, it's not good to compare value with different types.
csvList.map(x => x(0).toInt) converts the entire csvList to a List[Int] and then tries to apply the operator < to List[Int] and someInteger which does not exist. This is essentially what the error message is saying.
There is no error for == since this operator exists for List though List[T] == Int will always return false.
Perhaps what you are trying to do is compare each item of the List to an Int. If that is the case, something like this would do:
scala> List("1","2","3").map(x => x.toInt < 2)
res18: List[Boolean] = List(true, false, false)
The piece of code csvList.map(x => x(0).toInt) actually returns a List[Int], that is not comparable with a integer (not sure what it would mean to say that List(1,2) < 3).
If you want to compare each element of the list to your number, making sure they are all inferior to it, you would actually write if(csvList.map(x => x.toInt).forall { _ < someInteger })

Why does Slick require using three equal signs (===) for comparison?

I was reading through coming from SQL to Slick and it states to use === instead of == for comparison.
For example,
people.filter(p => p.age >= 18 && p.name === "C. Vogt").run
What is the difference between == and ===, and why is the latter used here?
== is defined on Any in Scala. Slick can't overload it for Column[...] types like it can for other operators. That's why slick needs a custom operator for equality. We chose === just like several other libraries, such as scalatest, scalaz, etc.
a == b will lead to true or false. It's a client-side comparison. a === b will lead to an object of type Column[Boolean], with an instance of Library.Equals(a,b) behind it, which Slick will compile to a server-side comparison using the SQL "a = b" (where a and b are replaced by the expressions a and b stand for).
== calls for equals, === is a custom defined method in slick which is used for column comparison:
def === [P2, R](e: Column[P2])(implicit om: o#arg[B1, P2]#to[Boolean, R]) =
om.column(Library.==, n, e.toNode)
The problem of using == for objects is this (from this question):
Default implementation of equals() class provided by java.lang.Object compares memory location and only return true if two reference variable are pointing to same memory location i.e. essentially they are same object.
What this means is that two variables must point to the same object to be equal, example:
scala> class A
defined class A
scala> new A
res0: A = A#4e931efa
scala> new A
res1: A = A#465670b4
scala> res0 == res1
res2: Boolean = false
scala> val res2 = res0
res2: A = A#4e931efa
scala> res2 == res0
res4: Boolean = true
In the first case == returns false because res0 and res1 point to two different objects, in the second case res2 is equal to res0 because they point to the same object.
In Slick columns are abstracted in objects, so having column1 == column2 is not what you are looking for, you want to check equality for the value a column hold and not if they point to the same object. Slick then probably translates that === in a value equality in the AST (Library.== is a SqlOperator("="), n is the left hand side column and e the right hand side), but Christopher can explain that better than me.
'==' compare value only and result in Boolean 'True' & 'False
'===' compare completely (i.e. compare value with its data types) and result in column
example
1=='1' True
1==='1' False

hashcode and ## result differ for Double,Float [duplicate]

What is the difference between methods ## and hashCode?
They seem to be outputting the same values no matter which class or hashCode overloading I use. Google doesn't help, either, as it cannot find symbol ##.
"Subclasses" of AnyVal do not behave properly from a hashing perspective:
scala> 1.0.hashCode
res14: Int = 1072693248
Of course this is boxed to a call to:
scala> new java.lang.Double(1.0).hashCode
res16: Int = 1072693248
We might prefer it to be:
scala> new java.lang.Double(1.0).##
res17: Int = 1
scala> 1.0.##
res15: Int = 1
We should expect this given that the int 1 is also the double 1. Of course this issue does not arise in Java. Without it, we'd have this problem:
Set(1.0) contains 1 //compiles but is false
Luckily:
scala> Set(1.0) contains 1
res21: Boolean = true
## was introduced because hashCode is not consistent with the == operator in Scala. If a == b then a.## == b.## regardless of the type of a and b (if custom hashCode implementations are correct). The same is not true for hashCode as can be seen in the examples given by other posters.
Just want to add to the answers of other posters that although the ## method strives to keep the contract between equality and hash codes, it is apparently not good enough in some cases, like when you are comparing doubles and longs (scala 2.10.2):
> import java.lang._
import java.lang._
> val lng = Integer.MAX_VALUE.toLong + 1
lng: Long = 2147483648
> val dbl = Integer.MAX_VALUE.toDouble + 1
dbl: Double = 2.147483648E9
> lng == dbl
res65: Boolean = true
> lng.## == dbl.##
res66: Boolean = false
> (lng.##, lng.hashCode)
res67: (Int, Int) = (-2147483647,-2147483648)
> (dbl.##, dbl.hashCode)
res68: (Int, Int) = (-2147483648,1105199104)
In addition to what everyone else said, I'd like to say that ## is null-safe, because null.## returns 0 whereas null.hashCode throws NullPointerException.
From scaladoc:
Equivalent to x.hashCode except for boxed numeric types and null. For numerics, it returns a hash value which is consistent with value equality: if two value type instances compare as true, then ## will produce the same hash value for each of them. For null returns a hashcode where null.hashCode throws a NullPointerException.