Scala Set with a Tuple of 3 Elements - scala

How can I put in a tuple containing 3 elements to a Set?
Say, I have a Set of type:
Set[(String, String, String)]
How can I simply add 3 String's to my Set definition?
The following has the compiler complaining?
set + ("a", "b", "c")
Why is the tuple treated differently? It is just like any other type, so why it fails in my case above?

It doesn't parse well:
scala> Set[(String,String,String)]() + (("a", "b", "c"))
res3: scala.collection.immutable.Set[(String, String, String)] = Set((a,b,c))
What you wrote is parsed as Set.+(String x, String y, String z)
i.e., a function + with 3 string arguments, where what you wanted was a function + with a single 3-tuple as argument.

Note the signature of + for HashSet as an example:
def +(elem1: A, elem2: A, elems: A*): HashSet[A]
(from http://www.scala-lang.org/api/2.11.7/#scala.collection.immutable.HashSet)
This syntax implies that to add a tuple to a set of tuples would require double parentheses, one to accomodate the syntax of + and the other for the tuple. For example:
scala> import scala.collection.immutable.HashSet
import scala.collection.immutable.HashSet
scala> val set: Set[(String, String, String)] = new HashSet[(String, String, String)]()
set: Set[(String, String, String)] = Set()
scala> val newset = set + (("one", "two", "three"))
newset: scala.collection.immutable.Set[(String, String, String)] = Set((one,two,three))
This issue does not occur for sets of elements that are not bounded by parentheses since then there is no confusion with the syntax of +.

Related

Weird scala tuple behavior

I've notice this behavior in Scala
val list = List[(Int, Int)]()
val set = HashSet[(Int, Int)]()
scala> list :+ (1, 2)
res30: List[(Int, Int)] = List((1,2))
scala> list :+ (1 -> 2)
res31: List[(Int, Int)] = List((1,2))
scala> list :+ 1 -> 2
res32: List[(Int, Int)] = List((1,2))
//Work
// But the same for set not work
set += (1, 2)
<console>:14: error: type mismatch;
found : Int(2)
required: (Int, Int)
set += (1, 2)
//Ok may be += in set mean add all that mean this should work
set += ((1, 2))
set += ((1, 2), (3,4))
// Worked
// But why this one work
set += 1 -> 2
set += (1 -> 2)
set += ((1 -> 2))
Now I'm confuse, could you explain why tuple is not tuple?
scala> (4->5).getClass
res28: Class[_ <: (Int, Int)] = class scala.Tuple2
scala> (4,7).getClass
res29: Class[_ <: (Int, Int)] = class scala.Tuple2$mcII$sp
The parser stage -Xprint:parser gives
set.$plus$eq(1, 2)
which seems to resolve to
def += (elem1: A, elem2: A, elems: A*)
that is a method that accepts multiple arguments so compiler probably thinks elem1 = 1 or elem2 = 2 instead of considering (1,2) as a tuple.
missingfaktor points to SLS 6.12.3 Infix Operations as the explanation
The right-hand operand of a left-associative operator may consist of
several arguments enclosed in parentheses, e.g. 𝑒;op;(𝑒1,…,𝑒𝑛).
This expression is then interpreted as 𝑒.op(𝑒1,…,𝑒𝑛).
Now the operator += is left-associative because it does not end in :, and the right-hand operand of += consists of several arguments enclosed in parentheses (1,2). Therefore, by design, the compiler does not treat (1,2) as Tuple2.
I think the difference is that HashSet[T] defines two overloads for +=, one of which takes a single T, and the other takes multiple (as a T* params list). This is inherited from Growable[T], and shown here.
List[T].:+ can only take one T on the right hand side, which is why the compiler works out that it's looking at a tuple, not something that should be turned into a params list.
If you do set += ((1, 2)) then it compiles. Also, val tuple = (1,2); set += x works too.
See Mario Galic’s answer for why in the case of HashSet[T].+= the compiler chooses the overload that can't type over the one that can.

Spark RDD tuple transformation

I'm trying to transform an RDD of tuple of Strings of this format :
(("abc","xyz","123","2016-02-26T18:31:56"),"15") TO
(("abc","xyz","123"),"2016-02-26T18:31:56","15")
Basically seperating out the timestamp string as a seperate tuple element. I tried following but it's still not clean and correct.
val result = rdd.map(r => (r._1.toString.split(",").toVector.dropRight(1).toString, r._1.toString.split(",").toList.last.toString, r._2))
However, it results in
(Vector(("abc", "xyz", "123"),"2016-02-26T18:31:56"),"15")
The expected output I'm looking for is
(("abc", "xyz", "123"),"2016-02-26T18:31:56","15")
This way I can access the elements using r._1, r._2 (the timestamp string) and r._3 in a seperate map operation.
Any hints/pointers will be greatly appreciated.
Vector.toString will include the String 'Vector' in its result. Instead, use Vector.mkString(",").
Example:
scala> val xs = Vector(1,2,3)
xs: scala.collection.immutable.Vector[Int] = Vector(1, 2, 3)
scala> xs.toString
res25: String = Vector(1, 2, 3)
scala> xs.mkString
res26: String = 123
scala> xs.mkString(",")
res27: String = 1,2,3
However, if you want to be able to access (abc,xyz,123) as a Tuple and not as a string, you could also do the following:
val res = rdd.map{
case ((a:String,b:String,c:String,ts:String),d:String) => ((a,b,c),ts,d)
}

What is the structure that is only enclosed by parentheses in scala?

Here's the problem:
I intend to retrieve a (Int, Int) object from a function, but I don't know how to get the second element. I've tried the following commands so as to retrieve the second value, or convert it to a Seq or List, but with no luck.
scala> val s = (1,2)
s: (Int, Int) = (1,2)
scala> s(1)
<console>:9: error: (Int, Int) does not take parameters
s(1)
^
scala> val ss = List(s)
ss: List[(Int, Int)] = List((1,2))
scala> ss(0)
res10: (Int, Int) = (1,2)
Could anyone give me some idea? Thanks a lot!
val s = (1, 2)
is syntatic sugar and creates a Tuple2, or in other words is equivalent to new Tuple2(1, 2). You can access elements in tuples with
s._1 // => 1
s._2 // => 2
Likewise, (1, 2, 3) would create a Tuple3, which also has a method _3 to access the third element.

Trying to append to Iterable[String]

I'm trying to add another string to Iterable[String] for easy concatenation, but the result is not what I expect.
scala> val s: Iterable[String] = "one string" :: "two string" :: Nil
s: Iterable[String] = List(one string, two string)
scala> s.mkString(";\n")
res3: String =
one string;
two string
scala> (s ++ "three").mkString(";\n")
res5: String =
one string;
two string;
t;
h;
r;
e;
e
How should I should I rewrite this snippet to have 3 string in my iterable?
Edit: I should add, that order of items should be preserved
++ is for collection aggregation. There is no method +, :+ or add in Iterable, but you can use method ++ like this:
scala> (s ++ Seq("three")).mkString(";\n")
res3: String =
one string;
two string;
three
The ++ function is waiting for a Traversable argument. If you use just "three", it will convert the String "three" to a list of characters and append every character to s. That's why you get this result.
Instead, you can wrap "three" in an Iterable and the concatenation should work correctly :
scala> (s ++ Iterable[String]("three")).mkString(";\n")
res6: String =
one string;
two string;
three
I like to use toBuffer and then +=
scala> val l : Iterable[Int] = List(1,2,3)
l: Iterable[Int] = List(1, 2, 3)
scala> val e : Iterable[Int] = l.toBuffer += 4
e: Iterable[Int] = ArrayBuffer(1, 2, 3, 4)
or in your example:
scala> (s.toBuffer += "three").mkString("\n")
I have no idea why this operation isn't supported in the standard library. You can also use toArray but if you add more than one element this will be less performant - I would assume - as the buffer should return itself if it another element is added.

Simple question about tuple of scala

I'm new to scala, and what I'm learning is tuple.
I can define a tuple as following, and get the items:
val tuple = ("Mike", 40, "New York")
println("Name: " + tuple._1)
println("Age: " + tuple._2)
println("City: " + tuple._3)
My question is:
How to get the length of a tuple?
Is tuple mutable? Can I modify its items?
Is there any other useful operation we can do on a tuple?
Thanks in advance!
1] tuple.productArity
2] No.
3] Some interesting operations you can perform on tuples: (a short REPL session)
scala> val x = (3, "hello")
x: (Int, java.lang.String) = (3,hello)
scala> x.swap
res0: (java.lang.String, Int) = (hello,3)
scala> x.toString
res1: java.lang.String = (3,hello)
scala> val y = (3, "hello")
y: (Int, java.lang.String) = (3,hello)
scala> x == y
res2: Boolean = true
scala> x.productPrefix
res3: java.lang.String = Tuple2
scala> val xi = x.productIterator
xi: Iterator[Any] = non-empty iterator
scala> while(xi.hasNext) println(xi.next)
3
hello
See scaladocs of Tuple2, Tuple3 etc for more.
One thing that you can also do with a tuple is to extract the content using the match expression:
def tupleview( tup: Any ){
tup match {
case (a: String, b: String) =>
println("A pair of strings: "+a + " "+ b)
case (a: Int, b: Int, c: Int) =>
println("A triplet of ints: "+a + " "+ b + " " +c)
case _ => println("Unknown")
}
}
tupleview( ("Hello", "Freewind"))
tupleview( (1,2,3))
Gives:
A pair of strings: Hello Freewind
A triplet of ints: 1 2 3
Tuples are immutable, but, like all cases classes, they have a copy method that can be used to create a new Tuple with a few changed elements:
scala> (1, false, "two")
res0: (Int, Boolean, java.lang.String) = (1,false,two)
scala> res0.copy(_2 = true)
res1: (Int, Boolean, java.lang.String) = (1,true,two)
scala> res1.copy(_1 = 1f)
res2: (Float, Boolean, java.lang.String) = (1.0,true,two)
Concerning question 3:
A useful thing you can do with Tuples is to store parameter lists for functions:
def f(i:Int, s:String, c:Char) = s * i + c
List((3, "cha", '!'), (2, "bora", '.')).foreach(t => println((f _).tupled(t)))
//--> chachacha!
//--> borabora.
[Edit] As Randall remarks, you'd better use something like this in "real life":
def f(i:Int, s:String, c:Char) = s * i + c
val g = (f _).tupled
List((3, "cha", '!'), (2, "bora", '.')).foreach(t => println(g(t)))
In order to extract the values from tuples in the middle of a "collection transformation chain" you can write:
val words = List((3, "cha"),(2, "bora")).map{ case(i,s) => s * i }
Note the curly braces around the case, parentheses won't work.
Another nice trick ad question 3) (as 1 and 2 are already answered by others)
val tuple = ("Mike", 40, "New York")
tuple match {
case (name, age, city) =>{
println("Name: " + name)
println("Age: " + age)
println("City: " + city)
}
}
Edit: in fact it's rather a feature of pattern matching and case classes, a tuple is just a simple example of a case class...
You know the size of a tuple, it's part of it's type. For example if you define a function def f(tup: (Int, Int)), you know the length of tup is 2 because values of type (Int, Int) (aka Tuple2[Int, Int]) always have a length of 2.
No.
Not really. Tuples are useful for storing a fixed amount of items of possibly different types and passing them around, putting them into data structures etc. There's really not much you can do with them, other than creating tuples, and getting stuff out of tuples.
1 and 2 have already been answered.
A very useful thing that you can use tuples for is to return more than one value from a method or function. Simple example:
// Get the min and max of two integers
def minmax(a: Int, b: Int): (Int, Int) = if (a < b) (a, b) else (b, a)
// Call it and assign the result to two variables like this:
val (x, y) = minmax(10, 3) // x = 3, y = 10
Using shapeless, you easily get a lot of useful methods, that are usually available only on collections:
import shapeless.syntax.std.tuple._
val t = ("a", 2, true, 0.0)
val first = t(0)
val second = t(1)
// etc
val head = t.head
val tail = t.tail
val init = t.init
val last = t.last
val v = (2.0, 3L)
val concat = t ++ v
val append = t :+ 2L
val prepend = 1.0 +: t
val take2 = t take 2
val drop3 = t drop 3
val reverse = t.reverse
val zip = t zip (2.0, 2, "a", false)
val (unzip, other) = zip.unzip
val list = t.toList
val array = t.toArray
val set = t.to[Set]
Everything is typed as one would expect (that is first has type String, concat has type (String, Int, Boolean, Double, Double, Long), etc.)
The last method above (.to[Collection]) should be available in the next release (as of 2014/07/19).
You can also "update" a tuple
val a = t.updatedAt(1, 3) // gives ("a", 3, true, 0.0)
but that will return a new tuple instead of mutating the original one.