What does param: _* mean in Scala?

What does param: _* mean in Scala? - scala

Being new to Scala (2.9.1), I have a List[Event] and would like to copy it into a Queue[Event], but the following Syntax yields a Queue[List[Event]] instead:
val eventQueue = Queue(events)
For some reason, the following works:
val eventQueue = Queue(events : _*)
But I would like to understand what it does, and why it works? I already looked at the signature of the Queue.apply function:
def apply[A](elems: A*)
And I understand why the first attempt doesn't work, but what's the meaning of the second one? What is :, and _* in this case, and why doesn't the apply function just take an Iterable[A] ?

a: A is type ascription; see What is the purpose of type ascriptions in Scala?
: _* is a special instance of type ascription which tells the compiler to treat a single argument of a sequence type as a variable argument sequence, i.e. varargs.
It is completely valid to create a Queue using Queue.apply that has a single element which is a sequence or iterable, so this is exactly what happens when you give a single Iterable[A].

This is a special notation that tells the compiler to pass each element as its own argument, rather than all of it as a single argument. See here.
It is a type annotation that indicates a sequence argument and is mentioned as an "exception" to the general rule in section 4.6.2 of the language spec, "Repeated Parameters".
It is useful when a function takes a variable number of arguments, e.g. a function such as def sum(args: Int*), which can be invoked as sum(1), sum(1,2) etc. If you have a list such as xs = List(1,2,3), you can't pass xs itself, because it is a List rather than an Int, but you can pass its elements using sum(xs: _*).

For Python folks:
Scala's _* operator is more or less the equivalent of Python's *-operator.
Example
Converting the scala example from the link provided by Luigi Plinge:
def echo(args: String*) =
for (arg <- args) println(arg)
val arr = Array("What's", "up", "doc?")
echo(arr: _*)
to Python would look like:
def echo(*args):
for arg in args:
print "%s" % arg
arr = ["What's", "up", "doc?"]
echo(*arr)
and both give the following output:
What's
up
doc?
The Difference: unpacking positional parameters
While Python's *-operator can also deal with unpacking of positional parameters/parameters for fixed-arity functions:
def multiply (x, y):
return x * y
operands = (2, 4)
multiply(*operands)
8
Doing the same with Scala:
def multiply(x:Int, y:Int) = {
x * y;
}
val operands = (2, 4)
multiply (operands : _*)
will fail:
not enough arguments for method multiply: (x: Int, y: Int)Int.
Unspecified value parameter y.
But it is possible to achieve the same with scala:
def multiply(x:Int, y:Int) = {
x*y;
}
val operands = (2, 4)
multiply _ tupled operands
According to Lorrin Nelson this is how it works:
The first part, f _, is the syntax for a partially applied function in which none of the arguments have been specified. This works as a mechanism to get a hold of the function object. tupled returns a new function which of arity-1 that takes a single arity-n tuple.
Futher reading:
stackoverflow.com - scala tuple unpacking

Related

Apache-Spark : What is map(_._2) shorthand for?

I read a project's source code, found:
val sampleMBR = inputMBR.map(_._2).sample
inputMBR is a tuple.
the function map's definition is :
map[U classTag](f:T=>U):RDD[U]
it seems that map(_._2) is the shorthand for map(x => (x._2)).
Anyone can tell me rules of those shorthand ?

The _ syntax can be a bit confusing. When _ is used on its own it represents an argument in the anonymous function. So if we working on pairs:
map(_._2 + _._2) would be shorthand for map(x, y => x._2 + y._2). When _ is used as part of a function name (or value name) it has no special meaning. In this case x._2 returns the second element of a tuple (assuming x is a tuple).

collection.map(_._2) emits a second component of the tuple. Example from pure Scala (Spark RDDs work the same way):
scala> val zipped = (1 to 10).zip('a' to 'j')
zipped: scala.collection.immutable.IndexedSeq[(Int, Char)] = Vector((1,a), (2,b), (3,c), (4,d), (5,e), (6,f), (7,g), (8,h), (9,i), (10,j))
scala> val justLetters = zipped.map(_._2)
justLetters: scala.collection.immutable.IndexedSeq[Char] = Vector(a, b, c, d, e, f, g, h, i, j)

Two underscores in '_._2' are different.
First '_' is for placeholder of anonymous function; Second '_2' is member of case class Tuple.
Something like:
case class Tuple3 (_1: T1, _2: T2, _3: T3)
{...}

I have found the solutions.
First the underscore here is as placeholder.
To make a function literal even more concise, you can use underscores
as placeholders for one or more parameters, so long as each parameter
appears only one time within the function literal.
See more about underscore in Scala at What are all the uses of an underscore in Scala?.

The first '_' is referring what is mapped to and since what is mapped to is a tuple you might call any function within the tuple and one of the method is '_2' so what below tells us transform input into it's second attribute.

Why doesn't the scala compiler recognize this as a tuple?

If I create a map:
val m = Map((4, 3))
And try to add a new key value pair:
val m_prime = m + (1, 5)
I get:
error: type mismatch;
found : Int(1)
required: (Int, ?)
val m_prime = m + (1, 5)
If I do:
val m_prime = m + ((1, 5))
Or:
val m_prime = m + (1 -> 5)
Then it works. Why doesn't the compiler accept the first example?
I am using 2.10.2

This is indeed very annoying (I run into this frequently). First of all, the + method comes from a general collection trait, taking only one argument—the collection's element type. Map's element type is the pair (A, B). However, Scala interprets the parentheses here as method call parentheses, not a tuple constructor. The explanation is shown in the next section.
To solve this, you can either avoid tuple syntax and use the arrow association key -> value instead, or use double parentheses, or use method updated which is specific to Map. updated does the same as + but takes key and value as separate arguments:
val m_prime = m updated (1, 5)
Still it is unclear why Scala fails here, as in general infix syntax should work and not expect parentheses. It appears that this particular case is broken because of a method overloading: There is a second + method that takes a variable number of tuple arguments.
Demonstration:
trait Foo {
def +(tup: (Int, Int)): Foo
}
def test1(f: Foo) = f + (1, 2) // yes, it works!
trait Baz extends Foo {
def +(tups: (Int, Int)*): Foo // overloaded
}
def test2(b: Baz) = b + (1, 2) // boom. we broke it.
My interpretation is that with the vararg version added, there is now an ambiguity: Is (a, b) a Tuple2 or a list of two arguments a and b (even if a and b are not of type Tuple2, perhaps the compiler would start looking for an implicit conversion). The only way to resolve the ambiguity is to use either of the three approaches described above.

Curried update method

I'm trying to have curried apply and update methods like this:
def apply(i: Int)(j: Int) = matrix(i)(j)
def update(i: Int, j: Int, value: Int) =
new Matrix(n, m, (x, y) => if ((i,j) == (x,y)) value else matrix(x)(y))
Apply method works correctly, but update method complains:
scala> matrix(2)(1) = 1
<console>:16: error: missing arguments for method apply in class Matrix;
follow this method with `_' if you want to treat it as a partially applied function
matrix(2)(1) = 1
Calling directly update(2)(1)(1) works, so it is a conversion to update method that doesn't work properly. Where is my mistake?

The desugaring of assignment syntax into invocations of update maps the concatenation of a single argument list on the LHS of the assignment with the value on the RHS of the assignment to the first parameter block of the update method definition, irrespective of how many other parameter blocks the update method definition has. Whilst this transformation in a sense splits a single parameter block into two (one on the LHS, one on the RHS of the assignment), it will not further split the left parameter block in the way that you want.
I also think you're mistaken about the example of the explicit invocation of update that you show. This doesn't compile with the definition of update that you've given,
scala> class Matrix { def update(i: Int, j: Int, value: Int) = (i, j, value) }
defined class Matrix
scala> val m = new Matrix
m: Matrix = Matrix#37176bc4
scala> m.update(1)(2)(3)
<console>:10: error: not enough arguments for method update: (i: Int, j: Int, value: Int)(Int, Int, Int).
Unspecified value parameters j, value.
m.update(1)(2)(3)
^
I suspect that during your experimentation you actually defined update like so,
scala> class Matrix { def update(i: Int)(j: Int)(value: Int) = (i, j, value) }
defined class Matrix
The update desugaring does apply to this definition, but probably not in the way that you expect: as described above, it only applies to the first argument list, which leads to constructs like,
scala> val m = new Matrix
m: Matrix = Matrix#39741f43
scala> (m() = 1)(2)(3)
res0: (Int, Int, Int) = (1,2,3)
Here the initial one-place parameter block is split to an empty parameter block on the LHS of the assignment (ie. the ()) and a one argument parameter block on the RHS (ie. the 1). The remainder of the parameter blocks from the original definition then follow.
If you're surprised by this behaviour you won't be the first.
The syntax you're after is achievable via a slightly different route,
scala> class Matrix {
| class MatrixAux(i : Int) {
| def apply(j : Int) = 23
| def update(j: Int, value: Int) = (i, j, value)
| }
|
| def apply(i: Int) = new MatrixAux(i)
| }
defined class Matrix
scala> val m = new Matrix
m: Matrix = Matrix#3af30087
scala> m(1)(2) // invokes MatrixAux.apply
res0: Int = 23
scala> m(1)(2) = 3 // invokes MatrixAux.update
res1: (Int, Int, Int) = (1,2,3)

My guess is, that it is simply not supported. Probably not due to an explicit design decision, because I don't see why it shouldn't work in principle.
The translation concerned with apply, i.e., the one performed when converting m(i)(j) into m.apply(i, j) seems to be able to cope with currying. Run scala -print on your program to see the code resulting from the translation.
The translation concerned with update, on the other hand, doesn't seem to be able to cope with currying. Since the error message is missing arguments for method apply, it even looks as if the currying confuses the translator such that it tries to translate m(i)(j) = v into m.apply, but then screws up the number of required arguments. scala -print unfortunately won't help here, because the type checker terminates the translation too early.
Here is what the language specs (Scala 2.9, "6.15 Assignments") say about assignments. Since currying is not mentioned, I assume that it is not explicitly supported. I couldn't find the corresponding paragraph for apply, but I guess it is purely coincidental that currying works there.
An assignment f(args) = e with a function application to the left of
the ‘=’ operator is interpreted as f.update(args, e), i.e. the
invocation of an update function defined by f.

How to invoke a method taking String * with elements of Array[String]

Suppose I have a method
def f(s:String *) = s.foreach( x => println(x) )
Now I have an array:
val a = Array("1", "2", "3")
How do I invoke f with elements of a as parameters?
EDIT:
So given a, I want to do the following:
f(a(0), a(1), a(2)) // f("1", "2", "3")

There is an operator for that:
f(a: _*)
This operation is defined in chapter 4.6.2 Repeated Parameters of The Scala Language Specification Version 2.9 and further explained in 6.6 Function Applications:
The last argument in an application may be marked as a sequence argument, e.g.
e : _*. Such an argument must correspond to a repeated parameter (§4.6.2) of type
S * [...]. Furthermore, the type of e must conform to scala.Seq[T], for some type T which conforms to S. In this
case, the argument list is transformed by replacing the sequence e with its elements.
BTW your f function can be simplified:
def f(s:String *) = s foreach println
Or better (equals sign is discouraged as it suggests that the method actually returns something, however it "only" return Unit):
def f(s:String *) {s foreach println}

Syntax sugar: _* for treating Seq as method parameters

I just noticed this construct somewhere on web:
val list = List(someCollection: _*)
What does _* mean? Is this a syntax sugar for some method call? What constraints should my custom class satisfy so that it can take advantage of this syntax sugar?

Generally, the : notation is used for type ascription, forcing the compiler to see a value as some particular type. This is not quite the same as casting.
val b = 1 : Byte
val f = 1 : Float
val d = 1 : Double
In this case, you're ascribing the special varargs type. This mirrors the asterisk notation used for declaring a varargs parameter and can be used on a variable of any type that subclasses Seq[T]:
def f(args: String*) = ... //varargs parameter, use as an Array[String]
val list = List("a", "b", "c")
f(list : _*)

That's scala syntax for exploding an array. Some functions take a variable number of arguments and to pass in an array you need to append : _* to the array argument.

Variable (number of) Arguments are defined using *.
For example,
def wordcount(words: String*) = println(words.size)
wordcount expects a string as parameter,
scala> wordcount("I")
1
but accepts more Strings as its input parameter (_* is needed for Type Ascription)
scala> val wordList = List("I", "love", "Scala")
scala> wordcount(wordList: _*)
3