Get a different result when calling the same function - scala

I am a scala newbie.
This is my code. The results of two types of same method use are different,
can anyone explain to me why???

The thing is that in Scala, all functions ( or "operators") which have names ending with a colon ':' are deemed as right associative when used with infix notation.
So... for your function,
def ::(t: TG) = ???
When, you are writing
val lxx3 = lxx1 :: lxx2
The function :: associates to the right (ie. with lxx2). So it is actually equivalent to
val lxx3 = lxx2.::(lxx1)
instead of this,
val lxx3 = lxx1.::(lxx2)

Related

What does an underscore after a scala method call mean?

The scala documentation has a code example that includes the following line:
val numberFunc = numbers.foldLeft(List[Int]())_
What does the underscore after the method call mean?
It's a partially applied function. You only provide the first parameter to foldLeft (the initial value), but you don't provide the second one; you postpone it for later. In the docs you linked they do it in the next line, where they define squares:
val numberFunc = numbers.foldLeft(List[Int]())_
val squares = numberFunc((xs, x) => xs:+ x*x)
See that (xs, x) => xs:+ x*x, that's the missing second parameter which you omitted while defining numberFunc. If you had provided it right away, then numberFunc would not be a function - it would be the computed value.
So basically the whole thing can also be written as a one-liner in the curried form:
val squares = numbers.foldLeft(List[Int]())((xs, x) => xs:+ x*x)
However, if you want to be able to reuse foldLeft over and over again, having the same collection and initial value, but providing a different function every time, then it's very convinient to define a separate numbersFunc (as they did in the docs) and reuse it with different functions, e.g.:
val squares = numberFunc((xs, x) => xs:+ x*x)
val cubes = numberFunc((xs, x) => xs:+ x*x*x)
...
Note that the compiler error message is pretty straightforward in case you forget the underscore:
Error: missing argument list for method foldLeft in trait
LinearSeqOptimized Unapplied methods are only converted to functions
when a function type is expected. You can make this conversion
explicit by writing foldLeft _ or foldLeft(_)(_) instead of
foldLeft. val numberFunc = numbers.foldLeft(ListInt)
EDIT: Haha I just realized that they did the exact same thing with cubes in the documentation.
I don't know if it helps but I prefer this syntax
val numberFunc = numbers.foldLeft(List[Int]())(_)
then numberFunc is basically a delegate corresponding to an instance method (instance being numbers) waiting for a parameter. Which later comes to be a lambda expression in the scala documentation example

Spark Scala MLlib assignment syntax

I've been going through the guide at https://spark.apache.org/docs/latest/ml-statistics.html and I've noticed that they're using this syntax for val assignment:
val Row(coeff1: Matrix) = Correlation.corr(df, "features").head
Can someone elaborate on what this means? It seems similar to how Scala handles regex group extraction...
It is nothing more than a pattern matching. To make it more obvious, you rewrite it as:
val coeff1 = Correlation.corr(df, "features").head match {
case Row(coeff1: Matrix) => coeff1
}
In other words it just tries to match the object returned form .head call and on successful match, it creates a reference (coeff1) to the Matrix object contained in the returned Row.

How to refer Spark RDD element multiple times using underscore notation?

How to refer Spark RDD element multiple times using underscore notations.
For example I need to convert RDD[String] to RDD[(String, Int)]. I can create anonymous function using function variables but I would like to do this using Underscore notation. How I can achieve this.
PFB sample code.
val x = List("apple", "banana")
val rdd1 = sc.parallelize(x)
// Working
val rdd2 = rdd1.map(x => (x, x.length))
// Not working
val rdd3 = rdd1.map((_, _.length))
Why does the last line above not work?
An underscore or (more commonly) a placeholder syntax is a marker of a single input parameter. It's nice to use for simple functions, but can get tricky to get right with two or more.
You can find the definitive answer in the Scala language specification's Placeholder Syntax for Anonymous Functions:
An expression (of syntactic category Expr) may contain embedded underscore symbols _ at places where identifiers are legal. Such an expression represents an anonymous function where subsequent occurrences of underscores denote successive parameters.
Note that one underscore references one input parameter, two underscores are for two different input parameters and so on.
With that said, you cannot use the placeholder twice and expect that they'll reference the same input parameter. That's not how it works in Scala and hence the compiler error.
// Not working
val rdd3 = rdd1.map((_, _.length))
The above is equivalent to the following:
// Not working
val rdd3 = rdd1.map { (a: String, b: String) => (a, b.length)) }
which is clearly incorrect as map expects a function of one input parameter.

why scala lambda with _ can't using && to combine two bool expression

As far as I understand .
_ is a short lambda to omit a=>
i find this code (can find here scala-function-true-power)
val file = List("warn 2013 msg", "warn 2012 msg", "error 2013 msg", "warn 2013 msg")
val size = file.filter(_.contains("warn")).filter(_.contains("2013")).size
//val size1 = file.filter(_.contains("warn") && _.contains("2013")).size
val size2 = file.filter( a=> a.contains("warn") && a.contains("2013")).size
println("cat file | grep 'warn' | grep '2013' | wc : " +size )
the line to get size1 has syntax error,looks like it can't recognize the "_" ,it's not a element in fileList.
but i use a=>,the normal kind,it works good .
so,why the scala work by this way?
is there more difference in _ and a=> ?
In scala, any _ placeholder is matched against the passed arguments in the context of calling function. So for example if the signature of the function you are trying to use is f : A ⇒ B and you are calling something like collectionOfFunctA.map(_.f) - Scala compiler will infer the correct type of the function and will use the first underscore to put the actual item from a collection and call the function f over it. But if you will try to write it as collectionOfFunctA.map(_.f + _.size) - that will fail, because Scala compiler will pick up the first placeholder as of type that has function f defined, and the second underscore will not match any function in the context. So it will expect to have a function that takes two parameters instead of one.
More on this
As jdevelop says, but here in the words of the compiler/REPL:
scala> val size1 = file.filter(_.contains("warn") && _.contains("2013")).size
<console>:8: error: missing parameter type for expanded function ((x$1, x$2) => x$1.contains("warn").$amp$amp(x$2.contains("2013")))
val size1 = file.filter(_.contains("warn") && _.contains("2013")).size
^
<console>:8: error: missing parameter type for expanded function ((x$1: <error>, x$2) => x$1.contains("warn").$amp$amp(x$2.contains("2013")))
val size1 = file.filter(_.contains("warn") && _.contains("2013")).size
^
You see that hint: for expanded function ((x$1, x$2) => x$1.contains("warn").$amp$amp(x$2.contains("2013")))
It is expecting 2 parameters while there is just one.
You can think of the place holder as being matched with the lambda's arguments positionnally.
The first occurrence of the _ is matched with the first argument, the second occurence is matched with the second argument, etc.
As the other answers have shown, this means that using the placeholder twice will be desugared as trying to pass a lamba with 2 arguments to the filter which only expects one.
In your example :
val size = file.filter(_.contains("warn") && _.contains("2013")).size
would be desugared as
val size = file.filter((a,b)=>a.contains("warn") && b.contains("2013")).size
which will not compile since filter expects a predicate p: A => Boolean
Now, a reason the placeholder is matched positionnally is to avoid ambiguity in lambdas with more than one argument.
How can the compiler guess the correct implementation for the following case if the place holder can be reused multiple times for the same argument:
file.fold("")(_++_)
Should it be desugared as :
file.fold("")((a,b)=> a++b )
or as
file.fold("")((a,b)=> a++a )
or as
file.fold("")((a,b)=> b++b )
and worse, what would you expect for
file.fold("")(_++_++_)
There is no general way for the compiler to infer the correct implementation.
One might argue for relaxing the constraint when the expected lambda only accepts one argument. I suggest doing a more detailed research before taking the first steps to the scala improvement process as it seems likely that this particular design decision has been challenged and explained before.
If you are worried about the performance of iterating over the list twice (which is the case when you write)
file.filter(_.contains("warn")).filter(_.contains("2013")).size
In theory it should be possible for the compiler to detect that both filters can be applied within the same iteration.
In scala, the collections are eager by default but you can get the lazy evaluation by using views.
The current implementation has known issues which are being worked on. Other collection implementations in scala are actively being developed to be able to combine transformations and computations by default (see psp-std for example)

What does { val x = a; b.:::(x) } mean in Scala?

I am new to Scala and studying a book about it (Programming in Scala). I am really lost, what is the author trying to explain with the code below. Can anyone explain it in more detail ?
{ val x = a; b.:::(x) }
::: is a method that prepends list given as argument to the list it is called on
you could look at this as
val a = List(1, 2)
val b = List(3, 4)
val x = a
b.prependList(x)
but actually for single argument methods if it's not ambiguous scala allows to skip parenthesis and the dot and this is how this method is supposed to be used to not look ugly
x ::: b
it will just join these two lists, but there is some trick here
if method name ends with : it will be bound the other way
so typing x ::: b works as if this type of thing was done (x):::.b. You obviously can't type it like this in scala, won't compile, but this is what happens. Thanks to this x is on the left side of the operator and it's elements will be on the left side (beginning) of the list that is result of this call.
Oh well, now I found maybe some more explanation for you and also the very same piece of code you posted, in answer to this question: What good are right-associative methods in Scala?
Assuming a and b are lists: It assigns a to x, then returns the list b prepended with the list x.
For example, if val a = List(1,2,3) and val b = List(4,5,6) then it returns List(1,2,3,4,5,6).