Why DataFrame.collect() not returning array - scala

I am trying to call collect() over a Dataframe in Scala 2.12. Instead of returning an Array[Row], it returns me this - [Long.apache.spark.sql.Row;#58131fc

It's annoying, but on the JVM, in both Java and Scala, that's just how the toString method on arrays works. Instead of seeing the contents, you get a cryptic thing beginning with e.g. [L:
scala 2.12.10> Array("foo").toString
res0: String = [Ljava.lang.String;#8bffb8b
So it appears to me that you do in fact have an Array[Row].
See also Why does the toString method in java not seem to work for an array

Related

forEach in scala shows expected: Consumer[_ >:Path] actual: (Path) => Boolean

Wrong syntax problem in recursively deleting scala files
Files.walk(path, FileVisitOption.FOLLOW_LINKS)
.sorted(Comparator.reverseOrder())
.forEach(Files.deleteIfExists)
The issue is that you're trying to pass a scala-style function to a method expecting a java-8-style function. There's a couple libraries out there that can do the conversion, or you could write it yourself (it's not complicated), or probably the simplest is to just convert the java collection to a scala collection that has a foreach method expecting a scala-style function as an argument:
import scala.collection.JavaConverters._
Files.walk(path, FileVisitOption.FOLLOW_LINKS)
.sorted(Comparator.reverseOrder())
.iterator().asScala
.foreach(Files.deleteIfExists)
In Scala 2.12 I expect this should work:
...forEach(Files.deleteIfExists(_: Path))
The reason you need to specify argument type is because expected type is Consumer[_ >: Path], not Consumer[Path] as it would be in Scala.
If it doesn't work (can't test at the moment), try
val deleteIfExists: Consumer[Path] = Files.deleteIfExists(_)
...forEach(deleteIfExists)
Before Scala 2.12, Joe K's answer is the correct one.

Groovy: getClass method on map literal returns null

In Groovy I use the map literal notation quite frequently in my code, and was curious as to what concrete implementation of Map it was.
After trying a few things, this script best illustrates my confusion:
def map = ["A":"B"]
println map // I assume this avoids any lazy evaluation of the map
println map instanceof HashMap // I tried some other impls too
println map.class
and receive this output:
[A:B]
true
null
This tells me that the map is apparently a HashMap, but the getClass method doesn't want to tell me that.
So my question is: why is getClass returning null, and is there a more appropriate way to get runtime class info from Groovy?
You need to use
map.getClass()
As otherwise it's looking for a key called class
Nearly a duplicate of Why does groovy .class return a different value than .getClass()

How to iterate through lazy iterable in scala? from stanford-tmt

Scala newbie here,
I'm using stanford's topic modelling toolkit
and it has a lazy iterable of type LazyIterable[(String, Array[Double])]
How should i iterate through all the elements in this iterable say it to print all these values?
I tried doing this by
while(it.hasNext){
System.out.println(it.next())
}
Gives an error
error: value next is not a member of scalanlp.collection.LazyIterable[(String, Array[Double])]
This is the API source -> iterable_name ->
InferCVB0DocumentTopicDistributions in
http://nlp.stanford.edu/software/tmt/tmt-0.4/api/edu/stanford/nlp/tmt/stage/package.html
Based on its source code, I can see that the LazyIterable implements the standard Scala Iterable interface, which means you have access to all the standard higher-order functions that all Scala collections implement - such as map, flatMap, filter, etc.
The one you will be interested in for printing all the values is foreach. So try this (no need for the while-loop):
it.foreach(println)
Seems like method invocation problem, just check the source code of LazyIterable, look at line 46
override def iterator : Iterator[A]
when you get an instance of LazyIterable, invoke iterator method, then you can do what you want.

.eq causing warning. How do I get rid of it?

I'm using JDO with the DataNucleus typesafe query language in Scala. I therefore have code that looks like this:
val id: Long = // something
val cand: QDbObject = QDbObject.candidate()
pm.query[DbObject].filter(cand.id.eq(id))...
In a nutshell, this runs a query for all the DbObjects whose id field is equal to id. Unfortunately, I get the following warning:
NumericExpression[Long] and Long are unrelated: they will most likely
never compare equal
Clearly, the Scala compiler thinks that NumericExpression[Long] is using the built-in definition of eq(), which is similar to ==, but since this comes from Java, the eq() method has absolutely nothing to do with Scala's eq() method.
Is there any way to get rid of the warning? Obviously, this is going to happen a lot and I'm afraid these non-warnings will hide real warnings.
Update (2013-06-29)
This was fixed in Scala 2.10.2. The warnings are gone.
I was more concerned whether the eq method would actually be called instead of Scala's eq! But it is. I don't think you can get rid of it, though. If you are using Scala 2.10, you can create an implicit value class with a different method calling eq -- it will be effectively the same thing, but the warning will be limited to one file.

Using scala vararg methods in java

Why do all scala vararg methods, when used from java, seem to accept a Seq of variables, and can't be used as java native vararg methods. Is this a bug?
For instance, Buffer has method def append(elems: A*): Unit. But in java it has another signature: void append(Seq<A>).
If you control the scala code you can use #varargs to make it generate a java-compatible varags method, e.g. #varargs def append(elems: A*): Unit = {}
It is not a bug. It is a design choice that favors vararg use within Scala over interoperability with Java. For example, it allows you to pass a List into a Scala varargs method without having to convert it to an Array on the way.
If you need to use Scala varargs from Java, you should create some scala Seq instead. You can, for example, write a Java wrapper to get an array automatically created, and then use the genericWrapArray method from the Predef object.
you can easily cast a Seq in varargs using :_*. For example :
val b = collection.mutable.ListBuffer.empty[Int]
b.append(List(1, 2):_*)
so this avoid code duplication in the collection API.
You can also simply use appendAll :
b.appendAll((List(1, 2))