Can someone please explain what this means
Error:(32, 28) discarded non-Unit value
dataFrameReader.load() wasCalled once
I've looked at some online articles and I don't quite understand it.
This is my code snippet from a ScalaTest with Scala Mockito
...
val dataFrameReader = mock[DataFrameReader]
dataFrameReader.format(anyString) shouldReturn dataFrameReader
dataFrameReader.option(anyString, anyString) shouldReturn dataFrameReader
dataFrameReader.load() wasCalled once
If I take out the wasCalled once then it works fine
I don't understand what this means though as I am invoking "wasCalled" on what load() returns and wasCalled once resolves to a unit
What am I missing here?
Assuming you are mocking DataFrameReader.load from Apache Spark, then its return type is actually DataFrame and not Unit:
def load(): DataFrame
On the other hand, return type of wasCalled is indeed Unit:
def wasCalled(t: Times)(implicit order: VerifyOrder): Unit
Thus we have a situation similar to
def f(): Unit = {
g() // g returns DataFrame which gets discarded by f
}
def g(): DataFrame
which gets flagged by compiler if scalacOptions += "-Ywarn-value-discard" is set.
The issue has been resolved since mockito-scala 1.2.2.
Related
Here is a simple example in shapeless:
it("from Witness") {
val ttg = implicitly[TypeTag[Witness.Lt[String]]]
val ctg = implicitly[ClassTag[Witness.Lt[String]]]
}
it("... from macro") {
val ttg = implicitly[TypeTag[Witness.`1`.T]]
val ctg = implicitly[ClassTag[Witness.`1`.T]]
}
it("... doesn't work") {
val ttg = implicitly[TypeTag[w1.T]] // failed!
val ctg = implicitly[ClassTag[w2.T]]
}
The second and third it block have very similar bytecode (after the Witness.? macro has been invoked), yet one succeed and one failed:
[Error] /home/peng/git-spike/scalaspike/common/src/test/scala/com/tribbloids/spike/scala_spike/Reflection/InferTypeTag.scala:55: No TypeTag available for com.tribbloids.spike.scala_spike.Reflection.InferTypeTag.w1.T
What could have caused this? And how do I circumvent this problem?
If you switch on scalacOptions += "-Xlog-implicits" you'll see
val w1 = Witness(1)
val ttg = implicitly[TypeTag[w1.T]] // doesn't compile
//materializing requested reflect.runtime.universe.type.TypeTag[App.w1.T] using scala.reflect.api.`package`.materializeTypeTag[App.w1.T](scala.reflect.runtime.`package`.universe)
//scala.reflect.api.`package`.materializeTypeTag[App.w1.T](scala.reflect.runtime.`package`.universe) is not a valid implicit value for reflect.runtime.universe.TypeTag[App.w1.T] because:
//failed to typecheck the materialized tag:
//cannot create a TypeTag referring to type shapeless.Witness.<refinement>.T local to the reifee: use WeakTypeTag instead
//No TypeTag available for App.w1.T
So try to use WeakTypeTag as recommended
val w1 = Witness(1)
val ttg2 = implicitly[WeakTypeTag[w1.T]] // compiles
In Scala, why it is impossible to infer TypeTag from type alias or dependent type?
Why there is no TypeTag available in nested instantiations (when interpreted by scala code runner)?
Typetags not working inside of code block scope?
Type aliases screw up type tags?
Here's an example:
$ scala
Welcome to Scala 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_112).
Type in expressions for evaluation. Or try :help.
scala> val a: Unit = 1
<console>:11: warning: a pure expression does nothing in statement position; you may be omitting necessary parentheses
val a: Unit = 1
^
a: Unit = ()
In Scala documentation:
There is only one value of type Unit, ()
Why is Scala compiler silently coercing values to Unit?
A bit of context: I used Future[Unit] type to describe some procedure which does not return anything. And since Future[Unit] is now effectively a subtype of Unit, I got bit by some funny bugs (someFuture.map(a => Future(a)) silently skips calling the operation instead of giving compilation warning). What am I supposed to use as a type of operation that does not return any meaningful result?
Unit is not a supertype of other types. What happens instead is called value discarding: when the expected type of an expression e is Unit, the compiler replaces it by {e; ()}. This is done to make some behavior more familiar. E.g.
val sb = new StringBuilder
val strings: List[String] = ...
for (str <- strings) { sb.append(str) }
By analogy with for loops in other languages, we would expect it to compile. But without value discarding it wouldn't: this is equivalent to strings.foreach(str => sb.append(str)), the type of str => sb.append(str) is String => StringBuilder (because all append methods on StringBuilder return the builder itself) and foreach on List[String] takes String => Unit.
You can add -Ywarn-value-discard compiler option to warn you when it happens (and write for (sb <- sbOpt) { sb.append("a"); () } explicitly).
Or you can actually go with a trick of defining your own Unit (possibly changing the name to avoid confusion for anyone reading your code):
object Unit
type Unit = Unit.type
implicit def unit2scalaunit(a: Unit): scala.Unit = ()
implicit def scalaunit2unit(a: scala.Unit): Unit = Unit
This should avoid running into the problem with Future you describe.
Unit is not a supertype of everything! Scala actually has a pretty wide variety of conversions that happen automatically and this is one of them. From section 6.26.1 Value Conversions of the Scala language spec, one of the conversions is
Value Discarding
If e has some value type and the expected type is Unit, e is converted
to the expected type by embedding it in the term { e; () }.
So when you write something like val a: Unit = 1, it gets processed into val a: Unit = { 1; () }, which is quite different. The warning is actually very helpful here - it is warning you that you probably did something wrong: the expression you are trying to put into statement position is pure (has no side-effects), so executing it has no effect (except possibly to cause the program to diverge) on the final output.
Im using the latest SJS version (master) and the application extends SparkHiveJob. In the runJob implementation, I have the following
val eDF1 = hive.applySchema(rowRDD1, schema)
I would like to persist eDF1 and tried the following
val rdd_topersist = namedObjects.getOrElseCreate("cleanedDF1", {
NamedDataFrame(eDF1, true, StorageLevel.MEMORY_ONLY)
})
where the following compile errors occur
could not find implicit value for parameter persister: spark.jobserver.NamedObjectPersister[spark.jobserver.NamedDataFrame]
not enough arguments for method getOrElseCreate: (implicit timeout:scala.concurrent.duration.FiniteDuration, implicit persister:spark.jobserver.NamedObjectPersister[spark.jobserver.NamedDataFrame])spark.jobserver.NamedDataFrame. Unspecified value parameter persister.
Obviously this is wrong, but I can't figure what is wrong. I'm fairly new to Scala.
Can someone help me understand this syntax from NamedObjectSupport?
def getOrElseCreate[O <: NamedObject](name: String, objGen: => O)
(implicit timeout: FiniteDuration = defaultTimeout,
persister: NamedObjectPersister[O]): O
I think you should define implicit persister. Looking at the test code, I see something like this
https://github.com/spark-jobserver/spark-jobserver/blob/ea34a8f3e3c90af27aa87a165934d5eb4ea94dee/job-server-extras/test/spark.jobserver/NamedObjectsSpec.scala#L20
Why does this compile
scala> import scala.concurrent.Future
import scala.concurrent.Future
scala> val f: Unit = Future.successful(())
f: Unit = ()
I expected the compiler to complain about the assignment.
This is called "Value Discarding". Citing the scala specification (6.26.1):
Value Discarding
If e has some value type and the expected type is Unit, e is converted to the expected type by embedding it in the term { e; () }.
In other words, any value, whatever its type, is implicitly converted to Unit, effectively discarding it.
It you want to be warned about such discarding (which can in some cases hide a bug), you can pass the -Ywarn-value-discard option to the compiler. You'll then have to explicitly return () every time you call a method only for its side effect, but that method does return a non-Unit value.
The compiler is fine since applying f will only execute the call
val f: Unit = Future.successful(())
and the return value will go into the nirvana.
Basically this is the same as:
val f: Unit = {
Future.successful(())
()
}
If the compiler don't find the Unit it expects in the last value of the method it will put it there.
I am writing a set of methods that extend Spark RDD's API.
I have to implement a general method for storing the RDDs, and for a start I tried to wrap spark-cassandra-connector's saveAsCassandraTable, without success.
Here's the "extending RDD's API" part:
object NewRDDFunctions {
implicit def addStorageFunctions[T](rdd: RDD[T]):
RDDStorageFunctions[T] = new RDDStorageFunctions(rdd)
}
class RDDStorageFunctions[T](rdd: RDD[T]) {
def saveResultsToCassandra() {
rdd.saveAsCassandraTable("ks_name", "table_name") // this line produces errors!
}
}
...and importing the object as: import ...NewRDDFunctions._.
The marked line produces following errors:
Error:(99, 29) could not find implicit value for parameter rwf: com.datastax.spark.connector.writer.RowWriterFactory[T]
rdd.saveAsCassandraTable("ks_name", "table_name")
^
Error:(99, 29) not enough arguments for method saveAsCassandraTable: (implicit connector: com.datastax.spark.connector.cql.CassandraConnector, implicit rwf: com.datastax.spark.connector.writer.RowWriterFactory[T], implicit columnMapper: com.datastax.spark.connector.mapper.ColumnMapper[T])Unit.
Unspecified value parameters rwf, columnMapper.
rdd.saveAsCassandraTable("ks_name", "table_name")
^
I don't get why this doesn't work since saveAsCassandraTable is designed to work on any RDD. Any suggestions?
I had similar problem with the example in spark-cassandra-connector docs:
case class WordCount(word: String, count: Long)
val collection = sc.parallelize(Seq(WordCount("dog", 50), WordCount("cow", 60)))
collection.saveAsCassandraTable("test", "words_new", SomeColumns("word", "count"))
...and the solution was to move case class definition out of "main" function (but I don't really know if this applies to the mentioned problem...).
saveAsCassandraTable needs 3 implicit parameters. The first one (connector) has a default value, the last two (rwf and columnMapper) are not in implicit scope in your saveResultsToCassandra method, as a consequence your method doesn't compile.
Look at this answer on another question, if you need some more information about implicits.
Turning your saveResultsToCassandra into the function below should work, if you have defined your tables (TableDef) before.
def saveResultsToCassandra()(
// implicit parameters as a separate list!
implicit rwf: RowWriterFactory[T],
columnMapper: ColumnMapper[T]
) {
rdd.saveAsCassandraTable("ks_name", "table_name")
}