I found that mock object in Mockito would throw ClassNotFoundException when used in Spark. Here is a minimal example:
import org.apache.spark.{SparkConf, SparkContext}
import org.mockito.{Matchers, Mockito}
import org.scalatest.FlatSpec
import org.scalatest.mockito.MockitoSugar
trait MyTrait {
def myMethod(a: Int): Int
}
class MyTraitTest extends FlatSpec with MockitoSugar {
"Mock" should "work in Spark" in {
val m = mock[MyTrait](Mockito.withSettings().serializable())
Mockito.when(m.myMethod(Matchers.any())).thenReturn(1)
val conf = new SparkConf().setAppName("testApp").setMaster("local")
val sc = new SparkContext(conf)
assert(sc.makeRDD(Seq(1, 2, 3)).map(m.myMethod).first() == 1)
}
}
which would throw the following exception:
[info] MyTraitTest:
[info] Mock
[info] - should work in Spark *** FAILED ***
[info] org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): java.lang.ClassNotFoundException: MyTrait$$EnhancerByMockitoWithCGLIB$$6d9e95a8
[info] at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
[info] at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
[info] at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
[info] at java.lang.Class.forName0(Native Method)
[info] at java.lang.Class.forName(Class.java:348)
[info] at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67)
[info] at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1819)
[info] at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1713)
[info] at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1986)
[info] at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
[info] at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
[info] at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
[info] at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
[info] at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
[info] at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
[info] at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
[info] at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
[info] at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
[info] at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
[info] at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
[info] at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
[info] at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
[info] at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2231)
[info] at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2155)
[info] at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2013)
[info] at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1535)
[info] at java.io.ObjectInputStream.readObject(ObjectInputStream.java:422)
[info] at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
[info] at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
[info] at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:80)
[info] at org.apache.spark.scheduler.Task.run(Task.scala:99)
[info] at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
[info] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[info] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[info] at java.lang.Thread.run(Thread.java:745)
The stacktrace hints this is related to dynamic class loading, but I don't know how to fix it.
Update:
Apparently, change
val m = mock[MyTrait](Mockito.withSettings().serializable())
to
val m = mock[MyTrait](Mockito.withSettings().serializable(SerializableMode.ACROSS_CLASSLOADERS))
makes exception disappear. However I am not following why this fix is necessary. I thought in spark local mode, a single JVM is running that hosts both driver and executor. So it must be a different ClassLoader is used to load the deserialized class on executor?
I cannot get basic tests using matchers working with Scala.js with Three.js facade. Following code prints error "scala.scalajs.js.JavaScriptException: TypeError: Cannot read property 'isArray__Z' of null".
import org.denigma.threejs.Vector3
import org.scalatest.{FlatSpec, Matchers}
class TestVector extends FlatSpec with Matchers {
"Vector" should "work" in {
val v000 = new Vector3(0, 0, 0)
v000 should be (v000)
}
}
Complete project shared at GitHub.
I guess it must be something related to facaded native types, as tests with classes defined in Scala work fine.
I did not find any gotchas documented regarding Scala.js and matchers. Am I missing something obvious?
Complete stack trace of the error:
[info] - should work *** FAILED ***
[info] scala.scalajs.js.JavaScriptException: TypeError: Cannot read property 'isArray__Z' of null
[info] at scala.runtime.ScalaRunTime$.isArrayClass(W:\Test\JSTest\target\scala-2.11\jstest-test-fastopt.js:19470:14)
[info] at scala.runtime.ScalaRunTime$.isArray(W:\Test\JSTest\target\scala-2.11\jstest-test-fastopt.js:19461:32)
[info] at org.scalatest.Assertions$.areEqualComparingArraysStructurally(W:\Test\JSTest\target\scala-2.11\jstest-test-fastopt.js:30299:29)
[info] at org.scalatest.words.BeWord$$anon$15.apply(W:\Test\JSTest\target\scala-2.11\jstest-test-fastopt.js:26853:49)
[info] at org.scalatest.Matchers$ShouldMethodHelper$.shouldMatcher(W:\Test\JSTest\target\scala-2.11\jstest-test-fastopt.js:8592:29)
[info] at org.scalatest.Matchers$AnyShouldWrapper.should(W:\Test\JSTest\target\scala-2.11\jstest-test-fastopt.js:8523:115)
[info] at {anonymous}()(W:\Test\JSTest\target\scala-2.11\jstest-test-fastopt.js:54138:20)
[info] at scala.scalajs.runtime.AnonFunction0.apply(W:\Test\JSTest\target\scala-2.11\jstest-test-fastopt.js:29505:23)
[info] at org.scalatest.OutcomeOf$class.outcomeOf(W:\Test\JSTest\target\scala-2.11\jstest-test-fastopt.js:2911:7)
[info] at org.scalatest.Transformer.apply(W:\Test\JSTest\target\scala-2.11\jstest-test-fastopt.js:37772:10)
I need some help understanding errors that are being generated through Scala class for the RandomForestAlgorithm.scala (https://github.com/PredictionIO/PredictionIO/blob/develop/examples/scala-parallel-classification/custom-attributes/src/main/scala/RandomForestAlgorithm.scala).
I am building the project as is (custom-attributes for classification template) in PredictionIO and am getting a pio build error:
hduser#hduser-VirtualBox:~/PredictionIO/classTest$ pio build --verbose
[INFO] [Console$] Using existing engine manifest JSON at /home/hduser/PredictionIO/classTest/manifest.json
[INFO] [Console$] Using command '/home/hduser/PredictionIO/sbt/sbt' at the current working directory to build.
[INFO] [Console$] If the path above is incorrect, this process will fail.
[INFO] [Console$] Uber JAR disabled. Making sure lib/pio-assembly-0.9.5.jar is absent.
[INFO] [Console$] Going to run: /home/hduser/PredictionIO/sbt/sbt package assemblyPackageDependency
[INFO] [Console$] [info] Loading project definition from /home/hduser/PredictionIO/classTest/project
[INFO] [Console$] [info] Set current project to template-scala-parallel-classification (in build file:/home/hduser/PredictionIO/classTest/)
[INFO] [Console$] [info] Compiling 1 Scala source to /home/hduser/PredictionIO/classTest/target/scala-2.10/classes...
[INFO] [Console$] [error] /home/hduser/PredictionIO/classTest/src/main/scala/RandomForestAlgorithm.scala:28: class RandomForestAlgorithm **needs to be abstract**, since method train in class P2LAlgorithm of type (sc: org.apache.spark.SparkContext, pd: com.test1.PreparedData)com.test1.**PIORandomForestModel is not defined**
[INFO] [Console$] [error] class RandomForestAlgorithm(val ap: RandomForestAlgorithmParams) // CHANGED
[INFO] [Console$] [error] ^
[INFO] [Console$] [error] one error found
[INFO] [Console$] [error] (compile:compile) Compilation failed
[INFO] [Console$] [error] Total time: 6 s, completed Jun 8, 2016 4:37:36 PM
[ERROR] [Console$] Return code of previous step is 1. Aborting.
so when I address the line causing the error and make it an abstract object:
// extends P2LAlgorithm because the MLlib's RandomForestModel doesn't
// contain RDD.
abstract class RandomForestAlgorithm(val ap: RandomForestAlgorithmParams) // CHANGED
extends P2LAlgorithm[PreparedData, PIORandomForestModel, // CHANGED
Query, PredictedResult] {
def train(data: PreparedData): PIORandomForestModel = { // CHANGED
// CHANGED
// Empty categoricalFeaturesInfo indicates all features are continuous.
val categoricalFeaturesInfo = Map[Int, Int]()
val m = RandomForest.trainClassifier(
data.labeledPoints,
ap.numClasses,
categoricalFeaturesInfo,
ap.numTrees,
ap.featureSubsetStrategy,
ap.impurity,
ap.maxDepth,
ap.maxBins)
new PIORandomForestModel(
gendersMap = data.gendersMap,
educationMap = data.educationMap,
randomForestModel = m
)
}
pio build is successful but training fails because it can't instantiate the new assignments for the model:
[INFO] [Engine] Extracting datasource params...
[INFO] [WorkflowUtils$] No 'name' is found. Default empty String will be used.
[INFO] [Engine] Datasource params: (,DataSourceParams(6))
[INFO] [Engine] Extracting preparator params...
[INFO] [Engine] Preparator params: (,Empty)
[INFO] [Engine] Extracting serving params...
[INFO] [Engine] Serving params: (,Empty)
[WARN] [Utils] Your hostname, hduser-VirtualBox resolves to a loopback address: 127.0.1.1; using 10.0.2.15 instead (on interface eth0)
[WARN] [Utils] Set SPARK_LOCAL_IP if you need to bind to another address
[INFO] [Remoting] Starting remoting
[INFO] [Remoting] Remoting started; listening on addresses :[akka.tcp://sparkDriver#10.0.2.15:59444]
[WARN] [MetricsSystem] Using default name DAGScheduler for source because spark.app.id is not set.
**Exception in thread "main" java.lang.InstantiationException**
at sun.reflect.InstantiationExceptionConstructorAccessorImpl.newInstance(InstantiationExceptionConstructorAccessorImpl.java:48)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at io.prediction.core.Doer$.apply(AbstractDoer.scala:52)
at io.prediction.controller.Engine$$anonfun$1.apply(Engine.scala:171)
at io.prediction.controller.Engine$$anonfun$1.apply(Engine.scala:170)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at io.prediction.controller.Engine.train(Engine.scala:170)
at io.prediction.workflow.CoreWorkflow$.runTrain(CoreWorkflow.scala:65)
at io.prediction.workflow.CreateWorkflow$.main(CreateWorkflow.scala:247)
at io.prediction.workflow.CreateWorkflow.main(CreateWorkflow.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:672)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
So two questions:
1. Why is the following model not considered defined during building:
class PIORandomForestModel(
val gendersMap: Map[String, Double],
val educationMap: Map[String, Double],
val randomForestModel: RandomForestModel
) extends Serializable
How can I define PIORandomForestModel in a way that does not throw a pio build error and lets training re-assign attributes to the object?
I have posted this question in the PredictionIO Google group but have not gotten a response.
Thanks in advance for your help.
This code just getPackage and getName of a class (not use any mock techniques yet), but it failed.
Anyone see this problem before?
Code:
import mai.MyScala1
import org.junit.Test
import org.junit.runner.RunWith
import org.powermock.modules.junit4.PowerMockRunner
import org.scalatest.junit.JUnitSuite
#RunWith(classOf[PowerMockRunner])
class MyTest extends JUnitSuite {
#Test def test1() {
classOf[MyScala1].getPackage // this one returns null
classOf[MyScala1].getPackage.getName // raise java.lang.NullPointerException
}
}
Error logs:
[info] - test1 *** FAILED ***
[info] java.lang.NullPointerException:
[info] at org.apache.tmp.MyTest.test1(MyTest.scala:15)
[info] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[info] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[info] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[info] at java.lang.reflect.Method.invoke(Method.java:497)
[info] at org.junit.internal.runners.TestMethod.invoke(TestMethod.java:68)
[info] at org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:310)
[info] at org.junit.internal.runners.MethodRoadie$2.run(MethodRoadie.java:89)
[info] at org.junit.internal.runners.MethodRoadie.runBeforesThenTestThenAfters(MethodRoadie.java:97)
[info] at org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.executeTest(PowerMockJUnit44RunnerDelegateImpl.java:294)
[info] ...
I think I found the answer, but I don't understand why it works?
Answer: Inside sbt, we should first "set fork := true", then everything will work fine.
The problem might be related to process vs thread problem.
I currently testing a web service, and I keep on running into an error where the web service test is failing because it is timing out. I'm trying to extends that timeout to be 5 seconds long. I'm trying to mimic a solution that some one posted on the Scala Spray google groups forum to no avail. Here is the code I am trying to use in my test:
import akka.testkit._
import akka.actor.ActorSystem
import com.github.nfldb.config.{NflDbApiActorSystemConfig, NflDbApiDbConfigTest}
import org.scalatest.MustMatchers
import org.specs2.mutable.Specification
import spray.testkit.Specs2RouteTest
import spray.routing.HttpService
import spray.http.StatusCodes._
import spray.json.DefaultJsonProtocol._
import spray.httpx.SprayJsonSupport._
import concurrent.duration._
/**
* Created by chris on 8/25/15.
*/
class NflPlayerScoringSvcTest extends Specification with Specs2RouteTest with NflPlayerScoringService
with NflDbApiDbConfigTest with NflDbApiActorSystemConfig {
import PlayerScoreProtocol.playerScoreProtocol
implicit def actorRefFactory = actorSystem
implicit def default(system: ActorSystem = actorSystem) = RouteTestTimeout(new DurationInt(5).second.dilated)
"NflPlayerScoringSvc" should {
"return hello" in {
Get("/hello") ~> nflPlayerScoringServiceRoutes ~> check {
responseAs[String] must contain("Say hello")
}
}
"calculate a player's score for a given week" in {
import PlayerScoreProtocol.playerScoreProtocol
Get("/playerScore?playerId=00-0031237&gsisId=2015081551") ~> nflPlayerScoringServiceRoutes ~> check {
val playerScore : DfsNflScoringEngineComponent.PlayerScore = responseAs[DfsNflScoringEngineComponent.PlayerScore]
playerScore.playerId must be ("00-0031237")
}
}
}
}
and here is the error that I am receiving:
> test-only *NflPlayerScoringSvcTest*
[info] Compiling 1 Scala source to /home/chris/dev/suredbits-dfs/target/scala-2.11/test-classes...
15:54:54.639 TKD [com-suredbits-dfs-nfl-scoring-NflPlayerScoringSvcTest-akka.actor.default-dispatcher-4] INFO akka.event.slf4j.Slf4jLogger - Slf4jLogger started
15:54:55.158 TKD [NflDbApiActorSystemConfig-akka.actor.default-dispatcher-2] INFO akka.event.slf4j.Slf4jLogger - Slf4jLogger started
15:54:55.228 TKD [NflDbApiActorSystemConfig-akka.actor.default-dispatcher-2] INFO test test test - Trying to find score for player: 00-0031237 and optional gsisId: Some(2015081551)
15:54:55.228 TKD [NflDbApiActorSystemConfig-akka.actor.default-dispatcher-2] INFO test test test - Searching for player 00-0031237 with optional game: Some(2015081551)
15:54:55.268 TKD [NflDbApiActorSystemConfig-akka.actor.default-dispatcher-4] INFO c.s.d.n.s.NflPlayerScoringSvcTest - Creating database for class com.suredbits.dfs.nfl.scoring.NflPlayerScoringSvcTest
[info] NflPlayerScoringSvcTest
[info]
[info] NflPlayerScoringSvc should
[info] + return hello
[info] x calculate a player's score for a given week
[error] Request was neither completed nor rejected within 1 second (NflPlayerScoringSvcTest.scala:33)
[info]
[info]
[info] Total for specification NflPlayerScoringSvcTest
[info] Finished in 1 second, 310 ms
[info] 2 examples, 1 failure, 0 error
[info] ScalaTest
[info] Run completed in 3 seconds, 455 milliseconds.
[info] Total number of tests run: 0
[info] Suites: completed 0, aborted 0
[info] Tests: succeeded 0, failed 0, canceled 0, ignored 0, pending 0
[info] No tests were executed.
[error] Failed: Total 2, Failed 1, Errors 0, Passed 1
[error] Failed tests:
[error] com.suredbits.dfs.nfl.scoring.NflPlayerScoringSvcTest
[error] (test:testOnly) sbt.TestsFailedException: Tests unsuccessful
[error] Total time: 11 s, completed Aug 25, 2015 3:54:56 PM
> 15:54:56.799 TKD [NflDbApiActorSystemConfig-akka.actor.default-dispatcher-2] INFO c.s.d.n.s.NflPlayerScoringSvcTest - Calculating score for game: NflGame(2015081551,Some(56772),2015-08-16T00:00:00.000Z,NflPreSeasonWeek1,Saturday,2015,Preseason,true,HomeTeam(MIN,26,9,14,3,0,2),AwayTeam(TB,16,3,6,7,0,1),2015-05-22T21:54:43.143Z,2015-08-16T17:29:01.729Z) and player: NflPlayer(00-0031237,Some(T.Bridgewater),Some(Teddy Bridgewater),Some(Teddy),Some(Bridgewater),MIN,QB,Some(2543465),Some(http://www.nfl.com/player/teddybridgewater/2543465/profile),Some(5),Some(11/10/1992),Some(Louisville),Some(2),Some(74),Some(215),Active)
can anyone provide any insight as what to what I can do to extend the time timeout time on Scala Spray?
Here is the solution
implicit def default(implicit system: ActorSystem) = RouteTestTimeout(new DurationInt(5).second.dilated(system))
I have to explicitly pass for system parameter to the dilated method because of implicit conflicts in Scala.
Similar to the previous answer, but a bit shorter
implicit def default(implicit system: ActorSystem) = RouteTestTimeout(5.seconds)
Here is the change which you will need to make -
implicit def default(implicit system: ActorSystem): RouteTestTimeout = RouteTestTimeout(new DurationInt(8).second.dilated(system))