I'm playing with Akka Streams and have been trying to do some enrichment and processing of events polled from a MongoDB collection. I'm having some doubts, however, about the best approach to implement my event enrichers, which may need to connect to an external data source.
The mapFuture seems to be a good fit, but I'm having some issues:
class EventEnricherActor extends Actor with ActorLogging {
// ...
def receive = {
case e: Event =>
sender ! augmentEvent(e)
}
}
And my app:
val enricherActor = actorSystem.actorOf(Props[EventEnricherActor])
Flow(mongodbConsumer).
mapFuture(msg => enricherActor ? msg).
onComplete(materializer) { _ => actorSystem.shutdown()}
However, I'm stuck at this error:
java.lang.ClassCastException: messages.Event cannot be cast to scala.runtime.Nothing$
on the call to mapFuture.
What am I missing?
Any better ideas to deal with these enrichments?
Update
Stack trace:
[ERROR] [08/12/2014 11:48:07.314] [actor-system-akka.actor.default-dispatcher-2] [akka://actor-system/user/flow-1-2-transform] failure during processing
java.lang.ClassCastException: messages.Event cannot be cast to scala.runtime.Nothing$
at apps.Consume$$anonfun$1$$anonfun$apply$1.apply(Consume.scala:47)
at akka.stream.impl.MapFutureProcessorImpl$$anonfun$1.apply$mcV$sp(MapFutureProcessorImpl.scala:125)
at akka.stream.impl.Pump$$anonfun$pump$1.apply$mcV$sp(Transfer.scala:163)
at akka.stream.impl.Pump$$anonfun$pump$1.apply(Transfer.scala:163)
at akka.stream.impl.Pump$$anonfun$pump$1.apply(Transfer.scala:163)
at akka.stream.impl.ActorBasedFlowMaterializer$.withCtx(ActorBasedFlowMaterializer.scala:133)
at akka.stream.impl.Pump$class.pump(Transfer.scala:163)
at akka.stream.impl.ActorProcessorImpl.pump(ActorProcessor.scala:238)
at akka.stream.impl.BatchingInputBuffer.enqueueInputElement(ActorProcessor.scala:93)
at akka.stream.impl.BatchingInputBuffer$$anonfun$upstreamRunning$1.applyOrElse(ActorProcessor.scala:140)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at akka.stream.impl.SubReceive.apply(Transfer.scala:18)
at akka.stream.impl.SubReceive.apply(Transfer.scala:14)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at akka.stream.impl.SubReceive.applyOrElse(Transfer.scala:14)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:165)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:166)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at akka.stream.impl.ActorProcessorImpl.aroundReceive(ActorProcessor.scala:238)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Thanks
My mistake, mongodbConsumer (an ActorProducer) was untyped!
So this:
val mongodbConsumer = ActorProducer[Event](system.actorOf(MongodbConsumerActor.props(db)))
Instead of:
val mongodbConsumer = ActorProducer(system.actorOf(MongodbConsumerActor.props(db)))
Related
I'm trying to develop a K-means model in Flink (Scala), using Zeppelin.
This is part of my simple code:
//Reading data
val mapped : DataSet[Vector] = data.map {x => DenseVector (x._1,x._2) }
//Create algorithm
val knn = KNN()
.setK(3)
.setBlocks(10)
.setDistanceMetric(SquaredEuclideanDistanceMetric())
.setUseQuadTree(false)
.setSizeHint(CrossHint.SECOND_IS_SMALL)
...
//Just to learn I use the same data predicting the model
val result = knn.predict(mapped).collect()
When I print the data or to use predict method, i got this ERROR:
org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Job execution failed.
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:409)
at org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:95)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:382)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:369)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:344)
at org.apache.flink.client.RemoteExecutor.executePlanWithJars(RemoteExecutor.java:211)
at org.apache.flink.client.RemoteExecutor.executePlan(RemoteExecutor.java:188)
at org.apache.flink.api.java.RemoteEnvironment.execute(RemoteEnvironment.java:172)
at org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:896)
at org.apache.flink.api.scala.ExecutionEnvironment.execute(ExecutionEnvironment.scala:637)
at org.apache.flink.api.scala.DataSet.collect(DataSet.scala:547)
... 36 elided
Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:822)
at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:768)
at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:768)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: org.apache.flink.api.common.io.ParseException: Line could not be parsed: '-6.59 -44.68'
ParserError NUMERIC_VALUE_FORMAT_ERROR
Expect field types: class java.lang.Double, class java.lang.Double
in file: /home/borja/flink/kmeans/points
at org.apache.flink.api.common.io.GenericCsvInputFormat.parseRecord(GenericCsvInputFormat.java:407)
at org.apache.flink.api.java.io.CsvInputFormat.readRecord(CsvInputFormat.java:110)
at org.apache.flink.api.common.io.DelimitedInputFormat.nextRecord(DelimitedInputFormat.java:470)
at org.apache.flink.api.java.io.CsvInputFormat.nextRecord(CsvInputFormat.java:78)
at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:162)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:585)
at java.lang.Thread.run(Thread.java:748)
I do not know if it's my fault loading the data or it has related with something like that.
Thanks for any help! :)
You haven't shown us the code you are using to read and parse the data, which is where the error is occurring. But given the error message, I'll hazard a guess that you are using readCSVFile with data that is delimited by spaces or tabs, and didn't specify the fieldDelimiter (which defaults to comma). If that's the case, see the docs for how to configure the CSV parser.
I simply have a reactivemongo (version : 0.11.9) insertion query as below,
class MongoInsertQuery extends Query {
val DbName = "events-db"
val CollectionName = "EventStream"
val driver = new MongoDriver
val connection = driver.connection(List("127.0.0.1:27018"))
def insert() = {
val db = connection(DbName)
val collection: BSONCollection = db(CollectionName)
val futureResult: Future[WriteResult] = collection.insert(BSONDocument("type" -> "Product_Received"))
futureResult onComplete {
case Success(s) => println("Success " + s)
case Failure(e) => {
println("error " + e.getMessage)
e.printStackTrace()
}
}
}
}
Not that important, but Scala Play REST controller would call that method
def sayBeard = Action { request =>
new MongoInsertQuery().insert()
Ok(Json.obj("message" -> "TODO respond writeResult"))
}
When I call the endpoint,
$ curl -XGET localhost:9000/sayBeard
{"message":"TODO respond writeResult"}
It inserts successfully to mongodb 2.4.14,
precise64(mongod-2.4.14) events-db> db.EventStream.find({})
{
"_id": ObjectId("5683337120906c127504eb79"),
"type": "Product_Received"
}
But I get readAndDeserialize error with Failure on future onComplete.
[info] - play.api.Play - Application started (Dev)
error 83
java.lang.ArrayIndexOutOfBoundsException: 83
at org.jboss.netty.buffer.LittleEndianHeapChannelBuffer.getInt(LittleEndianHeapChannelBuffer.java:69)
at reactivemongo.api.SerializationPack$class.readAndDeserialize(serializationpack.scala:31)
at reactivemongo.api.BSONSerializationPack$.readAndDeserialize(serializationpack.scala:41)
at reactivemongo.api.collections.GenericCollection$$anonfun$insert$1$$anonfun$apply$12.apply(genericcollection.scala:279)
at reactivemongo.api.collections.GenericCollection$$anonfun$insert$1$$anonfun$apply$12.apply(genericcollection.scala:279)
at scala.util.Success$$anonfun$map$1.apply(Try.scala:237)
at scala.util.Try$.apply(Try.scala:192)
at scala.util.Success.map(Try.scala:237)
at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:235)
at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:235)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
at scala.concurrent.impl.ExecutionContextImpl$AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:121)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
My intention here is to read the WriteResult (N, message) on writeResult after inserting into Collection.
With mongo 3.0.6, It couldn't even find mongodb.
23:02:02.172 [ForkJoinPool-6-worker-9] ERROR reactivemongo.api.Failover2 - Got an error, no more attempts to do. Completing with a failure...
reactivemongo.core.actors.Exceptions$PrimaryUnavailableException$: MongoError['No primary node is available!']
at reactivemongo.core.actors.Exceptions$PrimaryUnavailableException$.<clinit>(actors.scala) ~[reactivemongo_2.11-0.11.9.jar:0.11.9]
at reactivemongo.core.actors.MongoDBSystem$$anonfun$reactivemongo$core$actors$MongoDBSystem$$pickChannel$4.apply(actors.scala:681) ~[reactivemongo_2.11-0.11.9.jar:0.11.9]
at reactivemongo.core.actors.MongoDBSystem$$anonfun$reactivemongo$core$actors$MongoDBSystem$$pickChannel$4.apply(actors.scala:681) ~[reactivemongo_2.11-0.11.9.jar:0.11.9]
at scala.Option.getOrElse(Option.scala:121) ~[scala-library-2.11.7.jar:?]
at reactivemongo.core.actors.MongoDBSystem$class.reactivemongo$core$actors$MongoDBSystem$$pickChannel(actors.scala:681) ~[reactivemongo_2.11-0.11.9.jar:0.11.9]
at reactivemongo.core.actors.MongoDBSystem$$anonfun$4.applyOrElse(actors.scala:335) ~[reactivemongo_2.11-0.11.9.jar:0.11.9]
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170) ~[scala-library-2.11.7.jar:?]
at akka.actor.Actor$class.aroundReceive(Actor.scala:467) ~[akka-actor_2.11-2.3.13.jar:?]
at reactivemongo.core.actors.LegacyDBSystem.aroundReceive(actors.scala:796) ~[reactivemongo_2.11-0.11.9.jar:0.11.9]
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) ~[akka-actor_2.11-2.3.13.jar:?]
at akka.actor.ActorCell.invoke(ActorCell.scala:487) ~[akka-actor_2.11-2.3.13.jar:?]
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238) ~[akka-actor_2.11-2.3.13.jar:?]
at akka.dispatch.Mailbox.run(Mailbox.scala:220) ~[akka-actor_2.11-2.3.13.jar:?]
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) ~[akka-actor_2.11-2.3.13.jar:?]
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [scala-library-2.11.7.jar:?]
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [scala-library-2.11.7.jar:?]
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [scala-library-2.11.7.jar:?]
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [scala-library-2.11.7.jar:?]
Though, I see connection log on mongo,
2015-12-30T23:21:57.688-0500 I NETWORK [initandlisten] connection accepted from 127.0.0.1:51663 #11 (11 connections now open)
Reference
http://reactivemongo.org/releases/0.11/documentation/tutorial/write-documents.html
https://groups.google.com/forum/#!topic/reactivemongo/0jxS1cgXjP8
So, the thing is reactivemongo 0.11.4 supports mongodb 3.0+. And then I had to update the mongodb host.
val driver = new MongoDriver
val connection = driver.connection(List("127.0.0.1:27017"))
Read
java.lang.ArrayIndexOutOfBoundsException on update with mongo 2.4.12 #394
i am using Scala 2.11.1 and Hazelcast 3.5 and twitter-chill for serialization
i have a class User and serializer UserSerializer both on Server(Sbt project) and client side (Play2 project)
i am storing User object in Hazelcast ,
my serializer is working fine on server but the read method of the same serializer gives me exception
here is my serializer
// takes the bytes and converts into User Object (Deserializer)
#throws(classOf[IOException])
override def read(in : ObjectDataInput) : User = {
log.info("*********** Deserializer from Client called ")
val bytes = ByteArrayConverter.getByteArray(in).toByteArray();
log.info("Bytes size: "+bytes.length)
val tryDecode: scala.util.Try[Object] = KryoInjection.invert(bytes)
tryDecode match {
case Success(obj) => log.info(" ------ Success case executed ")
obj.asInstanceOf[User]
case Failure(e) => log.info("------ Failure case executed ")
log.error("deserializer from Client : {}" ,e.printStackTrace())
null
}
}
the problem is at this line
at UserSerializer.read(UserSerializer.scala:44)
which is this line from the above method
val tryDecode: scala.util.Try[Object] = KryoInjection.invert(bytes)
i dont know whats the problem because same code works fine on Hazelcast server but gives problem on client side
here are the complete stack traces from client side (Play2)
17:24:48.822 276580 [play-akka.actor.default-dispatcher-7]
UserSerializer INFO - *********** Deserializer from Client
called 17:24:48.839 276597 [play-akka.actor.default-dispatcher-7]
ByteArrayConverter$ INFO - ---- object to byte array converter
17:24:48.841 276599 [play-akka.actor.default-dispatcher-7]
ByteArrayConverter$ INFO - ---- conversion done now returning the byte
array 17:24:48.841 276599 [play-akka.actor.default-dispatcher-7]
UserSerializer INFO - Bytes size: 596 17:24:48.917 276675
[play-akka.actor.default-dispatcher-7] UserSerializer INFO - ------
Failure case executed com.twitter.bijection.InversionFailure: Failed
to invert: [B#1296562 at
com.twitter.bijection.InversionFailure$$anonfun$partialFailure$1.applyOrElse(InversionFailure.scala:43)
at
com.twitter.bijection.InversionFailure$$anonfun$partialFailure$1.applyOrElse(InversionFailure.scala:42)
at
scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at scala.util.Failure.recoverWith(Try.scala:203) at
com.twitter.bijection.Inversion$.attempt(Inversion.scala:30) at
com.twitter.chill.KryoInjection$.invert(KryoInjection.scala:29) at
UserSerializer.read(UserSerializer.scala:44)
at
UserSerializer.read(UserSerializer.scala:22)
at
com.hazelcast.nio.serialization.StreamSerializerAdapter.read(StreamSerializerAdapter.java:41)
at
com.hazelcast.nio.serialization.SerializationServiceImpl.toObject(SerializationServiceImpl.java:276)
at
com.hazelcast.client.spi.ClientProxy.toObject(ClientProxy.java:173)
at com.hazelcast.client.spi.ClientProxy.invoke(ClientProxy.java:131)
at
com.hazelcast.client.proxy.ClientMapProxy.get(ClientMapProxy.java:200)
at
play.api.mvc.ActionBuilder$$anonfun$apply$17.apply(Action.scala:464)
at
play.api.mvc.ActionBuilder$$anonfun$apply$17.apply(Action.scala:464)
at
play.api.mvc.ActionBuilder$$anonfun$apply$16.apply(Action.scala:433)
at
play.api.mvc.ActionBuilder$$anonfun$apply$16.apply(Action.scala:432)
at play.api.mvc.Action$.invokeBlock(Action.scala:556) at
play.api.mvc.Action$.invokeBlock(Action.scala:555) at
play.api.mvc.ActionBuilder$$anon$1.apply(Action.scala:518) at
play.api.mvc.Action$$anonfun$apply$1$$anonfun$apply$4$$anonfun$apply$5.apply(Action.scala:130)
at
play.api.mvc.Action$$anonfun$apply$1$$anonfun$apply$4$$anonfun$apply$5.apply(Action.scala:130)
at play.utils.Threads$.withContextClassLoader(Threads.scala:21) at
play.api.mvc.Action$$anonfun$apply$1$$anonfun$apply$4.apply(Action.scala:129)
at
play.api.mvc.Action$$anonfun$apply$1$$anonfun$apply$4.apply(Action.scala:128)
at scala.Option.map(Option.scala:146) at
play.api.mvc.Action$$anonfun$apply$1.apply(Action.scala:128) at
play.api.mvc.Action$$anonfun$apply$1.apply(Action.scala:121) at
play.api.libs.iteratee.DoneIteratee$$anonfun$mapM$2.apply(Iteratee.scala:705)
at
play.api.libs.iteratee.DoneIteratee$$anonfun$mapM$2.apply(Iteratee.scala:705)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41) at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
at
scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: com.esotericsoftware.kryo.KryoException: Unable to find
class: User at
com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
at
com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:610) at
com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:721) at
com.twitter.chill.SerDeState.readClassAndObject(SerDeState.java:61)
at com.twitter.chill.KryoPool.fromBytes(KryoPool.java:94) at
com.twitter.chill.KryoInjection$$anonfun$invert$1.apply(KryoInjection.scala:30)
at
com.twitter.chill.KryoInjection$$anonfun$invert$1.apply(KryoInjection.scala:30)
at
com.twitter.bijection.Inversion$$anonfun$attempt$1.apply(Inversion.scala:30)
at scala.util.Try$.apply(Try.scala:192) ... 36 more Caused by:
java.lang.ClassNotFoundException: User at
java.net.URLClassLoader.findClass(URLClassLoader.java:381) at
java.lang.ClassLoader.loadClass(ClassLoader.java:424) at
java.lang.ClassLoader.loadClass(ClassLoader.java:357) at
java.lang.Class.forName0(Native Method) at
java.lang.Class.forName(Class.java:348) at
com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:136)
... 45 more 17:24:48.942 276700
[play-akka.actor.default-dispatcher-7] UserSerializer ERROR -
deserializer from Client : ()
This is the example code that came with Spark. I copied the code here and this is the link to it: https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/streaming/StatefulNetworkWordCount.scala. However, when I was trying to run the program using command "bin/run-example org.apache.spark.examples.streaming.StatefulNetworkWordCount localhost 9999", I was given the following error:
14/07/20 11:52:57 ERROR ActorSystemImpl: Uncaught fatal error from thread [spark-akka.actor.default-dispatcher-4] shutting down ActorSystem [spark]
java.lang.NoSuchMethodError: java.util.concurrent.ConcurrentHashMap.keySet()Ljava/util/concurrent/ConcurrentHashMap$KeySetView;
at org.apache.spark.streaming.scheduler.JobScheduler.getPendingTimes(JobScheduler.scala:114)
at org.apache.spark.streaming.Checkpoint.<init>(Checkpoint.scala:43)
at org.apache.spark.streaming.scheduler.JobGenerator.doCheckpoint(JobGenerator.scala:259)
at org.apache.spark.streaming.scheduler.JobGenerator.org$apache$spark$streaming$scheduler$JobGenerator$$processEvent(JobGenerator.scala:167)
at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$start$1$$anon$1$$anonfun$receive$1.applyOrElse(JobGenerator.scala:76)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
at akka.actor.ActorCell.invoke(ActorCell.scala:456)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Exception in thread "Thread-37" org.apache.spark.SparkException: Job cancelled because SparkContext was shut down
at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:639)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:638)
at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
at org.apache.spark.scheduler.DAGScheduler.cleanUpAfterSchedulerStop(DAGScheduler.scala:638)
at org.apache.spark.scheduler.DAGSchedulerEventProcessActor.postStop(DAGScheduler.scala:1215)
at akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:201)
at akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:163)
at akka.actor.ActorCell.terminate(ActorCell.scala:338)
at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:431)
at akka.actor.ActorCell.systemInvoke(ActorCell.scala:447)
at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:262)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:240)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
14/07/20 11:53:00 ERROR Executor: Exception in task ID 0
java.lang.IllegalStateException: cannot create children while terminating or terminated
at akka.actor.dungeon.Children$class.makeChild(Children.scala:184)
at akka.actor.dungeon.Children$class.attachChild(Children.scala:42)
at akka.actor.ActorCell.attachChild(ActorCell.scala:338)
at akka.actor.ActorSystemImpl.actorOf(ActorSystem.scala:518)
at org.apache.spark.streaming.receiver.ReceiverSupervisorImpl.<init>(ReceiverSupervisorImpl.scala:67)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$9.apply(ReceiverTracker.scala:263)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$9.apply(ReceiverTracker.scala:257)
at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1080)
at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1080)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
at org.apache.spark.scheduler.Task.run(Task.scala:51)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
14/07/20 11:53:06 ERROR ExecutorUncaughtExceptionHandler: Uncaught exception in thread Thread[spark-akka.actor.default-dispatcher-13,5,main]
org.apache.spark.SparkException: Error sending message to BlockManagerMaster [message = HeartBeat(BlockManagerId(<driver>, x-131-212-225-148.uofm-secure.wireless.umn.edu, 47668, 0))]
at org.apache.spark.storage.BlockManagerMaster.askDriverWithReply(BlockManagerMaster.scala:251)
at org.apache.spark.storage.BlockManagerMaster.sendHeartBeat(BlockManagerMaster.scala:51)
at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$heartBeat(BlockManager.scala:113)
at org.apache.spark.storage.BlockManager$$anonfun$initialize$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(BlockManager.scala:158)
at org.apache.spark.util.Utils$.tryOrExit(Utils.scala:790)
at org.apache.spark.storage.BlockManager$$anonfun$initialize$1.apply$mcV$sp(BlockManager.scala:158)
at akka.actor.Scheduler$$anon$9.run(Scheduler.scala:80)
at akka.actor.LightArrayRevolverScheduler$$anon$3$$anon$2.run(Scheduler.scala:241)
at akka.actor.LightArrayRevolverScheduler$TaskHolder.run(Scheduler.scala:464)
at akka.actor.LightArrayRevolverScheduler$$anonfun$close$1.apply(Scheduler.scala:281)
at akka.actor.LightArrayRevolverScheduler$$anonfun$close$1.apply(Scheduler.scala:280)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at akka.actor.LightArrayRevolverScheduler.close(Scheduler.scala:279)
at akka.actor.ActorSystemImpl.stopScheduler(ActorSystem.scala:630)
at akka.actor.ActorSystemImpl$$anonfun$_start$1.apply$mcV$sp(ActorSystem.scala:582)
at akka.actor.ActorSystemImpl$$anonfun$_start$1.apply(ActorSystem.scala:582)
at akka.actor.ActorSystemImpl$$anonfun$_start$1.apply(ActorSystem.scala:582)
at akka.actor.ActorSystemImpl$$anon$3.run(ActorSystem.scala:596)
at akka.actor.ActorSystemImpl$TerminationCallbacks$$anonfun$run$1.runNext$1(ActorSystem.scala:750)
at akka.actor.ActorSystemImpl$TerminationCallbacks$$anonfun$run$1.apply$mcV$sp(ActorSystem.scala:753)
at akka.actor.ActorSystemImpl$TerminationCallbacks$$anonfun$run$1.apply(ActorSystem.scala:746)
at akka.actor.ActorSystemImpl$TerminationCallbacks$$anonfun$run$1.apply(ActorSystem.scala:746)
at akka.util.ReentrantGuard.withGuard(LockUtil.scala:15)
at akka.actor.ActorSystemImpl$TerminationCallbacks.run(ActorSystem.scala:746)
at akka.actor.ActorSystemImpl$$anonfun$terminationCallbacks$1.apply(ActorSystem.scala:593)
at akka.actor.ActorSystemImpl$$anonfun$terminationCallbacks$1.apply(ActorSystem.scala:593)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.processBatch$1(BatchingExecutor.scala:67)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:82)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:58)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:42)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: akka.pattern.AskTimeoutException: Recipient[Actor[akka://spark/user/BlockManagerMaster#1887396223]] had already been terminated.
at akka.pattern.AskableActorRef$.ask$extension(AskSupport.scala:134)
at org.apache.spark.storage.BlockManagerMaster.askDriverWithReply(BlockManagerMaster.scala:236)
****************CODE********************
object StatefulNetworkWordCount {
def main(args: Array[String]) {
if (args.length < 2) {
System.err.println("Usage: StatefulNetworkWordCount <hostname> <port>")
System.exit(1)
}
StreamingExamples.setStreamingLogLevels()
val updateFunc = (values: Seq[Int], state: Option[Int]) => {
val currentCount = values.foldLeft(0)(_ + _)
val previousCount = state.getOrElse(0)
Some(currentCount + previousCount)
}
val sparkConf = new SparkConf().setAppName("StatefulNetworkWordCount")
// Create the context with a 1 second batch size
val ssc = new StreamingContext(sparkConf, Seconds(1))
ssc.checkpoint(".")
// Create a NetworkInputDStream on target ip:port and count the
// words in input stream of \n delimited test (eg. generated by 'nc')
val lines = ssc.socketTextStream(args(0), args(1).toInt)
val words = lines.flatMap(_.split(" "))
val wordDstream = words.map(x => (x, 1))
// Update the cumulative count using updateStateByKey
// This will give a Dstream made of state (which is the cumulative count of the words)
val stateDstream = wordDstream.updateStateByKey[Int](updateFunc)
stateDstream.print()
ssc.start()
ssc.awaitTermination()
}
}
I wonder if it is because, it is trying to set up the checkpoint at my local file system by doing commands "ssc.checkpoint(".")", while the file is not a file compatible with hadoop? (the file must be compatible with hadoop in order to set the checkpoint) If it is, how could I fix it? Thanks!
what's your JRE run time release, 1.7 or 1.8 , I have the similar issue that, I compile spark-source-code in 1.8, but if I use 1.7 to run the code, this issue will happen, change back to 1.8 as run time issue will solve.
jdk 1.8, concurrenthashmap.java(line 812):
// views
private transient KeySetView<K,V> keySet;
private transient ValuesView<K,V> values;
private transient EntrySetView<K,V> entrySet;
jdk 1.7 don't have above code
Hope it will help ^_^
I have an elasticsearch server installed in a machine:
this is a part of config:
transport: {
bound_address: inet[/192.168.1.42:9300]
publish_address: inet[/192.168.1.42:9300]
}
http: {
bound_address: inet[/192.168.1.42:9200]
publish_address: inet[/192.168.1.42:9200]
max_content_length_in_bytes: 104857600
}
And I try from Scala program with elasticsearch java library to connect to this server.
This is a code for connect to server using nodeBuilder :
val builder:Node = nodeBuilder()
.client(true)
.local(false)
.loadConfigSettings(false)
.clusterName("elasticsearch")
.settings(
ImmutableSettings.settingsBuilder()
//.put("network.bind_host", "192.168.1.42")
.put("network.host", "192.168.1.42")
.put("transport.tcp.port",9300)
.put("http.port",9200)
.build()
).node()
val client:Client = builder.client()
but receive this exception:
java.lang.ExceptionInInitializerError
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at actors.DaasySocketActor$$anonfun$receive$1.applyOrElse(DaasySocketActor.scala:25)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
at actors.DaasySocketActor.aroundReceive(DaasySocketActor.scala:17)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
Caused by: org.elasticsearch.transport.BindTransportException: Failed to bind to [9300]
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at org.elasticsearch.transport.netty.NettyTransport.doStart(NettyTransport.java:389)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at org.elasticsearch.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:85)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
at org.elasticsearch.transport.TransportService.doStart(TransportService.java:91)
Caused by: org.elasticsearch.transport.BindTransportException: Failed to bind to [9300]
at org.elasticsearch.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:85)
at org.elasticsearch.transport.netty.NettyTransport.doStart(NettyTransport.java:389)
at org.elasticsearch.node.internal.InternalNode.start(InternalNode.java:231)
at org.elasticsearch.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:85)
at org.elasticsearch.node.NodeBuilder.node(NodeBuilder.java:166)
at org.elasticsearch.transport.TransportService.doStart(TransportService.java:91)
at lib.int_elasticsearch$.<init>(int_elasticsearch.scala:55)
at org.elasticsearch.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:85)
at lib.int_elasticsearch$.<clinit>(int_elasticsearch.scala)
at org.elasticsearch.node.internal.InternalNode.start(InternalNode.java:231)
... 12 more
at org.elasticsearch.node.NodeBuilder.node(NodeBuilder.java:166)
Caused by: org.elasticsearch.common.netty.channel.ChannelException: Failed to bind to: /192.168.1.42:9300
If I try to connect with this code works well:
val client:Client = new TransportClient()
.addTransportAddress(new InetSocketTransportAddress("192.168.1.42", 9300))
but i want connect with node client and not with TransportClient.
What do the error?
thank you very much
Strip the hostname and port out from the node client. A node client is a node in its own right, so setting the hostname/port to another node is why you have a clash (manifested in the bind exception).
val builder:Node = nodeBuilder()
.client(true)
.local(false)
.loadConfigSettings(false)
.clusterName("elasticsearch")
.node()
val client:Client = builder.client()
A node joins the cluster. A transport client just connects to a node.