Error in twitter4j classes while feeding data from twitter using flume - twitter4j

I am using the sources provided by cloudera flume-sources-1.0-SNAPSHOT uploaded on github.
I am getting error below. the method does not exist in the configuration.class the jar provided by cloudera in twitter4j.
How to solve this issue?
{
14/06/12 03:04:56 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting
14/06/12 03:04:56 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:conf/tagent
14/06/12 03:04:56 INFO conf.FlumeConfiguration: Processing:HDFS
14/06/12 03:04:56 INFO conf.FlumeConfiguration: Processing:HDFS
14/06/12 03:04:56 INFO conf.FlumeConfiguration: Processing:HDFS
14/06/12 03:04:56 INFO conf.FlumeConfiguration: Added sinks: HDFS Agent: TwitterAgent
14/06/12 03:04:56 INFO conf.FlumeConfiguration: Processing:HDFS
14/06/12 03:04:56 INFO conf.FlumeConfiguration: Processing:HDFS
14/06/12 03:04:56 INFO conf.FlumeConfiguration: Processing:HDFS
14/06/12 03:04:56 INFO conf.FlumeConfiguration: Processing:HDFS
14/06/12 03:04:56 INFO conf.FlumeConfiguration: Processing:HDFS
14/06/12 03:04:56 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [TwitterAgent]
14/06/12 03:04:56 INFO node.AbstractConfigurationProvider: Creating channels
14/06/12 03:04:56 INFO channel.DefaultChannelFactory: Creating instance of channel MemChannel type memory
14/06/12 03:04:57 INFO node.AbstractConfigurationProvider: Created channel MemChannel
14/06/12 03:04:57 INFO source.DefaultSourceFactory: Creating instance of source Twitter, type com.cloudera.flume.source.TwitterSource
14/06/12 03:04:57 ERROR node.PollingPropertiesFileConfigurationProvider: Unhandled error
java.lang.NoSuchMethodError: twitter4j.conf.Configuration.getRequestHeaders()Ljava/util/Map;
at twitter4j.StreamingReadTimeoutConfiguration.getRequestHeaders(TwitterStreamImpl.java:664)
at twitter4j.internal.http.HttpClientWrapper.<init>(HttpClientWrapper.java:47)
at twitter4j.TwitterStreamImpl.<init>(TwitterStreamImpl.java:54)
at twitter4j.TwitterStreamFactory.<clinit>(TwitterStreamFactory.java:40)
at com.cloudera.flume.source.TwitterSource.<init>(TwitterSource.java:64)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at java.lang.Class.newInstance0(Class.java:355)
at java.lang.Class.newInstance(Class.java:308)
at org.apache.flume.source.DefaultSourceFactory.create(DefaultSourceFactory.java:42)
at org.apache.flume.node.AbstractConfigurationProvider.loadSources(AbstractConfigurationProvider.java:327)
at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:102)
at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:140)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
}

Check the twitter4j jars version you are using.
twitter4j.conf.Configuration.getRequestHeaders() method is not in the latest versions.
But it is in twitter4j 2.x

You have to replace all the twitter4j-3.0.3 libraries placed in flume/lib folder by the twitter4j-2.x.x version. The reason is that same class name is conflicting in twitter4j-3.0.3 and flume-sources-1.0-SNAPSHOT.jar
To avoid this you have to make changes in the twitter4j properties by putting the following property in twitter4j.properties as class path:
streamBaseURL=https://stream.twitter.com/1.1/ `
Or you can download the new flume-source-1.0-snapshot.jar from here.
This issue has been resolved by the Twitter community, for a full explanation you can check this discussion.

When you receive
twitter4j.conf.Configuration.getRequestHeaders() method error.
Try below steps, it resolved my issue.
Remove the below 3 jar files from /usr/lib/flume-ng/lib/ directory.
twitter4j-core-3.0.3.jar
twitter4j-media-support-3.0.3.jar
twitter4j-stream-3.0.3.jar
Add the below jar files to /usr/lib/flume-ng/lib/ directory.
(twitter4j jars should be latest version.)
flume-sources-1.0-SNAPSHOT.jar
twitter4j-core-4.0.7.jar
twitter4j-async-4.0.7.jar
twitter4j-stream-4.0.7.jar

Related

neo4j - graphaware plugins

I downloaded the plugins of graphaware nlp,open-nlp,framework and copied the jar files to the plugins directory.
And as per the steps in neo4j , i included the following lines in neo4j.config file
dbms.unmanaged_extension_classes=com.graphaware.server=/graphaware
com.graphaware.runtime.enabled=true
com.graphaware.module.NLP.2=com.graphaware.nlp.module.NLPBootstrapper
After inserting this the localhost:7474 is not starting.
But when i comment these lines localhost starts and works properly but doesnt include the plugins.
Version : enterprise 3.1.3
Error in LocalLost after commenting those lines:
Failed to invoke procedure `ga.nlp.annotate`: Caused by: java.lang.RuntimeException: java.lang.IllegalStateException: No GraphAware Runtime is registered with the given database
Error in log file:
2017-11-07 10:41:03.839+0000 INFO ======== Neo4j 3.1.3 ========
2017-11-07 10:41:04.120+0000 INFO Starting...
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/share/neo4j/lib/slf4j-nop-1.7.22.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/var/lib/neo4j/plugins/nlp-opennlp-3.1.3.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.helpers.NOPLoggerFactory]
2017-11-07 10:41:04.985+0000 INFO Bolt enabled on localhost:7687.
2017-11-07 10:41:05.010+0000 INFO Initiating metrics...
2017-11-07 10:41:07.374+0000 INFO [c.g.r.b.RuntimeKernelExtension] GraphAware Runtime enabled, bootstrapping...
2017-11-07 10:41:07.444+0000 INFO [c.g.r.b.RuntimeKernelExtension] Bootstrapping module with order 2, ID NLP, using com.graphaware.nlp.module.NLPBootstrapper
2017-11-07 10:41:07.523+0000 INFO Registering module NLP with GraphAware Runtime.
2017-11-07 10:41:07.523+0000 INFO [c.g.r.b.RuntimeKernelExtension] GraphAware Runtime bootstrapped, starting the Runtime...
2017-11-07 10:41:21.893+0000 INFO Starting GraphAware...
2017-11-07 10:41:21.894+0000 INFO Loading module metadata...
2017-11-07 10:41:21.894+0000 INFO Loading metadata for module NLP
2017-11-07 10:41:21.946+0000 INFO Module NLP seems to have been registered for the first time.
2017-11-07 10:41:21.947+0000 INFO Module NLP seems to have been registered for the first time, will try to initialize...
2017-11-07 10:41:21.947+0000 INFO InitializeUntil set to 9223372036854775807 and it is 1510051281947. Will initialize.
2017-11-07 10:41:24.709+0000 INFO Started.
2017-11-07 10:41:24.811+0000 INFO Mounted REST API at: /db/manage
2017-11-07 10:41:24.823+0000 INFO [c.g.s.f.b.GraphAwareServerBootstrapper] started
2017-11-07 10:41:24.825+0000 INFO Mounted unmanaged extension [com.graphaware.server] at [/graphaware]
Exception in thread "GraphAware Starter" java.lang.RuntimeException: Error while initializing model of class: class opennlp.tools.namefind.TokenNameFinderModel
at com.graphaware.nlp.processor.opennlp.OpenNLPPipeline.loadModel(OpenNLPPipeline.java:503)
at com.graphaware.nlp.processor.opennlp.OpenNLPPipeline.lambda$loadNamedEntitiesFinders$2(OpenNLPPipeline.java:161)
at java.util.HashMap$EntrySpliterator.forEachRemaining(HashMap.java:1691)
at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
at com.graphaware.nlp.processor.opennlp.OpenNLPPipeline.loadNamedEntitiesFinders(OpenNLPPipeline.java:158)
at com.graphaware.nlp.processor.opennlp.OpenNLPPipeline.init(OpenNLPPipeline.java:118)
at com.graphaware.nlp.processor.opennlp.OpenNLPPipeline.<init>(OpenNLPPipeline.java:108)
at com.graphaware.nlp.processor.opennlp.PipelineBuilder.build(PipelineBuilder.java:79)
at com.graphaware.nlp.processor.opennlp.OpenNLPTextProcessor.createPhrasePipeline(OpenNLPTextProcessor.java:106)
at com.graphaware.nlp.processor.opennlp.OpenNLPTextProcessor.init(OpenNLPTextProcessor.java:56)
at com.graphaware.nlp.processor.TextProcessorsManager.lambda$initiateTextProcessors$0(TextProcessorsManager.java:61)
at java.util.HashMap$Values.forEach(HashMap.java:980)
at com.graphaware.nlp.processor.TextProcessorsManager.initiateTextProcessors(TextProcessorsManager.java:60)
at com.graphaware.nlp.processor.TextProcessorsManager.<init>(TextProcessorsManager.java:37)
at com.graphaware.nlp.NLPManager.init(NLPManager.java:95)
at com.graphaware.nlp.module.NLPModule.initialize(NLPModule.java:52)
at com.graphaware.runtime.manager.ProductionTxDrivenModuleManager.initialize(ProductionTxDrivenModuleManager.java:57)
at com.graphaware.runtime.manager.BaseTxDrivenModuleManager.initializeIfAllowed(BaseTxDrivenModuleManager.java:128)
at com.graphaware.runtime.manager.BaseTxDrivenModuleManager.handleNoMetadata(BaseTxDrivenModuleManager.java:72)
at com.graphaware.runtime.manager.BaseTxDrivenModuleManager.handleNoMetadata(BaseTxDrivenModuleManager.java:39)
at com.graphaware.runtime.manager.BaseModuleManager.loadMetadata(BaseModuleManager.java:143)
at com.graphaware.runtime.manager.BaseModuleManager.loadMetadata(BaseModuleManager.java:125)
at com.graphaware.runtime.TxDrivenRuntime.loadMetadata(TxDrivenRuntime.java:130)
at com.graphaware.runtime.ProductionRuntime.loadMetadata(ProductionRuntime.java:80)
at com.graphaware.runtime.BaseGraphAwareRuntime.startModules(BaseGraphAwareRuntime.java:154)
at com.graphaware.runtime.TxDrivenRuntime.startModules(TxDrivenRuntime.java:146)
at com.graphaware.runtime.ProductionRuntime.startModules(ProductionRuntime.java:70)
at com.graphaware.runtime.BaseGraphAwareRuntime.start(BaseGraphAwareRuntime.java:134)
at com.graphaware.runtime.bootstrap.RuntimeKernelExtension.lambda$start$8(RuntimeKernelExtension.java:117)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedConstructorAccessor29.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.graphaware.nlp.processor.opennlp.OpenNLPPipeline.loadModel(OpenNLPPipeline.java:499)
... 29 more
Caused by: java.lang.OutOfMemoryError: Java heap space
at opennlp.tools.ml.model.AbstractModelReader.getParameters(AbstractModelReader.java:140)
at opennlp.tools.ml.maxent.io.GISModelReader.constructModel(GISModelReader.java:78)
at opennlp.tools.ml.model.GenericModelReader.constructModel(GenericModelReader.java:62)
at opennlp.tools.ml.model.AbstractModelReader.getModel(AbstractModelReader.java:85)
at opennlp.tools.util.model.GenericModelSerializer.create(GenericModelSerializer.java:32)
at opennlp.tools.util.model.GenericModelSerializer.create(GenericModelSerializer.java:29)
at opennlp.tools.util.model.BaseModel.finishLoadingArtifacts(BaseModel.java:309)
at opennlp.tools.util.model.BaseModel.loadModel(BaseModel.java:239)
at opennlp.tools.util.model.BaseModel.<init>(BaseModel.java:173)
at opennlp.tools.namefind.TokenNameFinderModel.<init>(TokenNameFinderModel.java:103)
at sun.reflect.GeneratedConstructorAccessor29.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.graphaware.nlp.processor.opennlp.OpenNLPPipeline.loadModel(OpenNLPPipeline.java:499)
at com.graphaware.nlp.processor.opennlp.OpenNLPPipeline.lambda$loadNamedEntitiesFinders$2(OpenNLPPipeline.java:161)
at com.graphaware.nlp.processor.opennlp.OpenNLPPipeline$$Lambda$239/1188677545.accept(Unknown Source)
at java.util.HashMap$EntrySpliterator.forEachRemaining(HashMap.java:1691)
at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
at com.graphaware.nlp.processor.opennlp.OpenNLPPipeline.loadNamedEntitiesFinders(OpenNLPPipeline.java:158)
at com.graphaware.nlp.processor.opennlp.OpenNLPPipeline.init(OpenNLPPipeline.java:118)
at com.graphaware.nlp.processor.opennlp.OpenNLPPipeline.<init>(OpenNLPPipeline.java:108)
at com.graphaware.nlp.processor.opennlp.PipelineBuilder.build(PipelineBuilder.java:79)
at com.graphaware.nlp.processor.opennlp.OpenNLPTextProcessor.createPhrasePipeline(OpenNLPTextProcessor.java:106)
at com.graphaware.nlp.processor.opennlp.OpenNLPTextProcessor.init(OpenNLPTextProcessor.java:56)
at com.graphaware.nlp.processor.TextProcessorsManager.lambda$initiateTextProcessors$0(TextProcessorsManager.java:61)
at com.graphaware.nlp.processor.TextProcessorsManager$$Lambda$234/2094381213.accept(Unknown Source)
at java.util.HashMap$Values.forEach(HashMap.java:980)
at com.graphaware.nlp.processor.TextProcessorsManager.initiateTextProcessors(TextProcessorsManager.java:60)
at com.graphaware.nlp.processor.TextProcessorsManager.<init>(TextProcessorsManager.java:37)
at com.graphaware.nlp.NLPManager.init(NLPManager.java:95)
at com.graphaware.nlp.module.NLPModule.initialize(NLPModule.java:52)
at com.graphaware.runtime.manager.ProductionTxDrivenModuleManager.initialize(ProductionTxDrivenModuleManager.java:57)
please help me out
You do not have sufficient memory for the NLP plugins to load, hence the NLP module is not registered and thus not available once that database has started.
As stated in the NLP plugin README, you need at least 4GB of heap for the modules to run, adapt it in your neo4j.conf and restart.

RuntimeException when nutch generate

I'm new to nutch. I have installed nutch 2.3.1 and configure it to use mongodb. The inject operation was successful but when I try to generate it generate an exception (see below).
NB : This error is generated with a seed file containing 60K urls. So I've tried with 100 urls and everything went well.
Do you have an idea what is the cause of this error ? Thanks !!!
2016-12-30 00:01:48,446 INFO crawl.GeneratorJob - GeneratorJob: starting at 2016-12-30 00:01:48
2016-12-30 00:01:48,447 INFO crawl.GeneratorJob - GeneratorJob: Selecting best-scoring urls due for fetch.
2016-12-30 00:01:48,447 INFO crawl.GeneratorJob - GeneratorJob: starting
2016-12-30 00:01:48,448 INFO crawl.GeneratorJob - GeneratorJob: filtering: true
2016-12-30 00:01:48,448 INFO crawl.GeneratorJob - GeneratorJob: normalizing: true
2016-12-30 00:01:48,448 INFO crawl.GeneratorJob - GeneratorJob: topN: 100000
2016-12-30 00:01:48,816 WARN util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2016-12-30 00:01:48,857 INFO crawl.FetchScheduleFactory - Using FetchSchedule impl: org.apache.nutch.crawl.DefaultFetchSchedule
2016-12-30 00:01:48,867 INFO crawl.AbstractFetchSchedule - defaultInterval=2592000
2016-12-30 00:01:48,867 INFO crawl.AbstractFetchSchedule - maxInterval=7776000
2016-12-30 00:01:51,568 WARN conf.Configuration - file:/tmp/hadoop-mehdi/mapred/staging/mehdi1740651658/.staging/job_local1740651658_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
2016-12-30 00:01:51,573 WARN conf.Configuration - file:/tmp/hadoop-mehdi/mapred/staging/mehdi1740651658/.staging/job_local1740651658_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
2016-12-30 00:01:51,753 WARN conf.Configuration - file:/tmp/hadoop-mehdi/mapred/local/localRunner/mehdi/job_local1740651658_0001/job_local1740651658_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
2016-12-30 00:01:51,760 WARN conf.Configuration - file:/tmp/hadoop-mehdi/mapred/local/localRunner/mehdi/job_local1740651658_0001/job_local1740651658_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
2016-12-30 00:01:52,408 INFO crawl.FetchScheduleFactory - Using FetchSchedule impl: org.apache.nutch.crawl.DefaultFetchSchedule
2016-12-30 00:01:52,408 INFO crawl.AbstractFetchSchedule - defaultInterval=2592000
2016-12-30 00:01:52,408 INFO crawl.AbstractFetchSchedule - maxInterval=7776000
2016-12-30 00:01:52,591 INFO regex.RegexURLNormalizer - can't find rules for scope 'generate_host_count', using default
2016-12-30 00:02:03,229 ERROR mapreduce.GoraRecordReader - Error reading Gora records: Read operation to server localhost:27017 failed on database nutch
2016-12-30 00:02:04,607 WARN mapred.LocalJobRunner - job_local1740651658_0001
java.lang.Exception: java.lang.RuntimeException: com.mongodb.MongoException$Network: Read operation to server localhost:27017 failed on database nutch
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.RuntimeException: com.mongodb.MongoException$Network: Read operation to server localhost:27017 failed on database nutch
at org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordReader.java:122)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: com.mongodb.MongoException$Network: Read operation to server localhost:27017 failed on database nutch
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:298)
at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:269)
at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:235)
at com.mongodb.QueryResultIterator.getMore(QueryResultIterator.java:145)
at com.mongodb.QueryResultIterator.hasNext(QueryResultIterator.java:135)
at com.mongodb.DBCursor._hasNext(DBCursor.java:626)
at com.mongodb.DBCursor.hasNext(DBCursor.java:657)
at org.apache.gora.mongodb.query.MongoDBResult.nextInner(MongoDBResult.java:71)
at org.apache.gora.query.impl.ResultBase.next(ResultBase.java:111)
at org.apache.gora.mapreduce.GoraRecordReader.nextKeyValue(GoraRecordReader.java:118)
... 12 more
Caused by: java.io.EOFException
at org.bson.io.Bits.readFully(Bits.java:75)
at org.bson.io.Bits.readFully(Bits.java:50)
at org.bson.io.Bits.readFully(Bits.java:37)
at com.mongodb.Response.<init>(Response.java:42)
at com.mongodb.DBPort$1.execute(DBPort.java:164)
at com.mongodb.DBPort$1.execute(DBPort.java:158)
at com.mongodb.DBPort.doOperation(DBPort.java:187)
at com.mongodb.DBPort.call(DBPort.java:158)
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:290)
... 21 more
2016-12-30 00:02:04,846 ERROR crawl.GeneratorJob - GeneratorJob: java.lang.RuntimeException: job failed: name=nutch-maven-1.0-SNAPSHOT.jar, jobid=job_local1740651658_0001
at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:120)
at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:227)
at org.apache.nutch.crawl.GeneratorJob.generate(GeneratorJob.java:256)
at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:322)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.nutch.crawl.GeneratorJob.main(GeneratorJob.java:330)
I figured out that the problem becomes from mongodb version. Nutch uses mongo-java-driver-2.13.1.jar ad I've installed mongodb 3.4.1. So I've installed mongo 2.6.7 and now it works fine. I'll try to update the driver in Nutch and tell you if it works with the new version of mongodb.

What causes this OrientDB startup error?

I just started getting this error when running Orient. Any advise what to do?
2015-11-10 08:09:52:003 INFO OrientDB auto-config DISKCACHE=489MB (heap=491MB os=52,339MB disk=978MB) [orientechnologies]
2015-11-10 08:09:52:111 INFO Loading configuration from: /home/ubuntu/workspace/orient215/config/orientdb-server-config.xml... [OServerConfigurationLoaderXml]
2015-11-10 08:09:52:391 INFO OrientDB Server v2.1.5 (build 2.1.x#r; 2015-10-29 16:54:25+0000) is starting up... [OServer]
2015-11-10 08:09:52:430 INFO Databases directory: /home/ubuntu/workspace/orient215/databases [OServer]
2015-11-10 08:09:52:484 INFO Listening binary connections on 0.0.0.0:2424 (protocol v.32, socket=default) [OServerNetworkListener]Exception in thread "main" java.lang.NegativeArraySizeException
at com.orientechnologies.orient.server.network.OServerNetworkListener.getPorts(OServerNetworkListener.java:113)
at com.orientechnologies.orient.server.network.OServerNetworkListener.listen(OServerNetworkListener.java:305)
at com.orientechnologies.orient.server.network.OServerNetworkListener.<init>(OServerNetworkListener.java:79)
at com.orientechnologies.orient.server.OServer.activate(OServer.java:334)
at com.orientechnologies.orient.server.OServerMain.main(OServerMain.java:41)
Well I deleted this question quickly because I found my own answer. But on second thought if someone else makes a typo in the port specification in orientdb-server-config.xml, then they might find this post useful

Unable to view Twitter stream using a Spark streaming application

I am trying to write a Spark streaming app, using Scala, which is supposed to read the Twitter feed every second, following the instructions provided here.
My code is:
import org.apache.spark._
import org.apache.spark.SparkContext._
import org.apache.spark.streaming._
import org.apache.spark.streaming.twitter._
import org.apache.spark.streaming.StreamingContext._
import TutorialHelper._
object Tutorial {
def main(args: Array[String]) {
// Checkpoint directory
val checkpointDir = TutorialHelper.getCheckpointDirectory()
// Configure Twitter credentials
val apiKey = "blabla"
val apiSecret = "blabla"
val accessToken = "blabla"
val accessTokenSecret = "blabla"
TutorialHelper.configureTwitterCredentials(apiKey, apiSecret, accessToken, accessTokenSecret)
val ssc = new StreamingContext(new SparkConf(), Seconds(1))
val tweets = TwitterUtils.createStream(ssc, None)
val statuses = tweets.map(status => status.getText())
statuses.print()
ssc.checkpoint(checkpointDir)
ssc.start()
ssc.awaitTermination()
}
}
When I try to execute, using spark-submit I get the following log:
Configuring Twitter OAuth
Property twitter4j.oauth.consumerKey set as [blabla]
Property twitter4j.oauth.accessToken set as [blabla]
Property twitter4j.oauth.consumerSecret set as [blabla]
Property twitter4j.oauth.accessTokenSecret set as [blabla]
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/09/28 13:22:11 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/09/28 13:22:11 WARN Utils: Your hostname, ubuntu resolves to a loopback address: 127.0.1.1; using 192.168.163.145 instead (on interface eth0)
15/09/28 13:22:11 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
15/09/28 13:22:13 INFO Slf4jLogger: Slf4jLogger started
15/09/28 13:22:13 INFO Remoting: Starting remoting
15/09/28 13:22:13 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver#192.168.163.145:59422]
15/09/28 13:22:15 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
15/09/28 13:22:17 INFO WriteAheadLogManager : Recovered 1 write ahead log files from file:/home/nikos/Desktop/spark-1.5.0-bin-hadoop2.6/streaming/scala/checkpoint/receivedBlockMetadata
-------------------------------------------
Time: 1443471738000 ms
-------------------------------------------
15/09/28 13:22:18 INFO WriteAheadLogManager : Attempting to clear 1 old log files in file:/home/nikos/Desktop/spark-1.5.0-bin-hadoop2.6/streaming/scala/checkpoint/receivedBlockMetadata older than 1443471737000: file:/home/nikos/Desktop/spark-1.5.0-bin-hadoop2.6/streaming/scala/checkpoint/receivedBlockMetadata/log-1443468716010-1443468776010
15/09/28 13:22:18 INFO WriteAheadLogManager : Cleared log files in file:/home/nikos/Desktop/spark-1.5.0-bin-hadoop2.6/streaming/scala/checkpoint/receivedBlockMetadata older than 1443471737000
-------------------------------------------
Time: 1443471739000 ms
-------------------------------------------
15/09/28 13:22:19 INFO WriteAheadLogManager : Attempting to clear 0 old log files in file:/home/nikos/Desktop/spark-1.5.0-bin-hadoop2.6/streaming/scala/checkpoint/receivedBlockMetadata older than 1443471738000:
15/09/28 13:22:19 INFO WriteAheadLogManager : Cleared log files in file:/home/nikos/Desktop/spark-1.5.0-bin-hadoop2.6/streaming/scala/checkpoint/receivedBlockMetadata older than 1443471738000
-------------------------------------------
Time: 1443471740000 ms
-------------------------------------------
15/09/28 13:22:20 INFO WriteAheadLogManager : Attempting to clear 0 old log files in file:/home/nikos/Desktop/spark-1.5.0-bin-hadoop2.6/streaming/scala/checkpoint/receivedBlockMetadata older than 1443471739000:
15/09/28 13:22:20 INFO WriteAheadLogManager : Cleared log files in file:/home/nikos/Desktop/spark-1.5.0-bin-hadoop2.6/streaming/scala/checkpoint/receivedBlockMetadata older than 1443471739000
-------------------------------------------
Time: 1443471741000 ms
-------------------------------------------
15/09/28 13:22:21 INFO WriteAheadLogManager : Attempting to clear 0 old log files in file:/home/nikos/Desktop/spark-1.5.0-bin-hadoop2.6/streaming/scala/checkpoint/receivedBlockMetadata older than 1443471740000:
15/09/28 13:22:21 INFO WriteAheadLogManager : Cleared log files in file:/home/nikos/Desktop/spark-1.5.0-bin-hadoop2.6/streaming/scala/checkpoint/receivedBlockMetadata older than 1443471740000
15/09/28 13:22:21 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.AbstractMethodError
at org.apache.spark.Logging$class.log(Logging.scala:52)
at org.apache.spark.streaming.twitter.TwitterReceiver.log(TwitterInputDStream.scala:60)
at org.apache.spark.Logging$class.logInfo(Logging.scala:59)
at org.apache.spark.streaming.twitter.TwitterReceiver.logInfo(TwitterInputDStream.scala:60)
at org.apache.spark.streaming.twitter.TwitterReceiver.onStart(TwitterInputDStream.scala:93)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:148)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:130)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:542)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:532)
at org.apache.spark.SparkContext$$anonfun$37.apply(SparkContext.scala:1975)
at org.apache.spark.SparkContext$$anonfun$37.apply(SparkContext.scala:1975)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/09/28 13:22:21 INFO TwitterStreamImpl: Establishing connection.
15/09/28 13:22:21 ERROR SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[Executor task launch worker-0,5,main]
java.lang.AbstractMethodError
at org.apache.spark.Logging$class.log(Logging.scala:52)
at org.apache.spark.streaming.twitter.TwitterReceiver.log(TwitterInputDStream.scala:60)
at org.apache.spark.Logging$class.logInfo(Logging.scala:59)
at org.apache.spark.streaming.twitter.TwitterReceiver.logInfo(TwitterInputDStream.scala:60)
at org.apache.spark.streaming.twitter.TwitterReceiver.onStart(TwitterInputDStream.scala:93)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:148)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:130)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:542)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:532)
at org.apache.spark.SparkContext$$anonfun$37.apply(SparkContext.scala:1975)
at org.apache.spark.SparkContext$$anonfun$37.apply(SparkContext.scala:1975)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/09/28 13:22:21 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.AbstractMethodError
at org.apache.spark.Logging$class.log(Logging.scala:52)
at org.apache.spark.streaming.twitter.TwitterReceiver.log(TwitterInputDStream.scala:60)
at org.apache.spark.Logging$class.logInfo(Logging.scala:59)
at org.apache.spark.streaming.twitter.TwitterReceiver.logInfo(TwitterInputDStream.scala:60)
at org.apache.spark.streaming.twitter.TwitterReceiver.onStart(TwitterInputDStream.scala:93)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:148)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:130)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:542)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:532)
at org.apache.spark.SparkContext$$anonfun$37.apply(SparkContext.scala:1975)
at org.apache.spark.SparkContext$$anonfun$37.apply(SparkContext.scala:1975)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
15/09/28 13:22:21 ERROR ActorSystemImpl: Uncaught fatal error from thread [sparkDriver-akka.actor.default-dispatcher-7] shutting down ActorSystem [sparkDriver]
java.lang.AbstractMethodError
at org.apache.spark.Logging$class.log(Logging.scala:52)
at org.apache.spark.streaming.twitter.TwitterReceiver.log(TwitterInputDStream.scala:60)
at org.apache.spark.Logging$class.logInfo(Logging.scala:59)
at org.apache.spark.streaming.twitter.TwitterReceiver.logInfo(TwitterInputDStream.scala:60)
at org.apache.spark.streaming.twitter.TwitterReceiver.onStop(TwitterInputDStream.scala:101)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.stopReceiver(ReceiverSupervisor.scala:169)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.stop(ReceiverSupervisor.scala:136)
at org.apache.spark.streaming.receiver.ReceiverSupervisorImpl$$anon$2$$anonfun$receive$1.applyOrElse(ReceiverSupervisorImpl.scala:79)
at org.apache.spark.rpc.akka.AkkaRpcEnv.org$apache$spark$rpc$akka$AkkaRpcEnv$$processMessage(AkkaRpcEnv.scala:177)
at org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1$$anonfun$receiveWithLogging$1$$anonfun$applyOrElse$4.apply$mcV$sp(AkkaRpcEnv.scala:126)
at org.apache.spark.rpc.akka.AkkaRpcEnv.org$apache$spark$rpc$akka$AkkaRpcEnv$$safelyCall(AkkaRpcEnv.scala:197)
at org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1$$anonfun$receiveWithLogging$1.applyOrElse(AkkaRpcEnv.scala:125)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:59)
at org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:42)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at org.apache.spark.util.ActorLogReceive$$anon$1.applyOrElse(ActorLogReceive.scala:42)
at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
at org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1.aroundReceive(AkkaRpcEnv.scala:92)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
15/09/28 13:22:21 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 times; aborting job
15/09/28 13:22:21 ERROR ErrorMonitor: Uncaught fatal error from thread [sparkDriver-akka.actor.default-dispatcher-7] shutting down ActorSystem [sparkDriver]
java.lang.AbstractMethodError
at org.apache.spark.Logging$class.log(Logging.scala:52)
at org.apache.spark.streaming.twitter.TwitterReceiver.log(TwitterInputDStream.scala:60)
at org.apache.spark.Logging$class.logInfo(Logging.scala:59)
at org.apache.spark.streaming.twitter.TwitterReceiver.logInfo(TwitterInputDStream.scala:60)
at org.apache.spark.streaming.twitter.TwitterReceiver.onStop(TwitterInputDStream.scala:101)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.stopReceiver(ReceiverSupervisor.scala:169)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.stop(ReceiverSupervisor.scala:136)
at org.apache.spark.streaming.receiver.ReceiverSupervisorImpl$$anon$2$$anonfun$receive$1.applyOrElse(ReceiverSupervisorImpl.scala:79)
at org.apache.spark.rpc.akka.AkkaRpcEnv.org$apache$spark$rpc$akka$AkkaRpcEnv$$processMessage(AkkaRpcEnv.scala:177)
at org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1$$anonfun$receiveWithLogging$1$$anonfun$applyOrElse$4.apply$mcV$sp(AkkaRpcEnv.scala:126)
at org.apache.spark.rpc.akka.AkkaRpcEnv.org$apache$spark$rpc$akka$AkkaRpcEnv$$safelyCall(AkkaRpcEnv.scala:197)
at org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1$$anonfun$receiveWithLogging$1.applyOrElse(AkkaRpcEnv.scala:125)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:59)
at org.apache.spark.util.ActorLogReceive$$anon$1.apply(ActorLogReceive.scala:42)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118)
at org.apache.spark.util.ActorLogReceive$$anon$1.applyOrElse(ActorLogReceive.scala:42)
at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
at org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1.aroundReceive(AkkaRpcEnv.scala:92)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
15/09/28 13:22:21 WARN ReceiverTracker: Receiver 0 exited but didn't deregister
15/09/28 13:22:21 INFO WriteAheadLogManager : Stopped write ahead log manager
15/09/28 13:22:22 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/09/28 13:22:22 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
15/09/28 13:22:22 WARN AkkaRpcEndpointRef: Error sending message [message = RemoveBroadcast(0,true)] in 1 attempts
org.apache.spark.rpc.RpcTimeoutException: Recipient[Actor[akka://sparkDriver/user/BlockManagerMaster#1080029491]] had already been terminated.. This timeout is controlled by spark.rpc.askTimeout
at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcEnv.scala:214)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcEnv.scala:229)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcEnv.scala:225)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
at scala.util.Failure$$anonfun$recover$1.apply(Try.scala:185)
at scala.util.Try$.apply(Try.scala:161)
at scala.util.Failure.recover(Try.scala:185)
at scala.concurrent.Future$$anonfun$recover$1.apply(Future.scala:324)
at scala.concurrent.Future$$anonfun$recover$1.apply(Future.scala:324)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
at org.spark-project.guava.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:293)
at scala.concurrent.impl.ExecutionContextImpl$$anon$1.execute(ExecutionContextImpl.scala:133)
at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:40)
at scala.concurrent.impl.Promise$DefaultPromise.scala$concurrent$impl$Promise$DefaultPromise$$dispatchOrAddCallback(Promise.scala:280)
at scala.concurrent.impl.Promise$DefaultPromise.onComplete(Promise.scala:270)
at scala.concurrent.Future$class.recover(Future.scala:324)
at scala.concurrent.impl.Promise$DefaultPromise.recover(Promise.scala:153)
at org.apache.spark.rpc.akka.AkkaRpcEndpointRef.ask(AkkaRpcEnv.scala:319)
at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:100)
at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:77)
at org.apache.spark.storage.BlockManagerMaster.removeBroadcast(BlockManagerMaster.scala:128)
at org.apache.spark.broadcast.TorrentBroadcast$.unpersist(TorrentBroadcast.scala:228)
at org.apache.spark.broadcast.TorrentBroadcastFactory.unbroadcast(TorrentBroadcastFactory.scala:45)
at org.apache.spark.broadcast.BroadcastManager.unbroadcast(BroadcastManager.scala:67)
at org.apache.spark.ContextCleaner.doCleanupBroadcast(ContextCleaner.scala:214)
at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1$$anonfun$apply$mcV$sp$2.apply(ContextCleaner.scala:170)
at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1$$anonfun$apply$mcV$sp$2.apply(ContextCleaner.scala:161)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:161)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1136)
at org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:154)
at org.apache.spark.ContextCleaner$$anon$3.run(ContextCleaner.scala:67)
Caused by: akka.pattern.AskTimeoutException: Recipient[Actor[akka://sparkDriver/user/BlockManagerMaster#1080029491]] had already been terminated.
at akka.pattern.AskableActorRef$.ask$extension(AskSupport.scala:132)
at org.apache.spark.rpc.akka.AkkaRpcEndpointRef.ask(AkkaRpcEnv.scala:307)
... 14 more
This question was also answered in:
Spark streaming StreamingContext.start() - Error starting receiver 0
It can be resolved by changing your scalaVersion and libraryDependencies in your build.sbt, to match the ones for your Spark version.
e.g.:
scalaVersion := "2.11.8"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-streaming" % "1.6.1" % "provided",
"org.apache.spark" %% "spark-streaming-twitter" % "1.6.1"
)
Be sure to use the newly generated scala-2.11 classes, e.g.:
../../../../spark-1.6.1/bin/spark-submit --class Tutorial /Users/mendezr/development/spark/exercises_spark_submit_2014/usb/streaming/scala/target/scala-2.11/Tutorial-assembly-0.1-SNAPSHOT.jar
For spark 1.6.x use scala 2.10.6 version, and make sure it's set in build.sbt file.
Make sure that your hostname is linked to your current IP, in linux
environment the hosts are resolved in /etc/hosts file.

Embedding HornetQ 2.2.14 with Jboss-5.1.0.GA

I have successfully embedded HornetQ 2.2.14 with JBoss-5.1.0.GA by following this link http://docs.jboss.org/hornetq/2.2.2.Final/quickstart-guide/en/html_single/#installation.jboss.as5.
but I am getting exception as,
Error installing to Real: name=vfsfile:/home/jboss/jboss-5.1.0.GA/server/default-with-hornetq/deploy/hornetq.sar/ state=PreReal mode=Manual requiredState=Real
org.jboss.deployers.spi.DeploymentException: Error deploying: vfsfile:/home/jboss/jboss-5.1.0.GA/server/default-with-hornetq/deploy/hornetq.sar/hornetq-jboss-beans.xml
at org.jboss.deployers.spi.DeploymentException.rethrowAsDeploymentException(DeploymentException.java:49)
at org.jboss.deployers.vfs.deployer.kernel.BeanMetaDataFactoryVisitor.deploy(BeanMetaDataFactoryVisitor.java:136)
at org.jboss.deployers.spi.deployer.helpers.AbstractRealDeployerWithInput.deploy(AbstractRealDeployerWithInput.java:125)
at org.jboss.deployers.spi.deployer.helpers.AbstractRealDeployerWithInput.internalDeploy(AbstractRealDeployerWithInput.java:102)
Caused by: java.lang.IllegalArgumentException: Exception loading class for ScopeKey addition.
at org.jboss.deployers.vfs.deployer.kernel.BeanMetaDataFactoryVisitor.addBeanComponent(BeanMetaDataFactoryVisitor.java:67)
at org.jboss.deployers.vfs.deployer.kernel.BeanMetaDataFactoryVisitor.deploy(BeanMetaDataFactoryVisitor.java:126)
... 35 more
Caused by: java.lang.ClassNotFoundException: org.hornetq.jms.server.recovery.AS5RecoveryRegistry from BaseClassLoader#bf5dc1{VFSClassLoaderPolicy#1a45a7f{name=vfsfile:/home/jboss/jboss-5.1.0.GA/server/default-with-hornetq/deploy/hornetq.sar/ domain=ClassLoaderDomain#152e961{name=DefaultDomain parentPolicy=BEFORE parent=org.jboss.bootstrap.NoAnnotationURLClassLoader#1479feb} roots=[MemoryContextHandler#20020219[path= context=vfsmemory://5c4oz19-xy0sz8-hghy7wgr-1-hghy8fcm-10 real=vfsmemory://5c4oz19-xy0sz8-hghy7wgr-1-hghy8fcm-10], FileHandler#29532276[path=hornetq.sar context=file:/home/jboss/jboss-5.1.0.GA/server/default-with-hornetq/deploy/ real=file:/home/jboss/jboss-5.1.0.GA/server/default-with-hornetq/deploy/hornetq.sar/]] delegates=null exported=[] <IMPORT-ALL>NON_EMPTY}}
at org.jboss.classloader.spi.base.BaseClassLoader.loadClass(BaseClassLoader.java:448)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
at org.jboss.deployers.vfs.deployer.kernel.BeanMetaDataFactoryVisitor.addBeanComponent(BeanMetaDataFactoryVisitor.java:63)
... 36 more
DEPLOYMENTS IN ERROR:
Deployment "vfsfile:/home/jboss/jboss-5.1.0.GA/server/default-with-hornetq/deploy/hornetq.sar/" is in error due to the following reason(s): java.lang.ClassNotFoundException: org.hornetq.jms.server.recovery.AS5RecoveryRegistry from BaseClassLoader#bf5dc1{VFSClassLoaderPolicy#1a45a7f{name=vfsfile:/home/jboss/jboss-5.1.0.GA/server/default-with-hornetq/deploy/hornetq.sar/ domain=ClassLoaderDomain#152e961{name=DefaultDomain parentPolicy=BEFORE parent=org.jboss.bootstrap.NoAnnotationURLClassLoader#1479feb} roots=[MemoryContextHandler#20020219[path= context=vfsmemory://5c4oz19-xy0sz8-hghy7wgr-1-hghy8fcm-10 real=vfsmemory://5c4oz19-xy0sz8-hghy7wgr-1-hghy8fcm-10], FileHandler#29532276[path=hornetq.sar context=file:/home/jboss/jboss-5.1.0.GA/server/default-with-hornetq/deploy/ real=file:/home/jboss/jboss-5.1.0.GA/server/default-with-hornetq/deploy/hornetq.sar/]] delegates=null exported=[] <IMPORT-ALL>NON_EMPTY}}
18:35:56,302 INFO [Http11Protocol] Starting Coyote HTTP/1.1 on http-127.0.0.1-8080
18:35:56,345 INFO [AjpProtocol] Starting Coyote AJP/1.3 on ajp-127.0.0.1-8009
18:35:56,372 INFO [Http11Protocol] Starting Coyote HTTP/1.1 on http-127.0.0.1-8443
18:35:56,402 INFO [ServerImpl] JBoss (Microcontainer) [5.1.0.GA (build: SVNTag=JBoss_5_1_0_GA date=200905221634)] Started in 49s:604ms
Either try the thing mentioned in this thread:
Change
<bean name="AS5RecoveryRegistry"
class="org.hornetq.jms.server.recovery.AS5RecoveryRegistry">
to
<bean name="AS5RecoveryRegistry"
class="org.jboss.as.integration.hornetq.recovery.AS5RecoveryRegistry">
in deploy/hornetq.sar/hornetq-jboss-beans.xml.
If that doesn't work, according to https://issues.jboss.org/browse/HORNETQ-943 you're out of luck:
HornetQ is only supported with JBoss EAP 5.1.2
Note that it is possible to just comment out the automatic recovery. That might be potentially harmful however.