Zeppelin unable to start IPython kernel - ipython

For quite some time I've been facing an issue with Zeppelin which seem to be unable to launch IPython. I followed this guide and this one. Pyspark interpreter is correctly set with the right python path and IPython activated by default. However, when I try to run any of the examples in the guides such as:
%ipyspark
import pandas as pd
df = pd.DataFrame({'name':['a','b','c'], 'count':[12,24,18]})
z.show(df)
I get the following error from the logs which doesn't tell much:
INFO [2018-11-30 15:17:08,653] ({pool-3-thread-2} IPythonInterpreter.java[setAdditionalPythonPath]:103) - setAdditionalPythonPath: /usr/hdp/current/spark2-client/python/lib/pyspark.zip:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.7-src.zip:/usr/hdp/current/zeppelin-server/interpreter/lib/python
INFO [2018-11-30 15:17:08,654] ({pool-3-thread-2} IPythonInterpreter.java[open]:135) - Python Exec: python3
INFO [2018-11-30 15:17:09,189] ({pool-3-thread-2} IPythonInterpreter.java[checkIPythonPrerequisite]:195) - IPython prerequisite is meet
INFO [2018-11-30 15:17:09,191] ({pool-3-thread-2} IPythonInterpreter.java[open]:146) - Launching IPython Kernel at port: 39753
INFO [2018-11-30 15:17:09,191] ({pool-3-thread-2} IPythonInterpreter.java[open]:147) - Launching JVM Gateway at port: 36511
INFO [2018-11-30 15:17:09,402] ({pool-3-thread-2} IPythonInterpreter.java[setupIPythonEnv]:315) - PYTHONPATH:/usr/hdp/current/spark2-client/python/lib/pyspark.zip:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.7-src.zip:/usr/hdp/current/zeppelin-server/interpreter/lib/python:/usr/hdp/current/spark2-client//python/lib/py4j-0.10.7-src.zip:/usr/hdp/current/spark2-client//python/:/usr/hdp/current/spark2-client//python:/usr/hdp/current/spark2-client//python/lib/py4j-0.8.2.1-src.zip
INFO [2018-11-30 15:17:09,743] ({pool-3-thread-2} IPythonInterpreter.java[launchIPythonKernel]:293) - Wait for IPython Kernel to be started
INFO [2018-11-30 15:17:09,844] ({pool-3-thread-2} IPythonInterpreter.java[launchIPythonKernel]:293) - Wait for IPython Kernel to be started
WARN [2018-11-30 15:17:09,926] ({Exec Default Executor} IPythonInterpreter.java[onProcessFailed]:394) - Exception happens in Python Process
org.apache.commons.exec.ExecuteException: Process exited with an error: 1 (Exit value: 1)
at org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:404)
at org.apache.commons.exec.DefaultExecutor.access$200(DefaultExecutor.java:48)
at org.apache.commons.exec.DefaultExecutor$1.run(DefaultExecutor.java:200)
at java.lang.Thread.run(Thread.java:745)
INFO [2018-11-30 15:17:09,944] ({pool-3-thread-2} IPythonInterpreter.java[launchIPythonKernel]:293) - Wait for IPython Kernel to be started
INFO [2018-11-30 15:17:10,044] ({pool-3-thread-2} IPythonInterpreter.java[launchIPythonKernel]:293) - Wait for IPython Kernel to be started
INFO [2018-11-30 15:17:39,465] ({pool-3-thread-2} IPythonInterpreter.java[launchIPythonKernel]:293) - Wait for IPython Kernel to be started
WARN [2018-11-30 15:17:39,466] ({pool-3-thread-2} PySparkInterpreter.java[open]:134) - Fail to open IPySparkInterpreter
java.lang.RuntimeException: Fail to open IPythonInterpreter
at org.apache.zeppelin.python.IPythonInterpreter.open(IPythonInterpreter.java:157)
at org.apache.zeppelin.spark.IPySparkInterpreter.open(IPySparkInterpreter.java:66)
at org.apache.zeppelin.spark.PySparkInterpreter.open(PySparkInterpreter.java:129)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
at org.apache.zeppelin.scheduler.Job.run(Job.java:188)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Fail to launch IPython Kernel in 30 seconds
at org.apache.zeppelin.python.IPythonInterpreter.launchIPythonKernel(IPythonInterpreter.java:297)
at org.apache.zeppelin.python.IPythonInterpreter.open(IPythonInterpreter.java:154)
at org.apache.zeppelin.spark.IPySparkInterpreter.open(IPySparkInterpreter.java:66)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
at org.apache.zeppelin.scheduler.Job.run(Job.java:188)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
INFO [2018-11-30 15:17:39,466] ({pool-3-thread-2} PySparkInterpreter.java[open]:140) - IPython is not available, use the native PySparkInterpreter
INFO [2018-11-30 15:17:39,533] ({pool-3-thread-2} PySparkInterpreter.java[createPythonScript]:118) - File /tmp/zeppelin_pyspark-5362368451576072994.py created
INFO [2018-11-30 15:17:39,534] ({pool-3-thread-2} Py4JUtils.java[createGatewayServer]:44) - Launching GatewayServer at 127.0.0.1:34508
INFO [2018-11-30 15:17:39,565] ({pool-3-thread-2} PySparkInterpreter.java[createGatewayServerAndStartScript]:265) - pythonExec: python3
INFO [2018-11-30 15:17:39,567] ({pool-3-thread-2} PySparkInterpreter.java[setupPySparkEnv]:236) - PYTHONPATH: /usr/hdp/current/spark2-client/python/lib/pyspark.zip:/usr/hdp/current/spark2-client/python/lib/py4j-0.10.7-src.zip:/usr/hdp/current/zeppelin-server/interpreter/lib/python:/usr/hdp/current/spark2-client//python/lib/py4j-0.10.7-src.zip:/usr/hdp/current/spark2-client//python/:/usr/hdp/current/spark2-client//python:/usr/hdp/current/spark2-client//python/lib/py4j-0.8.2.1-src.zip
INFO [2018-11-30 15:17:41,953] ({pool-3-thread-2} SchedulerFactory.java[jobFinished]:115) - Job 20181129-172919_2135817500 finished by scheduler interpreter_131607019
I'm using HDP3.0.1 which comes with Zeppelin 0.8.0. All the nodes have python 3.7.1 installed with the latest version of jupyter and grpcio. From Zeppelin notebook I checked ipython and python version:
%pyspark
import sys
import IPython
print(IPython.__version__)
print(sys.version)
7.2.0
3.7.1 (default, Nov 29 2018, 17:37:37)
I can start IPython from any node without a problem and Zeppelin can correctly get IPython's version. I tried to find if there are other logs than Zeppelin reporting the error but couldn't find anything.
Any idea of what could be preventing the launch of IPython kernel from Zeppelin?

pip install --upgrade setuptools pip
or potentially
pip install --upgrade ipython
there are a few other quick things to try here github.com/jupyter/notebook/issues/270

I had this problem too running zeppelin-0.9.0-preview2. In my case, the cause was that zeppelin did not recognize conda installed packages in pip-freeze.
For instance, I installed jupyter-client using conda so pip freeze looked like this:
➜ pip freeze | grep jupyter-client
jupyter-client # file:///tmp/build/80754af9/jupyter_client_1616770841739/work
Notice jupyter-client does not follow the package==version format. The solution is to remove the conda installed version of jupyter & install it with pip.
➜ conda uninstall jupyter
➜ pip install jupyter
➜ pip freeze | grep jupyter-client
jupyter-client==6.1.12
It seems zeppelin should support both cases soon. (If it does not already).
Remember too, that zeppelin-0.9.0 does not support Python 3.8 yet.

I did recently run in a similar issue with Apache Zeppelin 0.10.0 and Apache Zeppelin 0.10.1.
The reason (in my case) was that the newest versions installed with pip install jupyter will install jupyter_client==7.4.4. This is however not recognized by Apache Zeppelin 0.10.0 which looks for the older name of the package (jupyter-client instead of jupyter_client).
See here for more details and a workaround.

Related

Script hangs only when started via SystemD

I am trying to start the Kafka Connect component alongside Cloudera and want to wrap the Kafka Connect startup script in a systemd service file such that it can start on boot once the Cloudera services start.
For some strange reason if I start this script outside of systemd it works just fine, but when I start it via systemctl start kafka-connect it just hangs at the following log entry lines.
[Unit]
Description=Kafka Connect - Distributed
After=network.target cloudera-scm-agent
[Service]
Type=forking
ExecStart=/bin/bash -c "/app/cloudera/parcels/KAFKA/lib/kafka/bin/connect-distributed.sh -daemon /app/cloudera/kafka-connect/connect-distributed.properties"
SuccessExitStatus=143
[Install]
WantedBy=multi-user.target
2019-07-18 13:40:27,962 INFO (main) [ConnectDistributed(main:69)] Scanning for plugin classes. This might take a moment ...
2019-07-18 13:40:27,992 INFO (main) [DelegatingClassLoader(registerPlugin:184)] Loading plugin from: /app/cloudera/kafka-connect/plugins/splunk-kafka-connect.jar
2019-07-18 13:40:27,993 DEBUG (main) [DelegatingClassLoader(registerPlugin:191)] Loading plugin urls: [file:/app/cloudera/kafka-connect/plugins/splunk-kafka-connect.jar]
2019-07-18 13:40:29,206 DEBUG (main) [VersionUtils(getVersionFromProperties:63)] found git version string=v1.1.0 in version.properties file
2019-07-18 13:40:29,219 INFO (main) [DelegatingClassLoader(scanUrlsAndAddPlugins:207)] Registered loader: PluginClassLoader{pluginLocation=file:/app/cloudera/kafka-connect/plugins/splunk-kafka-connect.jar}
2019-07-18 13:40:29,220 INFO (main) [DelegatingClassLoader(addPlugins:136)] Added plugin 'com.splunk.kafka.connect.SplunkSinkConnector'
2019-07-18 13:40:29,220 INFO (main) [DelegatingClassLoader(addPlugins:136)] Added plugin 'org.apache.kafka.connect.storage.StringConverter'
My next thought was to try something a bit simpler and just create an init.d script at /etc/init.d/kafka-connect which works fine because it is nothing more than a shell script wrapper UNTIL I source in the /etc/init.d/functions file which causes this to be started via systemd again.
Two main questions -
What does systemctl do differently than a regular shell script that would cause this shell script (which in turn launches a java process) to hang at that exact same step everytime.
If my /etc/init.d/kafka-connect script works, is it okay to use on RHEL7? If so, is there a way to load that init.d script on boot of the server?
Thanks in advance!

spark-submit failed due to Exception in thread "main" java.lang.NoSuchMethodException

I got the error information below when i tried to submit my spark job for testing purpose.
jianrui#spark:~$ sudo $SPARK_HOME/bin/spark-submit --class com.test.spark.FirstScalaExample --master spark://spark.sparkstreaming.i10.internal.cloudapp.net:7077 /opt/spark/FirstScalaExample-0.0.1.jar
Exception in thread "main" java.lang.NoSuchMethodException: com.test.spark.FirstScalaExample.main([Ljava.lang.String;)
at java.lang.Class.getMethod(Class.java:1786)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:42)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:879)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
2018-04-06 13:13:00 INFO ShutdownHookManager:54 - Shutdown hook called
2018-04-06 13:13:00 INFO ShutdownHookManager:54 - Deleting directory /tmp/spark-7f47cab1-f8b3-4731-bd67-e0d0ad013617
[Scala version - 2.11.6]
[Hadoop version - 2.7.5]
[Spark version - 2.3.0]
Note:
To be informed that i specified the hadoop native lib in this way "export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native" in the "spark-env.sh", and i only extract the spark but not installed.
I don't know what exactly the problem is.

Running Apache Atlas standalone

I am trying to run Apache Atlas in a standalone fashion on Ubuntu - meaning without having to setup Solr and/or HBase.
What I did (according to the documentation: http://atlas.apache.org/0.8.1/InstallationSteps.html) was cloning the Git repository, build the maven project with embadded HBase an dSolr:
mvn clean package -Pdist,embedded-hbase-solr
Unpacked the resuting tar.gz file and executed bin/atlas_start.py - without having changed any configuration. To my understanding of the documentatino that should actually start up HBase along with Atlas - right?
The is what I find in logs/applocation.log:
2017-11-30 17:14:24,093 INFO - [main:] ~ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (Atlas:216)
2017-11-30 17:14:24,093 INFO - [main:] ~ Server starting with TLS ? false on port 21000 (Atlas:217)
2017-11-30 17:14:24,093 INFO - [main:] ~ <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< (Atlas:218)
2017-11-30 17:14:27,684 INFO - [main:] ~ No authentication method configured. Defaulting to simple authentication (LoginProcessor:102)
2017-11-30 17:14:28,527 INFO - [main:] ~ Logged in user daniel (auth:SIMPLE) (LoginProcessor:77)
2017-11-30 17:14:31,777 INFO - [main:] ~ Not running setup per configuration atlas.server.run.setup.on.start. (SetupSteps$SetupRequired:189)
2017-11-30 17:14:39,456 WARN - [main-SendThread(localhost:2181):] ~ Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect (ClientCnxn$SendThread:110$
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
2017-11-30 17:14:39,594 WARN - [main:] ~ Possibly transient ZooKeeper, quorum=localhost:2181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = Connecti$
2017-11-30 17:14:40,593 WARN - [main-SendThread(localhost:2181):] ~ Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect (ClientCnxn$SendThread:110$
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
...
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
2017-11-30 17:14:56,185 WARN - [main:] ~ Possibly transient ZooKeeper, quorum=localhost:2181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = Connecti$
2017-11-30 17:14:56,186 ERROR - [main:] ~ ZooKeeper exists failed after 4 attempts (RecoverableZooKeeper:277)
2017-11-30 17:14:56,186 WARN - [main:] ~ hconnection-0x1dba4e060x0, quorum=localhost:2181, baseZNode=/hbase Unable to set watcher on znode (/hbase/hbaseid) (ZKUtil:544)
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/hbaseid
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:221)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:541)
at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
To me it reads as if no HBase (and Zookeeper) are started by the script.
Am I missing something?
Thanks for your hints!
OK, meanwhile I figured out the issue. The start script obviously does not execute the script conf/atlas-env.sh which sets some environment variable. Among this are MANAGE_LOCAL_HBASE and MANAGE_LOCAL_SOLR. So if you set those two env vars to true (and set JAVA_HOME properly which is needed for the embedded HBase), then Atlas automatically starts HBase and Solr - and we get a local running instance of Atlas!
Maybe this helps someone who comes across the same issue in future!
Update March 2021
There are two ways of running apache atlas:
A) Building it from scratch:
git clone https://github.com/apache/atlas
mvn clean install -DskipTests
mvn clean package -Pdist -DskipTests
Running atlas_start.py:
python <atlas-directory>/conf/atlas_start.py
B) Using docker image:
docker-compose.yml
version: "3.3"
services:
atlas:
image: sburn/apache-atlas
container_name: atlas
ports:
- "21000:21000"
volumes:
- "./bash_script:/app"
command: bash -exc "/opt/apache-atlas-2.1.0/bin/atlas_start.py"
docker-compose up

Apache Toree and Spark Scala Not Working in Jupyter

I'm having problems running Scala Spark on Jupyter. Below is my error message when I load Apache Toree - Scala notebook in jupyter.
root#ubuntu-2gb-sgp1-01:~# jupyter notebook --ip 0.0.0.0 --port 8888
[I 03:14:54.281 NotebookApp] Serving notebooks from local directory: /root
[I 03:14:54.281 NotebookApp] 0 active kernels
[I 03:14:54.281 NotebookApp] The Jupyter Notebook is running at: http://0.0.0.0:8888/
[I 03:14:54.281 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[W 03:14:54.282 NotebookApp] No web browser found: could not locate runnable browser.
[I 03:15:09.976 NotebookApp] 302 GET / (61.6.68.44) 1.21ms
[I 03:15:15.924 NotebookApp] Creating new notebook in
[W 03:15:16.592 NotebookApp] 404 GET /nbextensions/widgets/notebook/js/extension.js?v=20161120031454 (61.6.68.44) 15.49ms referer=http://188.166.235.21:8888/notebooks/Untitled2.ipynb?kernel_name=apache_toree_scala
[I 03:15:16.677 NotebookApp] Kernel started: 94a63354-d294-4de7-a12c-2e05905e0c45
Starting Spark Kernel with SPARK_HOME=/usr/local/spark
16/11/20 03:15:18 [INFO] o.a.t.Main$$anon$1 - Kernel version: 0.1.0.dev8-incubating-SNAPSHOT
16/11/20 03:15:18 [INFO] o.a.t.Main$$anon$1 - Scala version: Some(2.10.4)
16/11/20 03:15:18 [INFO] o.a.t.Main$$anon$1 - ZeroMQ (JeroMQ) version: 3.2.2
16/11/20 03:15:18 [INFO] o.a.t.Main$$anon$1 - Initializing internal actor system
Exception in thread "main" java.lang.NoSuchMethodError: scala.collection.immutable.HashSet$.empty()Lscala/collection/immutable/HashSet;
at akka.actor.ActorCell$.<init>(ActorCell.scala:336)
at akka.actor.ActorCell$.<clinit>(ActorCell.scala)
at akka.actor.RootActorPath.$div(ActorPath.scala:185)
at akka.actor.LocalActorRefProvider.<init>(ActorRefProvider.scala:465)
at akka.actor.LocalActorRefProvider.<init>(ActorRefProvider.scala:453)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at akka.actor.ReflectiveDynamicAccess$$anonfun$createInstanceFor$2.apply(DynamicAccess.scala:78)
at scala.util.Try$.apply(Try.scala:192)
at akka.actor.ReflectiveDynamicAccess.createInstanceFor(DynamicAccess.scala:73)
at akka.actor.ReflectiveDynamicAccess$$anonfun$createInstanceFor$3.apply(DynamicAccess.scala:84)
at akka.actor.ReflectiveDynamicAccess$$anonfun$createInstanceFor$3.apply(DynamicAccess.scala:84)
at scala.util.Success.flatMap(Try.scala:231)
at akka.actor.ReflectiveDynamicAccess.createInstanceFor(DynamicAccess.scala:84)
at akka.actor.ActorSystemImpl.liftedTree1$1(ActorSystem.scala:585)
at akka.actor.ActorSystemImpl.<init>(ActorSystem.scala:578)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:142)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:109)
at org.apache.toree.boot.layer.StandardBareInitialization$class.createActorSystem(BareInitialization.scala:71)
at org.apache.toree.Main$$anon$1.createActorSystem(Main.scala:35)
at org.apache.toree.boot.layer.StandardBareInitialization$class.initializeBare(BareInitialization.scala:60)
at org.apache.toree.Main$$anon$1.initializeBare(Main.scala:35)
at org.apache.toree.boot.KernelBootstrap.initialize(KernelBootstrap.scala:72)
at org.apache.toree.Main$delayedInit$body.apply(Main.scala:40)
at scala.Function0$class.apply$mcV$sp(Function0.scala:34)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
at scala.App$class.main(App.scala:76)
at org.apache.toree.Main$.main(Main.scala:24)
at org.apache.toree.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:736)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
[W 03:15:26.738 NotebookApp] Timeout waiting for kernel_info reply from 94a63354-d294-4de7-a12c-2e05905e0c45
When running Scala shell, this is my output logs
root#ubuntu-2gb-sgp1-01:~# spark-shell
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).
16/11/20 03:17:11 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/11/20 03:17:12 WARN Utils: Your hostname, ubuntu-2gb-sgp1-01 resolves to a loopback address: 127.0.1.1; using 10.15.0.5 instead (on interface eth0)
16/11/20 03:17:12 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
16/11/20 03:17:13 WARN SparkContext: Use an existing SparkContext, some configuration may not take effect.
Spark context Web UI available at http://10.15.0.5:4040
Spark context available as 'sc' (master = local[*], app id = local-1479611833426).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.0.2
/_/
Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_111)
Type in expressions to have them evaluated.
Type :help for more information.
scala>
This problem was highlighted before in jira https://issues.apache.org/jira/browse/TOREE-336 . However, I'm still unable to get it working for some reason.
I followed the instructions listed on their official site.
https://toree.apache.org/documentation/user/quick-start
This is my path
scala> root#ubuntu-2gb-sgp1-01:~# echo $PATH
/root/bin:/root/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/usr/local/spark:/usr/local/spark/bin
Please note I didnt install Scala as it comes with spark.
Thanks
We haven't used Spark 2.0 in production yet with Scala 2.11 and notebooks.
The root cause you your error is in compatibility. Based on GitHub Toree description, the latest Scala version that is supported is Scala 2.10.4 and you have 2.11.8.
Try to downgrade it to 2.10 if it is not a production need to use only 2.11

Run protractor test script in multiple nodes (os platform) via selenium grid

I am currently trying to run Protractor test script using selenium grid. My goal is to distribute Protractor test scripts to multiple nodes running under different flavor of OS. It should also run different parts of test script in parallel to save execution time. Current setup works when I use webdriver-manager to distribute test with multiple instances, but webdriver-manager would let me use only one node. I know it's possible to overcome this issue with selenium grid, but I am having following errors test console:
node_modules/selenium-webdriver/error.js:26
constructor(opt_error) {
^
WebDriverError: The path to the driver executable must be set by the webdriver.chrome.driver system property; for more information, see https://github.com/SeleniumHQ/selenium/wiki/ChromeDriver. The latest version can be downloaded from http://chromedriver.storage.googleapis.com/index.html
at WebDriverError (./node_modules/selenium-webdriver/error.js:26:26)
at Object.checkLegacyResponse (./node_modules/selenium-webdriver/error.js:580:13)
at ./node_modules/selenium-webdriver/lib/webdriver.js:64:13
at Promise.invokeCallback_ (./node_modules/selenium-webdriver/lib/promise.js:1329:14)
at TaskQueue.execute_ (./node_modules/selenium-webdriver/lib/promise.js:2790:14)
at TaskQueue.executeNext_ (./node_modules/selenium-webdriver/lib/promise.js:2773:21)
at ./node_modules/selenium-webdriver/lib/promise.js:2652:27
at ./node_modules/selenium-webdriver/lib/promise.js:639:7
at process._tickCallback (internal/process/next_tick.js:103:7)
From: Task: WebDriver.createSession()
at acquireSession (./node_modules/selenium-webdriver/lib/webdriver.js:62:22)
at Function.createSession (./node_modules/selenium-webdriver/lib/webdriver.js:295:12)
at Builder.build (./node_modules/selenium-webdriver/builder.js:458:24)
at [object Object].DriverProvider.getNewDriver (./node_modules/protractor/built/driverProviders/driverProvider.js:42:27)
at [object Object].Runner.createBrowser (./node_modules/protractor/built/runner.js:203:37)
at ./node_modules/protractor/built/runner.js:293:21
at _fulfilled (./node_modules/q/q.js:834:54)
at self.promiseDispatch.done (./node_modules/q/q.js:863:30)
at Promise.promise.promiseDispatch (./node_modules/q/q.js:796:13)
at ./node_modules/q/q.js:556:49
[launcher] Process exited with error code 1
Also, selenium node that I registered complains with following messages:
$ java -jar /Users/user/Desktop/selenium-server-standalone-2.52.0.jar -role node -hub http://localhost:4444/grid/register Dwebdriver.chrome.driver="./node_modules/protractor/selenium/chromedriver_2.21" -browser "browserName=chrome,version=ANY,platform=MAC,maxInstances=20"
14:40:17.078 INFO - Launching a Selenium Grid node
14:40:17.126 INFO - Adding browserName=chrome,version=ANY,platform=MAC,maxInstances=20
14:40:17.515 INFO - Java: Oracle Corporation 25.74-b02
14:40:17.515 INFO - OS: Mac OS X 10.11.6 x86_64
14:40:17.521 INFO - v2.52.0, with Core v2.52.0. Built from revision 4c2593c
14:40:17.570 INFO - Driver provider org.openqa.selenium.ie.InternetExplorerDriver registration is skipped:
registration capabilities Capabilities [{ensureCleanSession=true, browserName=internet explorer, version=, platform=WINDOWS}] does not match the current platform MAC
14:40:17.570 INFO - Driver provider org.openqa.selenium.edge.EdgeDriver registration is skipped:
registration capabilities Capabilities [{browserName=MicrosoftEdge, version=, platform=WINDOWS}] does not match the current platform MAC
14:40:17.570 INFO - Driver class not found: com.opera.core.systems.OperaDriver
14:40:17.570 INFO - Driver provider com.opera.core.systems.OperaDriver is not registered
14:40:17.630 INFO - Selenium Grid node is up and ready to register to the hub
14:40:17.650 INFO - Starting auto registration thread. Will try to register every 5000 ms.
14:40:17.650 INFO - Registering the node to the hub: http://localhost:4444/grid/register
14:40:17.669 INFO - The node is registered to the hub and ready to use
14:41:32.968 INFO - Executing: [new session: Capabilities [{rootElement=*[ng-app], count=1, browserName=chrome, maxInstances=4, shardTestFiles=true}]])
14:41:32.977 INFO - Creating a new session for Capabilities [{rootElement=*[ng-app], count=1, browserName=chrome, maxInstances=4, shardTestFiles=true}]
14:41:32.998 WARN - Exception thrown
java.util.concurrent.ExecutionException: org.openqa.selenium.WebDriverException: java.lang.reflect.InvocationTargetException
Build info: version: '2.52.0', revision: '4c2593c', time: '2016-02-11 19:06:42'
System info: host: 'user-M-C19J', ip: '10.128.164.26', os.name: 'Mac OS X', os.arch: 'x86_64', os.version: '10.11.6', java.version: '1.8.0_74'
Driver info: driver.version: unknown
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at org.openqa.selenium.remote.server.DefaultSession.execute(DefaultSession.java:183)
at org.openqa.selenium.remote.server.DefaultSession.<init>(DefaultSession.java:119)
at org.openqa.selenium.remote.server.DefaultSession.createSession(DefaultSession.java:95)
at org.openqa.selenium.remote.server.DefaultDriverSessions.newSession(DefaultDriverSessions.java:124)
at org.openqa.selenium.remote.server.handler.NewSession.handle(NewSession.java:59)
at org.openqa.selenium.remote.server.handler.NewSession.handle(NewSession.java:1)
at org.openqa.selenium.remote.server.rest.ResultConfig.handle(ResultConfig.java:111)
at org.openqa.selenium.remote.server.JsonHttpCommandHandler.handleRequest(JsonHttpCommandHandler.java:79)
at org.openqa.selenium.remote.server.DriverServlet.handleRequest(DriverServlet.java:202)
at org.openqa.selenium.remote.server.DriverServlet.doPost(DriverServlet.java:164)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
at org.openqa.selenium.remote.server.DriverServlet.service(DriverServlet.java:130)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.openqa.jetty.jetty.servlet.ServletHolder.handle(ServletHolder.java:428)
at org.openqa.jetty.jetty.servlet.ServletHandler.dispatch(ServletHandler.java:680)
at org.openqa.jetty.jetty.servlet.ServletHandler.handle(ServletHandler.java:571)
at org.openqa.jetty.http.HttpContext.handle(HttpContext.java:1526)
at org.openqa.jetty.http.HttpContext.handle(HttpContext.java:1479)
at org.openqa.jetty.http.HttpServer.service(HttpServer.java:920)
at org.openqa.jetty.http.HttpConnection.service(HttpConnection.java:820)
at org.openqa.jetty.http.HttpConnection.handleNext(HttpConnection.java:986)
at org.openqa.jetty.http.HttpConnection.handle(HttpConnection.java:837)
at org.openqa.jetty.http.SocketListener.handleConnection(SocketListener.java:243)
at org.openqa.jetty.util.ThreadedServer.handle(ThreadedServer.java:358)
at org.openqa.jetty.util.ThreadPool$PoolThread.run(ThreadPool.java:537)
Caused by: org.openqa.selenium.WebDriverException: java.lang.reflect.InvocationTargetException
Build info: version: '2.52.0', revision: '4c2593c', time: '2016-02-11 19:06:42'
System info: host: 'user-M-C19J', ip: '10.128.164.26', os.name: 'Mac OS X', os.arch: 'x86_64', os.version: '10.11.6', java.version: '1.8.0_74'
Driver info: driver.version: unknown
at org.openqa.selenium.remote.server.DefaultDriverProvider.callConstructor(DefaultDriverProvider.java:113)
at org.openqa.selenium.remote.server.DefaultDriverProvider.newInstance(DefaultDriverProvider.java:97)
at org.openqa.selenium.remote.server.DefaultDriverFactory.newInstance(DefaultDriverFactory.java:60)
at org.openqa.selenium.remote.server.DefaultSession$BrowserCreator.call(DefaultSession.java:222)
at org.openqa.selenium.remote.server.DefaultSession$BrowserCreator.call(DefaultSession.java:1)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at org.openqa.selenium.remote.server.DefaultSession$1.run(DefaultSession.java:176)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.openqa.selenium.remote.server.DefaultDriverProvider.callConstructor(DefaultDriverProvider.java:103)
... 9 more
Caused by: java.lang.IllegalStateException: The path to the driver executable must be set by the webdriver.chrome.driver system property; for more information, see https://github.com/SeleniumHQ/selenium/wiki/ChromeDriver. The latest version can be downloaded from http://chromedriver.storage.googleapis.com/index.html
at com.google.common.base.Preconditions.checkState(Preconditions.java:199)
at org.openqa.selenium.remote.service.DriverService.findExecutable(DriverService.java:109)
at org.openqa.selenium.chrome.ChromeDriverService.access$0(ChromeDriverService.java:1)
at org.openqa.selenium.chrome.ChromeDriverService$Builder.findDefaultExecutable(ChromeDriverService.java:137)
at org.openqa.selenium.remote.service.DriverService$Builder.build(DriverService.java:296)
at org.openqa.selenium.chrome.ChromeDriverService.createDefaultService(ChromeDriverService.java:88)
at org.openqa.selenium.chrome.ChromeDriver.<init>(ChromeDriver.java:138)
... 14 more
14:41:33.002 WARN - Exception: The path to the driver executable must be set by the webdriver.chrome.driver system property; for more information, see https://github.com/SeleniumHQ/selenium/wiki/ChromeDriver. The latest version can be downloaded from http://chromedriver.storage.googleapis.com/index.html
this is my current configuration:
seleniumAddress: 'http://localhost:4444/wd/hub',
chromeDriver: './node_modules/protractor/selenium/chromedriver_2.21',
capabilities: {
'rootElement': '*[ng-app]',
'browserName': 'chrome', // chrome, firefox, or phantomjs
'shardTestFiles': true,
'maxInstances': 3
}
I ran the following command line to launch hub:
java -jar /Users/xxx/selenium-server-standalone-3.0.0-beta2.jar -role hub
and following is for running node:
java -jar /Users/xxx/selenium-server-standalone-3.0.0-beta2.jar -role node -hub http://localhost:4444/grid/register
Chrome driver is under: /node_modules/protractor/selenium/chromedriver_2.21
Has anyone successfully configure protractor with selenium grid?
Webdriver_manager is bundled in npm as a quick solution for starting server and is not a comprehensive solution as Selenium Grid. Grid should be used for all professional distributed execution environment.
Check this link for response from Julie
Webdriver-Manager doesn't have options for hub & node
And regarding Protractor with Selenium Grid, I did use local Grid and also Sauce Labs.
You should pass the chrome driver property when initializing the node.
java -jar /Users/xxx/selenium-server-standalone-3.0.0-beta2.jar -role node -hub http://*******:4444/grid/register/ -Dwebdriver.chrome.driver=C:/Users/<>/AppData/Roaming/npm/node_modules/protractor/selenium/chromedriver.exe
The protractor Conf - chromedriver option is not passed over grid and it works for only webdriver-manager.