brunel not working on IBM data science experience - scala

I am trying to use brunel on a spark scala notebook on IBM datascience experience.
%AddJar -magic https://brunelvis.org/jar/spark-kernel-brunel-all-2.2.jar
%%brunel data(leadsDF) map x(state) y(count) color(state)
I always get this error:
Name: Error parsing magics!
Message: Magics [brunel] do not exist!
StackTrace:
Is there a import needed for brunel?

I tested this Scala 2.11 with spark 2.0 and it worked for me.
%AddJar -magic https://brunelvis.org/jar/spark-kernel-brunel-all-2.2.jar
Then i used below to display and it showed me the map.
%%brunel data('co2agg') map(low) x(CO2_per_capita) color(Mean_Co2) tooltip(#all):: width=800, height=500
Example reference is from
https://github.com/Brunel-Visualization/Brunel/tree/master/spark-kernel/examples
I have fully functioning example here for scala 2.11:-
https://apsportal.ibm.com/analytics/notebooks/97e83c35-06a2-476a-aa57-078a20f04356/view?access_token=8d2f4f749aab9abbf45a08351f7b50f7fb09f06ad9aa7bab5ebc9a2cc98e902f
I tested this with scala 2.10 and i get some dependecy error, i am guessing it is because of brunel 2.2 may be dependent scala 2.11.
I hope that helps.
Thanks,
Charles.

I had the same problem where magics where not working correctly. For me this action from the "Known Issues" fixed it:
Run the following code in a Python notebook to remove the existing Scala libraries: !rm -rvf ~/data/libs/*

Related

Spark Scala - TokenizerExample - Intellij Error

I am facing issue in compiling 'TokenizerExample' that comes along with Spark-Scala package.
I have setup my environment in IntelliJ and I am able to successfully compile other Spark-Scala examples such as NaiveBayes, CosineSimilarity etc.
But when I load the 'TokenizerExample' into IntelliJ IDE, system displays an error message on the below line stating 'Cannot resolve reference transform with such signature':
val tokenized = tokenizer.transform(sentenceDataFrame)
val regexTokenized = regexTokenizer.transform(sentenceDataFrame)
I have not done any edit and I could observe that the issue is with the transform method. Could you please help me address this issue? Appreciate your support.
Thanks!
SBT/Maven, make sure you have all the dependencies listed in the .sbt or pom.xml file for the mllib module and rebuild. I personally didn't work on mllib, but this kind of errors will show up if we don't resolve the dependencies well enough. Thanks.

Spark Kafka - Issue while running from Eclipse IDE

I am experimenting with Spark Kafka integration. And I want to test the code from my eclipse IDE. However, I got below error:
java.lang.NoClassDefFoundError: scala/collection/GenTraversableOnce$class
at kafka.utils.Pool.<init>(Pool.scala:28)
at kafka.consumer.FetchRequestAndResponseStatsRegistry$.<init>(FetchRequestAndResponseStats.scala:60)
at kafka.consumer.FetchRequestAndResponseStatsRegistry$.<clinit>(FetchRequestAndResponseStats.scala)
at kafka.consumer.SimpleConsumer.<init>(SimpleConsumer.scala:39)
at org.apache.spark.streaming.kafka.KafkaCluster.connect(KafkaCluster.scala:52)
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$org$apache$spark$streaming$kafka$KafkaCluster$$withBrokers$1.apply(KafkaCluster.scala:345)
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$org$apache$spark$streaming$kafka$KafkaCluster$$withBrokers$1.apply(KafkaCluster.scala:342)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35)
at org.apache.spark.streaming.kafka.KafkaCluster.org$apache$spark$streaming$kafka$KafkaCluster$$withBrokers(KafkaCluster.scala:342)
at org.apache.spark.streaming.kafka.KafkaCluster.getPartitionMetadata(KafkaCluster.scala:125)
at org.apache.spark.streaming.kafka.KafkaCluster.getPartitions(KafkaCluster.scala:112)
at org.apache.spark.streaming.kafka.KafkaUtils$.createDirectStream(KafkaUtils.scala:403)
at org.apache.spark.streaming.kafka.KafkaUtils$.createDirectStream(KafkaUtils.scala:532)
at org.apache.spark.streaming.kafka.KafkaUtils.createDirectStream(KafkaUtils.scala)
at com.capiot.platform.spark.SparkTelemetryReceiverFromKafkaStream.executeStreamingCalculations(SparkTelemetryReceiverFromKafkaStream.java:248)
at com.capiot.platform.spark.SparkTelemetryReceiverFromKafkaStream.main(SparkTelemetryReceiverFromKafkaStream.java:84)
UPDATE:
The versions that I am using are:
scala - 2.11
spark-streaming-kafka- 1.4.1
spark - 1.4.1
Can any one resolve the issue? Thanks in advance.
You have the wrong version of Scala. You need 2.10.x per
https://spark.apache.org/docs/1.4.1/
"For the Scala API, Spark 1.4.1 uses Scala 2.10."
Might be late to help OP, but when using kafka streaming with spark, you need to make sure that you use the right jar file.
For example, in my case, I have scala 2.11 (the minimum required for spark 2.0 which im using), and given that kafka spark requires the version 2.0.0 I have to use the artifact spark-streaming-kafka-0-8-assembly_2.11-2.0.0-preview.jar
Notice my scala version and the artifact version can be seen at 2.11-2.0.0
Hope this helps (someone)
Hope that helps.

import scala.io.StdIn

I'm using Eclipse ScalaIDE and for some reason I'm not able to
import scala.io.StdIn
I'm getting a red squiggly that tells me:
object StdIn is not a member of package io
And I'm seeing that it's not in that scala.io jar file. The ScalaDoc, however says it should be there. I've tried both scala 2.10.4 and 2.11.5. I've used the Eclipse ScalaIDE to create the scala project and I've also created an sbt eclipse project directly using the scalasbt.plugin which I use all the time to manage ScalaIDE dependencies.
sbt "eclipse with-source=true"
Neither way is getting it.
I'm currently taking the Coursera Reactive Programming course and an assignment file has this import. I'm able do compile the project with sbt directly, but Eclipse ScalaIDE is not doing the job. Any clues? There may be good reason why not to use scala.io.StdIn, but my question is why can I not get it to import in the ScalaIDE?
thank you
scala.io.StdIn is new in scala 2.11.x and does not exist in previous versions.
The problem you are likely encountering is that ScalaIDE is not picking up the scala version you are specifying. Since you say that you tried it with 2.10.4, it probably still has that cached or set somewhere and it's failing because it cannot find the specified class.

Scala : trying to get log4j working

Scala newb here (it's my 2nd day of using it). I want to get log4j logging working in my Scala script. The script and the results are below, any ideas as to what's going wrong?
[sean#ibmp2 pybackup]$ cat backup.scala
import org.apache.log4j._
val log = LogFactory.getLog()
log.info("started backup")
[sean#ibmp2 pybackup]$ scala -cp log4j-1.2.16.jar:. backup.scala
/home/sean/projects/personal/pybackup/backup.scala:1: error: value apache is not a member of package org
import org.apache.log4j._
^
one error found
I reproduce it under Windows: delimiter of '-classpath' must be ';' there (not ':'). Are you use cygwin or some sort of unix emulator?
But Scala script works anywhere without current dir in classpath. Try to use:
$ scala -cp log4j-1.2.16.jar backup.scala
JFI: LogFactory is a class of slf4j library (not log4j).
UPDATE
Another possible case: broken jar in classpath, maybe during download or something else. Scala interpreter does report only about unavailable member of the package.
$ echo "qwerty" > example.jar
$ scala -cp example.jar backup.scala
backup.scala:1: error: value apache is not a member of package org
...
Need to inspect content of the jar-file:
$ jar -tf log4j-1.2.16.jar
...
org/apache/log4j/Appender.class
...
Did you remember to put log4j.jar in your classpath?
Had Similar issue when started doing Scala Development using Eclipse, doing a clean build solved the problem.
Guess the Scala tools are not matured et.
Instead of using log4j directly, you might try using Configgy. It's the Scala Way™ to work with log4j, as well as configuration files. It also plays nicely with SBT and Maven.
I asked and answered this question myself, have a look:
Put it under src/main/resources/logback.xml. It will be copied to the right location when SBT is doing the artifact assembly.

Bad class file error when using Scala 2.8.x (2.8.0 and 2.8.1) in Javafx 1.x (1.2 and 1.3.1)

When trying to import scala.Option in a javafx script, I get the following javafxc error:
cannot access scala.Option.$anonfun$orNull$1
bad class file: scala/Option$$anonfun$orNull$1.class(scala:Option$$anonfun$orNull$1.class)
undeclared type variable: A1
Please remove or make sure it appears in the correct subdirectory of the classpath.
import scala.Option;
I am using Scala 2.8.1, Javafxc 1.3.1_b101, JVM 1.6.0_21-b06, OS Ubuntu 10.10. The same code was working in Scala 2.7.7 .
Later edit:
The same error is reported in case I import scala.immutable.Seq/List/Traversable/Iterable . I have tried the imports in a default Netbeans 6.9.1 JavaFX project which has in the classpath only scala-library.jar.
It reminded me first of #4067, but this one looks quite different.
I would suggest that you try to reproduce the error with 2.8.1 or 2.9 trunk, maybe it is fixed?