How to run documentDb (mongoDb) in Zeppelin using AWS EMR? - mongodb

I am using Zeppelin inside an AWS EMR.
Then i created a DocumentDb instance and tried to run it through Zeppelin
but when I try to run the code it gives the error as there isn't any MongoDB Java Driver dependency.
<console>:25: error: object mongodb is not a member of package org
import org.mongodb.scala._
^
<console>:26: error: object bson is not a member of package org
import org.bson._
^
Is there any way to add maven dependency through which I can run mongo in Zeppelin?

Related

object neo4j is not a member of package org

I am trying to read data from neo4j using spark job. The import itself is showing this error. I tried to import org.neo4j.spark._ in intellij IDEA, but it is showing an error "Cannot resolve symbol spark". While trying in the spark-shell, it is throwing an error,
":23: error: object neo4j is not a member of package org
import org.neo4j.spark._"
Spark version - 3.1.1
dependencies

Error while running Scala code - Databricks 7.3LTS and above

I am running databricks 7.3LTS and having errors while trying to use scala bulk copy.
The error is:
object sqldb is not a member of package com.microsoft.
I have installed correct sqlconnector drivers but not sure how to fix this error.
The installed drivers are:
com.microsoft.azure:spark-mssql-connector_2.12:1.1.0.
also i have installed the JAR dependencies as below:
spark_mssql_connector_2_12_1_1_0.jar
i couldnt find any scala code example for the above configurations on the internet.
my scala code sample is as below:
%scala
import com.microsoft.azure.sqldb.spark.config.Config
as soon as i run this command i get the error
Object sqldb is not a member of package com.microsoft.azure
any help please
In the new connector you need to use com.microsoft.sqlserver.jdbc.spark.SQLServerBulkJdbcOptions class to specify bulk copy options.

Cannot import Cosmosdb in databricks

I setup a new cluster on databricks which using databricks runtime version 10.1 (includes Apache Spark 3.2.0, Scala 2.12). I also installed azure_cosmos_spark_3_2_2_12_4_6_2.jar in Libraries.
I create a new notebook with Scala
import com.microsoft.azure.cosmosdb.spark.schema._
import com.microsoft.azure.cosmosdb.spark.CosmosDBSpark
import com.microsoft.azure.cosmosdb.spark.config.Config
But I still get error: object cosmosdb is not a member of package com.microsoft.azure
Does anyone know which step I missing?
Thanks
Looks like the imports you are doing are for the older Spark Connector (https://github.com/Azure/azure-cosmosdb-spark).
For the Spark 3.2 Connector, you might want to follow the quickstart guides: https://learn.microsoft.com/azure/cosmos-db/sql/create-sql-api-spark
The official repository is: https://github.com/Azure/azure-sdk-for-java/tree/main/sdk/cosmos/azure-cosmos-spark_3-2_2-12
Complete Scala sample: https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/cosmos/azure-cosmos-spark_3_2-12/Samples/Scala-Sample.scala
Here is the configuration reference: https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/cosmos/azure-cosmos-spark_3_2-12/docs/configuration-reference.md
You may be missing the pip install step:
pip install azure-cosmos

error not found value spark import spark.implicits._ import spark.sql

I am using hadoop 2.7.2 , hbase 1.4.9, spark 2.2.0, scala 2.11.8 and java 1.8 on a hadoop cluster which is composed of one master and two slave.
when I run spark-shell after starting the cluster , it works fine.
I am trying to connect to hbase using scala by following this tutorial : [https://www.youtube.com/watch?v=gGwB0kCcdu0][1] .
But when I try like he does to run the spark-shell by adding those jars like argument I have this error:
spark-shell --jars
"hbase-annotations-1.4.9.jar,hbase-common-1.4.9.jar,hbase-protocol-1.4.9.jar,htrace-core-3.1.0-incubating.jar,zookeeper-3.4.6.jar,hbase-client-1.4.9.jar,hbase-hadoop2-compat-1.4.9.jar,metrics-json-3.1.2.jar,hbase-server-1.4.9.jar"
<console>:14: error: not found: value spark
import spark.implicits._
^
<console>:14: error: not found: value spark
import spark.sql
^
and after that even I log out and run spark-shell another time I have the same issue.
Can any one tell me please what is the cause and how to fix it .
In your import statement spark should be an object of type SparkSession. That object should have been created previously for you. Or you need to create it yourself (read spark docs). I didn't watch your tutorial video.
The point is it doesn't have to be called spark. It could be for instance called sparkSession and then you can do import sparkSession.implicits._

Neo4j Spark connector error: import.org.neo4j.spark._ object neo4j is not found in package org

I have my scala code running in spark connecting to Neo4j on my mac. I wanted to test it on my windows machine but cannot seem to get it to run, I keep getting the error:
Spark context Web UI available at http://192.168.43.4:4040
Spark context available as 'sc' (master = local[*], app id = local-1508360735468).
Spark session available as 'spark'.
Loading neo4jspark.scala...
<console>:23: error: object neo4j is not a member of package org
import org.neo4j.spark._
^
Which gives subsequent errors of:
changeScoreList: java.util.List[Double] = []
<console>:87: error: not found: value neo
val initialDf2 = neo.cypher(noBbox).partitions(5).batch(10000).loadDataFrame
^
<console>:120: error: not found: value neo
Not sure what I am doing wrong, I am executing it like this:
spark-shell --conf spark.neo4j.bolt.password=TestNeo4j --packages neo4j-contrib:neo4j-spark-connector:2.0.0-M2,graphframes:graphframes:0.2.0-spark2.0-s_2.11 -i neo4jspark.scala
Says it finds all the dependencies yet the code throws the error when using neo. Not sure what else to try? Not sure why this doesn't work on my windows box and does my mac. Spark version 2.2 the same, neo4j up and running same versions, scala too, even java (save for a few minor revisions version differences)
This is a known issue (with a related one here), the fix for which is part of the Spark 2.2.1 release.