Error while running Scala code - Databricks 7.3LTS and above - scala

I am running databricks 7.3LTS and having errors while trying to use scala bulk copy.
The error is:
object sqldb is not a member of package com.microsoft.
I have installed correct sqlconnector drivers but not sure how to fix this error.
The installed drivers are:
com.microsoft.azure:spark-mssql-connector_2.12:1.1.0.
also i have installed the JAR dependencies as below:
spark_mssql_connector_2_12_1_1_0.jar
i couldnt find any scala code example for the above configurations on the internet.
my scala code sample is as below:
%scala
import com.microsoft.azure.sqldb.spark.config.Config
as soon as i run this command i get the error
Object sqldb is not a member of package com.microsoft.azure
any help please

In the new connector you need to use com.microsoft.sqlserver.jdbc.spark.SQLServerBulkJdbcOptions class to specify bulk copy options.

Related

Cannot import Cosmosdb in databricks

I setup a new cluster on databricks which using databricks runtime version 10.1 (includes Apache Spark 3.2.0, Scala 2.12). I also installed azure_cosmos_spark_3_2_2_12_4_6_2.jar in Libraries.
I create a new notebook with Scala
import com.microsoft.azure.cosmosdb.spark.schema._
import com.microsoft.azure.cosmosdb.spark.CosmosDBSpark
import com.microsoft.azure.cosmosdb.spark.config.Config
But I still get error: object cosmosdb is not a member of package com.microsoft.azure
Does anyone know which step I missing?
Thanks
Looks like the imports you are doing are for the older Spark Connector (https://github.com/Azure/azure-cosmosdb-spark).
For the Spark 3.2 Connector, you might want to follow the quickstart guides: https://learn.microsoft.com/azure/cosmos-db/sql/create-sql-api-spark
The official repository is: https://github.com/Azure/azure-sdk-for-java/tree/main/sdk/cosmos/azure-cosmos-spark_3-2_2-12
Complete Scala sample: https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/cosmos/azure-cosmos-spark_3_2-12/Samples/Scala-Sample.scala
Here is the configuration reference: https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/cosmos/azure-cosmos-spark_3_2-12/docs/configuration-reference.md
You may be missing the pip install step:
pip install azure-cosmos

Reference uploaded JAR library

I've billed set of support function into helper.jar library and imported to Databricks cluster. The jar is installed on the cluster, but I'm not able to reference the functions in the library.
The jar import has been tested, cluster restarted and the jar can be referenced in InelliJ where it was developed as Azure Spark/HDInsight project.
//next line generates error value helper is not a member of org.apache.spark.sql.SparkSession
import helper
//nex line generates error: not found: value fn_conversion
display(df.withColumn("RevenueConstantUSD", fn_conversion($"Revenue"))
I'd expect the helper function would be visible after library deployment or possibly after adding the import command.
Edit: added information about IntelliJ project type

import uploaded library to Databricks

I uploaded the spark time series spark-ts library to DataBricks using maven coordinate option in the Create Library. I was able to successfully create the library and attach it to my cluster. But when I tried to import the spark-ts library in DataBricks using org.apache.spark.spark-ts. But it throws an error stating that notebook:1: error: object ts is not a member of package org.apache.spark Please let me know how to handle this issue.

brunel not working on IBM data science experience

I am trying to use brunel on a spark scala notebook on IBM datascience experience.
%AddJar -magic https://brunelvis.org/jar/spark-kernel-brunel-all-2.2.jar
%%brunel data(leadsDF) map x(state) y(count) color(state)
I always get this error:
Name: Error parsing magics!
Message: Magics [brunel] do not exist!
StackTrace:
Is there a import needed for brunel?
I tested this Scala 2.11 with spark 2.0 and it worked for me.
%AddJar -magic https://brunelvis.org/jar/spark-kernel-brunel-all-2.2.jar
Then i used below to display and it showed me the map.
%%brunel data('co2agg') map(low) x(CO2_per_capita) color(Mean_Co2) tooltip(#all):: width=800, height=500
Example reference is from
https://github.com/Brunel-Visualization/Brunel/tree/master/spark-kernel/examples
I have fully functioning example here for scala 2.11:-
https://apsportal.ibm.com/analytics/notebooks/97e83c35-06a2-476a-aa57-078a20f04356/view?access_token=8d2f4f749aab9abbf45a08351f7b50f7fb09f06ad9aa7bab5ebc9a2cc98e902f
I tested this with scala 2.10 and i get some dependecy error, i am guessing it is because of brunel 2.2 may be dependent scala 2.11.
I hope that helps.
Thanks,
Charles.
I had the same problem where magics where not working correctly. For me this action from the "Known Issues" fixed it:
Run the following code in a Python notebook to remove the existing Scala libraries: !rm -rvf ~/data/libs/*

Using SBT on a remote node without internet access via SSH

I am trying to write a Spark program with Scala on a remote machine, but that machine has no internet access. Since I am using the pre-built version for Hadoop, I am able to run the pre-compiled examples:
[user#host spark-0.7.2]$ ./run spark.examples.LocalPi
but I can't compile anything that references spark on the machine:
$ scalac PiEstimate.scala
PiEstimate.scala:1: error: not found: object spark
import spark.SparkContext
^
Normally, I would use SBT to take care of any dependencies, but the machine does not have internet access, and tunneling internet through SSH is not possible.
Is it possible to compile an SBT project on a remote machine that has no internet access? Or how could I manually link the Spark dependencies to the Scala compiler.
If you're compiling your Spark program through scalac, you'll have to add Spark's jars to scalac's classpath; I think this should work:
scalac -classpath "$SPARK_HOME/target/scala-*/*.jar" PiEstimate.scala
I know this is an old post but I had to deal with this issue recently. I solved it by removing the dependencies from my .sbt file and adding the spark jar (spark-home/assembly/target/scala.2-10/spark-[...].jar) under my-project-dir/lib directory. You can also point to it using unmanagedBase = file("/path/to/jars/") Then I could use sbt package as usually