Scala ClosedByInterruptException using os.lib watch service

Scala ClosedByInterruptException using os.lib watch service - scala

One new to scala here!
Im trying to use os.lib.watch to read json files when there is file name change happening in the directory. Problem is that i cannot get the json file read when i change the filename manually to something else.
object Main extends App {
//this works no problem
val jsonPath = os.Path("/users/tst.json")
val jsonString = os.read(jsonPath)
val data = ujson.read(jsonString)
println(data)
def readFileContent(file: os.Path){
println("Reading Input Json..")
val jsonString = os.read(file) //fail
val data = ujson.read(jsonString)
println(data)
}
def processFile(filePath: os.Path) {
println("FileName:"+ filePath)
readFileContent(filePath)
}
os.watch.watch(Seq(os.pwd/"output"),
f => processFile(f.last))
}
sbt:
"com.lihaoyi" %% "os-lib" % "0.7.8", "com.lihaoyi" %% "os-lib-watch" % "0.4.2"
Error when reading the json file:
JNA: Callback os.watch.FSEventsWatcher$$anon$1#82009c threw the following exception:
java.nio.channels.ClosedByInterruptException
at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:164)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:65)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
at java.io.InputStream.read(InputStream.java:101)
at os.Internals$.transfer0(Internals.scala:15)
at os.Internals$.transfer(Internals.scala:23)
at os.read$bytes$.apply(ReadWriteOps.scala:257)
at os.read$.apply(ReadWriteOps.scala:216)
at os.read$.apply(ReadWriteOps.scala:214)
at Main$.readFileContent(Main.scala:21)
at Main$.processFile(Main.scala:28)
at Main$.$anonfun$new$1(Main.scala:33)
at Main$.$anonfun$new$1$adapted(Main.scala:33)
at os.watch.FSEventsWatcher$$anon$1.invoke(FSEventsWatcher.scala:34)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.sun.jna.CallbackReference$DefaultCallbackProxy.invokeCallback(CallbackReference.java:520)
at com.sun.jna.CallbackReference$DefaultCallbackProxy.callback(CallbackReference.java:551)
at com.sun.jna.Native.invokeVoid(Native Method)
at com.sun.jna.Function.invoke(Function.java:414)
at com.sun.jna.Function.invoke(Function.java:360)
at com.sun.jna.Library$Handler.invoke(Library.java:244)
at com.sun.proxy.$Proxy3.CFRunLoopRun(Unknown Source)
at os.watch.FSEventsWatcher.run(FSEventsWatcher.scala:75)
at os.watch.package$.$anonfun$watch$1(package.scala:39)
at java.lang.Thread.run(Thread.java:748)

Related

Scala: Reading data from scylla throws exception

I am new to scala and to run a simple query to retrieve some data from scylla. Here is my code:
val my_name = "test"
val cluster = ScyllaConnector.getCluster(clusterIpString, scyllaPreferredDc, scyllaUsername, scyllaPassword)
val session = cluster.connect(keySpace)
val preparedStatement: PreparedStatement = session.prepare(GOID_QUERY)
val nameResults = session.execute(preparedStatement.bind(my_name))
val nameResult = nameResults.one()
if(nameResult != null){
println("Here")
val id_recent = nameResult.getSet("id_recent", classOf[String])
println(id_recent)
}
session.close()
cluster.close()
Throws:
Exception in thread "main"
com.datastax.driver.core.exceptions.CodecNotFoundException: Codec not
found for requested operation: [varchar <->
java.util.Set<java.lang.String>] at
com.datastax.driver.core.CodecRegistry.notFound(CodecRegistry.java:679)
at
com.datastax.driver.core.CodecRegistry.createCodec(CodecRegistry.java:526)
at
com.datastax.driver.core.CodecRegistry.findCodec(CodecRegistry.java:506)
at
com.datastax.driver.core.CodecRegistry.access$200(CodecRegistry.java:140)
at
com.datastax.driver.core.CodecRegistry$TypeCodecCacheLoader.load(CodecRegistry.java:211)
at
com.datastax.driver.core.CodecRegistry$TypeCodecCacheLoader.load(CodecRegistry.java:208)
at
shadeio.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3527)
at
shadeio.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2319)
at
shadeio.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2282)
at shadeio.common.cache.LocalCache$Segment.get(LocalCache.java:2197)
at shadeio.common.cache.LocalCache.get(LocalCache.java:3937) at
shadeio.common.cache.LocalCache.getOrLoad(LocalCache.java:3941) at
shadeio.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4824)
at
com.datastax.driver.core.CodecRegistry.lookupCodec(CodecRegistry.java:480)
at
com.datastax.driver.core.CodecRegistry.codecFor(CodecRegistry.java:448)
at
com.datastax.driver.core.AbstractGettableByIndexData.codecFor(AbstractGettableByIndexData.java:73)
at
com.datastax.driver.core.AbstractGettableByIndexData.getSet(AbstractGettableByIndexData.java:318)
at
com.datastax.driver.core.AbstractGettableData.getSet(AbstractGettableData.java:26)
at
com.datastax.driver.core.AbstractGettableByIndexData.getSet(AbstractGettableByIndexData.java:307)
at
com.datastax.driver.core.AbstractGettableData.getSet(AbstractGettableData.java:26)
at
com.datastax.driver.core.AbstractGettableData.getSet(AbstractGettableData.java:215)
at
class.path$.main(CodeName.scala:184)
at
class.path.main(CodeName.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498) at
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
at
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
I am sure the problem rises in the getSet line where it's asking for classOf[String] but I'm not sure what to replace it with.
Here is my table definition:
-- auto-generated definition
create table name_table
(
name text,
id_recent text,
primary key (name)
)

You have incompatible types - you have text type in the database, but you're trying to retrieve it as a set of strings ([varchar <-> java.util.Set<java.lang.String>] message directly says about that).
Replace getSet with getString, and if you need a set, then you need to construct it yourself from retrieved string

spark dealing with carbondata

Below is the code snippet I'm trying to use to create a carbondata table in S3. However, inspite of setting the aws credentials in hadoopconfiguration, it still complains about secret key and access key not being set. What is the issue here?
import org.apache.spark.sql.CarbonSession._
import org.apache.spark.sql.CarbonSession._
val carbon = SparkSession.builder().config(sc.getConf).getOrCreateCarbonSession("s3n://url")
carbon.sparkContext.hadoopConfiguration.set("fs.s3n.awsAccessKeyId","<accesskey>")
carbon.sparkContext.hadoopConfiguration.set("fs.s3n.awsSecretAccessKey","<secretaccesskey>")
carbon.sql("CREATE TABLE IF NOT EXISTS test_table(id string,name string,city string,age Int) STORED BY 'carbondata'")
Last command yields error:
java.lang.IllegalArgumentException: AWS Access Key ID and Secret
Access Key must be specified as the username or password
(respectively) of a s3n URL, or by setting the fs.s3n.awsAccessKeyId
or fs.s3n.awsSecretAccessKey properties (respectively)
Spark Version : 2.2.1
Command used to start spark-shell:
$SPARK_PATH/bin/spark-shell --jars /localpath/jar/apache-carbondata-1.3.1-bin-spark2.2.1-hadoop2.7.2/apache-carbondata-1.3.1-bin-spark2.2.1-hadoop2.7.2.jar,/localpath/jar/spark-avro_2.11-4.0.0.jar --packages com.amazonaws:aws-java-sdk-pom:1.9.22,org.apache.hadoop:hadoop-aws:2.7.2,org.slf4j:slf4j-simple:1.7.21,asm:asm:3.2,org.xerial.snappy:snappy-java:1.1.7.1,com.databricks:spark-avro_2.11:4.0.0
UPDATE:
Found that S3 support is only available in 1.4.0 RC1. So I built RC1 and tested the below code against the same. But still I seem to be running into issues. Any help appreciated.
Code:
import org.apache.spark.sql.CarbonSession._
import org.apache.hadoop.fs.s3a.Constants.{ACCESS_KEY, ENDPOINT, SECRET_KEY}
import org.apache.spark.sql.SparkSession
import org.apache.carbondata.core.constants.CarbonCommonConstants
object sample4 {
def main(args: Array[String]) {
val (accessKey, secretKey, endpoint) = getKeyOnPrefix("s3n://")
//val rootPath = new File(this.getClass.getResource("/").getPath
// + "../../../..").getCanonicalPath
val path = "/localpath/sample/data1.csv"
val spark = SparkSession
.builder()
.master("local")
.appName("S3UsingSDKExample")
.config("spark.driver.host", "localhost")
.config(accessKey, "<accesskey>")
.config(secretKey, "<secretkey>")
//.config(endpoint, "s3-us-east-1.amazonaws.com")
.getOrCreateCarbonSession()
spark.sql("Drop table if exists carbon_table")
spark.sql(
s"""
| CREATE TABLE if not exists carbon_table(
| shortField SHORT,
| intField INT,
| bigintField LONG,
| doubleField DOUBLE,
| stringField STRING,
| timestampField TIMESTAMP,
| decimalField DECIMAL(18,2),
| dateField DATE,
| charField CHAR(5),
| floatField FLOAT
| )
| STORED BY 'carbondata'
| LOCATION 's3n://bucketName/table/carbon_table'
| TBLPROPERTIES('SORT_COLUMNS'='', 'DICTIONARY_INCLUDE'='dateField, charField')
""".stripMargin)
}
def getKeyOnPrefix(path: String): (String, String, String) = {
val endPoint = "spark.hadoop." + ENDPOINT
if (path.startsWith(CarbonCommonConstants.S3A_PREFIX)) {
("spark.hadoop." + ACCESS_KEY, "spark.hadoop." + SECRET_KEY, endPoint)
} else if (path.startsWith(CarbonCommonConstants.S3N_PREFIX)) {
("spark.hadoop." + CarbonCommonConstants.S3N_ACCESS_KEY,
"spark.hadoop." + CarbonCommonConstants.S3N_SECRET_KEY, endPoint)
} else if (path.startsWith(CarbonCommonConstants.S3_PREFIX)) {
("spark.hadoop." + CarbonCommonConstants.S3_ACCESS_KEY,
"spark.hadoop." + CarbonCommonConstants.S3_SECRET_KEY, endPoint)
} else {
throw new Exception("Incorrect Store Path")
}
}
def getSparkMaster(args: Array[String]): String = {
if (args.length == 6) args(5)
else if (args(3).contains("spark:") || args(3).contains("mesos:")) args(3)
else "local"
}
}
Error:
18/05/17 12:23:22 ERROR SegmentStatusManager: main Failed to read metadata of load
org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.ServiceException: Request Error: Empty key
I also tried against the sample code in (tried s3,s3n,s3a protocols as well):
https://github.com/apache/carbondata/blob/master/examples/spark2/src/main/scala/org/apache/carbondata/examples/S3Example.scala
Ran as:
S3Example.main(Array("accesskey","secretKey","s3://bucketName/path/carbon_table","https://bucketName.s3.amazonaws.com","local"))
Error stacktrace:
org.apache.hadoop.fs.s3.S3Exception:
org.jets3t.service.S3ServiceException: Request Error: Empty key at
org.apache.hadoop.fs.s3.Jets3tFileSystemStore.get(Jets3tFileSystemStore.java:175)
at
org.apache.hadoop.fs.s3.Jets3tFileSystemStore.retrieveINode(Jets3tFileSystemStore.java:221)
at sun.reflect.GeneratedMethodAccessor42.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498) at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy21.retrieveINode(Unknown Source) at
org.apache.hadoop.fs.s3.S3FileSystem.getFileStatus(S3FileSystem.java:340)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1426) at
org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.isFileExist(AbstractDFSCarbonFile.java:426)
at
org.apache.carbondata.core.datastore.impl.FileFactory.isFileExist(FileFactory.java:201)
at
org.apache.carbondata.core.statusmanager.SegmentStatusManager.readTableStatusFile(SegmentStatusManager.java:246)
at
org.apache.carbondata.core.statusmanager.SegmentStatusManager.readLoadMetadata(SegmentStatusManager.java:197)
at
org.apache.carbondata.core.cache.dictionary.ManageDictionaryAndBTree.clearBTreeAndDictionaryLRUCache(ManageDictionaryAndBTree.java:101)
at
org.apache.spark.sql.hive.CarbonFileMetastore.dropTable(CarbonFileMetastore.scala:460)
at
org.apache.spark.sql.execution.command.table.CarbonCreateTableCommand.processMetadata(CarbonCreateTableCommand.scala:148)
at
org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:68)
at
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
at
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
at
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67)
at org.apache.spark.sql.Dataset.(Dataset.scala:183) at
org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:107)
at
org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:96)
at
org.apache.spark.sql.CarbonSession.withProfiler(CarbonSession.scala:144)
at org.apache.spark.sql.CarbonSession.sql(CarbonSession.scala:94) at
$line19.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$S3Example$.main(:68) at $line26.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(:31)
at $line26.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw.(:36) at
$line26.$read$$iw$$iw$$iw$$iw$$iw$$iw.(:38) at
$line26.$read$$iw$$iw$$iw$$iw$$iw.(:40) at
$line26.$read$$iw$$iw$$iw$$iw.(:42) at
$line26.$read$$iw$$iw$$iw.(:44) at
$line26.$read$$iw$$iw.(:46) at
$line26.$read$$iw.(:48) at
$line26.$read.(:50) at
$line26.$read$.(:54) at
$line26.$read$.() at
$line26.$eval$.$print$lzycompute(:7) at
$line26.$eval$.$print(:6) at $line26.$eval.$print()
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498) at
scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:786)
at
scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1047)
at
scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:638)
at
scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:637)
at
scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31)
at
scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19)
at
scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:637)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:569) at
scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565) at
scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:807)
at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:681) at
scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:395) at
scala.tools.nsc.interpreter.ILoop.loop(ILoop.scala:415) at
scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:923)
at
scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)
at
scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)
at
scala.reflect.internal.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:97)
at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:909) at
org.apache.spark.repl.Main$.doMain(Main.scala:74) at
org.apache.spark.repl.Main$.main(Main.scala:54) at
org.apache.spark.repl.Main.main(Main.scala) at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498) at
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:775)
at
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused
by: org.jets3t.service.S3ServiceException: Request Error: Empty key
at org.jets3t.service.S3Service.getObject(S3Service.java:1470) at
org.apache.hadoop.fs.s3.Jets3tFileSystemStore.get(Jets3tFileSystemStore.java:163)
Is any of the arguments that I'm passing wrong.
I'm able to access the s3 path using aws cli:
aws s3 ls s3://bucketName/path
exists in S3.

You can try it using this example https://github.com/apache/carbondata/blob/master/examples/spark2/src/main/scala/org/apache/carbondata/examples/S3Example.scala
You have to provide aws credentials properties to spark first after that you will be creating carbonSession.
If you have already created sparkContext without aws properties being provided. Then it do not pick up those properties even after you give it to carbonContext.

hi vikas looking at your exception empty key simply means that your acesss key and secret key is not binded in carbon session because when we give the s3 implementation we write the logic that if any of key is not provide by user then it then their value should be taken as empty
so to make things easy
first build the carbon data jar using this command
mvn -Pspark-2.1 clean package
then execute spark submit with this command
./spark-submit --jars file:///home/anubhav/Downloads/softwares/spark-2.2.1-bin-hadoop2.7/carbonlib/apache-carbondata-1.4.0-SNAPSHOT-bin-spark2.2.1-hadoop2.7.2.jar --class org.apache.carbondata.examples.S3Example /home/anubhav/Documents/carbondata/carbondata/carbondata/examples/spark2/target/carbondata-examples-spark2-1.4.0-SNAPSHOT.jar local
replace my jar path with yours and see it should work,its working for me

Custom akka mail box configuration (Scala)

I created my custom mailbox called CustomMailBox which derived from MyUnboundedMessageQueueSemantics trait. Then i put this configurations into application.conf:
custom-dispatcher {
mailbox-requirement =
"com.MyUnboundedMessageQueueSemantics"
}
akka.actor.mailbox.requirements {
"com.MyUnboundedMessageQueueSemantics" =
custom-dispatcher-mailbox
}
custom-dispatcher-mailbox {
mailbox-type = "com.CustomMailBox"
}
akka.actor.deployment {
/myactor {
dispatcher = custom-dispatcher
}
}
After that i create my actor this way:
val system = ActorSystem("mySystem")
val quadrocopter = system.actorOf(Quadrocopter.props(vrep, clientID, handler)
.withDispatcher("custom-dispatcher"), "myactor")
and after running my program i got this error:
Exception in thread "main" akka.ConfigurationException: Dispatcher [custom-dispatcher] not configured for path akka://mySystem/user/myactor
at akka.actor.LocalActorRefProvider.actorOf(ActorRefProvider.scala:758)
at akka.actor.dungeon.Children$class.makeChild(Children.scala:273)
at akka.actor.dungeon.Children$class.attachChild(Children.scala:46)
at akka.actor.ActorCell.attachChild(ActorCell.scala:374)
at akka.actor.ActorSystemImpl.actorOf(ActorSystem.scala:732)
at Simulation$.delayedEndpoint$Simulation$1(Simulation.scala:24)
at Simulation$delayedInit$body.apply(Simulation.scala:12)
at scala.Function0$class.apply$mcV$sp(Function0.scala:34)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
at scala.App$class.main(App.scala:76)
at Simulation$.main(Simulation.scala:12)
at Simulation.main(Simulation.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)
Can anybody explain me what is wrong in that?

Problem solved by changing /myactor to " */myactor" in config file.
Second step is to remove invocation of withDispatcher() function when creating actor.

How to connect (Py)Spark to Postgres database using JDBC

I have followed instructions from this posting to read data from an existing Postgres database with table named "objects" as defined and created by the Objects class in SQLalchemy. In my Jupyter notebook, my code is
from pyspark import SparkContext
from pyspark import SparkConf
from random import random
#spark conf
conf = SparkConf()
conf.setMaster("local[*]")
conf.setAppName('pyspark')
sc = SparkContext(conf=conf)
from pyspark.sql import SQLContext
sqlContext = SQLContext(sc)
properties = {
"driver": "org.postgresql.Driver"
}
url = 'jdbc:postgresql://PG_USER:PASSWORD#PG_SERVER_IP/db_name'
df = sqlContext.read.jdbc(url=url, table='objects', properties=properties)
the last line results in the following:
Py4JJavaError: An error occurred while calling o25.jdbc.
: java.lang.NullPointerException
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:158)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:117)
at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:237)
at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:159)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:237)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:280)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:128)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:211)
at java.lang.Thread.run(Thread.java:745)
so it looks like it can't resolve the table. How do I test from here to make sure that I am connected to the database properly?

Problems with name resolving are indicated by org.postgresql.util.PSQLException and don't result in NPE. The source of the issue is actually a connection string and in particular the way you provide user credentials. At first glance it looks like a bug but if you're looking for a quick solution you can either use URL properties:
url = 'jdbc:postgresql://PG_SERVER_IP/db_name?user=PG_USER&password=PASSWORD'
or properties argument:
properties = {
"user": "PG_USER",
"password": "PASSWORD",
"driver": "org.postgresql.Driver"
}

How can i pass a URL explicitly in Scala

Hello i am new to Scala . I tried this code
def web ( url : Any) {
| val ur= new URL("url")
| val content=fromInputStream(ur.openStream).getLines.mkString("\n")
| print(content)
| }
when i pass a url like web("http://contentexplore.com/iphone-6-amazing-looks/")
it is showing error
java.net.MalformedURLException: no protocol: url
at java.net.URL.<init>(URL.java:585)
at java.net.URL.<init>(URL.java:482)
at java.net.URL.<init>(URL.java:431)
at .web(<console>:22)
at .<init>(<console>:23)
at .<clinit>(<console>)
at .<init>(<console>:11)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:704)
at scala.tools.nsc.interpreter.IMain$Request$$anonfun$14.apply(IMain.scala:920)
at scala.tools.nsc.interpreter.Line$$anonfun$1.apply$mcV$sp(Line.scala:43)
at scala.tools.nsc.io.package$$anon$2.run(package.scala:25)
at java.lang.Thread.run(Thread.java:722)
My question is how can i pass a url explicitly in scala .Kindly suggest me an idea .Thanks in advance

As mentioned in the comments, this line is the problem:
val ur= new URL("url")
If you want to create a URL from the input param url, the code should be:
val ur= new URL(url)
With the error, the java URL class was trying to parse a String with value "url", looking first for a recognized protocol (http, https, etc...) and not finding one, so that's why you were seeing that error.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Scala ClosedByInterruptException using os.lib watch service - scala

Related

Scala: Reading data from scylla throws exception

spark dealing with carbondata

Custom akka mail box configuration (Scala)

How to connect (Py)Spark to Postgres database using JDBC

How can i pass a URL explicitly in Scala

Categories

Resources