How to resolve an ' Unable to get public no-arg constructor' error while trying to push data to GCS and load it into BigQuery? - pyspark

I've set up a pyspark session and provided it specific configuration settings based off what I've read:
self.spark_session = SparkSession.builder.appName(
"Example Session"
).config("spark.jars", "../../.jars/spark-bigquery-with-dependencies_2.13-0.28.0.jar")\
.config("spark.hadoop.fs.gs.impl", "com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem")\
.config("spark.driver.extraClassPath", "../../.jars/gcs-connector-hadoop3-latest.jar")\
.config("spark.executor.extraClassPath", "../../.jars/gcs-connector-hadoop3-latest.jar").getOrCreate()
and I'm able to work with the dataset I pull in just fine, transforming the data and the like. It's when I try to write to GCS to eventually write to BigQuery that I get an error:
dataframe.write.format("bigquery").option("temporaryGcsBucket", bucket_path).save(table_name)
The error I receive is:
E py4j.protocol.Py4JJavaError: An error occurred while calling o65.json.
E : java.util.ServiceConfigurationError: org.apache.spark.sql.sources.DataSourceRegister: com.google.cloud.spark.bigquery.BigQueryRelationProvider Unable to get public no-arg constructor
E at java.base/java.util.ServiceLoader.fail(ServiceLoader.java:582)
E at java.base/java.util.ServiceLoader.getConstructor(ServiceLoader.java:673)
E at java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.hasNextService(ServiceLoader.java:1233)
E at java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.hasNext(ServiceLoader.java:1265)
E at java.base/java.util.ServiceLoader$2.hasNext(ServiceLoader.java:1300)
E at java.base/java.util.ServiceLoader$3.hasNext(ServiceLoader.java:1385)
E at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:45)
E at scala.collection.Iterator.foreach(Iterator.scala:943)
E at scala.collection.Iterator.foreach$(Iterator.scala:943)
E at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
E at scala.collection.IterableLike.foreach(IterableLike.scala:74)
E at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
E at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
E at scala.collection.TraversableLike.filterImpl(TraversableLike.scala:303)
E at scala.collection.TraversableLike.filterImpl$(TraversableLike.scala:297)
E at scala.collection.AbstractTraversable.filterImpl(Traversable.scala:108)
E at scala.collection.TraversableLike.filter(TraversableLike.scala:395)
E at scala.collection.TraversableLike.filter$(TraversableLike.scala:395)
E at scala.collection.AbstractTraversable.filter(Traversable.scala:108)
E at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:657)
E at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSourceV2(DataSource.scala:725)
E at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:207)
E at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:361)
E at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
E at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
E at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
E at java.base/java.lang.reflect.Method.invoke(Method.java:566)
E at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
E at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
E at py4j.Gateway.invoke(Gateway.java:282)
E at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
E at py4j.commands.CallCommand.execute(CallCommand.java:79)
E at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
E at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
E at java.base/java.lang.Thread.run(Thread.java:829)
E Caused by: java.lang.NoClassDefFoundError: scala/$less$colon$less
E at java.base/java.lang.Class.getDeclaredConstructors0(Native Method)
E at java.base/java.lang.Class.privateGetDeclaredConstructors(Class.java:3137)
E at java.base/java.lang.Class.getConstructor0(Class.java:3342)
E at java.base/java.lang.Class.getConstructor(Class.java:2151)
E at java.base/java.util.ServiceLoader$1.run(ServiceLoader.java:660)
E at java.base/java.util.ServiceLoader$1.run(ServiceLoader.java:657)
E at java.base/java.security.AccessController.doPrivileged(Native Method)
E at java.base/java.util.ServiceLoader.getConstructor(ServiceLoader.java:668)
E ... 33 more
E Caused by: java.lang.ClassNotFoundException: scala.$less$colon$less
E at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:476)
E at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:589)
E at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
E ... 41 more
I've looked everywhere but unsure how to resolve this. It looks like I'm missing another .jar file at best guess, but not sure which one.

Related

Runing in local model the wordcount lines=sc.textFile('../spark_test.txt') lines.count()

I run pyspark in the local model to learn the wordcount case. I write a spark_test.txt in current file, and run pyspark command , lines = sc.textFile('../spark_test.txt') lines.count(), it throws out a bug:
Input path does not exist: file:/Users/lisl/myproject/spark_test.txt
** I tried:**
sc.textFile('spark_test.txt')
lines = sc.textFile('../spark_test.txt')
lines.count()
i expect the outpub be the number of lines of the file , but the actual output is :
file "/Users/lisl/.pyenv/versions/3.7.1/lib/python3.7/site-packages/pyspark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/Users/lisl/myproject/learn_test/spark_test.txt
at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:287)
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:229)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:204)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
at org.apache.spark.api.python.PythonRDD.getPartitions(PythonRDD.scala:55)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2126)
at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:945)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at org.apache.spark.rdd.RDD.collect(RDD.scala:944)
at org.apache.spark.api.python.PythonRDD$.collectAndServe(PythonRDD.scala:166)
at org.apache.spark.api.python.PythonRDD.collectAndServe(PythonRDD.scala)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:567)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.base/java.lang.Thread.run(Thread.java:835)
First of all, some lines of code are missing. Please try the code below and make sure that the path to your file exist and you have given read access for all to it (chmod a+r yourFile)
text_file = sc.textFile("path/to/your/input/file")
wordCounts = text_file.flatMap(lambda line: line.split(" ")) \
.map(lambda word: (word, 1)) \
.reduceByKey(lambda a, b: a + b)
wordCounts.saveAsTextFile("path/to/your/output/file")

In Spark, What is the reason of getting EOF exception, Seek past end of file?

I am reading some data (8 GB) data from multiple files, filter the data doing some null check and performing some upliftings (operations) on columns like cleaning column value for this I have 6 to 7 functions (custom functionality, cannot use spark functions) that are registered as UDFs. Then I write the final result to tables and CSV files, now on 'dataframe.write.saveAsTable()' and on writing 'CSV' I get EOF exception Seek past end of file. This exception does not occur everytime, like if I run 20 times it may occur once. I am unable to find its reason and cause because it is not reproduce-able, (Getting this in both in scala and pyspark), will appreciate any help or hint. Looking forward. Thanks
---------------------------------------------------------------------------
Py4JJavaError Traceback (most recent call last)
<command-4146672194126555> in <module>()
331 saveMergedLogs(
332 dataframeLogs
333 );
<command-3860558353011740> in saveMergedLogs(dataframeLogs)
45 # ---------------------------------------------------------------------------------------------------------------------------------
46 spark.sql("DROP TABLE IF EXISTS dbo.UsageLogs");
---> 47 dataframeLogs.write.saveAsTable("dbo.UsageLogs")
/databricks/spark/python/pyspark/sql/readwriter.py in saveAsTable(self, name, format, mode, partitionBy, **options)
773 if format is not None:
774 self.format(format)
--> 775 self._jwrite.saveAsTable(name)
776
777 #since(1.4)
/databricks/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py in __call__(self, *args)
1255 answer = self.gateway_client.send_command(command)
1256 return_value = get_return_value(
-> 1257 answer, self.gateway_client, self.target_id, self.name)
1258
1259 for temp_arg in temp_args:
/databricks/spark/python/pyspark/sql/utils.py in deco(*a, **kw)
61 def deco(*a, **kw):
62 try:
---> 63 return f(*a, **kw)
64 except py4j.protocol.Py4JJavaError as e:
65 s = e.java_exception.toString()
/databricks/spark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
326 raise Py4JJavaError(
327 "An error occurred while calling {0}{1}{2}.\n".
--> 328 format(target_id, ".", name), value)
329 else:
330 raise Py4JError(
Py4JJavaError: An error occurred while calling o3615.saveAsTable.
: org.apache.spark.SparkException: Job aborted.
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:196)
at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:192)
at org.apache.spark.sql.execution.datasources.DataSource.writeAndRead(DataSource.scala:553)
at org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.saveDataIntoTable(createDataSourceTables.scala:216)
at org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:175)
at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:110)
at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:108)
at org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:128)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:143)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$5.apply(SparkPlan.scala:183)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:180)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:131)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:114)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:114)
at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:690)
at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:690)
at org.apache.spark.sql.execution.SQLExecution$$anonfun$withCustomExecutionEnv$1.apply(SQLExecution.scala:99)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:228)
at org.apache.spark.sql.execution.SQLExecution$.withCustomExecutionEnv(SQLExecution.scala:85)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:158)
at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:690)
at org.apache.spark.sql.DataFrameWriter.createTable(DataFrameWriter.scala:487)
at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:466)
at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:414)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
at py4j.Gateway.invoke(Gateway.java:295)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:251)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 9 in stage 58.0 failed 4 times, most recent failure: Lost task 9.3 in stage 58.0 (TID 8630, 10.139.64.7, executor 0): java.io.EOFException: Cannot seek past end of file
at com.microsoft.azure.datalake.store.ADLFileInputStream.seek(ADLFileInputStream.java:262)
at com.databricks.adl.AdlFsInputStream.seek(AdlFsInputStream.java:64)
at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:62)
at com.databricks.spark.metrics.FSInputStreamWithMetrics.seek(FileSystemWithMetrics.scala:207)
at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:62)
at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:107)
at org.apache.spark.sql.execution.datasources.HadoopFileLinesReader.<init>(HadoopFileLinesReader.scala:65)
at org.apache.spark.sql.execution.datasources.HadoopFileLinesReader.<init>(HadoopFileLinesReader.scala:47)
at org.apache.spark.sql.execution.datasources.csv.TextInputCSVDataSource$.readFile(CSVDataSource.scala:201)
at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat$$anonfun$buildReader$2.apply(CSVFileFormat.scala:147)
at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat$$anonfun$buildReader$2.apply(CSVFileFormat.scala:140)
at org.apache.spark.sql.execution.datasources.FileFormat$$anon$1.apply(FileFormat.scala:147)
at org.apache.spark.sql.execution.datasources.FileFormat$$anon$1.apply(FileFormat.scala:134)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1$$anon$2.getNext(FileScanRDD.scala:226)
at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:196)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:338)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:196)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage443.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$11$$anon$1.hasNext(WholeStageCodegenExec.scala:622)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
at org.apache.spark.scheduler.Task.doRunTask(Task.scala:139)
at org.apache.spark.scheduler.Task.run(Task.scala:112)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$13.apply(Executor.scala:497)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1432)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:503)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:2100)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:2088)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:2087)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2087)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:1076)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:1076)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1076)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2319)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2267)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2255)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:873)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2252)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:166)
... 36 more
Caused by: java.io.EOFException: Cannot seek past end of file
at com.microsoft.azure.datalake.store.ADLFileInputStream.seek(ADLFileInputStream.java:262)
at com.databricks.adl.AdlFsInputStream.seek(AdlFsInputStream.java:64)
at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:62)
at com.databricks.spark.metrics.FSInputStreamWithMetrics.seek(FileSystemWithMetrics.scala:207)
at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:62)
at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:107)
at org.apache.spark.sql.execution.datasources.HadoopFileLinesReader.<init>(HadoopFileLinesReader.scala:65)
at org.apache.spark.sql.execution.datasources.HadoopFileLinesReader.<init>(HadoopFileLinesReader.scala:47)
at org.apache.spark.sql.execution.datasources.csv.TextInputCSVDataSource$.readFile(CSVDataSource.scala:201)
at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat$$anonfun$buildReader$2.apply(CSVFileFormat.scala:147)
at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat$$anonfun$buildReader$2.apply(CSVFileFormat.scala:140)
at org.apache.spark.sql.execution.datasources.FileFormat$$anon$1.apply(FileFormat.scala:147)
at org.apache.spark.sql.execution.datasources.FileFormat$$anon$1.apply(FileFormat.scala:134)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1$$anon$2.getNext(FileScanRDD.scala:226)
at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:196)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:338)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:196)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage443.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$11$$anon$1.hasNext(WholeStageCodegenExec.scala:622)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
at org.apache.spark.scheduler.Task.doRunTask(Task.scala:139)
at org.apache.spark.scheduler.Task.run(Task.scala:112)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$13.apply(Executor.scala:497)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1432)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:503)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
... 1 more
So I worked a lot on it, but there is no solution for it. The way I managed was I specified a retry in ADF databricks activity. So when ever this issue occurs it can rerun the notebook in databricks and it passes. Still this is not a perfect solution but it works.
I met same issue when reading data from multiple files. The files have same columns but the order of the columns are not exactly the same. Since in pandas we can set sort=True when append DataFrame so that same columns in different position won't get confused, I suspected it is the order of the columns that cause the issue. I'm not 100% sure but it worked after I re-write the files with same order of columns.

ERROR when I use 'textFile.count()'

I don't know where is the problem and how to fix it.The spark version is 2.1.0 and the python version is 3.4.6.
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/__ / .__/\_,_/_/ /_/\_\ version 2.1.0
/_/
Using Python version 3.4.6 (default, Mar 9 2017 19:57:54)
SparkSession available as 'spark'.
##This is the command i code as the official document
>>> input_data = sc.textFile('my python')
>>> input_data.count()
but it is not work.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/spark/python/pyspark/rdd.py", line 1041, in count
return self.mapPartitions(lambda i: [sum(1 for _ in i)]).sum()
File "/usr/local/spark/python/pyspark/rdd.py", line 1032, in sum
return self.mapPartitions(lambda x: [sum(x)]).fold(0, operator.add)
File "/usr/local/spark/python/pyspark/rdd.py", line 906, in fold
vals = self.mapPartitions(func).collect()
File "/usr/local/spark/python/pyspark/rdd.py", line 809, in collect
port = self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd())
File "/usr/local/spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
File "/usr/local/spark/python/pyspark/sql/utils.py", line 63, in deco
return f(*a, **kw)
File "/usr/local/spark/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py", line 319, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: java.net.ConnectException: Call From hadoop/192.168.81.129 to localhost:9000 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732)
at org.apache.hadoop.ipc.Client.call(Client.java:1479)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy19.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy20.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2108)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:57)
at org.apache.hadoop.fs.Globber.glob(Globber.java:252)
at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1674)
at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:259)
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:229)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:202)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
at org.apache.spark.api.python.PythonRDD.getPartitions(PythonRDD.scala:53)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1958)
at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:935)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
at org.apache.spark.rdd.RDD.collect(RDD.scala:934)
at org.apache.spark.api.python.PythonRDD$.collectAndServe(PythonRDD.scala:453)
at org.apache.spark.api.python.PythonRDD.collectAndServe(PythonRDD.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:280)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:214)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: 拒绝连接
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712)
at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
at org.apache.hadoop.ipc.Client.call(Client.java:1451)
... 56 more
>>>

tMatchGroupHadoop_2 component talend

I am working with Talend tMatchGroupHadoop component with Amazon EMR cluster,
it is giving an error: "could only be replicated to 0 nodes, instead of 1".
Actually data node is running in the AMR cluster.
hadoop fsck
..............Status: HEALTHY
Total size: 315153 B
Total dirs: 12
Total files: 14 (Files currently being written: 1)
Total blocks (validated): 13 (avg. block size 24242 B)
Minimally replicated blocks: 13 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 1
Average block replication: 1.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 1
Number of racks: 1
FSCK ended at Mon Sep 08 06:07:58 UTC 2014 in 158 milliseconds
I am getting the following error:
[statistics] connecting to socket on port 3645
[statistics] connected
[INFO ]: org.apache.hadoop.hdfs.DFSClient - Exception in createBlockOutputStream 10.230.30.124:9200 java.net.ConnectException: Connection timed out: no further information
[INFO ]: org.apache.hadoop.hdfs.DFSClient - Abandoning block blk_-3580819895919001579_2135
[INFO ]: org.apache.hadoop.hdfs.DFSClient - Excluding datanode 10.230.30.124:9200
Exception in component tMatchGroupHadoop_2_GroupOut
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /in could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1569)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:701)
at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:583)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1140)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)
at org.apache.hadoop.ipc.Client.call(Client.java:1092)
[WARN ]: org.apache.hadoop.hdfs.DFSClient - DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /in could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1569)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:701)
at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:583)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1140)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at com.sun.proxy.$Proxy1.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)
at com.sun.proxy.$Proxy1.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3595)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3456)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2672)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2912)
at org.apache.hadoop.ipc.Client.call(Client.java:1092)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at com.sun.proxy.$Proxy1.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)
at com.sun.proxy.$Proxy1.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3595)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3456)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2672)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2912)
[WARN ]: org.apache.hadoop.hdfs.DFSClient - Error Recovery for block blk_-3580819895919001579_2135 bad datanode[0] nodes == null
[WARN ]: org.apache.hadoop.hdfs.DFSClient - Could not get block locations. Source file "/in" - Aborting...
[statistics] disconnected
[ERROR]: org.apache.hadoop.hdfs.DFSClient - Exception closing file /in : org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /in could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1569)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:701)
at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:583)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1140)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /in could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1569)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:701)
at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:583)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1140)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)
at org.apache.hadoop.ipc.Client.call(Client.java:1092)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at com.sun.proxy.$Proxy1.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)
at com.sun.proxy.$Proxy1.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3595)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3456)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2672)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2912)
Job HadoopMatch ended at 17:54 08/09/2014. [exit code=1]
What is wrong here?
to work this, we have to check the option, use datanode hostname.
We have do modification in the windows host file
C:\Windows\System32\Drivers\etc\hosts
EC2PublicIP PrivateDNS
example :
10.210.202.106 ip-101-210-141-188.ec2.internal
then it is working now.

jdbc:mysql:replication causes Unknown column in 'field list' exception at runtime

I am using JBOSS EAP 5.1
want to implement Master/slave mechanism.
xyz-mysql-ds.xml (for ex. customer db)
xyz-mysql-ds.common.xml (for common db)
I have added master/slave ip in xyz-mysql-ds.xml
<local-tx-datasource>
<jndi-name>customerDS</jndi-name>
<use-java-context>false</use-java-context>
<connection-url>jdbc:mysql:replication://master_ip_address:3306,slave_ip_address:3306/customer_v2_local?loadBalancePingTimeout=1500;loadBalanceBlacklistTimeout=7000;autoReconnect=true;failOverReadOnly=true;roundRobinLoadBalance=true</connection-url>
<driver-class>com.mysql.jdbc.ReplicationDriver</driver-class>
It starts JBOSS successfully. but geetting following error at runtime.
MySQLSyntaxErrorException: Unknown column 'lkpmenuopt0_.SortOrder' in 'field list'com.mysql.jdbc.exceptions.jdbc4.M
ySQLSyntaxErrorException: Unknown column 'lkpmenuopt0_.SortOrder' in 'field list'
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.Util.getInstance(Util.java:386)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1052)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3597)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3529)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1990)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2151)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2625)
at com.mysql.jdbc.LoadBalancedMySQLConnection.execSQL(LoadBalancedMySQLConnection.java:155)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2119)
at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:2281)
at org.jboss.resource.adapter.jdbc.WrappedPreparedStatement.executeQuery(WrappedPreparedStatement.java:342)
at org.hibernate.jdbc.AbstractBatcher.getResultSet(AbstractBatcher.java:209)
at org.hibernate.loader.Loader.getResultSet(Loader.java:1811)
at org.hibernate.loader.Loader.doQuery(Loader.java:696)
at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:258)
at org.hibernate.loader.Loader.doList(Loader.java:2233)
[wrapped] org.hibernate.exception.SQLGrammarException: could not execute query
at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:90)
at org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:66)
at org.hibernate.loader.Loader.doList(Loader.java:2236)
>>org.springframework.dao.InvalidDataAccessResourceUsageException: could not execute query; nested exception is org
.hibernate.exception.SQLGrammarException: could not execute query
>>org.hibernate.exception.SQLGrammarException: could not execute query
[SQL: 1054, 42S22]
>>com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Unknown column 'lkpmenuopt0_.SortOrder' in 'field list
'
>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>> at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>> at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>> at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
>> at com.mysql.jdbc.Util.getInstance(Util.java:386)
>> at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1052)
>> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3597)
>> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3529)
>> at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1990)
>> at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2151)
>> at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2625)
>> at com.mysql.jdbc.LoadBalancedMySQLConnection.execSQL(LoadBalancedMySQLConnection.java:155)
...
...
And if remove word relication from : <connection-url>jdbc:mysql:replication:
It working with NO runtime exception, but all read / write request goes to master only.
Any hint to solve this issue ?
Thanks !
<connection-url>jdbc:mysql:replication://master:3306,slave:3306/Customerdb?
loadBalancePingTimeout=1500&loadBalanceBlacklistTimeout=7000&autoReconnect=true&failOverReadOnly=true&roundRobinLoadBalance=true</connection-url>
<connection-url>jdbc:mysql:replication://master:3306,slave:3306/Customerdb?loadBalancePingTimeout=1500&loadBalanceBlacklistTimeout=7000&autoReconnect=true&failOverReadOnly=true&roundRobinLoadBalance=true</connection-url>
**finally this This works for me separating properties with &