Getting a redshift error while using RDS Postgres with PySpark - postgresql

I'm trying to load data in PySpark from a normal Amazon RDS Postgres DB:
df = sqlCtx.read.format('jdbc').option('url', f"jdbc:postgresql://{DB_HOSTNAME}:{DB_PORT}/{DATABASE}?ssl=false").option("dbtable","public.pc").option("user", DB_USERNAME).option("password", DB_PASSWORD).load()
But strangely, I'm getting this authentication error which looks like it's using redshift libraries (even though this is not a redshift db):
Py4JJavaError: An error occurred while calling o81.load.
: java.sql.SQLException: [Amazon](500101) Error authenticating with database:
Authentication: isAuthenticationOK - false
isKerberos5Required - false
isClearTextPasswordRequired - false
isMD5PasswordRequired - false
isSCMCredentialsMessageRequired - false
isGSSAPIAuthenticationRequired - false
isSSPIAuthenticationRequired - false
isGSSContinue - false
at com.amazon.redshift.client.PGClient.startSession(Unknown Source)
at com.amazon.redshift.client.PGClient.<init>(Unknown Source)
at com.amazon.redshift.core.PGJDBCConnection.connect(Unknown Source)
at com.amazon.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source)
at com.amazon.jdbc.common.AbstractDriver.connect(Unknown Source)
I also know my credentials are valid, as they work with a DB explorer app as well as with the psycopg2 library. Any ideas?

Related

Postgres driver not found error when the microservice is using MongoDB on K8s

I am trying to deploy some microservices over Azure Kubernetes cluster using Helm chart. Some of those services use PostgreSQL DB(Type-1 services) and others use MongoDB(Type-2 services).
Type-1 services are working fine but Type-2 services are not working with following error:
Caused by: org.springframework.beans.BeanInstantiationException: Failed to instantiate [com.zaxxer.hikari.HikariDataSource]: Factory method 'dataSource' threw exception; nested exception is java.lang.IllegalStateException: Cannot load driver class: org.postgresql.Driver
at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:185)
at org.springframework.beans.factory.support.ConstructorResolver.instantiate(ConstructorResolver.java:653)
... 43 common frames omitted
Caused by: java.lang.IllegalStateException: Cannot load driver class: org.postgresql.Driver
at org.springframework.util.Assert.state(Assert.java:97)
at org.springframework.boot.autoconfigure.jdbc.DataSourceProperties.determineDriverClassName(DataSourceProperties.java:242)
at org.springframework.boot.autoconfigure.jdbc.DataSourceProperties.initializeDataSourceBuilder(DataSourceProperties.java:194)
at org.springframework.boot.autoconfigure.jdbc.DataSourceConfiguration.createDataSource(DataSourceConfiguration.java:48)
at org.springframework.boot.autoconfigure.jdbc.DataSourceConfiguration$Hikari.dataSource(DataSourceConfiguration.java:90)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:154)
... 44 common frames omitted
I don't understand why services that use MongoDB as a database seek Postgres driver, as seen in the error message. The values.yaml look like this:
type1-service:
imagePullSecrets:
- name: regcred
global:
envVariable:
common:
spring_datasource_url: "jdbc:postgresql://postgres01.some.hostname:5432/jarvis?sslfactory=org.postgresql.ssl.NonValidatingFactory"
spring_datasource_username: "tonystark"
spring_datasource_password: "peter"
kafka_bootstrap_servers: "kafka01.some.hostname:9092"
spring_kafka_bootstrap_servers: "kafka01.some.hostname:9092"
oauth2_defaultSuccessURL: "https://some.web.hostname/"
CLUSTERNAME: "demo.some.aks.hostname"
type2-service:
imagePullSecrets:
- name: regcred
global:
envVariable:
common:
spring_datasource_url: "mongodb://mongo01.some.hostname:27017/?authSource=admin&replicaSet=rs1"
envVariable:
service:
vars:
spring_data_mongodb_database: mydatabase
kafka_topics_query_event: "search_text"
kafka_topics_result_event: "result_name"
spring_data_mongodb_auto-index-creation: false
I am not getting this error when running the same micro-service in local system.
Appreciate any help here.

Rundeck Oracle Integration : ORA-00904: "TRUE": invalid identifier

I am working on Rundeck POC for migrating jobs from Jenkins. But when I used oracle as backend to the rundeck I am getting the following exception:
Rundeck version: Rundeck 3.1.2-20190927
"/etc/rundeck/rundeck-config.properties" file entry for dialect:
dataSource.dialect=org.rundeck.hibernate.RundeckOracleDialect
Dialect Jar:
https://repo1.maven.org/maven2/org/rundeck/hibernate/rundeck-oracle-dialect/1.0.0/rundeck-oracle-dialect-1.0.0.jar
Exception during rundeck start-up:
Caused by: java.sql.SQLSyntaxErrorException: ORA-00904: "TRUE": invalid identifier
at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:450)
at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:399)
at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:1059)
at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:522)
at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:257)
at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:587)
at oracle.jdbc.driver.T4CStatement.doOall8(T4CStatement.java:210)
at oracle.jdbc.driver.T4CStatement.doOall8(T4CStatement.java:30)
at oracle.jdbc.driver.T4CStatement.executeForRows(T4CStatement.java:931)
at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1150)
at oracle.jdbc.driver.OracleStatement.executeInternal(OracleStatement.java:1792)
at oracle.jdbc.driver.OracleStatement.execute(OracleStatement.java:1745)
at oracle.jdbc.driver.OracleStatementWrapper.execute(OracleStatementWrapper.java:334)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.tomcat.jdbc.pool.StatementFacade$StatementProxy.invoke(StatementFacade.java:114)
Since Rundeck 3, the Oracle Dialect is out of the box, just need to configure the database and Rundeck to use it. For example, the following config at rundeck-config.properties works without adding nothing special (only the oracle JDBC at /var/lib/rundeck/lib directory)
dataSource.url = jdbc:oracle:thin:#192.168.33.30:1521:xe
dataSource.driverClassName = oracle.jdbc.driver.OracleDriver
dataSource.username = rundeckuser
dataSource.password = rundeckpass
dataSource.dialect = org.rundeck.hibernate.RundeckOracleDialect
dataSource.properties.validationQuery = SELECT 1 FROM DUAL
Take a look at this:
https://docs.rundeck.com/docs/administration/configuration/database/oracle.html

PostgreSQL jsonb testing using in-memory db

I am using postgres in production and have tables that have jsonb type columns. I am trying to test these queries using junits and an in-memory embedded database.
In the past, I have used H2 and HSQL for testing queries that run on MySql or Sybase. However, I am facing trouble using these for postgres as jsonb type is not supported by H2/HSQL.
Caused by: org.hsqldb.HsqlException: type not found or user lacks privilege: JSONB
at org.hsqldb.error.Error.error(Unknown Source)
at org.hsqldb.error.Error.error(Unknown Source)
at org.hsqldb.ParserDQL.readTypeDefinition(Unknown Source)
at org.hsqldb.ParserTable.readColumnDefinitionOrNull(Unknown Source)
at org.hsqldb.ParserTable.readTableContentsSource(Unknown Source)
at org.hsqldb.ParserTable.compileCreateTableBody(Unknown Source)
at org.hsqldb.ParserTable.compileCreateTable(Unknown Source)
at org.hsqldb.ParserDDL.compileCreate(Unknown Source)
at org.hsqldb.ParserCommand.compilePart(Unknown Source)
at org.hsqldb.ParserCommand.compileStatements(Unknown Source)
at org.hsqldb.Session.executeDirectStatement(Unknown Source)
at org.hsqldb.Session.execute(Unknown Source)
... 18 more
Is there any alternate approach available or if there is any trick that I am missing that could make jsonb work with H2/HSQL?
H2 does not support JSONB column type,found this workaround,
create test db in postgres, and use this in your integration tests
#RunWith(SpringJUnit4ClassRunner.class)
#SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT)
#ActiveProfiles({"test"})
define db properties in application-test.yml
For h2 version up to 2.1.212 you can create a custom type jsonb as json (which is supported by h2), by adding the next script in schema.sql file (from resources folder)
CREATE TYPE JSONB AS json;
P.S. Thanks #Sarajog. I've updated my post accordingly

Bluemix dashDB couchDB(?) error

After an attempt to create a new dashDB instance, a distinctly non-Netezza/DB2 error is thrown when trying to "manage" this newly purchased instance.
Exception thrown by application class 'org.lightcouch.CouchDbClientBase.executeRequest:-1'
org.lightcouch.CouchDbException: Error executing request.
at org.lightcouch.CouchDbClientBase.executeRequest(Unknown Source)
at org.lightcouch.CouchDbClientBase.get(Unknown Source)
at org.lightcouch.CouchDbClientBase.get(Unknown Source)
at org.lightcouch.CouchDbClientBase.get(Unknown Source)
at org.lightcouch.CouchDatabaseBase.find(Unknown Source)
at com.cloudant.client.api.Database.find(Unknown Source)
at com.ibm.datatools.dsweb.repository.CloudantRepo.getProvisionedServiceInstance(CloudantRepo.java:382)
at com.ibm.datatools.dsweb.controller.BluShiftHTTPController.getInstanceStatus(BluShiftHTTPController.java:870)
at com.ibm.datatools.dsweb.controller.RestEndPoint.launchDashboard(RestEndPoint.java:513)
at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.apache.wink.server.internal.handlers.InvokeMethodHandler.handleRequest(InvokeMethodHandler.java:63)
at org.apache.wink.server.handlers.AbstractHandler.handleRequest(AbstractHandler.java:33)
at org.apache.wink.server.handlers.RequestHandlersChain.handle(RequestHandlersChain.java:26)
--- clipped for your sanity ---
at org.apache.wink.server.internal.RequestProcessor.handleRequestWithoutFaultBarrier(RequestProcessor.java:207)
at org.apache.wink.server.internal.RequestProcessor.handleRequest(RequestProcessor.java:154)
at org.apache.wink.server.internal.servlet.RestServlet.service(RestServlet.java:124)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:668)
at com.ibm.ws.webcontainer.servlet.ServletWrapper.service(ServletWrapper.java:1287)
at [internal classes]
Caused by: java.net.SocketTimeoutException: Read timed out
... 72 more
I'm not quite sure what CouchDB has to do with dashDB, but in any event, another day, another ungracefully handled.
I'll just try again tomorrow, that usually fixies it.
Per your description you got the error above when trying to launch the console to manage the dashDB instance you just created.
This confirms the exception you are seeing, specifically this line:
com.ibm.datatools.dsweb.controller.RestEndPoint.launchDashboard(RestEndPoint.java:513)
The dashDB console is a Web UI with a backend developed using Cloudant NoSQL DB, which is based off of Couch DB. Hence the Couch DB exception you are seeing.
The Cloudant NoSQL DB was probably offline at the moment you tried to launch it, but I agreed that the exception should be handled properly. I will create an internal defect to get the dashDB team provide a fix for this.

error writing to mongodb from pig

I'am trying to use the mongo hadoop connector with pig or streaming to load/store data from mongodb. using pig i have following problem:
$cat process.pig
REGISTER /usr/hdp/2.2.4.2-2/hadoop/lib/mongo-java-driver-3.0.2.jar
REGISTER /usr/hdp/2.2.4.2-2/hadoop/lib/mongo-hadoop-core-1.4.0.jar
REGISTER /usr/hdp/2.2.4.2-2/hadoop/lib/mongo-hadoop-pig-1.4.0.jar
SET mapreduce.map.speculative false
SET mapreduce.reduce.speculative false
SET mapreduce.fileoutputcommitter.marksuccessfuljobs false
SET mongo.auth.uri 'mongodb://hadoop:password#127.0.0.1:27017/admin'
raw = LOAD 'mongodb://hadoop:password#127.0.0.1:27017/hadoop.collection'
USING com.mongodb.hadoop.pig.MongoLoader('id:chararray, t:chararray, c_s:map[]');
writing the data into a bson file with
STORE raw
INTO 'file:///tmp/pig_without_limit_bson'
USING com.mongodb.hadoop.pig.BSONStorage('id');
works and i'am able to import the file with mongorestore.
writing to mongodb with
STORE raw
INTO 'mongodb://hadoop:password#127.0.0.1:27017/hadoop.out'
USING com.mongodb.hadoop.pig.MongoInsertStorage('id:chararray, t:chararray', 'id');
does not work and produces following error:
Input(s):
Failed to read data from "mongodb://hadoop:password#127.0.0.1:27017/hadoop.collection"
Output(s):
Failed to produce result in "mongodb://hadoop:password#127.0.0.1:27017/hadoop.out"
$cat pig.log
Error: java.lang.IllegalStateException: state should be: open
at com.mongodb.assertions.Assertions.isTrue(Assertions.java:70)
at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:79)
at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.<init>(ClusterBinding.java:75)
at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.<init>(ClusterBinding.java:71)
at com.mongodb.binding.ClusterBinding.getWriteConnectionSource(ClusterBinding.java:68)
at com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:175)
at com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:141)
at com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:72)
at com.mongodb.Mongo.execute(Mongo.java:745)
at com.mongodb.Mongo$2.execute(Mongo.java:728)
at com.mongodb.DBCollection.executeBulkWriteOperation(DBCollection.java:1968)
at com.mongodb.DBCollection.executeBulkWriteOperation(DBCollection.java:1962)
at com.mongodb.BulkWriteOperation.execute(BulkWriteOperation.java:98)
at com.mongodb.hadoop.output.MongoOutputCommitter.commitTask(MongoOutputCommitter.java:133)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.commitTask(PigOutputCommitter.java:356)
at org.apache.hadoop.mapred.Task.commit(Task.java:1163)
at org.apache.hadoop.mapred.Task.done(Task.java:1025)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:345)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Pig Stack Trace
---------------
ERROR 0: java.io.IOException: No FileSystem for scheme: mongodb
org.apache.pig.backend.executionengine.ExecException: ERROR 0: java.io.IOException: No FileSystem for scheme: mongodb
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:535)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:280)
at org.apache.pig.PigServer.launchPlan(PigServer.java:1390)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1375)
at org.apache.pig.PigServer.execute(PigServer.java:1364)
at org.apache.pig.PigServer.executeBatch(PigServer.java:415)
at org.apache.pig.PigServer.executeBatch(PigServer.java:398)
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:171)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:234)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
at org.apache.pig.Main.run(Main.java:495)
at org.apache.pig.Main.main(Main.java:170)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.io.IOException: No FileSystem for scheme: mongodb
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2607)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2614)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2653)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2635)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.pig.StoreFunc.cleanupOnFailureImpl(StoreFunc.java:193)
at org.apache.pig.StoreFunc.cleanupOnFailure(StoreFunc.java:161)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:526)
... 18 more
However, if using the limit operator (even if limiting to enormous figures) all documents are saved into mongodb.
raw_limited = limit raw 1000000;
STORE raw_limited
INTO 'mongodb://hadoop:password#127.0.0.1:27017/hadoop.out'
USING com.mongodb.hadoop.pig.MongoInsertStorage('id:chararray, t:chararray', 'id');
results in
Input(s):
Successfully read 100 records (638 bytes) from:
Output(s):
Successfully stored 100 records (18477 bytes) in:
$mongo hadoop
>> db.out.count()
100
why is that and how can it be fixed? did i miss something?
Seems to be a bug in the mongo java driver.
It works, if using version 3.0.4 of the mongo java driver.