Anyone using hadoop_fdw with cloudera 5.2.0? - postgresql

After painful installation of hadoop_fdw into our running pgsql 9.3.4, I am trying to connect it to cloudera cluster 5.2.0 with no luck.
Is there a way of debugging the fdw? After creating the foreign table and selecting from it, I just got an error - ERROR: failed to connect to Hive: No more data to read.
btw.: Some old version of hadoop_fdw was capable of using url (jdbc://server:port/args), but not the recent version, there's just address & port.

Hadoop_fdw didn'd make it. There's probably something wrong/old/obsolete in hive.c. But with even more effort we managed to make jdbc_fdw work with cloudera jdbc drivers. The steps were as follows:
1) install jdbc_fdw extension
2) merge all driver jar files into one
3) CREATE SERVER cloudera2 FOREIGN DATA WRAPPER jdbc_fdw OPTIONS(drivername 'com.cloudera.hive.jdbc4.HS2Driver',url 'jdbc:hive2://fqdn:10000;user=hive',querytimeout '15', jarfile '/opt/cloudera/combined.jar');
mental note: set client_min_messages to debug5; can help you identify where is the problem e.g.:driver not found etc

Related

How to create connection for DB2 database in Apache Airflow(2.2.5)

I need to poll DB2 database table until a record is created and once it is successful,i need to trigger a task in dag.
For this I have to do below:
Create a DB2 connection in airflow
Use SQL sensor to poll db table using the connection created in above step.
But I don't see the connection type option for DB2 (screenshot attached). Do I have to install something? Please help.
PS: I am new to Airflow.
There is no DB2 specific connection.
For general connections you can just use the Generic connection:
For older Airflow versions that doesn't have Generic connection you can use any other connection type (HTTP/MySql for example). It doesn't really matter. Airflow looks for connection by the connection id. The type is almost meaningless in that perspective.
Yes the airflow connection does not appear to be a specific connection for db2, so you have to choose generic in connection Type.
I also recommend to install the security mechanism for connection.
the next package: pip install airflow-provider-db2
I have tried to connect to DB2 successfully on Airflow 2.2.5.
Airflow doesn't has specific provider for DB2. You need to use JDBC to connect.
So you need to install some libraries of Airflow and download DB2 JDBC driver to fulfill this task.
List of all libs and provider need to install below:
Apache-airflow-provider-jdbc/Version:2.1.3
JayDeBeApi/Version:latest
JPype1/Version:latest
You also need to download DB2 JDBC drivers. This driver include some packages and license file like:
db2jcc4.jar
sqlj4.zip
jdbc4_LI_en
Because it will use JDBC to connect,openJDK also need to install on your docker.
Setting environment variables JAVA_HOME, PATH and CLASSPATH are needed.
All of above actions can be done on your docker file.
When you have completed to build your image, You can start your airflow to set connection.
Connection URL syntax: jdbc:db2://Host:Port/Database
You can replace HOST,PORT and Database.
Driver Path: You can input where your class file(db2jcc4.jar) located.
Hope these tips can help you. I am New to Airflow too.

Bareos Postgres Plugin NOT backing up remote PostgreSQL13 database

I've installed Bareos 20.0.1 on Ubuntu 20.04.3 according to their documentations here.
I'm trying to backup a remote PostgreSQL database and apparently, there are three possible scenarios and the pros of the PostgreSQL Plugin (third solution), makes it the obvious choice.
Following the PostgreSQL Plugin documentations, in the Prerequisites for the PostgreSQL Plugin section, there is a line saying:
The plugin must be installed on the same host where the PostgreSQL database runs.
Now what I'm failing to understand is that, if I'm supposed to install the plugin on my database node, how will the bareos machine and the plugin on the db machine communicate?
Furthermore, I've checked out the source code for this module on their GitHub, and I see that the plugin source code tries to find files locally and that is a proof to the aforementioned statement.
In a desperate act, I tried installing the plugin and its dependencies on the bareos node and I keep getting the error Error: python3-fd-mod: Could not read Label File /var/lib/postgresql/13/main/backup_label which is actually trying to find the backup_label file in the bareos node.
Here is the configuration for my fileset:
FileSet {
Name = "psql"
Include {
Options {
compression=GZIP
signature = MD5
}
Plugin = "python"
":module_path=/usr/lib/bareos/plugins"
":module_name=bareos-fd-postgres"
":postgresDataDir=/var/lib/postgresql/13/main"
":walArchive=/var/lib/postgresql/13/wal_archive/"
":dbHost=DATABASE_DNS"
":dbuser=DATABASE_USER"
}
}
Note that the plugin document specifies the dbHost parameter as:
useful, if socket is not in default location. Specify socket-directory with a leading / here
However, since I'm trying a remote database, I'm using the DNS address of the remote database. I verified the bareos connection to database and made sure the backup_label file gets created while the PostgreSQL backup job runs.
I'll be happy to provide more details if necessary. Appreciate any help or even guesses :-D

Is there a way to use Flyway on AS400?

I need to implement migration tool like Flyway in order to use Jenkins to deploy DB changes.
I tried to add jt400.jar file and added configuration as follows:
flyway.url=jdbc:as400://192.168.171.251:446/DBDEV
flyway.driver=com.ibm.as400.access.AS400JDBCDriver
as a driver and it would not connect with this message:
ERROR: No database found to handle jdbc:as400://192.168.171.251:446/DBDEV
I also tried with using IBM DB2 driver and had configuration
flyway.url=jdbc:db2://192.168.171.251:50000/DBDEV
flyway.driver=com.ibm.db2.jcc.DB2Driver
this time I am getting this kind of refusal message
ERROR:
Unable to obtain connection from database (jdbc:db2://192.168.171.251:50000/DBDEV) for user 'DEVUSER':
[jcc][t4][2043][11550][4.26.14] Exception java.net.ConnectException: Error opening socket to server
/192.168.171.251 on port 50,000 with message: Connection refused (Connection refused).
ERRORCODE=-4499, SQLSTATE=08001
With this test migration I am trying to create a simple table by executing this sql
CREATE TABLE PERSON (
ID INT NOT NULL,
NAME VARCHAR(100) NOT NULL
);
Anyone had this situation and solved it?
I believe that at present there is no support for flyway to work with IBM i (as/400) regardless of whether you use jt400.jar or an IBM jdbc driver.
You can either use a different database-schema versioning tool, or find a fork of flyway that supports i-series (or pay someone to create and support such a fork, it is open source...).
It seems that currently (flyway 7.7.2) does not recognize a URL that starts with "jdbc:as400:" as a Db2 URL, so it throws an exception, which is the reason that the jt400.jar style URL is rejected with exception:
"No database found to handle ..."
The github history tells a story (see: https://github.com/flyway/flyway/issues/105).
Looks like the devs did not succeed to get the AS400 support added due to the lack of a suitable available i-series testing/dev environment (and also available to travis ci) . There may have been at least one PR for such support in the past, although it seems to be removed.
If you try to use the IBM db2jcc4.jar driver to connect to i-series (as400) with a url similar to: jdbc:db2://hostname/dbname, and you explicitly use an IBM jre , and have the relevant license file (e.g. db2jcc_license_cisuz.jar on the CLASSPATH), then flyway will connect and then report the exception similar to:
Unsupported Database: AS 7.4
The flyway source code shows that flyway does not recognize this database product-name and version, at current flyway version 7.7.2.
Are you sure DBDEV is the name of your Db2 data base on the IBM i?
Use the Work with RDB Directory Entry (WRKRDBDIRE) from the green screen, and look for the *LOCAL entry.
Or use the Access Client Solutions (ACS) "Schemas" tool to see a list of DB on your system.
The above shows 2 DB's, UT29p63 and Dbtest

JDBC Producer in streamsets that could not write data into MySql

I had configured the JDBC connection configuration in the pipeline.
and when the application executes i get the following error on the logs.
"java.sql.SQLSyntaxErrorException: Table 'databaseName.aim_table' doesn't exist"
The databaseName is not what I have set.
I have tried many times. it shows the same message that could not find the table in different database, and the question is all the db occurred in the sdc.log are that I had never configured ,and the correct database is never used ,so I want to know how could it find the wrong db and I had checked before start the pipeline and it shows successful:
Do you have anything set in the Schema Name configuration for JDBC
Producer? This should be blank for MySQL, since you're setting the
database/schema name in the connect URL.
Check that your MySQL driver matches the server. In particular, using
the current version 8.0.x JDBC driver with a 5.x.x server seems to
result in this problem. Download the older 5.1.x driver (currently
5.1.46) and it should work.
refer this
This problem is indeed caused by the wrong version of the driver package. I found the correct driver package and successfully wrote the data to the target table. add aonther point, I have set the SCHEMA NAME to blank and defined the database name in the connect URL for mysql.
My English is not good. Please forgive me.

DB2 10.1 table data copy to IBM VSE 7.4 table

We have one application where we have db2 10.1 as database.
Now one requirement came in which we need to interface few tables to HOST which is on IBM DB2 VSE 7.4
I tried to execute load command with client option but it give "SQL1325N The remote database environment does not support the command or one
of the command options." error.
command is :"D:\tempdata>db2 load client from app.tbl of ixf
insert into host.tbl"
Many post says that its not allow to use load from 10.1 to VSE Z/OS.
Another option I tried is import but its too slow and we need to delete records every time as truncate is not available.
Replication can be think for option but we would like to avoid replication.
Can anyone suggest way to achieve this. Load can be use or not?
It seems its not allow to use load from remote machine. But wondering what is use of CLIENT option in load then.
Finally we decided to use import utility after deleting HOST DB2 records. In that we need to execute delete and import command on part of table. If we try to import or delete big table at one go it give error of log file size full.
Hope this will help.