How to access hive tables in pentaho - mongodb

I was used pentaho data integration 4.4.0 and using mongo hadoop connector using this I was created successfully hadoop and mongo connections. Then I installed hive 0.11.0 and using above link I was created hive and mongo connections successfully. In my mongo contains one database name as pentaho and I created database in hive name as demo and using following command I was created new table name as pentho
CREATE TABLE pentaho
(
id INT,
region STRING,
year INT,
q1 INT,
q2 INT,
q3 INT,
q4 INT
)
STORED BY 'com.mongodb.hadoop.hive.MongoStorageHandler'
WITH SERDEPROPERTIES('mongo.columns.mapping'='{"id":"_id"}')
TBLPROPERTIES('mongo.uri'='mongodb://localhost:27017/pentaho.sales');
then I was write select * from pentaho; command on hive shell it shows all records presents in sales collections.
Then I was created model in pentaho using hadoop hive datasource set host as localhost database name as demo and port as 10000 and click on test then it shows popup connections made success. But when I was click on ok then new coming popup windows options like Schemas, Tables, Views, Synonyms but in table not contains any table which I was created in hive. So how can access hive tables in pentaho data source?

After time I found where I was missing I set in my local hadoop conf/mapred-site.xml file I set port as localhost:9001 and when I start bi-server same port used for hsql. Then after changing mapred-site.xml port other than 9001 it work fine. :)

Related

How to ETL my PostgreSQL data into a ClickHouse datawarehouse?

I have data stored in postgreSQL as data source and I want to load dimensions and fact tables of the clickhouse datawarehouse , I am new to clickhouse and used to use traditional integration tools like Talend and Microsoft SSIS to perform ETL
(PS i'm using docker images for both clickhouse and postgreSQL)
Have a look at the PostgreSQL engine integration here where you can perform SELECT and INSERT queries on ClickHouse on data stored in remote PostgreSQL.
You can also make use of table function as well.
You ingest data from Postgres into Clickhouse using:
External ETL tools like Airbyte and make a connector from Postgres to Clikhouse
Clickhouse integrations table engine to make a view from Clickhouse to Postgres data, after that use insert into query to insert data from that view into the real Clickhouse table

external table in DB2

I am trying to create an external table in DB2 warehouse in the cloud with the following command.
CREATE EXTERNAL TABLE EXT_5 (ID INT, AGE INT)
USING (DATAOBJECT ('abc.csv')
DELIMITER ','
MAXERRORS 10
SOCKETBUFSIZE 30000
REMOTESOURCE 'LOCAL')
The response to this query is a success. But getting an error as unsupported client when trying to insert/select rows to/from this external table.
EXTERNAL TABLEs require version 11.1.2.2 or above of the Db2 Client.
E.g. see the line item "External table support by ODBC/CLI" here https://www.ibm.com/support/knowledgecenter/SSEPGG_11.1.0/com.ibm.db2.luw.wn.doc/doc/c0060895.html
You can download the latest version of your Client from here http://www-01.ibm.com/support/docview.wss?uid=swg27016878

How to Connect Mongodb to tableau

We are still in the development phase.
Our req is parse the XML to JSON and store them as flat files in Mongodb
Then for the analysis we want to use tableau.
Part 1 of the req is done...now i need to connect to tableau.
Versions we have are
Mongo 3.2
Tableau 9.1
I have googled and couldn't find any steps to integrate tableau with mongodb.
I also saw Mongodb has released a connector but there is no Windows BI connector.
Now do we need to migrate to Enterprise version for tableau connectivity.
Many thanks for the inputs
Detailed Instructions (for Windows), using localhost server of mongoDB:
1) Installation: Install Tableau, MongoDB, and MongoDB BI Connector for Tableau.
2) From the command prompt, you will want to serve your mongoDB instance as well as the mongoDB_sql server needed to connect to Tableau MongoDB BI Connector. Add mongoDB and mongoDB BI Connector bin's to your system path, for example: C:\Program Files\MongoDB\Server\3.6\bin\ and 2) C:\Program Files\MongoDB\Connector for BI\2.3\bin\.
3) Serve your local mongoDB server. Example command: mongod. (Let's assume it is served on localhost:27017.
4) Create a schema of the database you want Tableau to integrate with. Command to do this: mongodrdl --out <path_that_you_want_to_save_schema_to> /db:<name_of_database>
5) Validate the schema, and serve your local server of mongoDB as an SQL server (Tableau expects this server to be running). Command to do this: mongosqld --schema <path_to_schema> (** this will typically serve to localhost:3307)
6) You can now go to Tableau, under connectors, click on the MongoDB BI Connector, and enter localhost for the server, and 3307 for the port. (assuming in step 5 you have validated that the sql server is running on localhost with port 3307).
I hope this helps, these exact steps worked well for me.
The mongo biconnector is implemented as a multicorn (python) based Foreign Data Wrapper imbedded in the supplied postgresql server. Tools are provided to set up the postgresql "biuser" user, to create the collection to table mappings from data sampling, and to import the resultant schema into postgresql. The postgresql database contains non-materialized views corresponding to the (flattened) mongo collections. Access is through the postgresql server using standard postgresql jdbc/odbc drivers.
I think run mongodb bi connector in docker ubuntu/centos is a choice if the connector does not support windows, and tableau does not support linux, that's a question.
There is an example of create bi connection in www.mongodb.com/tableau with mongosqld
Hope this would work well for your issue
I have described our way connecting to data in MongoDB Community Edition to Tableau. First creat an API to your DB, then Web Data Connector to Tableau (it's HTML and JS files), after you can use WDC Connector in Tableau to connect to your URL.
Here is the detailed description how we did it: https://medium.com/#katya.neulinger/tableau-web-data-connector-to-mongodb-c1477d7d5ac9

connect to postgresql from enterprise architect

I've Enterprise Architect v12 and I want to use its Database builder in order to create a live connection between it and a postgresql database.
This is what I've done:
I've downloaded and installed PostgreSQL 9.5
During installation I've installed with StackBuilder Npgsql v2.2.4.3-2 and psqlODBC (64 bit) v09.03.0400-1.
I've started pgAdmin III and I can connect to my local database. I've created a new one without problems.
I've created in EA a new project: in technology I've selected Database and Data model: PostgreSQL.
In DataModel -> Tables -> Tables I've added a table Table1 and I've created a column id (varchar and primary key)
In the same table I've added a Datatabase connection Database connection 1.
I double click on Database connection 1 and then I select ODBC based database. Then click OK
At this point I'm not able to select my postgresql database. The data origin window (I don't know if it's the right name I've the italian version of windows) gives me two tabs: File origin data and computer origin data. In the first case inside the postgresql data folder that I've selected during installation I don't find anything useful, while in the second case I can see only Excel files and MS Access Database. If I click on New button I can select the data origin only for the current user and in the list I can't see postgresql (SQL Server, Microsoft paradox, etc).
What I'm doing wrong? How can I read my posgresql database from EA?
Enterprise Architect only connects to 32 bit ODBC datasources.
You can access the 32 bit ODBC manager from within EA from Tools|ODBC Data Sources or directly in windows from c:\Windows\SysWOW64\odbcad32.exe

Oracle CDC in Talend

I am trying to setup a CDC in Talend on an Oracle database. I am following the steps as listed here: https://www.talendforge.org/tutorials/tutorial.php?language=english&idTuto=42
However, when i try to establish the CDC in the main DB conneciton, i get a message saying "Database information must be filled to use the cdc!" and the Create Subscriber button is disabled.
What could I be doing differently? I have created two connections pointing to the same database, and have also retrieved the table schema in my main connection.