I am new to DBT. I understand that DBT is used for transforming data once the raw data is available in your DWH. However I was trying to see if anyone has used DBT to do an initial select/import of data into staging tables from a remote database server using DB link in a select statement where both the staging database server and the remote database server are Postgres databases.
Also in my case the data volume isn’t much.
I was able to do some testing and looks like this is doable.
Below was the set-up:
The profiles.yml file was pointing to target DB server.
Below was the query used in the model targetTable.sql file.
{{ config(materialized='table') }}
SELECT i.col1,
i.col2
FROM dblink('dbname=sourceDBName port=portNo hostaddr=sourceHostName user=userID password=****',
'SELECT
a.col1,
a.col2
from
(SELECT
col1,col2
FROM
public.tableName) a
')
i (col1 integer, col2 varchar(20))
Related
I have data stored in postgreSQL as data source and I want to load dimensions and fact tables of the clickhouse datawarehouse , I am new to clickhouse and used to use traditional integration tools like Talend and Microsoft SSIS to perform ETL
(PS i'm using docker images for both clickhouse and postgreSQL)
Have a look at the PostgreSQL engine integration here where you can perform SELECT and INSERT queries on ClickHouse on data stored in remote PostgreSQL.
You can also make use of table function as well.
You ingest data from Postgres into Clickhouse using:
External ETL tools like Airbyte and make a connector from Postgres to Clikhouse
Clickhouse integrations table engine to make a view from Clickhouse to Postgres data, after that use insert into query to insert data from that view into the real Clickhouse table
I'm trying to link existing MS Access 2013 tables into a PostgreSQL 12 Database, both 64bit version and installed on a small network using odbc_fdw. The databases are on different machines on a windows network. I have a system DSN installed and checked (using pyodbc).
In PostgreSQL I am able to create the extension; Foreign Data Wrappers; Foreign Server and Foreign Table, and to create the User Mappings, but when trying to run a query I get the "ERROR: Connecting to the driver". I tried numerous options following the little literature I found without any luck. I can use the ocdb_fwd to connect to a MySQL server straight forward but I could not figure out how to do it with MSAccess.
I would deeply appreciate it if somebody can help me figure out how to connect MSAccess to the fdw if possible.
These are my basic settings in PostgreSQL:
CREATE FOREIGN DATA WRAPPER odbc_data_wrapper
HANDLER public.odbc_fdw_handler
VALIDATOR public.odbc_fdw_validator;
CREATE SERVER odbc_msaccess
FOREIGN DATA WRAPPER odbc_data_wrapper
OPTIONS (dsn 'msaccess');
CREATE USER MAPPING FOR postgre SERVER odbc_msaccess
OPTIONS("ocdb_UID" 'Admin', "ocdb_pwd" '');
CREATE FOREIGN TABLE test(
id integer NOT NULL,
name character varying NOT NULL
)
SERVER odbc_msaccess
OPTIONS (layer 'test',
sql_query 'SELECT id, name FROM test);
DSN: msaccess working. Tested on pyodbc
odbc_data_wrapper: tested just fine connecting a MySql database
The databases are on different machines
Yeah, that's likely not going to work.
PostgreSQL needs direct access to the Microsoft Access Database, so it either has to be on the same machine, or on a network share. But if you're running it on a network share, you need to make sure that the user running PostgreSQL has access to the network share, the DSN is installed on the machine running PostgreSQL, and you're properly referring to the network path.
We have 2 databases on AWS RDS, OMS and SFMS, with each database having its own read replica. We use dblink in SFMS to fetch data of Table A from OMS. It works perfectly on my SFMS db instance with Master role, but get an ERROR: could not establish connection on our read replica DB.
Here is how I have setup the dblink:
SELECT * FROM dblink(
'dbname=<DB End Point> user=<username> password=<password>',
'SELECT id, <Other fields> from A') AS oms_A
(id int, <Remaining Schema>)
I can always create a materialized view on SFMS to get it to work. Is their some mistake that I am making while setting up DBLink to use it on a read replica instance?
This works on Aiven's PostgreSQL service. Please checkout aiven.io.
To set it up you first need to create the extension on the master server with 'CREATE EXTENSION dblink;'
The foreign server definitions and user name mappings also have to be created on the master which will then replicate them to the read-only replicas.
Once those are setup you can do things like: SELECT dblink_connect('myconn', 'db2remote'); and SELECT * FROM dblink('myconn','SELECT * FROM foo') AS t(id int); on the read-replica.
Hope this helps.
I am connecting to my PostgreSQL DB on AWS RDS through SQL workbench. i have created a new table. it created successfully. table name is like public.xyz.
Now when i am trying to access the select query on 'public.xyz' i am getting error like
'Relation public.xyz does not exist'.
i have checked that my show_path contains %user,public. no case issue.
i have tried select query like select * from public.xyz and select * from xyz etc.. all have same issue
please suggest.
it seems to be a problem with SQL workbench. when i tried to create table using psql client on a linux machine, it worked properly. now i am able to run the same select query which was failing earlier. don't know inner details, but it seems to be an issue with SQL workbench.
I was used pentaho data integration 4.4.0 and using mongo hadoop connector using this I was created successfully hadoop and mongo connections. Then I installed hive 0.11.0 and using above link I was created hive and mongo connections successfully. In my mongo contains one database name as pentaho and I created database in hive name as demo and using following command I was created new table name as pentho
CREATE TABLE pentaho
(
id INT,
region STRING,
year INT,
q1 INT,
q2 INT,
q3 INT,
q4 INT
)
STORED BY 'com.mongodb.hadoop.hive.MongoStorageHandler'
WITH SERDEPROPERTIES('mongo.columns.mapping'='{"id":"_id"}')
TBLPROPERTIES('mongo.uri'='mongodb://localhost:27017/pentaho.sales');
then I was write select * from pentaho; command on hive shell it shows all records presents in sales collections.
Then I was created model in pentaho using hadoop hive datasource set host as localhost database name as demo and port as 10000 and click on test then it shows popup connections made success. But when I was click on ok then new coming popup windows options like Schemas, Tables, Views, Synonyms but in table not contains any table which I was created in hive. So how can access hive tables in pentaho data source?
After time I found where I was missing I set in my local hadoop conf/mapred-site.xml file I set port as localhost:9001 and when I start bi-server same port used for hsql. Then after changing mapred-site.xml port other than 9001 it work fine. :)