Load a PostgreSQL database using cloudconnect - postgresql

On the side of my Gooddata project, I maintain a small PostgreSQL database that contains a few tables.
I would like to be able to integrate both my ETL processes using the same tool, and it seems to me cloudconnect would be the easiest way, since I already have my whole GoodData ETL in it.
Here are the ways I tried to do it without success:
I tried to have a look in the documentation, and it seems to me that all the functionalities of CloverETL that enabled this (DBOutput, PostGreSQLDataWriter) are not available in Cloudconnect.
I managed to connect to the Agile Datawarehouse Service (Database attached to GoodData), but it seems that only the ADS database is able to understand the request:
COPY MyDataBaseTable (field1,field2) FROM LOCAL '${DATA_TMP_DIR}/CIforADS.csv'
even when I adapt the syntax to PostgreSQL because the dynamic addressing I use here does not seem to work.
Is there any way to proceed that I'm missing? Can anyone think of a workaround?

In general this could be achieved by using of "DBExecute" component, but
I'm not sure if I understand it well - do you want to load data into your own Postgres instance using CloudConnect?

Related

Connect SAS to mongodb without using ODBC

One of ours statistician is stuck trying to read data from mongoDB using SAS.
In my experience connecting mongo to other languages always require a native driver, but in this case I've found that is only possible using ODBC.
I've tried to find a better way to connect this two software but the only idea that came to my mind is to expose mongo via webservice.
Any of you have a better solution to connect SAS to mongodb?
After some tries we found that using webservice is the most convenient way to solve mongodb access in ours case.
Some statistician required to load data on laptop from outside the corporate network so we decided to extend our web service to expose some more informations and access it in SAS like this https://blogs.sas.com/content/sasdummy/2016/12/02/json-libname-engine-sas/
Thanks all for the clarification regarding ODBC, to me was a real surprise that is still the preferred way to load data in enterprise environments.

Postgres to ElasticSearch data indexing for ELK Stack

So I know the JDBC Rivers plugin is deprecated so even though it is being used I'd ideally not want to look at using something that is no longer supported.
However I have a few tables in a Postgres database with values that I need to be able to search in a Kibana view. Im new to the ELK stack but i've been messing around with some of their samples to get familiar.
I've seen some mentions of using Stored Procedures/Triggers from Postgres to send to Logstash. Although im not sure if this is the best way. Im not a developer but a QA so my coding skills are "ok" as im used to writing automation tests/etc...
What would be the best way to do this? I would want to probably capture updates to these tables (probably new inserts or updates) OR be able to poll the data every X period of time (30s or something). Lets pretend it's for a weather station and the tables contain humidity data from different weather sensors.
I'd want to be able to search in a Kibana view the Values/Station ID/etc...
Is this doable? Is there maybe a better way than using Triggers/Stored procedures?
I ended up using the JDBC driver and following https://www.elastic.co/blog/logstash-jdbc-input-plugin to get it moving and working (Which it does move). But it was a lot of setup for anyone that may see this answer.

How to visualize data from a Postgresql in Kibana?

I need to some visualize data from a Postgresql in Kibana. I have also ElasticSearch installed just in case. So how visualize data from a Postgresql in Kibana? Of course, I don't need the whole database, but only data returned by a custom sql query.
Also, I want it to be as simple as possible, I wouldn't like to use libraries I really don't need to use.
Kibana was built with Elastisearch in mind.
Having used it quite a lot in a startup I worked for, I can tell you that even the front-end query DSL (built on Lucene) will only work with Elasticsearch (or might need some serious tweaks).
I would advise you to push your data into Elasticsearch, and just work with Kibana the way it was made for :)

Multiple server instance load and database balancing

Please correct me if I am wrong but I guess handling more requests and load by adding more machines or balancing the load between multiple servers is horizontal scaling. So, if I add more servers, how do I distribute the database? Do I create only one database to hold the user records with multiple servers? Or do I split the database too? Then what about database integrity? How to synchronize it? I am a newbie and really confused but eager to learn. I would like to use postgresql for my project and would like to know some basic things before I start. What I want to do is, create two servers for the application load balancing (please correct me if its not what I have to do). How will I manage the database across these without database integrity? Do I have to replicate the data within the two servers? How do you guyz have multiple instance and manage database? Do I need to go through sharding for this? What would be the best approach to have many instance without database integrity in accordance to postgresql. I would really appreciate if anyone could explain it to me. Thank you!
Not sure if you are looking just for a service, which could bring you what you need so you don't have to spend time on it, or if on the other hand, you would like to implement that on your end, which I guess could be quite complex.
If you are looking for your own solution, maybe you should take a look at Postgres-XC, which is a group that provides a database cluster based upon PostgreSQL database.
On the other hand, if you are just interested in the development process and don't want to spend time on this when you can have it on the cloud, maybe you would like to take a look at EnterpriseDB which provides PostgreSQL on the cloud.
For your application, you can also use a cloud service in which you can even auto-scale your app depending on some parameters as it is explained here.

How to have complete offline functionality in a web app with PostgreSQL database?

I would like to give a web app with a PostgreSQL database 100% offline functionality. In an ideal case the database should be completely replicated in the browser per user, and synchronized when online. So that the same code can be used to talk to both the offline and online database. I know this is possible with PouchDB and CouchDB, but have not found a solution that works with PostgreSQL. Is this at all possible?
Short answer: I don't know of anything like this that currently exists.
However, in theory, this could be made to work...(long answer:)
Write a PostgreSQL backend for levelup (one exists for MySQL: https://github.com/kesla/mysqldown)
Wire up pouch-server to read/write from your PostgreSQL db using pouchdb's existing leveldb adapter (which in turn will have to be configured to use your postgres backend). Congrats, you can now sync data using PouchDB!
Whether an approach like this is practical in reality for your application is a different question you'll have to answer.
You may be wondering, for example, "will I be able to sync an existing complex schema with multiple tables to the client with this approach?" The answer is probably not - the mysqldown implementation of leveldown uses a single MySQL table with three fields: id, key, and value (source), and I imagine any general-purpose PostgreSQL adapter would be similar (nothing says you can't do a special-purpose adapter just for your app though!).
On the other hand, if you were to implement a couchdb-compatible API (or a subset- you may not need attachments, for example) over your existing database schema, there's nothing stopping you from using PouchDB on the client to talk directly to that as if it were an actual CouchDB - just pop in the URL and call replicate()! Implementing the replication protocol might be a fair bit of work, since you'd need to track revisions and so on somewhere - but again, technically not impossible!
There are also implementations of levelup's backend storage that are designed for browsers. See level.js, which could be another way to sync between a server-side Postgres levelup backend and the browser.
TL;DR: There's tons of work being done around Javascript databases right now. Is syncing with Postgres impossible? probably not. Would it be a lot of work? Definitely. Worth it? Who knows, but it would be cool.
Without installing PostgreSQL on the client? No. Obviously you can cache data for offline use, but an entire RDBMS+procedural languages in Javscript, no.