I am new to DataDog. We are planning to utilize it as a reporting tool. Below is the scenario:
Create a log table in Amazon RDS
Fetch some rows from the log table and send them to DataDOG.
I have read some documentation and found two ways to send data to DataDog. Below are the links:
https://www.datadoghq.com/blog/collect-postgresql-data-with-datadog/#collecting-custom-postgresql-metrics-with-datadog
https://docs.datadoghq.com/logs/log_collection/?tab=serverless
Is there any other way to send query results to Data Dog or directly query Aurora PostgreSQL(AWS)?
Help will be highly appreciated.
Related
I would like to automatically stream data from an external PostgreSQL database into a Google Cloud Platform BigQuery database in my GCP account. So far, I have seen that one can query external databases (MySQL or PostgreSQL) with the EXTERNAL_QUERY() function, e.g.:
https://cloud.google.com/bigquery/docs/cloud-sql-federated-queries
But for that to work, the database has to be in GCP Cloud SQL. I tried to see what options are there for streaming from the external PostgreSQL into a Cloud SQL PostgreSQL database, but I could only find information about replicating it in a one time copy, not streaming:
https://cloud.google.com/sql/docs/mysql/replication/replication-from-external
The reason why I want this streaming into BigQuery is that I am using Google Data Studio to create reports from the external PostgreSQL, which works great, but GDS can only accept SQL query parameters if it comes from a Google BigQuery database. E.g. if we have a table with 1M entries, and we want a Google Data Studio parameter to be added by the user, this will turn into a:
SELECT * from table WHERE id=#parameter;
which means that the query will be faster, and won't hit the 100K records limit in Google Data Studio.
What's the best way of creating a connection between an external PostgreSQL (read-only access) and Google BigQuery so that when querying via BigQuery, one gets the same live results as querying the external PostgreSQL?
Perhaps you missed the options stated on the google cloud user guide?
https://cloud.google.com/sql/docs/mysql/replication/replication-from-external#setup-replication
Notice in this section, it says:
"When you set up your replication settings, you can also decide whether the Cloud SQL replica should stay in-sync with the source database server after the initial import is complete. A replica that should stay in-sync is online. A replica that is only updated once, is offline."
I suspect online mode is what you are looking for.
What you are looking for will require some architecture design based on your needs and some coding. There isn't a feature to automatically sync your PostgreSQL database with BigQuery (apart from the EXTERNAL_QUERY() functionality that has some limitations - 1 connection per db - performance - total of connections - etc).
In case you are not looking for the data in real time, what you can do is with Airflow for instance, have a DAG to connect to all your DBs once per day (using KubernetesPodOperator for instance), extract the data (from past day) and loading it into BQ. A typical ETL process, but in this case more EL(T). You can run this process more often if you cannot wait one day for the previous day of data.
On the other hand, if streaming is what you are looking for, then I can think on a Dataflow Job. I guess you can connect using a JDBC connector.
In addition, depending on how you have your pipeline structure, it might be easier to implement (but harder to maintain) if at the same moment you write to your PostgreSQL DB, you also stream your data into BigQuery.
Not sure if you have tried this already, but instead of adding a parameter, if you add a dropdown filter based on a dimension, Data Studio will push that down to the underlying Postgres db in this form:
SELECT * from table WHERE id=$filter_value;
This should achieve the same results you want without going through BigQuery.
Queries are getting queued when multiple user trying to access Database(Same table) db2 warehouse on cloud.
We are using DB2 Warehouse on cloud and our Analytical report tool is Dundas and Cognos. When multiple user using same Dashboard/report that time all the sqls getting queued and user are not getting response. User does not have option customized backed sqls which generate by report tool.
Is there any way we tune the parameter so that we can avoid sqls are geeting queued?
I need to migrate the tables from the BigQuery to the on-prem Postgres database.
How can I efficiently achieve that?
Some thoughts that are coming
I will use Google APIs to export the data from the tables
Store it locally
And finally, import to Postgres
But I am not sure if that can be done for a huge amount of data in TBs. Also, how can I automate this process? Can I use Jenkins for that?
Exporting the data from BigQuery, store it and importing it to PostgreSQL is a good approach. Here are other two alternatives that you can consider:
1) There's a PostgreSQL wrapper for BigQuery that allows to query directly from BigQuery. Depending on your case scenario this might be the easiest way to transfer the data; although, for TBs it might not be the best approach. This suggestion was made by #David in this SO question.
2) Using Dataflow. You can create a ETL process using Apache Beam to made the transfer. Take a look at this how-to for transferring data from BigQuery to CloudSQL. You would need to adapt it for local PostgreSQL, but the idea maintains.
Here's another SO answer that gives more context on this approach.
I have two databases on both MYSQL and sqlserver database engine , I want to connect with MULE ESB. The wanted result is a table with fields (MACC, tencc, ngaysinh) on MYSQL and a table with fields (ID, NAME, ADDRESS) on SQLSERVER, when I perform adding manipulation (NAME, ADDRESS) on MYSQL, then the data also changes on SQLSERSER.
Thanks.
Mule JDBC connectivity suites provide very good connectors for mysql and sqlserver databases.
For your requirement kindly go thorough Mulesoft official Document here. and learn how to connect databases.
Good tutorial for sqlserver connectivity in mulesoft, here.
Based on above tutorial you can design you mule flow which connects to mysql db and sqlserver db using mule timer component, this timer component triggers event which reads data from mysql table and populate in sqlserver table as per need.
Note : In my opinion replicating data in such manner is not good design. If its for PoC or for learning purpose its good. If possible can you please share your usecase.
I want to connect two database and establish a relationship between them in tableau. One from sql sever and another from Microsoft excel sheet. How to do that?
I have goggled a lot for that but could not get a suitable answer.
You are speaking about Data Blending -
And for connecting cross database data
Cross Database Querying is a Flagship Upgrade to Tableau 10.0
However, you cannot use cross-database joins with these below connection types:
Tableau Server
Firebird
Google Analytics
Microsoft Analysis Services
Microsoft PowerPivot
Odata
Oracle Essbase
Salesforce
SAP BW
Splunk
Teradata OLAP Connector
You just need to connect to each database separately and make sure they have the same column names. When creating a sheet when you switch between datasources you will see a chain on the linked fields.
Do note that this is not properly joined but is just blended data, it would be best to create another table in your sql database for the excel sheet.