Is it possible to use grafana to write query results of SQL DBs (postgres / mysql) into influxDB ? - grafana

I would like to query several different DB's using grafana, and in order to keep metrics history I would like to keep it in influxDB.
I know that I can write my own little process that holds queries and send it to influx, but I wonder if its possible by grafana only?

You won't be able to use Grafana to do that. Grafana isn't really an appropriate tool for transforming/writing data. But either way, its query engine generally just works with one single datasource/database at a time, rather than multiple, which is what you'd need here.

Related

How to combine data from postgreSQL and dynamic json in grafana

I have a grafana dashboard where I want to use an orcestra cities map dashboard to show status of some stations. The status is available as json from a http server (using nagios for this part) but the status has no idea of the location of the stations. This I have in a postGIS database.
I know I can set up a script that reads the status json and inserts the data into a table in the postgis database. This can run each five minutes or something. This feels a bit kludgy, so I wonder if there are some other ways of doing this.
Could it be possible to use a foreign data wrapper to fetch the json into postgis? The only json fdw I have found is to read a set of files, I would need to read from a http server.
If not, is it possible to combine data from json and postgres in one data set in grafana? I can read in data from both sources and present them e.g. as time series in one panel, but here I need to be able to join the two so that I use some of the attributes from json to categorize the points from postgis (or the other way around if that should be easier)
In theory you can do that in the Grafana. You need to have 2 queries with results from both sources (how to write query, configure datasources for that is not in the scope of this question) + you need a key, which can be used for a join in both results (e.g. city_id).
Then you may use join transformation to "join" both query results into single dataset.

streaming PostgreSQL tables into Google BigQuery

I would like to automatically stream data from an external PostgreSQL database into a Google Cloud Platform BigQuery database in my GCP account. So far, I have seen that one can query external databases (MySQL or PostgreSQL) with the EXTERNAL_QUERY() function, e.g.:
https://cloud.google.com/bigquery/docs/cloud-sql-federated-queries
But for that to work, the database has to be in GCP Cloud SQL. I tried to see what options are there for streaming from the external PostgreSQL into a Cloud SQL PostgreSQL database, but I could only find information about replicating it in a one time copy, not streaming:
https://cloud.google.com/sql/docs/mysql/replication/replication-from-external
The reason why I want this streaming into BigQuery is that I am using Google Data Studio to create reports from the external PostgreSQL, which works great, but GDS can only accept SQL query parameters if it comes from a Google BigQuery database. E.g. if we have a table with 1M entries, and we want a Google Data Studio parameter to be added by the user, this will turn into a:
SELECT * from table WHERE id=#parameter;
which means that the query will be faster, and won't hit the 100K records limit in Google Data Studio.
What's the best way of creating a connection between an external PostgreSQL (read-only access) and Google BigQuery so that when querying via BigQuery, one gets the same live results as querying the external PostgreSQL?
Perhaps you missed the options stated on the google cloud user guide?
https://cloud.google.com/sql/docs/mysql/replication/replication-from-external#setup-replication
Notice in this section, it says:
"When you set up your replication settings, you can also decide whether the Cloud SQL replica should stay in-sync with the source database server after the initial import is complete. A replica that should stay in-sync is online. A replica that is only updated once, is offline."
I suspect online mode is what you are looking for.
What you are looking for will require some architecture design based on your needs and some coding. There isn't a feature to automatically sync your PostgreSQL database with BigQuery (apart from the EXTERNAL_QUERY() functionality that has some limitations - 1 connection per db - performance - total of connections - etc).
In case you are not looking for the data in real time, what you can do is with Airflow for instance, have a DAG to connect to all your DBs once per day (using KubernetesPodOperator for instance), extract the data (from past day) and loading it into BQ. A typical ETL process, but in this case more EL(T). You can run this process more often if you cannot wait one day for the previous day of data.
On the other hand, if streaming is what you are looking for, then I can think on a Dataflow Job. I guess you can connect using a JDBC connector.
In addition, depending on how you have your pipeline structure, it might be easier to implement (but harder to maintain) if at the same moment you write to your PostgreSQL DB, you also stream your data into BigQuery.
Not sure if you have tried this already, but instead of adding a parameter, if you add a dropdown filter based on a dimension, Data Studio will push that down to the underlying Postgres db in this form:
SELECT * from table WHERE id=$filter_value;
This should achieve the same results you want without going through BigQuery.

Can I modify postgresql sql before execute it

I use grafana to view metrics in timescaledb.
For large scale metrics I create a view to aggregate them to a small dataset, I configure a sql in grafana, which table is fixed, I want the table name is changed according to the time range, say: time range less than 6 hours, query the detail table, time range greater than 24 hours query the aggregate view.
So I am looking for a proxy or postgresql plugin which can used to modify the sql before execute it.
AFAIK there is no PostgreSQL extension to modify SQL query but there is a proxy that says it can rewrite and filter SQL query: https://github.com/wgliang/pgproxy.
You might alternatively look at TimescaleDB's real-time aggregates, which were released in 1.7
Basically it will transparently take the "union" between pre-calculated aggregates > 6 hours with the "raw" data < 6 hours.
Not quite what you are asking for, but might get you to the same place, and works transparently with grafana.
https://blog.timescale.com/blog/achieving-the-best-of-both-worlds-ensuring-up-to-date-results-with-real-time-aggregation/
I would suggest taking a look at Gallium Data, it's a free database proxy that allows you to change database requests before they hit the database, and database responses before they reach the clients.
Disclosure: I'm the founder of Gallium Data.

Can I use grafana with a relational database not listed in the supported data source list?

I need to show metrics in real time but my metrics are stored in a relational database not supported by the datasources listed here https://grafana.com/docs/grafana/latest/http_api/data_source/
Can I somehow provide the JDBC (or other DB driver) to Grafana?
As #danielle clearly mentioned, "There is no direct support for JDBC or ODBC currently. You could get this data in time series form and into Grafana if you are prepared to do some programming.
The simple json data source is a generic backend that could make JDBC/ODBC calls to MapD and then transform the data into the right form for Grafana."
https://github.com/grafana/grafana/issues/8739#issuecomment-312118425
Though this comment is a bit old, i'm pretty sure there is no out of the box way to visualize data using JDBC/ODBC, yet.
One possible approach can make use of:
Grafana can access PostgreSQL
PostgreSQL can transparently display data in other databases as though it was a PostgreSQL table through Foreign Data Wrappers
Doing it this way, you'd use PostgreSQL to act as a gateway to the data. Depending on the table structure, you might also need to create a view in PG to shape the data to match Grafana's requirements for PG data source.

Is Grafana used for analyzing system metrics alone?

I am new to grafana. I want to know whether grafana is used for only monitoring system metrics?
1) If not so, I am having postgreSQL database with some live data in it. Can i use the grafana for accessing those postgres tables directly into grafana without any conversion like json.
2) If there is possibility to directly access postgres databse into grafana which data source can i use?
Please correct me if I am wrong..
Grafana can be used to visualize any time-series or metrics and not just system metrics.
PostgreSQL can be used using a datasource plugin - https://github.com/sraoss/grafana-sqldb-datasource (haven't tried it out myself)
And there's a generic SQL Datasource being developed as well. Here's the PR for your reference. - https://github.com/grafana/grafana/pull/5364
I want to know whether grafana is used for only monitoring system
metrics?
You can use grafana to display a lot of different metrics. I for example use grafana + influxdb to display different sensor values from my apartment.
Can i use the grafana for accessing those postgres tables directly into grafana
I am not sure about that. But if you take a look at the available data-sources LINK you will see that there is no PostgreSQL. So I think this is a no.