I am trying to create Grafana dashboard for a large system. There are thousands of metadata variables which I need to store and access. E.g. SLA's for hundreds of applications. What is the best way to achieve this? My data source for logs and metrics is elastic search.
Should I store the static data as Elastic search index and query along with main data or is it possible to store it in some other DB and access it with main elastic search data.
tl;dr Best is to handle all metadata before and only feed Grafana with indexes ready for display.
The only source of data in Grafana is the 'data source'. There is no way to get any sort of metadata in Grafana. Especially with ElasticSearch(ES) as a data source which is fairly new to Grafana.
The best way to configure any metadata is in an ES index or to model your data along with the metadata using a transformation or ingestion in ES. As suggested in tl;dr it is best to handle all the correlation and transformation beforehand and let Grafana just query indices to render graphs.
However, if you need any aggregations to be performed on the data Grafana does support it. You can check it in the official documentation
Related
I am currently working on a project with InfluxDB, where I have to track and store several measurements. My issue: For all my measurement I also need to store some additional relational metadata (for reporting), which only change once or twice a week. Moreover I have some config data, for my applications..
So my question: Do you have any suggestions or experience with storing time-series and relational data like that? Should I try to store everything (even the relational data) within the InfluxDB or should I use an external relational DB for that?
Looking forward to your help!
Marko
I need to show metrics in real time but my metrics are stored in a relational database not supported by the datasources listed here https://grafana.com/docs/grafana/latest/http_api/data_source/
Can I somehow provide the JDBC (or other DB driver) to Grafana?
As #danielle clearly mentioned, "There is no direct support for JDBC or ODBC currently. You could get this data in time series form and into Grafana if you are prepared to do some programming.
The simple json data source is a generic backend that could make JDBC/ODBC calls to MapD and then transform the data into the right form for Grafana."
https://github.com/grafana/grafana/issues/8739#issuecomment-312118425
Though this comment is a bit old, i'm pretty sure there is no out of the box way to visualize data using JDBC/ODBC, yet.
One possible approach can make use of:
Grafana can access PostgreSQL
PostgreSQL can transparently display data in other databases as though it was a PostgreSQL table through Foreign Data Wrappers
Doing it this way, you'd use PostgreSQL to act as a gateway to the data. Depending on the table structure, you might also need to create a view in PG to shape the data to match Grafana's requirements for PG data source.
We want to use Grafana to show measuring data. Now, our measuring setup creates a huge amount of data that is saved in files. We keep the files as-is and do post-processing on them directly with Spark ("Data Lake" approach).
We now want to create some visualization and I thought of setting up Cassandra on the cluster running Spark and HDFS (where the files are stored). There will be a service (or Spark-Streaming job) that dumps selected channels from the measuring data files to a Kafka topic and another job that puts them into Cassandra. I use this approach because we have other stream processing jobs that do on the fly calculations as well.
I now thought of writing a small REST service that makes Grafana's Simple JSON datasource usable to pull the data in and visualize it. So far so good, but as the amount of data we are collecting is huge (sometimes about 300MiB per minute) the Cassandra database should only hold the most recent few hours of data.
My question now is: If someone looks at the data, finds something interesting and creates a snapshot of a dashboard or panel (or a certain event occurrs and a snapshot is taken automatically), and the original data is deleted from Cassandra, can the snapshot still be viewed? Is the data saved with it? Or does the snapshot only save metadata and the data source is queried anew?
According to Grafana docs:
Dashboard snapshot
A dashboard snapshot is an instant way to share an interactive dashboard publicly. When created, we strip sensitive data like queries (metric, template and annotation) and panel links, leaving only the visible metric data and series names embedded into your dashboard. Dashboard snapshots can be accessed by anyone who has the link and can reach the URL.
So, data is saved inside snapshot and no longer depends on original data.
As far as I understand Local Snapshot is stored in grafana db. At your data scale using external storage (webdav, etc) for snapshots can be more a better option.
I connected my sonarqube server to my postgres db however when I view the the "metrics" table, it lacks the actual value of the metric.
Those are all the columns I get, which are not particularly helpful. How can I get the actual values of the metrics?
My end goal is to obtain metrics such as duplicate code, function size, complexity etc. on my projects. I understand I could also use the REST api to do this however another application I am using will need a db to extract data from.
As far as i know connecting to db just helps to store data, not to display data.
You can check stored data on sonarqube's gui
Click on project
Click on Activity
I am using ELK stack (first project) to centralize logs of a server and visualize some real-time statistics with Kibana. The logs are stored in an ES index and I have another index with user information (IP, name, demographics). I am trying to:
Join user information with the server logs, matching the IPs. I want to include this information in the Kibana dashboard (e.g. to show in real-time the username of the connected users).
Create new indexes with filtered and processed information (e.g. users that have visited more than 3 times certain url).
Which is the best design to solve those problems (e.g. include username in the logstash stage through a filter, do scheduled jobs,...)? If the processing task (2) gets more complex, would it be better to use MongoDB instead?
Thank you!
I recently wanted to cross reference some log data with user data (containing IPs among other data) and just used elasticsearch's bulk import API. This meant extracting the data from a RDBMS, converting it to JSON and outputting a flat file that adhered to the format desired by the bulk import API (basically prefixing a row that describes the index and type).
That should work for an initial import, then your delta could be achieved using triggers in whatever stores your user data. Might simply write to a flat file and process like other logs. Other options might be JDBC River.
I am also interested to know where the data is stored originally (DB, pushing straight from a server..). However, I initially used the ELK stack to pull data back from a DB server using a batch file utilizing BCP (running on a scheduled task) and storing it to a flat file, monitoring the file with Logstash, and manipulating the data inside the LS config (grok filter). You may also consider a simple console/web application to manipulate the data before grokking with Logstash.
If possible, I would attempt to pull your data via SQL Server SPROC/BCP command and match the returned, complete message within Logstash. You can then store the information in a single index.
I hope this helps as I am by no means an expert, but I will be happy to answer more questions for you if you get a little more specific with the details of your current data storage; namely how the data is entering Logstash. RabbitMQ is another valuable tool to take a look at for your input source.