I have a Prometheus Dashboard that will display No Data when viewing it on a remote machine and the time period is 6 hours or less. The issue does not happen on the local machine. Grafana and Prometheus service are both running on the same CentOS 7 server. I've tested this on Windows and Linux machines
Edit: Setting range to 12hrs doesn't populate the data properly, but 24hrs and more does
Related
I have many Java microservices running in a Kubernetes Cluster. All of them are APM agents sending data to an APM server in our Elastic Cloud Cluster.
Everything was working fine but suddenly every microservice received the error below showed in the logs.
I tried to restart the cluster, increase the hardware power and I tried to follow the hints but no success.
Obs: The disk is almost empty and the memory usage is ok.
Everything is in 7.5.2 version
I deleted all the indexes related to APM and everything worked after some minutes.
for better performance u can fine tune these fields in apm-server.yml file
internal queue size increase queue.mem.events=output.elasticsearch.worker * output.elasticsearch.bulk_max_size
default is 4096
output.elasticsearch.worker (increase) default is 1
output.elasticsearch.bulk_max_size (increase) default is 50 very less
Example : for my use case i have used following stats for 2 apm-server nodes and 3 es nodes (1 master 2 data nodes )
queue.mem.events=40000
output.elasticsearch.worker=4
output.elasticsearch.bulk_max_size=10000
I'm new in IBM-cloud, I have not know how to stop ibm-cloud engine (Analytic-engine). Ive receive a mail telling me it'll be suspended (remaining little time. I did not see anywhere to stop it and Ive deleted it(instance). While I tried to create other engine I've got a message telling me I have to wait about 30 days.
I'm using lite account with credit of cognitiveclass (245 days duration)
My Question is: Is it possible to retrieve my instance by contacting support?
Once a service instance is deleted, the underlying cluster is also deleted. All data and metadata, including all logs, on the cluster will be lost after the cluster is deleted.
Also, here are a couple of restrictions using Lite plan,
Maximum of one tile per IBM Cloud account every 30 days.
Maximum of one cluster with up to 3 compute nodes.
Free usage limit is 50 node hours. After 50 node hours, the cluster will be disabled. This means, for example, that a cluster with 4 nodes (3 compute node and 1 management node) will be disabled after 12.5 hours. While the cluster is disabled, it cannot be scaled up or customized.
A grace period of 24 hours is given to upgrade your user account to a paid account, and to upgrade the service instance to the Standard-Hourly plan.
If the service instance is not upgraded, then it will expire and be deleted.
Note: You are entitled to one service instance per month. If you delete the service instance or it expires after the free 50 node hours, you will not be able to create a new one until after the month has passed.
Check this link for other supporting plans
I have a 5 cluster MariaDB/Galera cluster running in production environment.
I also have a monitor which checks every 20 seconds for cluster size changes. One of our other engineers has been running queries using MySQL Workbench, and when that application is running, I start seeing alerts coming from my monitor where cluster size is 1. It does recover in a few seconds back to the correct size of 5, however it's disconcerting that this client app is causing issues on the cluster. I've requested everyone on our team to not use this app... however I wonder if anyone else has seen this, or knows what it is doing to the cluster.
Since a couple of weeks, my reported uptimes for most of my pods are incorrect and reset to 0 frequently but at a random rate (sometimes it's reset after a couple of minutes/seconds, sometimes a couple of hours).
The data are sinked to influxdb and displayed with Grafana.
Here is a screenshot of the uptime of some MongoDB nodes over a week (none of them have restarted). Only the blue line (node-2) is correct, all other are reset randomly.
Versions:
kubernetes: 1.8.3
heapster: 1.4.3 amd64
influxdb: 1.1.1 amd64
Any idea of what is going wrong?
I have a vm (vm1) on which I installed everything I needed and am running a cronjob every 5 hours lets say.
Now using the snapshot of this vm instance I create many more virtual machines.
Now how do I ensure that my pre-scheduled "every 5 hours cron job" runs on all vms at the same time ?I want them to start at the same time but i am not sure how to synchronize the clock/time on all vms any pointers? My vms are running centos 7
The solution that worked for me was
1) fix network ..
2) and since my vms were running centos .. i ended up using chrony
yum install chriny
systemctl enable chronyd
check by
chronyc tracking
chronyc sources
and make sure you can ping the sources