Grafana upgrade causes error in rolling up dashboard - grafana

I upgraded Grafana from version 8.1.8 to 8.5.20 and after I did, I started seeing these errors in the logs about rolling up dashboards:
logger=analytics.summaries t=2023-02-09T21:41:06.67+0000 lvl=eror msg="error during daily rollup" err="context deadline exceeded"
logger=analytics.summaries t=2023-02-09T21:41:06.67+0000 lvl=eror msg="got error while rolling up dashboard" dashboard=8928
There isn't any other information in the logs and I'm not sure why this is occurring. What does it mean to rollup a dashboard and how would I resolve this error?

Related

[ERROR][o.o.a.a.AlertIndices] info deleteOldIndices

I am running an Opensearch cluster 2.3, and from the log I can see the following error:
[2023-02-13T03:37:44,711][ERROR][o.o.a.a.AlertIndices ] [opensearch-node1] info deleteOldIndices
What trigger this error? I've never set up the alert, and in the past on the same cluster I used to have some ISM policies for the indices, but I removed them all, can this be linked to the error I am seeing?
Thanks.

Azure AKS fluxconfig-agent 401 causing unhealthy

I have an AKS environment based on the AKS-Construction templates
At some point fluxconfig-agent started reporting unhealthy. I checked the logs and it looks like there is a 401 when it tries to fetch config from https://eastus.dp.kubernetesconfiguration.azure.com
{"Message":"2022/10/03 17:09:01 URL:\u003e https://eastus.dp.kubernetesconfiguration.azure.com/subscriptions/xxx/resourceGroups/my-aks/provider/Microsoft.ContainerService-managedclusters/clusters/my-aks/configurations/getPendingConfigs?api-version=2021-11-01","LogType":"ConfigAgentTrace","LogLevel":"Information","Environment":"prod","Role":"ClusterConfigAgent","Location":"eastus","ArmId":"/subscriptions/xxx/resourceGroups/my-aks/providers/Microsoft.ContainerService/managedclusters/my-aks","CorrelationId":"","AgentName":"FluxConfigAgent","AgentVersion":"1.6.0","AgentTimestamp":"2022/10/03 17:09:01"}
{"Message":"2022/10/03 17:09:01 GET configurations returned response code {401}","LogType":"ConfigAgentTrace","LogLevel":"Information","Environment":"prod","Role":"ClusterConfigAgent","Location":"eastus","ArmId":"/subscriptions/xxx/resourceGroups/my-aks/providers/Microsoft.ContainerService/managedclusters/my-aks","CorrelationId":"","AgentName":"FluxConfigAgent","AgentVersion":"1.6.0","AgentTimestamp":"2022/10/03 17:09:01"}
{"Message":"2022/10/03 17:09:01 Failed to GET configurations with ResponseCode : {401}","LogType":"ConfigAgentTrace","LogLevel":"Information","Environment":"prod","Role":"ClusterConfigAgent","Location":"eastus","ArmId":"/subscriptions/xxx/resourceGroups/my-aks/providers/Microsoft.ContainerService/managedclusters/my-aks","CorrelationId":"","AgentName":"FluxConfigAgent","AgentVersion":"1.6.0","AgentTimestamp":"2022/10/03 17:09:01"}
{"Message":"Error in the getting the Configurations: error {%!s(\u003cnil\u003e)}","LogType":"ConfigAgentTrace","LogLevel":"Error","Environment":"prod","Role":"ClusterConfigAgent","Location":"eastus","ArmId":"/subscriptions/xxx/resourceGroups/my-aks/providers/Microsoft.ContainerService/managedclusters/my-aks","CorrelationId":"","AgentName":"FluxConfigAgent","AgentVersion":"1.6.0","AgentTimestamp":"2022/10/03 17:09:01"}
{"Message":"2022/10/03 17:09:01 \"Errorcode: 401, Message Unauthorized client credentials., Target /subscriptions/xxx/resourceGroups/my-aks/provider/Microsoft.ContainerService-managedclusters/clusters/my-aks/configurations/getPendingConfigs\"","LogType":"ConfigAgentTrace","LogLevel":"Information","Environment":"prod","Role":"ClusterConfigAgent","Location":"eastus","ArmId":"/subscriptions/xxx/resourceGroups/my-aks/providers/Microsoft.ContainerService/managedclusters/my-aks","CorrelationId":"","AgentName":"FluxConfigAgent","AgentVersion":"1.6.0","AgentTimestamp":"2022/10/03 17:09:01"}
Is anyone here familiar with how fluxconfig-agent authenticates and what might cause a 401 here?
Seems to have went away for now after upgrading my AKS cluster and nodes to latest Kubernetes version.

Ec2 Metadata updgrade from imdSV1 to imdSV2 causes 403 and 401 error- kube2iam

I recently updated my ec2 instances to use imdSV2 but had to rollback because of the following issue:
It looks like after i did the upgrade my init containers started failing and i saw the following in the logs:
time="2022-01-11T14:25:01Z" level=info msg="PUT /latest/api/token (403) took 0.753220 ms" req.method=PUT req.path=/latest/api/token req.remote=XXXXX res.duration=0.75322 res.status=403 time="2022-01-11T14:25:37Z" level=error msg="Error getting instance id, got status: 401 Unauthorized"
We are using Kube2iam for the same. Any advice what changes need to be done on the Kube2iam side to support imdSV2? Below is some info from my kube2iam daemonset:
EKS =1.21
image = "jtblin/kube2iam:0.10.9"

Kubernetes audit log showing 404 not found on event

I'm seeing the following log continuously in the Kubernetes audit log file.
Can anyone explain what is this error and its reason
{
"kind":"Event",
"apiVersion":"audit.k8s.io/v1beta1",
"metadata":{"creationTimestamp":"2018-08-29T06:59:04Z"},
"level":"Request",
"timestamp":"2018-08-29T06:59:04Z",
"auditID":"97187fc8-76c1-42f0-9435-c11928b6ec49",
"stage":"ResponseComplete",
"requestURI":"/apis/admissionregistration.k8s.io/v1alpha1/initializerconfigurations",
"verb":"list",
"user":{"username":"system:apiserver","uid":"44rrd678-859a-4f663-bt79-23bar678uj66","groups":["system:masters"]},
"sourceIPs":["X.X.X.X"],
"objectRef":{"resource":"initializerconfigurations","apiGroup":"admissionregistration.k8s.io","apiVersion":"v1alpha1"},
"responseStatus":{"metadata":{},"status":"Failure","reason":"NotFound","code":404},
"requestReceivedTimestamp":"2018-08-29T06:59:04.350346Z",
"stageTimestamp":"2018-08-29T06:59:04.350425Z",
"annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":""}
}

fluentd isnt shipping logs to stackdriver

I have an application deployed on kubernetes on GKE,
Kubernetes version: v1.7.11-gke.1
Stackdriver Logging is enabled on my cluster
fluntd-gcp image on my cluster (by default):
gcr.io/google-containers/fluentd-gcp:2.0.9
my logs were all ok, seen in stackdriver,
but since a few days ago logs from one deployment (lets call it my-app ) stopped arriving in stackdriver
even though they are logged from my app :
kubectl logs -f my-app-3270987706-cx0r2 --namespace=production
{"time":"2018-01-30 16:11:13.155","msg":"ignoring xml"}
{"time":"2018-01-30 16:11:14.155","msg":"success blabla"}
I see the following logs from fluentd:
2018-01-30 16:11:46 +0000 [warn]: emit transaction failed:
error_class=Errno::ENOENT error="No such file or directory # sys_fail2 -
(/var/log/fluentd-buffers/kubernetes.system.buffer..b563203c1da7cb5e1.log, /var/log/fluentd-
buffers/kubernetes.system.buffer..q563203c1da7cb5e1.log)" tag="docker"
2018-01-30 16:11:46 +0000 [warn]: suppressed same stacktrace
2018-01-30 16:11:46 +0000 [error]: Exception emitting record:
No such file or directory # sys_fail2 -
(/var/log/fluentd-buffers/kubernetes.system.buffer..b563203c1da7cb5e1.log,
/var/log/fluentd-buffers/kubernetes.system.buffer..q563203c1da7cb5e1.log)
why logs arent shipped to stackdriver? how can I fix it?
edit:
Ill note that the logs of other apps do appear in stackdriver
the logs of the failing app are very big - maybe thats why they fail to log?