prometheus is not able to talk to influxDB - kubernetes

I am running prometheus as a kubernetes pod and wants prometheus to write data to the inflluxDB I have added the entries to the prometheus.yml , below entries been added
remote_read:
- url: "http://localhost:8086/api/v1/prom/write?u=xxxxxx&p=ids3pr0m&db=xxxxxx"
remote_write:
- url: "http://localhost:8086/api/v1/prom/read?u=xxxxxx&p=ids3pr0m&db=xxxxxx"
the pod is running file and able to read it , but keep on giving me below error .
time="2018-05-03T17:38:31Z" level=warning msg="Error sending 100 samples to remote storage: server returned HTTP status 400 Bad Request: {"error":"proto: wrong wireType = 2 for field StartTimestampMs"}" source="queue_manager.go:500"
time="2018-05-03T17:38:31Z" level=warning msg="Error sending 100 samples to remote storage: server returned HTTP status 400 Bad Request: {"error":"proto: wrong wireType = 2 for field StartTimestampMs"}" source="queue_manager.go:500"
time="2018-05-03T17:38:31Z" level=warning msg="Error sending 100 samples to remote storage: server returned HTTP status 400 Bad Request: {"error":"proto: wrong wireType = 2 for field StartTimestampMs"}" source="queue_manager.go:500"
Can someone help me on this

Ran into this as well and for me it was using a Prometheus version 2.x.
Looks InfluxDB only supports version 1.8

Related

failed to do request: Head "https://192.168.56.2:5000/v2/ubn_mysql/manifests/latest": http: server gave HTTP response to HTTPS client

Warning Failed 22s (x2 over 37s) kubelet Failed to pull image "192.168.56.2:5000/ubn_mysql:latest": rpc error: code = Unknown desc = failed to pull and unpack image "192.168.56.2:5000/ubn_mysql:latest": failed to resolve reference "192.168.56.2:5000/ubn_mysql:latest": failed to do request: Head "https://192.168.56.2:5000/v2/ubn_mysql/manifests/latest": http: server gave HTTP response to HTTPS client
Getting above error while creating pod in k8. I am able to pull this image on worker nodes using
"docker pull 192.168.56.2:5000/ubn_mysql:latest" from registry '**http://192.168.56.2:5000/ubn_mysql:latest**' but getting issue while creating pod only.
I already have entries in below files:
vagrant#kubemaster:~/pods_yaml$ cat /etc/docker/daemon.json
{"insecure-registries" : ["192.168.56.2:5000"]}
vagrant#kubemaster:~/pods_yaml$ cat /etc/default/docker
DOCKER_OPTS="--config-file=/etc/docker/daemon.json"

Prometheus --> Getting "Server returned HTTP status 400 Bad Request" attempting scraping

I have a Grafana Agent with Kube State Metrics in an EKS cluster and I am getting this error attempting to scrape a kubelet.
Any reason why such error can appear?
ts=2022-11-14T19:42:49.643418574Z caller=scrape.go:1302
level=debug agent=prometheus instance=6adb012c62a4c8a8139b7fbf04a08668 component="scrape manager"
scrape_pool=integrations/kubernetes/kubelet target=http://kubernetes.default.svc.cluster.local:443/api/v1/nodes/ip-192-168-0-8.ec2.internal/proxy/metrics
msg="Scrape failed"
err="server returned HTTP status 400 Bad Request"

REST API slow response on production server

I have a spring-boot application that uses Java on the backend and react.js on the frontend side. Some API calls are too slow in the production environment (it takes 1 min in the production environment and 4 ms in the local). The slow APIs are not fetching any large data set. I was trying to debug the code and found Nginx error logs. The logs are as follows:
[error] 29755#29755: *632803 upstream timed out (110: Connection timed out) while connecting to upstream, client: 172.XX.X.XX, server: test.apps.com, request: "GET /apiv1/master/modules?&isglobal=all&startIndex=0&pageSize=10&sortBy=id HTTP/1.1", upstream: "https://172.XX.X.XX:443/mhk-cmt-app/master/modules?&isglobal=all&startIndex=0&pageSize=10&sortBy=id", host: "test.apps.com", referrer: "https://test.apps.com/admin/app-setting"
How can I improve the API response time in production?

increase the upload limit of HAProxy

When using HAProxy, I've been getting the error 413: Request Entity Too Large
This error occurs when I'm trying to upload some files which are too large, however I can not find any documentation on how to increase this limit.
How can you increase the maximum upload limit to a specified amount of MB's?
This is not a HAProxy error, as you can see here http://cbonte.github.io/haproxy-dconv/configuration-1.7#1.3.1, 413 Error is not in the list.
So this probably an error returned from the server and HAProxy is just "forwarding" the error to the client.
To be 100% sure, you can see the logs:
An error returned by HAProxy:
127.0.0.1:35494 fe_main be_app/websrv1 0/0/-1/-1/3002 503 212 - - SC-- 0/0/0/0/3 0/0 "GET /test HTTP/1.1"
An error returned by the backend server:
127.0.0.1:39055 fe_main be_app/websrv2 0/0/0/0/0 404 324 - - --NI 1/1/0/1/0 0/0 "GET /test HTTP/1.1"
Notice the "-1" in the timers.

Meteor Error: write after end

EDIT
It seems that the second server DOES occasionally get this error, this makes me near certain it's a config problem. Could it be one of:
net.ipv4.tcp_fin_timeout = 2
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse =1
version information as requested: Meteor: 1.5.0
OS: Ubuntu 16.04
Provider: AWS EC2
I'm getting the following error, intermittently and seemingly randomly, on both processes running on one server (of a pair). The other server never gets this error, the error doesn't refer to any code I've written, so I can only assume its (a) a bug in Meteor or (b), a bug with my server config. The server whose processes are crashing is also hosting two other meteor sites, both of which occasionally get this error:
Error: write after end
at writeAfterEnd (_stream_writable.js:167:12)
at PassThrough.Writable.write (_stream_writable.js:212:5)
at IncomingMessage.ondata (_stream_readable.js:542:20)
at emitOne (events.js:77:13)
at IncomingMessage.emit (events.js:169:7)
at IncomingMessage.Readable.read (_stream_readable.js:368:10)
at flow (_stream_readable.js:759:26)
at resume_ (_stream_readable.js:739:3)
at nextTickCallbackWith2Args (node.js:511:9)
at process._tickDomainCallback (node.js:466:17)
things I've already checked:
memory limits (nowhere near close)
connection limits - very small, around 20 per server at the time of failure, and the processes were bumped to the second server within 1 minute, which handled them + it's own just fine
process limits - both processes on server 1 failed within 7 minutes of each other.
server config - while I was trying to eek out a little extra performance during load testing, I modified sysctl.conf based on a post I saw for high load node.js servers, this is the contents of the faulty servers sysctl.conf however, the functioning server has an identical config.
.
fs.file-max = 1000000
fs.nr_open = 1000000
ifs.file-max = 70000
net.nf_conntrack_max = 1048576
net.ipv4.netfilter.ip_conntrack_max = 32768
net.ipv4.tcp_fin_timeout = 2
net.ipv4.tcp_max_orphans = 8192
net.ipv4.ip_local_port_range = 16768 61000
net.ipv4.tcp_max_syn_backlog = 10024
net.ipv4.tcp_max_tw_buckets = 360000
net.core.netdev_max_backlog = 2500
net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse =1
net.core.somaxconn = 20048
I have an NGINX balancer on server1 which load balances across the 4 processes (2 per server). The NGINX error log is littered with lines as follows:
2017/08/17 16:15:01 [warn] 1221#1221: *6233472 an upstream response is buffered to a temporary file /var/lib/nginx/proxy/1/46/0000029461 while reading upstream, client: 164.68.80.47, server: server redacted, request: "GET path redacted HTTP/1.1", upstream: "path redacted", host: "host redacted", referrer: "referrer redacted"
At the time of the error, I see a pair of lines like this:
2017/08/17 15:07:19 [error] 1222#1222: *6215301 connect() failed (111: Connection refused) while connecting to upstream, client: ip redacted, server: server redacted, request: "GET /admin/sockjs/info?cb=o2ziavvsua HTTP/1.1", upstream: "http://127.0.0.1:8080/admin/sockjs/info?cb=o2ziavvsua", host: "hostname redacted", referrer: "referrer redacted"
2017/08/17 15:07:19 [warn] 1222#1222: *6215301 upstream server temporarily disabled while connecting to upstream, client: ip redacted, server: server redacted, request: "GET /admin/sockjs/info?cb=o2ziavvsua HTTP/1.1", upstream: "http://127.0.0.1:8080/admin/sockjs/info?cb=o2ziavvsua", host: "hostname redacted", referrer: "referrer redacted"
If it matters at all, I'm using a 3 node mongo replica set, where both servers are pointing at all 3 nodes.
I'm also using a custom hosted version of kadira (since it went offline).
If there is no way to stop the errors, is there anyway to stop them taking down the entire process, there are times when 50-100 users are connected per process, booting them all because of one error seems excessive
It's been two days without a crash, so I think the solution was changing:
net.ipv4.tcp_fin_timeout = 2
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1
to
net.ipv4.tcp_fin_timeout = 15
net.ipv4.tcp_tw_recycle = 0
net.ipv4.tcp_tw_reuse = 0
I don't know which of those was causing the problem (probably the timeout). I still think its a "bug" that a single "Write after end" error crashes the entire meteor process. Perhaps this should simply be logged.