Spring boot admin – high CPU usage on client - spring-boot-admin

I have one application which uses Spring Boot Admin Server library (v2.2.1) to monitor the other applications with integrated Spring Boot Client library (v2.2.1). It works very well; server tracks status of client applications with low performance impact.
Nevertheless, when I open page Insight -> Details (default page), CPU usage of client application grows to 90 – 100 % which causes that my application (running on 1CPU system) responds very slowly. Other pages of Spring Boot Server are OK.
According to my observation, high CPU usage is caused by frequent refresh (1 second) of information there, especially charts. In my case sends Spring Boot Admin about 16 requests in 1 second. Processing each takes about 400 ms (last number)
2020-03-11 08:31:45.290 127.0.0.1 - GET "/actuator/metrics/cache.gets?tag=name:ui-logbook,result:miss" 200 HTTP/1.0 308 421
2020-03-11 08:31:45.291 127.0.0.1 - GET "/actuator/metrics/cache.gets?tag=name:area-services,result:miss" 200 HTTP/1.0 312 438
2020-03-11 08:31:45.291 127.0.0.1 - GET "/actuator/metrics/cache.gets?tag=name:area-services,result:hit" 200 HTTP/1.0 286 437
2020-03-11 08:31:45.290 127.0.0.1 - GET "/actuator/metrics/cache.size?tag=name:area-services" 200 HTTP/1.0 314 420
2020-03-11 08:31:45.292 127.0.0.1 - GET "/actuator/metrics/cache.gets?tag=name:ui-logbook,result:hit" 200 HTTP/1.0 282 428
2020-03-11 08:31:45.426 127.0.0.1 - GET "/actuator/metrics/cache.size?tag=name:ui-logbook" 200 HTTP/1.0 310 100
2020-03-11 08:31:46.513 127.0.0.1 - GET "/actuator/metrics/jvm.threads.peak" 200 HTTP/1.0 219 436
2020-03-11 08:31:46.520 127.0.0.1 - GET "/actuator/metrics/jvm.threads.live" 200 HTTP/1.0 215 434
2020-03-11 08:31:46.520 127.0.0.1 - GET "/actuator/metrics/process.cpu.usage" 200 HTTP/1.0 207 434
2020-03-11 08:31:46.520 127.0.0.1 - GET "/actuator/metrics/system.cpu.usage" 200 HTTP/1.0 177 433
2020-03-11 08:31:46.521 127.0.0.1 - GET "/actuator/metrics/jvm.gc.pause" 200 HTTP/1.0 401 433
2020-03-11 08:31:46.945 127.0.0.1 - GET "/actuator/metrics/jvm.threads.daemon" 200 HTTP/1.0 179 398
2020-03-11 08:31:46.991 127.0.0.1 - GET "/actuator/metrics/jvm.memory.max?tag=area:heap" 200 HTTP/1.0 282 425
2020-03-11 08:31:46.998 127.0.0.1 - GET "/actuator/metrics/jvm.memory.max?tag=area:nonheap" 200 HTTP/1.0 369 420
2020-03-11 08:31:46.998 127.0.0.1 - GET "/actuator/metrics/jvm.memory.used?tag=area:nonheap" 200 HTTP/1.0 318 422
2020-03-11 08:31:46.999 127.0.0.1 - GET "/actuator/metrics/jvm.memory.used?tag=area:heap" 200 HTTP/1.0 233 420
Is there any way how to reduce refresh rate in order to reduce CPU load on client system?
So far I have found only this answer but suggested solution did not work.
Thanks for sharing ideas.

Related

How to query Go-micro services(v2) inside docker with curl or postman

I use Go-micro(v2) to deploy services inside docker-compose
user-service:
build:
context: ./user-service
restart: always
ports:
- "8086:8086"
deploy:
mode: replicated
replicas: 1
environment:....
See the service configuration
srv = micro.NewService(
micro.Name("my.user"),
micro.Address("127.0.0.1:8086"))
when running docker-compose, the container logs show
2022-07-31 05:43:53 file=v2#v2.9.1/service.go:200 level=info Starting [service] my.user
2022-07-31 05:43:53 file=grpc/grpc.go:864 level=info Server [grpc] Listening on [::]:8086
2022-07-31 05:43:53 file=grpc/grpc.go:697 level=info Registry [mdns] Registering node: my.user-00ee4795-06df-47f1-a07a-cc362e135864
All looks good.
But when I want to query some handlers using curl or postman(for development purpose), It doesn't work,
see an exemple of failed request with postman
GET http://127.0.0.1:8086/my.user/Get
Error: Parse Error: Expected HTTP/
Request Headers
Content-Type: application/json
User-Agent: PostmanRuntime/7.29.2
Accept: */*
Postman-Token: b5ab718a-341b-40ff-81fa-37c66fd4d9f2
Host: 127.0.0.1:8086
Accept-Encoding: gzip, deflate, br
Connection: keep-alive
Request Body
GET http://127.0.0.1:8086/my.user/userService/Get // same error
with curl it is not better
curl --header "Content-Type:application/json" --http0.9 --output GET http://localhost:8086/my.user/Get
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 15 0 15 0 0 10638 0 --:--:-- --:--:-- --:--:-- 15000
curl --header "Content-Type:application/json" --http0.9 --output GET http://localhost:8086/my.user/userService/Get
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 15 0 15 0 0 13550 0 --:--:-- --:--:-- --:--:-- 15000
Any idea how to query locally some go-micro services ? Thank you.
ps: Note that the 'Get' handler is working

Server-Sent Events with Play: response only received when process killed

I'm trying to get the sample webapp play-streaming-scala to run and in some circumstances I get a weird behavior.
I've got the app running directly on port 80 of some host and I'm checking the output with curl -iv --raw http://somehost/scala/eventSource/liveClock.
What I'm expecting is something like this:
* Hostname was NOT found in DNS cache
* Trying 195.176.3.71...
* Connected to somehost (0.0.0.0) port 80 (#0)
> GET /scala/eventSource/liveClock HTTP/1.1
> User-Agent: curl/7.39.0
> Host: somehost
> Accept: */*
>
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
< Transfer-Encoding: chunked
Transfer-Encoding: chunked
< Content-Type: text/event-stream; charset=utf-8
Content-Type: text/event-stream; charset=utf-8
< Date: Wed, 18 Jan 2017 13:24:55 GMT
Date: Wed, 18 Jan 2017 13:24:55 GMT
<
10
data: 14 24 56
10
data: 14 24 56
10
data: 14 24 56
etc., and clearly see the chunks appear one after the other as time goes by.
Now, on some machines, this works well. On some others on campus, this fails. curl only shows this and then stops:
* Trying 195.176.3.71...
* Connected to somehost (0.0.0.0) port 80 (#0)
> GET /scala/eventSource/liveClock HTTP/1.1
> Host: somehost
> User-Agent: curl/7.43.0
> Accept: */*
>
Now the interesting thing is: if I kill the webapp on the host, curl suddenly “catches up” and spits all the chunks together, closing the connection like this:
10
data: 14 35 20
* transfer closed with outstanding read data remaining
* Closing connection 0
curl: (18) transfer closed with outstanding read data remaining
What can be causing the behavior? What on earth is going on and intercepting these events? Is there any way I can “force flush” something from the Play response?
Turns out the local “hidden” proxy set up automatically by OS X's parental controls system is not forwarding chunked responses properly, thus making a system based on Server-Sent Events inoperable. A shame.

Deployment fails with a 504

I'm trying to deploy an application to the Swisscom App Cloud from the console. It reports progress until at the end, an 504 without further explanation is reported:
Updating app helloclass-fe-develop in org UCID-Bern Team / space HELLOCLASS-TEST as christian.cueni#iterativ.ch...
OK
Uploading helloclass-fe-develop...
FAILED
Error processing app files: Error uploading application.
Server error, status code: 504, error code: 0, message:
The log of the app reports that the app has been updated:
2017-01-03 09:37:39 [RTR/0] OUT helloclass-develop.scapp.io - [03/01/2017:08:37:39.584 +0000] "GET / HTTP/1.1" 200 0 594 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.75 Safari/537.36 Google Favicon" 66.249.93.201:50868 10.0.18.35:64341 x_forwarded_for:"83.76.152.96" x_forwarded_proto:"https" vcap_request_id:8a8adcc7-9e97-4bd9-4492-68e92883ee3d response_time:0.001739219 app_id:310166b4-f3a6-4168-a9ac-530e45dbfb10 app_index:0
2017-01-03 09:37:39 [APP/PROC/WEB/0] OUT 83.76.152.96, 66.249.93.201, 66.249.93.201 - - - [03/Jan/2017:08:37:39 +0000] "GET / HTTP/1.1" 200 606
2017-01-03 10:05:50 [API/2] OUT Updated app with guid 310166b4-f3a6-4168-a9ac-530e45dbfb10 ({"name"=>"helloclass-fe-develop"})
2017-01-03 10:57:15 [API/1] OUT Updated app with guid 310166b4-f3a6-4168-a9ac-530e45dbfb10 ({"state"=>"STOPPED"})
2017-01-03 10:57:15 [CELL/0] OUT Exit status 0
2017-01-03 10:57:15 [APP/PROC/WEB/0] OUT Exit status 0
2017-01-03 10:57:15 [CELL/0] OUT Destroying container
2017-01-03 10:57:15 [CELL/0] OUT Successfully destroyed container
2017-01-03 10:57:16 [API/1] OUT Updated app with guid 310166b4-f3a6-4168-a9ac-530e45dbfb10 ({"state"=>"STARTED"})
2017-01-03 10:57:16 [CELL/0] OUT Creating container
2017-01-03 10:57:16 [CELL/0] OUT Successfully created container
2017-01-03 10:57:17 [CELL/0] OUT Starting health monitoring of container
2017-01-03 10:57:19 [CELL/0] OUT Container became healthy
In spite of those messages which would indicate that the app has been updated, I still see the old version of the app being served.
EDIT
After running the command with the -v parameter, I see that the reason for the failure is a gateway timeout:
RESPONSE: [2017-01-03T13:32:39+01:00]
HTTP/1.1 504 Gateway Timeout
Connection: close
Content-Length: 176
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Content-Type: text/html
Date: Tue, 03 Jan 2017 12:32:39 GMT
Expires: 0
Pragma: no-cache
Strict-Transport-Security: max-age=15768000; includeSubDomains
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
X-Vcap-Request-Id: 3ac831ef-e70b-4f4e-7c56-e308806f039e
X-Xss-Protection: 1; mode=block
<html>
<head><title>504 Gateway Time-out</title></head>
<body bgcolor="white">
<center><h1>504 Gateway Time-out</h1></center>
<hr><center>nginx</center>
</body>
</html>
FAILED
Error processing app files: Error uploading application.
Server error, status code: 504, error code: 0, message:
Is this something cloudfoundry specific or rather related to Swisscom AppCloud? Are there cloudfoundry inherent timeout limits?
You can run cf push with -v or enable CF_TRACE to see more of the interaction of the CLI with your CF endpoint.
The error message looks similar to https://github.com/cloudfoundry/cli/issues/1042: the Cloud Controller could not complete a request in time and the router that routed the API request to Cloud Controller did not wait any longer and returned the 504 (Gateway timeout) to the CLI.
The trace should tell you which API call timed out.
The CLI aborted the operation there, while the Cloud Controller may have completed the operation successfully, eventually.
I would have thought the operations the CLI would perform here are:
send a list of files in your app and their checksums for resource matching (so it can skip uploading unmodified app bits that the CC cached from a previous push)
upload app files
(re)start app (which includes staging)
poll & wait until an app instance returns that it's running
From your CLI output I assume the first operation timed out, so not clear how your app was restarted.

grafana not showing in kubernetes heapster

I have tried to install heapster with grafana and influxdb on my kubernetes cluster. I cannot manage to see the page of grafana, it only shows me alert.title.
I think that I did everything right, all the logs seems good, but this is the last problem: If someone will be kind enough to show me what's happening I would be grateful.
Here is a pick of my log for:
2016/06/23 13:31:23 [I] Completed 172.17.77.1 - "GET /favicon.ico HTTP/1.1" 404 Not Found 2929 bytes in 1224us
2016/06/23 13:31:30 [I] Completed 172.17.77.1 - "GET /grafana HTTP/1.1" 404 Not Found 2929 bytes in 1154us
2016/06/23 13:31:30 [I] Completed 172.17.77.1 - "GET /api/v1/proxy/namespaces/default/services/monitoring-grafana/public/app/app.ca0ab6f9.js HTTP/1.1" 404 Not Found 23 bytes in 545us
2016/06/23 13:31:30 [I] Completed 172.17.77.1 - "GET /api/v1/proxy/namespaces/default/services/monitoring-grafana/public/css/grafana.dark.min.a95b3754.css HTTP/1.1" 404 Not Found 23 bytes in 786us
2016/06/23 13:31:40 [I] Completed 172.17.77.1 - "GET /monitoring-grafana HTTP/1.1" 404 Not Found 2929 bytes in 1409us
2016/06/23 13:31:40 [I] Completed 172.17.77.1 - "GET /api/v1/proxy/namespaces/default/services/monitoring-grafana/public/app/app.ca0ab6f9.js HTTP/1.1" 404 Not Found 23 bytes in 879us
2016/06/23 13:31:40 [I] Completed 172.17.77.1 - "GET /api/v1/proxy/namespaces/default/services/monitoring-grafana/public/css/grafana.dark.min.a95b3754.css HTTP/1.1" 404 Not Found 23 bytes in 1349us
2016/06/23 13:31:46 [I] Completed 172.17.77.1 - "GET /api/v1/proxy/namespaces/default/services/monitoring-grafana/public/app/app.ca0ab6f9.js HTTP/1.1" 404 Not Found 23 bytes in 837us
2016/06/23 13:31:46 [I] Completed 172.17.77.1 - "GET /api/v1/proxy/namespaces/default/services/monitoring-grafana/public/css/grafana.dark.min.a95b3754.css HTTP/1.1" 404 Not Found 23 bytes in 1181us
Update :
Ok I found something in the influxdb-grafana-controller.yaml I changed value : /api/v1/proxy/namespaces/kube-system/services/monitoring-grafana/ to value: /
I dont know if it's a good solution but it's working.
Ok I found the solution, my cluster was flawed. I had to install flannel on the master too. With the option --iface=eth1 because of vagrant.
I followed this guide http://severalnines.com/blog/installing-kubernetes-cluster-minions-centos7-manage-pods-services but they didnt say to install flannel on the master.
You can remove NodePort from influxdb-grafana-controller.yaml and you can also put the value : api/v1/proxy/namespaces/kube-system/services/monitoring-grafana/ back.
Now everything is working.

Understanding locust summary result

I have a problem to understanding the locust result as this is the first time load test my server, I ran locust using command line on 00:00 local time with; 1000 total user , 100 hatch per second and 10000 request. Below are the result
Name # reqs # fails Avg Min Max | Median req/s
--------------------------------------------------------------------------------------------------------------------------------------------
GET /api/v0/business/result/22918 452 203(30.99%) 9980 2830 49809 | 6500 1.70
GET /api/v0/business/result/36150 463 229(33.09%) 10636 2898 86221 | 7000 1.50
GET /api/v0/business/result/55327 482 190(28.27%) 10401 3007 48228 | 7000 1.60
GET /api/v0/business/result/69274 502 203(28.79%) 9882 2903 48435 | 6800 1.50
GET /api/v0/business/result/71704 469 191(28.94%) 10714 2748 62271 | 6900 1.70
POST /api/v0/business/query 2268 974(30.04%) 10528 2938 55204 | 7100 7.10
GET /api/v0/suggestions/query/?q=na 2361 1013(30.02%) 10775 2713 63359 | 6800 7.80
--------------------------------------------------------------------------------------------------------------------------------------------
Total 6997 3003(42.92%) 22.90
Percentage of the requests completed within given times
Name # reqs 50% 66% 75% 80% 90% 95% 98% 99% 100%
--------------------------------------------------------------------------------------------------------------------------------------------
GET /api/v0/business/result/22918 452 6500 8300 11000 13000 20000 35000 37000 38000 49809
GET /api/v0/business/result/36150 463 7000 9400 12000 14000 21000 35000 37000 38000 86221
GET /api/v0/business/result/55327 482 7000 9800 12000 13000 21000 34000 38000 39000 48228
GET /api/v0/business/result/69274 502 6800 9000 11000 12000 20000 35000 37000 38000 48435
GET /api/v0/business/result/71704 469 6900 9500 11000 13000 21000 36000 38000 40000 62271
POST /api/v0/business/query 2268 7100 9600 12000 13000 21000 35000 37000 38000 55204
GET /api/v0/suggestions/query/?q=na 2361 6800 9900 12000 14000 22000 35000 37000 39000 63359
--------------------------------------------------------------------------------------------------------------------------------------------
Error report
# occurences Error
--------------------------------------------------------------------------------------------------------------------------------------------
80 GET /api/v0/business/result/71704: "HTTPError('502 Server Error: Bad Gateway',)"
111 GET /api/v0/business/result/71704: "HTTPError('504 Server Error: Gateway Time-out',)"
134 GET /api/v0/business/result/22918: "HTTPError('504 Server Error: Gateway Time-out',)"
69 GET /api/v0/business/result/22918: "HTTPError('502 Server Error: Bad Gateway',)"
92 GET /api/v0/business/result/69274: "HTTPError('502 Server Error: Bad Gateway',)"
594 GET /api/v0/suggestions/query/?q=na: "HTTPError('504 Server Error: Gateway Time-out',)"
111 GET /api/v0/business/result/69274: "HTTPError('504 Server Error: Gateway Time-out',)"
419 GET /api/v0/suggestions/query/?q=na: "HTTPError('502 Server Error: Bad Gateway',)"
69 GET /api/v0/business/result/55327: "HTTPError('502 Server Error: Bad Gateway',)"
121 GET /api/v0/business/result/55327: "HTTPError('504 Server Error: Gateway Time-out',)"
397 POST /api/v0/business/query: "HTTPError('502 Server Error: Bad Gateway',)"
145 GET /api/v0/business/result/36150: "HTTPError('504 Server Error: Gateway Time-out',)"
577 POST /api/v0/business/query: "HTTPError('504 Server Error: Gateway Time-out',)"
84 GET /api/v0/business/result/36150: "HTTPError('502 Server Error: Bad Gateway',)"
--------------------------------------------------------------------------------------------------------------------------------------------
here is that I confused about :
what is the meaning of the numbers below #reqs, #fails, Avg, and all number after the name on first and second table? is it to show the total request has been sent or the n-th request sent ?
at the Error Report below # occurences, does total number represent number of request that cause the error ?
thanks for your answer
First table shows the Statistic related to each row with given column explanation in millisecond but total raw shows the total number for each given column. However in your example, there is problem with the calculation of perfcentage of faliure for each raw. For the first raw: 452 requests are sent but 203 of them are failed which means 203/453 ~= 44.81% thbut in the total raw it is calculated correctly.
The second table is the distribution table which shows the percentage of request completed given time interval which in table means that 50% of the total requests to home is completed 6500ms and 66% of requests are completed in 8300ms and respectively goes on.