I am investigating options for monitoring our installation in Swisscom's cloud-foundry. My objectives are the following:
monitor performance indicators for deployed application (such as cpu, disk, memory)
monitor performance indicators for services (slow queries, number of queries, ideally also some metrics on hitting quotas)
So far, I understand the options are the following (including some BUTs):
I used a very nice TOP cf-plugin (github)
This works very well. It seems that it registers itself to get the required firehose nozzles and consume data.
That is very useful for tracing / ad-hoc monitoring, but not very good for a serious infrastructure monitoring.
Another way I found is to use firehose-syslog solution.
This can be deployed as an app to (as far as I understand) do the job in similar way, as the TOP cf plugin.
The problem is, that it requires registered client, so it can authenticate with the doppler endpoint. For some reason, the top-cf-plugin does that automatically / in another way.
Last option i am considering is to build the monitoring itself to the App (using a special buildpack)
That can be for example done with Datadog. But it seems to also require a dedicated uaa client to register the Nozzle.
I would like to check, if somebody is (was) on the similar road, has some findings.
Eventually I would like to raise the following questions towards the swisscom community support:
is it possible to register uaac client to be able to ingest events through the firehose nozzle from external service? (this requires admin credentials if I was reading correctly)
is there an alternative way to authenticate with the nozzle (for example using a special user and his authentication token?)
is there any alternative to monitor the CF deployments in Swisscom? Eventually, is there a paper, blogpost or other form of documentation, that would be helpful in this respect (also for other users of AppCloud)?
Since it requires admin permissions, we can not give out UAA clients for the firehose.
However, there are different ways to get metrics in context of a user.
CF API
You can obtain basic metrics of a specific app by polling the CF API:
https://apidocs.cloudfoundry.org/5.0.0/apps/get_detailed_stats_for_a_started_app.html
However, since you have to poll (and for each app), it's not the recommended way.
Metrics in syslog drain
CF allows devs to forward their logs to syslog drains; in more recent versions, CF also sends metrics to this syslog drain (see https://docs.cloudfoundry.org/devguide/deploy-apps/streaming-logs.html#container-metrics).
For example, you could use Swisscom's Elasticsearch service to store these metrics and then analyze it using Kibana.
Metrics using loggregator (firehose)
The firehose allows streaming logs to clients for two types of roles:
Streaming all logs to admins (which requires a UAA client with admin permissions) and streaming app logs and metrics to devs with permissions in the app's space. This is also what the cf logs command uses. cf top also works this way (it enumerates all apps and streams the logs of each app).
However, you will find out that most open source tools that leverage the firehose only work in admin mode, since they're written for the platform operator.
Of course you also have the possibility to monitor your app by instrumenting it (white box approach), for example by configuring Spring actuator in a Spring boot app or by including an agent of your favourite APM vendor (Dynatrace, AppDynamics, ...)
I guess this is the most common approach; we've seen a lot of teams having success by instrumenting their applications. Especially since advanced monitoring anyway requires you to create your own metrics as the firehose provided cpu/memory metrics are not that powerful in a microservice world.
However, option 2. would be worth a try as well, especially since the ELK's stack metric support is getting better and better.
How can I centrally log my spring boot REST services which are running in different applications on the cloud foundry platform? For example I want to log how much a particular services is requested. The log should also be persistent even if I have to restart / reset my application. I don't only want to see the last log entries with cf logs --recent. Is there a best practise?
Your applications should configure their logging to write to stdout and stderr. The Cloud Foundry logging subsystem will automatically pick up everything written to stdout and stderr and send it to the log aggregator. See the Application Logging docs for more info.
To persist the logs and make them available for viewing and analysis, they should be streamed to an external log capture system. Some Cloud Foundry docs contain some general information about configuring log streaming and some specific instructions for some popular log capture systems.
Has anyone tried connecting to IBM bluemix using bosh-cli. I am seeing performance issues in my requests and was going through this article on cloud foundry. I am planning to login to ssh to gorouter and monitor go-router CPU utilization.
Can someone recommend any way to capture the following metrics from Bluemix:
CPU utilization
Latency
Requests per second
what do you mean by "connecting to IBM bluemix using bosh-cli"?
When you think about the public available IBM Cloud (formerly Bluemix) that's represented here https://console.bluemix.net/ it's not possible. The bosh cli is to maintain the platform, thus Cloudfoundry and potentially other deployments but not your apps.
If you have a private installation you might check the metrics that the system provides. Infos here https://docs.cloudfoundry.org/running/all_metrics.html
When you want to have metrics about your app I could think off your app is providing these metrics. Or you put something in place like the New Relic monitoring. The have a bunch of application performance monitoring (APM). Info here https://docs.newrelic.com/docs/agents
HP
Consider a scenario where I have a watcher-service which is registered in consul. In this watcher service, I want to trigger some emails based on the behaviour of other services i.e. when other services are up, when a new service is added and when a running service is down.
My question is how can I subscribe to the events of other services.
One way I got is to use a scheduler and keep hitting consulClient.agentServices but this seems not an optimal way. I am hoping that I should be able to add a listener / watcher which invoke method which inturn tells watcher-service which service is down/added.
Looking for solutions which are more specific to spring cloud consul, but all hints are warmly welcome.
I have app that uses several Bluemix services including database service.
Is it possible to have access to all available operational info via single dashboard or I need inspect all components' logs separately?
-Thanks in advance
Service log files are not available. You might consider looking at the Monitoring and Analytics service to see if will meet your needs. There is both a free and a diagnostics plan available.