Has anyone tried connecting to IBM bluemix using bosh-cli. I am seeing performance issues in my requests and was going through this article on cloud foundry. I am planning to login to ssh to gorouter and monitor go-router CPU utilization.
Can someone recommend any way to capture the following metrics from Bluemix:
CPU utilization
Latency
Requests per second
what do you mean by "connecting to IBM bluemix using bosh-cli"?
When you think about the public available IBM Cloud (formerly Bluemix) that's represented here https://console.bluemix.net/ it's not possible. The bosh cli is to maintain the platform, thus Cloudfoundry and potentially other deployments but not your apps.
If you have a private installation you might check the metrics that the system provides. Infos here https://docs.cloudfoundry.org/running/all_metrics.html
When you want to have metrics about your app I could think off your app is providing these metrics. Or you put something in place like the New Relic monitoring. The have a bunch of application performance monitoring (APM). Info here https://docs.newrelic.com/docs/agents
HP
Related
Is there any advantage if I use Cloud Run instead of deploying a normal service/container in GKE?
I will try to add my perspective.
This answer does not cover running containers in Google Cloud Run Kubernetes. The reason is that we wanted an almost zero cost solution for a legacy PHP website. Cloud Run fit perfectly and we had an easy time both porting the code and learning Cloud Run.
We needed to do something with a legacy PHP website. This website was running on Windows Server 2012, IIS and PHP 7.0x. The cost was over $100.00 per month - mostly for Windows licensing fees for a VM in the cloud. The site was not accessed very much but was needed for various business reasons.
A decision was made Thursday (4/18/2019) was that we needed to learn Google Cloud Run, so we decided to port this site to a container and try to run the container in Google Cloud. Nothing like a real world example to learn the details.
Friday, we ported the PHP code to Apache. Very easy process. We did not worry about SSL as we intend to use Cloud Run SSL.
Saturday we started to learn Cloud Run. Within an hour we had the Hello World PHP example running. Link.
Within two hours we had the containerized website running in Cloud Run. Again, very simple.
Then we learned how to configure Cloud Run SSL with our DNS server.
End result:
Almost zero cost for a PHP website running in Cloud Run.
Approximately 1.5 days of effort to port the legacy code and learn Cloud Run.
Savings of about $100.00 per month (no Windows IIS server).
We do not have to worry about SSL certificates from now on for this site.
For small websites that are static, Cloud Run is a killer product. The learning curve is very small even if you do not know Google Cloud. You just need to configure gcloud for container builds and deployment. This means developers can be independant of needing to master GCP.
There are many distinctions in using Cloud Run to expose a service as compared to running it natively in GKE. The primary of these is that Cloud Run provides more of a serverless infrastructure. Basically you declare that you want to expose a service and then let GCP do the rest. Contrast this with creating a Kubernetes cluster and then defining your service in pods. With a manually created GKE cluster, the nodes and environment are always on which means that you are billed for them regardless of utilization. With Cloud Run, your service is merely available and you are only billed for actual consumption. If your service not being called, your costs are zero. Another advantage is that you don't have to predict your utilization needs and allocate sufficient nodes. Scaling happens automatically for you.
See also these presentations from Google Next 19:
Migrating from a Monolith to Microservices (Cloud Next '19)
What's New in Serverless Compute? (Cloud Next '19)
Run Containers on GCP's Serverless Infrastructure (Cloud Next '19)
Run Cloud Functions Everywhere (Cloud Next '19)
Container Once, Serverless Anywhere (Cloud Next '19)
I am investigating options for monitoring our installation in Swisscom's cloud-foundry. My objectives are the following:
monitor performance indicators for deployed application (such as cpu, disk, memory)
monitor performance indicators for services (slow queries, number of queries, ideally also some metrics on hitting quotas)
So far, I understand the options are the following (including some BUTs):
I used a very nice TOP cf-plugin (github)
This works very well. It seems that it registers itself to get the required firehose nozzles and consume data.
That is very useful for tracing / ad-hoc monitoring, but not very good for a serious infrastructure monitoring.
Another way I found is to use firehose-syslog solution.
This can be deployed as an app to (as far as I understand) do the job in similar way, as the TOP cf plugin.
The problem is, that it requires registered client, so it can authenticate with the doppler endpoint. For some reason, the top-cf-plugin does that automatically / in another way.
Last option i am considering is to build the monitoring itself to the App (using a special buildpack)
That can be for example done with Datadog. But it seems to also require a dedicated uaa client to register the Nozzle.
I would like to check, if somebody is (was) on the similar road, has some findings.
Eventually I would like to raise the following questions towards the swisscom community support:
is it possible to register uaac client to be able to ingest events through the firehose nozzle from external service? (this requires admin credentials if I was reading correctly)
is there an alternative way to authenticate with the nozzle (for example using a special user and his authentication token?)
is there any alternative to monitor the CF deployments in Swisscom? Eventually, is there a paper, blogpost or other form of documentation, that would be helpful in this respect (also for other users of AppCloud)?
Since it requires admin permissions, we can not give out UAA clients for the firehose.
However, there are different ways to get metrics in context of a user.
CF API
You can obtain basic metrics of a specific app by polling the CF API:
https://apidocs.cloudfoundry.org/5.0.0/apps/get_detailed_stats_for_a_started_app.html
However, since you have to poll (and for each app), it's not the recommended way.
Metrics in syslog drain
CF allows devs to forward their logs to syslog drains; in more recent versions, CF also sends metrics to this syslog drain (see https://docs.cloudfoundry.org/devguide/deploy-apps/streaming-logs.html#container-metrics).
For example, you could use Swisscom's Elasticsearch service to store these metrics and then analyze it using Kibana.
Metrics using loggregator (firehose)
The firehose allows streaming logs to clients for two types of roles:
Streaming all logs to admins (which requires a UAA client with admin permissions) and streaming app logs and metrics to devs with permissions in the app's space. This is also what the cf logs command uses. cf top also works this way (it enumerates all apps and streams the logs of each app).
However, you will find out that most open source tools that leverage the firehose only work in admin mode, since they're written for the platform operator.
Of course you also have the possibility to monitor your app by instrumenting it (white box approach), for example by configuring Spring actuator in a Spring boot app or by including an agent of your favourite APM vendor (Dynatrace, AppDynamics, ...)
I guess this is the most common approach; we've seen a lot of teams having success by instrumenting their applications. Especially since advanced monitoring anyway requires you to create your own metrics as the firehose provided cpu/memory metrics are not that powerful in a microservice world.
However, option 2. would be worth a try as well, especially since the ELK's stack metric support is getting better and better.
Is there a way to get some notification when a Cloud Foundry application fails or is unreachable? I mean to register to some deployed app and if the status of the application is changed to failed or something, I want to receive a notification.
On Pivotal Cloud Foundry, when a app crashes, an event is emitted thru the firehose.
PCF Metrics tile, available from Pivotal, can be deployed to your PCF foudnation. PCF Metrics will track all events for apps running on the foundation and are accessible to developers (thru Apps Manager). I believe Metrics tile tracks history for up to two weeks. I am not aware of any alerting capabilities in the PCF Metrics tile (I could be wrong, in which case, please correct me), that will prompt you when an app crashes.
Other approaches are to implement event logging tools like Splunk, New Relic etc. They support alerts. You will have to build those.
API monitoring tools like AppD, Apigee, and New Relic provide alerting and can notify you went the response time to an app has degraded (as in your app has crashed). This approach is a little more involved. You may require to add an agent to your buildpack, depending on the tool you choose.
IMHO there is no such built-in feature for Cloud Foundry, but IBM Cloud offers the Availability Monitoring service to monitor apps and send out alerts in case of unavailability or other similar events. The service is part of the DevOps category in the IBM Cloud catalog.
There is also Alert Notification to manage alerts, the notification of the right groups via all kinds of channels and to track the alert status. For your question you should start with the Availability Monitoring and then work towards how those events are handled.
You can use the cf events appname command to get a list of all events about the application, this will print out all the recent events such as application crashes.
if run the cf events appname -v you will see the json rest calls the cf cli makes to Cloud Foundry.
You can use Cloud Foundry Java Client to write you own code to interact with Cloud Foundry.
Another thing you can do is stream your application logs to any syslog compatible log aggregation service for example splunk. Then have splunk monitor for app crash events in the log. You can read how to configure app log streaming at the docs
This functionality is scheduled to be available with PCF Metrics 1.5 and can be seen with PWS (Pivotal Web Services) in Alpha Mode.
The functionality is available under the Monitors Tab inside of PCF Metrics (1.5).
Webhook notifications (i.e. Slack) can be configured for a number of Events (including as you discussed crashes).
You can create a User Provided service and Add a syslog drain URL. And then bind the service to your application. Now in case of any events happening it will put the logs into the URL you have provided.
I have an app running in Bluemix on a node.js runtime and I want to integrate it with APIs exposed by my on-prem system -- which is connected via the Secure Gateway. What is the best way to measure the latency between Bluemix and my on-prem system to determine viability of this architecture?
I am not sure you will get away with using ping nor traceroute. Ping uses ICMP which is probably blocked by the BSO firewall. As for traceroute, its not dependable and deprecated.
It might be possible to use snort to gain more insights into the latency times between your on-premises resources and Bluemix.
I'm working on trying to setup some monitoring on a Google Cloud SQL node and am not seeing how to do it. I was able to install the monitoring agent on my Google Compute Engine instances to monitor CPU, Network, etc. I have not been able to figure out how to do so on the Cloud SQL instance. I have access to these types of monitoring:
Storage Usage (GB)
Number of Read/Write operations
Egress Bytes
Active Connections
MySQL Queries
MySQL Questions
InnoDB Pages Read/Written (pages/sec)
InnoDB Data fsyncs (operations/sec)
InnoDB Log fsyncs (operations/sec)
I'm sure these are great options, but at this point all I want to pay attention to is if my node is performing on a CPU/RAM standpoint as they seem to first and foremost measures for performance.
If I'm missing something, or misunderstnading what I'm trying to do, any advice is appreciated.
Thanks!
Google has a Stackdriver which is for logging and monitoring Google and AWS cloud infrastructure. It can monitor every single thing present on GCP. You can create visualization to monitor your Cloud SQL instance in one dashboard. You just have to ---->
1. login to stackdriver and Go to any existing dashboard, If you dont have create one.---->
2. Add chart and select Cloud SQL in resource Name.---->
3. Select CPU Utilization from metric and save. You can also monitor memory, Disk I/o, Delta count of Queries or servers Up-time and many more.
if you want to monitor any other GCP Compute engine, App-Engine, Kubernetese Engine, storage bucket, Bigtable or pub/sub you just have to select appropriate resource name from list. Hope you got your answer.
You can view all of them directly from the "Overview" tab of the Cloud SQL console:
I have added this as a feature request as issue 110.
https://code.google.com/p/googlecloudsql/issues/detail?id=110