1 Click app / add-on kubernetes on DigitalOcean failed - kubernetes

I have an kubernetes cluster running on DigitalOcean with multiple applications. Recently I tried using the 1 click app but got an error without any explenation. It just says: Installation failed try again.
The app I tried to install is: Kubernetes monitoring stack.
I hope someone can tell me how to fix this or where to find more information about the error.

Related

Google Cloud Composer failed after restart

I have Google cloud composer running in 2 GCP projects. I have updated composer environment variable in both. One composer restarted fine within few minutes. I have problem in another & it shows below error as shown in images.
Update operation failed. Couldn't start composer-agent, a GKE job that updates kubernetes resources. Please check if your GKE cluster exists and is healthy.
This is the error what I see when I enter the composer
This is the environment overview
GKE cluster notification
GKE pods overview
I am trying to find how to resolve the problem but I didn't find any satisfied answers. My colleagues are assuming firewall & org policies issue but I haven't changed any.
Can some one let me know what caused this problem as the google composer is managed by google & how to resolve this issue now?
Once the Cloud Composer is the managed resource and when the GKE which serves the environment for your composer is unhealthy you should try to contact Google Cloud Support. That GKE should work just fine and you do not need even know about its existence.
Also check whether you do not reacy any limits or quotas in your project.
When nothing helps recreation of Cloud Composer is always good idea.

Is KubeFlow still supported on GCP?

I am trying to use KubeFlow on GCP and I am following this codelab, but "click-to-deploy" is no longer supported so I followed the documentation of "kubectl and kpt". However, I keep getting this "You cannot perform this action because the Cloud SDK component manager is disabled for this installation." error and none of the solutions I found worked. I have 2 other friends told me they tried to make KubeFlow work since last year, it never worked, but I did see people post question about KubeFlow on Stackoverflow still, so I want to ask if it is still working, if so, where can I find a decent guide to follow?
Thanks!
I finally got it working. For that error message, it turned out that I just didn't install the Cloud SDK properly. There will be a lot of other issues too down the road, but at least the KubeFlow web UI is working for me now.
yes, as the kubectl and kpt says, the first step in getting prepared to install cluster is installing gcloud that is CLI that manages authentication, local configuration, developer workflow, interactions with Google Cloud APIs.
Without is you simply cant work with objects(in your case you need to enable kpt anthoscli beta) and perform tasks like
creating a Compute Engine VM instance, managing a Google Kubernetes
Engine cluster, and deploying an App Engine application, either from
the command line or in scripts and other automations..

Error copying pip.conf from bucket to Cloud Composer Airflow environment

Similar to this lonely questioner I'm trying to install a Python package from a private PyPI repo such that it's available to our Google Cloud Composer Airflow instance.
I've followed these instructions but Airflow continues not to know about my package:
No module named 'foopackage'
I can't find any reference to my pip.conf in any logs anywhere so I'm not sure whether the file is in the right place, or has the right contents.
How can I proceed with debugging this problem?
The Cloud Composer environment logs show that there was a problem with copying pip.conf from the bucket, but don't give any other details:
{
insertId: "16qa4c8g540zxs3"
logName: "projects/{my-env}/logs/composer-agent"
receiveTimestamp: "2020-02-06T15:59:03.164564368Z"
resource: {…}
severity: "ERROR"
textPayload: "Copying gs://{my-bucket}/config/pip/pip.conf...
"
timestamp: "2020-02-06T15:59:00.857642186Z"
}
I first thought this might be a permissions issue, but the file seems to have the same set of permissions as other files in this bucket.
Where can I get more detailed information on what went wrong when copying that file?
update
I'm on composer-1.7.2-airflow-1.10.2.
update
The service account for my Composer environment already has the project.editor role.
This is an indicator that the Docker image used for the web server failed to build. To find the root cause, please view Cloud Build logs in project.
The reason for this, is a failed or taking long time operation, it timed out on the Composer’s backend. In some cases these errors persist in the backend, blocking future attempts. You can try re-enabling the API:
First solution that comes to my mind is running following commands in cloud shell:
gcloud services disable composer.googleapis.com
gcloud services enable composer.googleapis.com
After enabling the API, please update your Composer environment as usual.
When you install packages, the Composer environment re-creates Docker containers for the Airflow workers and scheduler, then performs a rolling update within the GKE cluster to update the workers to keep workers available. You can check Kubernetes Engine > Workloads to see if your environment timed out because of waiting for the scheduler and workers to come back online.
When Composer environment is using a custom service account and does not have IAM access to use Cloud Build, builds will fail immediately, so please check it. You can diagnose these by going to Cloud Build > History, and when you see builds without a log, it means that builds failed even before trying to build a container.
When your package implement bindings, it will fail at runtime if the libraries don't exist on the system. This means it is incompatible with Cloud Composer, because getting shared libraries into the build environment is not currently supported.
Another thing, make sure if your project is packed in correct way.
I hope you find the above pieces of information useful.

network deployed in IBM cloud, but having an issue instantiating the chaincode

I deployed a bna archive file to my ibm cloud instance. It has all the files you'd expect including the package.json. This was done following the tutorial here: https://console.bluemix.net/docs/services/blockchain/develop_starter.html#deploying-a-business-network
The last step in the process is a ping issued to ensure the network is up and running. I am getting an error telling me that the "chaincode is not instantiated".
I went to the web interface ( https://blockchain-starter.eu-gb.bluemix.net/network/myid ) and under My Code / Install Code section I can see my network. Under Actions there is an option to instantiate it on a peer. Clicking that gives me this error : Unknown error occurred when instantiating chaincode, check your peer logs.
Looking at the logs on the peer I can see this:
{"log":"npm ERR! enoent ENOENT: no such file or directory, open '/usr/local/src/package.json'\n","stream":"stderr","time":"2018-06-19T13:20:48.455812355Z"}
That particular file IS part of my bna archive.
I can deploy the bna file to both my local composer-playground and also IBM's one ( https://blockchaindevelop.mybluemix.net/ ) and it works fine in both environments.
The same issue happens if I deploy the bna using the web interface, I simply can't instantiate it.
Any suggestions what I can do to get this network running?
In the end it was a software version issue.
The original documentation specified using composer 0.18.1 as the only one compatible with the IBM cloud infrastructure.
This has recently been updated to 0.19.x.
In the IBM cloud, I removed the original peer with all the old chaincode, removed the old certificate as well.
On my local machine I started from scratch:
reinstalled latest composer,
recreated the bna file
I then re-did all the steps as in the original documentation and this time everything worked and I managed to start the network and ping it.
Everything is up and running now. There was one last timeout issue when I tried to start the network, but I simply ran the command again and the problem went away.

We are deploying Open Loyalty on to Google Cloud and we are receiving a Yarn error?

We are installing Open Loyalty Program on to the Google Cloud. Please Google Open Loyalty by Divante Ltd.
We have been trying to deploy this application on google cloud using Kubernetes.
The instance used to deploy this application contains Debian v4.9 as its OS. And we installed Docker, GCloud, Kubernetes and Kompose as the tools for deployment. We built two docker images for the frontend and backend and linked them to the docker-compose file. Now in frontend image, we used (node:5) image from docker hub in Dockerfile of the frontend.
We also changed the docker-compose file as seen below:
enter image description here
After changing the docker-compose file, we ran “kompose up” within the same directory.
which created the deployment and service ‘yaml’ file and then proceeded to run it.
We have pods as given below, but the frontend pod shows some error and some logs.enter image description here
It says yarn not found. When we execute the same process on a local machine, it works as expected.
We are also trying to seek help from Google Support but your help and suggestions will also be highly appreciated.
Yarn is available from node:6. Your front-end image is too old.