how to run sample controller kubernetes - kubernetes

I am trying to run the Kubernetes sample controller example by following the link https://github.com/kubernetes/sample-controller. I have the repo set up on an Ubuntu 18.04 system and was able to build the sample-controller package. However, when I try to run the go package, I am getting some errors and am unable to debug the issue. Can someone please help me with this?
Here are the steps that I followed which were given in
Running Kubernetes Sample Controller
could not achieve the same result but ended up having error after running first four commands
if any one could help me this was the error which i got

Your kubeconfig is not located in ~/.kube/config path. Set --kubeconfig=<path> flag along with kubectl command.

Related

`gcloud run deploy` raises "Revision <revision_name> is not ready and cannot serve traffic."

Command
gcloud run deploy api --region=$REGION --image=$IMAGE
Logs
Deploying container to Cloud Run service [api] in project [[MASKED]] region [[MASKED]]
Deploying...
Creating Revision...........interrupted
Deployment failed
ERROR: (gcloud.run.deploy) Revision [[MASKED]] is not ready and cannot serve traffic.
I've tried to search Google Cloud documentation, but it does not mention such problem.
How to solve the "Revision is not ready and cannot serve traffic."?
Try to wait a few minutes and then just re-launch the procedure. The good old "let's retry without changing anything" worked for me! :)
EDIT: I talked with a Cloud Architect who works with me and he told me that this is the actual solution, because if you retry too quickly to restart the deploy, GCP may still have some pending operations from the previous one!
I faced the same error in Cloud Run after getting the container working correctly locally. In my case the revisions weren't showing as failing, they had a grey checkmark
and when hovering I got the message
The revision is healthy but not currently serving traffic.
I just needed to click Manage Traffic and set 100% of the traffic to a new revision
I faced this problem as well. In my case I checked "Cloud Run" section from hamburger menu of google cloud console. The "Logs" section should give you more idea about what went wrong. I was missing a python library, and adding correct python dependency in my requirements.txt solved the issue for me. Somehow my local testing went well without this issue. I hope this helps. :)
I faced with this problem, my problem is that my docker image is missing required dependency package at build stage, my Dockerfile missed some steps to copy required files for preparing to install package.
To find you problem if cloud build logs was not make sense for you, I think you should:
From gcloud console, go to service "Container Registry" > Images
Select your repository name
From the image version (maybe latest) that you want to check > more actions > show pull command > then copy that command ex: docker pull gcr.io/..
From gcloud console header > select activate cloud shell
At cloud shell terminal, pull docker images of your latest build by running "pull command" that you copied before.
Start your container from this image to see what exactly happens with your run revision

Dags in Airflow UI do not show on/off and keep on loading

I installed airflow using Kubernetes and have login the airflow UI. It shows all dags, but they are not shown correctly.
1/ There is no on/off buttons on the left of Dag name, it just show empty checkbox.
2/ The "Recent Tasks" and "DAG Runs" columns look like they are trying to load something;
3/ If I click and therefore go to any of DAG, it looks like it tries to load something;
I tried both airflow 2.0.0 and 1.10.11 and they show the same so it is not because of version.
What is the problem of the airflow and how to fix that?
------- here I provide more information according to Ofek Hod's suggestion:
1/ run "kubectl logs <pod_id> webserver", after I login airflow web UI, I got many http 404 response. e.g.
after I click any dag in airflow WebUI I got some other 404 response
Airflow interprets all your .py files in your dags folder first, I guess something goes wrong there.
As a rule-of-thumb, first access webserver and scheduler logs (for kub kubectl logs?), maybe you can find a hint there.
If not, try first to make a "clean" airflow instance without any of your dags code or related .py files- point the dags folder to an empty directory and see what happens (better if you turn on example dags configuration).
If that works, add your .py files from the original dag folder incrementally until you find problematic code.
If it's not working, probably the scheduler or webserver are messed up, please check the logs again with better attention.
find the answer myself:
The package I use to setup k8s-airflow has a step to run ./airflow/www/compile_assets.sh using npm but the package missed the step to install npm. so I added "apt install -y npm" in the step and now I see airflow page correctly.

GCR Cloud Run says "Image [name] not found"

I'm trying to take my first baby steps with podman (instead of Docker) and Google Cloud Run. I've managed to build an image with a gcr.io tag and push it to Google. I then create a new service, and I can select the image in the "Select Image URL" pop-up dialog. But then the service fails to start, saying "Image [full name] not found".
I can't find anything on Google's support pages, or anywhere else. I can pull the image, I can push new versions, and they appear on the pop-up dialog. But the service still reports that they can't be found.
What am I doing wrong?
Edit in answer to DazWilkin's questions below:
Can you run the podman-created container locally using Docker?
I can't run Docker locally because it is not compatible with Fedora 31 (hence podman). But I can run it locally using podman run
Can you deploy a Docker-created container in Cloud Run?
As above: F31. However podman is supposed to be a drop-in replacement.
Is the container registry in the same project as Cloud Run?
Yes. I did have a problem with that, but I got a permissions message rather than "not found".
Have you tried deploying via gcloud rather than the console?
Yes.
$ podman push eu.gcr.io/my-project/hs-hello-world
Getting image source signatures
Copying blob c7f3d2e0289b done
Copying blob def7032cea8e done
Copying config f1c2e2615f done
Writing manifest to image destination
Storing signatures
$ gcloud run deploy --image eu.gcr.io/my-project/hs-hello-world --platform managed
Service name (hs-hello-world):
Deploying container to Cloud Run service [hs-hello-world] in project [my-project] region [europe-west1]
X Deploying... Image 'eu.gcr.io/my-project/hs-hello-world' not found.
X Creating Revision... Image 'eu.gcr.io/my-project/hs-hello-world' not found.
. Routing traffic...
Deployment failed
ERROR: (gcloud.run.deploy) Image 'eu.gcr.io/my-project/hs-hello-world' not found.
When I used a Google-built container it worked fine.
Update: 5 March 2020
In the end I just carried on with the Google build service, and it works fine. My initial wish for local builds was in large part because a build on Google was taking over half an hour (lots of Haskell libraries to import), but now I've figured out how to use staged builds and multi-processor VMs to avoid this. I appreciate the efforts of those who have tried to help, but right now it's not broke so I'm not going to try to fix it.
I had the same issue: it seems Cloud Run is picky about the kind of manifest it can pull.
By building my images with --format docker and pushing them with --remove-signatures (inspired by this issue), podman will create and push docker-style manifests to the Container Registry and everything ran smoothly!
Too bad I spent a lot of time thinking it was a lack of permissions problem
I had the same error. My issue was that I was using the docker/setup-buildx-action in a GitHub action. When this was removed, Cloud Run was happy with the resulting manifest / container image.
Thanks to #André-Breda for providing the direction.
I've been having the same issue today. I'm using buildah to create the new image. I realized that the image I used successfully yesterday was built as root. So I built the new one as root and pushed it successfully.
Wish I knew why. The images built as my username ran fine locally with rootless podman.

Does anyone have tried the HLF 2.0 feature "External Builders and Launchers" and wants to get in touch?

I'm getting my way through the HLF 2.0 docs and would love to discuss and try out the new features "External Builders and Launchers" and "Chaincode as an external service".
My goal is to run HLF2.0 on an K8s cluster (OpenShift). Does anyone wants to get in touch or has anyone already figured his way through?
Cheers from Germany
Also trying to use the ExternalBuilder. Setup core.yaml, rebuilt the containers to use it. I get an error that on "peer lifecycle chaincode install .tgz...", that the path to the scripts in core.yaml can not be found.
I've added volume bind commands in the peer-base.yaml, and in docker-compose-cli.yaml, and am using the first-network setup. Dropped out the part of the byfn.sh that would connect to the cli container, so that I do that part manually, do the create, join, update anchors successfully, and then try to do the install and fail. However, on the install, I'm failing on the /bin/detect, because it can't find that file to fork/exec it. To get that far, peer was able to read my external configuration, and read the core.yaml file. At the moment, trying the "mode: dev" in the core.yaml which seems to indicate that the scripts and the chaincode will be run "locally", which I think means it should run in the cli container. Otherwise, tried to walk the code to see how the docker containers are being created dynamically, and from what image, but haven't been able to nail that down yet.

Error creating template PredictionIO

I've created a lot of templates before as by now I was creating the Recommendation following the suggested steps.
$ pio template get PredictionIO/template-scala-parallel-recommendation Foo
Getting this error:
[ERROR] [Template$] Either PredictionIO/template-scala-parallel-universal-recommendation is not a valid GitHub repository, or it does not have any tag. Aborting.
How I fix this and why is this happening?
EDIT:
My Prediction version 0.9.5. Using Ubuntu
It seems that happens when you have made a pio deploy of another template before pio template get, so you have to shutdown the eventserver default port 7070 as:
$ lsof -wni tcp:7070
$ kill -9 PID
This solved the problem.
I had this issue but this google group post had my solution. Basically pio template get is cloning a repository under the covers, so it can have git-related issues.
Check to see if you can access https://api.github.com/ from your web browser. If not, check the google group post.
Also there is no need to do pio template get, just clone it from github. The Universal Recommender is kept up-to-date in its home repo here: https://github.com/actionml/template-scala-parallel-universal-recommendation/tree/v0.3.0
Notice v0.3.0 is nearing release but is not in the template gallery yet.