How to modify Kubeflow source code before deploying it with Kubernetes? - kubernetes

I encountered the same issue as in https://github.com/kubeflow/kubeflow/issues/6014 with my Kubeflow app. The fix is very simple (just a type casting), then I would like to fix it myself and redeploy Kubeflow.
The problem is that I am running a k3s cluster on my local machine where I have installed Kubeflow bundle via Juju. Then, I cannot change the source code.
How to modify Kubeflow source code before deploying it with Kubernetes?
Should I use the manifest installation https://github.com/kubeflow/manifests#installation ? or a totally different method?
Thank you.

The bug was fixed in the last version of the manifests, then I have finally installed kubeflow directly from the manifests.
But still I am in touch with one Kubeflow developer, I will post here the right way to do modify/deploy if interested.

You got to check out their Github repo. Make changes and use kustomize to install like explained in their wiki. If you check the example folder you can see that it points to all other component folders.
https://github.com/kubeflow/manifests#install-with-a-single-command
One another hack could be, just look for the controllers in Kubernetes eg., deployments created for kubeflow, then modify them; works only if your changes are only related to Kubernetes resource definitions. I suggest going with the first option above for a clean development experience, and hey, that way can you contribute back to the kubeflow project as well, if you changes will benefit others.

Related

Deploy magento2 on a kubernet cluster

I run magento 2 on my local with docker.
It works like a charm.
But now I would like to pass the next step, and deploying a registry image of my magento2 (gitlab registry with gitlab-ci in my case) in a k3d cluster in my personal server.
I don't have any problems to generate the image in the registry of gitlab and to deploy the service, the node, the pod with kube in my server. To train myself and to be sure that what I did was correct, I have tried to deploy an other image in my cluster from the gitlab registry from a little simple HTML and a Dockerfile and it works like a charm.
But the magento 2 image no. The problem is that we must play the setup / compile / etc. When I try to play composer install in my entrypoint it says in the logs pod that there is no composer.json in the folder where composer is supposed to be. So I guess I must configure ingress.yaml, services and deployment with a specific way. I don't know where to start and how.
My question is just, if someone know a good tuto or some documentations links I would be pleased to see that.
I have heard about helm, the hooks etc...but I don't handle it at all. I even don't know what it is...
Thank you
Thanks

Is there a way to package scripts in a container to run for diagnostic purposes against kubernetes?

The idea is instead of installing these scripts they can instead be applied via yaml perhaps and ran with access to kubectl and host tools to find potential issues with the running environment.
I figure the pod would need special elevated permissions, etc. I'm not quite sure if there is an example or even a better way of accomplishing the same idea.
Is there a way to package scripts in a container to run for diagnostic purposes against kubernetes?
It's an Alpha feature and not recommended for production use, but check out the ephemeral containers system: https://kubernetes.io/docs/concepts/workloads/pods/ephemeral-containers/
It's designed for exactly this, having a bundle of debugging tools that you can connect in to an existing file/pid namespace. However the feature is still incomplete as it is being added incrementally.

What is the suggested workflow when working on a Kubernetes cluster using Dask?

I have set up a Kubernetes cluster using Kubernetes Engine on GCP to work on some data preprocessing and modelling using Dask. I installed Dask using Helm following these instructions.
Right now, I see that there are two folders, work and examples
I was able to execute the contents of the notebooks in the example folder confirming that everything is working as expected.
My questions now are as follows
What are the suggested workflow to follow when working on a cluster? Should I just create a new notebook under work and begin prototyping my data preprocessing scripts?
How can I ensure that my work doesn't get erased whenever I upgrade my Helm deployment? Would you just manually move them to a bucket every time you upgrade (which seems tedious)? or would you create a simple vm instance, prototype there, then move everything to the cluster when running on the full dataset?
I'm new to working with data in a distributed environment in the cloud so any suggestions are welcome.
What are the suggested workflow to follow when working on a cluster?
There are many workflows that work well for different groups. There is no single blessed workflow.
Should I just create a new notebook under work and begin prototyping my data preprocessing scripts?
Sure, that would be fine.
How can I ensure that my work doesn't get erased whenever I upgrade my Helm deployment?
You might save your data to some more permanent store, like cloud storage, or a git repository hosted elsewhere.
Would you just manually move them to a bucket every time you upgrade (which seems tedious)?
Yes, that would work (and yes, it is)
or would you create a simple vm instance, prototype there, then move everything to the cluster when running on the full dataset?
Yes, that would also work.
In Summary
The Helm chart includes a Jupyter notebook server for convenience and easy testing, but it is no substitute for a full fledged long-term persistent productivity suite. For that you might consider a project like JupyterHub (which handles the problems you list above) or one of the many enterprise-targeted variants on the market today. It would be easy to use Dask alongside any of those.

Deploying Kubernetes on bare metal rather than VM

Stupid question, but right now I'm deploying my Kubernetes cluster inside a VM. Is there a way to deploy it directly onto my machine?
I'm sure there has to be a easy fix but many of the docs I've read have been focused on deploying it inside VM.
I am assuming you are using some flavor of Linux; otherwise the information below won't be useful to you.
The easiest way of bare metal deployment ("onto your machine") is by using kubeadm. The documentation for that is excellent.
(If you need help with then reply with your exact OS flavor and version and I can edit this answer to reflect that specific situation.)

Kubernetes User Interface

I was going through a Kubernetes tutorial on Youtube and found the following UI which demonstrates pod and service arrangements of Kubernetes cluster.How can I install this UI in my Kubernetes setup?
In order to use this UI, go to the saturnism/gcp-live-k8s-visualizer GitHub repo and follow the steps, there.
The code for that UI is from https://github.com/brendandburns/gcp-live-k8s-visualizer.
the visualizer expects some specific tags to be on the pods / services for them to be displayed. It was built for a demo and I don't think it was generalized to work on arbitrary deployments
As Robert Bailey pointed out, the versions of brendendburns and saturnism are not generealized scripts, but require little modifications on your resource labels (such as labeling things with "name" or "uses").
Maybe this version can help you:
https://github.com/0ortmann/k8s-visualizer
It features minimalistic configuration options. You can configure labels you want the script to use. You do not need to change your actual setup.
Please contact me if you run into issues.