Writing a startup script to google container engine

Writing a startup script to google container engine - kubernetes

I found that startup scripts can be added to Google compute instances using either the console or cli(gcloud). I want to add the startup scripts to google container engine.
The goal is to notify me when the google container engine has changed its state to Running. I though one efficient way is to use startup scripts in container engine, as these scripts will only be executed when the container's status is changed to running.
Any idea how to add startup scripts to container engine or any other way of notifying when the container's status changes to running.

First of all your question is fairly complicated. The concept of startup scripts do not belong to the containers world. As far as I know you can't add startup scripts in Google Container Engine. This is because Container Engine instances are immutable (e.g. you can't or you are not supposed to modify the operating system, you should just run containers).
If you're trying to run scripts when a container starts/stops you need to forget about startup scripts concept in the Compute Engine world. You can use container lifecycle hooks in Kubernetes (the orchestrator running in Container Engine).
Here's documentation and tutorial about it:
https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/
https://kubernetes.io/docs/tasks/configure-pod-container/attach-handler-lifecycle-event/

You can approximate the behavior of startup scripts using a DaemonSet with a simple pod that runs in privileged mode. For example code, see https://github.com/kubernetes/contrib/tree/master/startup-script.

Project metadata works for this, here's a terraform example:
resource "google_compute_project_metadata_item" "main" {
project = abcdefg # this is optional
key = "startup-script"
value = "#! /bin/sh\necho hello > /tmp/world"
}

Related

Is there a Kubernetes pattern or concept equivalent to init containers that runs after the application has started?

I have an application that starts, is configured, then the configuration is run, via the application's API calls (i.e. Cannot be configured until after the application is running). The application also takes a bit of time to start up before it is ready to accept the configuration. This container is being run inside a Kubernetes cluster.
I want to create a helper container inside the pod that:
Uses a different language than what the application container has (add python, so it should be a separate container)
Loops/waits/sleeps until the application is ready (check for a url to be available)
Runs a script to configure the application, then start the configuration
Runs to completion and isn't always running
Is tied to the main application (automated, runs once if the main application is restarted, I can already do this if I just run it manually after a restart)
(Bonus) Runs the script to completion or shows the main application as failed
Basically I am looking for a lot of the same characteristics of an init container, except that it runs post application startup instead of pre application startup.
Anyone have an example or pattern to research that does this?
Thanks!

Best way to deploy long-running high-compute app to GCP

I have a python app that builds a dataset for a machine learning task on GCP.
Currently I have to start an instance of a VM that we have, and then SSH in, and run the app, which will complete in 2-24 hours depending on the size of the dataset requested.
Once the dataset is complete the VM needs to be shutdown so we don't incur additional charges.
I am looking to streamline this process as much as possible, so that we have a "1 click" or "1 command" solution, but I'm not sure the best way to go about it.
From what I've read about so far it seems like containers might be a good way to go, but I'm inexperienced with docker.
Can I setup a container that will pip install the latest app from our private GitHub and execute the dataset build before shutting down? How would I pass information to the container such as where to get the config file etc? It's conceivable that we will have multiple datasets being generated at the same time based on different config files.
Is there a better gcloud feature that suits our purpose more effectively than containers?
I'm struggling to get information regarding these basic questions, it seems like container tutorials are dominated by web apps.

It would be useful to have a batch-like container service that runs a container until its process completes. I'm unsure whether such a service exists. I'm most familiar with Google Cloud Platform and this provides a wealth of compute and container services. However -- to your point -- these predominantly scale by (HTTP) requests.
One possibility may be Cloud Run and to trigger jobs using Cloud Pub/Sub. I see there's async capabilities too and this may be interesting (I've not explored).
Another runtime for you to consider is Kubernetes itself. While Kubernetes requires some overhead in having Google, AWS or Azure manage a cluster for you (I strongly recommend you don't run Kubernetes yourself) and some inertia in the capacity of the cluster's nodes vs. the needs of your jobs, as you scale the number of jobs, you will smooth these needs. A big advantage with Kubernetes is that it will scale (nodes|pods) as you need them. You tell Kubernetes to run X container jobs, it does it (and cleans-up) without much additional management on your part.
I'm biased and approach the container vs image question mostly from a perspective of defaulting to container-first. In this case, you'd receive several benefits from containerizing your solution:
reproducible: the same image is more probable to produce the same results
deployability: container run vs. manage OS, app stack, test for consistency etc.
maintainable: smaller image representing your app, less work to maintain it
One (beneficial!?) workflow change if you choose to use containers is that you will need to build your images before using them. Something like Knative combines these steps but, I'd stick with doing-this-yourself initially. A common solution is to trigger builds (Docker, GitHub Actions, Cloud Build) from your source code repo. Commonly you would run tests against the images that are built but you may also run your machine-learning tasks this way too.
Your containers would container only your code. When you build your container images, you would pip install, perhaps pip install --requirement requirements.txt to pull the appropriate packages. Your data (models?) are better kept separate from your code when this makes sense. When your runtime platform runs containers for you, you provide configuration information (environment variables and|or flags) to the container.

The use of a startup script seems to better fit the bill compared to containers. The instance always executes startup scripts as root, thus you can do anything you like, as the command will be executed as root.
A startup script will perform automated tasks every time your instance boots up. Startup scripts can perform many actions, such as installing software, performing updates, turning on services, and any other tasks defined in the script.
Keep in mind that a startup script cannot stop an instance but you can stop an instance through the guest operating system.
This would be the ideal solution for the question you posed. This would require you to make a small change in your Python app where the Operating system shuts off when the dataset is complete.
Q1) Can I setup a container that will pip install the latest app from our private GitHub and execute the dataset build before shutting down?
A1) Medium has a great article on installing a package from a private git repo inside a container. You can execute the dataset build before shutting down.
Q2) How would I pass information to the container such as where to get the config file etc?
A2) You can use ENV to set an environment variable. These will be available within the container.
You may consider looking into Docker for more information about container.

NixOS within NixOS?

I'm starting to play around with NixOS deployments. To that end, I have a repo with some packages defined, and a configuration.nix for the server.
It seems like I should then be able to test this configuration locally (I'm also running NixOS). I imagine it's a bad idea to change my global configuration.nix to point to the deployment server's configuration.nix (who knows what that will break); but is there a safe and convenient way to "try out" the server locally - i.e. build it and either boot into it or, better, start it as a separate process?
I can see docker being one way, of course; maybe there's nothing else. But I have this vague sense Nix could be capable of doing it alone.

There is a fairly standard way of doing this that is built into the default system.
Namely nixos-rebuild build-vm. This will take your current configuration file (by default /etc/nixos/configuration.nix, build it and create a script allowing you to boot the configuration into a virtualmachine.
once the script has finished, it will leave a symlink in the current directory. You can then boot by running ./result/bin/run-$HOSTNAME-vm which will start a boot of your virtualmachine for you to play around with.
TLDR;
nixos-rebuild build-vm
./result/bin/run-$HOSTNAME-vm

nixos-rebuild build-vm is the easiest way to do this, however; you could also import the configuration into a NixOS container (see Chapter 47. Container Management in the NixOS manual and the nixos-container command).
This would be done with something like:
containers.mydeploy = {
privateNetwork = true;
config = import ../mydeploy-configuration.nix;
};
Note that you would not want to specify the network configuration in mydeploy-configuration.nix if it's static as that could cause conflicts with the network subnet created for the container.

As you may already know, system configurations can coexist without any problems in the Nix store. The problem here is running more than one system at once. For this, you need an isolation or virtualization tools like Docker, VirtualBox, etc.
NixOS Containers
NixOS provides an efficient implementation of the container concept, backed by systemd-nspawn instead of an image-based container runtime.
These can be specified declaratively in configuration.nix or imperatively with the nixos-container command if you need more flexibility.
Docker
Docker was not designed to run an entire operating system inside a container, so it may not be the best fit for testing NixOS-based deployments, which expect and provide systemd and some services inside their units of deployment. While you won't get a good NixOS experience with Docker, Nix and Docker are a good fit.
UPDATE: Both 'raw' Nix packages and NixOS run in Docker. For example, Arion supports images from plain Nix, NixOS modules and 'normal' Docker images.
NixOps
To deploy NixOS inside NixOS it is best to use a technology that is designed to run a full Linux system inside.
It helps to have a program that manages the integration for you. In the Nix ecosystem, NixOps is the first candidate for this. You can use NixOps with its multiple backends, such as QEMU/KVM, VirtualBox, the (currently experimental) NixOS container backend, or you can use the none backend to deploy to machines that you have created using another tool.
Here's a complete example of using NixOps with QEMU/KVM.
Tests
If the your goal is to run automated integration tests, you can make use of the NixOS VM testing framework. This uses Linux KVM virtualization (expose /dev/kvm in sandbox) to run integrations test on networks of virtual machines, and it runs them as a derivation. It is quite efficient because it does not have to create virtual machine images because it mounts the Nix store in the VM. These tests are "built" like any other derivation, making them easy to run.
Nix store optimization
A unique feature of Nix is that you can often reuse the host Nix store, so being able to mount a host filesystem in the container/vm is a nice feature to have in your solution. If you are creating your own solutions, depending on you needs, you may want to postpone this optimization, because it becomes a bit more involved if you want the container/vm to be able to modify the store. NixOS tests solve this with an overlay file system in the VM. Another approach may be to bind mount the Nix store forward the Nix daemon socket.

Can we consider using containers (and kubernetes) for monolith, stateful web applications?

I'm learning about Containers and Kubernetes and was evaluating if we can move our monolith, stateful appplication to kubernetes?
I was also looking at https://kubernetes.io/blog/2018/03/principles-of-container-app-design/ and "Self-Containment" looks close. We can consider using "storage".
Properties of my application:
1. Runs on a JVM
2. Does not have a database. Saves all its data/content to TAR files on the file-system
3. Should be able to backup and retain state if the container goes down.
In our current scenarios, we deploy the app to a VM and our IT teams generally take snapshots of these VM's as backups and restore them if the app fails or they have to restore to a point where the app was working good. I wanted to avoid this.
Please advice.

You call it as web application, but based on what it does it just a process which writes to file system.
If you move to k8s, write to NFS or persistent storage from pod. If you can only run one instance, then you can't use k8s horizontal scaling.

wait for other deployments to start running before other can be created?

I am creating the deployments/services using REST APIs. I send POST request with bodies which contain the JSON objects which create the applications on Openshift. After I call all the APIs, these objects get instantiated.
I have 2 deployments which are dependent on mongodb deployment but this mongodb takes a little longer to start running, while the two deployments which are dependent on mongodb start running earlier. This breaks the code inside the 2 deployments as the mongodb connection fails(since it is not up yet).
There could be 2 possible way I can fix this problem.
I put a delay after i create mongodb deployment and recursively call the API to check it's status if it is running or not.
Just like we make changes in docker-compose, with the key, depends-on which tell the docker-compose that all the dependencies should be started first and then the dependent container.
Is there any way this could be achieved in openshift?

Instead of implementing complex logic for dependency handling, use health checking mechanism of Kubernetes. If your application starts and doesn't see Mongo DB, let it crash. Kubernetes will keep restarting it until Mongo DB comes online, and your application becomes healthy and serving as well. Kubernetes won't send traffic to not yet healthy instances.
Docs: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/

Just like we make changes in docker-compose, with the key, depends-on which tell the docker-compose that all the dependencies should be started first and then the dependent container.
You might want to look into Init Containers for dependent container. They run to completion before container is actually started. Below excerpt is taken from referenced documentation (given below) for use cases that might be applicable to your issue:
They run to completion before any app Containers start, whereas app Containers run in parallel, so Init Containers provide an easy way to block or delay the startup of app Containers until some set of preconditions are met.
Examples
Here are some ideas for how to use Init Containers:
Wait for a service to be created with a shell command like:
for i in {1..100}; do sleep 1; if dig myservice; then exit 0; fi; done; exit 1
Register this Pod with a remote server from the downward API with a command like:
curl -X POST http://$MANAGEMENT_SERVICE_HOST:$MANAGEMENT_SERVICE_PORT/register -d ‘instance=$()&ip=$()’
Wait for some time before starting the app Container with a command like sleep 60.
Reference documentation:
https://kubernetes.io/docs/concepts/workloads/pods/init-containers/

Alex has pointed out correct practice to follow with kubernetes. But if you still want directly depend on other pod phase you can use this pod-dependency-init-container that I have build. This will check if any pod with given labels is running before starting your pod.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse