Deploying with docker push is slow because there are many images - deployment

I'm trying to deploy via docker. I'm using the following workflow:
Build locally
Push my image to docker hub
On the server: pull the image
On the server: start the image
But docker push takes FOREVER. There are like 30 images, and it has to walk through each one and say "Image already exists". Is there any way to speed this up?
Alternatively, should I be using a different process to deploy?

If you are pushing on AWS ECR, like I was, it may be that docker on your local needs to restart. See thread about AWS ECR slowness:
https://forums.aws.amazon.com/thread.jspa?threadID=222834
This may affect other platforms as well. It seems that around 1.12.1 on Mac, anyhow, there are some slowness issues that go away with a restart of Docker.

If you're using a local registry, we recently added a redis cache which has helped speed things up tremendously. Details about how to do this are on the registry github page
https://github.com/docker/docker-registry
While pushing still takes time on new images, pulls are very fast, as all layers are in the redis cache.

The most likely reason why you are pushing more/large layers of your images on every deployment is that you have not optimized your Dockerfiles. Here is a nice intro http://blog.tutum.co/2014/10/22/how-to-optimize-your-dockerfile/.

Related

Does Skaffold risk overloading a registry when used with a remote cluster?

While most layers of a given image would probably be reused during development and only pushed once, it seems that pushing new images/layers to a registry on each code change would fill up a registry rather quickly - especially with a team of developers.
Is this the case with Skaffold or does it have a way to manage that?

Heroku manual deploy keeps switching to Master

In Heroku, connected to Git. I want to deploy my Dev branch, and can select it.
When I manually deploy it does it's thing (deploys my website to Heroku). But my website has Master branch code. I go back to Heroku and it's on Master.
If I select Dev as the branch for either Manual or Automatic, then reload the page, it switches back to Master. Below is a screenshot of me setting the branch to dev. If I do a browser refresh, it resets to Master.
I tried reconnecting Github. Not sure what else it could be.
Deploying Dev was working up until yesterday.
Here is a screenshot of how I manually deploy (as opposed to auto deploy) from the Heroku Deployment tab.
Edit: I should also add, I happily was on Dev, and could deploy Dev updates up until recently. I deployed Master by mistake, bat can't go back to Dev.
I ended up having a corrupt Collection / DB record. I was tipped on another forum that the symptoms I was seeing (Nighscout web app not displaying some data, not the Heroku deploy I was attempting to work around that issue) could be caused by that. So as a last resort I dropped the entire Mongo Collection and I can now deploy Master and Dev, and it sticks in Heroku.
I don't know the significance since the data should be separate from the web app source code itself.
The whole reason I wanted to try Dev was for a fix for parts of the app not working. After initialising the Mongo DB Collection, I can use Master, so Dev (and the fix it contained) is not needed.
I know this isn't the exact root cause, but I'll leave this here in case someone comes across it and hasn't thought to look at the data.

How to manage software updates on docker-compose with one machine per user architecture?

We are deploying a Java backend and React UI application using docker-compose. Our Docker containers are running Java, Caddy, and Postgres.
What's unusual about this architecture is that we are not running the application as a cluster. Each user gets their own server with their own subdomain. Everything is working nicely, but we need a strategy for managing/updating machines as the number of users grows.
We can accept some down time in the middle of the night, so we don't need to have high availability.
We're just not sure what would be the best way to update software on all machines. And we are pretty new to Docker and have no experience with Kubernetes or Ansible, Chef, Puppet, etc. But we are quick to pick things up.
We expect to have hundreds to thousands of users. Each machine runs the same code but has environment variables that are unique to the user. Our original provisioning takes care of that, so we do not anticipate having to change those with software updates. But a solution that can also provide that ability would not be a bad thing.
So, the question is, when we make code changes and want to deploy the updated Java jar or the React application, what would be the best way to get those out there in an automated fashion?
Some things we have considered:
Docker Hub (concerns about rate limiting)
Deploying our own Docker repo
Kubernetes
Ansible
https://containrrr.dev/watchtower/
Other things that we probably need include GitHub actions to build and update the Docker images.
We are open to ideas that are not listed here, because there is a lot we don't know about managing many machines running docker-compose. So please feel free to offer suggestions. Many thanks!
In your case I advice you to use Kubernetes combination with CD tools. One of it is Buddy. I think it is the best way to make such updates in an automated fashion. Of course you can use just Kubernetes, but with Buddy or other CD tools you will make it faster and easier. In my answer I am describing Buddy but there are a lot of popular CD tools for automating workflows in Kubernetes like for example: GitLab or CodeFresh.io - you should pick which one is actually best for you. Take a look: CD-automation-tools-Kubernetes.
With Buddy you can avoid most of these steps while automating updates - (executing kubectl apply, kubectl set image commands ) by doing a simple push to Git.
Every time you updates your application code or Kubernetes configuration, you have two possibilities to update your cluster: kubectl apply or kubectl set image.
Such workflow most often looks like:
1. Edit application code or configuration .YML file
2. Push changes to your Git repository
3. Build an new Docker image
4. Push the Docker image
5. Log in to your K8s cluster
6. Run kubectl apply or kubectl set image commands to apply changes into K8s cluster
Buddy is a CD tool that you can use to automate your whole K8s release workflows like:
managing Dockerfile updates
building Docker images and pushing them to the Docker registry
applying new images on your K8s cluster
managing configuration changes of a K8s Deployment
etc.
With Buddy you will have to configure just one pipeline.
With every change in your app code or the YAML config file, this tool will apply the deployment and Kubernetes will start transforming the containers to the desired state.
Pipeline configuration for running Kubernetes pods or jobs
Assume that we have application on a K8s cluster and the its repository contains:
source code of our application
a Dockerfile with instructions on creating an image of your app
DB migration scripts
a Dockerfile with instructions on creating an image that will run the migration during the deployment (db migration runner)
In this case, we can configure a pipeline that will:
1. Build application and migrate images
2. Push them to the Docker Hub
3. Trigger the DB migration using the previously built image. We can define the image, commands and deployment and use YAML file.
4. Use either Apply K8s Deployment or Set K8s Image to update the image in your K8s application.
You can adjust above workflow properly to your environment/applications properties.
Buddy supports GitLab as a Git provider. Integration of these two tools is easy and only requires authorizing GitLab in your profile. Thanks to this integration you can create pipelines that will build, test and deploy your app code to the server. But of course if you are using GitLab there is no need to set up Buddy as an extra tool because GitLab is also CD tools tool for automating workflows in Kubernetes.
More information you can find here: buddy-workflow-kubernetes.
Read also: automating-workflows-kubernetes.
As it turns out, we found that a paid Docker Hub plan addressed all of our needs. I appreciate the excellent information from #Malgorzata.

How do I update my application running in my users' clusters?

I'm building a cluster visualization tool for Kubernetes that runs inside users' clusters.
My goal is to make this tool freely available. The most obvious way to distribute it is to tell people to kubectl apply -f www.ourgithub/our-configs.yaml, which pulls our images and voila.
That's all fine. Now the problem is how do we push updates?
I've considered these options but none seem very good:
Using something like https://github.com/chartmuseum/helm-push
Having the apps themselves check for updates and "restart" themselves (i.e imagePullPolicy=always scale to 0)
Having users download an executable on their machines that periodically checks for updates
I want to be able to push updates reliably so I want to make sure I'm using the most robust method there is.
What is the best practice for this?
Separate CI/CD pipeline for building and testing docker images and separate pipeline for deploying.
Your pipeline should deploy an application in a version that's is already running on the environment, deploy a new one, run e2e tests to verify everything is correct and then push a new version to the desired cluster.

Deploy a Docker image without using a repository

I'm building a Docker image on my build server (using TeamCity). After the build is done I want to take the image and deploy it to some server (staging, production).
All tutorials i have found either
push the image to some repository where it can be downloaded (pulled) by the server(s) which in small projects introduce additional complexity
use Heroku-like approach and build the images "near" or at the machine where it will be run
I really think that nothing special should be done at the (app) servers. Images, IMO, should act as closed, self-sufficient binaries that represent the application as a whole and can be passed between build server, testing, Q&A etc.
However, when I save a standard NodeJS app based on the official node repository it has 1.2 GB. Passing such a file from server to server is not very comfortable.
Q: Is there some way to export/save and "upload" just the changed parts (layers) of an image via SSH without introducing the complexity of a Docker repository? The server would then pull the missing layers from the public hub.docker.com in order to avoid the slow upload from my network to the cloud.
Investingating the content of a saved tarfile it should not be difficult from a technical point of view. The push command does basically just that - it never uploads layers that are already present in the repo.
Q2: Do you think that running a small repo on the docker-host that I'm deploying to in order to achieve this is a good approach?
If your code can live on Github or BitBucket why not just use DockerHub Automated builds for free. That way on you node you just have to docker pull user/image. The github repository and the dockerhub automated build's can both be private so you don't have to expose your code to the world. Although you may have to pay for more than one private repository or build.
If you do still want to build your own images then when you run the build command you see out put similar to the following:
Step 0 : FROM ubuntu
---> c4ff7513909d
Step 1 : MAINTAINER Maluuba Infrastructure Team <infrastructure#maluuba.com>
---> Using cache
---> 858ff007971a
Step 2 : EXPOSE 8080
---> Using cache
---> 493b76d124c0
Step 3 : RUN apt-get -qq update
---> Using cache
---> e66c5ff65137
Each of the hashes e.g. ---> c4ff7513909d are intermediate layers. You can find folders which named with that hash at /var/lib/docker/graph, for example:
ls /var/lib/docker/graph | grep c4ff7513909d
c4ff7513909dedf4ddf3a450aea68cd817c42e698ebccf54755973576525c416
As long as you copy all the intermediate layers to your deployment server you won't need an external docker repository. If you are only changing one of the intermediate layers you only need to recopy that one for a redeployment. If you notice that the steps listed in the DockerFile each lead to an intermediate layer. As long as you only change the last line in the DockerFile you will only need to upload one layer. Therefor I would recommend putting your ADD code line at the end of your docker file.
ADD MyGeneratedCode /var/my_generated_code