Has there been a fix for this issue related to celery? https://github.com/celery/celery/issues/3519 - celery

Celery seems to be picking both my old code and new code. I have tried clearing cache, clearing the broker queue(redis), restarting celery etc. But none of them seem to be fixing this issue.
For context, we have new releases going to various servers periodically. The web application uses Django rest framework in the backend and celery for scheduling asynchronous tasks. Recently when we deployed new version of the code, the application was behaving very strangely. It had artifacts from the old code that was being run and parts of the new code as well. This was very weird behaviour until we found a github thread(https://github.com/celery/celery/issues/3519) which outlines the issue exactly as we had faced. There seems to be no good answers for this issue in that thread, so posting it here so that if anyone with celery knowledge knows of a workaround where we can stop celery from picking up the old artifacts.
The deployment is done through Jenkins build scripts. Please find below the script for the same. For obvious reasons, I have replaced our application name with "proj".
sudo /bin/systemctl stop httpd
sudo /bin/systemctl stop celery
/bin/redis-cli flushall
/srv/proj/bin/pip install --no-cache-dir --upgrade -r /srv/proj/requirements/staging.txt
/usr/bin/git fetch
/usr/bin/git fetch --tags
/usr/bin/git checkout $TAG
/srv/proj/bin/python manage.py migrate
sudo /bin/systemctl restart httpd
sudo /bin/systemctl restart proj
sudo /bin/systemctl start celery

OP, the problem is almost invariably that your servers include old code on them. There are two possible issues:
the checkout command fails. checkout can fail to switch branches if there are local changes on a directory. In addition, the deployment script doesn't pull the latest changes to the server. The more common approach that we use is to have the venv in another directory, and then on deploy clone the repository to a new directory (versioned). Then switch the "main" app directory link to point to the latest version. The latter, for example, is the approach used in aws's elastic beanstalk.
there are stray processes of celery still running. Depending on how you started / stopped celery, there could be stray processes of celery still hanging around and running your old code.
Ultimately, you won't be able to diagnose this problem without confirming that all of your servers are 100% identical. So, if you have devs or admins that ssh to these boxes, chances are that one or more of them made some change that affected (1) or (2).

Related

Very Slow Upload Times to GCP Apt Artifact registry

I have a CI/CD system uploading numerous large deb's into a Google Cloud Artifact Registry for Apt packages. The normal upload time is roughly 10 seconds for the average package. Yesterday all of the upload commands to this artifact registry started to hang until they are either terminated by an external trigger or timeout (over 30 minutes).
Any attempt to delete packages from the registry timeout without deleting the package.
The command I have been using to upload is:
gcloud artifacts apt upload ${ARTIFACT_REPOSITORY} --location=${ARTIFACT_LOCATION} --project ${ARTIFACT_PROJECT} --source=${debPackageName} --verbosity=debug
I started by updating all Gcloud versions to the latest version
Google Cloud SDK 409.0.0
alpha 2022.11.04
beta 2022.11.04
bq 2.0.81
bundled-python3-unix 3.9.12
core 2022.11.04
gcloud-crc32c 1.0.0
gsutil 5.16
I try to delete packages thinking perhaps the artifact registry was getting bloated using this command:
gcloud artifacts packages delete --location={LOCATION} --project {PROJECT} --repository={REPOSITORY} {PACKAGE} --verbosity=debug
But I consistently get:
"message": "Deadline expired before operation could complete."
The debug output from the original command and the delete command both spam this kind of message:
DEBUG: https://artifactregistry.googleapis.com:443 "GET /v1/projects/{PROJECT}/locations/{LOCATION}/operations/f9885192-e1aa-4273-9b61-7b0cacdd5023?alt=json HTTP/1.1" 200 None
When I created a new repository I was able to upload to it without the timeout issues.
I'm the lead for Artifact Registry. Firstly apologies that you're seeing this kind of latency with update operations to Apt Repositories. They are likely caused by regenerating the index for the repo. The bigger the repo gets, the longer this takes.
If you do a bunch of individual uploads/deletes, this causes the index generation to queue up, and you're getting timeouts. We did change some of the locking behavior around this recently, so we may have inadvertently swapped one performance issue with another.
We are planning to stop doing the index generation in the same transaction as the file modification. Instead we'll generate it asynchronously, and will look at batching or de-duping so that less work is done for a large number of individual updates. It will mean that the index isn't up-to-date the moment the upload call finishes, but will be eventually consistent.
We're working on this now as a priority but you may not see changes in performance for a few weeks. The only real workaround is to do less frequent updates or to keep the repositories smaller.
Apologies again, we definitely want to get this working in a performant way.

How to manage software updates on docker-compose with one machine per user architecture?

We are deploying a Java backend and React UI application using docker-compose. Our Docker containers are running Java, Caddy, and Postgres.
What's unusual about this architecture is that we are not running the application as a cluster. Each user gets their own server with their own subdomain. Everything is working nicely, but we need a strategy for managing/updating machines as the number of users grows.
We can accept some down time in the middle of the night, so we don't need to have high availability.
We're just not sure what would be the best way to update software on all machines. And we are pretty new to Docker and have no experience with Kubernetes or Ansible, Chef, Puppet, etc. But we are quick to pick things up.
We expect to have hundreds to thousands of users. Each machine runs the same code but has environment variables that are unique to the user. Our original provisioning takes care of that, so we do not anticipate having to change those with software updates. But a solution that can also provide that ability would not be a bad thing.
So, the question is, when we make code changes and want to deploy the updated Java jar or the React application, what would be the best way to get those out there in an automated fashion?
Some things we have considered:
Docker Hub (concerns about rate limiting)
Deploying our own Docker repo
Kubernetes
Ansible
https://containrrr.dev/watchtower/
Other things that we probably need include GitHub actions to build and update the Docker images.
We are open to ideas that are not listed here, because there is a lot we don't know about managing many machines running docker-compose. So please feel free to offer suggestions. Many thanks!
In your case I advice you to use Kubernetes combination with CD tools. One of it is Buddy. I think it is the best way to make such updates in an automated fashion. Of course you can use just Kubernetes, but with Buddy or other CD tools you will make it faster and easier. In my answer I am describing Buddy but there are a lot of popular CD tools for automating workflows in Kubernetes like for example: GitLab or CodeFresh.io - you should pick which one is actually best for you. Take a look: CD-automation-tools-Kubernetes.
With Buddy you can avoid most of these steps while automating updates - (executing kubectl apply, kubectl set image commands ) by doing a simple push to Git.
Every time you updates your application code or Kubernetes configuration, you have two possibilities to update your cluster: kubectl apply or kubectl set image.
Such workflow most often looks like:
1. Edit application code or configuration .YML file
2. Push changes to your Git repository
3. Build an new Docker image
4. Push the Docker image
5. Log in to your K8s cluster
6. Run kubectl apply or kubectl set image commands to apply changes into K8s cluster
Buddy is a CD tool that you can use to automate your whole K8s release workflows like:
managing Dockerfile updates
building Docker images and pushing them to the Docker registry
applying new images on your K8s cluster
managing configuration changes of a K8s Deployment
etc.
With Buddy you will have to configure just one pipeline.
With every change in your app code or the YAML config file, this tool will apply the deployment and Kubernetes will start transforming the containers to the desired state.
Pipeline configuration for running Kubernetes pods or jobs
Assume that we have application on a K8s cluster and the its repository contains:
source code of our application
a Dockerfile with instructions on creating an image of your app
DB migration scripts
a Dockerfile with instructions on creating an image that will run the migration during the deployment (db migration runner)
In this case, we can configure a pipeline that will:
1. Build application and migrate images
2. Push them to the Docker Hub
3. Trigger the DB migration using the previously built image. We can define the image, commands and deployment and use YAML file.
4. Use either Apply K8s Deployment or Set K8s Image to update the image in your K8s application.
You can adjust above workflow properly to your environment/applications properties.
Buddy supports GitLab as a Git provider. Integration of these two tools is easy and only requires authorizing GitLab in your profile. Thanks to this integration you can create pipelines that will build, test and deploy your app code to the server. But of course if you are using GitLab there is no need to set up Buddy as an extra tool because GitLab is also CD tools tool for automating workflows in Kubernetes.
More information you can find here: buddy-workflow-kubernetes.
Read also: automating-workflows-kubernetes.
As it turns out, we found that a paid Docker Hub plan addressed all of our needs. I appreciate the excellent information from #Malgorzata.

Local testing of Perl repository using Travis CI (with docker)

I'd like to fix a bug in a Perl repository (now owned by me, I just submitted some pull requests), but at the moment it's failing its Travis CI tests (before my pull requests).
My goal is to be able to run Travis CI tests locally starting from the repository's .travis.yml.
Note that I'm totally new to Travis CI.
Following other's solutions that pointed to this FAQ (http://web.archive.org/web/20180929150027/https://docs.travis-ci.com/user/common-build-problems/#troubleshooting-locally-in-a-docker-image), that as you can see is no longer officially available in travis-ci.com, I tried:
sudo docker pull travisci/ci-amethyst:packer-1512508255-986baf0
sudo docker run --name travis-debug -dit travisci/ci-amethyst:packer-1512508255-986baf0 /sbin/init
sudo docker exec -it travis-debug bash -l
From the container:
su - travis
git clone https://github.com/{user}/{repo}.git
Now I don't know how to build the bash script to run the tests, as the last two steps (manually install dependencies / run your Travis CI build) looks cryptic (I don't know how to run the build, and possibly lead to lack of reproducibily (if I install dependencies manually, how do I know I'll get the same results as the cloud test?)
I tried starting from the procedure described here (https://github.com/travis-ci/travis-build
), one error is ´Could not locate Gemfile or .bundle/ directory´, but I probably need some missing steps.
For what its worth, I think you are going at it from the wrong angle.
Travis is just running your stuff remotely. Instead of bringing Travis to your machine, you need to make your tests pass locally first - cryptic or not - there is no way around it, especially if you are going to own this repo.
Another reason I am recommending this, is that - as you have already witnessed - the develop-debug-fix cycle is much lengthier when you rely on something to test your code remotely.
It has been my experience that your .travis.yml should be super simple, since it just runs one or two scripts or commands that can comfortably run locally.
If you are comfortable with Docker, I would consider building a local Dockerfile with all the dependencies, and bring your tests to work in your docker environment. Once you succeeded with this step, asking Travis to do the same (run tests in a docker) is trivial.
Not sure if it is the answer you were looking for, but it was too long for a comment.

Building and deploying from a remote server with Capistrano

I'm new to Capistrano and struggling a little to get started. A brief description of what I need to do:
git pull the latest code from our git repo, on a central build server. This build server's environment matches the deployment environment exactly. I need the code to be built here. I don't want to deploy a binary that was built on a Mac laptop, for example.
compile the binary on this machine.
deploy it from this machine to all the target machines.
There is a shared user we can all SSH into on the build machine to do the builds.
The build machine is behind a gateway machine, not directly accessible.
All of the deployment target machines also have this shared user and are also behind the gateway.
The deployed binary is a single executable, and there is an init script on the target machines. After deploying the binary and changing the symlink to it, restart the service via the init script.
Everyone has appropriate SSH keys and agent forwarding for all necessary tasks.
So in principle it seems rather simple, but Capistrano seems opinionated and a bit magical. As a result I'm not sure how to accomplish all of this. It seems like it wants to check out my code and copy it to the remote machines, for example without building it first.
I think I need to ignore all of Capistrano's default smarts and just make it run some shell commands on the appropriate servers. In pseudo-code:
ssh buildmachine via gateway "cd repo && git pull && make"
ssh targetmachine(s) via gateway "scp buildmachine:repo/binary .; <mv && symlink>; service foo restart"
Am I even using the right tool for the job? It seems a lot like a round peg in a square hole.
Can someone explain to me what the contents of the Capistrano configuration files should be, and what cap commands I'd run to accomplish this?
BTW, I've searched around and looked at questions like deploying with capistrano with remote git repo but without git running on production server and From manual pull on server to Capistrano
The question is rather old, but you never know when someone steps onto it in need of information...
First and formost, consider that Capistrano might just not be the right tool for the job you want to do.
That said, it is not impossible to accomplish what you expect. While in projects that deploy large amount of files and modify them (like css/js minify, js builds etc.) I would avoid it, in your case, you can consider runing a "deployment repository" and configure it in capistrano as the source. Your process would look like this :
run the local build with whatever tools you need
upload resulting binary to a deployment repository
run capistrano that will connect to application servers, fetch fresh binary from repository, perform any server side tasks required and symlink to "current"
As a side effect you end up with full history of deployed binaries

How does capistrano "run_locally" work with branches?

I have a capistrano task that uses "run_locally" to compass compile/compress my css files and then upload them to the server.
Is it going to be smart and run that on the git branch that's getting deployed, or will it just run on the branch that I currently have in my working copy?
I'd want it to run on the branch that's getting deployed regardless of what I have checked locally. If it's not smart about this would I instead need to run_locally a git checkout on the branch that's getting deployed before running the compile command?
It runs on you current local code. So it matters what code is checked out there. As you mentioned you can try to ensure that you run the version you are going to deploy.
Better would be to do the compilation work on the server.