Very Slow Upload Times to GCP Apt Artifact registry - google-artifact-registry

I have a CI/CD system uploading numerous large deb's into a Google Cloud Artifact Registry for Apt packages. The normal upload time is roughly 10 seconds for the average package. Yesterday all of the upload commands to this artifact registry started to hang until they are either terminated by an external trigger or timeout (over 30 minutes).
Any attempt to delete packages from the registry timeout without deleting the package.
The command I have been using to upload is:
gcloud artifacts apt upload ${ARTIFACT_REPOSITORY} --location=${ARTIFACT_LOCATION} --project ${ARTIFACT_PROJECT} --source=${debPackageName} --verbosity=debug
I started by updating all Gcloud versions to the latest version
Google Cloud SDK 409.0.0
alpha 2022.11.04
beta 2022.11.04
bq 2.0.81
bundled-python3-unix 3.9.12
core 2022.11.04
gcloud-crc32c 1.0.0
gsutil 5.16
I try to delete packages thinking perhaps the artifact registry was getting bloated using this command:
gcloud artifacts packages delete --location={LOCATION} --project {PROJECT} --repository={REPOSITORY} {PACKAGE} --verbosity=debug
But I consistently get:
"message": "Deadline expired before operation could complete."
The debug output from the original command and the delete command both spam this kind of message:
DEBUG: https://artifactregistry.googleapis.com:443 "GET /v1/projects/{PROJECT}/locations/{LOCATION}/operations/f9885192-e1aa-4273-9b61-7b0cacdd5023?alt=json HTTP/1.1" 200 None
When I created a new repository I was able to upload to it without the timeout issues.

I'm the lead for Artifact Registry. Firstly apologies that you're seeing this kind of latency with update operations to Apt Repositories. They are likely caused by regenerating the index for the repo. The bigger the repo gets, the longer this takes.
If you do a bunch of individual uploads/deletes, this causes the index generation to queue up, and you're getting timeouts. We did change some of the locking behavior around this recently, so we may have inadvertently swapped one performance issue with another.
We are planning to stop doing the index generation in the same transaction as the file modification. Instead we'll generate it asynchronously, and will look at batching or de-duping so that less work is done for a large number of individual updates. It will mean that the index isn't up-to-date the moment the upload call finishes, but will be eventually consistent.
We're working on this now as a priority but you may not see changes in performance for a few weeks. The only real workaround is to do less frequent updates or to keep the repositories smaller.
Apologies again, we definitely want to get this working in a performant way.

Related

Any way to cache signed binaries and pull them during Azure Build Pipeline run?

We have an Azure Build Pipeline that is building our product. We recently migrated to a custom Agent Pool, so we now have control over the VMs in the pool. I recently added a code signing step. This step is signing all of the binaries, and is resulting in a doubling of the build time. Up until now, I have not thought about caching as our full build runs were only about 20 min. Now we are 45 or longer.
I am trying to think through how to cache, but if I cache the signed binaries, I don't have a hash yet or anything I can compare to the newly built unsigned files. I could cache the unsigned binaries and the signed binaries, then after building compare the unsigned binaries with what was just built. For matches, I could grab the signed version from the cache, and for the others go forward with signing.
This seems overly complex. Any other options?
Azure DevOps provided Pipeline caching can help reduce build time by allowing the outputs or downloaded dependencies from one run to be reused in later runs, thereby reducing or avoiding the cost to recreate or redownload the same files again. Caching is especially useful in scenarios where the same dependencies are downloaded over and over at the start of each run.
Caching is currently supported in CI and deployment jobs, but not classic release jobs.
However according to your scenarios, especially I don't have a hash yet or anything I can compare to the newly built unsigned files. Cash may not suitable for this.
Caching can be effective at improving build time provided the time to restore and save the cache is less than the time to produce the output again from scratch. Because of this, caching may not be effective in all scenarios and may actually have a negative impact on build time.
To reduce your build time, expect improve your infrastructure of the server which host your build agents. You could also try to use parallel jobs for build pipeline.

Unable to reclaim Storage for Actions and Packages after deleting all files

When i try to run a github action (it will build android apk) it showing an error
You've used 100% of included services for GitHub Storage (GitHub
Actions and Packages). GitHub Actions and Packages won’t work until a
monthly spending limit is set.
So i delete all Artifacts files but after i delete each Artifacts the Storage for Actions is not reducing For example i delete 20 Artifacts file and each contains 20mb. Which means 400Mb and when i check the "Storage for Actions" it is still showing it is overflowed Why this is happening?
I encountered an identical problem After looking at the docs, it seems it takes one hour for storage usage to update.
From the documentation:
Storage usage data synchronizes every hour.

VSTS Hosted Agent, not enough space in the disk

I cannot build in VSTS with hosted agent (VS 2017) with error:
System.IO.IOException: There is not enough space on the disk
I have tried setting "Clean" option to true on Build , Repository definition without solving the issue. I didn't have this option set to true which I imagine led to the current situation.
Also installed VSTS extension "Clean Agent Directories" and added as last step of the build process without solving the issue either.
Is there an option that would allow me to solve this issue and continue using the hosted build agent ?
Hosted agents offer 10 GB of space. You stated that your entire solution folder is 2.6 GB. Your build outputs will typically be somewhere in the range of 2x that size, if not larger, depending on various factors.
If you're a Git user, this the entire repo that's being cloned may be significantly larger than 2.6 GB, as well -- cloning the repo brings down not only the current working copy of the code, but also all of the history.
You can control the clone depth (e.g. how much history is pulled down) by enabling Shallow fetch under the Advanced options of your repo settings.
If you're a TFVC user, you can check your workspace mappings to ensure only relevant source code is being pulled down.
You may be in a situation where the 10 GB simply isn't sufficient for your purposes. If the 2.6 GB is purely code and contains no binary assets (images, PDFs, video files, etc), you may want to start modularizing your application so smaller subsections can be built and independently deployed. If the 2.6 GB contains a lot of binary assets, you'll likely want to separate static content (images, et al) from source code and devise a separate static content deployment process.
According to Microsoft's documentation,
(Microsoft-hosted agents) Provide at least 10 GB of storage for your source and build outputs.
So, if you are getting "not enough space in disk error" it might mean that the amount of disk space used by your source code (files, repos, branches, etc), together with the amount of disk space taken by your build output (files generated as a result of the build process) is crossing the 10 GB of storaged provided by your DevOps plan.
When getting this error I had to delete an old git repo and an old git branch, getting 17 MB of free space, which was enough for my build to process. Thus, in my case the space was being used up by source code. It could well be too many or too big files being generated by the build. That is, you just need to find out which one of these two is the cause of your lack of disk space, and work on freeing it.
There is a trick to free agent space by removing the cached docker images (if you don't need them of course). With the Microsoft hosted agent there is a list of docker images pre-provisioned. This SO answer describes where to find the docs on the different images / cached container images.
It's as simple as adding an extra command task to cleanup the cached images. For Linux / Ubuntu:
steps:
- script: |
df -h
- script: |
docker rmi -f $(docker images -aq)
- script: |
df -h
The df (disk-free) command shows you how much is saved. This will probably free up another 5Gb.

Deploy build files from continuous integration

I am working on a project with multiple people, a website application which requires webpack to be built, uglified, concatenated into a few files e.g. app.min.js, style.min.css etc. - As a result of this, in an effort to prevent merge conflicts we recently added the build folder to .gitignore, under the assumption that we would be able to build during deployment.
When pushing to the Master branch, we automatically "deploy" through Semaphore CI (similar to Travis) which runs composer install, npm install, and finally "npm run build" which triggers the webpack build. This is all built and then tested on the CI side of things, and then Semaphore automatically deploys to Amazon's Elastic Beanstalk where our application is hosted.
The problem with this is, it seems Semaphore doesn't upload the build it's just tested, but rather the Master branch itself which has no built JS or CSS. I'm wondering if there's a way to push these built files to deployment as well, or if running the entire build process AGAIN on Elastic Beanstalk is the only route. It seems unnecessary to have to do that process essentially 3 times, locally, CI, and then deployment. Every time a step like this is needed on EB the actual re-instantiation time gets longer, which I'd like to keep as short as possible.
Obviously if building it a 3rd time on EB is the only way to go about this then I'll have to, just wondering if there are better solutions for this whole workflow.
I haven't worked with Semaphore CI, but you might be able to use an .ebignore file.
If you create one, the cli will use that instead of your .gitignore file.
I find in some deployment situations you want the inverse of your .gitignore (all compiled, no src). It essentially lets you pick the files from your project directory that you want to deploy, in the same way as the .gitignore file.
Edit: I just noticed the documentation on aws is lacking. It only mentions file exclusion, but you can include files too.
Edit 2: I don't think Semaphore supports the use of .ebignore, so right now this solution isn't of any use. :(
I just had a great first experience with https://deploybot.com/. The can deploy directly to elastic beanstalk. It might be interesting or you.

How do I speed up an EB deploy using ebignore?

I'm deploying my app to ElasticBeanstalk. I'm using an .ebignore file because there are files that I do not want to check into git, but I do want deployed with the app(like application secrets, config vars, etc). The issue I'm facing is that when using an .ebignore, the deploy takes FOREVER. I've used the --verbose flag, and I can see that it is recursing my entire node_modules directory and skipping each file individually. When I deploy by using .gitignore, it becomes very fast.
Has anyone else experienced this? How do I speed up this process?