VSTS Hosted Agent, not enough space in the disk - azure-devops

I cannot build in VSTS with hosted agent (VS 2017) with error:
System.IO.IOException: There is not enough space on the disk
I have tried setting "Clean" option to true on Build , Repository definition without solving the issue. I didn't have this option set to true which I imagine led to the current situation.
Also installed VSTS extension "Clean Agent Directories" and added as last step of the build process without solving the issue either.
Is there an option that would allow me to solve this issue and continue using the hosted build agent ?

Hosted agents offer 10 GB of space. You stated that your entire solution folder is 2.6 GB. Your build outputs will typically be somewhere in the range of 2x that size, if not larger, depending on various factors.
If you're a Git user, this the entire repo that's being cloned may be significantly larger than 2.6 GB, as well -- cloning the repo brings down not only the current working copy of the code, but also all of the history.
You can control the clone depth (e.g. how much history is pulled down) by enabling Shallow fetch under the Advanced options of your repo settings.
If you're a TFVC user, you can check your workspace mappings to ensure only relevant source code is being pulled down.
You may be in a situation where the 10 GB simply isn't sufficient for your purposes. If the 2.6 GB is purely code and contains no binary assets (images, PDFs, video files, etc), you may want to start modularizing your application so smaller subsections can be built and independently deployed. If the 2.6 GB contains a lot of binary assets, you'll likely want to separate static content (images, et al) from source code and devise a separate static content deployment process.

According to Microsoft's documentation,
(Microsoft-hosted agents) Provide at least 10 GB of storage for your source and build outputs.
So, if you are getting "not enough space in disk error" it might mean that the amount of disk space used by your source code (files, repos, branches, etc), together with the amount of disk space taken by your build output (files generated as a result of the build process) is crossing the 10 GB of storaged provided by your DevOps plan.
When getting this error I had to delete an old git repo and an old git branch, getting 17 MB of free space, which was enough for my build to process. Thus, in my case the space was being used up by source code. It could well be too many or too big files being generated by the build. That is, you just need to find out which one of these two is the cause of your lack of disk space, and work on freeing it.

There is a trick to free agent space by removing the cached docker images (if you don't need them of course). With the Microsoft hosted agent there is a list of docker images pre-provisioned. This SO answer describes where to find the docs on the different images / cached container images.
It's as simple as adding an extra command task to cleanup the cached images. For Linux / Ubuntu:
steps:
- script: |
df -h
- script: |
docker rmi -f $(docker images -aq)
- script: |
df -h
The df (disk-free) command shows you how much is saved. This will probably free up another 5Gb.

Related

Very Slow Upload Times to GCP Apt Artifact registry

I have a CI/CD system uploading numerous large deb's into a Google Cloud Artifact Registry for Apt packages. The normal upload time is roughly 10 seconds for the average package. Yesterday all of the upload commands to this artifact registry started to hang until they are either terminated by an external trigger or timeout (over 30 minutes).
Any attempt to delete packages from the registry timeout without deleting the package.
The command I have been using to upload is:
gcloud artifacts apt upload ${ARTIFACT_REPOSITORY} --location=${ARTIFACT_LOCATION} --project ${ARTIFACT_PROJECT} --source=${debPackageName} --verbosity=debug
I started by updating all Gcloud versions to the latest version
Google Cloud SDK 409.0.0
alpha 2022.11.04
beta 2022.11.04
bq 2.0.81
bundled-python3-unix 3.9.12
core 2022.11.04
gcloud-crc32c 1.0.0
gsutil 5.16
I try to delete packages thinking perhaps the artifact registry was getting bloated using this command:
gcloud artifacts packages delete --location={LOCATION} --project {PROJECT} --repository={REPOSITORY} {PACKAGE} --verbosity=debug
But I consistently get:
"message": "Deadline expired before operation could complete."
The debug output from the original command and the delete command both spam this kind of message:
DEBUG: https://artifactregistry.googleapis.com:443 "GET /v1/projects/{PROJECT}/locations/{LOCATION}/operations/f9885192-e1aa-4273-9b61-7b0cacdd5023?alt=json HTTP/1.1" 200 None
When I created a new repository I was able to upload to it without the timeout issues.
I'm the lead for Artifact Registry. Firstly apologies that you're seeing this kind of latency with update operations to Apt Repositories. They are likely caused by regenerating the index for the repo. The bigger the repo gets, the longer this takes.
If you do a bunch of individual uploads/deletes, this causes the index generation to queue up, and you're getting timeouts. We did change some of the locking behavior around this recently, so we may have inadvertently swapped one performance issue with another.
We are planning to stop doing the index generation in the same transaction as the file modification. Instead we'll generate it asynchronously, and will look at batching or de-duping so that less work is done for a large number of individual updates. It will mean that the index isn't up-to-date the moment the upload call finishes, but will be eventually consistent.
We're working on this now as a priority but you may not see changes in performance for a few weeks. The only real workaround is to do less frequent updates or to keep the repositories smaller.
Apologies again, we definitely want to get this working in a performant way.

Unable to reclaim Storage for Actions and Packages after deleting all files

When i try to run a github action (it will build android apk) it showing an error
You've used 100% of included services for GitHub Storage (GitHub
Actions and Packages). GitHub Actions and Packages won’t work until a
monthly spending limit is set.
So i delete all Artifacts files but after i delete each Artifacts the Storage for Actions is not reducing For example i delete 20 Artifacts file and each contains 20mb. Which means 400Mb and when i check the "Storage for Actions" it is still showing it is overflowed Why this is happening?
I encountered an identical problem After looking at the docs, it seems it takes one hour for storage usage to update.
From the documentation:
Storage usage data synchronizes every hour.

Deploy a Docker image without using a repository

I'm building a Docker image on my build server (using TeamCity). After the build is done I want to take the image and deploy it to some server (staging, production).
All tutorials i have found either
push the image to some repository where it can be downloaded (pulled) by the server(s) which in small projects introduce additional complexity
use Heroku-like approach and build the images "near" or at the machine where it will be run
I really think that nothing special should be done at the (app) servers. Images, IMO, should act as closed, self-sufficient binaries that represent the application as a whole and can be passed between build server, testing, Q&A etc.
However, when I save a standard NodeJS app based on the official node repository it has 1.2 GB. Passing such a file from server to server is not very comfortable.
Q: Is there some way to export/save and "upload" just the changed parts (layers) of an image via SSH without introducing the complexity of a Docker repository? The server would then pull the missing layers from the public hub.docker.com in order to avoid the slow upload from my network to the cloud.
Investingating the content of a saved tarfile it should not be difficult from a technical point of view. The push command does basically just that - it never uploads layers that are already present in the repo.
Q2: Do you think that running a small repo on the docker-host that I'm deploying to in order to achieve this is a good approach?
If your code can live on Github or BitBucket why not just use DockerHub Automated builds for free. That way on you node you just have to docker pull user/image. The github repository and the dockerhub automated build's can both be private so you don't have to expose your code to the world. Although you may have to pay for more than one private repository or build.
If you do still want to build your own images then when you run the build command you see out put similar to the following:
Step 0 : FROM ubuntu
---> c4ff7513909d
Step 1 : MAINTAINER Maluuba Infrastructure Team <infrastructure#maluuba.com>
---> Using cache
---> 858ff007971a
Step 2 : EXPOSE 8080
---> Using cache
---> 493b76d124c0
Step 3 : RUN apt-get -qq update
---> Using cache
---> e66c5ff65137
Each of the hashes e.g. ---> c4ff7513909d are intermediate layers. You can find folders which named with that hash at /var/lib/docker/graph, for example:
ls /var/lib/docker/graph | grep c4ff7513909d
c4ff7513909dedf4ddf3a450aea68cd817c42e698ebccf54755973576525c416
As long as you copy all the intermediate layers to your deployment server you won't need an external docker repository. If you are only changing one of the intermediate layers you only need to recopy that one for a redeployment. If you notice that the steps listed in the DockerFile each lead to an intermediate layer. As long as you only change the last line in the DockerFile you will only need to upload one layer. Therefor I would recommend putting your ADD code line at the end of your docker file.
ADD MyGeneratedCode /var/my_generated_code

What are the advantages of using Nuget automatic package restore in the enterprise?

I've been trying to implement a Nuget policy inside my company and I was wondering what is the real value of Automatic Package Restore when working on internal projects on a internally hosted TFS.
I understand that in OpenSource projects or when using externally hosted source control not checking in external packages can save a lot of disk space, but apart from this advantage (saving disk space on the server) I cannot see any other advantage in using the automatic restore: actually it gives us some problem as the build machine doesn't connect to internet and for using that feature we'd either change firewall rules or keeping a local cache of the nuget repository.
Thank you
As Steven indicated in his answer the main reason for keeping the packages out of source control is to reduce the amount of data that needs to be stored by and transferred from/to the source control server. Obviously in a company where you control all the hardware neither the disk space nor the network transfer should be an issue, but why waste the disk space / time dealing with the packages if you don't need to. Note that the size of the package directory can be a quite considerable percentage of the total size of a workspace. In my case the packages directory takes up between 80% - 95% of the total workspace when using TFS (for my work workspaces) and between 50% - 75% for my private workspaces (which are based on git and thus have the .git directory which takes up some space). All in all a significant amount of space could be saved on your source control server if you use package restore.
One way to solve the access problem with Nuget.org is to have your own local package repository. This local package repository could be a local nuget web service or just a shared directory on a server. Because the repository lives inside your company LAN it should not be a big problem for the build server to get to the local nuget repository.
A side benefit of this approach is that your build process is independent from Nuget.org (in the very rare case it goes down) and, more importantly, that you know exactly which packages get pulled into the build (because they will be the approved packages in your local repository).
For us the decision whether to use a local nuget repository and the package restore option depended on our decision to package all our internal libraries as nuget packages (I have described our development process in an answer to another nuget question). Because of that we needed a local package repository to distribute those internal packages. Which then meant that adding the third-party packages to this repository was easy and thus using package restore made perfect sense.
If you don't package your internal libraries as nuget packages then you will put those in your source control along side your solution and code files. In that case you may as well do the same for the third-party libraries.
In the end it is all a trade-off. Disk space vs ease of infrastructure set-up etc. etc. Pick the solution that suits your environment best.
From Nuget.org:
The original NuGet workflow has been to commit the Packages folder into source control. The reasoning is that it matches what developers typically do when they don't have NuGet: they create a Lib or ExternalDependencies folder, dump binaries into there and commit them to source control to allow others to build.
So the idea is that the packages would be restored on each machine that the project would be built upon so the binary data stays out of source control. And with distributed source control systems like Mercurial or Git, that is disk space and bandwidth that get's used on client machines as well as the server.
But that does pose a problem when your build machine can't connect to the internet to nuget.org (I'm assuming). I think you've hit the major solutions. Commit the packages to source control and avoid package restore, allow the build machine to connect to the internet, or setup a local mirror.
I'll stop short of saying there are no advantages of package restore in your environment. I don't see big harm in committing the packages. But it really depends on how your team functions, what your team expects and what type of enterprise environment you're working in.

jenkins continuous delivery with shared workspace

Background:
We have one Jenkins job (Production) to build a deliverable every night. We have another job (ProductionPush) that pushes out the deliverable over a proprietary protocol to production machines the next day. This is because some production machines are only available during certain hours during the day (It also gives us a chance to fix any last-minute build breaks). ProductionPush needs access to the deliverable built by the Production job (so it needs access to the same workspace). We have multiple nodes and concurrent builds (and thus unpredictable workspaces) and prefer not to tie the jobs to a fixed node/workspace since resources are somewhat limited.
Questions:
How to make sure both jobs share the same workspace and ensure that ProductionPush runs at a fixed time the next day only if Production succeeds -- without fixing both jobs to run out of the same node/workspace? I know the Parameterized Trigger Plugin might help with some of this but it does not seem to have time delay capability and 12 hours seems too long for a quiet period.
Is sharing the workspace a bad idea?
Answer 2: Yes, sharing workspace is a bad idea. There is possibility of file locks. There is the issue of workspace being wiped out. Just don't do it...
Answer 1: What you need is to Archive the artifacts of the build. This way, the artifacts for a particular build (by build number) will always be available, regardless of whether another build is running or not, or what state the workspaces are in
To Archive the artifacts
In your build job, under Post-build Actions, select Archive the artifacts
Specify what artifacts to archive (you can use a combination of below)
a) You can archive all: *.*
b) You can archive a particular file with wildcards: /path/to/file_version*.zip
c) You can ignore the intermediate directories like: **/file_version*.zip
To avoid storage problems with many artifacts, on the top of configuration you can select Discard Old Builds, Click Advanced button, and play around with Days to keep artifacts and Max # of builds to keep with artifacts. Note that these two settings do not control for how long the actual builds are kept (other settings control that)
To access artifacts from Jenkins
In the build history, select any previous build you want.
In addition to SCM changes and revisions data, you will now have a Build Artifacts link, under which you will find all the artifacts for that particular build.
You can also access them with Jenkins' permalinks, for example
http://JENKINS_URL/job/JOB_NAME/lastSuccessfulBuild/artifact/ and then the name of the artifact.
To access artifacts from another job
I've extensively explained how to access previous artifacts from another deploy job (in your example, ProductionPush) over here:
How to promote a specific build number from another job in Jenkins?
If your requirements are to always deploy latest build to Production, you can skip the configuration of promotion in the above link. Just follow the steps for configuration of the deploy job. Once you have your deploy job, if it is always run at the same time, just configure its Build periodically parameters. Alternatively, you can have yet another job that will trigger the deploy job based on whatever conditions you want.
In either case above, if your Default Selector is set to Latest successful build (as explained in the link above), the latest build will be pushed to Production
Not sure archiving artifacts is really a good idea. A staging repository might be better as it enables cross-functional teams to share artifacts across different builds when required by tweaking the Maven settings.xml file.
You really want a deployable (ear/war) as the thing that gets built, tested, then promoted to production once confidence is high with the build.
Use a build number on your deployable (major.minor.buildnumber). This is the thing you promote to production, providing your tests can be relied upon. Don't use a hyphen to separate minor with build number as that forces Maven to perform a lexical comparison... a decimal point will force a numeric comparison which will give you far less headaches.
Also, you didn't mention your target platform, but using the Maven APT/RPM plugin to push an APT/RPM to a APT/YUM repo that's available to a production box (AFTER successful testing!) would be a good fit, as per industry standards?