Why Jib dropped support for distroless base image? - jib

It looks like starting with Jib 3.0; you no longer have default distroless images for your Java applications. Instead, you will get an adoptOpenjdk base image if you don't specify one. You still can configure and use distroless base images as per this link. I am just wondering if the adoptOpenJDK image is more secure, slimmer than distroless?. What's the benefit?

The Jib team was maintaining the Java specific images for distroless. Debian, from which distroless obtains its packages, dropped support for Java8 starting with Debian10. Java8 builds used Debian9 dependencies (outdated), which led to many many CVEs in the container image. This is a problem for users who require Java8 (lots of jib users), and at the moment the Jib team does not have the bandwidth to put together a high quality Java8 product for distroless.
Switching the default to adoptopenjdk, gives users consistently maintained images by adoptOpenJDK folks.
This is no means a knock on Distroless, it is still a great project, however a lack of resources and a complicated Java situation has led us here. Distroless is an opensource project, and anyone willing to create or update the workflow for Java8 can contribute directly. As far as I know, the distroless Java11 images is still available based on Debian10 packages, and you can use that as a base image if you like.

Related

How can I make Service Fabric package sizes practical?

I'm working on a Service Fabric application that is deployed to Azure. It currently consists of only 5 stateless services. The zipped archive weighs in at ~200MB, which is already becoming problematic.
By inspecting the contents of the archive, I can see the primary problem is that many files are required by all services. An exact duplicate of those files is therefore present in each service's folder. However, the zip compression format does not do anything clever with respect to duplicate files within the archive.
As an experiment, I wrote a little script to find all duplicate files in the deployment and delete all but one of each files. Then I tried zipping the results and it comes in at a much more practical 38MB.
I also noticed that system libraries are bundled, including:
System.Private.CoreLib.dll (12MB)
System.Private.Xml.dll (8MB)
coreclr.dll (5MB)
These are all big files, so I'd be interested to know if there was a way for me to only bundle them once. I've tried removing them altogether but then Service Fabric fails to start the application.
Can anyone offer any advice as to how I can drastically reduce my deployment package size?
NOTE: I've already read the docs on compressing packages, but I am very confused as to why their compression method would help. Indeed, I tried it and it didn't. All they do is zip each subfolder inside the primary zip, but there is no de-duplication of files involved.
There is a way to reduce the size of the package but I would say it isn't a good way or the way things should be done but still I think it can be of use in some cases.
Please note: This approach requires target machines to have all prerequisites installed (including .NET Core Runtime etc.)
When building .NET Core app there are two deployment models: self-contained and framework-dependent.
In the self-contained mode all required framework binaries are published with the application binaries while in the framework-dependent only application binaries are published.
By default if the project has runtime specified: <RuntimeIdentifier>win7-x64</RuntimeIdentifier> in .csproj then publish operation is self-contained - that is why all of your services do copy all the things.
In order to turn this off you can simply add SelfContained=false property to every service project you have.
Here is an example of new .NET Core stateless service project:
<PropertyGroup>
<TargetFramework>netcoreapp2.2</TargetFramework>
<AspNetCoreHostingModel>InProcess</AspNetCoreHostingModel>
<IsServiceFabricServiceProject>True</IsServiceFabricServiceProject>
<ServerGarbageCollection>True</ServerGarbageCollection>
<RuntimeIdentifier>win7-x64</RuntimeIdentifier>
<TargetLatestRuntimePatch>False</TargetLatestRuntimePatch>
<SelfContained>false</SelfContained>
</PropertyGroup>
I did a small test and created new Service Fabric application with five services. The uncompressed package size in Debug was around ~500 MB. After I have modified all the projects the package size dropped to ~30MB.
The application deployed worked well on the Local Cluster so it demonstrates that this concept is a working way to reduce package size.
In the end I will highlight the warning one more time:
Please note: This approach requires target machines to have all prerequisites installed (including .NET Core Runtime etc.)
You usually don't want to know which node runs which service and you want to deploy service versions independently of each other, so sharing binaries between otherwise independent services creates a very unnatural run-time dependency. I'd advise against that, except for platform binaries like AspNet and DotNet of course.
However, did you read about creating differential packages? https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-application-upgrade-advanced#upgrade-with-a-diff-package that would reduce the size of upgrade packages after the initial 200MB hit.
Here's another option:
https://devblogs.microsoft.com/dotnet/app-trimming-in-net-5/
<SelfContained>True</SelfContained>
<PublishTrimmed>True</PublishTrimmed>
From a quick test just now, trimming one app reduced the package size from ~110m MB to ~70MB (compared to ~25MB for selfcontained=false).
The trimming process took several minutes for a single application though, and the project I work on has 10-20 apps per Service Fabric project. Also I suspect that this process isn't safe when you have a heavy reliance on dependency injection model in your code.
For debug builds we use SelfContained=False though because developers will have the required runtimes on their machines. Not for release deployments though.
As a final note, since the OP mentioned file upload being a particular bottleneck:
A large proportion of the deployment time is just zipping and uploading the package
I noticed recently that we were using the deprecated Publish Build Artifacts task when uploading artifacts during our build pipeline. It was taking 20 minutes to upload 2GB of files. I switched over the suggested Publish Pipeline Artifact task and it took our publish step down to 10-20 seconds. From what I can tell, it's using all kinds of tricks under the hood for this newer task to speed up uploads (and downloads) including file deduplication. I suspect that zipping up build artifacts yourself at that point would actually hurt your upload times.

Using ANT to source control the code in cloud (NetSuite)

For our ERP platform (NetSuite) customization code rests in the Cloud. We (different entities) can directly make changes to it but there is no source control available to us in cloud.
It is possible to fetch the code files through a SOAP API.
I was wondering if it is possible to get the files through API using Apache ANT and shove into TFS/SVN?
I am not familiar with Apache ANT so I do not know the capabilities of ANT that if it can fetch any info through API?
(you can also suggest any better approach to source control the code in cloud)
Ant has several third party task plugins for version control tasks. Plus, you can always use the <exec/> task to build an equivalent command line checkout. However, I do not recommend people use Ant to fetch versions from your version control system. This ends up being a chicken v. egg issue.
Your build script is in version control. You need to fetch it in order to run Ant against it. If you're fetching your build script, why not the rest of your project?
Once you checked out your project in a working directory, and want to do an update, why not let Ant do the update? Because your build script is also version controlled. Doing an update and build at the same time could have you running the wrong version of your build script against your build.
Maybe you're going to check in the files that were modified by the build system. Not a good idea. You should rarely, if ever, check in files you built. If you need them, rebuild them. Built files are usually binary in nature, and can vary greatly from one version to another. In most cases, your version control system will be checking in completely new versions of the built object instead of using diff format. That takes up a lot of room in your version control system.
Even worse, you can't diff the built objects, so you can't really verify their content or trace their history. And, built objects tend to age very quickly. Something built last month is already obsolete. With in a year, the vast majority of the information in your version control system will be nothing but obsolete binary, and very little stored is useful code.
Besides, your version control system has nothing to do with building your files. Imagine between Release 2.1 and Release 2.2, you change version control systems from Subversion to Git. Now, a bug in Release 2.1 needs to be fixed, and you need to create Release 2.1.1. Your checkout code in your build scripts will no longer work.
If you're using NetSuite IDE, you're using Eclipse, and Eclipse is great at handling version control. Eclipse can handle both SVN and TFS (although I don't know why anyone would use TFS). Eclipse tracks file changes quite nicely. In fact, Eclipse gets confused when you change files behind its back (like you do an update outside of Eclipse).
Let Eclipse handle your version control issues. It presents a common interface to almost all version control systems. This way, your build system can handle the builds.
I'm not sure what other requirements you might have, but if you use the NetSuite IDE (Eclipse + bundled plugin), you can use it to pull and push files to NetSuite. And then you can use any source control system you like (we use SVN, for instance).

Version control workflow with 'external' binary files

I'm working with an embedded system software project right now, and we're facing some problems dealing with some precompiled binaries living inside our repository.
We have several repositories for different parts of our project: One for the application itself, one for the OS, one for the bootloader and several libraries. All of them, except the one for our application, are shared with other teams, for other projects. We are using git (and changing is not an option right now), but I think we'd have the same problem with any VCS.
Right now, we have a precompiled binary for each of those components living inside our application repository. The idea was to speed up the build time, since the OS alone takes about 20 minutes to build from scratch and most guys work only with the application.
Problem is, there are several bugs/features in those binaries (and related application code) to be integrated at any time and, as you know, diffing and merging binaries won't work.
So, how do you guys do when you have to work with those external dependencies?
Thanks a lot =)
One viable solution is to use an external binary repository like Nexus.
It is not linked to a VCS, meaning you can easily clean up old versions of said binaries you don't need anymore
It is light-weight (simple HTTP client-server protocol, no need to clone all the repo with all the versioned binaries like you would with a DVCS -- git or mercurial --)

How to install UML2Tools in Rational Business Developer

Am trying to install Eclipse UML2Tools plugin, in Rational Business Developer 7.5.1
Most compatible version seems UML2Tools 0.8.0 - but am unable to get it working.
I could not get UML2Tools working by neither from the local update site mechanism, nor by placing it in dropins folder.
One of the error reported when updating using local site was:
Cannot find a solution where both Match[requiredCapability: org.eclipse.equinox.p2.iu/org.eclipse.emf/[2.4.0.v200806091234,2.4.0.v200806091234]] and Match[requiredCapability: org.eclipse.equinox.p2.iu/org.eclipse.emf/[2.4.0.v200808251517,2.4.0.v200808251517]] can be satisfied.
Then I forced UML2Tools in RBD 7.5.1, by placing in its features and plugins folder. This got it working, but on trying class diagram -- the diagram generated had large shapes, was laid out incorrectly, and classes were not connected.
What can be done to get UML2Tools working in RBD 7.5.1?
Thanks
I would question if there is not some UML2Tools already running which is conflicting with your installation. I would just use eclipse on the side or get RSM/RSA. According to the documentation RBD can do model transformations from ULM2 which means it already has the emf/gmf/equinox libraries somewhat installed. Who knows how IBM crippled RBD to not provide more UML2 features. If you really want to dig I would us the plug-in registry view to see what is running at a high-level. (Takes time to open) type in equinox, my RSA ofcourse has many plug-ins around UML2 etc.

How to version control the build tools and libraries?

What are the recommendations for including your compiler, libraries, and other tools in your source control system itself?
In the past, I've run into issues where, although we had all the source code, building an old version of the product was an exercise in scurrying around trying to get the exact correct configuration of Visual Studio, InstallShield and other tools (including the correct patch version) used to build the product. On my next project, I'd like to avoid this by checking these build tools into source control, and then build using them. This would also simplify things in terms of setting up a new build machine -- 1) install our source control tool, 2) point at the right branch, and 3) build -- that's it.
Options I've considered include:
Copying the install CD ISO to source control - although this provides the backup we need if we have to go back to an older version, it isn't a good option for "live" use (each build would need to start with an install step, which could easily turn a 1 hour build into 3 hours).
Installing the software to source control. ClearCase maps your branch to a drive letter; we could install the software under this drive. This doesn't take into account non-file part of installing your tools, like registry settings.
Installing all the software and setting up the build process inside a virtual machine, storing the virtual machine in source control, and figuring out how to get the VM to do a build on boot. While we capture the state of the "build machine" with ease, we get the overhead of a VM, and it doesn't help with the "make the same tools available to developers issue."
It seems such a basic idea of configuration management, but I've been unable to track down any resources for how to do this. What are the suggestions?
I think the VM is your best solution. We always used dedicated build machines to get consistency. In the old COM DLL Hell days, there were dependencies on (COMCAT.DLL, anyone) on non-development software installed (Office). Your first two options don't solve anything that has shared COM components. If you don't have any shared components issue, maybe they will work.
There is no reason the developers couldn't take a copy of the same VM to be able to debug in a clean environment. Your issues would be more complex if there are a lot of physical layers in your architecture, like mail server, database server, etc.
This is something that is very specific to your environment. That's why you won't see a guide to handle all situations. All the different shops I've worked for have handled this differently. I can only give you my opinion on what I think has worked best for me.
Put everything needed to build the
application on a new workstation
under source control.
Keep large
applications out of source control,
stuff like IDEs, SDKs, and database
engines. Keep these in a directory as ISO files.
Maintain a text document, with the source code, that has a list of the ISO files that will be needed to build the app.
I would definitely consider the legal/licensing issues surrounding the idea. Would it be permissible according to the various licenses of your toolchain?
Have you considered ghosting a fresh development machine that is able to build the release, if you don't like the idea of a VM image? Of course, keeping that ghosted image running as hardware changes might be more trouble than it's worth...
Just a note on the versionning of libraries in your version control system:
it is a good solution but it implies packaging (i.e. reducing the number of files of that library to a minimum)
it does not solves the 'configuration aspect' (that is "what specific set of libraries does my '3.2' projects need ?").
Do not forget that set will evolves with each new version of your project. UCM and its 'composite baseline' might give the beginning of an answer for that.
The packaging aspect (minimum number of files) is important because:
you do not want to access your libraries through the network (like though dynamic view), because the compilation times are much longer than when you use local accessed library files.
you do want to get those library on your disk, meaning snapshot view, meaning downloading those files... and this is where you might appreciate the packaging of your libraries: the less files you have to download, the better you are ;)
My organisation has a "read-only" filesystem, where everything is put into releases and versions. Releaselinks (essentially symlinks) point to the version being used by your project. When a new version comes along it is just added to the filesystem and you can swing your symlink to it. There is full audit history of the symlinks, and you can create new symlinks for different versions.
This approach works great on Linux, but it doesn't work so well for Windows apps that tend to like to use things local to the machine such as the registry to store things like configuration.
Are you using a continuous integration (CI) tool like NAnt to do your builds?
As a .Net example, you can specify specific frameworks for each build.
Perhaps the popular CI tool for whatever you're developing in has options that will allow you to avoid storing several IDEs in your version control system.
In many cases, you can force your build to use compilers and libraries checked into your source control rather than relying on global machine settings that won't be repeatable in the future. For example, with the C# compiler, you can use the /nostdlib switch and manually /reference all libraries to point to versions checked in to source control. And of course check the compilers themselves into source control as well.
Following up on my own question, I came across this posting referenced in the answer to another question. Although more of a discussion of the issue than an aswer, it does mention the VM idea.
As for "figuring out how to build on boot": I've developed using a build farm system custom-created very quickly by one sysadmin and one developer. Build slaves query a taskmaster for suitable queued build requests. It's pretty nice.
A request is 'suitable' for a slave if its toolchain requirements match the toolchain versions on the slave - including what OS, since the product is multi-platform and a build can include automated tests. Normally this is "the current state of the art", but doesn't have to be.
When a slave is ready to build, it just starts polling the taskmaster, telling it what it's got installed. It doesn't have to know in advance what it's expected to build. It fetches a build request, which tells it to check certain tags out of SVN, then run a script from one of those tags to take it from there. Developers don't have to know how many build slaves are available, what they're called, or whether they're busy, just how to add a request to the build queue. The build queue itself is a fairly simple web app. All very modular.
Slaves needn't be VMs, but usually are. The number of slaves (and the physical machines they're running on) can be scaled to satisfy demand. Slaves can obviously be added to the system any time, or nuked if the toolchain crashes. That'ss actually the main point of this scheme, rather than your problem with archiving the state of the toolchain, but I think it's applicable.
Depending how often you need an old toolchain, you might want the build queue to be capable of starting VMs as needed, since otherwise someone who wants to recreate an old build has to also arrange for a suitable slave to appear. Not that this is necessarily difficult - it might just be a question of starting the right VM on a machine of their choosing.