I am interested to know how people organise their code libraries, particularly with respect to reusable components. I am talking in OO terms below but I am interested in how your organise libraries for other types of language also.
For example:
Are you a stickler for class library projects for everything or do you prefer to keep everything in a single project?
Do you reuse your prebuilt DLLs or do you include individual classes from previous projects in your current work? If individual classes, do you share them between the projects to ensure all are kept up to date or do you permit branching?
How large are your reusable elements? How focussed are they? How are they focussed?
What level of reuse do you attain through your preferred practices?
etc.
EDIT
I am not looking for specific guidance here, I am just interested in people's thoughts and practices. I am particularly interested in the reuse of code between disparate projects, rather than within a single project. (Unfortunately the use of 'project' here is misleading - I mean reuse between real-world projects undertaken for customers, not projects in a Visual Studio sense.)
It generally can be guide by deployment considerations:
How will you deploy (i.e. what will you copy on your production machine) ?
If what you are deploying are packaged components (i.e. dll, jar, war, ...), it is wise to organize the "code library" as a collection of packaged set of files.
That way, you will develop directly with the -- dll, jar, war, ... -- which will be deployed on the production platform.
The idea being: if it works with those packaged files, it may still work in production.
the reuse of code between disparate projects, rather than within a single project.
I maintain that kind of reuse is easier in a "component" approach (like the one discussed in the question "Vendor Branches in GIT")
Over more than 40 current projects, we achieved:
technical reuse by systematically isolating any pure technical aspect into independent framework (typically, log framework, exception framework, KPI - Key Performance Indicator - framework, and so on).
Those technical components are reused into every other projects.
functional reuse by setting a clear applicative architecture in order to divide any functional domain (given the business and functional specifications) into well-defined applications. That would typically involve, for instance, a bus layer which is also a great candidate for exposing services reused by any other projects.
Summary:
For large functional domain, a single project being not manageable, a good applicative architecture will lead to natural code reuse.
We follow these principles:
The Release-Reuse Equivalency Principle: The granule of reuse is the granule of release.
The Common Closure Principle: The classes in a package should be closed together against the same kinds of changes.
The Common Reuse Principle: The classes in a package are reused together.
The Acyclic Dependencies Principle: Allow no cycles in the package dependency graph.
The Stable Dependency Principle: Depend in the direction of stability.
The Stable Abstraction Principle: A package should be as abstract as it is stable.
You can find out more over here and over here.
It depends on what platform you work. I'm a (proud) Java developer and we have nice tools to organise our dependencies such as Maven or Ivy
Whatever else you decide good source code control is crucial to this,as it allows you to implement your strategy whatever way you like without ending up with lots of unrelated copies of your libraries.good branching support is essential.
Related
Now firstly I realise the title is extremely broad, so let me describe the use case.
Background:
I'm currently teaching myself Scala+Gradle (because I like the flexibility and power of gradle and the much more legible build files)
As such with learning new languages its often best to make applications that you can actually use, and being primarily a PHP (with Symfony) programmer and formerly a Java programmer, there are many patterns that could carry across from both paradigms.
Use Case:
I'm writing an application where I am experimenting with a Provider+Interface(trait) layout, the goal is to define traits that encompass all the expected functionality for any particular type of component e.g. a ConfigReaderTrait and a YamlConfigReager as a provider. Theoretically the advantage of this would be to allow me to switch out core mechanisms or even architectural components with minimal effort, this allows for a great deal of R&D and experimenting.
PHP Symfony Influence
Now currently I work as a pure PHP dev, and as such has been introduced to Symfony, which has a brilliant Dependency Injection framework where the dependencies are defined in yaml files, and can be delegated to sub directories. I like this, because unlike with SBT I am unphased by using different languages for different purposes (eg groovy with gradle for build scripts) and I want to maintain a separation of concerns.
Such that each type of interface/trait or bundle of related functionality should be able to have its own DI config, and I would prefer it separate from the scala code itself.
Now for Scala....
Obviously things are not the same across different languages, and if you don't embrace the differences you may aswell go back to the previous language and leave things at that.
That said, I am not yet convinced by the DI frameworks I see for scala.
Guice for example is really a modified java framework (which is fine
because scala can use java libs, but because they don't function in
the entirely same paradigm of coding languages it feels as though
scala's capabilities are not leveraged)
MacWire annoyed me a bit,because you had to define the dependencies
in the files where you used them. Which does not assist in my
interface/provider concept.
SubCut so far seems to be the best suited to what I would expect.
But while going through all of this (and bare in mind this is all in the research phase, I havent used any of them yet) it seemed that DI in Scala is still very scattered, and in its infancy, by that I mean that there are different implementations with different applications, but not one flexible enough or power enough to compare to Symfonys DI. particularly not for my application.
Comments? Thoughts?
My 5 cents:
I have actually stopped using dependency injection frameworks after switching to Scala from Java.
The language allows for a few nice ways of doing it without a framework (multiple parameter lists and currying as well as the mixins for doing injection the way the 'cake pattern' does)
and I find myself more and more just using constructor or method parameter based injection as it clearly documents what dependencies a given piece of logic has and where it got those dependencies from.
It's also fairly easy to create different modules sets of implementations or factories for implementations using Scala objects and then selecting between those at runtime. This will give you the guarantee that it wont compile unless there is an implementation available, as opposed to the big ones in Java-land that will fail in runtime, effectively pushing a compile time problem into runtime.
This also removes the 'magic' of how dependencies are created and wired (reflection, runtime weaving, macros, XML, binding context to thread local etc). I think this makes it much easier for new developers to jump into a project and understand how the codebase is interconnected.
Regarding declaring implementations in non-code like XML I have found that projects rarely or never change those files without making a new release so then they might as well be code with all the benefits that bring (IDE support, performance, type checking).
What's the recommended way to use Maven for a project that might grow large in the future?
I work with Eclipse and I see different approaches. Some use one project with no sub modules, and some Like mahout for example, have different sub-projects for different modules (e.g., core, math, examples, etc.). You can see it in this link:
http://svn.apache.org/repos/asf/mahout/trunk/
Is there any advantage preferring one over the other?
Thanks.
The decision to split a project into modules should be driven by the way you want to design and maintain your application. Maven itself will work well whether you choose to create a multi-module project, or lump everything together in a single module. So then the question becomes, what are the advantages/disadvantages of splitting your application into multiple modules.
Some drivers for splitting your app are purely technical:
Generation of separate client and server artifacts
Generation a command line version and a web app version
Module dependency relationships
In other cases, it is more design related concerns will drive you to split up your application. This is especially important if your concern is an application that will grow over time. In these cases, you want to separate your application into specific areas of concern, and have defined service boundaries through which modules interact. This allows you to evolve individual modules over time, while minimizing the effect on the remaining parts of the application.
Note that maven is especially good with large multi-module projects, because of the support it provides for dependency management and resolution.
Modules are really a core concepts with Maven (and are well supported) and the obvious advantage of modular builds is... well, modularity.
Pros:
Allows better separation of concerns.
Promotes, enforces modular design of code.
Gives you finer grained control of what you "apply" to each parts.
Allows to work on subparts of the code in isolation of the other parts.
Using binary dependencies speeds up compilation (vs compiling a whole monolithic project).
Allows users to depend on subparts of your application only.
Cons:
If not needed, multiple modules would induce more maintenance than a monolithic project.
So, Maven supports and promotes modular builds, that's part of the design.
And in some case, you actually don't even have the choice because of the one (main) artifact per module golden rule: if you want to distribute an application as different parts, or to assemble several parts of an application (e.g. a client JAR and and a server JAR, or JARs in a WAR, or some EJB-JARs and WARs in a EAR), you must create dedicated modules for them (Maven can't produce a WAR and an EAR from a same module).
To sum up, modularity of your build is somehow driven by the nature of your application, it might just be required. But if your app is not modular by nature (say you develop a swing client), you can use a single module. Divide it as needed when things become too complex, too big.
It's recommended to use modules cause usually in a large project you have reltionships between the modules which are handled by Maven. On the other hand the related parts are related to each other so a reactor build is a good way and of course the release system of Maven give you many things to support you in your project ( branching etc.).
A repeating theme in my development work has been the use of or creation of an in-house plug-in architecture. I've seen it approached many ways - configuration files (XML, .conf, and so on), inheritance frameworks, database information, libraries, and others. In my experience:
A database isn't a great place to store your configuration information, especially co-mingled with data
Attempting this with an inheritance hierarchy requires knowledge about the plug-ins to be coded in, meaning the plug-in architecture isn't all that dynamic
Configuration files work well for providing simple information, but can't handle more complex behaviors
Libraries seem to work well, but the one-way dependencies have to be carefully created.
As I seek to learn from the various architectures I've worked with, I'm also looking to the community for suggestions. How have you implemented a SOLID plug-in architecture? What was your worst failure (or the worst failure you've seen)? What would you do if you were going to implement a new plug-in architecture? What SDK or open source project that you've worked with has the best example of a good architecture?
A few examples I've been finding on my own:
Perl's Module::Plugable and IOC for dependency injection in Perl
The various Spring frameworks (Java, .NET, Python) for dependency injection.
An SO question with a list for Java (including Service Provider Interfaces)
An SO question for C++ pointing to a Dr. Dobbs article
An SO question regarding a specific plugin idea for ASP.NET MVC
These examples seem to play to various language strengths. Is a good plugin architecture necessarily tied to the language? Is it best to use tools to create a plugin architecture, or to do it on one's own following models?
This is not an answer as much as a bunch of potentially useful remarks/examples.
One effective way to make your application extensible is to expose its internals as a scripting language and write all the top level stuff in that language. This makes it quite modifiable and practically future proof (if your primitives are well chosen and implemented). A success story of this kind of thing is Emacs. I prefer this to the eclipse style plugin system because if I want to extend functionality, I don't have to learn the API and write/compile a separate plugin. I can write a 3 line snippet in the current buffer itself, evaluate it and use it. Very smooth learning curve and very pleasing results.
One application which I've extended a little is Trac. It has a component architecture which in this situation means that tasks are delegated to modules that advertise extension points. You can then implement other components which would fit into these points and change the flow. It's a little like Kalkie's suggestion above.
Another one that's good is py.test. It follows the "best API is no API" philosophy and relies purely on hooks being called at every level. You can override these hooks in files/functions named according to a convention and alter the behaviour. You can see the list of plugins on the site to see how quickly/easily they can be implemented.
A few general points.
Try to keep your non-extensible/non-user-modifiable core as small as possible. Delegate everything you can to a higher layer so that the extensibility increases. Less stuff to correct in the core then in case of bad choices.
Related to the above point is that you shouldn't make too many decisions about the direction of your project at the outset. Implement the smallest needed subset and then start writing plugins.
If you are embedding a scripting language, make sure it's a full one in which you can write general programs and not a toy language just for your application.
Reduce boilerplate as much as you can. Don't bother with subclassing, complex APIs, plugin registration and stuff like that. Try to keep it simple so that it's easy and not just possible to extend. This will let your plugin API be used more and will encourage end users to write plugins. Not just plugin developers. py.test does this well. Eclipse as far as I know, does not.
In my experience I've found there are really two types of plug-in Architectures.
One follows the Eclipse model which is meant to allow for freedom and is open-ended.
The other usually requires plugins to follow a narrow API because the plugin will fill a specific function.
To state this in a different way, one allows plugins to access your application while the other allows your application to access plugins.
The distinction is subtle, and sometimes there is no distiction... you want both for your application.
I do not have a ton of experience with Eclipse/Opening up your App to plugins model (the article in Kalkie's post is great). I've read a bit on the way eclipse does things, but nothing more than that.
Yegge's properties blog talks a bit about how the use of the properties pattern allows for plugins and extensibility.
Most of the work I've done has used a plugin architecture to allow my app to access plugins, things like time/display/map data, etc.
Years ago I would create factories, plugin managers and config files to manage all of it and let me determine which plugin to use at runtime.
Now I usually just have a DI framework do most of that work.
I still have to write adapters to use third party libraries, but they usually aren't that bad.
One of the best plug-in architectures that I have seen is implemented in Eclipse. Instead of having an application with a plug-in model, everything is a plug-in. The base application itself is the plug-in framework.
http://www.eclipse.org/articles/Article-Plug-in-architecture/plugin_architecture.html
I'll describe a fairly simple technique that I have use in the past. This approach uses C# reflection to help in the plugin loading process. This technique can be modified so it is applicable to C++ but you lose the convenience of being able to use reflection.
An IPlugin interface is used to identify classes that implement plugins. Methods are added to the interface to allow the application to communicate with the plugin. For example the Init method that the application will use to instruct the plugin to initialize.
To find plugins the application scans a plugin folder for .Net assemblies. Each assembly is loaded. Reflection is used to scan for classes that implement IPlugin. An instance of each plugin class is created.
(Alternatively, an Xml file might list the assemblies and classes to load. This might help performance but I never found an issue with performance).
The Init method is called for each plugin object. It is passed a reference to an object that implements the application interface: IApplication (or something else named specific to your app, eg ITextEditorApplication).
IApplication contains methods that allows the plugin to communicate with the application. For instance if you are writing a text editor this interface would have an OpenDocuments property that allows plugins to enumerate the collection of currently open documents.
This plugin system can be extended to scripting languages, eg Lua, by creating a derived plugin class, eg LuaPlugin that forwards IPlugin functions and the application interface to a Lua script.
This technique allows you to iteratively implement your IPlugin, IApplication and other application-specific interfaces during development. When the application is complete and nicely refactored you can document your exposed interfaces and you should have a nice system for which users can write their own plugins.
I once worked on a project that had to be so flexible in the way each customer could setup the system, which the only good design we found was to ship the customer a C# compiler!
If the spec is filled with words like:
Flexible
Plug-In
Customisable
Ask lots of questions about how you will support the system (and how support will be charged for, as each customer will think their case is the normal case and should not need any plug-ins.), as in my experience
The support of customers (or
fount-line support people) writing
Plug-Ins is a lot harder than the
Architecture
Usualy I use MEF. The Managed Extensibility Framework (or MEF for short) simplifies the creation of extensible applications. MEF offers discovery and composition capabilities that you can leverage to load application extensions.
If you are interested read more...
In my experience, the two best ways to create a flexible plugin architecture are scripting languages and libraries. These two concepts are in my mind orthogonal; the two can be mixed in any proportion, rather like functional and object-oriented programming, but find their greatest strengths when balanced. A library is typically responsible for fulfilling a specific interface with dynamic functionality, whereas scripts tend to emphasise functionality with a dynamic interface.
I have found that an architecture based on scripts managing libraries seems to work the best. The scripting language allows high-level manipulation of lower-level libraries, and the libraries are thus freed from any specific interface, leaving all of the application-level interaction in the more flexible hands of the scripting system.
For this to work, the scripting system must have a fairly robust API, with hooks to the application data, logic, and GUI, as well as the base functionality of importing and executing code from libraries. Further, scripts are usually required to be safe in the sense that the application can gracefully recover from a poorly-written script. Using a scripting system as a layer of indirection means that the application can more easily detach itself in case of Something Badâ„¢.
The means of packaging plugins depends largely on personal preference, but you can never go wrong with a compressed archive with a simple interface, say PluginName.ext in the root directory.
I think you need to first answer the question: "What components are expected to be plugins?"
You want to keep this number to an absolute minimum or the number of combinations which you must test explodes. Try to separate your core product (which should not have too much flexibility) from plugin functionality.
I've found that the IOC (Inversion of Control) principal (read springframework) works well for providing a flexible base, which you can add specialization to to make plugin development simpler.
You can scan the container for the "interface as a plugin type advertisement" mechanism.
You can use the container to inject common dependencies which plugins may require (i.e. ResourceLoaderAware or MessageSourceAware).
The Plug-in Pattern is a software pattern for extending the behaviour of a class with a clean interface. Often behaviour of classes is extended by class inheritance, where the derived class overwrites some of the virtual methods of the class. A problem with this solution is that it conflicts with implementation hiding. It also leads to situations where derived class become a gathering places of unrelated behaviour extensions. Also, scripting is used to implement this pattern as mentioned above "Make internals as a scripting language and write all the top level stuff in that language. This makes it quite modifiable and practically future proof". Libraries use script managing libraries. The scripting language allows high-level manipulation of lower level libraries. (Also as mentioned above)
I've noticed in pretty much every company I've worked that they have a common library that is generally shared across a number of projects. More often than not this has been a single companyx-commons project that ends up as a dumping ground for common programs including:
Command Line Parsers
File Utilities
Framework Helpers
etc...
Some of these are well thought out and some duplicate functionality found in Apache commons-lang, commons-io etc..
What are the things you have in your common library and more importantly how do you structure the common libraries to make them easy to improve and incorporate across other projects?
In my experience, the single biggest factor in the success of a common library is user buy-in; users in this case being other developers; and culture of your workplace/team(s) will be a big factor.
Separate libraries (projects/assemblies if you're in .Net) for different application tiers is essential (e.g: there's obviously no point putting UI and data access code together).
Keep things as simple as possible; what you don't put in a common library is often at least as important as what you do. Users of the library won't want to have to think, so usage needs to be super easy.
The golden rule we stuck to was keeping individual functions focused on a single task - do one thing and do it well (or very very well); don't try and provide something that tries to take every possibility into account, the more reusable you think you're making it - the less likely it is to be used. Code Complete (the book) has some excellent content on common libraries.
A good approach to setting/improving a library up is to do regular code reviews and retrospectives; find good candidates that you've already come up with and consider re-factoring them into a library for future projects; a good candidate will be something that more than one developer has had to do on more that one project (for example).
Set-up some sort of simple and clear governance of the libraries - someone who can 'own' a specific library and ensure it's overal quality (such as a senior dev or team lead).
I have so far written most of the common libraries we use at our office.
We have certain button classes that are just slightly more useful to us than the standard buttons
A database management class that does some internal caching and can connect to ODBC, OLEDB, SQL, and Access databases without even the flip of a parameter
Some grid and list controls that are multi threaded so we can add large amounts of data to them without the program slowing and without having to write all the multithreading code every time there is a performance issue with a list box/combo box.
These classes make it easier for all of us to work on each other's code and know how exactly they work since we all use the exact same interfaces throughout our products.
As far as organization goes, all of the DLL's are stored along with their source code on a shared development drive in the office that we all have access to. (We're a pretty small shop)
We split our libraries by function.
Commmon.Ui.dll has base classes for ui elements.
Common.Data.Dll is sort of a wrapper around Enterprise library Data access classes.
Common.Business is a dumping ground for other common classes that don't fit into one of those.
We create other specialized dlls as needs arise.
At work we're developing a large-scale application with quite a few front-end, back-end and support components. Typically the front-end is developed in C# and the back-end is developed in Java, although parts of the back-end are also developed in C# and possibly later C++.
The choice of language and platform is not arbitrary; we try to weigh the relative merits of each in development time, tool-chain cost, familiarity with the language by the specific development team etc. What all these components have in common, though, is that they are all required for the complete operation of the product, and that they are being developed concurrently by independent (but highly communicative) teams.
Previously, we have used Team Foundation Server for our .NET code and Subversion for our Java code; because there was clear separation of the teams' responsibilities, this caused little problem beyond the inconvenience of placing binaries (WARs, in this case) generated from one source tree in another, and the high manual overhead of keeping the branches and revisions in sync. With this project, the degree of separation between the teams is intentionally much smaller, and the volume of branching/merging is expected to be considerably higher; as a result we're moving to a unified VCS, more specifically Subversion.
This brings me to the meat of the question: how does one mix Java and C# code effectively? In practice, we'll have .NET code dependent on a Java codebase; the Java binaries are required to run anything other than unit test code (integration tests already require the binaries, and QA, acceptance testing etc. certainly does as well). What we currently have in mind looks something like:
/trunk
/java
/component1
/component2
/library1
/library2
/net
/assembly1
/assembly2
/...
project.sln
The idea is that the entire source tree is placed under one branch; the .NET code is dependant on the Java code, so we'll add a post-build step to the solution which will (most likely) call the ant script for the Java components. This allows branching of the entire codebase (for .NET developers) or just the Java components (for Java developers).
The problems with this solution are:
What happens when one of the two codebases becomes so large that making copies of it for every branch gets impractical? (our thoughts: split to separate repositories for .NET and Java code and use svn:externals, any input on this would be greatly appreciated).
We use Eclipse for Java development. How do we manage the "shared" workspace (i.e. which projects are required for which components, the dependency graph etc.)? Up until now we've had relatively few Java components, so each developer could just keep all of them in the workspace at the same time. With the increase in Java components and Java developers I don't see how we can keep doing that; any suggestions on how to keep the workspace versioned (a la solution files) while still maintaining sync between the two code-bases?
I would love to hear your input!
1: I've found it best to group things by component, rather than langugage. If one component requires several languages for interface, you still need to develop, test and release them as one. So, splitting a component across several repos is not a good idea.
If one part of the code depends tightly on the other, keep it together. Better to split components across repos. (This even goes for internal structure, where, especially as things grow, it's difficult if you package things by type, rather than by function, i.e. in MVC, don't have three huge packages for each category, rather keep FooView, FooModel and FooController tight.)
svn:externals might work, and with the later versions I think you can use "internals", i.e. link to other dirs in the same repo. That is miles easier than managing separate repos, especially with tagging and branching. (shudder)
2: You could always have the developers setup different workspaces, or perhaps use working sets. Commercial Eclipse releases has better support for sharing workspace settings than the OS variant. (Haven't tried, only worked and been frustrated with the OS one)
I've done C++ (MSVS) and Java (Eclipse) in one repo, and it works pretty well. Also C++/Python similarly. Make sure your build system supports building and testing everything (even if your IDEs only build one part).