Hudson: Keeping track of number of changed files in each build - version-control

Does anyone know of a built-in way to have Hudson keep track of how many files are changed, added or deleted in the source code repository in each build ? I'd like to plot the results in the same way that the JUnit test results graphs show the numbers of passing and failing tests for each build.
The Measurement Plots plugin and the Plot Plugin look like they might give me a starting point, but i'm wondering if there might be a more specific plugin or feature already available.
My SCM system is CVS, but I'd like a generic solution that would work with other SCM systems.

I don't believe there are any existing plugins that will do this directly.
If it doesn't need to be specifically tracked for each build (that is if you are really more interested in changes over time), then I would suggest setting up Sonar, which tracks daily changes from your builds and integrates fantastically with Hudson, or FishEye which connects directly to your SCM system.
But why not try to write the plugin for Hudson? Seems like the sort of thing that people might like to visualize as a per-build metric.

I think this question is more generic than specific to Hudson. You're probably going to have to write a little code by yourself. Unfortunately, I don't think any solution will be SCM agnostic because Hudson tends to use the SCM tools themselves to do the SCM bits.
I couldn't find any off-the-shelf solutions, so here's what I see would have to be done:
Find the SCM command that you are using (i.e., svn up, cvs -n).
Use wc -l or some other command to count the number of lines in output. This will give you an estimate to the number of changed/added/deleted files.
Parse the output if you want the names of individual files that have been added/changed/deleted.
Unfortunately, I don't think there's an SCM-agnostic way of doing this. Perhaps the best you could do is find a pure-Java CVS/SVN client implementation that you could modify to keep track of files as they come in from the SCM.

Related

version control, configuration management and build combined

Could you please explain why such approach is not already existing and not widely in use?
Or if such a toolset exists, can you cite it?
Why is it that version control systems (VCS) are working on files (clearcase, svn, git e tc)? and not on units/functions?
So to track changes in a functionality one have to analyze versions of file (sometimes several files) -- as an example: if I want to analyze "Change functionality" I would get a history of that module/function and see it in one place.
If such a tool would exist... then, a software configuration tool (SCM) would put these units and/or functions together in a release configuration. Why is it, we still use Makefile, build.xml, plugin.xml e tc?
And about build: is it really necessary for a compiler to have files? if the SCM could prepare an input for a build tool and get binaries?
C/C++ for example: such a SCM could prepare the whole source in one chunk and get binary out of compiler. In case of Java: SCM could prepare .java classes and get .jar out of compiler.
Thank you.
PS: I do not search for solution for any particular problem, it is more about method. In every project same approach on source/config/build. Well with different tools, which are evolving... but there is no new approach/method to address complex systems in a different way.
if I want to analyze "Change functionality" I would get a history of that module/function and see it in one place.
If such a tool would exist
It does actually: see "Can Git really track the movement of a single function from 1 file to another? If so, how?", and its git blame -C command.
Why is it, we still use Makefile, build.xml, plugin.xml etc?
If is about declaration: you declare what you want to build, but most importantly in which order and the dependencies you need.
if the SCM could prepare an input for a build tool and get binaries?
The input for the build tool remains files, not "units/functions": the compilations tools are much more evolved and equipped to parse/analyze and extract those units, building the binary as a result.
Putting too much responsibilities in the sole SCM tool seems try to do it all, which means it will do "all" not too well, as opposed of doing one thing brilliantly.

Automate build and developement pattern with VisualStudio

I'm currently working on a project that's been going on for several years straight. The development-team is small (less than 5 programmers), and source-control is virtually non-existent, and the deployment-process as is is just based on manually moving files from one server to another. The project is in classic ASP, so building isn't an issue, as both deployment and testing is just about getting the files to where they need to be and directing the browser at the correct location. Currently all development is done on a network-drive which is also the test-server. The test-server is only available when inside the the local network (can be accessed trough vpn), and is available on the address 'site.test' in the browser (requires editing to the hosts-file on all the clients, but since there are so few of us that hasn't proven to be any problem at all). All development is done in visual studio. Whenever a file is change the developer that changed the file is required to write the file he changed into a word-document and include a small description as of what was changed and why. Then, whenever there's supposed to be a version-bump (deployment), our lead-developer goes trough the word-document and copies every file (file by file) that has changed over to the production-server. Now, I don't think I need to tell you that this method is very error prone (a developer might for instance forget to add that he changed some dependency, and that might cause problems when deployed), and there's a lot of work involved with deploying.
And here comes the main question. I've been asked by the lead developer to use some time and see if I can come up with a simple solution that can simplify and automate the "version-control" and the deployment. Now, the important thing is that it's as easy as posible to use for the developers. Two of the existing developers have worked with computers for a long time, and are pretty stuck up in their routines, so for instance changing it into something like git bash wouldn't work at all. Don't get me wrong, I love git, but the first time one of them got a merge-conflict they wouldn't know what to do at all. Also, it would be ideal to change to a more distributed development-process where the developers wouldn't need to be logged into vpn (or need internet at all) to develop, and the changes they made offline could be synced up when they were done with them. Now, I've looked at Teem Development Server from Microsoft because of it's strong integration with Visual Studio. As far as I've tested it seems possible to make Visual Studio prompt the user if they want to check in changes whenever the user closes Visual Studio. Now, using TFS for source-control would probably eliminate most of the problems with the development, but how about deployment? Not to mention versioning? As far as I've understood (I've only looked briefly at TFS), TFS has a running number for every check-in, but is it possible to tell TFS that this check-in should be version 2.0.1 of the system (for example), and then have it deploy it to the web-server? And another problem, the whole solution consists of about 10 directories with hundreds of files in, though the system itself (without images and such) is only 5 directories, and only these 5 should be deployed to the server, is this possible to automate?
I know there's a lot of questions here, but what is most important is that I want to automate the development process (not the coding, but the managing of the code), and the deployment process, and I want to make it as simple as possible to use. I don't care if the setup is a bit of work, cause I got enough time at hand to setup whatever system that fits our needs, but the other devs should not have to do a lot of setup. If all of the machines that should use the system needs to be setup once, that's no problem at all, cause I can do that, but there shouldn't bee any need to do config and setups as we go.
Now, do any of you have any suggestions to what systems to use/how to use them, in order to simplify the described processes above? I've worked with several types of scm-systems before (GIT, HG and SubVersion), but I don't have any experience with build-systems at all (if that is needed). Articles, and discussion on how to efficiently setup systems like this would be greatly appreciated. In advance, thanks.
This is pretty subjective territory, but I think you need to get some easy wins first. The developers who are "stuck up in there ways" are the main roadblock here. They are going to see change as disruptive and not worth it. You need to slowly and carefully go for the easy wins.
First, TFS is probably not going to be a good choice. It's expensive, heavy, and the source control in TFS is pretty lousy. Go for Subversion: it's easy to setup and easy to use, and it's free. Get that in place first, and get the devs using it. Much easier said than done.
Later (possibly much later), once the devs are using it and couldn't imagine life without a VCS, then you could switch to Hg or Git if you need first class branching and all those other nice features.
Once you have Subversion in place, you can use something like JetBrains TeamCity or Jenkins, both of which are free and easy to use. However, I'm just assuming you don't have a lot of tests and build scripts that the CI server is really going to be running, so it's far more important that you get VCS first. In all things: keep it as simple as possible. Baby steps. Get some wins, build trust, repeat.
I can't even begin to think where to begin with this! Intending no offense directed at you, apart from the mention of git and HG, this post could have been written 10 years ago.
1) Source control - How can a team of developers possibly work effectively without some form of source control? Hell, even if it's Visual Source Safe (* shudder *) at least it would be something. You have to insist that the team implement source control. You know what's available so I won't get into preaching about that. (However, Subversion with TortoiseSVN has worked quite well for me.)
2)
"write the file he changed into a
word-document and include a small
description as of what was changed and
why"
You have got to be kidding... What happens if two developers change the same file? Does the lead then have to manually merge two changes that s/he extracts from the word doc? Please see #1 and explain to them how commit comments work.
Since your don't really need to "build" (i.e. compiled, etc.) anything, you should be able to solve most of your problems with some simple tools. First and foremost you need to use a source control solution. Yes, the developers would have to learn how to use another tool (EEEK!). You could do the initial leg work of getting the code into the repository. If you have file access to the other developers machines, you could even copy a checked-out working copy to their machines so they wouldn't have to do the checkout themselves (not really that hard). You could then use all the creamy goodness of version control to create version branches when each deployment needs to be done. You could write simple scripts using the command line SVN tools to check out said branches and automatically copy the files to the target server(s). Using a tool like BeyondCompare, the copy process could be restricted to only the files that are different (plus BC can handle an FTP target if that is an issue). By enforcing commit comments on the SVN repo, you'll guarantee that the developers provide comments, and for each set of changes between releases you could very easily generate a list of all those comments using the CSM log retrieval features.

Eclipse: collaborative dev on shared drive

I am using Eclipse with the "statet plugin for R".
I am looking for a way to do collaborative development (like in google docs: allow people to modify code at the same time). Any preferred plugin for that? I have seen eXtreme Collaborative Development Environment but I don't know if it is good?
I wouldn't use a shared drive, I'd set up a source code respository using Mercurial or Git and use that. I'm sure there's plugins for those for Eclipse.
How would you stop it being a free-for-all? In my experience, when developing code you want to control change, not have it forced upon you when you aren't ready for it.
Consider this scenario. You have a hard to explain defect in your code. You are steadily debugging it, throwing different data at it, looking at intermediate values etc. You are just about to trap it when BOOM, somebody else changes some other code and your results change because of that.
Spacedman is right, use a revision control system of your choice and keep in control of change.
If you want to do something like pair programming, but remotely, then use a remote screen, e.g. VNC, with Skype so that you can explain to the other what you are trying to achieve.
I've used egit to add git functionality to Eclipse with StatEt and it works well. Allows others to edit code in whatever way works best for them (one repository, several individual repositories with frequent merges etc)
There are plugins out there that will facilitate realtime code sharing. They commonly work by having all individual devs have their own copy of the files and synching changes back-n-forth on the fly. If conflicts are found you get to decide how to resolve them explicitly.
Here is one such plugin from ECF project:
http://wiki.eclipse.org/DocShare_Plugin
I would recommend a source control system for day-to-day development. Real-time code sharing works best for holding short-term collaborative editing or debugging sessions, doing code reviews, etc.

Checkstyle source control intergration

I've been looking into checkstyle recently as part of some research into standard coding conventions. Though it seems like it is perfectly suitable for brand new projects, it seems to have a huge barrier to adoption for already existing projects as it doesn't seem to supply a method of only checking new or edited code. Maybe I'm wrong?
If you have a codebase that has never had a coding standard it could be a massive effort to get the whole codebase inline with a standard all at once. Allowing it to be done incrementally over time as code naturally evolves seems like a more reasonable approach. But it doesn't seem like a possibility with checkstyle.
I assume this would have to be a tie in with a source control system in order to be possible. Is that possible with Checkstyle or is there another tool that can provide this functionality?
As far as I known, Checkstyle is meant to analyze source, without considering its history or revisions.
To add that kind of feature means script checkstyle analysis to feed it with the exact sub-set of files representing the delta.
But then, certain kind of checks would be likely to fail or to miss in their analysis, like duplicate code check.
So for that kind of incremental analysis, you not only need to restrict the set of sources, but also the set of rules you want to enforce, for some of those rules only make sense on the all sources.
So, why couldn't you run a full check on each file and then filter results based on changes managed by your source control system? Anything like that exist?
Not to my knowledge, especially with plugin like eclipse-cs for eclipse: it they analyze a file, they will display all warnings, even though the source control mentions the file has not changed since a given revision.
Only an external script would be able to do this:
The principle is simple (although it could be a bit slow at execution time):
for each file, do a diff to check if modification have been made
if yes,
do a svn blame to annotate lines with the revision number which contained the last change.
Then analyze the file with checkstyle.
The script can then filter the warning for the line being currently modified (or for all the lines modified after a given revision).
We developed a Checkstyle plugin for SCM-Manager, a tool for managing git, subversion and mercurial repositories. If activated it is possible to check committed source code against your Checkstyle rules. If the check found errors, the commit is aborted.

Version control of deliverables

We need to regularly synchronize many dozens of binary files (project executables and DLLs) between many developers at several different locations, so that every developer has an up to date environment to build and test at. Due to nature of the project, updates must be done often and on-demand (overnight updates are not sufficient). This is not pretty, but we are stuck with it for a time.
We settled on using a regular version (source) control system: put everything into it as binary files, get-latest before testing and check-in updated DLL after testing.
It works fine, but a version control client has a lot of features which don't make sense for us and people occasionally get confused.
Are there any tools better suited for the task? Or may be a completely different approach?
Update:
I need to clarify that it's not a tightly integrated project - more like extensible system with a heap of "plugins", including thrid-party ones. We need to make sure those modules-plugins works nicely with recent versions of each other and the core. Centralised build as was suggested was considered initially, but it's not an option.
I'd probably take a look at rsync.
Just create a .CMD file that contains the call to rsync with all the correct parameters and let people call that. rsync is very smart in deciding what part of files need to be transferred, so it'll be very fast even when large files are involved.
What rsync doesn't do though is conflict resolution (or even detection), but in the scenario you described it's more like reading from a central place which is what rsync is designed to handle.
Another option is unison
You should look into continuous integration and having some kind of centralised build process. I can only imagine the kind of hell you're going through with your current approach.
Obviously that doesn't help with the keeping your local files in sync, but I think you have bigger problems with your process.
Building the project should be a centralized process in order to allow for better control soon your solution will be caos in the long run. Anyway here is what I'd do.
Create the usual repositories for
source files, resources,
documentation, etc for each project.
Create a repository for resources.
There will be the latest binary
versions for each project as well as
any required resources, files, etc.
Keep a good folder structure for
each project so developers can
"reference" the files directly.
Create a repository for final buidls
which will hold the actual stable
release. This will get the stable
files, done in an automatic way (if
possible) from the checked in
sources. This will hold the real
product, the real version for
integration testing and so on.
While far from being perfect you'll be able to define well established protocols. Check in your latest dll here, generate the "real" versiĆ³n from latest source here.
What about embedding a 'what' string in the executables and libraries. Then you can synchronise the desired list of versions with a manifest.
We tend to use CVS id strings as a part of the what string.
const char cvsid[] = "#(#)INETOPS_filter_ip_$Revision: 1.9 $";
Entering the command
what filter_ip | grep INETOPS
returns
INETOPS_filter_ip_$Revision: 1.9 $
We do this for all deliverables so we can see if the versions in a bundle of libraries and executables match the list in a associated manifest.
HTH.
cheers,
Rob
Subversion handles binary files really well, is pretty fast, and scriptable. VisualSVN and TortoiseSVN make dealing with Subversion very easy too.
You could set up a folder that's checked out from Subversion with all your binary files (that all developers can push and update to) then just type "svn update" at the command line, or use TortoiseSVN: right click on the folder, click "SVN Update" and it'll update all the files and tell you what's changed.