Code Repo/Wiki for Academic Research Cluster? - version-control

(Not sure if this is the best SE, but nothing else seemed close enough)
I'm a 'fresh' PhD Researcher, and after chatting with most of my cluster-colleagues (including staff), I suggested putting together a system for sharing 'acquired experiential knowledge' (Digital Communications within Electrical Engineering, so lots of code, lots of languages, and lots of algorithms, therefore a lot of things to 'work out' twice.)
Any time that I've done major coding projects, it's been a single repository containing a single project, and this seems to be the general state of 'HOWTO' articles in this area. I'd be looking to put something together that would have a wiki 'front end' (I've got experience with Mediawiki so I will probably stick with that), with 'context' info and theoretical stuff, with a VCS 'backend' that would hold archives of code-bases that people wanted to share. The reasoning for this archive is that there is a lot of person-turnover and any generated code can disappear into the ether on their departure, so that experience is lost.
Can anyone recommend any tools for this kind of multi-project VCS backend? Ideally I'd like something similar to bitbucket but locally served.

If you want it open, code.google.com is a possible solution

For self-hosting Git repositories, and fronting them with something like Gitweb, Gitolite is the current standard. Your wiki can then provide links to both gitweb and the actual repository URLs. Keep to the 'one repository per project' structure. Any cross-referencing (say by language, topic, and task) can happen in the wiki.

That sounds like a problem, which could be solved by using github or bitbucket. Both offer (distributed) VCS with a wiki and an issue tracker. Github using Git and bitbasket using Mercurial. You can even make everything private.

Related

Are there best-practice guidelines for maintaining a repository?

Are there best-practice guidelines for maintaining a GitHub repository? I've contributed to many open source projects and used GitHub for projects that I work on solo, but now I'm working with a team of six developers, including myself, to build a system, and I've been placed in charge of maintaining the repository. Nothing is to get merged into our main branch without my approval. As little as I know about maintaining a GitHub repository, of those within the organization (two team members are consultants) I've the most experience with the process.
But I've never maintained a GitHub repository, and while I'm doing OK, I know that there must be a body of knowledge out there of how to handle this correctly. I just haven't been able to find it.
One hurdle I've been jumping over repeatedly, for example, is merge conflicts. Usually they're minor, but not always. Is there some known system available that allows me to enforce who has the ability to edit which files at any given time, for example?
And yes, I realize this may not be the best Stack Exchange forum, but none of the others seemed more suited to the topic.
The Cloud Native Computing Foundation (CNCF) serves as the vendor-neutral home for many of the fastest-growing open source projects, including Kubernetes, Prometheus, and Envoy.
As such, it can be used as a starting point for your own project: see contribute.cncf.io/maintainers/github/, which offers:
template, to be usre you have your README, LICENSE and other important files.
labels, to better classify your issues
Add also a clear "release and maintenance policy", and you should be in good shape.

Can't Decide On a Suitable Open Source Project Host

I need a little bit of help deciding which project host (if any) to move our currently existing project to.
We currently have an SVN (but willing to migrate if necessary) closed-source project existing on Assembla. We're wondering about moving because we want to open source our existing project and:
We don't have the resources to actively promote our now open source project, and suspect that the project hosting we select might influence how our project is publicized. If we move from assembla to github, how likely is it that our project will get more attention?
We want it to be as easy as possible for new developers to pickup and start running.
Our project is also going to need very extensive wiki documentation, as it is a very complex enterprise web application framework (somewhat similar to spring). Does it make sense to put that documentation at the same place as our repo host? Or should we have a separate website for that? We also would like to have a blog as well as a forum. Same question for those.
Help?
We don't have the resources to actively promote our now open source project, and suspect that the project hosting we select might influence how our project is publicized.
Might - and might not. You can see a lot of solo-projects on any source-hosting
We want it to be as easy as possible for new developers to pickup and start running
Assembla is very good choice in this case. Do not be in a common disease Git-mania. Really "big community" in case of Github is just common marketing cheating, no more - it's not your community
Assembla have most needed (for big complex project) tools, compared to competitors. Pull requests on Github implemented better, yes. But I can't recall any other advantages. Support of almost all modern widely-used SCM (except Bazaar) is big plus also.
Around community size: Assembla have big plans of expansion to million users in nearest years (two, AFAICR)
NB: You can think about changing SCM to (some) DVCS - forking|merging are more natural in these systems and it will give one more level of freedom to contributors without big headache for any side
I think your project would get more attention being on github or bitbucket and developers would be able to easily clone from either host using git. With that said, you could try to promote it on sites like Hacker News.
Since your wiki would be very complex, I think a seperate website would be more suitable.

What version control use if I'm the only developer?

I want to use some version control for my projects, but I'm the only developer, there is not others. I want to use my pendrive like repository because I develop in many different places(but the same project).
I only worked with SVN, but in that case, was not good, I think an DVCS was better.
But now, I really don't know what to use, if SVN is the best option. I've looked for another solutions like Mercurial, Git, and Fossil, but I don't understand the differences, and mainly, if they are the best options for my situation.
I need to know what is best in this case.
If you're the only developer, then the best version control is the one that you are most comfortable with. The goal of version control is to make your life as a developer easier and safer, so there's no point in fighting with a version control that you don't know.
However, if you want to learn how to use a new version control system, this is a great opportunity.
If you think that you're going to have more developers working on this project later, then you want to think about a robust solution like SVN.
I would definitely recommend Git. It is a little cryptic at times, but it seems to be/become the de-facto standard for most open source projects. It's very powerful and it'll enable you to work with the great service that is Github :)
There are good topics on SO that compare Git, Mercurial, svn, etc.
What is the Difference Between Mercurial and Git?
To me an important requirement was easy, free, private repositories online so I started to use http://Bitbucket.org They support both Mercurial and Git.
+1 to #dudemonkey sentences. Except last - tastes can differ. Even as sole developer, you can (and have) use best techniques - i.e you may have non-linear development (thus - branching|merging), refactoring of code, different targets. Try and select best for you solution.
Nobody mentioned Fossil SCM - small, portable app in one exe, with all basic features of DVCS, integrated Wiki and tickets - which you can have (with repo) in USB, for example for max mobility
In first place stay away from SVN or any other CVCS, they suck. To learn more about DVCS, I recommend the Eric Sink's Book, it's free: http://www.ericsink.com/vcbe/
As you plan to work on different machines, the best solution would be to have an online repository, I think it's more pratical and safer than the pendrive. Some of the most known out there are: https://bitbucket.org/, https://github.com/, https://launchpad.net/, http://code.google.com/projecthosting/. Remember that with a DVCS you don't need to be online all the time, you can commit locally and push to the server later. If your project is not open source, you should stick with BitBucket as it's the only one that offers free hosting for closed source projects.
If you really want to work just in the pendrive, don't leave the repository just in the pendrive. It's safer to clone the repository from the pen drive to the machines you will work. Then you Push/Pull to the pen drive's repository to synchronize.

Going against the common source control used in the organisation

The company I work for exclusively uses Clearcase. I feel it is not worth the effort to learn and use it as my project would not involve too many people(max of upto 3 people), neither would it involve fancy development flow. How do I convince my manager on using a separate source control for this when they bring up the point of "IT supported and uniform source control through out the company" against my suggestion? Or is that particular point valid and I should go with Clearcase?
Thanks...
PS: I was thinking of using Subversion.
If it's the company standard, then there should be people in the company that know it that can help you get started with it. They should be able to set up your particular IDE to work with it, and make everything as seamless as possible for you. You shouldn't have to learn anything crazy in order to use it - there should be clear instructions.
If there's not, then you're not getting the benefit of "IT supported and uniform source control through out the company" and it makes no difference what you use.
After working through this process, if an argument can be made that there's a better source control that would work better through the firm, you should make it. If it's a company standard, it shouldn't be a pain in the neck to use.
First of all, my condolences.
Is Clearcase that difficult to learn? If yes, I'd consider that a strike against it. If not, you might be better off just going with the flow and learning it.
The 3 people on your team aren't likely to be the ones that maintain the code over its software life, so that argument goes against you as well.
How big is the organization? It's easier for 3 people to learn Clearcase than for X to learn Subversion.
It would depend a lot on your boss and his/her willingness to consider alternatives.
Subversion would certainly be easy to set up and use. I'd hate to be using Clearcase myself.
But there are a lot of factors against you. Good luck.
If I were you then I'd want to use SVN because I just love it but just like anything with developement most of the time you have to go with the flow. Same as if you were extending a framework and you decided to do it your own way completely different to the rest of the framwork it would be frowned upon even if your way was much better. Also what would happen if you and your team got hit by a bus? OK SVN is easy to learn but there is still a learning curve that could slow down other people continuing your work.
You could also view this as a chance to learn a new source control system and make yourself more marketable. You could still prefer subversion, you can't be forced to change your opinion. As #Erick says if this is the company standard they should be able to support you and I'd hope you'd be able to get this factored into your timescales.
There's always the tried & true (albeit risky) approach of having your team use Subversion (how appropriate!) within itself and making the occasional show check-in to Clearcase.
Your argument against Clearcase
are
it is not worth the effort to learn
my project would not involve too many
people neither would it involve
fancy development flow.
Is it difficult to learn? If your company already using it, you will have many people to ask help from. Since your company already have clearcase, you are 80% done with clercase admin works. It will be fairly simple to create vobs, project etc.
In fact your project is small makes it more important to use standard within company. We had some fancy projects done using our standard SCM and some projects without the standard SCM. For the one which used standard SCM, when the projects is scrapped, the code is still safe with in organization. Other products, we lost people and code.
Are you using ClearCase with UCM? It is simpler than base clearcase.
As I mentioned in "Are There Any Reasons To Still Use SVN?", you can't just decide to put a new server on the side, especially for sensitive data like code sources. At some point, they need to be versioned in a central referential.
Any server for that kind of persistent and important data should be:
funded (you can't just manage a VCS on your workstation, you need an actual server)
supported (your jobs is to develop, not to manage /troubleshoot your team in term of SVN operations)
integrated (no problem on that front for SVN)
administrated (again, your job isn't to manage/monitor the SVN server)
documented (where is your SVN server?, what are the backup procedures?, is there any backup/DRP server?, ...)
For all those reasons, your manager might consider leveraging the existing infrastructure/support around the main VCS (ClearCase, condolences ;) ).
But that won't prevent you to manage a DVCS within a ClearCase snapshot view though.
I don't know Clearcase, so I can't say whether it's good or bad or how it compares to Subversion.
But the whole point of source control is that there's supposed to be a single repository where everyone knows they can go to check out their code. If one team uses Clearcase and your team uses Subversion and another uses Git and another uses CVS, etc etc, then anyone wanting to check out source code not only has to look in ten places, but they have to learn to use ten different source code control systems.
Unless you can make a case that Subversion is clearly better than Clearcase in some relevant and important way, I'd say just bite the bullet and learn Clearcase. If it was me, I'd see it as an opportunity to learn something new. After I did, maybe I'd conclude that Clearcase sucks big time and I should try to convince the powers that be to switch to something else. Or maybe I'd conclude that Clearcase was the greatest advance since the invention of USB. Either way, you now know a little more. If nothing else, it's one more bullet to put on your resume the next time you go looking for another job.
Are you working full-time for this company or is this a one-off project as a contractor?
If it's your full-time role as an employee then it's time to buy that "Clearcase for Beginners" book.
If you're looking to deliver a project quickly, then perhaps just keep your head down and use Subversion as an expedient. Ask someone to check the finished article into Clearcase as you run out of the door.
:)
Use a DVCS like Mercurial or Git and make a branch of it to keep synced up with ClearCase. Your team can do the day to day work using a better tool but your organization still has your work products accessible the usual way (which is important). Sync up with ClearCase on a regular basis, and consider adding the output of your systems DVCS log (eg hg log) to a file in the ClearCase branch, so people can see the more granular steps if needed.
DVCS systems are also usually pretty easy to set up, and often can be easily hosted on a network share or something like that. Since all developers will have a copy of the full tree it provides a certain amount of built in backup and redundancy.
This workflow is not uncommon -- gitsvn and hgsvn are both scripts people have set up to let them use those tools by themselves but sync up with folks on another repository.

Source Control Beginners

What would be the best version control system to learn as a beginner to source control?
Anything but Visual Source Safe; preferably one which supports the concepts of branching and merging. As others have said, Subversion is a great choice, especially with the TortoiseSVN client.
Be sure to check out (pardon the pun) Eric Sink's classic series of Source Control HOWTO articles.
I'd suggest you try Subversion, for example with the 1-click SVN installer. Try searching SO for "Subversion", and you'll find loads of questions with answers that point to good tutorials.
Good luck!
I'd go straight for Git. I've used subversion before, but always felt like I was doing it wrong. Git made sense from day one.
Useful resources:
Linus Torvals on Git
Scott Chacon "Getting Git"
There are a few core concepts that I think are important to learn:
Check-ins/check-outs (obviously)
Local versions vs. server versions
Mapping/Binding a local workspace to a remote store or repository.
Merging your changes back into a file that contains changes from
others.
Branching (what it is, when/why to use it)
Merging changes from a branch back into a main branch or trunk.
Most modern source control systems require some knowledge of the above topics and should help facilitate you learning them. Then you have distributed source control, which I don't have any experience with but is supposed to be fairly complicated and may not be suitable for a beginner.
Subversion is great because it has all of the modern features you'd want and is free.
Git is also becoming an increasingly popular option and is another free or very low cost alternative to Subversion. Knowledge regarding the concepts of branching and merging become critical for using Git, however.
You can use unfuddle as a free and easy way to experiment with both Git and Subversion. I use it to host a couple of subversion repositories for some side projects I've worked on in the past.
I'm not and advanced source control user, but I'm learning. Here is my experience with source control products:
A long time ago, the company I was working for at the time decided to use source control. They introduced the concept to developers and got eveyone willing to give it a try. They chose to use PVCS, and implemented it. Before too long, developers would have to coordinate to lock/unlock modules and objects and we really didn't see much benefit.
A few years later, I was playing around with making an open source project and at the time rubyforge was offering CVS repositories. I tried it out and it was marginally better than PVCS. Granted I was the only one using the repository. I did however become frustrated when I tried to rearrange the structure of my files because I didn't like the way I had initially imported them. It didn't really work out in CVS.
A few years after that I was working on another personal project and my web hosting provider offered easy to setup Subversion (SVN) repositories. It took me a little bit of research to get it up and running correctly, but once I got past the initial learning curve, I liked it.
Not long after that I realized that I liked having source control and that my current job didn't have it. So I evangelized, and after a long period of time, my team implemented Source Safe because we work in Visual Studio and are generally a Microsoft shop. I was eager to use it, but before long I found that I was losing files and that Visual Studio was putting things in the wrong place and that I'd work on a project for a while and then go to export my work to another location and find that it either wouldn't export or would only export some of the projects in a solution. This made me realize that even though I thought I was using a "version control system", the copy of the code that was most secure, robust and complete was my working copy. The exact opposite of what source control is supposed to do.
So last week I was so fed up with Source Safe that I went searching. After looking into a few solutions, I decided to try git. I won't say it's all been roses, since I have again had some learning curve to get it to do what I want it to do, However, I have liked it enough to convert all of my work and personal projects over to it. One of the really nice things about it is that I don't need a centralized repository so I can use it without going through a ton of red tape at work to get it installed.
So in short I would reccommend git, I use Mysysgit in windows and it has the added bonus of giving me a bash shell. On Linux you can just install it from your package manager. If you don't like git, try subversion. If you don't like either of those you probably won't like CVS or PVCS either. Under no circumstances try Source Safe, it's awful.
I found http://unfuddle.com saved me messing about with installing SVN or git. You can get a free account in there and use either of those - plus you can use your OpenID there.
Then you avoid having to mess about setting it up right and focus on how you're going to use it!
Vault from SourceGear.com is superb. It is free for single users and provides a superb VS 2005/2008 interface. I love it!
rp
#Ian Nelson:
I agree with you that Source Safe is bad as a source control system, but keep in mind that using Source Safe is a lot better than "carrying around floppy disks" as Joel Spolsky said.
For a beginner it might not be a bad idea, since the cost of having no source control at all is a lot higher.
Each tool has it's strengths and weaknesses. It's very much a question of what your requirements are. Unfortunately with this issue, like many others, it's often not the best tool that is selected but the one that someone is familiar with. For instance, if you don't require many branches and your team is small and local, almost any vcs will do the job (except SourceSafe). Things change if you need branches (which almost by necessity means you also need to do merges), your team is distributed, you need advanced security (subcontractors are not allowed to entire source tree), task tracking, etc. There is also the question of cost in three different ways: cost of licenses, cost of maintenance (some tools are so complicated that you in practice need someone just to control the repositories) and cost of training.
Therefore suggesting one tool over another is like suggesting what would be the best programming language.
Just some pointers:
StarTeam is the easiest of the tools
I have used. It required very little
training. I got a one-day training
since I was to be the maintainer.
This maintaining took me less than 30
minutes per week. Users I "trained"
by writing a two-page manual and
after that I had very few questions
to answer.
Continuus was the other end of the
scale as far as ease of use is
concerned. On the other hand task handling was great and it offered good support for release management. Trouble is, even as a release manager I never thought ease of making releases (it was once you learned how, but that took a considerable amount of time) should be more important than the daily work that developers do.
Merging and branch creating differs
wildly between tools. Some tools make
this simple, like git and ClearCase
(although the latter is very slow)
some basically force you to do the
merge by hand. If you need to do
merges a lot, the cost can get high.
ClearCase was also expensive in all
three categories mentioned before
(although it has to be said we used
all the advanced stuff which isn't
necessary). Git on the other hand
lacks a good UI and some of concepts
differ from what you might be used
to. Git's security features are also
lacking (gitosis addresses some
issues but not all).
Most tools I have used are also quite
slow. Tools like PVCS/Dimensions was
just slow, no matter what (basic
things like opening a directory in
the repository), some very slow in
more specific ways (like ClearCase).
From the tools I have used I would select StarTeam if your developers are not very experienced (and if you don't mind paying the license, which is quite expensive) and git if you have some experienced vcs guys onboard who can set up the environment to other guys. Mercurial also looks like an interesting competitor and seems to have slightly better UI's.
Continuus, PVCS/Dimensions and ClearCase are just too slow, too complex and too expensive for almost any project. If someone insist on selecting one of these, I would go for ClearCase.
I haven't used Subversion which many seem to like (yet, I have a feeling this is about to change in the near future) so can't comment how it compares to the other tools I have used (usually as a build and/or release manager).
As for the first tool to choose, problem with Git, Bazaar and Mercurial is they are distributed vcs's. This is different from the traditional server-client model where you have a central repository. For just learning the stuff I would recommend also reading about the concepts. Branching for instance is something that you might not understand correctly just by trying yourself (there are different branch strategies for different situations). Plus it is very different if you are the only one accessing the repository, merge conflicts for instance wouldn't be a problem (you might get to see them but you would easily also fix them since you know the code in both branches). Of course you would learn about check outs, check ins, and such but I don't think these issues are particularily difficult in the first place.
Added problem with vcs's is that they tend to use different terms. In StarTeam which is otherwise easy to use they for some reason insist on using the terms "check out" and "check out and lock". The latter is what most people think the first does. There is a reason for this (you can edit files even if you don't have an exclusive lock), but it would still make much more sense to call these "Get" and "Check out" to avoid confusion.
Anything, but I would learn a modern system like git or subversion myself. My first VCS was RCS, but I got the basics down.
Well, if you are just wanting to learn on your own, I would say you should go with something free, like subversion. If you are a company who has never used source control before, then it really depends on your needs.
My first exposure was CVS with WinCVS as a client. it was horrid. Next was Subversion, with TortoiseSVN and Eclipse's integration. It was intuitive, and heavenly. I think that using CVS with TortoiseCVS and Eclipse's would be nice as well, though I prefer the way SVN handles revisioning. The entire repository is versioned with each check in, not individual files.
I'd also recommend Subversion. It does not take too long to set up, it is free, and there is a really good book available online that goes over the basics as well as some advanced topics: http://svnbook.red-bean.com/
Subversion with tortoisesvn. (tortoisesvn because you can see a lot of what goes on visually and will provide a good jumping off point for the command line stuff. ) There is tons of documentation out there and most likely you will see it at least one point in your career. Almost every company I have worked for and interviewed with runs SVN.
If you're looking to learn a commercial product while getting started Perforce provides a free client and server, with the server supporting two users and five client workspaces.
At my previous place of employment it was used religiously not only for code by our programmers, but for art assets and game levels, and my own documentation.
Subversion is good place to start with. It is very stable and modern version control system.
Best online resource to start learning about Subversion would be Version Control with Subversion. There are lot of choices as far as server and client softwares are concerned. I personally prefer (for Windows environment).
VisualSVN server
TortoiseSVN shell-integrated client and
AnkhSVN Visual Studio Subversion Add-On
Again, with Subversion there are lot of options available. Also, it is a continually evolving version control system (unlike outdated SourceSafe). It could be easily integrated with numerous automated build tools (CruiseControl, FinalBuilder) and bug/issue tracking systems (JIRA).
If you are looking for state-of-the-art version control systems, go for Git(developed by Linus Torvalds). But if you are totally new to version control systems, I would suggest start with subversion.