Splitting up a project into multiple smaller projects - version-control

I have a library containing a few classes. Now I want to split up this library into two separate libraries. What is the correct/best way to handle this in combination with source control?
My initial thought is to create a new repository for each new project and in the initial commit mention that it was split of from a now unmaintained project.
While I only have a few commits so far, an issue with this method is that the history of the project is lost.

It depends on which version control you are using. For instance, in git you can use filter-branch to do the trick.
You can make a copy of the original repository, then use git filter-branch to keep the history of the part you are interested in and dropping everything else.
$ git filter-branch --subdirectory-filter mydir1
$ git gc --aggressive
$ git prune
Beware this is destructive. You will see a considerable reduction of the repository size, only having the history of mydir1 and removing all those unreachable objects.
Then, repeat the same for other libraries/subdirectories. In that way, you will keep only the history that belongs to each part/library/directory.
If you are using a different version control system, then you have to figure out the equivalent way to do it.

The rule of thumb I follow depends on whether you will be developing and/or deploying the libraries independently. If you are separating the libraries simply for code organization and the code is deployed as a single solution, then there is no need or benefit to creating separate repositories.
On the other hand, if you will be versioning and releasing the libraries independently, then having the code in separate repositories helps this. So, for instance, if you are separating the code because some of it belongs in a share framework, then put the framework code in its own repository. This will allow you to maintain, build and release the framework separate from any applications that are built using the framework.
HTH

you can create a new repository but also you can create new projects under the same repository and delete the old one in time. actually, that's up to you. if you see the previous project as test level or pre-alpha stage, you may want to create a new repository. but other than that, using the same repository is very likely for this situation.

Related

Should I add third-party js code as a git submodule, or add it to my own repo?

As far as I know, the generally accepted practice of adding third party code like d3 is to add it as a git submodule. This reduces the size of the main repo, but I would imagine having d3 (for example) code in the main repo would help debug the cases when d3 changes breaking some code that uses it.
Are there any reasons why I should not just check out the latest version, develop my code using it, and push it to my own repository?
I really like using git subtree for this purpose. It allows you to keep copies of the remote repository, but still maintains that repository's history, and push/pull back and forth at will.
the only reason is: you don't need it. just use some build tool that automatically manages your dependencies (like grunt). but if for any reason it's not an option for you than use the way that fits your needs. you can make a separate dir for 3rd party libraries and it will work. just make a way so any developer can easily find out which version is currently used (for example use version in file name)

Multiple Git repositories for each Eclipse project or one Git repository

I am in the process of moving to Git from SVN. In SVN I had multiple eclipse projects in a single SVN repository that is convenient for browsing projects. I was going to move to having one git repository per eclipse project but EGit suggests doing otherwise.
The guide for EGit suggests putting multiple projects into a single Git repository.
Looking at similar questions such as this suggest one project per repository.
Which approach is best practice and what do people implement?
It depends on how closely-related these projects are. Ask yourself the following questions:
Will they always need to be branched/tagged together?
Will you want to commit over all projects, or does a commit mostly only touch one project?
Does the build system operate on all of them or do they have a boundary there?
If you put them all in one, some things from above will be easier. You will only have to branch/tag/stash/commit in one repository, as opposed to doing it for every repository separately.
But if you need to have e.g. separate release cycles for the projects, then it's necessary to have each project in an independent repository.
Note that you can always split up a repository later, or combine multiple repositories into one again without losing history.
Combining is a bit harder to do than splitting, so I would go for one repository first and see how it goes.
I use 1 repo per project.
Some reasoning:
When you discover you messed up something after several commits, it's much easier to fix when it's just one project. Just think about, you did commits to two other projects and now you need to fix the commit you did on the 3rd project.
As Fedir said, your history and log is much cleaner. It only shows the commits for that project.
It works better with the development workflow I have. I have a master branch for production, develop branch for, well, development, and I create branches to implement features (you can read more about it here: http://blog.avirtualhome.com/development-workflow-using-git/)
When you work in a team, and so "share" the git repo, do the team members really need all the other projects as well?
Just a few thoughts, but what it boils down to: Do what works for you.
I have multiple projects (Eclipse projects) and have tried different things to find out what worked best in terms of actual daily development. Here is what I found and I think that most people would find the same thing if they kept track of the results and analyzed the results objectively.
In short applying the following rules will give the best results:
Make a separate repository for each project group.
Each project group consist of a group of projects that are tightly connected to each other, that should be administered together and that cannot be easily decoupled from each other.
A project group can contain a single project.
A project group that contains multiple projects should be examined to see if some of its projects can be decoupled from each other so it can be split into smaller project groups that are still contain projects that are tightly connected to each other, that should be administered together and that cannot be easily decoupled from each other.
The following guidelines explain this process for determining which projects to put in the same repository in more detail:
If a project is not closely connected to any other project (for example, the project can be opened without other projects being opened and no other projects relies on the project being opened when they are opened) then you should definitely place it in its own repository for the reasons explained in the answers above this one.
If a project is dependent upon other projects or other projects depend upon the project then it comes down to exactly how connected are they upon each other, how well they can be packaged together and how easily can they be decoupled from each other.
A) For example a testing project that contains junit test classes to test the classes of a main project is a case where the two projects are very connected with each other, can easily be packaged together and cannot be easily decoupled from each other. These projects should be placed in the same repository for the reasons explained in part C below.
B) In a case where one project relies on another project to provide some sort of shared resources it really comes down to how well that they can be administered together and how easily that they can be decoupled from each other. For example if the project with the shared resources is relied upon by many projects, then it should be put in its own repository because the other unrelated projects are impacted by changes to the shared source code project. In a case like this, the shared resources project should be decoupled from the dependent projects instead of being directly connected to the dependent projects. (For example, it would be better to create versioned archive files [Jar files with a name like "projectName".1.0.1.0.jar for example] and include a copy of those in each project instead of sharing the resources by linking the projects together.)
C) If the multiple projects are connected, can be easily administered together but cannot be easily decoupled from each other, then it depends upon how tightly connected they are with each other.
I) If the projects are put into one repository, then the projects will be kept in sync with each other in the repository each time there is a commit, which can be a real life saver if the projects are tightly connected. However, this also creates the issues noted in the answers above this one.
II) If the projects are put into separate repositories, then you will have to take care to keep there commits in sync with each other and be sure to include some sort of mechanism to indicate which commits belong to the same sync point across the projects (Perhaps something like including the same sync point number in the comments for the commit of each project when a group of commits is done across the projects.)
III) So in cases like this, it is almost always better to put these projects together into a single repository to reduce the overhead of human effort in syncing the commits and to avoid human error should the commits need to be backed out. The only time that it might be better to place them in separate repositories is when only one of the projects is being changed regularly and the other connected projects are rarely changed.
I think this question is related to one I answered here. basically Git by its nature supports a very fine granular structure when it comes to projects/repositories. I have read and been taught that 1 repository per project is almost always best practice. You lose almost nothing by keeping the projects separate and gain a lot as other have been describing.
Probably, it will be more performant to work with if You will create multiple git repositories.
If You will make a branch, only project's files would be branched, and not all the projects.
Small project it will be faster to analyze, to commit. Operations will take less of time.
The log will be more clear also, You could make more granulated configuration if You will have multiple git repositories.

How to share code across multiple repository with Mercurial?

Over time, I developed a variety of utility functions, classes and controls that I reuse across multiple projects. For each of those projects I have a Mercurial repository and I copy the re-usable projects. Obviously this is bad since if I fix a bug in one of the reusable components, I have to copy the code manually in all repository and I could make a mistake in the process.
How do you handle such situation? How to share code across multiple repository with Mercurial in such way that if I do an update in one repository, I can easily integrate with the others.
Check out subrepositories: https://www.mercurial-scm.org/wiki/Subrepository
They won't help you keep the other copies up to date (you'll have to do that manually), but they will make that easy (you'd use hg pull; hg update in the subrepo, then commit the parent repo).
Another option (which I use on a different project) is to mandate a layout, then simply assume that the "utilities" repository is stored at ../utils, relative to each "real" repository.

Creating a new project based on an existing project (Project Reuse)

I have a project A. This produces a product that's working and already submitted to the app store etc. Now, I'd like to create a new project, let's call it project B, and I want B to be based on A. Obviously B will add more UI and behavior on top of A.
After doing some research, the only option seems to be using cross-project referencing, because I'd like to reuse Project A's XIBs, images etc in Project B. Am I correct in assuming that cross-project referencing should work in that scenario?
Well I'm having some serious problems in getting this thing working. I'd like to achieve project level reuse. In Java or in .NET this wouldn't even be a consideration, the technology allows that. Because iPhone doesn't support frameworks, I think the developers are pushed towards more primitive approaches like code duplication.
So, how can I tackle this problem. How can I create my Project B, based on Project A (including XIBs, images, etc)?
Thanks,
If A and B are so similar perhaps you could consider simply creating a new build target; this would give you a single project with target A and target B. Both targets would have access to any of the resources in the project.
If you have a fair bit of shared code then you can create a static library; iOS doesn't support dynamic linking to user-generated libraries, but it supports static linking just fine. This would make the cross-project dependencies useful, because you could have project B reference library A from project A and build it as a dependency.
I did this same thing at one point:
I copied and pasted the entire app and then had two separate apps that I could work on individually.
Contrary to popular opinion, it is possible to create iOS frameworks.
Maybe you could use a scm tool like Git or Piston (http://piston.rubyforge.org) and 'clone' the code. Do something like:
#add original project to git
cd /your/base/project/code
git init
git add . #Stages all files to check-in index
git commit -m 'Your commit message here'
Then
#clone the original project into a new one
cd /your/new/project/directory
git clone /your/base/project/code
git checkout -b aNewWorkingBranchName #create a new working branch to modify
#modify code to your <3's content, use git pull/push/merge/rebase/diff as required to track/update original project
This should let you develop the 'new' project independently, while allowing you to pull in changes when required. Piston allows 'vendor' branching against both Git or Subversion repositories, tying your new code to a particular remote revision. Have a look at its documentation.

How do people manage changes to common library files stored across mutiple (Mercurial) repositories?

This is perhaps not a question unique to Mercurial, but that's the SCM that I've been using most lately.
I work on multiple projects and tend to copy source code for libraries or utilities from a previous project to get a leg up on starting a new project. The problem comes in when I want to merge all the changes I made in my latest project, back into a "master" copy of those shared library files.
Since the files stored in disjoint repositories will have distinct version histories, Mercurial won't be able to perform an intelligent merge if I just copy the files back to the master repo (or even between two independent projects).
I'm looking for an easy way to preserve the change history so I can merge library files back to the master with a minimum of external record keeping (which is one of the reasons I'm using SVN less as merges require remembering when copies were made across branches).
Perhaps I need to do a bit more up-front organization of my repository to prepare for a future merge back to a common master.
Three solutions, pick your favorite:
Put all projects into one repository.
Make a separate repository for shared code and different repository for each project.
One repository with Subrepositories: https://www.mercurial-scm.org/wiki/subrepos, keep all common code in one subrepo and different subrepos for each project.
Copying actual files between repositories with no common ancestors will never be optimal as history is not preserved.
I'd recommend against your "copy the sourcecode" practice but use binary distribution for your custom libraries instead. These binaries are checked in along the sourcecode.
reduces build-time
no overhead of tracking changes in all copies of the library
you can use different versions of the same library in different projects.
EDIT: And for the issue with "common" or "toolbox" libaries in general, read this post from ayende.
use the transplant extension