Using version control with non-hierarchical code? - version-control

I am looking at putting a code base that runs several website into version control. There are several instance of this code base running websites on different virtual servers.
The problem I'm grappling with is that each of these separate instances of more or less the same code have sub-directories with site-specific functions. But it seems that version control systems want to control the entire directory hierarchy.
For instance, each instance has the directory
/www/smarty/libs/plugins/
Where you'll find site-specific functions for smarty. When we are ready to put it into version control, the folder /www would be the root.
So one option is to have all the site-specific functions going out to all sites. I don't see a problem in and of itself, but it seems somehow architecturally 'wrong'. There would be a bunch of files that only belong to one deployment.
Another option is to have a separate repository for each site's specific files within the code base. But that sounds like it could quickly become a nightmare when trying to get new sites deployed properly.
What's the best way to do this? The version control system we're looking at is subversion.

Generally, source control systems should be used to control source. They are not at their best completely controlling file hierarchies, permissions, and other related things. These are best left to deployment configuration.
How about having each of the projects and directories you need represented once in the version control system. Then, in a separate directory (perhaps called /build/), have the various configuration layouts. You might have an ant file that builds each site, or maven. Or you can use tools like Capistrano or Fabric to have more control over each deployment.

The tools are made to be flexible (generally), so here are some suggestions:
Most VCS' allow you to ignore files and directories through some mechanism (e.g. Mercurial .hg ignore file), so you should be able to target what you want/should control versus what shouldn't be.
Separate the files/directories into common resource project and site-specific projects and then use a build system to integrate them to create a deployable package. The build system can be as simple as a shell script or a more sophisticated framework. If its a really simple integration, the VCS may have some basic features for merging bases (e.g. Mercurial subrespositories).

With subversion, you could have a bunch of repositories:
www be in a general repository
plugins each be in a site-specific repository
Then have nested working copies:
svn co http://www_repo www
cd www/smarty/libs
svn co http://foo_plugins_repo plugins
Tip: add plugins to svn:ignore property of www/smarty/libs
svn propset svn:ignore "plugins" www/smarty/libs
You could certainly do that with git too (through .gitignore), and probably with other version control systems but I don't know them.
(Alternatively you could skip the nested working copy part (which can freak some people out) and check out stuff side by side, but use a symlink in lieu of smarty/libs/plugins, while ignore still pertains)

You're missing a "build" step, which whould take the source in source control and create the deployment bundles for the different sites. Only one source package is needed, different build configurations create the different deployment packages. Don't try to directly put the deplyoment set into source control, it is not the source!

I believe the best thing to do would be to create a top level directory in your repository for each site (Site-01, Site-02, etc) and inside those directories put the source tree. Then you can checkout the projects separately. I think it's acceptable and somewhat standard to use the same repository for all the projects your company is involved with.
My terminology might be off kilter, but the fundamental idea is sound, I believe.

Related

SAP Cloud Platform WEB IDE Fullstack .che folder

Should we remove the .che folder from Git when we use Web IDE Full-Stack?
The rule of thumb is to never include IDE-specific files into a Git repository. There are several articles and blogs on this and I would point you to this one: IDE Project Files In Version Control - Yes or No? Of Course, Not!
The main drawbacks of having IDE specific files checked-in are the following:
Each IDE would add its own files. E.g. if some of your developers would decide to use VSCode, then you would also have a .vscode folder in there.
The file structure may be different depending on the IDE version (if you use the SAP Web IDE Cloud, this should not be an issue, but it might be if one developer is using the local WebIDE).
The files change very frequently and lead to merge conflicts. E.g. if you do a deploy and also one of your colleagues does a deploy, then you will have a conflict when you want to merge your branch with his (assuming that you work on parallel branches).
The files may contain environment-specific settings. E.g. the name of the project folder, which may actually be different for each developer.
The only clear advantage is that setting up the project after a clone operation might be faster marginally (i.e. the developer which is doing the clone might have to do some settings locally on his copy).

How to manage common source files in source control?

I have the following directory structure:
CommonUtilities, DataStructure1 and DataStructure2 contain source files used by one or more projects.
I would like to publish one or more projects as open source using Mercurial and BitBucket. But I don't know how to manage the source files used by one or more projects.
I am new to source control and to software development in general, so I would like to know the best practices in this kind of situation.
Should I:
Include the common source files in more repositories as needed?
(that is hg add them to two or more repositories)
Include the common files in some other way than hg add?
Do something completely different?
Option 3: Do something completely different.
You can use the subrepository feature: keep the common files in a different repo, then reference them from your repository.
You are trying to solve the issue by wrong tool. The best way to manage such situation is a dependency management tool (look at Maven, Ivy or Gradle).

Share code across TFS folders (like svn:externals)?

I have multiple projects with common in-house JavaScript library dependencies. I want to share these dependencies across multiple projects.
Unfortunately we are using TFS. I'd like something like svn:externals, whereby I can link a particular folder to a different folder elsewhere in the source control tree. So I want to have
ProjectA
app
js
lib [should link to SharedProject/lib]
ProjectB
app
js
lib [should link to SharedProject/lib]
SharedProject
lib
library1.js
library2.js
I don't want to link across workspaces...I don't want a crazy custom per-developer setup. I just want developers to check out one project, and it knows "Oh, there are shared resources in this other project. I'll get those too." I don't care about it always getting a specific version; I'm just tired of copying files across projects.
Is this remotely possible in TFS? I have Googled and found nothing conclusive.
Just branch the shared project from its original location to where you want it to be.
When you would switch to next revision on svn:externals, simply merge changes up to that revision to the branched copy.
(frankly I prefer this way even on SVN)
Using external link in source is not a good idea. It creates lot of side effects. You can package and publish your library using NuGet to a private NuGet server and then consume the published packages in all the dependent projects.

Keeping divergent versions of a hg version-controlled file on different machines?

I am working on a project that depends on external programs, and needs to know the paths to them. I develop and use the project on several machines, using mercurial for version control. The paths are machine-dependent, so I keep them in a machine-specific config file. I would like the config file for each host to be version-controlled, but I need to ensure that the config file from one host would never overwrite the config file for another host when pushing or pulling between hosts. Is there any way to accomplish this?
In principle, Wim is right: machine specific configurations shouldn't be part of the project's source control. As long as you walk alone, this isn't a real problem, but once you want to provide generic releases of your project, you have to get rid of them. In that case you might not be happy about the fact, that the change history contains files with machine specific data.
Nevertheless, it may make sense to have machine specific data in version controlled files (personally I do this for my dot-rc files and shell scripts). In that case I would suggest to separate generic and specific configurations into different files and include/utilize the specific one at build- or runtime, depending on the currently used machine.
If it is not possible to detect the current machine automatically, you could still create an unversioned symbolic link on each machine, pointing to the appropriate specific configuration file. For instance, on the machine foo the file layout could look like this:
generic.conf version-controlled
specific-foo.conf version-controlled
specific-bar.conf version-controlled
specific.conf → specific-foo.conf unversioned symbolic link
An alternative to symbolic links is to use a hook which automatically creates specific.conf, e.g. on each invocation of hg update. As hooks are set in a repository's hgrc file, it can be defined individually on each machine. Here's an example of a corresponding hooks section in the .hg/hgrc file of a repository clone on the machine foo:
[hooks]
update = cp specific-foo.conf specific.conf
Machine specific configuration settings should not be version controlled in the same repository as the project code.
However, it is still a good idea to put an inactive sample configuration file in your code repository. And this sample could show a bunch of typical locations for the external program paths you mentioned as lines that are commented out. That way you make it easier to get your project running on new machines.

Emulating symlink-like behaviour in a source control repository

Suppose I have the following (desired) folder structure:
*CommonProject
*Project#1
----> CommonProject(link)
*Project#2
----> CommonProject(link)
Where the CommonProject is the location of the source belonging to that project, and CommonProject(link) is merely a soft link to the main location. If we imagine this as a tree-view in a visual client, if I expand Project#1 I will see CommonProject there as a subdirectory, even though the files are not actually stored there.
The purpose of this is to enable the following behaviour:
When I check out Project#1 I get the files associated with that project as well as a subfolder CommonProject containing all of its files (as if Project#1 contained the copy of the files in the Version Control repository). Now if I were to modify CommonProject's files inside of Project#1 and was to submit my changes to the repository, the changes would go into the CommonProject location (no file is actually stored locally under Project#1 in the repository). Now if I was to sync Project#2, as it also contains symlink to CommonProject, it will now get my updates.
Essentially the duplication of files only exists on my machine, but in the repository there is only one version of CommonProject.
I know Perforce can’t do this, without juggling 3 specs. This is very complicated and error prone, especially when a lot of people do it. Is there a source control repository out there that can do this? (a pointer to some docs on how it can be done is a plus)
Thank you.
Subversion can directly store symlinks in the repository. This only works for operating systems that support symlinks though, as svn just stores the symlink the same way it would with any other file.
I think what you really want is to link to separate projects though. Subversion supports this through externals and git through submodules. Another alternative is to manage this sort of thing with in your build process, so that some static resources are gathered when you initialize the build. Generally, updating a utilities library that changes often is going to cause stability problems, so you can do this manually (or with clever scripts) when you need to
You'd probably be much better off just storing the projects in a flat directory (1 directory per project, all at the same level), and using whatever you build system or IDE is to link all the stuff together.