Best practices to keep up a diverging branch of code [closed] - version-control

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I'm in a situation where some minor patches I've submitted to an open-source project were ignored or explicitly not accepted.
I consider them useful, but more important is that I need the functionality they implement.
I don't want to push my ideas and suggestions anymore to the main contributors, because I don't want to turn this into an ego issue. I've decided that my best bet would be just to use what I wrote for my own purposes. I don't want to fork the whole source code tree because I like how things are generally working, I'm just not happy with details.
But I do realize that the project will evolve and I would like to use the new features that will eventually appear. I understand that I'll have to merge all new things into my own source tree. Are there any best practices for this scenario?

The standard approach is to maintain a vendor branch in your repository. The idea is that you import a pristine copy of the original sources (called a vendor drop) into your local repository, and store it on a branch. This is the version of the code prior to applying your mods. You tag that with the version, then copy it to the main trunk and apply your patches.
When subsequent new versions of the vendor code are released, you check out the vendor branch (without your mods), and overlay the new version on top. Finally you merge the new branch with your mods, checking that they are still applicable/relevant, and you're ready to go again.
There can be complications e.g. with files being renamed, deleted etc. The script svn_load_dirs.pl, that comes with Subversion, can help with this by allowing you to identify files which have changed name and automating some of the bureaucracy.
This approach is discussed in detail (and much more clearly) in the Subversion book, under the section Vendor Branches.

If you are using Git, or could get used to use it, perhaps you should take a look at Stacked Git or Guilt.
They are layers on top of Git to keep track of patches.

Generally you would create a branch of the project in the repository. In your specific case, I would create a branch of just the directory that contains your code if that is possible. A lot of repositories like subversion will allow you to then check out your branch along side the main trunk. That should allow you to maintain your patches and ensure that they work with future changes that are made to the trunk.

Related

Should you commit un-buildable code to GitHub? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
Just curious as to what the general consensus is for committing code of GitHub.
Should you commit only build-able code? Or is there a time when unbuildable code commits have their place?
Or am I totally miss-understanding GitHub completely? If so, please inform me how it should be?
If the master needs to stay buildable, I recommend that you make a branch, and merge the branch only when the code is working as intended.
GitHub is a place for sharing your code to everyone, it's your wish if you want to make it public for everyone to use or contribute or keep it private.
There are so many advantages in uploading code to GitHub :
1> Others can look at your code for reference and can also contribute to it
2> It keeps all your coding records with history so you can show them to your company when you apply for a job
yes, you can also upload unbuildable code and open an issue and wait for anyone else to fix it.
and it's a good practice to keep 2 branches one for buildable code (master branch)
and another test branch for testing stuff
In general for git, the master branch is reserved for only buildable features. Other branches can be reserved for in-progress features to be later merged into the master branch once completed and tested. GitHub, for the most part, follows these rules too.
To me, it really depends on the kind of project you are working on:
If the commit is for a private project, (and therefore probably has little following) you can do what you want
If the commit is for a highly visited project, maybe think twice before submitting unbuildable code without making a note in the commit message that it is unbuildable
As always, if you own the repo, you can follow any rules you like, but if it is owned by another individual, be sure to follow the rules that they set out for the repo.
It just depends. Some argue that all commits should compile or be "build-able", but IMO that defeats the purpose of frequent commits.
Typically, when I'm developing a project my rule of thumb is commit after 20 minutes or so of developing, and push every hour or when I finish the branch/feature I was working on. So if you are working on a project individually, then committing with issues in your code may not raise any problems. Committing frequently is the whole purpose of version control applications like Git, and you have the ability to go back to any previous versions whenever you like. If you are working on a project with a team there may be some guidelines as to when you commit/push, so make sure to check with your team if that is the case. Check out What are the differences between "git commit" and "git push"? if you are trying to better understand Github.

Mercurial branches with different codebase [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am struggling with figuring out a good method for handling my work flow using Mercurial. I have read many related questions here on SO and other places but couldn't find a reasonable solution.
Suppose I have two branches:
the default branch I do normal development in. Deploying releases from and marking them with release tags
a branch for a specific customer that is maintained separately. It runs on a separate version number, branched from an older version of the application.
These two branches are mostly identical now but there are already some differences. Over time, they will drift apart more.
It means that I have 4 types of source files:
Type-A: Files that remain identical in both branches (changes, if introduced, should be present in both). These are most of the files.
Type-B: Files that are only in the default branch and do not need to be merged into the customer branch
Type-C: Files that are only in the customer branch and do not need to be merged into the default branch
Type-D: Files that are present in both branches, have shared code but also contain code that should remain separate and specific to each branch
Development done on the default branch and there are regular releases which are mostly incremental. But I also have these two scenarios:
Some changes done to the default branch, need to also be merged to the customer branch (e.g. a bug or feature that need to be fixed/added to both).
A "hotfix" done to the customer branch and cannot be immediately merged to the default branch but needs to be merged eventually at some point.
Problem is that I can't figure out a reasonably simple, clean and safe way to support these two scenarios. Mercurial doesn't have the concept of partial merges or ignoring files on merge. It merges changesets but it insists on including the files that were introduced in previous revisions.
Merging default into customer branch or customer into default in these scenarios forces me to add files of Type-B and Type-C. Making a change to file of Type-A that doesn't need be added to the other branch (making it Type-D) introduces a challenge.
Now obviously I can work around some of these problems by using compiler defines (thus keeping source files the same in both branches), editing code manually and manually removing files after merges but this doesn't feel like the most efficient and clean way to handle this.
Surely this is a common enough work flow that wiser people than me already figured out. Can anyone suggest any method or best practices that can streamline the work in these scenarios? Or is there something fundamentally flawed with my setup?
Also, does Git handles these flows more gracefully?
Thanks.
I would do any development that a the customer requires on the customer branch and merge it into default. This will work for hotfixes and it is simpler than cherry-picking your development changesets from your default branch. It's easier to merge forwards rather than backwards. It also gets rid of Type-B file problem because there are no Type-B files in the customer branch.
Type C files I would merge into default and then delete on the default branch. Any further modifications to these files should generate a warning that file modified on branch was deleted on the other branch.
Separate changesets can be exchanged between branches using hg graft -r SRCREV commands
More complex and probably hard (but also more flexible and manageable) way may be using MQ and more than one queue (Queue per Type?), but even with one queue of MQ-patches with good naming convention you will not lost

What's a good strategy for an open source project to not share production settings? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I'm currently looking to release an MVC / RavenDB as an open source repository mostly just to let people look around rather than as a full on project.
However I don't want certain production settings such as SMTP Server details and Connection Strings publicly exposed (but still under source control).
I'm looking for suggestions on how I should structure public and private repositories so that I can easily work on the project and have reasonably hassle free deployment.
Cheers
In cases similar to yours where I need to publish parts publicly and keep import data hidden, I usually very simply keep a branch only for production.
In your case, you could either make the dev branch where you publishing everything open source, clone it and receive some contributions from other people. Then have a production branch somewhere different. (heroku..)
You normally don't put those files under version control. Inside a company, say, you can put a template under version control and ask your developers to copy it into place and update as necessary. Like a config.ini.template file with
[smtp]
host = smtp.company.com
user = USERNAME # update this
pass = PASSWORD # and this
where it's clear that the developers need to update the credentials when they rename it to config.ini. The config.ini file should then be excluded from version control.
For an open source project I would probably not put any template under version control. I would still configure it so that the config.ini file is excluded from version control so that I can have my own config.ini file in my working copy without committing it by accident.
I find the above system much easier than putting the real config files under version control. Even if I can put it in a private branch of some sort, then it will require me to constantly merge with that branch and I'll have to be careful not to accidentally push the config file to another repository.

What should go in the 'default' branch of a Hg repository? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
In large Libre Source software projects, versioned with Mercurial or similar DVCS tools, which of the following is considered to be more conventional:
Keeping the latest "stable" version of the software in the default branch. Tagging each release in default so you know which revision got packaged up as a download. Merging patches into default as soon as they are tested. Keeping new features, etc. in named branches to be merged into default on the next release.
Keeping each release in a named branch, or similar. Using default to keep bleeding-edge code that's only intended to be run by developers or the very foolhardy.
Or... is there some better pattern of workflow that it widely accepted?
Mercurial has a fairly strong opinion on what you should use your default branch for. It's documented in the Standard Branching wiki page. The summary is:
You should not use a name other than default for your main development branch.
The reason is that default is the branch that is checked out by new clones. If you try to use some other name for your "main" branch, users will get a more or less random branch when they clone and commit things in the wrong place, which is generally undesirable.
Even with tons of documentation that says "branch before adding a new feature" (see next point), people will forget this when they send you patches. They then have the trouble of cleaning up things by moving changesets around.
So always put the bleeding-edge code in the default branch and use other branches for your stable releases.
Don't treat branch names as disposable
Branch names are a permanent part of each commit and allow identifying on which branch each commit was introduced. Thus you will want to give some thought to your branch names so that you don't pollute the branch namespace.
Also, if you attempt to use a branch per bugfix, you may eventually run into performance issues. Mercurial and the tools surrounding it are designed to work well with hundreds of branches. Mercurial itself still works quite well with ten thousand branches, but some commands might show noticeable overhead which you will only see after your workflow alredy stabilized.
We have caches in place internally in Mercurial, so the problems are mostly UI problems: hosting sites and log viewers might run hg branches to load all 10,000 branches into a single drop-down menu. That is really slow and useless for the poor user that want to select a single branch from the gigantic menu.
If the branches are closed, then they wont show up in hg branches, and so the problem should be minimized. However, the tools might want to show closed branches too — it all depends on the tool.
I'm sorry this is a little vague. The main point is that Mercurial is built to scale in the number of changesets, not the number of named branches. We have addressed the biggest performance problems with named branches with the cache I mentioned before, so today I'm not too concerned about having many branches, especially if the number of open branches is kept small (less than, say, 100).
I have fallen into the habit if using default in Mercurial and master in Git for the actual work, the bleeding edge, and using tags and branches for the releases. hgsubversion and Git-Svn seem to take this tack.
There are not, in common, such thing as "most conventional" - each and every workflow is a matter of local convention and development policy in team.
I saw both mentioned policy often, and intermediate variations - also.
In case of strong testing|release policy and intensively used branches ("branch per task") "default" branch often exist only as merges-only branch (merges from feature-branches before QA-testing) and means "Code, which work with finished features, without throwing errors, but with unstested functionality".
Minor versions form named branches, each release in such branch is tag. Bugfix branches are merged after completing into "default" and active versions branches
But this workflow is just one more example, not better|worse than others, suitable for mid-size teams with responsibility separation established, doesn't work well in "chaotic anarchy" development
There's not a huge amount in it. If we're talking about just DEV & STABLE branches, which is default is mainly just a naming convention. I'd tend to have DEV as default, just because most work goes happens on the dev branch and if this is the default branch, it's less hassel.
Personally I prefer a named branch per release. Bugfixes can then go on those branches and be forward ported with relative ease to all releases after using hg merge. If you try to do the same with DEV and STABLE, you can only ever have one maintained release (the last one), or your stable branch starts growing branches and you end up with a (possibly less organised) version of the branch per release structure.

Any thoughts on Surround scm? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
So looking at different version control systems: subversion, accurev, surround, tfs, bitkeeper/git/mercurial
Subversion: I see it's quite the popular standard
Accurev: There seems to be a love hate relationship around it.
Surround and TFS: I haven't seen many comments around them.
Bitkeeper/Git/Mercurial: Seem pretty popular, but I think "distributed" may scare my manager lol
For some reason he seems attracted to Surround and it's not because of sales pitch. We had originally downloaded it for evaluation played around with it but nothing came of it. So now we are back to looking at scm and wants to try it again. So far I haven't seen any buzz around it like some other version control systems. Same for TFS
I've been using Surround SCM at my job and I'll say it is what it is, but there are a few things that I find lacking. Though, I've heard that surround scm integrates well with surround's issue tracking system, but I can't comment on that because we don't use that.
I personally find the UI to be buggy and confusing.
The workflows are confusing and often offer you with prompts that don't apply, so you get used to ignoring warnings.
eg. "are you sure you don't want to auto-merge?" "Are you sure you want to overwrite files?"
The UI always badgers you to use the auto-merge feature but every
time I've tried it, it ends up messing up my code (C#).
On top of that, the packaged diff tool (Guiffy) is buggy and doesn't display text
properly.
Weird workflow quirks can result in your changes being overwritten.
It doesn't do directory syncing
...which means that every time you add a new file to your project you must by-hand go and add it to the SCM repository. If you don't, everything will look normal to you until one of your teammates emails you because you broke the build.
There's no good way to copy over revision histories when you are branching
... which means that you are less likely to branch when you should be. There's nothing more frustrating than to have to store code locally because you're making changes right before a release and your team refuses to branch the code into another repository.
There's no good way to blacklist certain files from being checked-in or from being overwritten during an update.
If there's a file that you don't want to check-in then you're left with the painful chore of scanning through a long list of files and deselecting those you don't want every time you want to check-in. Yuck.
Features aren't documented that well
Of course, they release a user's guide but it's about as helpful as Microsoft Windows help function. It tells you step by step how to do things in the UI (ie. "click 'Create Shadow Directory', then click 'OK'", but it doesn't tell you what those features are, how they are intended to be used, what actually happens server-side etc.
Btw, if you know of any good way to get around these problems let me know :)
Danger! Danger, Will Robinson!
Surround is a data jail. Once you commit to it, you're stuck. There is no known way to get your history back out to another SCM. Don't get trapped!
This tends to be a problem with closed-source SCMs in general, but I have direct reports that it's especially bad with Surround.
Subversion, git, Mercurial, or Bazaar would be better choices.
I have used Surround at my job for about three years.
It does work well with their (Seapine's) test management and issue tracker program. If you are already using TestTrack, I would say Surround is a good choice.
In general I agree with #eremzeit, but the 'buggy and confusing' comment rarely applies to our workflow. The default diff tool (Guiffy) is bad, but often good enough.
One part I like is the easy ability to share files across repositories without needing to share a whole project/repository. Git does not have a mechanism to do this easily.
Last note: we have used Surround on Linux and Windows and it appears to work just as well on either. It is nice to have the same interface.
Surround SCM.
Pros:
Can apply a development work flow for all files. No two revisions of a file can be in the same status in the work flow.
Has a good UI.
Good licensing system.
Cons:
Stores all data in a RDBMS.. heading for a performance problem if the repo size is huge.
Does not support atomic commits. (you can do atomic commits but the files are still revisions and cannot be refereed using the changelist #)
My ideas about other tools
Subversion suits well for a corporate setup. Perforce is like subversion but faster and has a good UI, simple licensing terms and really super support system.
Recently Accurev has gained a strong footing with its innovative branching methodology.
IMHO. go for tool sets that interact well with your defect tracking, test case management and build management solution. This would help you create a good developer ecosystem thereby saving time.