What is the standard workflow for applying conflicting patches? - diff

This is a programming language and version control system agnostic question.
There is a source code tree and two patches X and Y. Each of them apply cleanly to the source code tree. But applying one of them (either X or Y first), then another one, results in second patch failing to apply (patches conflict).
Is my only option applying one of them (probably the biggest one, so most of work gets done automatically), then merging the other one by hand and resolving conflicts, or there are better tools/practices to handle this scenario?

The goal is to avoid this situation from happening as there's no easy solution to merge.
In order to avoid, make small commits with their tests and push them to the source repository. Other guys in the team will be forced to pull the latest changes in order to commit their code, and this will ensure that nothing gets broken.
I encourage you to avoid having multiple teams manipulating the same part of the source code. Create a good structure and if possible break down the project into smaller projects.

Related

Storing code metrics

I'd like to write a pre-commit hook that tells you if you've improved/worsened some code metric of a project (i.e. average function length). The hook would have to know what the previous average function length was and I don't know where to store that information. One option would be to store an additional .metrics file in the repo but that sounds clunky. Another option would be to git stash, compute the metrics, git stash pop, compute the metrics again and print the delta. I'm inclined to go with the latter. Are the any other solutions?
Disclaimer: I am author of the Metrix++ tool, which I am using in the workflow I described below. I guess the same workflow can be executed with other tools capable to compare the results.
One of the ideas you suggested works perfectly, if you add a couple of CI checks (see the steps below). I find it solid. Not sure why you are considering it clunky.
I have got a file with metrics results which is updated before each commit and stored in VCS. Let's name this file metrics.db, and consider automation of the following workflow on build/test of a project:
1) if metrics.db has not been changed since last checkout (i.e. it is the original data for the previous/base revision), copy it to metrics-prev.db
2) Collect metrics for current code, what produces metrics.db file again. Note: It is very helpful when a metrics tool can do iterative scans for the best performance (i.e. calculate metrics for updated functions/classes), so it gives you the opportunity to run metrics tool on every build, including iterative.
3) Compare metrics-prev.db with metrics.db. If metrics identify regressions, fail the build and [optionally] do not allow to commit - team rule. If metrics are good, build is successful, and commit may happen.
4) [optionally] you may run Continuous Integration (CI) which validates that the actual committed metrics.db file corresponds to the committed code for the same revision (i.e. do the same 1-3 steps and make sure that the diff is zero at the step 3). If diff is not zero, it means somebody forgot to update the metrics.db file, and presumably did not execute pre-commit check, so revert the change.
5) [optionally] CI may do steps 1-3 if you fetch metrics.db as metrics-prev.db from the previous revision. In this case, CI may also check that the collected metrics.db is the same as committed (alternative or addition for the step 4).
Another implementation I have seen: metrics.db files are stored in a separate drive, out of VCS, and custom script is able to locate corresponding metrics.db for a revision. I find this solution unreliable as the drive can disappear, files can be moved and renamed, and so on. So, placing the file in VCS is better solution, but any will work.
I have attempted to do the alternative you suggested: switch to the previous revision and run the metrics tool twice. I abandoned this approach for several reasons: metrics check script alters your source files (so, it is impossible to include it into iterative rebuild and continue to work smoothly with your IDE as it will complain about changed files), and secondly it is very slow performance (comparing with iterative re-scans, it is extremely slow).
Hope it helps.

Merge in local, base or other?

I have a project in which I've set up a bitbucket repo using mercurial.
We're actually 3 to work on it, so we're using branches.
When we did merges, we did them quite randomly so many times it failed.
Actually, I'm using Meld, and I don't really know in "which" part of the repo I have to choose which part of the source code I want to merge.
So, when I do merge, where should I do it ?
I'm not really sure if I have to do it on local, base or other, even though I know local corresponds to my last modifications, other corresponds to the last modifications of the branch I want to merge, and well, actually I'm not really sure about what is other ...
On careful review, I have found it out. You want to merge into local
Please correct me if I'm wrong but I am pretty sure after doing some tests
Local
The correct place to merge change to. The local files that will result from the merge. This will likely contain a mix of some auto-merged lines already.
Base
Where you are merging into.
Other
The merges you're pulling
This may not be the "right" answer, but when in doubt, I make them ALL match by making them look ALL merged 'correctly' (sometimes I have to discuss with my coworkers what the 'correct' look is based on their changes).
By doing this, I ensure the merge will be successful because meld cannot and will not actually change upstream data in mercurial. So there's no downside. For the life of me I also cannot tell which pane to merge into (mostly because the term 'base vs local' is ambiguous). So this is kind of an odd way to do it, but it works

Merge two LLBLGEN 2 source files

I have two LLGLGEN 2.6 pro source files that I have to merge in my git repo (2 different branches). Due to the "professionnal" work of previous programmers on this project, the two projects have changes (the fork is 1 year old) that are not tracked in documents.
What can be the less painfull solution to finalize my merge ?
Thanks.
In my experience, it's easier to simply ignore the merge conflicts in the LLBL generated code and just re-sync the project to the database and then regenerate the code completely post-merge.
Where this becomes a problem is when there are a lot (or even a few) customizations made to the LLBL project file (e.g renaming fields, creating typed lists). There isn't much you can do about these outside of tracking them down one by one. The good news is the compiler will complain of something is missing or renamed.

Why should I combine code and tests in a single commit?

Let's say I write an atomic change of code. I also write some tests to make sure the change works and keeps working.
I think I should commit the code change together with the tests.
PRO:
This makes sure (as far as possible) that every changeset of the branch results in running code and passing tests.
It documents that these tests "belong to" the code change that I did.
CON:
I may want to backout the change without backing out the tests some time in the future
What reasons more are there for doing it one way or the other?
Has either strategy ever bitten you? How exactly?
Somewhat related: Should change to code be committed separately from corresonding change to test suite?
My only rule is to make sure that code builds prior to checking in.
Yes, ideally you want to check in atomic bits of code. However, how do you define atomic? And then what happens if, later, you have to change one of the bits? Your changes are no longer in one commit.
I think I should commit the code change together with the tests.
I never do, but I would, if I used a non-branching SCM.
In branching SCMs (I've worked with mercurial and git) it's a non-issue.
I use a different branch every time I start working or something; I many times have local branch commits that just commit outstanding changes (that don't even compile) when I move to something else for a while, or when I just need a backup point before I start a major code change). I know that most of those intermediary commits will never be checked out again, when I create them. They don't affect anything though and are really useful when needed.
I only merge branches back when I have code that compiles and tests that run successfully on it.

Best practices to deal with "slightly different" branches of source code

This question is rather agnostic than related to a certain version control program.
Assume there is a source code tree under certain distributed version control. Let's call it A.
At some point somebody else clones it and gets its own copy. Let's call it B.
I'll call A and B branches, even if some version control tools have different definitions for branches (some might call A and B repositories).
Let's assume that branch A is the "main" branch. In the context of distributed version control this only means that branch A is modified much more actively and the owner of branch B periodically syncs (pulls) new updates from branch A.
Let's consider that a certain source file in branch B contains a class (again, it's also language agnostic). The owner of branch B considers that some class methods are more appropriate and groups them together by moving them inside the class body. Functionally nothing has changed - this is a very trivial refactoring of the code. But the change gets reflected in diffs. Now, assuming that this change from branch B will never get merged into branch A, the owner of branch B will always get this difference when pulling from branch A and merging into his own workspace. Even if there's only one such trivial change, the owner of branch B needs to resolve conflicts every time when pulling from branch A. As long as branches A and B are modified independently, more and more conflicts like this appear. What is the workaround for this situation? Which workflow should the owner of branch B follow to minimize the effort for periodically syncing with branch A?
The owner of branch B should have discussed the change with the owner of branch A. They should have decided that either the change was worth making, in which case it should have been committed to the trunk (A) or it wasn't, in which case it should have never been made. VCS is not a substitute for communication between developers.
Dev B should ask Dev A to pull from him.
The situation is not avoidable if two branches diverge and never converge. Its similar to a fork in a project - Its very difficult to merge two different forks.
Dev B should either un-refactor, or else add some meaningful functionality that has enough momentum to warrant merging back into A. Otherwise tell him "Them's the breaks".
To go to an extreme example, he could decide that he doesn't like C and decide to re-write everything as Pascal, line by line, using the same variable names, class names, etc.. To the point that the only difference is the language. So what would you tell him then, when he insists that there aren't any changes, so there should be no diff?
One workaround is to have a fancy diff wrapper around code comparison that will intelligently ignore certain very specific diffs.
On a simple level it's usually implemented via gdiff, on fancier, with custom diff code.
However, I must say that such a situation is really a bad thing and it's much better to merge the changes back into A; by for example submitting a patch to A's maintainers owners