Git Concepts - Distinction between Head and Latest Commit

Git Concepts - Distinction between Head and Latest Commit - azure-devops

Having used various version control systems in the past including TFS, the whole concept of file revisions was always easy to interpret and grasp. If I ever wanted to reference the latest version of files for instance, I only needed to identify what was listed as the latest changeset (or whatever terminology was used by that particular vcs) and that was pretty much it. Very easy.
I am a relative newbie to Git and so with the transition of TFS to Azure DevOps, I really find it confusing trying to get a handle on the concept of file versioning in Git. I have a couple of questions which are perhaps best depicted by the below screenshot.
For example,
What is HEAD and what is the distinction between it and the latest commit?
Why is a HEAD id different from the last commit id, when a file comparison run against the two ids indicates they are identical? In my case anda from the image below, this would be 43593c12 and f493628c respectively?
.

HEAD in Git is essentially a commit pointer of your local repository.
The most important thing to note here is that HEAD is subjective (locally scoped) and would normally differ for every clone of repository.
The concept of latest commit in Git is ambiguous, since Git is a distributed system - hence "latest" is not clearly defined.
Generally, if you're new to Git, I recommend spending 1-2 weeks of time and reading Git Book thoroughly - https://git-scm.com/book/en/v2
Specifically for your current question take a look at The Three States and Branching basics.

Related

Which VCS 3D modellers use?

Which VCS 3D modellers use? For instance in Blender or 3DsMax.

Like any project, it is the choice of the person starting it. Subversion and git are two popular choices, each has strong points. It would be hard to say one is more popular than the other.
There are two points I would highlight in making your decision -
Disk Usage - multimedia projects often use large files. git is a distributed repository, that means every user checking out a copy gets the entire repository, this can lead to a lot of extra disk usage for large projects. Svn keeps all the revisions on the server and each user gets two copies of each file, one original and one working so that a comparison can be made locally.
This also extends to svn being able to checkout a subdirectory of a project, while git needs to copy the entire repo. While recent git versions can checkout a subset of working files, the whole revision history is still copied locally.
Checking out previous revisions - git uses a unique string to identify each revision, while svn uses a numerical sequence. This makes svn easier to just checkout the previous revision, or five revisions earlier. To get an earlier revision from git you need to list the history and copy a random string to get an earlier revision. At least when using the CLI, GUI apps can make this easier for both.
This can extend to discussions, svn users can say I have revision 125 and I have 122 to quickly know that someone is way behind or just missing one update.

What is the downside to managing a repository with multiple heads?

I have a project with a single branch, default. I have been iterating on this single named branch for some time now and I have been using tags to mark version number milestones.
The project's source code changed quite a bit between tags 1.0.7 and 1.1.0 (current). However, there are some users on 1.0.7 that need a bug fix. So I checked out the source, updated to tag 1.0.7, implemented a fix and committed. That was tagged 1.0.8, and will probably be the last commit on the 1.0.x line.
I now have two heads on the default branch. I expected that. But when I tried to push to our BitBucket account, I received a warning from hg: "push creates new remote head". Reading up on this message, I get a lot of answers explaining why the message is there and for most people the answer is just to merge. However, I don't think I want that in this case. The two branches aren't compatible.
It looks like I can just use the -f option to force push the new head to the remote repository, however this seems to be discouraged both by hg help and various posts on the web without much explanation as to why. So what is the downside to doing this? It seems as though I can still update to whatever tags/revisions I want to continue working on. If I push that head to the BitBucket account, will I be shooting myself in the foot in some way?

Having multiple heads is perfectly fine.
If there are several heads and there's little indication as to their purpose, it may be difficult for others to see where they should continue and what is the head which contains the newest developments, e.g. which gains new features.
However by using the tags on the branch with clear versioning like you do, that problem doesn't exist either.
There's one small catch though: Mercurial will, upon clone, update to the newest commit in the default branch - e.g. the head which received the last commit. If that's the 1.0.x head of yours, that might be unfortunate. However you can fix this, by attaching the special '#' bookmark to the mainline or development head. Mercurial will always update to the head which bears that bookmark, if it is present - irrespective which head has the newest commit.

Keeping experimental history out of shared repository in Mercurial

I'm fairly new to Mercurial, but one of the advantages I see using Mercurial is that while writing a feature you can be more free to experiment, check in changes, share them, etc, while still maintaining a "clean" repo for the finished feature.
The issue is one of history. If I tried 6 different ways to get something to work, now I'm stuck with all of the history for all my mistakes. What I'd like to do is go through and clean up my changes and "collapse" them into one changeset that can be pushed into a shared repository. This is complicated by the fact that I might pull in new changesets from the shared repository, and have those changesets intermingled with my own.
The best way I know of to do that is to use hg export to create a patch of my changes since cloning, clone a fresh repository, and apply the patch to the fresh repository.
Those steps seems a little bit cumbersome and easy to mess up, particularly if this methodology is rolled out to the whole dev team, some of whom are a little resistant to change (don't get me started). TortoiseHg makes the process slightly better since you can highlight the changesets you want to be included in an export.
My question is this: Am I making this more complex than it needs to be? Is there a better workflow I can use to ease my troubles? Is it too much to expect a clean history where entire (small-ish) features are included in one changeset?
Or maybe my whole question could be summed up this way:
Is there an equivalent for this in mercurial? Collapsing a git repository's history

Although I think you should reconsider your use of branches in Mercurial (as per my comment on your post), using named branches doesn't really help with your concern of maintaining useless or unnecessary history - it just organizes them a bit.
I would recommend a combination of these tools:
mercurial queues
histedit (not distributed with Hg)
the mq changeset strip feature
to rework a messy history before pushing to a blessed or master repo. The easiest thing would be to use strip to permanently remove any changeset with no children. Once you've done that you can use mq or histedit to combine, relocate, or modify existing commits. Histedit will even let you redo the comment associated with a changeset.
Some pitfalls:
In your opening paragraph you mention sharing changesets during feature development. Please understand that once you've shared a changeset it's not a good idea to modify using mq or histedit, or strip. Using these extensions can result in a change to the revision hash, which will make them look like a new changeset to everyone else.
Also, I agree with Paul Nathan's comment that mq (and histedit) are power features and can easily destroy a history. It's a good idea to make a safety clone before using these extensions.

Named branches are the simplest solution. Each experimental approach gets its own branch.This retains the history of the experiments.
The next solution is to have a fresh clone for each experiment. The working one gets pushed back to the main repo.
The next solution - and probably what you are really looking for - is the mq extension, which can "squash" a series of patches into a single commit. I consider mq to be "advanced", and "subject to accidently shooting yourself in the foot". I also don't care to squash my commits - I like having my version history present for reference.

Using different "paths" in Mercurial - also called branching

Is it possible to have different development "paths" from a given point in Mercurial, without having to clone my project? I currently have 2-3 different implementations options for a project and I'd like to try them out. If I could just use one and at any point come back and start in another "path" without losing data from the older one that would be nice, but I am not even sure it is possible.
Thanks

This is exactly what branching is designed for:
https://www.mercurial-scm.org/wiki/Branch
The easiest way to create a branch in Mercurial is to simply checkout an older version, and then commit again with something different from what you committed after it the first time. You won't lose the old following commit, the new commit will simply branch out into a new line of development and the original commit(s) will remain on the previous line of development.

Yes, you probably want bookmarks for this - they're a lightweight way of marking various heads without recording the names forever in the revision (which branches do.) See BookmarksExtension for more details.
http://stevelosh.com/blog/2009/08/a-guide-to-branching-in-mercurial/ may also be helpful - it's essentially the canonical document on branch management strategies in Mercurial.

Branching and Merging Strategies

I have been tasked with coming up with a strategy for branching, merging and releasing over the next 6 months.
The complication comes from the fact the we will be running multiple projects all with different code changes and different release dates but approximately the same development start dates.
At present we are using VSS for code management, but are aware that it will probably cause some issues and will be migrating to TFS before new development starts.
What strategies should I be employing and what things should I be considering before setting a plan down?
Sorry if this is vague, feel free to ask questions and I will update with more information if required.

This is the single best source control pattern that I have come across. It emphasizes the importance of leaving the trunk free of any junk (no junk in the trunk). Development should be done in development branches, and regular merges (after the code has been tested) should be made back into the trunk (Pic 1), but the model also allows for source to be patched while still under development (Pic 2). I definitely recommend reading the post in its entirety, to completely understand.
Pic 1
Pic 2
Edit: The pictures are definitely confusing without words. I could explain, but I would basically be copying the original author. Having said that, I probably should have selected a better picture to describe the merge process, so hopefully this helps. I'd still recommend reading the post, however:

The simplest and most usual way I've seen branching work is off two premises. Trunk and Release. I think this is known as the "Unstable trunk, stable branch" philosophy.
Trunk is your main source. This contains the "latest and the greatest" code and is forward looking. It generally isn't always stable.
Release is a one-to-many association with trunk. There is one trunk but many releases that derive from the trunk. Releases generally start with a branch of the trunk once a particular functionality milestone has been hit so the "only" things left to go in for a particular deployment should just be bug fixes. You then branch the trunk, give it a label (e.g. 1.6 Release is our current latest Release), build and send the release to QA. We also push the version number (usually the minor number) of the trunk up at this point to ensure we don't have two releases with the same number.
Then you begin the testing cycle on your release branch. When sufficient testing has been perfomed you apply bug fixes to the release branch, merge these back to the trunk (to ensure bug fixes are carried forward!) and then re-release a build of the branch. This cycle with QA continues until you are both happy and the release is finally given to the customer(s). Any bug reports from the customer(s) that are accurate (i.e. they are a bug!) start another QA cycle with the branch in question.
As you create future releases it is a good idea to also try to move older customers onto newer branches to reduce the potential number of branches you might have to back-patch a bug fix into.
Using this technique you can deploy solutions using your technology to a variety of customers that require different levels of service (starting with least first), you can isolate your existing deployments from "dangerous" new code in the trunk and the worst merge scenario is one branch.

My first recommendation would be to read Eric Sink's Source Control HOWTO - specifically the branches and branch merge chapters.
We have 3 containers - DEV, MAIN, and RELEASE for our work. MAIN contains all our "ready-to-release" code and we tend to think of it as "basically stable." DEV/Iteration (or DEV/Feature, or DEV/RiskyFeatureThatMightBreakSomeoneElse) are branches from MAIN and are merged up when the Iteration/Feature is ready to promote up past the DEV environment. We also have TFS builds set up from the DEV/Iteration branch and the MAIN branch.
Our RELEASE container contains numbered releases (similar to the "tags" container used in many Subversion repositories). We simply take a branch from MAIN each time - I like to say we're "cutting" a RELEASE branch to signify this shouldn't have a lot of activity going on once the merge is finished.
As for VSS->TFS - Microsoft supports an upgrade path which should keep your version history, but if you don't need it the history, I would just get the latest version from VSS, check it into TFS and archive the VSS repository.
One final tip - get your team members familiar with source control. They must understand branching and merging or you will be stuck doing a lot of cleanup work :).
Good luck!

The subversion book describes some common branching patterns. Maybe you can also apply these to TFS.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse