Chunk patch uploading to online GitHub then push online for low-speed networks - github

I am on a low-speed network. How can I push large git commits with some kind of resume capability? Like, can I upload chunks of patch files then combine them on the git server, then push them to master?

Usually I'd recommend git for such purposes, since it compresses everything and transmits quite efficiently.
Anywho -- you could have a branch from which you'd cherry-pick one commit at a time and push them individually, so to say.
Btw, smaller commits are the way to go! ;-)
Also, you could create patches, of course, and upload them somehow else (scp, ...) which then might be a resumable way, and apply them on your git server. But I doubt, this would be more efficient, then using actually git.

Related

Deleting draft PR's

I know github generally has a rule where PR's and stuff can be closed, but never deleted to preserve history, unless there is something very necessary about it (such as private keys being included in the PR by accident, etc.). I was wondering, however, if draft PR's can be deleted. Sometimes I use it for certain CI/CD testing, and I end up closing them, but they can start cluttering up my PR history. Since they were never converted to a full, real PR, is that a thing we can do without contacting github support?
Thanks!
Apparently no. Draft PRs are treated like regular PRs and cannot be deleted without contacting support. https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/removing-sensitive-data-from-a-repository. They only delete them if there is sensitive data.
You might want to setup a separate repo just for CI/CD testing if you are trying to test configurations like Actions. That would keep your main repo from getting cluttered.
If you are creating draft PRs to run tests, that could indicate your tests are difficult (or slow) to run locally. I'm just assuming though. I know that's why I sometimes reply on CI/CD instead of local testing.

How do I notify all forks of my code of a critical change?

Suppose I have a following situation. Long ago I published some useful code on Github and a lot of people forked it since then. Now I find some really serious error (like a buffer overrun) in my code and fix it and I realize that all forks should better have that fix, otherwise Bad Things™ might happen.
How do I notify owners of all forks that there's this critical change they'd better pull?
An upstream repo doesn't really know about its downstream repo (see "Definition of “downstream” and “upstream”").
And you cannot make a pull request to a fork (that wouldn't scale well anyway).
So the easiest was is to count on the other developers to update their local clone with your latest changes, which will include your latest fixes.
You can update your README.md for all to see, but you cannot really "broadcast" to all the forks (not to mention all the direct clones you have no knowledge about).
Anyway, if they want to contribute back, you will reject any pull request which isn't fast-forward.
That means they will have to rebase their work on top of the latest from "upstream" (your repo), before pushing to their fork and making said pull request.

Can I use "Online Backup" to backup my DVS instead of pushing to an external repo?

I'm currently signed up with a third party service that hosts my mercurial repositories as a central hub to push my changes to as a sort of backup.
Now, I'm looking at a system to backup my laptop and am concidering Mozy. I'm a loan developer, and work on a laptop and am usualy connected to my internet via wifi with my laptop only really being on when I'm working, so feel something like Mozy is my best option.
My question is, if I'm the only developer, could I get away with just using local mercurial repos and using Mozy to backup everything up? Rather than pushing to an external repo?
Many thanks
Matt
Disclaimer: My experience is with git rather than hg, but as I understand it the concepts apply equally to both systems.
An advantage of backing up to a remote repo is that if your local repo becomes corrupted (perhaps due to a problem with the underlying filesystem), that corruption does not get transferred over to the backup, unless the files in your working tree themselves are corrupted.
For example, it's possible for some of the objects in the repository, perhaps those which are rarely accessed because you don't change them, to become corrupted. It could be months before you use one of those files again, and so months before you notice (though I think doing a garbage collect run, eg git gc, will detect corruption).
So if you are backing up by pushing commits, you're creating an independent version of those objects, and using checksums (ie the commit hash) to verify the transfer of any new files. Whereas if you are backing up to a backup provider, you're duplicating the actual objects in the repo, in whatever state they are in, and duplicating any changes to those files, including corruption of them.
Usually backup providers will give you rollback (spideroak seems to be particularly good for this) but you'll still have to sift through a lot of versions to figure out when the corruption happened; also with some providers, the rollback period is limited (especially for free accounts).

Do Distributed Version Control Systems promote poor backup habits?

In a DVCS, each developer has an entire repository on their workstation, to which they can commit all their changes. Then they can merge their repo with someone else's, or clone it, or whatever (as I understand it, I'm not a DVCS user).
To me that flags a side-effect, of being more vulnerable to forgetting to backup. In a traditional centralised system, both you as a developer and the people in charge know that if you commit something, it's held on a central server which can have decent backup solutions in place.
But using a DVCS, it seems you only have to push your work to a server when you feel like sharing it. It's all very well you have the repo locally so you can work on your feature branch for a month without bothering anyone, but it means (I think) that checking in your code to the repo is not enough, you have to remember to do regular pushes to a backed-up server.
It also means, doesn't it, that a team lead can't see all those nice SVN commit emails to keep a rough idea what's going on in the code-base?
Is any of this a real issue?
I can understand your concern about devs forgetting backups once their local diff is gone (because they've committed locally) and stops nagging them with copious output. I think the solution can lie in better tools, moar tools! You could set up a cron job on each dev's box that pushes every last reachable object in their repository to the central repo, and labels them in the central (backed-up) repo with namespaced branches. I think "git push" can do this, given the correct refspec. Then, all you aren't doing is affecting the state of your public branches.
But do you really need as aggressive a backup process as before, when the repo existed only in one place? With a DVCS, you need a far higher category of catastrophes to lose all your code. You now need an asteroid or a bomb hitting your office (and all your off-site team members), instead of just a hard disk or RAID controller going bad. Note, I'm not advocating sloppiness; I'm advocating equal risk at lower cost.
I don't think you have an automatism on this. Distributed or centralized VCS can be combined with backup (or not). It's all a question of discipline in the team.
Same for commit-emails. If the team has the discipline to regularly push changes to the right repositories, you can have a working commit-mailinglist too.
Bad habits also can grow in a team with centralized VCS. You have always to fight bad habits.
In most places I imagine that there is probably still a 'central' repository from where builds are made and put to test. If you want your code in the build, it's got to be pushed centrally.
This is also a management issue - tell your team - push regularly (at least daily) so that your code is backed up. If it's not being done, then get out the big stick.
I'd also note, that if you're relying on looking at the commits to see what your staff are doing, you probably have some larger issues that you might look at addressing...
Having a local copy of the repository might encourage poor backup habits, if one were slack. However, your master repository SHOULD be backed up.
The "local copy of the entire repository" has a much more important use than being a backup. It reduces the latency of examining the history of the codebase - say, diffing against the latest version - from being a network round trip to a trip to your local hard drive.
That doesn't sound all that big a deal if your main repository's on your gigabit LAN. If you're a telecommuter, and the repository's a good 600+ ms away over a VPN, it makes a world of difference.
I've never looked into it, but I'm sure both Mercurial and Git support post-commit hooks, allowing you to set up commit mails going to the team lead. Then each developer could set up her repository accordingly, or have an interim repository that permits half-baked features with the commit mails, or whatever.
Edit: Regarding John's comment about a long-running experiment being lost because it wasn't ready to commit to the master repo: work in a separate branch and regularly push your changes to the master. You still get all the benefits of working against a local repository (mainly, for me, very low latency), and still not annoy your colleagues with your half-baked feature... and you can still store your changes off your machine, in a place where your admin can properly back up the repository.

What is a good Mercurial usage pattern for this setup?

We've got two developers on the same closed (ugh, stupid gov) network, Another developer a couple minutes drive down the road, and a fourth developer half-way across the country. E-Mail, ftp, and removal media are all possible methods of transfer for the people not on the same network.
I am one of the two closed network developers, consider us the "master" location.
What is the best Mercurial setup/pattern for group? What is the best way to trasmit changes to/from the remote developers? As I am in charge, I figured that I would have to keep at least one master repo with another local repo in which I can develop. Each other person should just need a clone of the master. Is this right? I guess this also makes me responsible for the merging?
As you can see, I'm still trying to wrap my head around distributed version control. I don't think there is any other way to do this with the connectivity situation.
Patches are a simple and versatile solution.
For moving around larger groups of changes (especially binary changes and merges), Mercurial offers binary bundles. A bundle is basically the binary stuff that is sent on the network when you do hg push, but here it is captured in a file.
Let's imagine I have gotten a clone somehow (by flash drive, DVD, etc.). Call it upstream. I then make a second clone, call it devel. I do all my development in devel and make lots of commits, merges, etc. Since Mercurial is distributed I can do all this offline.
To see which changesets are missing in upstream I do
% hg outgoing ../upstream
When I have something to send, I can use
% hg bundle changes.hg ../upstream
to get a binary compressed file which contain the changesets including all their meta data. I can then burn this file on a CD and send it by mail...
The recipient of the bundle can do
% hg incoming changes.hg
to see the changeset list and
% hg pull changes.hg
to unpack and add the changesets to his repository. He will then most likely have to merge -- this is exactly as if he had pulled directly from your repository over HTTP or SSH.
Note, the upstream repository is only used as a convenient way to remember which changesets are already found in the upstream repository. You can also just jot down the changeset ID and use hg bundle --base when bundling to specify the base (common) changeset. See hg help bundle or look in the wiki.
The users outside the network can make patches, and/or use email to send the updates to the main repo or someone, like yourself to merge them. The other internal people can have local copies, like yourself and do merges --but if you are having these out of network patches, it might be better that one person deal with them so nobody gets confused, but that's something you'd have to consider yourself.
Syncing the other way, you'd create a patch, and them email or get a flash drive to the remote developers to patch their system. You're going to need some good communication in the team man, I am thankful I'm not in your shoes.
Those are my only suggestions --well, the obvious, get them a VPN connection! I'd love to hear how it goes, what plans stabilize into a weekly groove, et cetera.
Correct. The only way anything makes it onto the closed network is via flash drive.