Automatically store each compile on github? - version-control

My code is compiled by multiple people (multiple machines) multiple times per day. I would like to have each compile automatically uploaded to our Github account each time someone completes a compile. Basically, these compiled zips get sent to actual hardware via either flash drive or email or dropbox (any number of ways based on many conditions). Since the zip is always named the same, sometimes the old version is deleted on the device, sometimes stored in an /old directory. I would like to stop losing old versions and retain a central repository of each version stored chronologically. Github seems the perfect place for that.
I could of course ask each user to upload the finished zip that they created to a central location, but I would like for it to be an automatic process if possible. So - does Github offer a feature like that?

Github seems the perfect place for that.
Not really, since putting large binaries in a distributed repo (ie, a repo which is cloned around in its entirety) is not a good idea.
To get a specific version of your binary, you would have to clone your "artifacts" repo from GitHub, before being able to select the right one to deploy.
And with the multiple deliveries, that repo would get bigger and bigger, making the clone longer.
However, if you have only one place to deploy, it is true that a git fetch would only get the new artifacts (incremental update).
But:
GitHub doesn't offer an unlimited space (and again, that repo would grow rapidly with all the deliveries)
cleaning a repo (ie deleting old versions of a binaries you might not need) is hard.
So again, using GitHub or any other DVCS (Distributed VCS) for delivery purpose doesn't seem adequate.
If you can setup a public artifact repository like Nexus, then you will be able to deliver as many binaries you want, and you will be able to clean them (delete them) easily.

Github has the concept of a files attached to a repository (but not actually in the repo - they're stored on s3) and there's an api for uploading files to it.
You could call that api as part of your build process.
If you have a continuous integration server building your code after every commit you should be able to get it to store the build products somewhere, but you might have to handle the integration yourself if you want them stored on GitHub (as opposed to on the CI server's hard disk)

While github is perfect for collaborating on the sources, it is not for managing the build and its artifacts. You may eventually want to look at companies like Cloudbees, which provide hosted build and integration environments, that target exactly the workflow parts beyond the source management. But those are mostly targeted towards Java development, which may or may not fit your needs.
Besides that, if you really only want to have a lot of time stamped zip files from your builds accessible by a lot of people, wouldn't a good old fashioned FTP server be enough for your needs, maybe?

Related

Azure devops (VSTS) and GITHUB: Current status on largest allowed repository size

We would like to give VFSforGIT (formerly known as GVFS) a try, but currently this only works with VSTS and GitHUB. So we can't run a local VFSforGIT server, since no open source server implementation exists yet.
However, we expect our repository to become hundreds of gigabytes large, since we store assets for 3D movie production. Not a lot of assets, just large ones.
What is current status on the maximum allowed repository size (with a paid plan)?
According to the Azure devops documentation, there currently is no hard limit, only a recommendation:
In uncommon circumstances, repositories may be larger than 10GB. For instance, the Windows repository is at least 300GB. For that reason, we do not have a hard block in place. If your repository grows beyond 10GB, consider using Git-LFS, GVFS, or Azure Artifacts to refactor your development artifacts.
What is meant with refactor your development artifacts? How would using GVFS make a repository smaller (if that is meant with refactor)? Maybe it can use another server for storing large files?
Another problem is the push size limit, this is currently 5GB, which might be too small for say video files or some Substance Painter files, but for most cases, this will work.
GVFS, currently being renamed to VFS, is a system through which only it interacts (downloads) with the files that are opened, keeping the other files unused as empty (in local) and maintained in the repository.
(source: vfsforgit.org)
Regarding refactoring refers to the action of re-indexing the repository files but with programs (LFS, etc) that reduce the size to be uploaded to the repository.
Regarding the use, it is recommended that you take into account that:
Windows10 must be installed
There are Git commands that GVFS does not support
It has to be used with GitBash (GVFS)
Finally, the repository has no limit (currently I have it in 1TB) for now

GitHub Multiple Repositories vs. Branching for multiple environments

This might be a very beginner question, but I'm working on a large production website in a startup environment. I just recently started using Heroku, Github, and Ruby on Rails because I'm looking for much more flexibility and version control as compared to just locally making changes and uploading to a server.
My question, which might be very obvious, is if I should use a different repository for each environment (development, testing, staging, production, etc.) or just a main repository and branches to handle new features.
My initial thought is to create multiple repositories. For instance, if I add a new feature, like an image uploader, I would use the code from the development repository. Make the changes, and upload it with commits along the way to keep track of the small changes. Once I had tested it locally I would want to upload it to the test repository with a single commit that lists the feature added (e.g. "Added Image Uploader to account page").
My thought is this would allow micro-managing of commits within the development environment, while the testing environment commits would be more focused on bug fixes, etc.
This makes sense in my mind because as you move up in environments you remove the extraneous commits and focus on what is needed for each environment. I could also see how this could be achieved with branches though, so I was looking for some advice on how this is handled. Pros and cons, personal examples, etc.
I have seen a couple other related questions, but none of them seemed to touch on the same concerns I had.
Thanks in advance!
-Matt
Using different repos makes sense with a Distributed VCS, and I mention that publication aspect (push/pull) in:
"How do you keep changes separate and isolated across multiple deployment environments in git?"
"Reasons for not working on the master branch in Git"
The one difficult aspect of managing different environments is the configuration files which can contain different values per environment.
For that, I recommend content fiter driver:
That helps generating the actual config files with the current values in them, depending on the current deployment environment.

Recommended DVCS mechanism for hosting many independent patches

I have a project just getting started at http://sourceforge.net/projects/iotabuildit/ (more details at http://sourceforge.net/p/iotabuildit/wiki/Home/) that is currently using Mercurial for revision control. And it seems like Mercurial and SourceForge almost have all the right features or elements to put together the collaboration mechanism I have in mind for this project, but I think I'm not quite there yet. I want people to be able to submit, discuss and vote on individual changes from a large number of individuals (more developers than a project would normally have). And I want it to be as easy as possible for users to participate in this. The thought right now is that people can clone the "free4all" fork, which is a clone of the base "code" repository, or they can create their own fork in their own SourceForge user project (SourceForge now provides a workspace for every user to host miscellaneous project-related content). Then they can clone that to their local repository (after downloading TortoiseHg or their preferred Mercurial client). Then they can make modifications, commit them, push them to the fork, and request a merge into the base "code" repository, at which point we can discuss/review the merge request. This all is still far too many steps, and more formal than I'd like.
I see there is such a thing as "shelving" in Mercurial, but I don't see how/if that is supported in the SourceForge repository. And there probably isn't a way to discuss shelved changes as there is merge requests.
I'm looking for any suggestions that would make this easier. Ideally, I would like users to be able to:
Specify any version that they would like to play, and have that requested version extracted from source control hosted for the user to play at SourceForge (because the game can't be played locally due to security restrictions the Chrome browser properly applies to javascript code accessing image content in independent files)
Allow the user to download the requested version of the project for local editing (a C# version built from the same source is also playable locally, or Internet Explorer apparently ignores the security restriction, allowing local play in a browser)
Accept submitted modifications in a form that can be merged with any other compatible "branch" or version of the game that has been submitted/posted (ideally this would be very simple -- perhaps used just uploads the whole set of files back to the server and the compare and patch/diff extraction is performed there)
Other players can see a list of available submitted patches and choose any set to play/test with, then discuss and vote on changes.
Clearly some of these requirements are very specific, and I will probably need to write some server side code if I want to reach the ideal goal. But I want to take the path of least resistance and use the technologies available if much of the functionality I need is already almost there. Or I'd like to see if I can get any closer than the process I outlined earlier without writing any server code. So what pieces will help me do this? Does Mercurial & SourceForge support storing and sharing shelved code in the way I would want? Is there something to this "Patch Queue" (that I see, but can't understand or get to work yet) that might help? Is there a way to extract a patch file from a given set of files compared to a specific revision in a repository (on the server side), without having the user download any Mercurial components?
It sounds like something you could do with mercurial queue (mq) patch queues. The patch queue can be is own, separate versioned repository, and people can use 'guards' to apply only the patches they want to try.
But really it sounds even easier to use bitbucket or github, both of which have excellent patch-submission, review, and acceptance workflows built into them.

Version Control from a different age

At my work I'm on a separate network to my colleague due to clearance reasons, and we both need to share code. I am wondering what the best versioning system would be? There's got to be something better than having project1.zip, project2.zip , etc - but something not as expansive as git or hg.
I would still recommend Git, as it allows to:
make a bundle (only one file, and it can be an incremental bundle)
mail that bundle to your colleague (meaning it will work even if your separate networks have no other way to communicate)
The idea is to exchange one file (from which you can pull any new history bundled in it).
And Git is very cheap for creating and adding a repo when an existing code base is already there.
That being said, any communication procedure will have to be approved by your employer: don't bypass any security measure ;)

I want to separate binary files (media) from my code repositories. Is it worth it? If so, how can I manage them?

Our repositories are getting huge because there's tons of media we have ( hundreds of 1 MB jpegs, hundreds of PDFs, etc ).
Our developers who check out these repositories have to wait an abnormally long time because of this for certain repos.
Has anyone else had this dilemma before? Am I going about it the right way by separating code from media? Here are some issues/worries I had:
If I migrate these into a media server then I'm afraid it might be a pain for the developer to use. Instead of making updates to one server he/she will have to now update two servers if they are doing both programming logic and media updates.
If I migrate these into a media server, I'll still have to revision control the media, no? So the developer would have to commit code updates and commit media updates.
How would the developer test locally? I could make my site use absolute urls, eg src="http://media.domain.com/site/blah/image.gif", but this wouldn't work locally. I assume I'd have to change my site templating to decide whether it's local/development or production and based on that, change the BASE_URL.
Is it worth all the trouble to do this? We deal with about 100-150 sites, not a dozen or so major sites and so we have around 100-150 repositories. We won't have the time or resources to change existing sites, and we can only implement this on brand new sites.
I would still have to keep scripts that generate media ( pdf generators ) and the generated media on the code repository, right? It would be a huge pain to update all those pdf generators to POST files to external media servers, and an extra pain taking caching into account.
I'd appreciate any insight into the questions I have regarding managing media and code.
First, yes, separating media and generated content (like the generated pdf) from the source control is a good idea.
That is because of:
disk space and checkout time (as you describe in your question)
the lack of CVS feature actually used by this kind of file (no diff, no merge, only label and branches)
That said, any transition of this kind is costly to put in place.
You need to separate the release management process (generate the right files at the right places) from the development process (getting from one or two referential the right material to develop/update your projects)
Binaries fall generally into two categories:
non-generated binaries:
They are best kept in an artifact repository (like Nexus for instance), under a label that would match the label used for the text sources in a VCS
generated binaries (like your pdf):
ideally, they shouldn't be kept in any repository, but only generated during the release management phase in order to be deployed.