Where does GitHub store my code and files? [closed] - github

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 1 year ago.
This post was edited and submitted for review last year and failed to reopen the post:
Original close reason(s) were not resolved
Improve this question
I have tried to found out, where does GitHub store the code and files I commit? After a lot of search I found only that it is stored in the Cloud. This is too broad for me. I don't have (or don't know) the method, how to found exact answer.
Where does GitHub store my code and other data I commit? What is the hosting of GitHub?

Main servers are in the US based on my observations.
git pull from San Francisco based server is lightning fast while Australian servers are significantly noticeably slower.
Possibly some additional regional hubs too.
They were at some point hosted with racksapace, no idea if still are or not.
https://www.quora.com/How-many-physical-virtual-servers-does-GitHub-have

Currently Github seems to have its own datacenters in the USA: Northern Virginia and Seattle. https://github.blog/2017-10-12-evolution-of-our-data-centers/

GitHub is just a wrapper web service over Git technology.
Just like any other version control system, Git stores your committed files under a directory on the server like github/users/username/repositoryname. Under this directory there are the most updated files which are exact copy of your local clone.
To see in more detail you can setup your own Git server:
https://git-scm.com/book/en/v2/Git-on-the-Server-Setting-Up-the-Server
or you can install a clone script of GitHub like Gogs:
https://gogs.io

Github uses Git which can be seen as an object data storage. In this storage, files and directories are stored as git trees and blobs.
You may want to read about git internal to understand its architecture.
In addition, Github uses ElasticSearch as a primary software stack for indexing more than 8 million repositories, allowing full-text search on source code, issues, users...
You may want to read this article
https://www.elastic.co/use-cases/github

Git is a “version control system.
Version control systems keep revisions to code straight, and store the modifications in a central repository.
How Git stores the files ?
Well here is some reference material :
1) http://gitready.com/beginner/2009/02/17/how-git-stores-your-data.html
2) Git internals pdf : https://github.com/pluralsight/git-internals-pdf

Related

Is it possible for github to remove my email adress from public repositories from a purely technical standpoint? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
On GitHub, I contributed to Free/Libre & Open Source projects. Now, I want my email address from those commits to be removed.
I expect my user name to be in those commits, too. It does contain my surname and initial of my given at least. Some commits might even contain my full name.
I do expect GDPR to treat the combination of email address and personal name as goods protected in its scope. GDPR does state that the storage of personal data has to happen in consent and with a limited time frame.
(About the latter, I guess I will have to disable the auto-enabled option to "store my data for future generations in an arctic vault", but let's discuss that at another time.)
It would be cumbersome to write to each maintained repository.
Most of the times, they even have 10+ forks with no commit activity by those users which happen to share the information visibly. (GitHub does sometimes enforce public forks via an option, which at least works for lazy "fork-button-clickers".)
Therefore, I do not actually expect to get my personal data completely removed even if I put in a lot of manual work.
From a technical standpoint, git history has to be rewritten. Every DVCS user has to accept those changes [1].
Legally speaking,the case is clear. But:
Is it feasible with the help of GitHub to enforce my right to privacy in many projects? Would published NPM modules be affected as well? (I expect to have only changed their documentation, not actual executable scripts. But exactly the documentation is often hyperlinked to at github from npm.)
It would require all public repositories to accept such a change of history, and perhaps even put in the work to bulk-remove the mail address?
EDIT:
Accepted answer: GitHub can change these Projects and all Forks to private. Works for me, but would hurt these open source projects as well.
The effort to auto-rewrite history (via a script/programming) seems to be out of scope for such an infrequent request.
TY. I do regret asking too broadly and not about recent historical examples.
[1]: What I do not expect is, that every user will purge my email address from their private repositories. My problem is with the easy accessibility of my email address to web scrapers at a central location.
GitHub stores repositories, so from a purely technical standpoint, they are physically capable of editing the data to change it in any way, shape, or form. This is true of literally anybody who stores data on a standard storage medium.
However, because GitHub retains relatively few legal rights to host repositories, they won't modify repositories without the consent of the owner. If there's a legal challenge, they just disable the access to the repository; they don't edit it in any way. The issue of whether your data can and should be removed is left to the project maintainers. As a result, there's no tooling for GitHub to modify any repositories in any way outside of the normal permissions model, and sending any sort of request, legal or otherwise, won't be effective in getting your data actually removed.
I am of course not a lawyer and nothing I say is legal advice, and if you have questions about the law, you should contact an attorney licensed in your jurisdiction. However, you should note that projects that use the Developer's Certificate of Origin (such as Linux and Git) explicitly require you to assent to the recording of your personal information for the life of the project, and if you've signed off your commit, you've made a legal statement agreeing to that, which people may rely on in good faith. If you cannot make a binding statement to that effect in your jurisdiction, then as a consequence the only legal thing would be to refrain from contributing at all.

Game development with multiple people in Unity3D: How could we work on the same project simultaneously? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
First of all, we're all beginners, so I am really sorry if this is a trivial question.
We're developing a game in Unity3D. We have two programmers, and one artist. We'd like to make our life easier by not just simply communicating via Facebook and sending our stuff back and forth. I know about GitHub, but I have a couple of problems with it.
It's not free for closed source projects - which would be ideal. Is there an alternative? Is this even the right kind of site to use?
Stupid reason, but I just can't comprehend how it works/how to use it. Is there an easy tutorial for it or something?
Is it even 'compatible' with Unity3D? Since I don't really know how
GitHub works, this might also be a really stupid question.
First of all you can use Bitbucket to host your stuff. Its like github without the open source community.I'm using it on a similar project I'm working on with some guys. It's important you understand that git is version control software developed by Linus Torvalds (creator of the Linux kernel). Git can be used to "commit" changes to a project. Then your other coder could grab those code(script in unity?) files and load them into him project. It is kind of overwhelming to learn to use at first, but it gets easy once you get it. Really learning to use git is one of the best things you can do for yourself.
As far as using git goes, I use linux so I can just 'man git' to look at commands and then use said commands in the shell. Mac uses bash so it probably is run right from the shell there too. Honestly I don't know at all for windows.
Here are a couple of resources:
https://try.github.io/levels/1/challenges/1
https://www.youtube.com/watch?v=TI3yVcSahzk
If I had more time I would look for a really good one for you, but I'm going to be late for work!
I have developed some Unity3D projects using GitHub before. So to answer question 3 and the last part of 1 first, yes Unity projects use a file-system architecture that is perfectly compatible with GitHub and once your used to it it is a great tool for team development.
Answer for question 1:
GitHub is just a name brand for a centralized version control system and there are other brands out there with similar offerings such as bit bucket. Google this term for more info. also look into distributed version control as well.
In all honesty though, if your new to developing, the product you will be making will most likely not be of much interest to other people on GitHub and your public repository will probably go unnoticed. If you believe that what you are creating is of such great value it needs to be kept secret, then investing a few dollars a month in a premium service is recommended anyway.
For other options, one would be to set up a central Git repository on a server (or one of your home computers) that you or one of you project mates is running. This might be a more complicated method but you would learn a lot of other useful things along the way.
Answer for question 2:
See -https://guides.github.com/activities/hello-world/- for github's intro tutorial. Also Youtube has some decent offering if you search for how to use Git Hub.
It can be a little daunting to work with something new and attempt to understand the documentation. If you are planning on getting serious about development though, especially in a corporate setting, you need to learn GIT and practice reading and understanding documentation.
Good Luck!
I recommend git for just about any text-based version control. If the files are binary heavy, it still works but it's not git's strength.
Until you get the central hosting worked out, you can use git bundle to share the changes offline.

Should I commit all my computer science homework assignments to GitHub? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
After reading a community wiki on Quora, I decided it would be good to start experimenting with GitHub. I thought, "What a better way to experiment than with introductory computer science homework?" However this practice opens up my solutions to the web, and I am concerned that other students might plagiarize with it. I have read other questions on StackOverflow about version control and homework.
Thus, a few questions come to mind as I consider this practice:
Does putting homework code on GitHub open it up to be copied?
Are people that plagiarize familiar with GitHub?
Should I be concerned?
Would plagiarism detection software scan GitHub
Does putting homework code on GitHub open it up to be copied?
If you create a public repository, then yes. Private repositories cost money (7$/month for 5 private repositories) though, as pointed out by carols10cents there is a free student version https://github.com/edu
Are people that plagiarize familiar with GitHub?
Open source is all about sharing. That is kind of it's point. Don't store things you want to keep private in a public place.
Should I be concerned?
For general homework no. Again, don't put essays and personal writing in a public repository. That would be similar to putting your essays on a public blog.
Would plagiarism detection software scan GitHub
I don't know. Probably, eventually.
Git can be used without github. To really learn git, you do not need github or bitbucket or any other paid service. GitHub is just a public set of servers to store/share/backup your work on.
Git is great for tracking revisions. If you have ever used Google Docs (Google Drive) and looked at it's history feature, you are probably familiar with how nice it is to be able to revisit changes and old versions of your work. Git formalizes this by allowing you to comment on your commits, branch your work into multiple versions, or just experiment without worrying about overwriting the original work.
Update
I read the Quora post and thought I might add this.
The very best thing that you can do to improve your skills is rent a server of your own from a vendor like Rackspace, Digital Ocean, or Linode to name just a few of the providers. These services can run as little as $5/month though $10-$20 a month is more typical. From there you will have to learn how to configure a Linux machine. You can install a git repository, mail servers, web servers, whatever you want, in a very low risk environment. Make a mistake and you can just reset the server to its virgin state. I recommend installing an Ubuntu distro because of its large community and relative ease of installing new software.
One of the problems with developers is that they often are too dependent on sysadmins for tasks that really should be part of their repertoire.
Does putting homework code on GitHub open it up to be copied?
It depends. If the repository is public, anyone can see it, and fork it. They may even send you pull requests! If the repository is private, on the other hand, it can only be seen by people that you allow. You need a paid subscription to create private repos.
Are people that plagiarize familiar with GitHub?
That's off-topic. But IMO, you should always suppose plagiarizers are familiar with everything.
Should I be concerned?
It's just homework. Why do you care? It's not like that's your doctor thesis or your next patent material, is it?
Would plagiarism detection software scan GitHub
I know there's software that does that with Wikipedia. I wouldn't be surprised if someone made that for Github. But usually such software checks whether you've copied something from well known sites - if you are the author of the original content, you have nothing to worry about. If other people are plagiarizing you, it means you are good at what you're doing.
Last but not least: you might want to read about Creative Commons. Unless you really want to keep your work top secret, it's better to use a CC license than to lose a night's sleep over people copying your work.
Yes, unless you use a private account.
How could we know?
By publicizing your work, you're not doing anything wrong. Those who would cheat by copying your work and pretend the work is theirs would be the bad guys. Now if your teacher receives twi identical homeworks, you'll have to prove your innocence, which might not be so easy.
I guess so.
My advices
experiment by opening a private account, that onlyyou can have access to, or
experiment with git (which is what matters, more than github) by installing your own git server on one of your own machines.

Lightweight versioning system for standalone development [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I develop a lot of prototypes while I am trying out stuff. I wish I could have a lightweight versioning system which would keep backup of these and make it easy for me to find them next time. It would also help me in keeping track of all the various techiniques I have tried for solving a particular problem.
I would like to know your suggestions on using the right tool for this Job.
Update: A simple google would have given me the names of all the version control apps and git be my preferred choice. But I would like to know which would be the lightest app for the job and why. I dont want a single repo to take GBs of space.
BitBucket + Mercurial is a good combination.
Bitbucket is a web-based hosting service for projects that use the Mercurial revision control system. Bitbucket offers both commercial plans and free accounts. Unusually - and possibly uniquely - for a project hosting service, as of September 2010, it offers free accounts with unlimited numbers of private repositories (which can have up to five users in the case of free accounts).
Git would be the efficient in handling space as thats their claim. Check the link below
https://git.wiki.kernel.org/index.php/GitBenchmarks#Git.2C_Mercurial.2C_Bazaar_repository_size_benchmark
A DVCS such as git or Mercurial will let you create a repo directly in the project directory that you can use to track and manipulate changes.
Go for a DVCS like git or mercurial. I use the Tortoise Hg version of Mercurial, and I have found it very easy to set up and to use for personal use. As #Ignacio says, you can set up the repository in your project directory. You can also set up consolidation repos to manage across projects, and to keep track of multiple different approaches on different projects. Setting up a new repository and populating it takes less than a minute. The learning time for this system was minimal too.
I would recommend Mercurial.
It is quick to setup for each working copy/repository, and doesn't require a separate server.
I do software development mainly on the Windows platform. Thus I use VisualSVN to help me keep my repositories on my home network storage (or on the local computer storage) acting as the server, and I also use TortoiseSVN to access local SVN - acting as the client.
This allow me to work on small local projects which is easy to maintain.

Collaborative Code Editing [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I work for a small web development company (6 people) and we've been in the market for a new code editor/development environment for quite some time.
Currently, we're using Dreamweaver's (CS3) coding side for our site development. Each site's files is hosted on a Dreamhost ftp server. All 6 of us work on the same set of live files on the remote ftp server. Dreamweaver has a handy file locking functionality that prevents us from overwriting each others changes by keeping us out of the same files.
Now, we've found that this form of development allows for very rapid development and love how easy it is to get things done. However there are many things we don't like. One of which is Dreamweaver's code editor. We also don't like our lack of code history for each site.
Does anyone know of a good alternative to Dreamweaver that has similar file locking/ftp functionality?
If not, could you explain to me the best configuration of a source control system for our team? We're willing to look at GIT, Mercurial, and Subversion. The new system would ideally:
1). Support multiple different code editors on different operating systems. (Windows 1st choice.)
2). Be almost as easy and quick to push out code as currently.
3). Allow for working on the files outside of the office network.
4). Be inexpensive.
I'm probably just showing my ignorance of how to use a version control system, but it doesn't seem logical for each of us to have a testing server on our computers with every single site setup with our own test database... That's very time consuming
What's your solution to our problem? I think we'll either have to upgrade to the latest version of Dreamweaver and stick with it forever, or we'll have to find some sort of ftp collaborative editor, or we'll have to implement version control.
Do the benefits of version control outweigh the extra amount of time it entails to push out code?
it doesn't seem logical for each of us
to have a testing server on our
computers with every single site setup
with our own test database... That's
very time consuming
That's generally the way to do it. Most modern frameworks will let you set up your development server in minutes, if not seconds -- using an embedded http server and database, for example. If you are stuck on an ancient platform, there are solutions like wamp that are only a little more difficult. Remember, that it's time that you spend once, but it lets you be faster. If the project is going to take any longer than a few hours, it should be beneficial. You don't waste time on debugging things your fellow developer just changed, or recovering production data from that silly database manipulation mistake you just made.
(Oh, and if your websites are just HTML+JavaScript, then you don't need any server locally, obviously.)
As for version control systems, the ones you mentioned are fine, with SVN requiring a little more setup and network access to the central server for commits. Git and Mercurial let you work and commit offline, and then push your changes to the central server or even just exchange them between developers. I think Mercurial works better on Windows at the moment.
Michael I hear your pain.
I can't claim to have fully researched all avenues, but I have really begun to love Git recently.
My first hurdle was learning about how Revision Control Systems (RCS) work. Before I would pick SVN vs Git vs HG vs Bazzar vs etc I evaluated what I wanted to do. And that was to work locally then share my work, and push to a webserver.
I found this great comparison website: http://whygitisbetterthanx.com
From that I could clearly see that Git was worth the time to learn. As the backwards learner I am I dove into a project and learned how quickly things could become messy, then I began reading: http://gitready.com/ and http://book.git-scm.com/ and http://progit.org/book/
Then I realized I needed an organizational strategy. I went searching and found something I (and a lot of others) liked: http://nvie.com/posts/a-successful-git-branching-model/
This is also a great resource:
http://danielmiessler.com/study/git/
There's a bit of a primer. Let me try to answer your questions more directly.
1.) Git is a command-line tool. For windows there's cygwin.
I found the documentation at github to be the best. Even if you don't plan on using them for code hosting. Have a look at http://help.github.com/ Use the setup git link to get started.
2.) Since you ask for versioning there is a bit more work. Its a different model, a different way of thinking. Rather than not be able to edit the file which is currently what happens, your commits might collide, and in that case git provides great diff tools to help resolve the conflict.
3.) Git is whats called a DCVS or distributed version control system. Here's an example:
lets say you need to do some work over the weekend. You do a git pull from the server before you leave work. At home you can continue to work, create new branches etc. Then when you have an internet connection you can push your changes back to the server.
4.) Git is free!
As for pushing your work to the webserver you'll need to setup something like this:
http://toroid.org/ams/git-website-howto
Looks pretty easy, I'm gonna try it out next weekend.
I hope you find some of what I wrote helpful, if not maybe the links are.