Hosting Open Source code with a restrictive license on GitHub

Hosting Open Source code with a restrictive license on GitHub - github

I want to know if it is allowed to host software on GitHub that is open sourced (in the sense that anyone can see the source) but with a license that does not allow redistribution (vanilla or modified), while still allowing users to do as they wish with the code without redistributing it.

Definitely. You can use whatever license you want.

Related

Eclipse Public License v1.0

I would like to use a code which is under EPL v.1 license in a commercial project. As I know I can do so, but the problem is that I need to make some changes in this code.
Thus, I have two questions:
Can I change the EPL code and then use it in a commercial project without any restrictions.
If I am allowed do so, should I remove copyright notes in the files I made changes or may be I should add some additional notes.

Read Paragraph 4 in the official licence text.
You may use it for commercial products, but it must not create any liability on other (previous open source) contributors. In particular, you're responsible on your own if any problems occur.

How should I add licence information to maven project and its source files

I have several (Java) projects under maven control, developed in Eclipse, repo under Mercurial/bitbucket that I licence under Apache2 (though this question applies to any licences). What is the best way to licence this?
I have included a verbatim copy of the (Apache) LICENCE.txt in the top directory of the project. However there is no licence in any of the source files so that if they are re-used in other projects (as I hope they can be) they may get separated from the licence info. [Source files can be configuration/data as well as code and are not Java-specific]. If there are any changes to the licence then all these files will have to be edited. Possible approaches are:
use a brief sentence to refer back to LICENSE.txt
use a Maven licence tool if there is one?
use an Eclipse licence tool if there is one?
use a Bitbucket licence tool if it has one?
[I am on Windows so I don't want a sed/awk/grep approach]
UPDATE - have accepted #Nicmancol as the first answer given worked for me
UPDATE2 - Hmm. It has added a licence to all sorts of files in the distrib. Not such a good idea

You can use the Maven License Plugin or the License Maven Plugin

There are Eclipse plugins for adding / maintaining copyright notices in source file headers; e.g. see this SO question: How to manage license banners in source files of Eclipse plug-in projects. (The answers are more general than the question ...)
With a Maven project you can / should also add license details to the POM file.
From a purely legal perspective, it probably doesn't matter if a file gets separated from the "bundle" containing the copyright notice. Copyright applies irrespective of whether there is a copyright notice.
I agree that copyright applies irrespective, but authorship and licenses do not. So in an area where software is likely to be re-used we need to give the re-users that information.
Both authorship and licensing also apply irrespective of whether this is stated in each file.
Authorship is simply a fact, "William Stallings wrote Emacs" remains true even if someone strips the source headers. But knowing who the author of some piece of software is has no bearing over how someone else may use it, so it probably isn't of much relevance.
Licenses derive from copyright, and the default license is as set out in the relevant copyright law. That is, the default is that you do NOT have the right to make a copy, or have a copy that was made illegally.
If a file becomes separated from the license information, then it is up to the user of the file to deal with the problem; i.e. HE needs to find out what the license is. Because, the default is that he has no license.
Basically, if the copyright and/or license are unclear, the obligation is on the copier to find out what the copyright / license status is ... not the copyright owner / licensor. And that is as it should be. It is not possible for the copyright owner / licensor to PREVENT the information from BECOMING separated, and penalizing the copyright holder / licensor for something (illegal) that someone else did to achieve that separation would be manifestly unfair.

Automatically managing license/author/version header in source files

It is generally considered good practice to add some lines with author, version and license information to the top of source files. For instance, Gnu GPL v3 suggests to add
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software: you can redistribute it and/or modify
it under the terms [SNIP]
I find it tedious to add it manually to each file, and to have to update them all every now and then when some of this information changes (new authors, copyright years, version bumps).
Is there a way to manage this automatically, so that I only have to edit this stuff in one place and it gets automagically copied around?
If needed, you may assume that I am using any modern revision control system.

It is generally considered good practice to add some lines with author, version and license information to the top of source files.
That depends. First of all there are two (and more) ways to do this:
manage licensing information per file
manage licensing information in a central location
If you start a project from scratch, the per-file method is often easy to do while keeping things clear. As you write, over time it becomes more difficult to keep track of things. So more and more projects switch to the central location variant.
The file-by-file method has the benefit that the scope of a work is clear. Often you write the name of the application in the file-comment. If a single file is taken out for some reason, the information is still in there and the documentation chain is not broken.
With the central location method, the benefit is that this is normally supported by your version control software, for example GIT. Commits can be signed by the committing person, and author can be given. It's documented who has written which code automatically and that information is stored in a central location: the VCS.
Keep a COPYING file with your package where you provide the main information centrally. You can easily generate the list of authors via the VCS. And per each file you can create one header that just specifies which software and where to look into, just a bare outline:
/**
* Flux Deluxe v3.2.0 - Vector Drawing Redefined
*
* Copyright 2010, 2012 by its authors.
* Some rights reserved. See COPYING, AUTHORS.
*/
If you release a new version in a new year it's a no-brainer to update all files.

Use the License Header Manager

If working with Visual Studio, you could use macro's and attach a shortcut to it.
Then, when creating a new file, use the shortcut to add a header.
If you want to be sure that a header has been included in each file, you can use StyleCop.
Following links might be helpful:
http://abhijitjana.net/2010/12/05/add-document-header-for-files-automatically-in-visual-studio/
http://stylecop.codeplex.com/
In Eclipse, there is also macro support so you should be able to do the same as suggested for VS. However, I do not have any experience with that.
For Java, there is an alternative to StyleCop:
http://stylecop.codeplex.com/
I haven't heard of any SVN-tools that adapt the files itself.
Using macro's in your editor is the closest thing to what you want.

Documentation and version control

Given a project I'm about to start there will be documentation produced.
What is the best practice for this?
Should the documents live with the code and assets or should there be a separate documentation store?
Edit
I'd like a wiki but I will need to print the documents etc... It's a university project.

It really depends on your team. Where I work, we keep documentation in a wiki which is linked in with our team website. For the purposes of shipping documentation, the wiki can be exported and we run it through a parser that "fancifies" the look and feel of the documentation for customer purposes.
Storing the documentation with the code (typically in your source repository) is not a bad idea. Just make sure to keep them separated. For example, keep a docs folder which is on the same level with your src folder in your repository. This way, you can quickly ship the current documentation, you can easily track revisions, and anybody new to the project can immediately jump in without having to go to multiple locations for information.

Storing it in source control is fine.

This is an interesting question -- basically, what others are saying is right about generated documentation, source files and templates/etc. should be stored in source control and generated during your build process.
As far as requirements/specs/etc. documentation, I have worked both ways, and I very much prefer using SharePoint or a Wiki/document portal that is designed for document sharing/versioning. The reason is, most non-developer folks aren't comfortable working with source control systems, and you don't gain any of the advantages of intelligent merging if you are using a binary format like Word. Plus it's nice to have internet-based access so you can reference and work on the docs in a distributed team without people having to install extra software.

Here's a 2017 summary of the options and my experience:
(extreme 1) Completely external (e.g. a wiki, Google Docs, LaTeX, MS Word, MS Onedrive)
People aren't bothered about keeping it up to date (half of them don't even know where to find the page that needs updating since it's so out of the trenches).
wiki platforms are “captive user interfaces” - your data gets stored in their proprietary schemas and is not easy to examine with a simple text editor (Confluence is even worse in that you have no access to the plaintext content at all anymore)
(extreme 2) Completely internal (e.g. javadoc)
pollutes the source code, and is usually too low level to be of any use. Well-written source code is still the best form of low level documentation.
However, I feel package-info.java files are underutilized.
(balance) Colocated documentation (e.g. README.md)
A good half way solution, with the benefits of version control. If a single README.md file is not enough, consider a doc/ folder. The only drawback of this I've seen is whether to source control helpful graphics (e.g. png files) and risk bloating the repo.
One interesting way to avoid this problem is to use plaintext diagram tools (I find Grapheasy and Text Diagram to be a breath of fresh air).
plaintext can be easily read even if your rendering engine changes as the years go by.
Github's success is in no small part thanks to its README.md located in the root of the project.
One tiny disadvantage of this approach though is that your continuous integration system will trigger a new build each time you make edits to the README.md file.

If you are writing versioned user documentation associated with each release of the product, then it makes sense to put the documentation in source control along with its associated product release.
If you are writing internal developer documentation, use automated internal source code documentation (javadoc, doxygen, .net annotations, etc) for source level documentation and a project wiki for design level documentation.

I think most of us in the industry are not really following best-practices and it of course also depends a lot on your situation.
In an agile environment where you would have a very iterative process of release, you will want to "travel light". In this particular case, Jason's suggestion of a separate Wiki really works great.
In a water-fall/big bang model, you will have a better opportunity to have a decent documentation update with each new release. Also you will need to clearly document what version of the requirements was agreed on and have loads of documentation for every tiny change you do to requirements (due to the effects it has on subsequent stages). Often if the documentation can live together with the version controlled source code it is the best.

Are you using any sort of auto-documentation or is it completely manual? Assuming that you are using an auto-documentation system, the documentation is more or less generated on the fly, and would be part of the code itself.
To me, (assuming that it's possible with whatever code you are using), this would be the preferred method of handling it, as you wouldn't need to maintain the documentation source at all.

Storing third-party libraries in source control

Should libraries that the application relies on be stored in source control? One part of me says it should and another part say's no. It feels wrong to add a 20mb library that dwarfs the entire app just because you rely on a couple of functions from it (albeit rather heavily). Should you just store the jar/dll or maybe even the distributed zip/tar of the project?
What do other people do?

store everything you will need to build the project 10 years from now.I store the entire zip distribution of any library, just in case
Edit for 2017:
This answer did not age well:-). If you are still using something old like ant or make, the above still applies. If you use something more modern like maven or graddle (or Nuget on .net for example), with dependency management, you should be running a dependency management server, in addition to your version control server. As long as you have good backups of both, and your dependency management server does not delete old dependencies, you should be ok. For an example of a dependency management server, see for example Sonatype Nexus or JFrog Artifcatory, among many others.

As well as having third party libraries in your repository, it's worth doing it in such a way that makes it easy to track and merge in future updates to the library easily (for example, security fixes etc.). If you are using Subversion using a proper vendor branch is worthwhile.
If you know that it'd be a cold day in hell before you'll be modifying your third party's code then (as #Matt Sheppard said) an external makes sense and gives you the added benefit that it becomes very easy to switch up to the latest version of the library should security updates or a must-have new feature make that desirable.
Also, you can skip externals when updating your code base saving on the long slow load process should you need to.
#Stu Thompson mentions storing documentation etc. in source control. In bigger projects I've stored our entire "clients" folder in source control including invoices / bills/ meeting minutes / technical specifications etc. The whole shooting match. Although, ahem, do remember to store these in a SEPARATE repository from the one you'll be making available to: other developers; the client; your "browser source view"...cough... :)

Don't store the libraries; they're not strictly speaking part of your project and uselessy take up room in your revision control system. Do, however, use maven (or Ivy for ant builds) to keep track of what versions of external libraries your project uses. You should run a mirror of the repo within your organisation (that is backed up) to ensure you always have the dependencies under your control. This ought to give you the best of both worlds; external jars outside your project, but still reliably available and centrally accessible.

We store the libraries in source control because we want to be able to build a project by simply checking out the source code and running the build script. If you aren't able to get latest and build in one step then you're only going to run into problems later on.

never store your 3rd party binaries in source control. Source control systems are platforms that support concurrent file sharing, parallel work, merging efforts, and change history. Source control is not an FTP site for binaries. 3rd party assemblies are NOT source code; they change maybe twice per SDLC. The desire to be able to wipe your workspace clean, pull everything down from source control and build does not mean 3rd party assemblies need to be stuck in source control. You can use build scripts to control pulling 3rd party assemblies from a distribution server. If you are worried about controlling what branch/version of your application uses a particular 3rd party component, then you can control that through build scripts as well. People have mentioned Maven for Java, and you can do something similar with MSBuild for .Net.

I generally store them in the repository, but I do sympathise with your desire to keep the size down.
If you don't store them in the repository, the absolutely do need to be archived and versioned somehow, and your build system needs to know how to get them. Lots of people in Java world seem to use Maven for fetching dependencies automatically, but I've not used I, so I can't really recommend for or against it.
One good option might be to keep a separate repository of third party systems. If you're on Subversion, you could then use subversion's externals support to automatically check out the libraries form the other repository. Otherwise, I'd suggest keeping an internal Anonymous FTP (or similar) server which your build system can automatically fetch requirements from. Obviously you'll want to make sure you keep all the old versions of libraries, and have everything there backed up along with your repository.

What I have is an intranet Maven-like repository where all 3rd party libraries are stored (not only the libraries, but their respective source distribution with documentation, Javadoc and everything). The reason are the following:
why storing files that don't change into a system specifically designed to manage files that change?
it dramatically fasten the check-outs
each time I see "something.jar" stored under source control I ask "and which version is it?"

I put everything except the JDK and IDE in source control.
Tony's philosophy is sound. Don't forget database creation scripts and data structure update scripts. Before wikis came out, I used to even store our documentation in source control.

My preference is to store third party libraries in a dependency repository (Artifactory with Maven for example) rather than keeping them in Subversion.
Since third party libraries aren't managed or versioned like source code, it doesn't make a lot of sense to intermingle them. Remote developers also appreciate not having to download large libraries over a slow WPN link when they can get them more easily from any number of public repositories.

At a previous employer we stored everything necessary to build the application(s) in source control. Spinning up a new build machine was a matter of syncing with the source control and installing the necessary software.

Store third party libraries in source control so they are available if you check your code out to a new development environment. Any "includes" or build commands that you may have in build scripts should also reference these "local" copies.
As well as ensuring that third party code or libraries that you depend on are always available to you, it should also mean that code is (almost) ready to build on a fresh PC or user account when new developers join the team.

Store the libraries! The repository should be a snapshot of what is required to build a project at any moment in time. As the project requires different version of external libraries you will want to update / check in the newer versions of these libraries. That way you will be able to get all the right version to go with an old snapshot if you have to patch an older release etc.

Personally I have a dependancies folder as part of my projects and store referenced libraries in there.
I find this makes life easier as I work on a number of different projects, often with inter-depending parts that need the same version of a library meaning it's not always feasible to update to the latest version of a given library.
Having all dependancies used at compile time for each project means that a few years down the line when things have moved on, I can still build any part of a project without worrying about breaking other parts. Upgrading to a new version of a library is simply a case of replacing the file and rebuilding related components, not too difficult to manage if need be.
Having said that, I find most of the libraries I reference are relatively small weighing in at around a few hundred kb, rarely bigger, which makes it less of an issue for me to just stick them in source control.

Use git subprojects, and either reference from the 3rd party library's main git repository, or (if it doesn't have one) create a new git repository for each required library. There's nothing reason why you're limited to just one git repository, and I don't recommend you use somebody else's project as merely a directory in your own.

store everything you'll need to build the project, so you can check it out and build without doing anything.
(and, as someone who has experienced the pain - please keep a copy of everything needed to get the controls installed and working on a dev platform. I once got a project that could build - but without an installation file and reg keys, you couldn't make any alterations to the third-party control layout. That was a fun rewrite)

You have to store everything you need in order to build the project.
Furthermore different versions of your code may have different dependencies on 3rd parties.
You'll want to branch your code into maintenance version together with its 3rd party dependencies...

Personally what I have done and have so far liked the results is store libraries in a separate repository and then link to each library that I need in my other repositories through the use of the Subversion svn:externals feature. This works nice because I can keep versioned copies of most of our libraries (mainly managed .NET assemblies) in source control without them bulking up the size of our main source code repository at all. Having the assemblies stored in the repository in this fashion makes it so that the build server doesn't have to have them installed to make a build. I will say that getting a build to succeed in absence of Visual Studio being installed was quite a chore but now that we got it working we are happy with it.
Note that we don't currently use many commercial third-party control suites or that sort of thing much so we haven't run into licensing issues where it may be required to actually install an SDK on the build server but I can see where that could easily become a problem. Unfortunately I don't have a solution for that and will plan on addressing it when I first run into it.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse