GitHub: uploading existing libraries - github

Should I re-upload an existing library to my GitHub repo if my code uses it? Or should I only reference the library?
I have some Python programs that use the Yowsup library, which is already on GitHub. Should I upload my copy of this library with my code in order to make my code easier to understand, or should I just tell people to download Yowsup from its own GitHub page?
Thanks!

Maintain your dependencies using a dependency manager.
For Python code, this usually means using pip to maintain a requirements file:
pip install yowsup
pip freeze > requirements.txt
Commit the requirements.txt file to your repository. Don't commit the yowsup code itself.
Now other users can clone your repository and install all of your project's dependencies using
pip install -r requirements.txt
Generally you will want to do this inside a virtual environment, which in the Python world generally means using virtualenv (and optionally virtualenvwrapper).
Many other languages have similar tools, so you can apply the same general technique.

Related

Where to place custom packages that are being developed in github?

I want to get started developing my own packages. I am also adding version control via Github. I mainly develop on my Mac and a Windows laptop, but there is potential for me to develop on other machines down the line. My IDE of choice is PyCharm. I need to figure out where to place my packages both on Github and on my local machines so that my packages are always in sync regardless of where I am developing. Help??
First, let's clarify that git is the version control system, and Github is a platform for hosting git repositories (there are many other platforms aside from Github). You use git commands to manage your codes, and Github is where you store a copy of your codes.
By adding version control and putting a copy on Github, you've already taken the first step in managing your codes on different machines. All you need to do is to make sure the codes on Github is always the latest updated or maintained version.
Here's a sample workflow:
On machine 1 (Mac), clone a copy of the Github repo
Develop on machine 1
When you are satisfied with your changes, push your codes from machine 1 to Github
On machine 2 (Windows), clone a copy of the Github repo
Develop on machine 2
When you are satisfied with your changes, push your codes from machine 2 to Github
On machine 1, do a fetch to check for updates to the code
If there are updates, pull those changes to machine 1
Again, when done making changes, push them from machine 1 to Github
On machine 2 again, fetch and pull changes
Repeat this fetch-pull-push- cycle for all machines
Basically, you'll need to make sure that on wherever machine you are, when you are done, you should always push those changes to the remote (Github). So that other machines, can fetch and pull those changes and continue where you left off.
UPDATE (based on comment):
Once you've got the workflow for your package source codes, next is to package them like any other regular Python package and install them to your site-packages (either directly for your system or preferably in a virtual environment).
I recommend taking a look at the Python docs on Packaging Python Projects which uses setuptools to make your package compatible with pip.
Here's a sample workflow:
git clone <mypackage#github.com> # or git pull if you already cloned it before
cd mypackage
pip install -r requirements.txt
pip install -e . or pip install --user -e .
That last step will install your package to your site-packages folder, like any other pip-compatible package (assuming you've setup your setup.py file properly). If you are using virtual environments, you'll have to activate the virtual env first, then install your package there.
If you are not going to do any modification on the source code, and you just want to install the package on a specific machine, then you can also specify the Github URL to pip:
$ pip install -e git+https://git.repo/some_pkg.git#egg=SomeProject # from git
Lastly, if you are planning to upload this package to PyPi, check out the docs on Uploading the distribution archives. This just adds an extra step to your workflow of uploading your package to PyPi and then doing pip install from there next time.

Can a specific file from a specific github repo tree be installed with pip?

I need to install tensorflow_backend.py from a specific tree in keras, ideally without updating the other keras files. This shows how to install a specific repo branch, but how can the same be done for this single file in a tree?
No. pip installs Python packages. You need a different tool to download just one file. For a remote git repository git archive is a good tool. For a remote web interface use curl or wget.

npm install and build of forked github repo

I'm using a module for my angular app called angular-translate. However, I've had to make a few small modifications to the source code to get everything working the way I'd like, and now I want to persist those changes on npm install. A colleague suggested that I fork the repo of the source code and point to my forked repo as a dependency, which I've tried in these ways, e.g.
npm install https://github.com/myRepo/angular-translate
npm install https://github.com/myRepo/angular-translate/archive/master.tar.gz
The first gives me a directory like this with no build. Just a package.json, .npmignore, and some markdown files
-angular-translate
.npmignore
.nvmrc
CHANGELOG.md
package.json
etc
The second npm install gives me the full repo, but again I don't get a build like when I use the command npm install angular-translate. I've seen some dicussion of running the prepublish script, but I'm not sure how to do this when installing all the modules. I've also tried publishing the fork as my own module to the npm registry, but again I get no build, and I'm not sure that's the right thing to do...
I apologise for my ignorance on the topic. I don't have a huge amount of experience with npm. Would love to get some feedback on this issue. It seems like it could be a common enough issue when modifications need to be made to a package's source code? Maybe there's a better solution?
Try npm install <ghusername>/<repoName>, where <ghUsername> is your GitHub username (without the #) and <repoName> is the name of the repository. That should correctly install it. You will most likely want to use the --save or --save-dev flag with the install command to save dependency in your package.json.
If that isn't working correctly, check the contents of your .npmignore file.
Don't panic if the install command takes a long time; installing from a git repository is slower than installing from the npm registry.
Edit:
Your problem is that in your case, dist/ is not committed to the repo (since it is in the .gitignore). That is where the actual code lives. dist/ is built from the files in src/ before the package is published to the npm registry, but dist/ is never committed to the repo.
It's ugly, but in this case you will have to remove dist/ from the .gitignore and then run:
npm run build
git add .
git commit
git push
(Ensure that you have run npm install first)
You then should be able to install from github.
There might be another way to do this using a prepare script, but I'm not sure if that's possible; I've never tried it. Edit: Cameron Tacklind has written an excellent answer detailing how to do this: https://stackoverflow.com/a/57829251/7127751
TL;DR use a prepare script
and don't forget package.json#files or .npmignore
Code published to npmjs.com is often not what's in the repository for the package. It is common to "compile" JavaScript source files into versions meant for general consumption in libraries. That's what's usually published to npmjs.com.
It is so common that it's a feature of npm to automatically run a "build" step before publishing (npm publish). This was originally called prepublish. It seems that Npm thought it would be handy to also run the prepublish script on an npm install since that was the standard way to initialize a development environment.
This ended up leading to some major confusion in the community. There are very long issues on Github about this.
In the end, in an effort to not change old behavior, they decided to add two more automatic scripts: prepublishOnly and prepare.
prepublishOnly does what you expect. It does not run on npm install. Many package maintainers just blindly switched to this.
But there was also this problem that people wanted to not depend on npmjs.com to distribute versions of packages. Git repositories were the natural choice. However it's common practice to not commit "compiled" files to git. That's what prepare was added to handle...
prepare is the correct way
If you have a repository with source files but a "build" step is necessary to use it,
prepare does exactly what you want in all cases (as of npm 4).
prepare: Run both BEFORE the package is packed and published, on local npm install without any arguments, and when installing git dependencies.
You can even put your build dependencies into devDependencies and they will be installed before prepare is executed.
Here is an example of a package of mine that uses this method.
Problems with .gitignore
There is one issue with this option that gets many people.
When preparing a dependency, Npm and Yarn will keep only the files that are listed in the files section of package.json.
One might see that files defaults to all files being included and think they're done.
What is easily missed is that .npmignore mostly overrides the files directive and, if .npmignore does not exist, .gitignore is used instead.
So, if you have your built files listed in .gitignore, like a sane person, and don't do anything else, prepare will seem broken.
If you fix files to only include the built files or add an empty .npmignore, you're all set.
My recommendation
Set files (or, by inversion, .npmignore) such that the only files actually published are those needed by users of the published package. Imho, there is no need to include uncompiled sources in published packages.
Original answer: https://stackoverflow.com/a/57503862/4612476
Update for those using npm 5:
As of npm#5, prepublish scripts are deprecated.
Use prepare for build steps and prepublishOnly for upload-only.
I found adding a "prepare": "npm run build" to scripts fixed all my problems.
Just use the command npm install git+https://git#github.com/myRepo/angular-translate.git. Thanks.
To piggyback off of #RyanZim's excellent answer, postinstall is definitely a valid option for this.
Either do one of the following:
Update the package.json in your forked repo to add a postinstall element to scripts. In here, run whatever you need to get the compiled output (Preferred).
Update your package.json, and add a postinstall that updates the necessary directory in node_modules.
If you've forked another persons repository, then it might be worth raising an issue to illustrate the issue that installing their package through GitHub does not work as it does not provide the necessary means to build the script. From there, they can either accept a PR to resolve this with a postinstall, or they can reject it and you can do #2.
If you are using yarn like me. Imagine that you want to use a package like this:
yarn add ghasemikasra39/gridfs-easy --save where the ghasemikasra39 is the username and gridfs-easy is the name of the repo

Creating an apt repository on GitHub

Can I create an apt repository from my GitHub repository?
I found How To Setup A Debian Repository but it is too complex to understand.
I believe that it would be possible, but it not very practical. Git and GitHub are designed for storing source code, while an apt repository is setup to serve built packages. As far as git would be concerned, even a Debian source package would not be source code. But, since GitHub pages allow serving of largely arbitrary content via http, it should be possible to serve an apt repository in that way.
The page to which you referred doesn't seem at all complex to me, it's just a list of tools which can be used to manage an apt repository. The part listing Archive Generation Tools should be especially helpful. In the past I've used mini-dinstall to manage my own package archive, you'd probably want use either that or reprepro; the other tools are probably either trying to do too much (providing the capability to manage an entire distribution repository) or too little.

Best practice for using one mercurial project in another

What are the best practices for using one mercurial project in another? I've got a django app that I'm working on, but I'm also using mercurial to version control a website that uses that app. I've looked at mercurial subrepositories, but apparently this is considered a "feature of last resort". Is there a good way of doing what I want to do, or do I just have to copy the code from my app into my website repo when I want to update to a new version of my app?
In your specific case I like to let pip handle my django application dependencies: http://guide.python-distribute.org/pip.html#installing-from-a-vcs
We have in our "website" repo a requirements.txt and our deploy does a pip install --upgrade -r requirements.txt That pulls the latest from the repo and installes it into the application's virtual env. This gives nice flexibility and separation while leaving the package management up to pip. With those VCS urls in pip you can point to a specific tag or branch too if you want different sites using different revisions from the same underlying repo.
pip also has a -e /path/to/file mode for pointing to an "editable" clone that's outside the website repo, which would work too, but I've not tried it.
That said, if you think subrepos fit your workflow better by all means use them. They work just fine, but people often get hung up on the workflow constraints ("What do you mean I can't commit my parent repo w/o also committing in the subrepo?!")