Using gitattributes for linguist examples - github

Are there any concrete examples, in order to detect wrong languages in GitHub via Linguist attributes?
Source: https://github.com/github/linguist
linguist-documentation
linguist-language
linguist-vendored

Examples can be found in Linguist's documentation. What you want can be achieved with linguist-language attributes.
linguist-language
With the following attribute, Linguist detects all .rb files as being Java files.
*.rb linguist-language=Java
linguist-vendored
With the following attribute, Linguist detects files in the special-vendored-path directory (notice the mandatory trailing *) as vendored and excludes them from statistics.
special-vendored-path/* linguist-vendored
linguist-documentation
Without the following attribute, Linguist would detect the file docs/formatter.rb as documentation and exclude it from statistics.
docs/formatter.rb linguist-documentation=false
linguist-detectable
With the following attribute, Linguist counts SQL files in statistics. Without this attribute, only programming and markup languages are counted in statistics.
*.sql linguist-detectable=true

Related

Github incorrectly detects Languages of my project as "Roff"

In one of my repositories nearly all of my code is Python and some HTML.
However, Github thinks otherwise:
What causes that?
You were creating files through a script, with an unintended extension. That is, your script was inserting a dot in the file name.
Simply rename your file my_file_0.5ms to my_file_05ms.txt and it will display the correct languages:
What you could do to fix similar problems in the future is use a script to detect extensions and the total lines of code for each extension.
Solution
GitHub Linguist is the culprit in this situation, but luckily, it can be easily resolved in a number of ways.
Create a .gitattributes file and list patterns that match the files you want to ignore, and then append either linguist-vendored or linguist-documentation.
specific-file.5ms
*.5ms
specific-folder/*
This will remove the files from your GitHub repositories statistics on the next run of Linguist (it may take some time).
Notes
If you'd like to attribute these files to a specific language, you can do that using linguist-language={name}. Full documentation on overriding Linguist can be found here.
You can also run Linguist on your own computer, but note that any changes to .gitattributes will not take effect until you commit to your repository. Linguist will not see changes that exist only in the index.

Does GitHub Linguist support prefix wildcards for linguist-vendored?

In GitHub's documentation on linguist, the section on using the .gitattributes file says a path can be marked as vendored, and thus ignored in the repository's statistics tracking, with:
special-vendored-path/* linguist-vendored
However, is it possible to have linguist mark directories as vendored that may be nested in directories containing non-vendored code?
I tried adding a line styled as */special-vendored-path/* linguist-vendored to my .gitattributes, but that didn't cause the GitHub code-proportion information to change.
To match a directory inside an arbitrary arborescence of directories, you need double asterisks:
**/special-vendored-path/* linguist-vendored
Note, however, that double asterisks are not needed at the end of paths. For example, test1/* will match test1/test2/test3/file.

Issue with doxygen .dox files

I am trying to run doxygen on some source files for a project that I downloaded source files for. The files are located in the following directories:
doc/ - Documentation files, such as .dox files.
src/ - Source files
My settings in my doxygen.config file are:
INPUT = ../ .
FILE_PATTERNS = *.h *.dox *.dxx
When I run doxygen (doxygen doxygen.config), it generates all of the documentation from the .h files correctly, but it does not generate the mainpage correctly. I have a file titled intro.dox in the doc folder, with a command \mainpage Documentation Index, and a bunch of text, but doxygen is not using this to generate the main page.
What am I doing wrong?
There are (at least) two possible reasons for this:
You are not including the /doc directory in you INPUT list. Try modifying this to
INPUT = ../ . ../doc
Did you mean to write ../doc instead of ../? I am guessing that your doxygen.config file is in your src directory. If this is not the case can you make this clear in the question.
Doxygen requires that your documentation files (your .dox files) are plain text with your text wrapped with Doxygen C++ comments (i.e. /** ... */).
Without knowing where doxygen.config is located, and since you are using relative paths in INPUT, it is difficult to determine what might cause this, however since the files you are looking for are in parallel directories, it is possible that doxygen is not search recursively for your files. You may want to confirm that RECURSIVE is set to YES in doxygen.config.

Concatenate content of TAGS files from different directories

I'm referring to TAGS file generated by ctags or etags in order to have some code navigation in Emacs with M-..
The typical project looks like this:
Large standard library (more than 100 files, but rarely updated).
Project-specific library (updated on the daily basis).
I would like the project to be able to use two (or maybe more TAGS files), but regenerate only the portion of them, only the ones used inside the particular project. How would I approach this problem?
etags --help:
-i FILE, --include=FILE
Include a note in tag file indicating that, when searching for
a tag, one should also consult the tags file FILE after
checking the current file.

Making stable names for doxygen html docs pages

I need to refer to Doxygen documentation pages. The file names however are not stable as they change after every generation. My idea is to create a symlink to each HTML file created by Doxygen , having a stable and human friendly name. Have anyone tried this?
Actually, it might be very easy just to parse the annotated.html file Doxygen produces. Any documented class shows up there as a line like:
`<tr><td class="indexkey"><a class="el" href="dd/de6/a00548.html">
ImportantClass</a></td>`
The hard problem for me is that I would like to have my file names (i.e. the symlinks) be visible on my server like:
http://www.package.com/com.package.my.ImportantClass.html
[Yes, the code is in java]. So the question actually reads: "how to connect a HTML page by Doxygen with the right java class name and its package name.
You seem to have SHORT_NAMES enabled, which will indeed produce volatile names. When you set SHORT_NAMES to NO in the configuration file (the default), you will get longer names, but these are stable over multiple runs (i.e. they are based on the name, and for functions also on (a hash of) the parameters.