ctags or similar tagging system for a kernel source tree

ctags or similar tagging system for a kernel source tree - tags

ctags is a simple source code tagging system, also integrated in vi (and its flavours nvi, vim, etc.). AFAIK, it builds a plain text file where all the elements (functions, macros, ...) of the source code are indexed. But this file may become too large and unmanageable when the source code tree is extremely huge: this is the case of a kernel (Linux, *BSD, or similar).
Is still ctags or exuberant-ctags suitable for a complex source tree like a kernel?
If not, what tools (with the same integration in vi as ctags) can replace it? This may become subjective, so if possible provide a list of suggested tools: any comments, and references to a guide with the keyboard shortcuts in vi, are welcome.
Supported languages should be at least C, C++, assembly. The tool should be usable through CLI. I would principally like to jump to the definition of functions, macros, struct and similar objects (with ctags, pressing Ctrl+] with the cursor over the item name), to their manpages if possible, and back to the code.
The only alternative tool I know so far is GNU global, with a pretty complex vi integration, which seems to be possible only through Perl (and I can't find the equivalent of Ctrl+]).

The answer to your first point is a resounding yes.
You can use ctags to generate a tags file for different subtrees, thus keeping the size of the generated file to a minimum. At this point, you need to have a mechanism in place for searching for these multiple tags files. Vim provides this, of course.
I have given some advice here, so you may want to check that out.
Of course, I use exuberant-ctags there, so keep that in mind.

Related

Lisp native functions source codes [duplicate]

In C, if I want to see a function that how to work, I open the library which provides the function and analyze the code. How can be implementations of the lisp functions seen? For example, intersection function

You can also look at the source code of lisp functions.
For example, the source files for CLISP, one Common Lisp implementation, are available here: http://www.clisp.org/impnotes/src-files.html
If you want to examine the implementation of functions related to lists, you can look at the file: http://clisp.cvs.sourceforge.net/viewvc/clisp/clisp/src/list.d

The usual answer is "M-."
Assuming you have a properly configured IDE, and the source code of the function, clicking on its name and pressing M-. (that's Meta, or Alt or Option or Escape, and dot/period; or whatever key your IDE uses) should reveal its definition (or, for a generic function, definitions, plural; including any compiler macros that might optimize out some cases). Sometimes it's on a right-click or other mouse menu or toolbar.
If the source isn't available, you can often see the actual compiled form by evaluating (disassemble 'function)
Most IDE's, including perennial favourite Emacs+Slime, have other Inspection operations on the menu as well.
In a non-IDE environment, most compilers have reflection tools of their own (compiler-dependant) which are usually also mapped by the Swank library that Slime uses; one might find useful function in that package.
And this really should be documented in your IDE's manual.
I should postscript this that:
You really shouldn't care about the implementation of the core library functions; their contractual behavior is very well documented in the CLHS standard, which is available online and eg, Quicklisp has an utility to link it to Slime (C-c C-d h on a symbol in the COMMON-LISP package); for all well-written Lisp libraries, there should be documentation attached to functions, variables, classes, etc. accessible via the documentation function in the REPL or the IDE's menus and Inspection windows.
Core library functions are often highly optimized and far more complex than most user-level code should want to be, and often call down into compiler-specific "guts" that one should avoid doing in application code.

Tags for Emacs: Relationship between etags, ebrowse, cscope, GNU Global and exuberant ctags

I work on C++ projects, and I went through Alex Ott's guide to CEDET and other threads about tags in StackOverflow, but I am still confused about how Emacs interfaces with these different tag systems to facilitate autocompletion, the looking up of definitions, navigation of source code base or the previewing of doc-strings.
What is the difference (e.g. in terms of features) between etags, ebrowse, exuberant ctags, cscope, GNU Global and GTags? What do I need to do to use them in Emacs?
Do I need semantic/senator (CEDET) if I want to use tags to navigate/autocomplete symbols?
What does semantic bring to the table on top of these different tag utilities? How does it interface with these tools?

That's as a good question as I've recently read here, so I'll try explain the difference in more detail:
Point 1:
etags and ctags both generate an index (a.k.a. tag/TAGS) file of language objects found in source files that allows these items to be quickly and easily located by a text editor or other utility. A tag signifies a language object for which an index entry is available (or, alternatively, the index entry created for that object). The tags generated by ctags are richer in terms of metadata, but Emacs cannot interpret the additional data anyways, so you should consider them more or less the same (the main advantage of ctags would be its support for more languages). The primary use for the tags files is looking up class/method/function/constant/etc declaration/definitions.
cscope is much more powerful beast (at least as far as C/C++ and Java are concerned). While it operates on more or less the same principle (generating a file of useful metadata) it allows you do some fancier things like find all references to a symbol, see where a function is being invoked, etc (you can find definitions as well).
To sum it up:
ctags one allows you to navigate to symbol declaration/definitions (what some would call a one-way lookup). ctags is a general purpose tool useful for many languages.
On the other hand (as mentioned on the project's page) cscope allows you to:
Go to the declaration of a symbol
Show a selectable list of all references to a symbol
Search for any global definition
Functions called by a function
Functions calling a function
Search for a text string
Search for a regular expression pattern
Find a file
Find all files including a file
It should come as no surprise to anyone at this point, that when I deal with C/C++ projects I make heavy use of cscope and care very little about ctags. When dealing with other languages the situation would obviously be reversed.
Point 2.
To have intelligent autocompletion you need a true source code parser (like semantic), otherwise you won't know the types of the objects (for instance) in your applications and the methods that can be invoked on them. You can have an autocompletion based on many different sources, but to get the best results you'll ultimately need a parser. Same goes for syntax highlighting - currently syntax highlighting in Emacs major modes is based simply on regular expressions and that's very fragile and error prone. Hopefully with the inclusion of semantic in Emacs 23.2 (it used to be an external package before that) we'll start seeing more uses for it (like using it to analyse a buffer source code to properly highlight it)
Since Emacs 24.1 semantic is usable from the Emacs completion framework. The easiest way to test it is to open up a C source code file and typing M-TAB or C-M-i and watch as semantic automagically completes for you. For languages where semantic is not enabled by default, you can add it the following line to your major mode hook of choice:
(add-to-list 'completion-at-point-functions 'semantic-completion-at-point-function)
Point 3.
semantic brings true code awareness (for the few languages it currently supports) and closes the gap between IDEs and Emacs. It doesn't really interface with tools like etags and cscope, but it doesn't mean you cannot use them together.
Hopefully my explanations make sense and will be useful to you.
P.S. I'm not quite familiar with global and ebrowse, but if memory serves me they made use of etags.

I'll try to add some explanations to 1.
What is it?
Etags is a command to generate 'TAGS' file which is the tag file for Emacs. You can use the file with etags.el which is part of emacs package.
Ctags is a command to generate 'tags' file which is the tag file for vi. Universal Ctags, the successor of Exuberant Ctags, can generate 'TAGS' file by the -e option, supporting more than 41 programming languages.
Cscope is an all-in-one source code browsing tool for C language. It has own fine CUI (character user interface) and tag databases (cscope.in.out, cscope.out, cscope.po.out). You can use cscope from Emacs using xcscope.el which is part of cscope package.
GNU GLOBAL is a source code tagging system. Though it is similar to above tools, it differs from them at the point of that it is dependent from any editor, and it has no user interface except for command line. Gtags is a command to generate tag files for GLOBAL (GTAGS, GRTAGS, GPATH). You can use GLOBAL from emacs using gtags.el which is part of GLOBAL package. In addition to this, there are many elisp libraries for it (xgtags.el, ggtags.el, anything-gtags.el, helm-gtags.el, etc).
Comparison
Ctags and etags treat only definitions. Cscope and GNU GLOBAL treat not only definitions but also references.
Ctags and etags use a flat text tag file. Cscope and GNU GLOBAL use key-value tag databases.
Cscope and GNU GLOBAL have a grep like search engine and incremental updating facility of tag files.
Combination
You can combine Universal Ctags's rich language support and GNU GLOBAL's database facility by using ctags as a plug-in parser of GLOBAL.
Try the following: (requires GLOBAL-6.5.3+ and Universal Ctags respectively)
Building GNU GLOBAL:
$ ./configure --with-universal-ctags=/usr/local/bin/ctags
$ sudo make install
Usage:
$ export GTAGSCONF=/usr/local/share/gtags/gtags.conf
$ export GTAGSLABEL=new-ctags
$ gtags # invokes Universal Ctags internally
$ emacs -f gtags-mode # load gtags.el
(However, you cannot treat references by this method, because ctags don't treat references.)
You can also use cscope as a client of GNU GLOBAL. GLOBAL package includes a command named 'gtags-cscope' which is a port of cscope, that is, it is cscope itself except that it use GLOBAL as a search engine instead of cscope's one.
$ gtags-cscope # this is GLOBAL version of cscope
With the combinations, you can use cscope for 41 languages.
Good luck!

TAGS files contain definitions
A TAGS file contains a list of where functions and classes are defined. It is usually placed in the root of a project and looks like this:
^L
configure,3945
as_fn_success () { as_fn_return 0; }^?as_fn_success^A180,5465
as_fn_failure () { as_fn_return 1; }^?as_fn_failure^A181,5502
as_fn_ret_success () { return 0; }^?as_fn_ret_success^A182,5539
as_fn_ret_failure () { return 1; }^?as_fn_ret_failure^A183,5574
This enables Emacs to find definitions. Basic navigation is built-in with find-tag, but etags-select provides a nicer UI when there are multiple matches.
You can also uses TAGS files for code completion. For example, company's etags backend uses TAGS files.
TAGS files can be built by different tools
ctags (formerly known as 'universal ctags' or 'exuberant ctags') can generate TAGS files and supports the widest range of languages. It is actively maintained on github.
Emacs ships with two programs that generate TAGS files, called etags and ctags. Emacs' ctags is just etags with the same CLI interface as universal ctags. To avoid confusion, many distros rename these programs (e.g. ctags.emacs24 on Debian).
There are also language specific tools for generating TAGS files, such as jsctags and hasktags.
Other file formats
ebrowse is a C program shipped with Emacs. It indexes C/C++ code and generates a BROWSE file. ebrowse.el provides the usual find definition and completion. You can also open the BROWSE file directly in Emacs to get an overview of the classes/function defined a codebase.
GNU Global has its own database format, which consists of a GTAGS, GRTAGS and GPATH file. You can generate these files with the gtags command, which parses C/C++ code. For other languages, GNU Global can read files generated by universal ctags.
GNU Global also provides a CLI interface for asking more sophisticated questions, like 'where is this symbol mentioned?'. It ships with an Emacs package gtags.el, but ggtags.el is also popular for accessing GNU Global databases.
Cscope is similar in spirit to GNU Global: it parses C/C++ into its own database format. It can also answer questions like 'find all callers/callees of this funciton'.
See also this HN discussion comparing global and cscope.
Client/Server projects
rtags parses and indexes C/C++ using a persistent server. It uses the clang parser, so it handles C++ really well. It ships with an Emacs package to query the server.
google-gtags was a project where a large TAGS file would be stored on a server. When you queried the server, it would provide a subset of the TAGS file that was relevant to your search.
Semantic (CEDET)
Semantic is a built-in Emacs package that contains a parser for C/C++, so it can find definitions too. It can also import data from TAGS files, csope databases, and other sources. CEDET also includes IDE style functionality that uses this data, e.g. generating UML diagrams of class hierarchies.

[answer updated from shigio's]
I'll try to add some explanations to part 1 of the question.
What is it?
Etags generates a TAGS file which is the tag file format for Emacs. You can use an Etags file with etags.el which is part of Emacs.
Ctags is the generic term for anything that can generate a tags file, which is the native tag file format for Vi. Universal Ctags (aka UCtags, formerly Exuberant Ctags) can also generate Etags with the -e option.
Cscope is an all-in-one source code browsing tool for C (with lesser support for C++ and Java), with its own tag databases (cscope.in.out, cscope.out, cscope.po.out) and TUI. Cscope support is built-in to Vim; you can use Cscope from Emacs using the xcscope.el package. There are also Cscope-based GUIs.
GNU GLOBAL (aka Gtags) is yet another source code tagging system (with significant differences--see next section), in that it also generates tag files.
Comparison
Ctags and Etags treat only definitions (of, e.g., variables and functions). Cscope and Gtags also treat references.
Ctags and Etags tag files are flat. Cscope and Gtags tagfiles are more powerful key-value databases, which allows (e.g.) incremental update.
Cscope and Gtags have a grep-like search engine.
Ctags has built-in support for more languages and data formats: see the current-in-repository list of Universal Ctags parsers. UCtags also has documented how to develop your own parser.
Cscope and Gtags are editor-independent.
Gtags does not provide its own user interface, but can currently (Oct 2016) be used from commandline (CLI), Emacs and relatives, Vi and relatives, less (pager), Doxygen, and any web browser.
Gtags provides gtags.el via the GLOBAL package, but there are also many other elisp extensions, including xgtags.el, ggtags.el, anything-gtags.el, helm-gtags.el.
Combination
You can combine Universal Ctags' rich language support with Gtags' database facility and numerous extensions by using Ctags as a GLOBAL plug-in parser:
# build GNU GLOBAL
./configure --with-exuberant-ctags=/usr/local/bin/ctags
sudo make install
# use it
export GTAGSCONF=/usr/local/share/gtags/gtags.conf
export GTAGSLABEL=ctags
gtags # invokes Universal Ctags internally
emacs -f gtags-mode # load gtags.el
Note again that if you use Ctags as the parser for your Gtags, you lose the ability to treat references (e.g., variable usage, function calls) which Gtags would otherwise provide. Essentially, you trade off Gtags' reference tracking for Ctags' greater built-in language support.
You can also use Cscope as a client of Gtags: gtags-cscope.
Good luck!

I haven't actually checked, but according to CEDET manual (http://www.randomsample.de/cedetdocs/common/cedet/CScope.html):
semantic can use CScope as a back end for database searches. To enable it, use:
(semanticdb-enable-cscope-databases)
This will enable the use of cscope for all C and C++ buffers.
CScope will then be used for project-wide searches as a backup when pre-existing semantic database searches may not have parsed all your files.

What Are The Most Important Parts Of ELisp To Learn If You Want To Programme Emacs?

I use Emacs everyday as it is the standard editor for Erlang.
I decided as my New Years Resolution to learn to programme eLisp. I decided that writing a book about eLisp was the best way to learn.
I have make pretty good progress:
Learn eLisp For Emacs
The strategic structure of the book is
getting started/basics
more advanced eLisp
writing a minor mode
writing a major mode
I have got through the basics (ie the first of these 4 points), covering:
evaluating expressions
debugging
adding menu items/toolbars
loading your own emacs files
etc, etc
If you are writing a book about a programming language you usually start by knowing the language well - well I don't - so my major problem now is a completeness problem:
How do I know that I have covered all the features that an Emacs programmer should have mastered?
How do I ensure that there aren't gaps in the content?
So I thought I would address these by asking the community here. My question is What Is Missing From My Table Of Contents? (in particular what should the more advanced eLisp Section contain).

Now that's an interesting way to learn a language...
I think you've probably skipped a bunch of the fundamentals in the getting started/basics section. If you've not already read it, I recommend reading "An Introduction To Programming In Emacs Lisp" (at gnu.org), which covers what I'd expect to see in the "basics" portion. I won't bother cut/paste the table of contents.
As far as knowing when you've written a complete book... Well, once you've re-written the Emacs Lisp manual in "how to" form, you know you're done. Alternatively, if you've written a book that can be used to answer/interpret all of the elisp and emacs questions, then you've probably got decent coverage.
What advanced features could you write about? There's advice, process communication, non-ASCII text, syntax tables, abbrevs, text properties, byte compilation, display tables, images, and a bunch more in the manual.
Note: The proper capitalization of elisp is either all lowercase, or possibly an uppercase E. The GNU documentation doesn't use "elisp" very much at all (mostly as a directory name, all lowercase), it prefers "Emacs Lisp."
Note: In the current version of your book, you treat global variables negatively. It's probably worth reading the RMS paper to gain some insight into the design decisions made, specifically on global variables as well as dynamic binding (the latter which you've yet to cover, which is a key (basic) concept which you've already gotten wrong in your book).

Instead of asking the community here, why not use what the community already offers? Review all the questions tagged "elisp" and see where they fit it your book. A survey of what people actually want to understand could be some of the best information you will get.

"I have read both the GNU manuals - but they are not so much use if you don't know Lisp/elisp."
Tip: Emacs is not much use if you don't know Lisp.
Not really true, of course, but you get the idea. This one is really true:
Tip: Emacs is much, much, much, much more useful if you know Lisp. Not to mention more fun.
Wrt what to learn:
symbols (they are simple objects, with properties -- not just identifiers)
lists -- cons cells, list structure (including modification/sharing)
evaluation
function application
regexps
text/overlay/string properties (values can be any Lisp entities)
buffers and windows

I'd suggest you'll take a look what the two Info manuals included with Emacs Emacs Lisp Intro "An Introduction to Programming in Emacs Lisp" and Elisp "The Emacs Lisp Reference Manual" already have to offer and then decide what you would like to add to or do differently than those.

What are some useful emacs functions for refactoring?

For now I've stuck with multi-occur-in-matching-buffers and rgrep, which, while powerful, is still pretty basic I guess.
Eventhough I realize anything more involved than matching a regexp and renaming will need to integrate with CEDET's semantic bovinator, I feel like there is still room for improvement here.
Built-in functions, packages, or custom-code what do you find helpful getting the job done ?
Cheers

In CEDET, there is a symbol reference tool. By default it also uses find/grep in a project to find occurrence of a symbol. It is better to use GNU Global, IDUtils, or CScope instead to create a database in your project. You can then use semantic-symref-symbol which will then use gnu global or whatever to find all the references.
Once in symref list buffer, you can look through the hits. You can then select various hits and perform operations such as symbol rename, or the more powerful, execute macro on all the hits.
While there are more focused commands that could be made, the macro feature allows almost anything to happen for the expert user who understands Emacs keyboard macros well.

It depends on which language you are using; if your language is supported by slime, there are the family of who commands: slime-who-calls, who-references, who-binds, calls-who, etc. They provide real, semantic based information, so are more reliable than regexp matching.

If you're editing lisp, I've found it useful (in general) to use the paredit.el package. Follow the link for documentation, and the video is a great introduction.

Emacs recursive project search

I am switching to Emacs from TextMate. One feature of TextMate that I would really like to have in Emacs is the "Find in Project" search box that uses fuzzy matching. Emacs sort of has this with ido, but ido does not search recursively through child directories. It searches only within one directory.
Is there a way to give ido a root directory and to search everything under it?
Update:
The questions below pertain to find-file-in-project.el from Michał Marczyk's answer.
If anything in this message sounds obvious it's because I have used Emacs for less than one week. :-)
As I understand it, project-local-variables lets me define things in a .emacs-project file that I keep in my project root.
How do I point find-file-in-project to my project root?
I am not familiar with regex syntax in Emacs Lisp. The default value for ffip-regexp is:
".*\\.\\(rb\\|js\\|css\\|yml\\|yaml\\|rhtml\\|erb\\|html\\|el\\)"
I presume that I can just switch the extensions to the ones appropriate for my project.
Could you explain the ffip-find-options? From the file:
(defvar ffip-find-options
""
"Extra options to pass to `find' when using find-file-in-project.
Use this to exclude portions of your project: \"-not -regex \\".vendor.\\"\"")
What does this mean exactly and how do I use it to exclude files/directories?
Could you share an example .emacs-project file?

I use M-x rgrep for this. It automatically skips a lot of things you don't want, like .svn directories.

(Updated primarily in order to include actual setup instructions for use with the below mentioned find-file-in-project.el from the RINARI distribution. Original answer left intact; the new bits come after the second horizontal rule.)
Have a look at the TextMate page of the EmacsWiki. The most promising thing they mention is probably this Emacs Lisp script, which provides recursive search under a "project directory" guided by some variables. That file begins with an extensive comments section describing how to use it.
What makes it particularly promising is the following bit:
;; If `ido-mode' is enabled, the menu will use `ido-completing-read'
;; instead of `completing-read'.
Note I haven't used it myself... Though I may very well give it a try now that I've found it! :-)
HTH.
(BTW, that script is part of -- to quote the description from GitHub -- "Rinari Is Not A Rails IDE (it is an Emacs minor mode for Rails)". If you're doing any Rails development, you might want to check out the whole thing.)
Before proceeding any further, configure ido.el. Seriously, it's a must-have on its own and it will improve your experience with find-file-in-project. See this screencast by Stuart Halloway (which I've already mentioned in a comment on this answer) to learn why you need to use it. Also, Stu demonstrates how flexible ido is by emulating TextMate's project-scoped file-finding facility in his own way; if his function suits your needs, read no further.
Ok, so here's how to set up RINARI's find-file-in-project.el:
Obtain find-file-in-project.el and project-local-variables.el from the RINARI distribution and put someplace where Emacs can find them (which means in one of the directories in the load-path variable; you can use (add-to-list 'load-path "/path/to/some/directory") to add new directories to it).
Add (require 'find-file-in-project) to your .emacs file. Also add the following to have the C-x C-M-f sequence bring up the find-file-in-project prompt: (global-set-key (kbd "C-x C-M-f") 'find-file-in-project).
Create a file called .emacs-project in your projects root directory. At a minimum it should contain something like this: (setl ffip-regexp ".*\\.\$clj\\|py\$$"). This will make it so that only files whose names and in clj or py will be searched for; please adjust the regex to match your needs. (Note that this regular expression will be passed to the Unix find utility and should use find's preferred regular expression syntax. You still have to double every backslash in regexes as is usual in Emacs; whether you also have to put backslashes before parens / pipes (|) if you want to use their 'magic' regex meaning depends on your find's expectations. The example given above works for me on an Ubuntu box. Look up additional info on regexes in case of doubt.) (Note: this paragraph has been revised in the last edit to fix some confusion w.r.t. regular expression syntax.)
C-x C-M-f away.
There's a number of possible customisations; in particular, you can use (setl ffip-find-options "...") to pass additional options to the Unix find command, which is what find-file-in-project.el calls out to under the hood.
If things appear not to work, please check and double check your spelling -- I did something like (setl ffip-regex ...) once (note the lack of the final 'p' in the variable name) and were initially quite puzzled to discover that no files were being found.

Surprised nobody mentioned https://github.com/defunkt/textmate.el (now gotta make it work on Windows...)

eproject has eproject-grep, which does exactly what you want.
With the right project definition, it will only search project files; it will ignore version control, build artifacts, generated files, whatever. The only downside is that it requires a grep command on your system; this dependency will be eliminated soon.

You can get the effect you want by using GNU Global or IDUtils. They are not Emacs specific, but they has Emacs scripts that integrate that effect. (I don't know too much about them myself.)
You could also opt to use CEDET and the EDE project system. EDE is probably a bit heavy weight, but it has a way to just mark the top of a project. If you also keep GNU Global or IDUtils index files with your project, EDE can use it to find a file by name anywhere, or you can use `semantic-symref' to find references to symbols in your source files. CEDET is at http://cedet.sf.net

For pure, unadulterated speed, I highly recommend a combination of the command-line tool The Silver Searcher (a.k.a. 'ag') with ag.el. The ag-project interactive function will make an educated guess of your project root if you are using git, hg or svn and search the entire project.

FileCache may also be an option. However you would need to add your project directory manually with file-cache-add-directory-recursively.

See these links for info about how Icicles can help here:
find files anywhere, matching any parts of their name (including directory parts)
projects: create, organize, manage, search them
Icicles completion matching can be substring, regexp, fuzzy (various kinds), or combinations of these. You can also combine simple patterns, intersecting the matches or complementing (subtracting) a subset of them

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse