What does internal mean in function names in Emacs Lisp? - emacs

Some people use double dash to indicate that the function is subject to change:
What does the double minus (--) convention in function names mean in Emacs Lisp
Does including internal in function names mean similar things?
Two examples
where-is-internal
internal-make-var-non-special
The function where-is-internal has a detailed docstring and is mentioned in the manual as well. Is where-is-internal an exception?
Is there a difference between having -internal as suffix and having internal- as prefix?
Adding to confusion, there are also function names with internal-- (with double dash) as prefix.

The confusion is not just in the naming convention (variability due to history and perhaps sometimes whim). The confusion is in the very notion of "internal" in free software, where the source code is open to everyone to use or modify (even fork) as they please.
To answer your question from (what I think is) the point of view of Emacs Dev, and thus in terms of the underlying intention: "internal" means that someone using such a function is perhaps more likely to encounter future changes in the Emacs-Dev implementation and use of that function than might be the case for a non-"internal" function. IOW, you might not want to count on it remaining as it is now. That's all.
But there's a lot of "perhaps", "more likely", and "might" in there. In practice, some non-"internal" functions change more radically or more quickly than some "internal" functions. It might be the case that for the former there will be a deprecation grace period, during which the pre-change situation is tolerated, i.e., still works. That might not be the case for something "internal". But again, in practice there is some gray between the black of "internal" and the white of non-"internal".
Someone from Emacs Dev (e.g. #Stefan) will perhaps put this differently or correct my interpretation.
My own take: there have sometimes (often) been functions and variables that the author did not expect users to make use of directly, and thus naturally thought of as "internal", which users have nevertheless put to good use, or even "had" to use (modulo rewriting lots of code). Some such have had their "internal" status removed (no, I don't have examples memorized). Or sometimes a new, non-"internal" function has been added to make the behavior available - e.g., a wrapper or function-valued argument has been added (again, I have no offhand examples to give).
IOW, for Emacs Dev too it is not always clear what should be considered "internal". Just take the label as a flag that you might not want to count too much on that function or variable.
Wrt the various notations: My impression is that the -- convention seems recently to be used more (though there is also some old code that uses it); using internal is an older convention, for the most part.

The "internal" and the "--" conventions are similar. Basically "internal" is used when there's no prefix after which to put a double dash (which is usually the case for functions implemented in C).
And yes, as Drew explains, the intention behind the notion of something being "internal" is just to recommend people not use it directly. IOW if they need the corresponding functionality, they should report a bug requesting to promote its status to "non-internal".

Related

gcc precompiler directive __attribute__ ((__cleanup__)) vs ((cleanup)) (with vs without underscores?)

I'm learning about gcc's cleanup attribute, and learning how it calls a function to be run when a variable goes out of scope, and I don't understand why you can use the word "cleanup" with or without underscores. Where is the documentation for, or documentation of, the version with underscores?
The gcc documentation above shows it like this:
__attribute__ ((cleanup(cleanup_function)))
However, most code samples I read, show it like this:
__attribute__ ((__cleanup__(cleanup_function)))
Ex:
http://echorand.me/site/notes/articles/c_cleanup/cleanup_attribute_c.html
http://www.nongnu.org/avr-libc/user-manual/atomic_8h_source.html
Note that the first example link states they are identical, and of course coding it proves this, but how did he know this originally? Where did this come from?
Why the difference? Where is __cleanup__ defined or documented, as opposed to cleanup?
My fundamental problem lies in the fact that I don't know what I don't know, therefore I am trying to expose some of my unknown unknowns so they become known unknowns, until I can study them and make them known knowns.
My thinking is that perhaps there is some globally-applied principle to gcc preprocessor directives, where you can arbitrarily add underscores before or after any of them? -- Or perhaps only some of them? -- Or perhaps it modifies the preprocessor directive or attribute somehow and there are cases where one method, with or without the extra underscores, is preferred over the other?
You are allowed to define a macro cleanup, as it is not a name that is reserved to the compiler. You are not allowed to define one named __cleanup__. This guarantees that your code using __cleanup__ is unaffected by other code (provided that other code behaves, of course).
As https://gcc.gnu.org/onlinedocs/gcc/Attribute-Syntax.html#Attribute-Syntax explains:
You may optionally specify attribute names with __ preceding and following the name. This allows you to use them in header files without being concerned about a possible macro of the same name. For example, you may use the attribute name __noreturn__ instead of noreturn.
(But note that attributes are not preprocessor directives.)

Emacs custom syntactic analysis

In emacs, the syntactic analysis is surprisingly little.
For example, if I wish to indent parameter names differently than the types in a function declaration, like so:
void myfunction(
int
test
);
int is considered an arglist-intro, and test is considered as arglist-cont. However, if I add any more parameters, they'll all be considered arglist-cont, so indenting arglist-cont wouldn't do the desired effect.
So here's what I'm wondering: Is it possible to make my own syntactic analysis thingy for emacs so that it'll recognize and differentiate cases like this (this isn't the only case, by the way)? And if so, how?
Yes, of course you can write whatever you want. Emacs is free software, it comes with sources, so you can modify them as you wish.
However, please be aware that Emacs is quite widely used, including by some very smart hackers. This means that Emacs limitations usually (but, of course, not always!) have a good reason behind them (in your case, the reason is that the C syntax is quite complex). The implication is that doing what you want to do might be harder than you might be thinking. Not that it should discourage you, of course!
PS. You asked "is it possible to make my own syntactic analysis", not "how to do that" :-)
PPS. As for "how", you will have to start with cc-engine.el.

What's the recommended replacement for Perl's deprecated-ish given/when?

Now that the Perl devs have decided to sort-of deprecate given/when statements, is there a recommended replacement, beyond just going back to if/elsif/else?
if/elsif/else chains are the best option most of the time — except when something completely different is better than both if/elsif/else and given/when, which is actually reasonably often. Examples of "completely different" approaches are creating different types of objects to handle different scenarios, and letting method dispatch do your work for you, or finding an opportunity to make your code more data-driven. Both of those, if they're appropriate and you do them right, can greatly reduce the number of "switch statement" constructs in your code.
Just as a supplement, I've found that a combination of 'for' and if/elsif/else is good if you have some given/when/default code that needs to be quickly updated. Just replace given with for and replace the when statements with a cascade of if & elsif, and replace default with else. This allows all your tests to continue using $_ implicitly, requiring less rewriting. (But be aware that other special smart match features will not work any more.)
This is just for rewriting code that already uses given/when, though. For writing new code, #hobbs has the right answer.

Using regexp to index a file for imenu, performance is unacceptable

I'm producing a function for imenu-create-index-function, to index a source code module, for csharp-mode.el
It works, but delivers completely unacceptable performance. Any tips for fixing this?
The Background
I looked at js.el, which is the rebadged "espresso" now included, since v23.2, into emacs. It indexes Javascript files very nicely, does a good job with anonymous functions and various coding styles and patterns in common use. For example, in javascript one can do:
(function() {
var x = ... ;
function foo() {
if (x == 1) ...
}
})();
...to define a scope where x is "private" or inaccessible from other code. This gets indexed nicely by js.el, using regexps, and it indexes the inner functions (anonymous or not) within that scope also. It works quickly. A big module can be indexed in less than a second.
I tried following a similar approach in csharp-mode, but it's quite a bit more complicated. In Js, everything that gets indexed is a function. So the starting regex is "function" with some elaboration on either end. Once an occurrence of the function keyword is found, then there are 4 - 8 other regexps that get tried via looking-at - the number depends on settings. One nice thing about js mode is that you can turn on or off regexps for various coding styles, to speed things along I suppose. The default "styles" work for most of the code I tried.
This doesn't work in csharp-mode. It works, but it performs poorly enough to make it not very usable. I think the reason for this is that
there is no single marker keyword in C#, as function behaves in javascript. In C# I need to look for namespace, class, struct, interface, enum, and so on.
there's a great deal of flexibility with which csharp constructs can be defined. As one example, a class can define base classes as well as implemented interfaces. Another example: The return type for a method isn't a simple word-like string, but can be something messy like Dictionary<String, List<String>> . The index routine needs to handle all those cases, and capture the matches. This makes it run sloooooowly.
I use a lot of looking-back. The marker I use in the current approach is the open curly brace. Once I find one of those, I use looking-back to determine if the curly is a class, interface, enum, method, etc. I read that looking-back can be slow; I'm not clear on how much slower it is than, say, looking-at.
once I find an open-close pair of curlies, I call narrow-to-region in order to index what's inside. not sure if this is will kill performance or not. I suspect that it is not the main culprit, because the perf problems I see happen in modules with one namespace and 2 or 3 classes, which means narrow gets called 3 or 4 times total.
What's the Question?
My question is: do you have any tips for speeding up imenu-like indexing in a C# buffer?
I'm considering:
avoiding looking-back. I don't know exactly how to do this because when re-search-forward finds, say, the keyword class, the cursor is already in the middle of a class declaration. looking-back seems essential.
instead of using open-curly as the marker, use the keywords like enum, interface, namespace, class
avoid narrow-to-region
any hard advice? Further suggestions?
Something I've tried and I'm not really enthused about re-visiting: building a wisent-based parser for C#, and relying on semantic to do the indexing. I found semantic to be very very very (etc) difficult to use, hard to discover, and problematic. I had semantic working for a while, but then upgraded to v23.2, and it broke, and I never could get it working again. Simple things - like indexing the namespace keyword - took a very long time to solve. I'm very dissatisfied with it and don't want to try again.
I don't really know C# syntax, and without looking at your elisp it's hard to give an answer, but here goes anyway.
looking-back can be deadly slow. It's the first thing I'd experiment with. One thing that helps a lot is using the limit arg to, say, restrict your search to the beginning of the current line. A different approach is when you hit the open curly do backward-char then backward-sexp (or whatever) to get to the front of the previous word, then use looking-at.
Using keywords to search around instead of open curly is probably what I would have done. Maybe something like (re-search-forward "\\(enum\\|interface\\|namespace\\|class\\)[ \t\n]*{" nil t) then using match-string-no-properties on the first capture group to see which of the keywords was found. This might help with the looking-back problem as well.
I don't know how expensive narrow-to-region is, but could be avoided by when you find a open curly do save-excursion forward-sexp and keep point as a limit for the current iteration of your (I assume recursive) searches.

Keeping CL and Scheme straight in your head

Depending on my mood I seem to waffle back and forth between wanting a Lisp-1 and a Lisp-2. Unfortunately beyond the obvious name space differences, this leaves all kinds of amusing function name/etc problems you run into. Case in point, trying to write some code tonight I tried to do (map #'function listvar) which, of course, doesn't work in CL, at all. Took me a bit to remember I wanted mapcar, not map. Of course it doesn't help when slime/emacs shows map IS defined as something, though obviously not the same function at all.
So, pointers on how to minimize this short of picking one or the other and sticking with it?
Map is more general than mapcar, for example you could do the following rather than using mapcar:
(map 'list #'function listvar)
How do I keep scheme and CL separate in my head? I guess when you know both languages well enough you just know what works in one and not the other. Despite the syntactic similarities they are quite different languages in terms of style.
Well, I think that as soon you get enough experience in both languages this becomes a non-issue (just with similar natural languages, like Italian and Spanish). If you usually program in one language and switch to the other only occasionally, then unfortunately you are doomed to write Common Lisp in Scheme or vice versa ;)
One thing that helps is to have a distinct visual environment for both languages, using syntax highlighting in some other colors etc. Then at least you will always know whether you are in Common Lisp or Scheme mode.
I'm definitely aware that there are syntactic differences, though I'm certainly not fluent enough yet to automatically use them, making the code look much more similar currently ;-).
And I had a feeling your answer would be the case, but can always hope for a shortcut <_<.
The easiest way to keep both languages straight is to do your thinking and code writing in Common Lisp. Common Lisp code can be converted into Scheme code with relative ease; however, going from Scheme to Common Lisp can cause a few headaches. I remember once where I was using a letrec in Scheme to store both variables and functions and had to split it up into the separate CL functions for the variable and function namespaces respectively.
In all practicality though I don't make a habit of writing CL code, which makes the times that I do have to all the more painful.