What do these symbols mean in Emacs Lisp? - emacs

When I read some elisp code, I found something like:
(\,(* 2 \#1))
\,(format "%s %s id%d %s" \1 \2 (+1 \#) \3)
#'(bla bla)
What does the symbol like "\,", "#", "#'" mean? Which session should I look into for those kind of things?

\, is special in replacements when using query-replace-regexp. It means "evaluate the following elisp expression, and use the resulting value in the replacement".
n.b. It's not special elsewhere (that I'm aware of), so that should be the usage you've seen.
\# is also special in the replacement string, and is substituted with the number of replacements made thus far. (i.e. an incrementing counter).
\#N (where N is a number) is a variant of \N which treats the group in question as a number rather than a string, which is useful when the expression you're evaluating requires a number.
So (\,(* 2 \#1)) would be a replacement which evaluates the expression (* 2 \#1), multiplying the number matched by the first group of the regexp by 2 to produce some value N, such that the final replacement is (N).
You can find these detailed in the manual.
C-hig (emacs) RET followed by a search for the syntax in question. e.g. C-s \, with a repeated C-s if the search fails (as it will) to find a match in the subsequent nodes.
#'... is short-hand for (function ...) which is a variant of '... / (quote...) which indicates that the quoted object is a function.
As this is elisp syntax, you find it in the elisp manual:
C-hig (elisp) RET
You can either use C-s #' or in this case it's indexed, so I #' RET also works.
(In general check the index first, and then use isearch.)

For info on backquotes, see http://www.gnu.org/software/emacs/manual/html_node/elisp/Backquote.html.
# starts the reader syntax, for instance #' is a reader alias for function.
For more info see http://definitelyaplug.b0.cx/post/emacs-reader/

The #' is a short hand for using functions, for more details see here: http://www.gnu.org/software/emacs/manual/html_node/elisp/Anonymous-Functions.html
Backslash \ has two functions: it quotes the special characters (including ‘\’), and it introduces additional special constructs. More here: https://www.gnu.org/software/emacs/manual/html_node/emacs/Regexps.html#Regexps

Related

Noweb does not cross-reference Perl identifiers delimited on the left by #

Consider this Noweb source file named quux.nw:
\documentclass{article}
\usepackage{noweb}
\usepackage[colorlinks]{hyperref}
\begin{document}
<<quux.pl>>=
my #foo ;
my $bar ;
my %baz ;
# %def foo bar baz
\end{document}
and compiled using the commands:
$ noweb quux.nw
$ latexmk -pdf quux.tex
The identifiers bar and baz are properly highlighted as identifiers and cross referenced in the PDF output. The identifier foo is not.
It's my understanding that Noweb has a very simple heuristic for recognizing identifiers. foo should be recognizable as an identifier because, like bar and baz, it begins with an alphanumeric, is delimited on the left by a symbol (at-sign), and is delimited on the right by a delimiter (whitespace).
I considered the possibility that the at-sign was being interpreted by Noweb as an escape and tried doubling it, but that (i) did not solve the problem, and (ii) introduced the syntax error my ##foo into quux.pl. This makes sense because according to the fine manual, a double at-sign is only treated specially in columns 1–2.
Noweb treats # as alphanumeric, with the rationale that it “helps LaTeX”. I did not find anything about this in the Noweb manual. This is documented only in the Noweb source file finduses.nw, line 24, in Noweb version 2.12.
Apparently, when writing your own LaTeX package, any macro you define has public scope. To write “private” macros, the trick is to temporarily reclass the # as a letter at the top of the package, incorporate an # into the name of each “private” macro, and restore the class of # at the bottom of the package. The macro remains public, but is impossible to call because the name gets broken up into multiple lexemes. (A user can still call such a macro by reclassing # as a letter before the call, but if they do that, they assume the risk.)
So yes, # should be included as an alphanumeric character when the code block contains a LaTeX package.
The full list of symbols treated as alphanumeric by Noweb is:
_ ' # #
The _ is treated as an identifier character in many programming languages, so Noweb is right to treat it as alphanumeric.
The # is treated as alphanumeric to “avoid false hits on C preprocessor directives”.
No explanation is given for treating the ' as alphanumeric.
Ideally, Noweb would support separate character class schemes for each source language. But as I understand it, Noweb has only the one global character class scheme, and no support for changing it (other than modifying the source).
Fortunately, Perl has alternate syntaxes for array identifiers that work around this limitation. Instead of #foo you can write #{foo} or even # foo and it will work.

get symbol-name without uppercase

Is it possible in Common Lisp to get a symbol-name without the uppercase result?
(symbol-name 'aAbB)
;; => "AABB"
(OTHER_FUNCTION? 'aAbB)
;; => "aAbB"
I would like to use a symbol name as a string but case-sensitive.
Your symbol is actually all uppercase, because the reader already upcases it. In order to prevent that, you can either use a different readtable-case or escape the symbol, using either enclosing pipe symbols: '|aAbB| or a backslash for the next character: '\aA\bB.
There is quite a full answer on this question: Why is Common Lisp case insensitive
"The readtable objects has an attribute, readtable-case, that controls how the reader interns and evaluates the symbols read. you can setf readtable-case to :upcase(default), :downcase, :preserve, :invert.
By default, the readtable-case is set to :upcase, which causes all symbols to be converted to upcase."

What do elisp expression (1+ (buffer-size)) and (+ 1 (buffer-size)) mean?

I'm very very new in elisp and just started learning it. I have seen the following expressions in the document:
(1+ (buffer-size))
(+ 1 (buffer-size))
What do they mean? As I know elisp use prefix notation, so the second one should be correct one. But both of them can be executed without any errors. The first one is from the documentation of point-max function.
Thanks.
The token 1+ is an identifier which denotes a symbol. This symbol has a binding as a function, and so (1+ arg) means "call the 1+ function, with the value of arg as its argument). The 1+ function returns 1 plus the value of its argument.
The syntax (+ 1 arg) is a different way to achieve that effect. Here the function is named by the symbol +. The + function receives two arguments which it adds together.
In many mainstream programming languages popular today, the tokenization rules are such that there is no difference between 1+ and 1 +: both of these denote a numeric constant followed by a + token. Lisp tokenization is different. Languages in the Lisp family usually support tokens that can contain can contain digits and non-alpha-numeric characters. I'm looking at the Emacs Lisp reference manual and do not see a section about the logic which the read function uses to convert printed representations to objects. Typically, "Lispy" tokenizing behavior is something like this: token is scanned first without regard for what kind of token it is based on accumulating characters which are valid token constituents, stopping at a character which is not a token constituent. For instance when the input is abcde(f, the token that will be extracted is abcde. The ( character terminates the token (and stays in the input stream). Then, the resulting clump of characters abcde is re-examined and classified, converted to an object based on what it looks like, according to the rules of the given Lisp dialect. Across Lisp dialects, we can broadly depend on a token of all alphabetic characters to denote a symbol, and a token of all digits (possibly with a leading sign) to denote an integer. 1+ has a trailing + though, which is different!

How to escape double quote?

In org mode, if I want to format text a monospace verbatim, i.e. ~...~, if it is inside quotes: ~"..."~, it is not formatted (left as is).
Also, are quotes a reserved symbol, if so, what do they mean? (they don't seem to affect the generated HTML / inside Emacs display).
The culprit in this case is the regular expression in org-emph-re org-verbatim-re, responsible for determining if a sequence of characters in the document is to be set verbatim or not.
org-verbatim-re is a variable defined in `org.el'.
Its value is
"\([ ('\"{]\|^\)\(\([=~]\)\([^
\n,\"']\|[^
\n,\"'].?\(?:\n.?\)\{0,1\}[^
\n,\"']\)\3\)\([- .,:!?;'\")}\]\|$\)"
quotes and double quotes are explicitly forbidden inside verbatim characters =~ by
[^
\n,\"']\|[^
\n,\"']
I found discussions dating back 3 years comming to the conclusion that you have to tinker with this regular expression and set the variable org-emph-re/org-verbatim-re to something that matches your wishes in your emacs setup (maybe a file local variable works as well). You can experiment by excluding double quotes from the excluding character classes and outside matches as in
"\([ ('{]\|^\)\(\([*/_=~+]\)\([^
\n,']\|[^
\n,'].?\(?:\n.?\)\{0,1\}[^
\n,']\)\3\)\([- .,:!?;')}\]\|$\)"
but looking at that regex, heaven knows what happens to complex documents -- you have to try...
Edit: as it happens, if I evalute the following as region, quotes inside = are exported correctly, but nothing else is :-), I investigate further when I have more time.
(setq org-emph-re "\([ ('{]\|^\)\(\([*/_=~+]\)\([^
\n,']\|[^
\n,'].?\(?:\n.?\)\{0,1\}[^
\n,']\)\3\)\([- .,:!?;')}]\|$\)")
Edit 2:: Got it to work by changing org.el directly:
Change the line following (defvar org-emphasis-regexp-components from '(" \t('\"{" "- \t.,:!?;'\")}\\" " \t\r\n,\"'" "." 1) to '(" \t('{" "- \t.,:!?;')}\\" " \t\r\n,'" "." 1) and recompile org then restart emacs.
This was a defcustom prior to the 8.0 release, it isn't anymore, so you have to live with this manual modification.
regards,
Tom
Finally, I found a solution from http://comments.gmane.org/gmane.emacs.orgmode/82571
According to that thread, the regexp for verbatim is built from variable org-emphasis-regexp-components, which defines legal characters before, after, at the border of, or in the body of emphasis; and verbatim is one of the emphasis environment in org mode.
A workable setting given by that thread:
(setcar (nthcdr 2 org-emphasis-regexp-components) " \t\n,")
(custom-set-variables `(org-emphasis-alist ',org-emphasis-alist))
For small amounts of characters which have some unwanted effect in Emacs org-mode (because being metacharacters) it may be helpful to have a look at special symbols in org-mode (org-entities.el).
So for example " can be encoded by \quot{} (where the braces pair at the end is not mandatory, but needed if no whitespace follows).
So instead ="..."= you would write =\quot{}...\quot{}=.
That is some typing more and looks pretty ugly. But for the latter org-mode has a solution: by C-c C-x \ you can toggle a display magic for those symbols. If the magic is active, so directly after typing \quot{} resp. \quot{} a " will be displayed.
Besides, this symbols list can easily be extended, f.e.
(add-to-list 'org-entities
'("backslash" "\\textbackslash" nil "\\" "\\" "\\" "\\"))
Nevertheless I am heavily missing easier escaping in org-mode, besides the above solution and besides escaping a whole line by a : at its beginning.
I'd be happy if =verbatim= in all cases would leave the text between the ='s unchanged. Not =this*bold*text=, but =this *bold* text=. Like we know that from each well-designed markup/-down language.
But, of course, this is better placed at the org-mode development pages. Ideally with a fitting patch... :-)
I've met similar problem, and thanks #chaiko for a basic solution. However, #chaiko's solution only work for org-mode's fontification, it doesn't affect org-export. To get correct exported document, you need to do some more extra hack to org-mode's parser by (org-element--set-regexps).
So the full code snippets should be something like:
(setcar (nthcdr 2 org-emphasis-regexp-components) " \t\n\r")
(custom-set-variables `(org-emphasis-alist ',org-emphasis-alist))
(org-element--set-regexps)
I've integrated this to my oh-my-emacs project: https://github.com/xiaohanyu/oh-my-emacs/blob/e82fce10d47f7256df6d39e32ca288d0ec97a764/core/ome-org.org#code-block-fontification .

Is it possible to change an emacs syntax table based on context?

I'm working on improving an emacs major mode for UnrealScript. One of the (many) quirks is that it allows syntax like this for specifying tooltips in the Unreal editor:
var() int MyEditorVar <Foo=Bar|Tooltip=My tooltip text isn't quoted>;
The angle brackets after the variable declaration denote a pipe-separated list of Key=Value metadata pairs, and the metadata is not quoted but can contain quote marks -- a pipe (|) or right angle bracket (>) denotes the end.
Is there a way I can get the emacs syntax table to recognize this context-dependent syntax in a useful way? I'd like everything except for pipes and right angle brackets to be highlighted in some way inside of these variable metadata declarations, but otherwise retain their normal highlighting.
Right now, the single quote character is set up to be a quote delimiter (syntax designator "), so font-lock-mode interprets such a quote as starting a quoted string, which it's not in this very specific instance, so it mishighlights everything until it finds another supposedly matching single quote.
You'll need to setup a syntax-propertize-function which lets you apply different syntax designators to different characters in the buffer, depending on their context.
Grep for syntax-propertize-function in Emacs's lisp directory to see various examples (from simple to pretty complex ones).
You'll probably want to mark the "=" chars after your "Foo" and after your "Tooltip" as "generic string delimiter", then do the same with the corresponding terminating "|" and ">". An alternative could be to mark the char before the ">" as a (closing) generic string delimiter, so that you can then mark the "<" and ">" as open&close parens.