Is it possible to change an emacs syntax table based on context?

Is it possible to change an emacs syntax table based on context? - emacs

I'm working on improving an emacs major mode for UnrealScript. One of the (many) quirks is that it allows syntax like this for specifying tooltips in the Unreal editor:
var() int MyEditorVar <Foo=Bar|Tooltip=My tooltip text isn't quoted>;
The angle brackets after the variable declaration denote a pipe-separated list of Key=Value metadata pairs, and the metadata is not quoted but can contain quote marks -- a pipe (|) or right angle bracket (>) denotes the end.
Is there a way I can get the emacs syntax table to recognize this context-dependent syntax in a useful way? I'd like everything except for pipes and right angle brackets to be highlighted in some way inside of these variable metadata declarations, but otherwise retain their normal highlighting.
Right now, the single quote character is set up to be a quote delimiter (syntax designator "), so font-lock-mode interprets such a quote as starting a quoted string, which it's not in this very specific instance, so it mishighlights everything until it finds another supposedly matching single quote.

You'll need to setup a syntax-propertize-function which lets you apply different syntax designators to different characters in the buffer, depending on their context.
Grep for syntax-propertize-function in Emacs's lisp directory to see various examples (from simple to pretty complex ones).
You'll probably want to mark the "=" chars after your "Foo" and after your "Tooltip" as "generic string delimiter", then do the same with the corresponding terminating "|" and ">". An alternative could be to mark the char before the ">" as a (closing) generic string delimiter, so that you can then mark the "<" and ">" as open&close parens.

Related

How to add a vector notation above a variable name composed of multiple characters?

In Julia it is possible to add Unicode characters with LaTeX like syntax. All allowed unicode characters can be found here. For example, it is possible to add a right arrow over a character with this simple code
F\vec[TAB]
and it produces the following character
But I couldn't find a syntax to add the same right arrow over a whole word as \vec seems to always add the arrow over the previous character and does not allow to group them. For example
force\vec[TAB]
produces
Does the syntax for this feature exists ?

How to insert breakpoint using symbols include "<>" (angle brackets)

I want to insert a breakpoint in windbg, using symbols named "TSmartPointer::TSmartPointer".
bp TSmartPointer<class CDataMemberMgr>::TSmartPointer<class CDataMemberMgr>
WinDbg noticed me that no symbols were found.
I use command x to search symbol, but also no symbols are found:
x TSmartPointer<class CDataMemberMgr>::TSmartPointer<class CDataMemberMgr>
When I replace "<" and ">" to "*", WinDbg can find symbols:
x TSmartPointer*class CDataMemberMgr*::TSmartPointer*class CDataMemberMgr*
Am I wrong? How can I insert this breakpoint?

I could not find this in WinDbg's internal help, but in Microsoft documentation, which makes me wonder a bit about the spaces as well
To set a breakpoint on complicated functions, including functions that contain spaces, as well as a member of a C++ public class, enclose the expression in parentheses. For example, use bp (??MyPublic) or bp (operator new).
Furthermore, it explicitly talks about angle brackets:
You must start with the three symbols #!" and end with a quotation mark ("). Without this syntax, you cannot use spaces, angle brackets (<, >), or other special characters in symbol names in the MASM evaluator.
(emphasis mine)
So, in your case, the following should work:
bp #!"TSmartPointer<class CDataMemberMgr>::TSmartPointer<class CDataMemberMgr>"
The quotation marks should care about the spaces as well.
And to make a comment of #Kurt Hutchinson persistent:
For template classes, it's important to use the exact spacing and angle bracket placement that Windbg wants. Sometimes there will be an extra space in there that is significant. You can tell what it should be by doing a symbol lookup first like x MSHTML!TSmartPointer*CDataMemberMgr*. Windbg should do a wildcard match and print out a bunch of symbol names. Then you should copy and paste the correct name from that list, using the #!"..." quoting. Don't try to retype the symbol name yourself because spaces matter and if you miss one, Windbg won't match it correctly.

.tmlanguage escape sequences and rule priorities

I'm implementing a syntax highlighter in Apple's Swift language by parsing .tmlanguage files and applying styles to a NSMutableAttributtedString.
I'm testing with javascript code, a javascript.tmlanguage file, and the monokai.tmtheme theme (both last included in sublime text 3) to check that the syntax get highlighted correctly. By applying each rule (patterns) in the .tmlanguage file in the same order they come, the syntax is almost perfectly highlighted.
The problem I'm having right now is that I don't know how to know that a quote (") should be escaped when it has a backslash before it (\"). Am I missing something in the .tmlanguage file that specifies that?. Other problem is that I have no idea how to know that other rules should be ignored when inside others, for example:
I'm getting double slashes taken as comments when inside strings: "http://stackoverflow.com/" a url is recognised as comment after //
Also double or single quotes are taken as strings when inside comments: // press "Enter" to continue, the word "Enter" gets highlighted as string when should be same color as comments
So, I don't know if there is some priority for some rules over others in the convention, or if there is something in the files that I haven't noticed.
Help please!
Update:
Here is a better example of what I meant by escape quotes:
I'm getting this: while all the letters should be yellow except for the escaped sequence (/") which should be blue.
The question is. How do I know that /" should be escaped? The rule for that piece of code is:

Maybe I am late to answer this. You can apply the following method.
(Ugly) In your end regex, use ([^/])(") and in your endCaptures, it would be
1 = string.quote.double.js
2 = punctuation.definition.string.end.js
If the string must be single line, you can use match=(")(.*)("), captures=
1 = punctuation.definition.string.begin.js
2 = string.quote.double.js
3 = punctuation.definition.string.end.js
and use your patterns
You can try applyEndPatternLast and see if it is allowed. Set applyEndPatternLast=1 will do.

The priority is that earlier rules in the file are prioritized over later rules. As an example, in my Python Improved language definition, I have a scope that contains a series of all-caps constants used in Django, a popular Python web framework. I also have a generic constant.other.allcaps.python scope that recognizes (just about) anything in all caps. Since the Django constants rule is before the allcaps rule in the .tmLanguage file, I can color it with a theme using one color, while the later-occurring "highlight everything in all caps" only grabs identifiers that are NOT part of the first list.
Because of this, you should put your "comments" scope(s) as early in the file as possible, then write your parser in such a way that it obeys the rule I described above. However, it's slightly more complicated than that, as I believe items in the repository are prioritized based on where their include line is, not where the repository rule is defined in the file. You may want to do some testing to verify that, though.
Unfortunately I'm not sure what you mean about the escaped quotes - could you expand on that, and maybe add an example or two?
Hope this helps.

Assuming that / is the correct character for escaping a double quote mark, the following should work:
"str_double_quote": {
"begin": "\"",
"end": "\"",
"name": "string.quoted.double.swift",
"patterns": [
{
"name": "constant.character.escape.swift",
"match": "/[\"/]"
}
]
}
You can match an escaped double quote mark (/") and a literal forward slash (//) in the patterns to consume them before the end marker is used to handle them.
If the character for escaping is actually a backslash, then the tricky bit is that there are two levels of escaping, for the JSON encoding as well as the regular expression syntax. To match \", the regular expression requires you to escape the backslash (\\"). JSON requires you to escape backslashes and double quotes, resulting in \\\\\" in a TextMate JSON grammar file. The match expression would thus be \\\\[\"\\\\].

What do these symbols mean in Emacs Lisp?

When I read some elisp code, I found something like:
(\,(* 2 \#1))
\,(format "%s %s id%d %s" \1 \2 (+1 \#) \3)
#'(bla bla)
What does the symbol like "\,", "#", "#'" mean? Which session should I look into for those kind of things?

\, is special in replacements when using query-replace-regexp. It means "evaluate the following elisp expression, and use the resulting value in the replacement".
n.b. It's not special elsewhere (that I'm aware of), so that should be the usage you've seen.
\# is also special in the replacement string, and is substituted with the number of replacements made thus far. (i.e. an incrementing counter).
\#N (where N is a number) is a variant of \N which treats the group in question as a number rather than a string, which is useful when the expression you're evaluating requires a number.
So (\,(* 2 \#1)) would be a replacement which evaluates the expression (* 2 \#1), multiplying the number matched by the first group of the regexp by 2 to produce some value N, such that the final replacement is (N).
You can find these detailed in the manual.
C-hig (emacs) RET followed by a search for the syntax in question. e.g. C-s \, with a repeated C-s if the search fails (as it will) to find a match in the subsequent nodes.
#'... is short-hand for (function ...) which is a variant of '... / (quote...) which indicates that the quoted object is a function.
As this is elisp syntax, you find it in the elisp manual:
C-hig (elisp) RET
You can either use C-s #' or in this case it's indexed, so I #' RET also works.
(In general check the index first, and then use isearch.)

For info on backquotes, see http://www.gnu.org/software/emacs/manual/html_node/elisp/Backquote.html.
# starts the reader syntax, for instance #' is a reader alias for function.
For more info see http://definitelyaplug.b0.cx/post/emacs-reader/

The #' is a short hand for using functions, for more details see here: http://www.gnu.org/software/emacs/manual/html_node/elisp/Anonymous-Functions.html
Backslash \ has two functions: it quotes the special characters (including ‘\’), and it introduces additional special constructs. More here: https://www.gnu.org/software/emacs/manual/html_node/emacs/Regexps.html#Regexps

In emacs, how do I force certain characters to act as end of statement delineators?

I've created a new major mode derived from cc-mode, because I'm using a meta-language that is mostly C-like, but is parsed to generate code automatically.
Say I have something like this:
struct MyNewStruct
{
int newInt = 32;
{
[flag, different-flag]
string newString = "foo";
}
}
I need the ']' character to effectively be equivalent to the ; or the next line, declaring the string, doesn't indent properly.
I've tried using M-x modify-syntax-entry for ']' and making it both a closing character as well as a punctuation character (according to the GNU manual on syntax tables), but it doesn't look like it's allowed to belong to two character classes simultaneously (unless one of those character classes is a comment). (And if it's just a punctuation character, that causes other problems.)
I can't change the grammar of the meta-language, so adding a semicolon after the close bracket isn't possible.

In this case, the real answer was to pick something that was syntactically closer to my meta-language. csharp-mode already parses the brackets correctly and marks sections enclosed in brackets as statements, not statement-cont.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse