Regex to match anything between = object and ( - visual-studio-code

I'm developing a VS Code extension to support a new language, and for some syntax highlight I want to match any text between = object and (.
I tried the following Regex:
{
"name": "entity.name.class",
"match": "(?<==\\s*object).*?(?=\\()"
},
But when I add this to my grammar file it breaks all the other rules that were working, everything turns white again.
That Regex (?<==\s*object).*?(?=\()/g works on https://regexr.com/ with the following text:
!var = object REAL()
!var = object BORE(!bore)
!var =object REAL ()
!var =object BORE (!bore)
VS Code doesn't give me any exception or hint why this Regex is not working, does anyone have a clue on why the Regex is not working in VS Code?

You mention that adding your new pattern causes your test file to lose all syntax highlighting. This leads me to believe that you have a syntax issue within the actual grammar that's preventing any syntax pattern application. Your regular expression, while not optimal, seems to be syntactically correct, so I doubt that's the cause of your problem. For future reference, VS Code only applies syntax pattern matching one line at a time. This means that if your regex was incorrect, only the lines that match the pattern would lack the syntax highlighting, rather than the whole document.
Hopefully this answer isn't too late for your problem.

Related

How to fix ICU Lexing Error: Unexpected character in Flutter

I am using flutter_localizations to localize my app.
Since updating to Flutter 3.7 i am getting this error:
ICU Syntax Error: Expected "identifier" but found "}".
This =|(){}[] obviously
This =|\(){}[] obviously is the text that i have in my .arb file.
I understand that curly braces "{}" have a special meaning and should be escaped, but i can not find the way to correctly escape them, has anyone managed to do so?
One simple way to reproduce the issue is simply following the steps to add localization support here, and then instead of the hello world string, write anything that includes the character "{".
P.S.: There is a releted issue open on Github. Be sure to go and check there for updates!
There is an escaping syntax that is implemented but not enabled by default as it is a new feature that wasn't completely backward compatible with existing ICU message strings.
First, add the following to your l10n.yaml file:
use-escaping: true
Then, this will allow you to wrap parts of your strings in single quotes to ignore any syntax within the single quotes; to use a single quote normally as a character and not an escape, use a double single quote. For example,
{
message: "This '{isn''t}' obvious"
}
becomes
String get message => "This {isn't} obvious";
See here for information on the syntax. I'll add this to the documentation later.

.tmlanguage escape sequences and rule priorities

I'm implementing a syntax highlighter in Apple's Swift language by parsing .tmlanguage files and applying styles to a NSMutableAttributtedString.
I'm testing with javascript code, a javascript.tmlanguage file, and the monokai.tmtheme theme (both last included in sublime text 3) to check that the syntax get highlighted correctly. By applying each rule (patterns) in the .tmlanguage file in the same order they come, the syntax is almost perfectly highlighted.
The problem I'm having right now is that I don't know how to know that a quote (") should be escaped when it has a backslash before it (\"). Am I missing something in the .tmlanguage file that specifies that?. Other problem is that I have no idea how to know that other rules should be ignored when inside others, for example:
I'm getting double slashes taken as comments when inside strings: "http://stackoverflow.com/" a url is recognised as comment after //
Also double or single quotes are taken as strings when inside comments: // press "Enter" to continue, the word "Enter" gets highlighted as string when should be same color as comments
So, I don't know if there is some priority for some rules over others in the convention, or if there is something in the files that I haven't noticed.
Help please!
Update:
Here is a better example of what I meant by escape quotes:
I'm getting this: while all the letters should be yellow except for the escaped sequence (/") which should be blue.
The question is. How do I know that /" should be escaped? The rule for that piece of code is:
Maybe I am late to answer this. You can apply the following method.
(Ugly) In your end regex, use ([^/])(") and in your endCaptures, it would be
1 = string.quote.double.js
2 = punctuation.definition.string.end.js
If the string must be single line, you can use match=(")(.*)("), captures=
1 = punctuation.definition.string.begin.js
2 = string.quote.double.js
3 = punctuation.definition.string.end.js
and use your patterns
You can try applyEndPatternLast and see if it is allowed. Set applyEndPatternLast=1 will do.
The priority is that earlier rules in the file are prioritized over later rules. As an example, in my Python Improved language definition, I have a scope that contains a series of all-caps constants used in Django, a popular Python web framework. I also have a generic constant.other.allcaps.python scope that recognizes (just about) anything in all caps. Since the Django constants rule is before the allcaps rule in the .tmLanguage file, I can color it with a theme using one color, while the later-occurring "highlight everything in all caps" only grabs identifiers that are NOT part of the first list.
Because of this, you should put your "comments" scope(s) as early in the file as possible, then write your parser in such a way that it obeys the rule I described above. However, it's slightly more complicated than that, as I believe items in the repository are prioritized based on where their include line is, not where the repository rule is defined in the file. You may want to do some testing to verify that, though.
Unfortunately I'm not sure what you mean about the escaped quotes - could you expand on that, and maybe add an example or two?
Hope this helps.
Assuming that / is the correct character for escaping a double quote mark, the following should work:
"str_double_quote": {
"begin": "\"",
"end": "\"",
"name": "string.quoted.double.swift",
"patterns": [
{
"name": "constant.character.escape.swift",
"match": "/[\"/]"
}
]
}
You can match an escaped double quote mark (/") and a literal forward slash (//) in the patterns to consume them before the end marker is used to handle them.
If the character for escaping is actually a backslash, then the tricky bit is that there are two levels of escaping, for the JSON encoding as well as the regular expression syntax. To match \", the regular expression requires you to escape the backslash (\\"). JSON requires you to escape backslashes and double quotes, resulting in \\\\\" in a TextMate JSON grammar file. The match expression would thus be \\\\[\"\\\\].

Sublime syntax highlighting perl qw, qq, q not working fully

From 1 week i use sublime. And i'm very pleased. But i have little problem. I write in perl with sublime.
Here is the problem:
Sublime did not recognize that 'some string is quoted and $test_scalar and everything after it like it is string. When i type it like that:
There is no problem.
I tried with the Perl.tmLanguage file, but i did not understand it.
Can someone help me please?
Perl is one of the few programming languages that use this type of construct for quoting strings, and many program editors simply don't get it.
Imagine you're writing a syntax highlighter, and you have to understand all of these are the same:
my $string = "this is my string";
my $string = qq(This is my string);
my $string = qq/This is my string/;
my $string = qq#This is my string#;
my $string = qq
(This is my string);
Your syntax highlighter would have to understand that q, qq, and qx are quoting options, and that the character following them (after possible white space) is the character that's doing the quoting. Oh, and also that if the character is a (, a {, or a [, the closing quote is a ), }, or a ]. And, that this can be on more than one line. And, you really only need this for Perl.
I know that VIM can handle the qq quoting issue, but many other program editors I have tried failed. Even Stackoverflow's syntax highlighter (Google's prettify) fails.
Try Notepad++ or Textpad if you're on Windows. Or, try Eclipse with the EPIC editor. I believe that one also works.
Because Perl5 can't be statically parsed, editors have to make guesses about syntax. Could they do a better job in this case? Probably, but do keep in mind that it's impossible to do this perfectly.
In any case, your best bet is to get in touch with the author of the Perl syntax highlighting plugin for your editor.
As you said there is no problem and no syntax error. It is normal behavior either for sublime or vim editor. When you go on write qq operator on next line then highlighting string doesn't works on either editors.
cperl-mode.el for Emacs does the job:
Maybe you can take a look at it's source and try to use the same rules in Sublime or at last point this to the plugin author.

Regex to remove HTML-head-tag

how can I remove, with NSRegularExpression, the entire head-tag in a HTML file. Can some one give me a regex?
Thanks in advance,
Ph99Ph
There is none! HTML is a type-2 language and thus not parsable with a regular expression (type-3).
See this wiki article in case of doubt.
Lots of people use regex for parsing/editing HTML. This works quite well in simple cases but is utterly error prone.
This being said: You should have fairly reliable results with this regex:
<head>.+?</head>
This requires "." to also match line breaks. If it doesn't, then use this:
<head>(?:.|\n|\r)+?</head>
Again: This is error prone, don't do it.
What you should use is an XML parser such as NSXMLParser.
Please see the accepted answer at RegEx match open tags except XHTML self-contained tags. Or any version of this exact same question posted each day since the beginning of Stack Overflow.
In short, you cannot reliably parse HTML with Regular Expressions. RegEx is simply not advanced enough because of the complexities of HTML.
use something like this :
result = System.Text.RegularExpressions.Regex.Replace(result,
#"<( )*head([^>])*>", "<head>",
System.Text.RegularExpressions.RegexOptions.IgnoreCase);
result = System.Text.RegularExpressions.Regex.Replace(result,
#"(<( )*(/)( )*head( )*>)", "</head>",
System.Text.RegularExpressions.RegexOptions.IgnoreCase);
result = System.Text.RegularExpressions.Regex.Replace(result,
"(<head>).*(</head>)", " ",
System.Text.RegularExpressions.RegexOptions.IgnoreCase);

Regex for strings in Bibtex

I've trying to write a Bibtex parser with flex/bison. Here are the rules for strings in bibtex:
Strings can be enclosed in double quotes "..." or in braces {...}
In a string, braces can be nested
Inside a string, the braces should be balanced (invalid string: {this is a { test})
Inside an "internet" {}, you can have any characters. So this string is valid: {This is a string {test"} and it is valid}
Any ideas on how to do this?
Now you're going into the field of a text parser. Surprisingly, nobody has made a bibtex library for Actionscript that I could find, so it's an interesting problem. If you do make one, do the community a favor and open source it :)
It won't be easy to do since you essentially have to go character by character and check for the chars that you need and do logic around that. However, I recommend you look at as3corelib's implementation of the JSON parser which is somewhat similar to what you're trying to accomplish. You'll at least get an idea of how to do it using a tokenizer and it's a very good start on your project.
Good luck.