Wow, this was a terribly worded query, let me try again.
I'm still learning antlr and trying to understand grammars. I'm using a grammar (not written by me - so I'm trying not to adjust it too much as it's the standard used by many groups, found here).
I'm using it in a Flutter application. When I run it on Linux or Android, it runs without issue. When I try and run it no web, I immediately have issues. The full grammar I'm using is below.
grammar FhirPath;
// Grammar rules [FHIRPath](http://hl7.org/fhirpath/N1) Normative Release
//prog: line (line)*; line: ID ( '(' expr ')') ':' expr '\r'? '\n';
entireExpression: expression EOF;
expression:
term # termExpression
| expression '.' invocation # invocationExpression
| expression '[' expression ']' # indexerExpression
| ('+' | '-') expression # polarityExpression
| expression ('*' | '/' | 'div' | 'mod') expression # multiplicativeExpression
| expression ('+' | '-' | '&') expression # additiveExpression
| expression '|' expression # unionExpression
| expression ('<=' | '<' | '>' | '>=') expression # inequalityExpression
| expression ('is' | 'as') typeSpecifier # typeExpression
| expression ('=' | '~' | '!=' | '!~') expression # equalityExpression
| expression ('in' | 'contains') expression # membershipExpression
| expression 'and' expression # andExpression
| expression ('or' | 'xor') expression # orExpression
| expression 'implies' expression # impliesExpression;
//| (IDENTIFIER)? '=>' expression #lambdaExpression
term:
invocation # invocationTerm
| literal # literalTerm
| externalConstant # externalConstantTerm
| '(' expression ')' # parenthesizedTerm;
literal:
'{' '}' # nullLiteral
| ('true' | 'false') # booleanLiteral
| STRING # stringLiteral
| NUMBER # numberLiteral
| DATE # dateLiteral
| DATETIME # dateTimeLiteral
| TIME # timeLiteral
| quantity # quantityLiteral;
externalConstant: '%' ( identifier | STRING);
invocation: // Terms that can be used after the function/member invocation '.'
identifier # memberInvocation
| function # functionInvocation
| '$this' # thisInvocation
| '$index' # indexInvocation
| '$total' # totalInvocation;
function: identifier '(' paramList? ')';
paramList: expression (',' expression)*;
quantity: NUMBER unit?;
unit:
pluralDateTimePrecision
| dateTimePrecision
| STRING; // UCUM syntax for units of measure
pluralDateTimePrecision:
'years'
| 'months'
| 'weeks'
| 'days'
| 'hours'
| 'minutes'
| 'seconds'
| 'milliseconds';
dateTimePrecision:
'year'
| 'month'
| 'week'
| 'day'
| 'hour'
| 'minute'
| 'second'
| 'millisecond';
typeSpecifier: qualifiedIdentifier;
qualifiedIdentifier: identifier ('.' identifier)*;
identifier:
IDENTIFIER
| DELIMITEDIDENTIFIER
| 'as'
| 'is'
| 'contains'
| 'in'
| 'div';
/****************************************************************
Lexical rules ***************************************************************
*/
/*
NOTE: The goal of these rules in the grammar is to provide a date token to the parser. As such it
is not attempting to validate that the date is a correct date, that task is for the parser or
interpreter.
*/
DATE: '#' DATEFORMAT;
DATETIME:
'#' DATEFORMAT 'T' (TIMEFORMAT TIMEZONEOFFSETFORMAT?)?;
TIME: '#' 'T' TIMEFORMAT;
fragment DATEFORMAT:
[0-9][0-9][0-9][0-9] ('-' [0-9][0-9] ('-' [0-9][0-9])?)?;
fragment TIMEFORMAT:
[0-9][0-9] (':' [0-9][0-9] (':' [0-9][0-9] ('.' [0-9]+)?)?)?;
fragment TIMEZONEOFFSETFORMAT: (
'Z'
| ('+' | '-') [0-9][0-9]':' [0-9][0-9]
);
IDENTIFIER: ([A-Za-z] | '_') ([A-Za-z0-9] | '_')*;
// Added _ to support CQL (FHIR could constrain it out)
DELIMITEDIDENTIFIER: '`' (ESC | ~[\\`])* '`';
STRING: '\'' (ESC | ~['])* '\'';
// Also allows leading zeroes now (just like CQL and XSD)
NUMBER: [0-9]+ ('.' [0-9]+)?;
// Pipe whitespace to the HIDDEN channel to support retrieving source text through the parser.
WS: [ \r\n\t]+ -> channel(HIDDEN);
COMMENT: '/*' .*? '*/' -> channel(HIDDEN);
LINE_COMMENT: '//' ~[\r\n]* -> channel(HIDDEN);
fragment ESC:
'\\' ([`'\\/fnrt] | UNICODE); // allow \`, \', \\, \/, \f, etc. and \uXXX
fragment UNICODE: 'u' HEX HEX HEX HEX;
fragment HEX: [0-9a-fA-F];
I generate the code with the following:
antlr4 -Dlanguage=Dart FhirPath.g4 -visitor -no-listener
Then to test I use the following code:
final input = InputStream.fromString('name');
final lexer = FhirPathLexer(input);
final tokens = CommonTokenStream(lexer);
final parser = FhirPathParser(tokens);
parser.buildParseTree = true;
final tree = parser.expression();
If I run it in a simple dart script, it runs without issue. But if I put it in a Flutter application (again, only on web, otherwise it appears to run without issue), I get this error:
line 1:0 mismatched input 'name' expecting {'as', 'in', 'is', 'contains', 'div', 'mod', IDENTIFIER, DELIMITEDIDENTIFIER}
I assume there's something I don't understand about the grammar, so any insights would be appreciated.
I've concluded this is an error with transpiling to javascript. The antlr4 version works for all of my tests for Android and Linux, then throws a bunch of errors for web. I've gone back to using petitparser instead of antlr4. If anyone else has suggestions feel free to leave them, but for now, I'm going to close this. If you want to compare how the two look, I have both versions here: https://github.com/MayJuun/fhir/tree/main/fhir_path/lib
Related
I´m new to Xtext and have a problem.
When I try to create a terminal String without quotes I always get EOF errors.
If I comment out the code for the String without quotes I´dont get an error and everything works fine.
Can someone explain me this?
Or give me some hint how I could better solve this?
Thank you very much
// String without quotes
terminal STRINGWQ: ( ('a'..'z'|'A'..'Z')('a'..'z' | 'A'..'Z' | '_'| '-' | '§' | '?' | '!'| '#'
| '\n' | ':' |'%' | '.' | '*' | '^' | ',' | '&' | '('|')'| '0'..'9'|' ')*);
Rest of Code
grammar org.xtext.example.mydsl.MyDsl with org.eclipse.xtext.common.Terminals
generate myDsl "http://www.xtext.org/example/mydsl/MyDsl"
Model:
(elements += GITest)*
;
GITest:
KWHeader | KWTestCase
;
// KeyWords Header
KWHeader:
'Test' '!''?'
;
KWTestCase:
'testcase' int=INT ':' title = ID |
'Hello' names=ID '!'
;
UPDATE:
Data Type Rule
QSTRING returns ecore::EString: //custom terminal SurveyString
(('a'|'b'|'c'|'d'|'e'|'f'|'g'|'h'|'i'|'j'|'k'|'l'|'m'|'n'|'o'|'p'|'q'|'r'|'s'|'t'|'u'|'v'|'w'|'x'|'y'|'z'|
'A'|'B'|'C'|'D'|'E'|'F'|'G'|'H'|'I'|'J'|'K'|'L'|'M'|'N'|'O'|'P'|'Q'|'R'|'S'|'T'|'U'|'V'|'W'|'X'|'Y'|'Z'|' ')
('a'|'b'|'c'|'d'|'e'|'f'|'g'|'h'|'i'|'j'|'k'|'l'|'m'|'n'|'o'|'p'|'q'|'r'|'s'|'t'|'u'|'v'|'w'|'x'|'y'|'z'|
'A'|'B'|'C'|'D'|'E'|'F'|'G'|'H'|'I'|'J'|'K'|'L'|'M'|'N'|'O'|'P'|'Q'|'R'|'S'|'T'|'U'|'V'|'W'|'X'|'Y'|'Z'|
' '|'_'|'-'|'§'|'?'|'!'|'#'|'%'|'.'|'*'|'^'|','|'&'|'('|')'|'0'|'1'|'2'|'3'|'4'|'5'|'6'|'7'|'8'|'9')*);
UPDATE 2:
Got it working with Data Type Rule und manipulating ID
Code:
STRINGWQ: ((' ')?ID)((ID)?(INT)? ' ' (ID)?);
terminal ID: '^'?('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9'|'/'|';'|','|'#'|'!'|'§'|'$'|'%'|'&'|
'('|')'|'='|'?'|'\\'|'*'|'+'|'.'|'-'|'>'|'<'|'|'|'['|']'|'{'|'}')*;
But now I have the problem that xtext not recognizes when STRINGWQ ends.
So I dont get keyword suggestions in the next line.
For example if i dont use STRINGWQ but INT I get suggestions in the next line.
But with STRINGWQ I dont.
How can I define an end of Data Type Rules?
Thank you
Your STRINGWQ shadows the terminal rule ID and basically all other rules including the keywords. Chances are good that the entire document will by consumed as a single terminal token of type STRINGWQ. You should try to model your string as a datatype rule.
This works for me:
Property:
id=ID | int=INT | prop=PROPERTY_VALUE | spec=SPECIAL;
PROPERTY_VALUE:
(':');
SPECIAL:
(' ' | '/' | ';' | ',' | '!' | '§' | '%' | '&' | '(' | ')' | '?' | '*' | '+' | '.' | '-' | '|' | '[' | ']')
I separated the colon as I need to use PROPERTY_VALUE in another way.
But you also could add it to special.
$computername = 'RSERV1234'
$computername.Substring(5,4) returns '1234' as expected
Get-ADOrganizationalUnit -Filter {Name -like $computername.Substring(5,4)}
returns:
Property 'Substring' not found in object of type: 'System.String'
Please help!
From about_ActiveDirecory_Filter:
Filter Syntax
The following syntax descriptions use Backus-Naur form to show the
PowerShell Expression Language for the Filter parameter.
<filter> ::= "{" <FilterComponentList> "}"
<FilterComponentList> ::= <FilterComponent> |
<FilterComponent> <JoinOperator> <FilterComponent> |
<NotOperator> <FilterComponent>
<FilterComponent> ::= <attr> <FilterOperator> <value> |
"(" <FilterComponent> ")"
<FilterOperator> ::= "-eq" | "-le" | "-ge" | "-ne" | "-lt" | "-gt" |
"-approx" | "-bor" | "-band" | "-recursivematch" | "-like" |
"-notlike"
<JoinOperator> ::= "-and" | "-or"
<NotOperator> ::= "-not"
<attr> ::= <PropertyName> | <LDAPDisplayName of the attribute>
<value>::= < this value will be compared to the object data for
attribute <ATTR> using the specified filter operator
The Filter parameter translates PowerShell-like expressions to an LDAP filter, but doesn't support just any arbitrary PowerShell statement, only a specific set of comparison operations with attribute names as the left-hand operand and the comparison value on the right hand side.
Do your Substring() call beforehand:
$substr = $computername.Substring(5,4)
Get-ADOrganizationalUnit -Filter {Name -like "$substr"}
I've written a grammar which should allow me to define variables and arrays. Everything worked fine until I split up the variables into local and global variables. Now my parser doesn't recognize the arrays anymore (it says it would be a variable and gives me syntax errors for that).
My Grammar:
grammar sqf.Sqf with org.eclipse.xtext.common.Terminals
generate sqf "http://www.Sqf.sqf"
Model:
elements += Element*
;
Element:
Declaration ";" | Command ";"
;
Declaration:
Array | Variable
;
Variable:
LocalVariable | GlobalVariable
;
LocalVariable:
name=LOCALVARNAME "=" content=VARCONTENT (("+"|"-"|"*"|"/") content2+=VARCONTENT)*
;
GlobalVariable:
name=GLOBALVARNAME "=" content=VARCONTENT (("+"|"-"|"*"|"/") content2+=VARCONTENT)*
;
Array:
name=ID "=" content=ArrayLiteral | name=ID "=" "+" content2=[Array]
;
ArrayLiteral:
"[" (content += ArrayContent)* "]" (("+"|"-")content1+=Extension)*
;
ArrayContent:
content01=Acontent ("," content02+=Acontent)*
;
Acontent:
STRING | DOUBLE | ArrayLiteral
;
Extension:
STRING | DOUBLE
;
Command:
Interaction
;
Interaction:
hint
;
hint:
Normal | Format | Special
;
Normal:
name=("hint" | "hintC" | "hintCadet" | "hintSilent") content=STRING
;
Format:
name=("hint" | "hintC" | "hintCadet" | "hintSilent") "format" "[" content=STRING "," variable=DECREF "]"
;
Special:
hintCArray
;
hintCArray:
title=STRING "hintC" (content1=ArrayLiteral | content=STRING)
;
VARCONTENT:
STRING | DOUBLE | DECREF | "true" | "false" | "nil"
;
DOUBLE:
INT ("."INT)?
;
DECREF:
ref1=[Array|ID] | ref2=[LocalVariable|LOCALVARNAME] | ref3=[GlobalVariable|GLOBALVARNAME]
;
terminal LOCALVARNAME:
"_" ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*
;
terminal GLOBALVARNAME:
'^'?('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*
;
Has anybody of you an idea what the problem is?
(Any other code improvements are welcome, too)
Greets Krzmbrzl
Your rule GLOBALVARNAME completely shadows the rule ID. You could simply use ID instead of GLOBALVARNAME.
I would like to write a lexer generator to convert a basic subset of the MATLAB language to C#, C++, etc. To help me do this, I would like to find a document containing the formal grammar for MATLAB. Having spent a bit of time investigating this, it seems that Mathworks do not provide one.
Does anyone know where I could find such a document?
This is not complete grammar but yacc-keable for matlab provided for a compiler course in year 2000. From this, you can easily create BNF and EBNF.
primary_expression
: IDENTIFIER
| CONSTANT
| STRING_LITERAL
| '(' expression ')'
| '[' ']'
| '[' array_list ']'
;
postfix_expression
: primary_expression
| array_expression
| postfix_expression TRANSPOSE
| postfix_expression NCTRANSPOSE
;
index_expression
: ':'
| expression
;
index_expression_list
: index_expression
| index_expression_list ',' index_expression
;
array_expression
: IDENTIFIER '(' index_expression_list ')'
;
unary_expression
: postfix_expression
| unary_operator postfix_expression
;
unary_operator
: '+'
| '-'
| '~'
;
multiplicative_expression
: unary_expression
| multiplicative_expression '*' unary_expression
| multiplicative_expression '/' unary_expression
| multiplicative_expression '\\' unary_expression
| multiplicative_expression '^' unary_expression
| multiplicative_expression ARRAYMUL unary_expression
| multiplicative_expression ARRAYDIV unary_expression
| multiplicative_expression ARRAYRDIV unary_expression
| multiplicative_expression ARRAYPOW unary_expression
;
additive_expression
: multiplicative_expression
| additive_expression '+' multiplicative_expression
| additive_expression '-' multiplicative_expression
;
relational_expression
: additive_expression
| relational_expression '<' additive_expression
| relational_expression '>' additive_expression
| relational_expression LE_OP additive_expression
| relational_expression GE_OP additive_expression
;
equality_expression
: relational_expression
| equality_expression EQ_OP relational_expression
| equality_expression NE_OP relational_expression
;
and_expression
: equality_expression
| and_expression '&' equality_expression
;
or_expression
: and_expression
| or_expression '|' and_expression
;
expression
: or_expression
| expression ':' or_expression
;
assignment_expression
: postfix_expression '=' expression
eostmt
: ','
| ';'
| CR
;
statement
: global_statement
| clear_statement
| assignment_statement
| expression_statement
| selection_statement
| iteration_statement
| jump_statement
;
statement_list
: statement
| statement_list statement
;
identifier_list
: IDENTIFIER
| identifier_list IDENTIFIER
;
global_statement
: GLOBAL identifier_list eostmt
;
clear_statement
: CLEAR identifier_list eostmt
;
expression_statement
: eostmt
| expression eostmt
;
assignment_statement
: assignment_expression eostmt
;
array_element
: expression
| expression_statement
;
array_list
: array_element
| array_list array_element
;
selection_statement
: IF expression statement_list END eostmt
| IF expression statement_list ELSE statement_list END eostmt
| IF expression statement_list elseif_clause END eostmt
| IF expression statement_list elseif_clause
ELSE statement_list END eostmt
;
elseif_clause
: ELSEIF expression statement_list
| elseif_clause ELSEIF expression statement_list
;
iteration_statement
: WHILE expression statement_list END eostmt
| FOR IDENTIFIER '=' expression statement_list END eostmt
| FOR '(' IDENTIFIER '=' expression ')' statement_list END eostmt
;
jump_statement
: BREAK eostmt
| RETURN eostmt
;
translation_unit
: statement_list
| FUNCTION function_declare eostmt statement_list
;
func_ident_list
: IDENTIFIER
| func_ident_list ',' IDENTIFIER
;
func_return_list
: IDENTIFIER
| '[' func_ident_list ']'
;
function_declare_lhs
: IDENTIFIER
| IDENTIFIER '(' ')'
| IDENTIFIER '(' func_ident_list ')'
;
function_declare
: function_declare_lhs
| func_return_list '=' function_declare_lhs
;
Excellent opportunity to write your own formal grammar :)
If you should choose to write the grammer your self, I can recommend BNFC which can take a formal BNF grammar and construct data structures and lexers/parsers for a couple of target languages (C/C++, C#, Java, Haskell etc.). This would save you a lot of time and let you focus on formulating the grammar, and then get right to implementing the converter in your language of preference.
If nothing else, the link to BNFC contains some help and pointers on how to formulate a BNF grammar. Best of luck!
I'm not sure when exactly it appeared (possibly Mar-Apr 2019), but it is now available on Mathworks' GitHub. Here's the grammar-defining xml file (as of 09-Apr-2019; compressed to avoid the SO post character limit):
Copyright 2018 The MathWorks, Inc., under the BSD2 license.
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"><plist version="1.0"><dict><key>fileTypes</key><array><string>m</string></array><key>keyEquivalent</key><string>^~M</string><key>name</key><string>MATLAB</string><key>patterns</key><array><dict><key>include</key><string>#classdef</string></dict><dict><key>include</key><string>#function</string></dict><dict><key>include</key><string>#blocks</string></dict><dict><key>include</key><string>#control_statements</string></dict><dict><key>include</key><string>#global_persistent</string></dict><dict><key>include</key><string>#command_dual</string></dict><dict><key>include</key><string>#string</string></dict><dict><key>include</key><string>#line_continuation</string></dict><dict><key>include</key><string>#comments</string></dict><dict><key>include</key><string>#transpose</string></dict><dict><key>include</key><string>#constants</string></dict><dict><key>include</key><string>#variables</string></dict><dict><key>include</key><string>#end_in_parens</string></dict><dict><key>include</key><string>#numbers</string></dict><dict><key>include</key><string>#operators</string></dict></array><key>repository</key><dict><key>blocks</key><dict><key>patterns</key><array><dict><key>begin</key><string>(^\s*)(for)\b</string><key>beginCaptures</key><dict><key>0</key><dict><key>name</key><string>meta.for-quantity.matlab</string></dict><key>2</key><dict><key>name</key><string>keyword.control.for.matlab</string></dict></dict><key>end</key><string>^\s*(end)\b</string><key>endCaptures</key><dict><key>1</key><dict><key>name</key><string>keyword.control.end.for.matlab</string></dict></dict><key>name</key><string>meta.for.matlab</string><key>patterns</key><array><dict><key>begin</key><string>\G(?!$)</string><key>end</key><string>$\n?</string><key>name</key><string>meta.for-quantity.matlab</string><key>patterns</key><array><dict><key>include</key><string>$self</string></dict></array></dict><dict><key>include</key><string>$self</string></dict></array></dict><dict><key>begin</key><string>(^\s*)(if)\b</string><key>beginCaptures</key><dict><key>0</key><dict><key>name</key><string>meta.if-condition.matlab</string></dict><key>2</key><dict><key>name</key><string>keyword.control.if.matlab</string></dict></dict><key>end</key><string>^\s*(end)\b</string><key>endCaptures</key><dict><key>1</key><dict><key>name</key><string>keyword.control.end.if.matlab</string></dict></dict><key>name</key><string>meta.if.matlab</string><key>patterns</key><array><dict><key>begin</key><string>\G(?!$)</string><key>end</key><string>$\n?</string><key>name</key><string>meta.if-condition.matlab</string><key>patterns</key><array><dict><key>include</key><string>$self</string></dict></array></dict><dict><key>captures</key><dict><key>0</key><dict><key>name</key><string>meta.elseif-condition.matlab</string></dict><key>2</key><dict><key>name</key><string>keyword.control.elseif.matlab</string></dict><key>3</key><dict><key>patterns</key><array><dict><key>include</key><string>$self</string></dict></array></dict></dict><key>end</key><string>^</string><key>match</key><string>(^\s*)(elseif)\b(.*)$\n?</string><key>name</key><string>meta.elseif.matlab</string></dict><dict><key>captures</key><dict><key>0</key><dict><key>name</key><string>meta.else-condition.matlab</string></dict><key>2</key><dict><key>name</key><string>keyword.control.else.matlab</string></dict><key>3</key><dict><key>patterns</key><array><dict><key>include</key><string>$self</string></dict></array></dict></dict><key>end</key><string>^</string><key>match</key><string>(^\s*)(else)\b(.*)?$\n?</string><key>name</key><string>meta.else.matlab</string></dict><dict><key>include</key><string>$self</string></dict></array></dict><dict><key>begin</key><string>(^\s*)(parfor)\b</string><key>beginCaptures</key><dict><key>0</key><dict><key>name</key><string>meta.parfor-quantity.matlab</string></dict><key>2</key><dict><key>name</key><string>keyword.control.for.matlab</string></dict></dict><key>end</key><string>^\s*(end)\b</string><key>endCaptures</key><dict><key>1</key><dict><key>name</key><string>keyword.control.end.for.matlab</string></dict></dict><key>name</key><string>meta.parfor.matlab</string><key>patterns</key><array><dict><key>begin</key><string>\G(?!$)</string><key>end</key><string>$\n?</string><key>name</key><string>meta.parfor-quantity.matlab</string><key>patterns</key><array><dict><key>include</key><string>$self</string></dict></array></dict><dict><key>include</key><string>$self</string></dict></array></dict><dict><key>begin</key><string>(^\s*)(spmd)\b</string><key>beginCaptures</key><dict><key>0</key><dict><key>name</key><string>meta.spmd-statement.matlab</string></dict><key>2</key><dict><key>name</key><string>keyword.control.spmd.matlab</string></dict></dict><key>end</key><string>^\s*(end)\b</string><key>endCaptures</key><dict><key>1</key><dict><key>name</key><string>keyword.control.end.spmd.matlab</string></dict></dict><key>name</key><string>meta.spmd.matlab</string><key>patterns</key><array><dict><key>begin</key><string>\G(?!$)</string><key>end</key><string>$\n?</string><key>name</key><string>meta.spmd-statement.matlab</string><key>patterns</key><array><dict><key>include</key><string>$self</string></dict></array></dict><dict><key>include</key><string>$self</string></dict></array></dict><dict><key>begin</key><string>(^\s*)(switch)\b</string><key>beginCaptures</key><dict><key>0</key><dict><key>name</key><string>meta.switch-expression.matlab</string></dict><key>2</key><dict><key>name</key><string>keyword.control.switch.matlab</string></dict></dict><key>end</key><string>^\s*(end)\b</string><key>endCaptures</key><dict><key>1</key><dict><key>name</key><string>keyword.control.end.switch.matlab</string></dict></dict><key>name</key><string>meta.switch.matlab</string><key>patterns</key><array><dict><key>begin</key><string>\G(?!$)</string><key>end</key><string>$\n?</string><key>name</key><string>meta.switch-expression.matlab</string><key>patterns</key><array><dict><key>include</key><string>$self</string></dict></array></dict><dict><key>captures</key><dict><key>0</key><dict><key>name</key><string>meta.case-expression.matlab</string></dict><key>2</key><dict><key>name</key><string>keyword.control.case.matlab</string></dict><key>3</key><dict><key>patterns</key><array><dict><key>include</key><string>$self</string></dict></array></dict></dict><key>end</key><string>^</string><key>match</key><string>(^\s*)(case)\b(.*)$\n?</string><key>name</key><string>meta.case.matlab</string></dict><dict><key>captures</key><dict><key>0</key><dict><key>name</key><string>meta.otherwise-expression.matlab</string></dict><key>2</key><dict><key>name</key><string>keyword.control.otherwise.matlab</string></dict><key>3</key><dict><key>patterns</key><array><dict><key>include</key><string>$self</string></dict></array></dict></dict><key>end</key><string>^</string><key>match</key><string>(^\s*)(otherwise)\b(.*)?$\n?</string><key>name</key><string>meta.otherwise.matlab</string></dict><dict><key>include</key><string>$self</string></dict></array></dict><dict><key>begin</key><string>(^\s*)(try)\b</string><key>beginCaptures</key><dict><key>2</key><dict><key>name</key><string>keyword.control.try.matlab</string></dict></dict><key>end</key><string>^\s*(end)\b</string><key>endCaptures</key><dict><key>1</key><dict><key>name</key><string>keyword.control.end.try.matlab</string></dict></dict><key>name</key><string>meta.try.matlab</string><key>patterns</key><array><dict><key>captures</key><dict><key>0</key><dict><key>name</key><string>meta.catch-exception.matlab</string></dict><key>2</key><dict><key>name</key><string>keyword.control.catch.matlab</string></dict><key>3</key><dict><key>patterns</key><array><dict><key>include</key><string>$self</string></dict></array></dict></dict><key>end</key><string>^</string><key>match</key><string>(^\s*)(catch)\b(.*)?$\n?</string><key>name</key><string>meta.catch.matlab</string></dict><dict><key>include</key><string>$self</string></dict></array></dict><dict><key>begin</key><string>(^\s*)(while)\b</string><key>beginCaptures</key><dict><key>0</key><dict><key>name</key><string>meta.while-condition.matlab</string></dict><key>2</key><dict><key>name</key><string>keyword.control.while.matlab</string></dict></dict><key>end</key><string>^\s*(end)\b</string><key>endCaptures</key><dict><key>1</key><dict><key>name</key><string>keyword.control.end.while.matlab</string></dict></dict><key>name</key><string>meta.while.matlab</string><key>patterns</key><array><dict><key>begin</key><string>\G(?!$)</string><key>end</key><string>$\n?</string><key>name</key><string>meta.while-condition.matlab</string><key>patterns</key><array><dict><key>include</key><string>$self</string></dict></array></dict><dict><key>include</key><string>$self</string></dict></array></dict></array></dict><key>classdef</key><dict><key>patterns</key><array><dict><key>begin</key><string>(?x)
(^\s*) # Leading whitespace
(classdef)
\b\s*
( # Optional attributes
\( [^)]* \)
)?
\s*
(
([a-zA-Z][a-zA-Z0-9_]*) # Class name
(?: # Optional inheritance
\s*
(<)
\s*
([^%]*)
)?
)
\s*($|(?=%))
</string><key>beginCaptures</key><dict><key>2</key><dict><key>name</key><string>storage.type.class.matlab</string></dict><key>3</key><dict><key>patterns</key><array><dict><key>match</key><string>[a-zA-Z][a-zA-Z0-9_]*</string><key>name</key><string>variable.parameter.class.matlab</string></dict><dict><key>begin</key><string>=\s*</string><key>end</key><string>,|(?=\))</string><key>patterns</key><array><dict><key>match</key><string>true|false</string><key>name</key><string>constant.language.boolean.matlab</string></dict><dict><key>include</key><string>#string</string></dict></array></dict></array></dict><key>4</key><dict><key>name</key><string>meta.class-declaration.matlab</string></dict><key>5</key><dict><key>name</key><string>entity.name.section.class.matlab</string></dict><key>6</key><dict><key>name</key><string>keyword.operator.other.matlab</string></dict><key>7</key><dict><key>patterns</key><array><dict><key>match</key><string>[a-zA-Z][a-zA-Z0-9_]*(\.[a-zA-Z][a-zA-Z0-9_]*)*</string><key>name</key><string>entity.other.inherited-class.matlab</string></dict><dict><key>match</key><string>&</string><key>name</key><string>keyword.operator.other.matlab</string></dict></array></dict></dict><key>end</key><string>^\s*(end)\b</string><key>endCaptures</key><dict><key>1</key><dict><key>name</key><string>keyword.control.end.class.matlab</string></dict></dict><key>name</key><string>meta.class.matlab</string><key>patterns</key><array><dict><key>begin</key><string>(?x)
(^\s*) # Leading whitespace
(properties)\b(.*)$
\s*
( # Optional attributes
\( [^)]* \)
)?
\s*($|(?=%))
</string><key>beginCaptures</key><dict><key>2</key><dict><key>name</key><string>keyword.control.properties.matlab</string></dict><key>3</key><dict><key>patterns</key><array><dict><key>match</key><string>[a-zA-Z][a-zA-Z0-9_]*</string><key>name</key><string>variable.parameter.properties.matlab</string></dict><dict><key>begin</key><string>=\s*</string><key>end</key><string>,|(?=\))</string><key>patterns</key><array><dict><key>match</key><string>true|false</string><key>name</key><string>constant.language.boolean.matlab</string></dict><dict><key>match</key><string>public|protected|private</string><key>name</key><string>constant.language.access.matlab</string></dict></array></dict></array></dict></dict><key>end</key><string>^\s*(end)\b</string><key>endCaptures</key><dict><key>1</key><dict><key>name</key><string>keyword.control.end.properties.matlab</string></dict></dict><key>name</key><string>meta.properties.matlab</string><key>patterns</key><array><dict><key>include</key><string>$self</string></dict></array></dict><dict><key>begin</key><string>(?x)
(^\s*) # Leading whitespace
(methods)\b(.*)$
\s*
( # Optional attributes
\( [^)]* \)
)?
\s*($|(?=%))
</string><key>beginCaptures</key><dict><key>2</key><dict><key>name</key><string>keyword.control.methods.matlab</string></dict><key>3</key><dict><key>patterns</key><array><dict><key>match</key><string>[a-zA-Z][a-zA-Z0-9_]*</string><key>name</key><string>variable.parameter.methods.matlab</string></dict><dict><key>begin</key><string>=\s*</string><key>end</key><string>,|(?=\))</string><key>patterns</key><array><dict><key>match</key><string>true|false</string><key>name</key><string>constant.language.boolean.matlab</string></dict><dict><key>match</key><string>public|protected|private</string><key>name</key><string>constant.language.access.matlab</string></dict></array></dict></array></dict></dict><key>end</key><string>^\s*(end)\b</string><key>endCaptures</key><dict><key>1</key><dict><key>name</key><string>keyword.control.end.methods.matlab</string></dict></dict><key>name</key><string>meta.methods.matlab</string><key>patterns</key><array><dict><key>include</key><string>$self</string></dict></array></dict><dict><key>begin</key><string>(?x)
(^\s*) # Leading whitespace
(events)\b(.*)$
\s*
( # Optional attributes
\( [^)]* \)
)?
\s*($|(?=%))
</string><key>beginCaptures</key><dict><key>2</key><dict><key>name</key><string>keyword.control.events.matlab</string></dict><key>3</key><dict><key>patterns</key><array><dict><key>match</key><string>[a-zA-Z][a-zA-Z0-9_]*</string><key>name</key><string>variable.parameter.events.matlab</string></dict><dict><key>begin</key><string>=\s*</string><key>end</key><string>,|(?=\))</string><key>patterns</key><array><dict><key>match</key><string>true|false</string><key>name</key><string>constant.language.boolean.matlab</string></dict><dict><key>match</key><string>public|protected|private</string><key>name</key><string>constant.language.access.matlab</string></dict></array></dict></array></dict></dict><key>end</key><string>^\s*(end)\b</string><key>endCaptures</key><dict><key>1</key><dict><key>name</key><string>keyword.control.end.events.matlab</string></dict></dict><key>name</key><string>meta.events.matlab</string></dict><dict><key>begin</key><string>(?x)
(^\s*) # Leading whitespace
(enumeration)\b(.*)$
\s*($|(?=%))
</string><key>beginCaptures</key><dict><key>2</key><dict><key>name</key><string>keyword.control.enumeration.matlab</string></dict></dict><key>end</key><string>^\s*(end)\b</string><key>endCaptures</key><dict><key>1</key><dict><key>name</key><string>keyword.control.end.enumeration.matlab</string></dict></dict><key>name</key><string>meta.enumeration.matlab</string></dict><dict><key>include</key><string>$self</string></dict></array></dict></array></dict><key>command_dual</key><dict><key>captures</key><dict><key>1</key><dict><key>name</key><string>string.interpolated.matlab</string></dict><key>2</key><dict><key>name</key><string>variable.other.command.matlab</string></dict><key>28</key><dict><key>name</key><string>comment.line.percentage.matlab</string></dict></dict><key>comment</key><string> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1516 17 18 19 20 21 22 23 24 25 26 27 28</string><key>match</key><string>^\s*((?# A> )([b-df-hk-moq-zA-HJ-MO-Z]\w*|a|an|a([A-Za-mo-z0-9_]\w*|n[A-Za-rt-z0-9_]\w*|ns\w+)|e|ep|e([A-Za-oq-z0-9_]\w*|p[A-Za-rt-z0-9_]\w*|ps\w+)|in|i([A-Za-mo-z0-9_]\w*|n[A-Za-eg-z0-9_]\w*|nf\w+)|I|In|I([A-Za-mo-z0-9_]\w*|n[A-Za-eg-z0-9_]\w*|nf\w+)|j\w+|N|Na|N([A-Zb-z0-9_]\w*|a[A-MO-Za-z0-9_]\w*|aN\w+)|n|na|nar|narg|nargi|nargo|nargou|n([A-Zb-z0-9_]\w*|a([A-Za-mo-qs-z0-9_]\w*|n\w+|r([A-Za-fh-z0-9_]\w*|g([A-Za-hj-nq-z0-9_]\w*|i([A-Za-mo-z0-9_]\w*|n\w+)|o([A-Za-tv-z0-9_]\w*|u([A-Za-su-z]\w*|t\w+))))))|p|p[A-Za-hj-z0-9_]\w*|pi\w+)(?# <A )\s+(((?# B> )([^\s;,%()=.{&|~<>:+\-*/\\#^'"]|(?=')|(?="))(?# <B )|(?# C> )(\.\^|\.\*|\./|\.\\|\.'|\.\(|&&|==|\|\||&(?=[^&])|\|(?=[^\|])|~=|<=|>=|~(?!=)|<(?!=)|>(?!=)|:|\+|-|\*|/|\\|#|\^)(?# <C )(?# D> )([^\s]|\s*(?=%)|\s+$|\s+(,|;|\)|}|\]|&|\||<|>|=|:|\*|/|\\|\^|#|(\.[^\d.]|\.\.[^.])))(?# <D )|(?# E> )(\.[^^*/\\'(\sA-Za-z])(?# <E ))(?# F> )([^%]|'[^']*'|"[^"]*")*(?# <F )|(?# X> )(\.(?=\s)|\.[A-Za-z]|(?={))(?# <X )(?# Y> )([^(=\'"%]|==|'[^']*'|"[^"]*"|\(|\([^)%]*\)|\[|\[[^\]%]*\]|{|{[^}%]*})*(\.\.\.[^%]*)?((?=%)|$)(?# <Y )))(%.*)?$</string></dict><key>comment_block</key><dict><key>begin</key><string>(^[\s]*)%\{[^\n\S]*+\n</string><key>beginCaptures</key><dict><key>1</key><dict><key>name</key><string>punctuation.definition.comment.matlab</string></dict></dict><key>end</key><string>^[\s]*%\}[^\n\S]*+(?:\n|$)</string><key>name</key><string>comment.block.percentage.matlab</string><key>patterns</key><array><dict><key>include</key><string>#comment_block</string></dict><dict><key>match</key><string>^[^\n]*\n</string></dict></array></dict><key>comments</key><dict><key>patterns</key><array><dict><key>begin</key><string>(^[ \t]+)?(?=%%\s)</string><key>beginCaptures</key><dict><key>1</key><dict><key>name</key><string>punctuation.whitespace.comment.leading.matlab</string></dict></dict><key>end</key><string>(?!\G)</string><key>patterns</key><array><dict><key>begin</key><string>%%</string><key>beginCaptures</key><dict><key>0</key><dict><key>name</key><string>punctuation.definition.comment.matlab</string></dict></dict><key>end</key><string>\n</string><key>name</key><string>comment.line.double-percentage.matlab</string><key>patterns</key><array><dict><key>begin</key><string>\G[^\S\n]*(?![\n\s])</string><key>contentName</key><string>meta.cell.matlab</string><key>end</key><string>(?=\n)</string></dict></array></dict></array></dict><dict><key>include</key><string>#comment_block</string></dict><dict><key>begin</key><string>(^[ \t]+)?(?=%)</string><key>beginCaptures</key><dict><key>1</key><dict><key>name</key><string>punctuation.whitespace.comment.leading.matlab</string></dict></dict><key>end</key><string>(?!\G)</string><key>patterns</key><array><dict><key>begin</key><string>%</string><key>beginCaptures</key><dict><key>0</key><dict><key>name</key><string>punctuation.definition.comment.matlab</string></dict></dict><key>end</key><string>\n</string><key>name</key><string>comment.line.percentage.matlab</string></dict></array></dict></array></dict><key>control_statements</key><dict><key>captures</key><dict><key>1</key><dict><key>name</key><string>keyword.control.matlab</string></dict></dict><key>match</key><string>^\s*(break|continue|return)\b</string><key>name</key><string>meta.control.matlab</string></dict><key>function</key><dict><key>patterns</key><array><dict><key>begin</key><string>(?x)
(^\s*) # Leading whitespace
(function)
\s+
(?: # Optional
(?:
(\[) ([^\]]*) (\])
| ([a-zA-Z][a-zA-Z0-9_]*)
)
\s* = \s*
)?
([a-zA-Z][a-zA-Z0-9_]*(\.[a-zA-Z][a-zA-Z0-9_]*)*) # Function name
\s* # Trailing space
</string><key>beginCaptures</key><dict><key>2</key><dict><key>name</key><string>storage.type.function.matlab</string></dict><key>3</key><dict><key>name</key><string>punctuation.definition.arguments.begin.matlab</string></dict><key>4</key><dict><key>patterns</key><array><dict><key>match</key><string>\w+</string><key>name</key><string>variable.parameter.output.matlab</string></dict></array></dict><key>5</key><dict><key>name</key><string>punctuation.definition.arguments.end.matlab</string></dict><key>6</key><dict><key>name</key><string>variable.parameter.output.function.matlab</string></dict><key>7</key><dict><key>name</key><string>entity.name.function.matlab</string></dict></dict><key>end</key><string>^\s*(end)\b(\s*\n)?</string><key>endCaptures</key><dict><key>1</key><dict><key>name</key><string>keyword.control.end.function.matlab</string></dict></dict><key>name</key><string>meta.function.matlab</string><key>patterns</key><array><dict><key>begin</key><string>\G\(</string><key>end</key><string>\)</string><key>name</key><string>meta.arguments.function.matlab</string><key>patterns</key><array><dict><key>match</key><string>\w+</string><key>name</key><string>variable.parameter.input.matlab</string></dict></array></dict><dict><key>include</key><string>$self</string></dict></array></dict></array></dict><key>global_persistent</key><dict><key>captures</key><dict><key>1</key><dict><key>name</key><string>keyword.control.globalpersistent.matlab</string></dict></dict><key>match</key><string>^\s*(global|persistent)\b</string><key>name</key><string>meta.globalpersistent.matlab</string></dict><key>line_continuation</key><dict><key>captures</key><dict><key>1</key><dict><key>name</key><string>keyword.operator.symbols.matlab</string></dict><key>2</key><dict><key>name</key><string>comment.line.continuation.matlab</string></dict></dict><key>comment</key><string>Line continuations</string><key>match</key><string>(\.\.\.)(.*)$</string><key>name</key><string>meta.linecontinuation.matlab</string></dict><key>string</key><dict><key>patterns</key><array><dict><key>captures</key><dict><key>1</key><dict><key>name</key><string>string.interpolated.matlab</string></dict><key>2</key><dict><key>name</key><string>punctuation.definition.string.begin.matlab</string></dict></dict><key>comment</key><string>Shell command</string><key>match</key><string>^\s*((!).*$\n?)</string></dict><dict><key>begin</key><string>((?<=(\[|\(|\{|=|\s|;|:|,|~|<|>|&|\||-|\+|\*|/|\\|\.|\^))|^)'</string><key>beginCaptures</key><dict><key>0</key><dict><key>name</key><string>punctuation.definition.string.begin.matlab</string></dict></dict><key>comment</key><string>Character vector literal (single-quoted)</string><key>end</key><string>'(?=(\[|\(|\{|\]|\)|\}|=|~|<|>|&|\||-|\+|\*|/|\\|\.|\^|\s|;|:|,))</string><key>endCaptures</key><dict><key>0</key><dict><key>name</key><string>punctuation.definition.string.end.matlab</string></dict></dict><key>name</key><string>string.quoted.single.matlab</string><key>patterns</key><array><dict><key>match</key><string>''</string><key>name</key><string>constant.character.escape.matlab</string></dict><dict><key>match</key><string>'(?=.)</string><key>name</key><string>invalid.illegal.unescaped-quote.matlab</string></dict><dict><key>comment</key><string>Operator symbols</string><key>match</key><string>((\%([\+\-0]?\d{0,3}(\.\d{1,3})?)(c|d|e|E|f|g|G|s|((b|t)?(o|u|x|X))))|\%\%|\\(b|f|n|r|t|\\))</string><key>name</key><string>constant.character.escape.matlab</string></dict></array></dict><dict><key>begin</key><string>((?<=(\[|\(|\{|=|\s|;|:|,|~|<|>|&|\||-|\+|\*|/|\\|\.|\^))|^)"</string><key>beginCaptures</key><dict><key>0</key><dict><key>name</key><string>punctuation.definition.string.begin.matlab</string></dict></dict><key>comment</key><string>String literal (double-quoted)</string><key>end</key><string>"(?=(\[|\(|\{|\]|\)|\}|=|~|<|>|&|\||-|\+|\*|/|\\|\.|\^|\||\s|;|:|,))</string><key>endCaptures</key><dict><key>0</key><dict><key>name</key><string>punctuation.definition.string.end.matlab</string></dict></dict><key>name</key><string>string.quoted.double.matlab</string><key>patterns</key><array><dict><key>match</key><string>""</string><key>name</key><string>constant.character.escape.matlab</string></dict><dict><key>match</key><string>"(?=.)</string><key>name</key><string>invalid.illegal.unescaped-quote.matlab</string></dict></array></dict></array></dict><key>transpose</key><dict><key>match</key><string>((\w+)|(?<=\])|(?<=\)))\.?'</string><key>name</key><string>keyword.operator.transpose.matlab</string></dict><key>constants</key><dict><key>comment</key><string>MATLAB Constants</string><key>match</key><string>(?<!\.)\b(eps|false|Inf|inf|intmax|intmin|namelengthmax|NaN|nan|on|off|realmax|realmin|true|pi)\b</string><key>name</key><string>constant.language.matlab</string></dict><key>variables</key><dict><key>comment</key><string>MATLAB variables</string><key>match</key><string>(?<!\.)\b(nargin|nargout|varargin|varargout)\b</string><key>name</key><string>variable.other.function.matlab</string></dict><key>end_in_parens</key><dict><key>comment</key><string>end as operator symbol</string><key>match</key><string>\bend\b</string><key>name</key><string>keyword.operator.symbols.matlab</string></dict><key>numbers</key><dict><key>comment</key><string>Valid numbers: 1, .1, 1.1, .1e1, 1.1e1, 1e1, 1i, 1j, 1e2j</string><key>match</key><string>(?<=[\s\-\+\*\/\\=:\[\(\{,]|^)\d*\.?\d+([eE][+-]?\d)?([0-9&&[^\.]])*(i|j)?\b</string><key>name</key><string>constant.numeric.matlab</string></dict><key>operators</key><dict><key>comment</key><string>Operator symbols</string><key>match</key><string>(?<=\s)(==|~=|>|>=|<|<=|&|&&|:|\||\|\||\+|-|\*|\.\*|/|\./|\\|\.\\|\^|\.\^)(?=\s)</string><key>name</key><string>keyword.operator.symbols.matlab</string></dict></dict><key>scopeName</key><string>source.matlab</string><key>uuid</key><string>48F8858B-72FF-11D9-BFEE-000D93589AF6</string></dict></plist>
Dave Wingate provides some Antler resources that look like an excellent place to start.
As noted in his README file, he doesn't include the transpose operator and a few other tricky parses. See the mparser link here:
http://web.mit.edu/~wingated/www/resources.html
Some of the tricky bits of earlier versions of Matlab ( 1999) are also described in a document by a group from Northwestern. It includes has EBNF-like descriptions. It also outlines some nasty bits in some footnotes.
http://www.ece.northwestern.edu/cpdc/pjoisha/MAGICA/CPDC-TR-9909-017.pdf
I've collected a couple of other less relevant sources, but stackoverflow's editor bot tells me I don't have enough reputation point to post more than two links.
You can start by adapting the MATLAB -> Python converter smop that is itself written in Python. It uses PLY (Python lex-yacc). The files you would likely be interested in starting from are lexer.py and parse.py.
See also this answer for a list of converters from MATLAB to Python.
I'm trying to get this to work:
def emptyCond: Parser[Cond] = ("if" ~ "(") ~> regularStr <~ ")" ^^ { case s => Cond("",Nil,Nil) }
where regularStr is defined to accept a number of things, including ")". Of course, I want this to be an acceptable input: if(foo()). But for any if(x) it is taking the ")" as part of the regularStr and so this parser never succeeds.
What am I missing?
Edit:
regularStr is not a regular expression. It is defined thus:
def regularStr = rep(ident | numericLit | decimalLit | stringLit | stmtSymbol) ^^ { case s => s.mkString(" ") }
and the symbols are:
val stmtSymbol = "*" | "&" | "." | "::" | "(" | ")" | "*" | ">=" | "<=" | "=" |
"<" | ">" | "|" | "-" | "," | "^" | "[" | "]" | "?" | ":" | "+" |
"-=" | "+=" | "*=" | "/=" | "&&" | "||" | "&=" | "|="
I don't need exhaustive language check, just the control structures. So I don't really care what's inside "()" in if(), I want to accept any sequence of identifiers, symbols, etc. So, for my purposes even if())) should be valid, where "))" is the if's "condition".
A regular expression cannot recognize a language that has nested, balanced constructs such as (...), [...], {...}, etc. So you're going to need to use further context-free productions (not regular expressions) to match the regularStr portions.
OK, accepting if())) was not really a requirement, just an example of what I would be willing to accept in order to make my parsing as cheap as possible, to just worry about capturing control structures.
However it appears I can't be so cheap and still have it work. So, since the if() construct has parenthesis, all I have to do is expect what's inside to have well balanced parenthesis. A closing ")" where one isn't expected cannot be part of the condition.
I did this:
val regularNoParens = ident | numericLit | decimalLit | stringLit | stmtSymbol
def regularParens: Parser[String] = "(" ~ rep(regularNoParens | regularParens) ~ ")" ^^ { case l ~ s ~ r => l + s.mkString(" ") + r }
def regularStr = rep(regularNoParens | regularParens) ^^ { case s => s.mkString(" ") }
And I took out "(" and ")" from stmtSymbol. Works!
Edit: it didn't support nesting, fixed it.