Xtext - Custom made terminal string without quotes - eclipse

I´m new to Xtext and have a problem.
When I try to create a terminal String without quotes I always get EOF errors.
If I comment out the code for the String without quotes I´dont get an error and everything works fine.
Can someone explain me this?
Or give me some hint how I could better solve this?
Thank you very much
// String without quotes
terminal STRINGWQ: ( ('a'..'z'|'A'..'Z')('a'..'z' | 'A'..'Z' | '_'| '-' | '§' | '?' | '!'| '#'
| '\n' | ':' |'%' | '.' | '*' | '^' | ',' | '&' | '('|')'| '0'..'9'|' ')*);
Rest of Code
grammar org.xtext.example.mydsl.MyDsl with org.eclipse.xtext.common.Terminals
generate myDsl "http://www.xtext.org/example/mydsl/MyDsl"
Model:
(elements += GITest)*
;
GITest:
KWHeader | KWTestCase
;
// KeyWords Header
KWHeader:
'Test' '!''?'
;
KWTestCase:
'testcase' int=INT ':' title = ID |
'Hello' names=ID '!'
;
UPDATE:
Data Type Rule
QSTRING returns ecore::EString: //custom terminal SurveyString
(('a'|'b'|'c'|'d'|'e'|'f'|'g'|'h'|'i'|'j'|'k'|'l'|'m'|'n'|'o'|'p'|'q'|'r'|'s'|'t'|'u'|'v'|'w'|'x'|'y'|'z'|
'A'|'B'|'C'|'D'|'E'|'F'|'G'|'H'|'I'|'J'|'K'|'L'|'M'|'N'|'O'|'P'|'Q'|'R'|'S'|'T'|'U'|'V'|'W'|'X'|'Y'|'Z'|' ')
('a'|'b'|'c'|'d'|'e'|'f'|'g'|'h'|'i'|'j'|'k'|'l'|'m'|'n'|'o'|'p'|'q'|'r'|'s'|'t'|'u'|'v'|'w'|'x'|'y'|'z'|
'A'|'B'|'C'|'D'|'E'|'F'|'G'|'H'|'I'|'J'|'K'|'L'|'M'|'N'|'O'|'P'|'Q'|'R'|'S'|'T'|'U'|'V'|'W'|'X'|'Y'|'Z'|
' '|'_'|'-'|'§'|'?'|'!'|'#'|'%'|'.'|'*'|'^'|','|'&'|'('|')'|'0'|'1'|'2'|'3'|'4'|'5'|'6'|'7'|'8'|'9')*);
UPDATE 2:
Got it working with Data Type Rule und manipulating ID
Code:
STRINGWQ: ((' ')?ID)((ID)?(INT)? ' ' (ID)?);
terminal ID: '^'?('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9'|'/'|';'|','|'#'|'!'|'§'|'$'|'%'|'&'|
'('|')'|'='|'?'|'\\'|'*'|'+'|'.'|'-'|'>'|'<'|'|'|'['|']'|'{'|'}')*;
But now I have the problem that xtext not recognizes when STRINGWQ ends.
So I dont get keyword suggestions in the next line.
For example if i dont use STRINGWQ but INT I get suggestions in the next line.
But with STRINGWQ I dont.
How can I define an end of Data Type Rules?
Thank you

Your STRINGWQ shadows the terminal rule ID and basically all other rules including the keywords. Chances are good that the entire document will by consumed as a single terminal token of type STRINGWQ. You should try to model your string as a datatype rule.

This works for me:
Property:
id=ID | int=INT | prop=PROPERTY_VALUE | spec=SPECIAL;
PROPERTY_VALUE:
(':');
SPECIAL:
(' ' | '/' | ';' | ',' | '!' | '§' | '%' | '&' | '(' | ')' | '?' | '*' | '+' | '.' | '-' | '|' | '[' | ']')
I separated the colon as I need to use PROPERTY_VALUE in another way.
But you also could add it to special.

Related

Dart and Antlr4 on Web - Mismatched Input

Wow, this was a terribly worded query, let me try again.
I'm still learning antlr and trying to understand grammars. I'm using a grammar (not written by me - so I'm trying not to adjust it too much as it's the standard used by many groups, found here).
I'm using it in a Flutter application. When I run it on Linux or Android, it runs without issue. When I try and run it no web, I immediately have issues. The full grammar I'm using is below.
grammar FhirPath;
// Grammar rules [FHIRPath](http://hl7.org/fhirpath/N1) Normative Release
//prog: line (line)*; line: ID ( '(' expr ')') ':' expr '\r'? '\n';
entireExpression: expression EOF;
expression:
term # termExpression
| expression '.' invocation # invocationExpression
| expression '[' expression ']' # indexerExpression
| ('+' | '-') expression # polarityExpression
| expression ('*' | '/' | 'div' | 'mod') expression # multiplicativeExpression
| expression ('+' | '-' | '&') expression # additiveExpression
| expression '|' expression # unionExpression
| expression ('<=' | '<' | '>' | '>=') expression # inequalityExpression
| expression ('is' | 'as') typeSpecifier # typeExpression
| expression ('=' | '~' | '!=' | '!~') expression # equalityExpression
| expression ('in' | 'contains') expression # membershipExpression
| expression 'and' expression # andExpression
| expression ('or' | 'xor') expression # orExpression
| expression 'implies' expression # impliesExpression;
//| (IDENTIFIER)? '=>' expression #lambdaExpression
term:
invocation # invocationTerm
| literal # literalTerm
| externalConstant # externalConstantTerm
| '(' expression ')' # parenthesizedTerm;
literal:
'{' '}' # nullLiteral
| ('true' | 'false') # booleanLiteral
| STRING # stringLiteral
| NUMBER # numberLiteral
| DATE # dateLiteral
| DATETIME # dateTimeLiteral
| TIME # timeLiteral
| quantity # quantityLiteral;
externalConstant: '%' ( identifier | STRING);
invocation: // Terms that can be used after the function/member invocation '.'
identifier # memberInvocation
| function # functionInvocation
| '$this' # thisInvocation
| '$index' # indexInvocation
| '$total' # totalInvocation;
function: identifier '(' paramList? ')';
paramList: expression (',' expression)*;
quantity: NUMBER unit?;
unit:
pluralDateTimePrecision
| dateTimePrecision
| STRING; // UCUM syntax for units of measure
pluralDateTimePrecision:
'years'
| 'months'
| 'weeks'
| 'days'
| 'hours'
| 'minutes'
| 'seconds'
| 'milliseconds';
dateTimePrecision:
'year'
| 'month'
| 'week'
| 'day'
| 'hour'
| 'minute'
| 'second'
| 'millisecond';
typeSpecifier: qualifiedIdentifier;
qualifiedIdentifier: identifier ('.' identifier)*;
identifier:
IDENTIFIER
| DELIMITEDIDENTIFIER
| 'as'
| 'is'
| 'contains'
| 'in'
| 'div';
/****************************************************************
Lexical rules ***************************************************************
*/
/*
NOTE: The goal of these rules in the grammar is to provide a date token to the parser. As such it
is not attempting to validate that the date is a correct date, that task is for the parser or
interpreter.
*/
DATE: '#' DATEFORMAT;
DATETIME:
'#' DATEFORMAT 'T' (TIMEFORMAT TIMEZONEOFFSETFORMAT?)?;
TIME: '#' 'T' TIMEFORMAT;
fragment DATEFORMAT:
[0-9][0-9][0-9][0-9] ('-' [0-9][0-9] ('-' [0-9][0-9])?)?;
fragment TIMEFORMAT:
[0-9][0-9] (':' [0-9][0-9] (':' [0-9][0-9] ('.' [0-9]+)?)?)?;
fragment TIMEZONEOFFSETFORMAT: (
'Z'
| ('+' | '-') [0-9][0-9]':' [0-9][0-9]
);
IDENTIFIER: ([A-Za-z] | '_') ([A-Za-z0-9] | '_')*;
// Added _ to support CQL (FHIR could constrain it out)
DELIMITEDIDENTIFIER: '`' (ESC | ~[\\`])* '`';
STRING: '\'' (ESC | ~['])* '\'';
// Also allows leading zeroes now (just like CQL and XSD)
NUMBER: [0-9]+ ('.' [0-9]+)?;
// Pipe whitespace to the HIDDEN channel to support retrieving source text through the parser.
WS: [ \r\n\t]+ -> channel(HIDDEN);
COMMENT: '/*' .*? '*/' -> channel(HIDDEN);
LINE_COMMENT: '//' ~[\r\n]* -> channel(HIDDEN);
fragment ESC:
'\\' ([`'\\/fnrt] | UNICODE); // allow \`, \', \\, \/, \f, etc. and \uXXX
fragment UNICODE: 'u' HEX HEX HEX HEX;
fragment HEX: [0-9a-fA-F];
I generate the code with the following:
antlr4 -Dlanguage=Dart FhirPath.g4 -visitor -no-listener
Then to test I use the following code:
final input = InputStream.fromString('name');
final lexer = FhirPathLexer(input);
final tokens = CommonTokenStream(lexer);
final parser = FhirPathParser(tokens);
parser.buildParseTree = true;
final tree = parser.expression();
If I run it in a simple dart script, it runs without issue. But if I put it in a Flutter application (again, only on web, otherwise it appears to run without issue), I get this error:
line 1:0 mismatched input 'name' expecting {'as', 'in', 'is', 'contains', 'div', 'mod', IDENTIFIER, DELIMITEDIDENTIFIER}
I assume there's something I don't understand about the grammar, so any insights would be appreciated.
I've concluded this is an error with transpiling to javascript. The antlr4 version works for all of my tests for Android and Linux, then throws a bunch of errors for web. I've gone back to using petitparser instead of antlr4. If anyone else has suggestions feel free to leave them, but for now, I'm going to close this. If you want to compare how the two look, I have both versions here: https://github.com/MayJuun/fhir/tree/main/fhir_path/lib

Powershell script remove | from csv file

I have a CSV file that is outputting data in this format:
sitename | groupname | grouprole
--------------------+----------------------------+--------------------------
Administration | Group1 | NewRole
Finance | Group1 | NewRole
Default | Group1 | NewRole
I am trying to remove the | marks and replace them with a ; delimiter.
I am already using: ConvertTo-Csv -Delimiter ';' -NoType| % {$_ -replace ' ',''} to remove the padded spaces.
I tried using % {$_ -replace '|',';'} to have it replace the | with ; so that it would format to a proper CSV file. Instead the results were:
;s;i;t;e;n;a;m;e;|;g;r;o;u;p;n;a;m;e;|;g;r;o;u;p;r;o;l;e;
How do I go about removing the | in a CSV file and replacing it with a proper delimiter?

postgres: Cannot use to_tsvector on a column named 'text'

I'm using postgres 11.5 and try to do the following:
update ccnc set fulltext_tokens = to_tsvector(title || '. ' || description || '. ' || text ) where fulltext_tokens is NULL;
Which results in
ERROR: column " text" does not exist
LINE 1: ...o_tsvector(title || '. ' || description || '. ' || text ) wh...
^
HINT: Perhaps you meant to reference the column "ccnc.text".
However, neither does using ccnc.text help:
felix=# update ccnc set fulltext_tokens = to_tsvector(title || '. ' || description || '. ' || ccnc.text) where fulltext_tokens is NULL;
ERROR: missing FROM-clause entry for table " ccnc"
LINE 1: ...o_tsvector(title || '. ' || description || '. ' || ccnc.text...
... nor is something odd with the respective column (no trailing spaces or the like):
felix=# \d+ ccnc
Table "public.ccnc"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
-----------------+-----------------------------+-----------+----------+----------------------------------+----------+--------------+-------------
id | integer | | not null | nextval('ccnc_id_seq'::regclass) | plain | |
description | text | | | ''::text | extended | |
text | text | | | | extended | |
title | text | | | | extended | |
Edit:
Also quoting does not help, e.g.:
update ccnc set fulltext = title || '. ' || description || '. ' || "text";
ERROR: syntax error at or near ""text""
LINE 1: ... fulltext = title || '. ' || description || '. ' || "text";
I'd greatly appreciate any help on how to create that new column named fulltext_tokens. Thank you in advance :-)
Despite the very valuable comment by Richard, i.e., not using a column name such as "text" in the first place, one way to avoid the error is to use concat or concat_ws instead of the concat operator ||.
For instance:
update ccnc set fulltext = concat_ws('. ', title, description, text)

How to preserve new line character while performing psql copy command

I have following content in my csv file(with 3 columns):
141413,"\"'/x=/></script></title><x><x/","Mountain View, CA\"'/x=/></script></title><x><x/"
148443,"CLICK LINK BELOW TO ENTER^^^^^^^^^^^^^^","model\
\
xxx lipsum as it is\
\
100 sometimes unknown\
\
travel evening market\
"
When I import above mentioned csv in mysql using following command, it treats the backslash() as new line; which is the expected behavior.
LOAD DATA INFILE '1.csv' INTO TABLE users FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"' LINES TERMINATED BY '\n';
MYSQL Output
But when I try to import to psql using copy command, it treats \ as a normal character.
copy users from '1.csv' WITH (FORMAT csv, DELIMITER ',', ENCODING 'utf8', NULL "\N", QUOTE E'\"', ESCAPE '\');
postgres Output
Try parsing these \ before importing the CSV file, e.g. using perl -pe or sed and the STDIN from psql:
$ cat 1.csv | perl -pe 's/\\\n/\n/g' | psql testdb -c "COPY users FROM STDIN WITH (FORMAT csv, DELIMITER ',', ENCODING 'utf8', NULL "\N", QUOTE E'\"', ESCAPE '\');"
This is how it looks like after the import:
testdb=# select * from users;
id | company | location
--------+-----------------------------------------+-------------------------------------------------
141413 | "'/x=/></script></title><x><x/ | Mountain View, CA"'/x=/></script></title><x><x/
148443 | CLICK LINK BELOW TO ENTER^^^^^^^^^^^^^^ | model +
| | +
| | xxx lipsum as it is +
| | +
| | 100 sometimes unknown +
| | +
| | travel evening market +
| |
(2 Zeilen)

Rule not recognized

I've written a grammar which should allow me to define variables and arrays. Everything worked fine until I split up the variables into local and global variables. Now my parser doesn't recognize the arrays anymore (it says it would be a variable and gives me syntax errors for that).
My Grammar:
grammar sqf.Sqf with org.eclipse.xtext.common.Terminals
generate sqf "http://www.Sqf.sqf"
Model:
elements += Element*
;
Element:
Declaration ";" | Command ";"
;
Declaration:
Array | Variable
;
Variable:
LocalVariable | GlobalVariable
;
LocalVariable:
name=LOCALVARNAME "=" content=VARCONTENT (("+"|"-"|"*"|"/") content2+=VARCONTENT)*
;
GlobalVariable:
name=GLOBALVARNAME "=" content=VARCONTENT (("+"|"-"|"*"|"/") content2+=VARCONTENT)*
;
Array:
name=ID "=" content=ArrayLiteral | name=ID "=" "+" content2=[Array]
;
ArrayLiteral:
"[" (content += ArrayContent)* "]" (("+"|"-")content1+=Extension)*
;
ArrayContent:
content01=Acontent ("," content02+=Acontent)*
;
Acontent:
STRING | DOUBLE | ArrayLiteral
;
Extension:
STRING | DOUBLE
;
Command:
Interaction
;
Interaction:
hint
;
hint:
Normal | Format | Special
;
Normal:
name=("hint" | "hintC" | "hintCadet" | "hintSilent") content=STRING
;
Format:
name=("hint" | "hintC" | "hintCadet" | "hintSilent") "format" "[" content=STRING "," variable=DECREF "]"
;
Special:
hintCArray
;
hintCArray:
title=STRING "hintC" (content1=ArrayLiteral | content=STRING)
;
VARCONTENT:
STRING | DOUBLE | DECREF | "true" | "false" | "nil"
;
DOUBLE:
INT ("."INT)?
;
DECREF:
ref1=[Array|ID] | ref2=[LocalVariable|LOCALVARNAME] | ref3=[GlobalVariable|GLOBALVARNAME]
;
terminal LOCALVARNAME:
"_" ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*
;
terminal GLOBALVARNAME:
'^'?('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*
;
Has anybody of you an idea what the problem is?
(Any other code improvements are welcome, too)
Greets Krzmbrzl
Your rule GLOBALVARNAME completely shadows the rule ID. You could simply use ID instead of GLOBALVARNAME.