How to fix mismatch input x expecting y - character

I am new to antler and creating a parse tree. I am trying to create tokens that include a special character, but when I do so it gives me an input mismatch.
I have tried to add a special character to my LEXER rules by adding a '.' at the end, however when I do so it give me the error of input mismatch. The snippet of code that I am trying will work on its own but not as part of the entire code.
This is the code I have so far...
grammar Grammar4;
r : WORD', 'NUMBER', 'BOOL', 'SENT+;
BOOL : 'true' | 'false';
WORD : [a-zA-Z]+;
NUMBER : [0-9]+;
SENT : [a-zA-Z ]+;
WS : [ \t\r\n]+ -> skip ;
If I add a period at the end of SENT to allow for special characters ([a-zA-Z ]+.;) then I get an input mismatch. If I take that line out and use it independently of the rest than I can have a sentence like, "How are you today!" and have it tokenize fine.Any help is greatly appreciated.
Edited for clarity:
I am trying to parse a statement like, Alex, 31, false, I let the dog out! (note that I can get everything to parse as an individual token except the last special character and I would like "I let the dog out!" to be one token.

Related

Regex : findall with a repeated capture group

I would like to understand why :
re.findall(r"(\d[A-Za-z]+)", "My user name is 3e4r 5fg")
returns
['3e', '4r', '5fg']
while :
re.findall(r"(\d[A-Za-z]+)+", "My user name is 3e4r 5fg")
returns
['4r', '5fg']
I tested some combinations with spaces between groups of "digit-letter" and 2 points clearly are involved in :
spaces between those groups
last "+".
I don't really understand why adding "+" after the group changes the result. Can someone explain me the steps of the process which leads to those different answers? Thank you very much.
When you put + after parenthesis you are searching for a pattern that contains one or more sub pattern with 1 digit and (one or more) letters'
so this phrase: "(\d[A-Za-z]+)+" return 2 matches:
3e4r
5fg
When you put a sub-pattern in parenthesis it means that all matches this sub-pattern will enter in a group, the groups is:
3e
5fg
The function re.findall returns only the groups (Unless there are no groups then it returns the matches ).

Microsoft graph Mail Search Strict value

I have an issue with the search parameters. I want to pass a phrase in my query. For exemple i'm looking for emails where the subject is "Test 1".
For this i'm doing a get on this ressource.
https://graph.microsoft.com/v1.0/me/messages?$search="subject:Test 1"
But the behaviour of this query is : Looking for mails that contains "Test" in the subject OR 1 in any other fields.
Refering to the KQL Syntax
A phrase (includes two or more words together, separated by spaces; however, the words must be enclosed in double quotation marks)
So, to do what i want i have to put double quotes (") around my phrase to do a strict value search. Like below
subject:"Test 1"
The problem it's at this point. Microsoft graph api already use double quotes (") after the parameters $search.
?$search="Key words"
So I can't do what is mentioned in the KQL doc.
https://graph.microsoft.com/v1.0/me/messages?$search="subject:"Test 1""
It's throwing an error :
"Syntax error: character '1' is not valid at position 15 in '\"subject:\"test 1\"\"'.",
It's an expected behaviour. I was pretty sure it will not work.
If someone has any suggestions for a solution or a workaround, I'm a buyer.
What I've already tried so far :
Use simple quote
Remove the quotes right after $select=
Remove the subject part $select="Test 1", same behaviour as the first request mentioned in this post. It will looks for emails that contain "test" or "1".
Best regards.
EDIT :
After sasfrog's anwser :
I used $filter : It works well with simple operator AND, OR.I have some errors by using the Not Operator. And btw you have to use the orderby parameter to show the result by date and add the field in filter parameters.
Exemple 1 (working, what I asked for first) :
https://graph.microsoft.com/v1.0/me/messages/?$orderby=receivedDateTime desc &$filter=receivedDateTime ge 1900-01-01T00:00:00Z AND contains(subject,'test 1')
Exemple 2 (not working)
https://graph.microsoft.com/v1.0/me/messages/?$orderby=receivedDateTime desc &$filter=(receivedDateTime ge 1900-01-01T00:00:00Z AND contains(subject,'test 1')) NOT(contains(from/EmailAddress/address,[specific address]))
EDIT 2
After some test with the filter parameters.
The NOT operator is still not working so to workaround use "ne" (non-equals)
the example 2 becomes :
https://graph.microsoft.com/v1.0/me/messages/?$orderby=receivedDateTime desc&$filter=(receivedDateTime ge 1900-01-01T00:00:00Z AND contains(subject,'test 1')) AND (from/EmailAddress/address ne [specific address])
UPDATE : OTHER SOLUTION WITH $search
Using $filter is great but it looks like it was sometimes pretty slow. So I found a workaround aboutmy issue.
It's to use AND operator between all terms.
Exemple 4 :
I'm looking for the mails where the subject is test 1;
Let value = "test 1". So you have to splice it by using space separator. And after write some code to manipulate this array, to obtain something like below.
$search="(subject:test AND subject:1)"
The brackets can be important if you use a multiple fields search. And VoilĂ .
Not sure if it's sufficient for what you're doing, but how about using the contains function within a filter query instead:
https://graph.microsoft.com/v1.0/me/messages?$filter=contains(subject,'Test 1')
Sounds like you're already looking at the doco but here it is just in case.
Update also, this worked for me using the search method:
https://graph.microsoft.com/v1.0/me/messages?$search="subject:'Test 1'"

Checking the format of a string in Matlab

So I'm reading multiple text files in Matlab that have, in their first columns, a column of "times". These times are either in the format 'MM:SS.milliseconds' (sorry if that's not the proper way to express it) where for example the string '29:59.9' would be (29*60)+(59)+(.9) = 1799.9 seconds, or in the format of straight seconds.milliseconds, where '29.9' would mean 29.9 seconds. The format is the same for a single file, but varies across different files. Since I would like the times to be in the second format, I would like to check if the format of the strings match the first format. If it doesn't match, then convert it, otherwise, continue. The code below is my code to convert, so my question is how do I approach checking the format of the string? In otherwords, I need some condition for an if statement to check if the format is wrong.
%% Modify the textdata to convert time to seconds
timearray = textdata(2:end, 1);
if (timearray(1, 1) %{has format 'MM.SS.millisecond}%)
datev = datevec(timearray);
newtime = (datev(:, 5)*60) + (datev(:, 6));
elseif(timearray(1, 1) %{has format 'SS.millisecond}%)
newtime = timearray;
You can use regular expressions to help you out. Regular expressions are methods of specifying how to search for particular patterns in strings. As such, you want to find if a string follows the formats of either:
xx:xx.x
or:
xx.x
The regular expression syntax for each of these is defined as the following:
^[0-9]+:[0-9]+\.[0-9]+
^[0-9]+\.[0-9]+
Let's step through how each of these work.
For the first one, the ^[0-9]+ means that the string should start with any number (^[0-9]) and the + means that there should be at least one number. As such, 1, 2, ... 10, ... 20, ... etc. is valid syntax for this beginning. After the number should be separated by a :, followed by another sequence of numbers of at least one or more. After, there is a . that separates them, then this is followed by another sequence of numbers. Notice how I used \. to specify the . character. Using . by itself means that the character is a wildcard. This is obviously not what you want, so if you want to specify the actual . character, you need to prepend a \ to the ..
For the second one, it's almost the same as the first one. However, there is no : delimiter, and we only have the . to work with.
To invoke regular expressions, use the regexp command in MATLAB. It is done using:
ind = regexp(str, expression);
str represents the string you want to check, and expression is a regular expression that we talked about above. You need to make sure you encapsulate your expression using single quotes. The regular expression is taken in as a string. ind would this return the starting index of your string of where the match was found. As such, when we search for a particular format, ind should either be 1 indicating that we found this search at the beginning of the string, or it returns empty ([]) if it didn't find a match. Here's a reproducible example for you:
B = {'29:59.9', '29.9', '45:56.8', '24.5'};
for k = 1 : numel(B)
if (regexp(B{k}, '^[0-9]+:[0-9]+\.[0-9]+') == 1)
disp('I''m the first case!');
elseif (regexp(B{k}, '^[0-9]+\.[0-9]+') == 1)
disp('I''m the second case!');
end
end
As such, the code should print out I'm the first case! if it follows the format of the first case, and it should print I'm the second case! if it follows the format of the second case. As such, by running this code, we get:
I'm the first case!
I'm the second case!
I'm the first case!
I'm the second case!
Without knowing how your strings are formatted, I can't do the rest of it for you, but this should be a good start for you.

creating url with sprintf creates wrong url

I am trying to create a ulr using sprintf. To open various websites I changed part of the URL using sprintf. Now the following code writes 3times the url instread of replacing part of the url????Any suggestions?Many thanks!!
current_stock = 'AAPL';
current_url = sprintf('http://www.finviz.com/quote.ashx?t=%d&ty=c&ta=0&p=d',current_stock)
web(current_url, '-browser')
%d should be the place holer for appl. Result is :
http://www.finviz.com/quote.ashx?t=65&ty=c&ta=0&p=dhttp://www.finviz.com/quote.ashx?t=65&ty=c&ta=0&p=dhttp://www.finviz.com/quote.ashx?t=80&ty=c&ta=0&p=dhttp://www.finviz.com/quote.ashx?t=76&ty=c&ta=0&p=d
I'm not sure why you're using %d for a value that is clearly a string? You should be using %s.
The reason you're seeing what you're seeing is that it appears to be giving you a copy of your format string for each character in the AAPL string.
You can see that the differences lie solely in the ?t=XX bit, with XX being, in sequence, 65, 65, 80 and 76, the ASCII codes for the four letters in your string:
vv
http://www.finviz.com/quote.ashx?t=65&ty=c&ta=0&p=d
http://www.finviz.com/quote.ashx?t=65&ty=c&ta=0&p=d
http://www.finviz.com/quote.ashx?t=80&ty=c&ta=0&p=d
http://www.finviz.com/quote.ashx?t=76&ty=c&ta=0&p=d
^^
Whether that's a feature or bug in MatLab (a), I couldn't say for sure, but I suspect it'll fix itself if you just use the correct format specifier.
(a) It's probably a feature since it does similarly intelligent stuff with other mismatches, as per here:
If you apply a string conversion (%s) to integer values, MATLAB converts values that correspond to valid character codes to characters. For example, '%s' converts [65 66 67] to ABC.
I would follow this easy way:
current_stock = 'AAPL';
current_url = ['http://www.finviz.com/quote.ashx?t=%d&ty=c&ta=0&p=d',current_stock];
web(current_url,'-browser')
That redirected me to a valid webpage.

How to recognize string in Lex file

Hi what would be appropriate to recognize string in a lex.
I have already tried
enter code here
import java_cup.runtime.*;
%%
%cup
%line
NUM = [0-9]
ID = [a-zA-Z]
Pun= [:=;##$^~]
WhiteSpace = [ \t\r\n\f]
SDQuo = [\"]
%%
({SDQuo}+) ({ID}|{NUM})* ({SDQuo}+) { return new Symbol(sym.STR, new String(yytext()));}
but the macro fail to be recognized.
The error message that I kept getting is:
Processing first section -- user code.
Processing second section -- JLex declarations.
Processing third section -- lexical rules.
Creating NFA machine representation.
Error: Parse error at line 39.
Description: Missing brace at start of lexical action.
Parse error.
Loose the = signs in the definitions of NUM etc. and don't place them between %%. Instead place the last rule between %%.