Overloading "." produces errors - swift

I can't seem to overload "." and not sure if it's a compiler bug or something I'm doing:
#infix func . (a: Int, b: Int) -> Int {
return a * b
}
I get the errors:
Expected identifier in function declaration
Braced block of statements is an unused closure

You can't overload '.' It's a reserved token for the language. You can however overload the .. and ... operators.
Operators are made up of one or more of the following characters: /,
=, -, +, !, , %, <, >, &, |, ^, ~, and .. That said, the tokens =, ->, //, /, */, ., and the unary prefix operator & are reserved. These tokens can’t be overloaded, nor can they be used to define custom
operators.
Language Reference

Swift does allow the definition and overload of custom operators, but it only allows certain characters to be considered operators.
Operators are made up of one or more of the following characters: /,
=, -, +, !, *, %, <, >, &, |, ^, ~, and .. That said, the tokens =, ->, //, /*, */, ., and the unary prefix operator & are reserved. These tokens can’t be overloaded, nor can they be used to define custom
operators.
It is therefore illegal to try to overload the period operator, though it can be used as part of another custom operator.

Related

Array not recognized by powershell parser when other operators are involved

Assigning an array looks like this:
PS> $x = "a", "b"
PS> $x
a
b
Now, i wanted to add a 'root string' ("r") to any element so I did this (actually i used a variable, but for the sakeness of simplicity let's just use a string here):
PS> $x = "r" + "a" , "r" + "b"
PS> $x
ra rb
Looking at the output, I didn't get the array that I expected, but a single string with a "space" (I checked: it's a 32 ascii char, so a space, not a tab or another character).
That is: the comma seems to be interpreted as a string join operator, which I couldn't find any reference to.
Even worst, I get the feeling of not understanding how the parser works here. I had a look at about_Parsing; what I found seems not to apply to this case.
Commas (,) introduce lists passed as arrays, except when the command
to be called is a native application, in which case they are
interpreted as part of the expandable string. Initial, consecutive or
trailing commas are not supported.
The first obvious fix that I came up with is the following:
PS> $x = ("r" + "a") , ("r" + "b")
PS> $x
ra
rb
Maybe there are others, and I am expecially intrested in the ones that reveal how the parser actually works. What I would like to fix the most is my knowledge of the parsing rules.
To flesh out the helpful comments on the answer:
tl;dr
Due to operator precedence, your command is parsed as "r" + ("a" , "r") + "b", causing array "a", "r" to be implicitly stringified to verbatim a r, resulting in two string concatenation operations yielding a single string with verbatim content ra rb.
Using (...) is indeed the correct way to override operator precedence.
"r" + "a" , "r" + "b"
is an expression involving operators.
Expressions are parsed in expression mode, which contrasts with argument mode; the latter applies to commands, i.e. named units of functionality that are called with shell-typical syntax (whitespace-separated arguments, quotes around simple strings optional). Arguments (parameter values) in argument mode are parsed differently from operands in expression mode, as explained in the conceptual about_Parsing help topic. Your quote about , relates to argument mode, not expression mode.
The conceptual about_Operator_Precedence help topic describes the relative precedence among operators, from which you can glean that ,, the array constructor operator has higher precedence than the + operator
Therefore, your expression is parsed as follows (using (...), the grouping operator, to make the implicit rules explicit):
"r" + ("a" , "r") + "b"
+ is polymorphic in PowerShell, and with a [string] instance as the LHS the RHS is coerced to a string too.
Therefore, array "a" , "r" is stringified, which uses PowerShell's custom array stringification, namely joining the (potentially stringified) array elements with a space.[1]
That is, the array stringifies to a string with verbatim content a r.
As an aside: The same stringification is applied in the context of string interpolation via expandable (double-quoted) strings ("..."); that is, "$("a", "r")" also yields verbatim a r
Therefore, the above is equivalent to:
"r" + "a r" + "b"
which yields verbatim ra rb.
(...) is indeed the appropriate way to ensure the desired precedence:
("r" + "a"), ("r" + "b") # -> array 'ra', 'rb'
[1] Space is the default separator character. Technically, you can override it via the $OFS preference variable, though that is rarely used in practice.
Another way to do it. The type of the first term controls what type of operation the plus performs. The first term here is an empty array. If you want the plus to do both kinds of operations, there's no getting around extra parentheses to change the operator precedence.
#() + 'ra' + 'rb'
ra
rb
Or more commonly:
'ra','rb' + 'rc'
ra
rb
rc

greek char functions (for fun)

why can I write (in swift)
func β(a: Double, b: Double) -> Double { exp( lgamma(a) + lgamma(b) - lgamma(a + b) ) }
or
func Γ(_ x: Double) -> Double { tgamma(x) }
but not
func √(_ x: Double) -> Double { return sqrt(x) }
See Identifiers in the Swift Language Reference:
Identifiers begin with an uppercase or lowercase letter A through Z, an underscore (_), a noncombining alphanumeric Unicode character in the Basic Multilingual Plane, or a character outside the Basic Multilingual Plane that isn’t in a Private Use Area. After the first character, digits and combining Unicode characters are also allowed.
β and Γ are each a "noncombining alphanumeric Unicode character in the Basic Multilingual Plane." √ is not (nor does it meet any of the other requirements).
That said, √ is a valid operator, so you can write:
prefix operator √
prefix func √(_ x: Double) -> Double { return sqrt(x) }
print(√2)
The basic rules for Operators (from the document above) are:
Custom operators can begin with one of the ASCII characters /, =, -, +, !, *, %, <, >, &, |, ^, ?, or ~, or one of the Unicode characters defined in the grammar below (which include characters from the Mathematical Operators, Miscellaneous Symbols, and Dingbats Unicode blocks, among others). After the first character, combining Unicode characters are also allowed.
√ is included in "Mathematical Operators."
The square root character appears to be a valid operation identifier in Swift.
Character
Unicode Value
Unicode Name
√
221A
SQUARE ROOT
Have you checked the last method declaration without the return keyword?
func √(_ x: Double) -> Double { sqrt(x) }

Swift custom operators with Unicode combining characters

TL;DR
Can I coax the compiler to accept a combining character as a postfix operator?
The references at Swift.org and GitHub and this useful gist suggest that combining characters (e.g. U+0300 ff.) may serve as operators in Swift.
With judicious implementation (omitted here) I can say “Fiat Lux” and there is
prefix operator ‖ // Find the norm.
postfix operator ‖ // Does nothing.
func / // Scalar division.
which allows
let vHat = v / ‖v‖ // Readable as math.
or even
let v̂ = v / ‖v‖ // Loving it.
The OCD in me wants now to use the combining circumflex as a (topfix) operator like this:
let normalizedV = v̂ // Combining char is really a postfix.
So I leap in and try to write:
postfix operator ^ // Want this to be *combining* circumflex.
postfix func ^(v: Vector) -> Vector { v / ‖v‖ }
and can do it with plain old U+005E circumflex, but get (various) compiler errors when I try with the combining circumflex U+0302.
An operator name (or any other identifier) cannot start with the U+0302 character. Like all combining marks, it is an allowed “operator-character” but not an allowed “operator-head”. From Lexical Structure > Operators in “The Swift Programming Language”:
GRAMMAR OF OPERATORS
operator → operator-head operator-charactersopt
...
operator-character → U+0300–U+036F

What does the ?= operator do in Swift?

I just came across some code that looks like this:
var msg:String = "";
msg ?= err["ErrorMessage"].text;
The err variable is from SwiftyXMLParser from what I can see in the code. I'm at a loss about the meaning of the ?= (questionmark-equals) operator. I cannot find documentation about it. What is it doing?
This question is a quite interesting topic in Swift language.
In other programming languages, it is closed to operator overloading whereas in Swifty terms, it is called Custom Operators. Swift has his own standard operator, but we can add additional operator too. Swift has 4 types of operators, among them, first 3 are available to use with custom operators:
Infix: Used between two values, like the addition operator (e.g. 1 + 2)
Prefix: Added before a value, like the negative operator (e.g. -3).
Postfix: Added after a value, like the force-unwrap operator (e.g. objectNil!)
Ternary: Two symbols inserted between three values.
Custom operators can begin with one of the ASCII characters /, =, -, +, !, *, %, <, >, &, |, ^, ?, or ~, or one of the Unicode characters.
New operators are declared at a global level using the operator keyword, and are marked with the prefix, infix or postfix modifiers:
Here is a sample example in the playground[Swift 4].
infix operator ?=
func ?= (base: inout String, with: String)
{
base = base + " " + with
}
var str = "Stack"
str ?= "Overflow"
print(str)
Output:
Stack Overflow
Please check the topic name Advanced operator in apple doc.

How ANTLR decides whether terminals should be separated with whitespaces or not?

I'm writing lexical analyzer in Swift for Swift. I used ANTLR's grammar, but I faced with problem that I don't understand how ANTLR decides whether terminals should be separated with whitespaces.
Here's the grammar: https://github.com/antlr/grammars-v4/blob/master/swift/Swift.g4
Assume we have casting in Swift. It can also operate with optional types (Int?, String?) and with non-optional types (Int, String). Here are valid examples: "as? Int", "as Int", "as?Int". Invalid examples: "asInt" (it isn't a cast). I've implemented logic, when terminals in grammar rules can be separated with 0 or more WS (whitespace) symbols. But with this logic "asInt" is matching a cast, because it contains "as" and a type "Int" and have 0 or more WS symbols. But it should be invalid.
Swift grammar contains these rules:
DOT : '.' ;
LCURLY : '{' ;
LPAREN : '(' ;
LBRACK : '[' ;
RCURLY : '}' ;
RPAREN : ')' ;
RBRACK : ']' ;
COMMA : ',' ;
COLON : ':' ;
SEMI : ';' ;
LT : '<' ;
GT : '>' ;
UNDERSCORE : '_' ;
BANG : '!' ;
QUESTION: '?' ;
AT : '#' ;
AND : '&' ;
SUB : '-' ;
EQUAL : '=' ;
OR : '|' ;
DIV : '/' ;
ADD : '+' ;
MUL : '*' ;
MOD : '%' ;
CARET : '^' ;
TILDE : '~' ;
It seems that all these terminals can be separated with other's with 0 WS symbols, and other terminals don't (e.g. "as" + Identifier).
Am I right? If I'm right, the problem is solved. But there may be more complex logic.
Now if I have rules
WS : [ \n\r\t\u000B\u000C\u0000]+
a : 'str1' b
b : 'str2' c
c : '+' d
d : 'str3'
I use them as if they were these rules:
WS : [ \n\r\t\u000B\u000C\u0000]+
a : WS? 'str1' WS? 'str2' WS? '+' WS? 'str3' WS?
And I suppose that they should be like these (I don't know and that is the question):
WS : [ \n\r\t\u000B\u000C\u0000]+
a: 'str1' WS 'str2' WS? '+' WS? 'str3'
(notice WS is not optional between 'str1' and 'str2')
So there's 2 questions:
Am I right?
What I missed?
Thanks.
Here's the ANTLR WS rule in your Swift grammar:
WS : [ \n\r\t\u000B\u000C\u0000]+ -> channel(HIDDEN) ;
The -> channel(HIDDEN) instruction tells the lexer to put these tokens on a separate channel, so the parser won't see them at all. You shouldn't litter your grammar with WS rules - it'd become unreadable.
ANTLR works in two steps: you have the lexer and the parser. The lexer produces the tokens, and the parser tries to figure out a concrete syntax tree from these tokens and the grammar.
The lexer in ANTLR works like this:
Consume characters as long as they match any lexer rule.
If several rules match the text you've consumed, use the first one which appears in the grammar
Literal strings in the grammar (like 'as') are turned into implicit lexer rules (equivalent to TOKEN_AS: 'as'; except the name will be just 'as'). These end up first in the lexer rules list.
Example 1
Let's see the consequences of these when lexing as?Int (with a space at the end):
a... potentially matches Identifier and 'as'
as... potentially matches Identifier and 'as'
as? does not match any lexer rule
Therefore, you consume as, which will become a token. Now you have to decide which will be the token type. Both Identifier and 'as' rules match. 'as' is an implicit lexer rule, and considered to appear first in the grammar, therefore it takes precedence. The lexer emits a token with text as of type 'as'.
Next token.
?... potentially matches the QUESTION rule
?I doesn't match any rule
Therefore, you consume ? from the input and emit a token of type QUESTION with text ?.
Next token.
I... potentially matches Identifier
In... potentially matches Identifier
Int... potentially matches Identifier
Int (followed by a space) does not match anything
Therefore, you consume Int from the input and emit a token of type Identifier with text Int.
Next token.
You have a space there, it matches the WS rule.
You consume that space, and emit a WS token on the HIDDEN channel. The parser won't see this.
Example 2
Now let's see how asInt is tokenized.
a... potentially matches Identifier and 'as'
as... potentially matches Identifier and 'as'
asI... potentially matches Identifier
asIn... potentially matches Identifier
asInt... potentially matches Identifier
asInt followed by a space doesn't match any lexer rule.
Therefore, you consume asInt from the input stream, and emit an Identifier token with text asInt.
The parser
The parser stage is only interested in the token types it gets. It does not care about what text they contain. Tokens outside the default channel are ignored, which means the following inputs:
as?Int - tokens: 'as' QUESTION Identifier
as? Int - tokens: 'as' QUESTION WS Identifier
as ? Int - tokens: 'as' WS QUESTION WS Identifier
Will all result in the parser seeing the following token types: 'as' QUESTION Identifier, as WS is on a separate channel.