The difference seems to very different between the two compared to other languages like C or C++. Things that could be statements in C seem to be expressions in Scala. How should I differentiate them, and any reason why it’s like that?
According to Scala Language Specification, it does have statements:
Statements occur as parts of blocks and templates. A statement can be an import, a definition or an expression, or it can be empty. Statements used in the template of a class definition can also be declarations.
C++ also has expression statements and declaration statements; the main difference is that it also has additional types of statements, and their equivalents are generally expressions in Scala, so they can be used either as expression statements or as part of larger expressions.
Related
The difference between the two is not so clear from the Cadence documentation.
Could someone please elaborate on the difference between the two?
A define as macro is just a plain old macro that you probably know from other programming languages. It just means that at some select locations in the macro code you can substitute your own code.
A define as computed macro allows you to construct your output code programmatically, by using control flow statements (if, for, etc.). It acts kind of like a function that returns a string, with the return value being the code that will be inserted in its place by the pre-processor.
With both define as and define as computed macros you define a new syntactic construct of a given syntactic category (for example, <statement> or <action>), and you implement the replacement code that replaces a construct matching the macro match expression (or pattern).
In both cases the macro match expression can have syntactic arguments that are used inside the replacement code and are substituted with the actual code strings used in the matched code.
The difference is that with a define as macro the replacement code is just written in the macro body.
With a define as computed macro you write a procedural code that computes the desired replacement code text and returns it as a string. It's effectively a method that returns string, you can even use the result keyword to assign the resulting string, just like in any e method.
A define as computed macro is useful when the replacement code is not fixed, and can be different depending on the exact macro argument values or even semantic context (for example, in some cases a reflection query can be used to decide on the exact replacement code).
(But it's important to remember that even define as computed macros are executed during compilation and not at run time, so they cannot query actual run time values of fields or variables to decide on the resulting replacement code).
Here are some important differences between the two macro kinds.
A define as macro is more readable and usually easier to write. You just write down the code that you want to be created.
Define as computed macros are stronger. Everything that can be implemented with define as, can also be implemented with define as computed, but not vice versa. When the replacement code is not fixed, define as is not sufficient.
A define as macro can be used immediately after its definition. If the construct it introduces is used in the statement just following the macro, it will already be matched. A define as computed macro can only be used in the next file, and is not usable in the same file in which the macro is defined.
Given that the official Scala style guide
http://docs.scala-lang.org/style/method-invocation.html
recommends using infix notation for arity-1 calls and dot notation for arity-0 calls, what's the recommended style for an arity-1 call followed by an arity-0 call?
For example, would this be the recommended way?
(bytes map (_.toChar)).mkString
The summary of the style guide is basically: use point-free style whenever it is simple and clear. Otherwise, don't. In this case, your options are
bytes.map(_.toChar).mkString
(bytes map (_.toChar)).mkString
and the former looks simpler and clearer to me, so it wins.
Really long chains are also not very clear in point-free notation
foo bar baz qux quux bippy
Say what again?
foo.bar(baz).qux(quux).bippy
Oh, okay.
Be pragmatic. Point-free style can be clean and elegant. Good coding style will often lead you to write code that looks nice point-free. But the point of the syntax is to be clear and avoid errors, so use whichever better accomplishes the goal.
I don't know who wrote this guide, but he definitely seemed biased, and I would advice against following this guide. Infix notation has a lot of pitfalls, which the author doesn't mention, and the benefits of it are questionable at very least. The arguments the author uses are not any less questionable.
The author argues that the following code makes it look like a method filter is being called on a function (_.toUpperCase):
names.map (_.toUpperCase).filter (_.length > 5)
But nobody formats the code like that. The following are the standard practices and they both introduce no ambiguity of such:
names.map(_.toUpperCase).filter(_.length > 5)
names.map( _.toUpperCase ).filter( _.length > 5 )
Pitfalls of the infix notation
Parameterless methods cannot be used inside a method calls chain.
Parameterless methods cannot be used in the end of a method calls chain, because they grab the terms from the next line.
It does not allow splitting to multiple lines. You'll either have to introduce some lispy bracketing or awkardly place the parameters on the next line. Another option is to switch back to the "dot" notation and end up with an inconsistent style.
All those options can hardly be referred to as increasing the readability.
It breeds misunderstandings like that.
Finally, it adds a layer of obfuscation of your intent. I.e., a reader has to analyse how the compiler will infer the dots and braces prior to actually comprehending the code.
Conclusion
The only argument for this notation that I've ever met is that it increases readability. While this argument is questionable itself, I find that it can hardly stand against any of the aforementioned drawbacks of this notation, due to which it often even decreases the readability.
The most consistent and safe standard is to use infix notation only for operators, i.e., methods with names like +, *, >>=.
According to this answer => in Scala is a keyword which has two different meanings: 1 to denote a function type: Double => Double and 2 to create a lambda expression: (x: Double): Double => 2*x.
How does this relate to formal grammars, i.e. does this make Scala context sensitive?
I know that most languages are not context free, but I'm not sure whether the situation I'm describing has anything to do with that.
Edit:
Seems like I don't understand context sensitive grammars well enough. I know how the production rules are supposed to look, and what they mean ("this production applies only if A is surrounded by these symbols"), but I'm just not sure how they relate to actual (programming) languages.
I think my confusion stems from reading something like "Chomsky introduced this term because a word's meaning can depend on its context", and I connected => with the term "word" in the quote, and those two uses of it being two separate contexts.
It be great if an answer would address my confusion.
It's been a while since I've handled formal language theory, but I'll bite.
"Context-free" means that the production rules required in the corresponding grammar do not have a "context". It does not mean that a specific symbol cannot appear in different rules.
Addressing the edit: in other words (and more informally), deciding whether a language is context-free or context-sensitive boils down not to looking at the "meaning" of a specific "word" or "words". Instead, it amounts to looking at the set of all legal expressions in that language, and seeing whether you can "encode" them only by taking into account the positional relationships the component "words" have with one another. This is essentially what the Pumping Lemma checks.
For example:
S → Type"="Body
Type → "Double"
Type → "Double""=>""Double"
Body → Lambda
Body → NormalBody
NormalBody → "x"
Lambda -> "x""=>"NormalBody
Where S is of course the start symbol, uppercased IDs are nonterminals, and quoted strings are terminals. Obviously, this can generate a string like:
Double=>Double=x=>x
but the grammar is still context-free.
So just this, as in the observation that the nonterminal "=>" can appear in two "places" of the program, does not make Scala context-sensitive.
However, it does not mean that:
the entire Scala language is context-free,
it is context-sensitive - it can be even more complex,
if you would like to encode the semantics of Scala into a grammar, you would end up with either a context-free or a context-sensitive one.
The last thing is especially relevant since you've mentioned "meaning" in the (nomen omen) context of formal languages.
As far as I know, the pattern&template-based Scheme macro system works by first pattern matching a macro invocation, obtaining a substitution in case of success, applying the resulted substitution to the corresponding template to build up a (maybe) partially-expanded expression, and then continuously expanding the resulted expression. If what I describe is true (please correct me otherwise), then it seems to me this building-up and expanding-again model is not efficient. Why does the expansion need to be done like this? Is it possible to finish the expansion by a run down the template once and for all?
Keep in mind that macros can be handled define time for procedures and compile time for a whole program.
Also, a macro expansion might turn into another (or similar) macro form that needs expanding. Eg. you can make a macro that ends up as a cond expression which of course is a macro for nested if expressions in most Schemes.
Have you seen Alexpander? It evaluates a program (in one expression) and returns an equal program without macros.
The semantics of the macro system are specified in the way you describe. However, implementations are free to implement that specification any way they want; in particular, they could "inline" macro expansions ahead of time to speed the process of macro expansion.
I'm not aware of any Scheme implementations that do what you describe, and I would guess it's because macro expansion is not usually a big bottleneck in compilation.
I've heard that Lisp lets you redefine the language itself, and I have tried to research it, but there is no clear explanation anywhere. Does anyone have a simple example?
Lisp users refer to Lisp as the programmable programming language. It is used for symbolic computing - computing with symbols.
Macros are only one way to exploit the symbolic computing paradigm. The broader vision is that Lisp provides easy ways to describe symbolic expressions: mathematical terms, logic expressions, iteration statements, rules, constraint descriptions and more. Macros (transformations of Lisp source forms) are just one application of symbolic computing.
There are certain aspects to that: If you ask about 'redefining' the language, then redefine strictly would mean redefine some existing language mechanism (syntax, semantics, pragmatics). But there is also extension, embedding, removing of language features.
In the Lisp tradition there have been many attempts to provide these features. A Lisp dialect and a certain implementation may offer only a subset of them.
A few ways to redefine/change/extend functionality as provided by major Common Lisp implementations:
s-expression syntax. The syntax of s-expressions is not fixed. The reader (the function READ) uses so-called read tables to specify functions that will be executed when a character is read. One can modify and create read tables. This allows you for example to change the syntax of lists, symbols or other data objects. One can also introduce new syntax for new or existing data types (like hash-tables). It is also possible to replace the s-expression syntax completely and use a different parsing mechanism. If the new parser returns Lisp forms, there is no change needed for the Interpreter or Compiler. A typical example is a read macro that can read infix expressions. Within such a read macro, infix expressions and precedence rules for operators are being used. Read macros are different from ordinary macros: read macros work on the character level of the Lisp data syntax.
replacing functions. The top-level functions are bound to symbols. The user can change the this binding. Most implementations have a mechanism to allow this even for many built-in functions. If you want to provide an alternative to the built-in function ROOM, you could replace its definition. Some implementations will raise an error and then offer the option to continue with the change. Sometimes it is needed to unlock a package. This means that functions in general can be replaced with new definitions. There are limitations to that. One is that the compiler may inline functions in code. To see an effect then one needs to recompile the code that uses the changed code.
advising functions. Often one wants to add some behavior to functions. This is called 'advising' in the Lisp world. Many Common Lisp implementations will provide such a facility.
custom packages. Packages group the symbols in name spaces. The COMMON-LISP package is the home of all symbols that are part of the ANSI Common Lisp standard. The programmer can create new packages and import existing symbols. So you could use in your programs an EXTENDED-COMMON-LISP package that provides more or different facilities. Just by adding (IN-PACKAGE "EXTENDED-COMMON-LISP") you can start to develop using your own extended version of Common Lisp. Depending on the used namespace, the Lisp dialect you use may look slighty or even radically different. In Genera on the Lisp Machine there are several Lisp dialects side by side this way: ZetaLisp, CLtL1, ANSI Common Lisp and Symbolics Common Lisp.
CLOS and dynamic objects. The Common Lisp Object System comes with change built-in. The Meta-Object Protocol extends these capabilities. CLOS itself can be extended/redefined in CLOS. You want different inheritance. Write a method. You want different ways to store instances. Write a method. Slots should have more information. Provide a class for that. CLOS itself is designed such that it is able to implement a whole 'region' of different object-oriented programming languages. Typical examples are adding things like prototypes, integration with foreign object systems (like Objective C), adding persistance, ...
Lisp forms. The interpretation of Lisp forms can be redefined with macros. A macro can parse the source code it encloses and change it. There are various ways to control the transformation process. Complex macros use a code walker, which understands the syntax of Lisp forms and can apply transformations. Macros can be trivial, but can also get very complex like the LOOP or ITERATE macros. Other typical examples are macros for embedded SQL and embedded HTML generation. Macros can also used to move computation to compile time. Since the compiler is itself a Lisp program, arbitrary computation can be done during compilation. For example a Lisp macro could compute an optimized version of a formula if certain parameters are known during compilation.
Symbols. Common Lisp provides symbol macros. Symbol macros allow to change the meaning of symbols in source code. A typical example is this: (with-slots (foo) bar (+ foo 17)) Here the symbol FOO in the source enclosed with WITH-SLOTS will be replaced with a call (slot-value bar 'foo).
optimizations, with so-called compiler macros one can provide more efficient versions of some functionality. The compiler will use those compiler macros. This is an effective way for the user to program optimizations.
Condition Handling - handle conditions that result from using the programming language in a certain way. Common Lisp provides an advanced way to handle errors. The condition system can also be used to redefine language features. For example one could handle undefined function errors with a self-written autoload mechanism. Instead of landing in the debugger when an undefined function is seen by Lisp, the error handler could try to autoload the function and retry the operation after loading the necessary code.
Special variables - inject variable bindings into existing code. Many Lisp dialects, like Common Lisp, provide special/dynamic variables. Their value is looked up at runtime on the stack. This allows enclosing code to add variable bindings that influence existing code without changing it. A typical example is a variable like *standard-output*. One can rebind the variable and all output using this variable during the dynamic scope of the new binding will go to a new direction. Richard Stallman argued that this was very important for him that it was made default in Emacs Lisp (even though Stallman knew about lexical binding in Scheme and Common Lisp).
Lisp has these and more facilities, because it has been used to implement a lot of different languages and programming paradigms. A typical example is an embedded implementation of a logic language, say, Prolog. Lisp allows to describe Prolog terms with s-expressions and with a special compiler, the Prolog terms can be compiled to Lisp code. Sometimes the usual Prolog syntax is needed, then a parser will parse the typical Prolog terms into Lisp forms, which then will be compiled. Other examples for embedded languages are rule-based languages, mathematical expressions, SQL terms, inline Lisp assembler, HTML, XML and many more.
I'm going to pipe in that Scheme is different from Common Lisp when it comes to defining new syntax. It allows you to define templates using define-syntax which get applied to your source code wherever they are used. They look just like functions, only they run at compile time and transform the AST.
Here's an example of how let can be defined in terms of lambda. The line with let is the pattern to be matched, and the line with lambda is the resulting code template.
(define-syntax let
(syntax-rules ()
[(let ([var expr] ...) body1 body2 ...)
((lambda (var ...) body1 body2 ...) expr ...)]))
Note that this is NOTHING like textual substitution. You can actually redefine lambda and the above definition for let will still work, because it is using the definition of lambda in the environment where let was defined. Basically, it's powerful like macros but clean like functions.
Macros are the usual reason for saying this. The idea is that because code is just a data structure (a tree, more or less), you can write programs to generate this data structure. Everything you know about writing programs that generate and manipulate data structures, therefore, adds to your ability to code expressively.
Macros aren't quite a complete redefinition of the language, at least as far as I know (I'm actually a Schemer; I could be wrong), because there is a restriction. A macro can only take a single subtree of your code, and generate a single subtree to replace it. Therefore you can't write whole-program-transforming macros, as cool as that would be.
However, macros as they stand can still do a whole lot of stuff - definitely more than any other language will let you do. And if you're using static compilation, it wouldn't be hard at all to do a whole-program transformation, so the restriction is less of a big deal then.
A reference to 'structure and interpretation of computer programs' chapter 4-5 is what I was missing from the answers (link).
These chapters guide you in building a Lisp evaluator in Lisp. I like the read because not only does it show how to redefine Lisp in a new evaluator, but also let you learn about the specifications of Lisp programming language.
This answer is specifically concerning Common Lisp (CL hereafter), although parts of the answer may be applicable to other languages in the lisp family.
Since CL uses S-expressions and (mostly) looks like a sequence of function applications, there's no obvious difference between built-ins and user code. The main difference is that "things the language provides" is available in a specific package within the coding environment.
With a bit of care, it is not hard to code replacements and use those instead.
Now, the "normal" reader (the part that reads source code and turns it into internal notation) expects the source code to be in a rather specific format (parenthesised S-expressions) but as the reader is driven by something called "read-tables" and these can be created and modified by the developer, it is also possible to change how the source code is supposed to look.
These two things should at least provide some rationale as to why Common Lisp can be considered a re-programmable programming language. I don't have a simple example at hand, but I do have a partial implementation of a translation of Common Lisp to Swedish (created for April 1st, a few years back).
From the outside, looking in...
I always thought it was because Lisp provided, at its core, such basic, atomic logical operators that any logical process can be built (and has been built and provided as toolsets and add-ins) from the basic components.
It is not so much that it can redefine itself as that its basic definition is so malleable that it can take any form and that no form is assumed/presumed into the structure.
As a metaphor, if you only have organic compounds you do organic chemistry, if you only have metal oxides you do metallurgy but if you have only elements you can do everything but you have extra initial steps to complete....most of which others have already done for you....
I think.....
Cool example at http://www.cs.colorado.edu/~ralex/papers/PDF/X-expressions.pdf
reader macros define X-expressions to coexist with S-expressions, e.g.,
? (cx <circle cx="62" cy="135" r="20"/>)
62
plain vanilla Common Lisp at http://www.AgentSheets.com/lisp/XMLisp/XMLisp.lisp
...
(eval-when (:compile-toplevel :load-toplevel :execute)
(when (and (not (boundp '*Non-XMLISP-Readtable*)) (get-macro-character #\<))
(warn "~%XMLisp: The current *readtable* already contains a #/< reader function: ~A" (get-macro-character #\<))))
... of course the XML parser is not so simple but hooking it into the lisp reader is.