Why does the predicate naming convention differ between different functions? - lisp

I am a Common Lisp newbie who's beginning to learn the language. My question is: I've seen several functions use the predicate syntax in different ways. For example (just looked this up), there is streamp and pathnamep, but there is also input-stream-p and output-stream-p. Why aren't all functions standardized to use *p or *-p? Is this maybe just a historical artifact?

The rule is that one-word predicate names end in just P, while multi-word predicate names end in -P. The reason for the former is mostly historical; the reason for the latter is that a name like input-streamp would wrongly suggest that the question being asked is “is this input a stream?” or something like that.

Related

How to do higher-order term rewriting in Coq?

This question is based on my question https://cs.stackexchange.com/questions/96533/how-to-transform-lambda-function-to-multi-argument-lambda-function-and-how-to-re There are two functions and two terms in that question:
Functions:
is: (e->t)->(e->t)
IS: e->(e->t)->t
Terms:
(is(boss))(John): t
IS(John, boss): t
My question is this: how to rewrite terms involving is with terms that have only IS? Does Coq (or third party tools) has such rewriting facilities? Does Coq have facilities to check the equality of rewrittern terms?
Maybe such rewriting can be done outside the Coq world, maybe there are other purely lambda calculus tools with syntactic manipulation only?
There is no tool that performs the kind of textual transformation of Coq code you are describing directly. Without knowing much about GrammaticalFramework, I imagine that your best bet would be to write a Sed script that looks for occurrences of is applied to arguments and replaced those occurrences by equivalent expressions with IS.
The second “IS“ form can be more easily converted to is-boss predicate, that is why I am striving to arrive at it.
I think that if you used a Sed script you could just as easily go to the IS_BOSS form directly, without using IS.

Implementing language translators in racket

I am implementing an interpreter that codegen to another language using Racket. As a novice I'm trying to avoid macros to the extent that I can ;) Hence I came up with the following "interpreter":
(define op (open-output-bytes))
(define (interpret arg)
(define r
(syntax-case arg (if)
[(if a b) #'(fprintf op "if (~a) {~a}" a b)]))
; other cases here
(eval r))
This looks a bit clumsy to me. Is there a "best practice" for doing this? Am I doing a totally crazy thing here?
Short answer: yes, this is a reasonable thing to do. The way in which you do it is going to depend a lot on the specifics of your situation, though.
You're absolutely right to observe that generating programs as strings is an error-prone and fragile way to do it. Avoiding this, though, requires being able to express the target language at a higher level, in order to circumvent that language's parser.
Again, it really has a lot to do with the language that you're targeting, and how good a job you want to do. I've hacked together things like this for generating Python myself, in a situation where I knew I didn't have time to do things right.
EDIT: oh, you're doing Python too? Bleah! :)
You have a number of different choices. Your cleanest choice is to generate a representation of Python AST nodes, so you can either inject them directly or use existing serialization. You're going to ask me whether there are libraries for this, and ... I fergits. I do believe that the current Python architecture includes ... okay, yes, I went and looked, and you're in good shape. Python's "Parser" module generates ASTs, and it looks like the AST module can be constructed directly.
https://docs.python.org/3/library/ast.html#module-ast
I'm guessing your cleanest path would be to generate JSON that represents these AST modules, then write a Python stub that translates these to Python ASTs.
All of this assumes that you want to take the high road; there's a broad spectrum of in-between approaches involving simple generalizations of python syntax (e.g.: oh, it looks like this kind of statement has a colon followed by an indented block of code, etc.).
If your source language shares syntax with Racket, then use read-syntax to produce a syntax-object representing the input program. Then use recursive descent using syntax-case or syntax-parse to discern between the various constructs.
Instead of printing directly to an output port, I recommend building a tree of elements (strings, numbers, symbols etc). The last step is then to print all the elements of the tree. Representing the output using a tree is very flexible and allows you to handle sub expressions out of order. It also allows you to efficiently concatenate output from different sources.
Macros are not needed.

When a keyword means different things in different contexts, is that an example of context sensitivity?

According to this answer => in Scala is a keyword which has two different meanings: 1 to denote a function type: Double => Double and 2 to create a lambda expression: (x: Double): Double => 2*x.
How does this relate to formal grammars, i.e. does this make Scala context sensitive?
I know that most languages are not context free, but I'm not sure whether the situation I'm describing has anything to do with that.
Edit:
Seems like I don't understand context sensitive grammars well enough. I know how the production rules are supposed to look, and what they mean ("this production applies only if A is surrounded by these symbols"), but I'm just not sure how they relate to actual (programming) languages.
I think my confusion stems from reading something like "Chomsky introduced this term because a word's meaning can depend on its context", and I connected => with the term "word" in the quote, and those two uses of it being two separate contexts.
It be great if an answer would address my confusion.
It's been a while since I've handled formal language theory, but I'll bite.
"Context-free" means that the production rules required in the corresponding grammar do not have a "context". It does not mean that a specific symbol cannot appear in different rules.
Addressing the edit: in other words (and more informally), deciding whether a language is context-free or context-sensitive boils down not to looking at the "meaning" of a specific "word" or "words". Instead, it amounts to looking at the set of all legal expressions in that language, and seeing whether you can "encode" them only by taking into account the positional relationships the component "words" have with one another. This is essentially what the Pumping Lemma checks.
For example:
S → Type"="Body
Type → "Double"
Type → "Double""=>""Double"
Body → Lambda
Body → NormalBody
NormalBody → "x"
Lambda -> "x""=>"NormalBody
Where S is of course the start symbol, uppercased IDs are nonterminals, and quoted strings are terminals. Obviously, this can generate a string like:
Double=>Double=x=>x
but the grammar is still context-free.
So just this, as in the observation that the nonterminal "=>" can appear in two "places" of the program, does not make Scala context-sensitive.
However, it does not mean that:
the entire Scala language is context-free,
it is context-sensitive - it can be even more complex,
if you would like to encode the semantics of Scala into a grammar, you would end up with either a context-free or a context-sensitive one.
The last thing is especially relevant since you've mentioned "meaning" in the (nomen omen) context of formal languages.

Perl module for text comparison

Can anyone suggest a Perl module which can compare two strings and return a degree to which they match? I searched CPAN extensively, and although there are similar modules like String::Approx and Data::Compare, they are not what I am looking for. Suppose I have two strings : I love you, and I boht you. I want functionality which will compare these two strings, taking into account numerous parameters, the matching of words in correct order (love as the first word in a string should not "match" love as the 4th word in the 2nd string, even though both strings have that word), words not matching but spelt almost similarly (like say love and loge), number of words, etc and return an index, say a number from 0 to 1 on a scale of 1, representing the degree of similarity between the two strings. Is there any such Perl module?
There are many such modules. Often, though, you'll have to make use of them in some special way to account for your own assumptions. Most of the string comparison tools like this just implement some algorithm for comparing one string to another. Most assume that if you have specific policy decisions to make, you'll code them yourself.
Personally, I am not sure I'd recommend Text::Levenshtein because of bugs and lack of ut8 support. I don't have a better recommendation either, though.
However, these searches will reveal lots of potential modules you could look into and determine what works best for your purpose (based on the names of common algorithms for doing this sort of thing):
https://metacpan.org/search?q=levenshtein
https://metacpan.org/search?q=wagner+fischer
https://metacpan.org/search?q=edit+distance
If you're interested in spoken similarities, you can also look into phonetic comparisons:
https://metacpan.org/search?q=phonetic
https://metacpan.org/search?q=soundex
https://metacpan.org/search?q=metaphone

How can ported code be detected?

If you port code over from one language to another, how can this be detected?
Say you were porting code from c++ to Java, how could you tell?
What would be the difference between a program designed and implemented in Java, and a near identical program ported over to Java?
If the porting is done properly (by people expert in both languages and ready to translate the source language's idioms into the best similar idioms of the target language), there's no way you can tell that any porting has taken place.
If the porting is done incompetently, you can sometimes recognize goofily-transliterated idioms... but that can be hard to distinguish from people writing a new program in a language they know little just goofily transliterating the idioms from the language they do know;-).
Depending on how much effort was put into the intention to hide the porting it could be very easy to impossible to detect.
I would use pattern recognition for this task. Think about the "features" which would indicate code-similarities. Extract these feature from each code and compare them.
e.g:
One feature could be similar symbol names. Extract all symbols using ctags or regular expressions, make all lower-case, make uniq sort of both lists and compare them.
Another possible feature:
List of class + number of members e.g:
MyClass1 10
...
List of method + sequence of controll blocks. e.g:
doSth() if, while, if, ix, case
...
Another easy way, is to represent the code as a picture - e.g. load the code as text in Word and set the font size to 1. Human beings are very good on comparing pictures. For another Ideas of code Visualization you may check http://www.se-radio.net/2009/03/episode-130-code-visualization-with-michele-lanza/