Compare models for identity, but with variables? Construct with minus? - scala

My team is implementing a variation of Ceusters's Referent Tracking. In our implementation, the original URI for an entity can be changed (to something containing a UUID), although a link to the original URI is always kept.
For example:
:Joey rdf:type :person .
:New_York_City rdf:type :locality .
:Joey :hometown :New_York_City .
might become:
:Joey :replacedWith :ABC123 .
:ABC123 rdf:type :person .
:New_York_City :replacedWith :FFF555 .
:FFF555 rdf:type :locality .
:ABC123 :hometown :FFF555 .
I am writing some Scala integration tests to see if our software does the referent tracking correctly.
Specifically, I know I should expect this CorrectPattern:
:Joey :replacedWith ?person .
?person rdf:type :person .
:New_York_City :replacedWith ?locale .
?locale rdf:type :locality .
?person :hometown ?locale .
But I don't know what the values of ?person and ?locale will be.
I can SPARQL ASK for CorrectPattern... that will tell me if the pattern is present. But I also want to confirm that nothing else has been added.
I thought I could CONSTRUCT { ?s ?p ?o }, MINUS out CorrectPattern, and check for an empty result, but Blazegraph is saying:
java.util.concurrent.ExecutionException: org.openrdf.query.MalformedQueryException: CONSTRUCT WHERE only permits statement patterns in the WHERE clause.
Any ideas? I want to check that the whole triple store contains nothing more and nothing less than CorrectPattern, but I think CorrectPattern must contain variables.

Once again bailed out by a comment from #AKSW, who doesn't seem to be especially obsessed with earning reputation points.
This CONSTRUCT, with an embedded SELECT, gets all triples from my model, depleted of any triples in the MINUS block, even when they contain variables. I'm pretty sure I can flesh out the MINUS block and finish my task.
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
CONSTRUCT
{
?s ?p ?o .
}
WHERE
{ SELECT ?s ?p ?o
WHERE
{ { ?s ?p ?o }
MINUS
{ ?s rdf:type ?o }
}
}

In order to determine that the triple-store contains only the triples which you expect and nothing more, you could test for the exact triples which you expect while simultaneously counting the number of triples which appear in the database before and after you run your program. If the difference in triples is higher or lower than you expect, you will know that you have some extraneous or missing data.
Use SPARQL SELECT Count function:
SELECT (count(?triples) as ?count)
WHERE {
?triples ?p ?o .}
Pseudocode:
val beforeTripleCount = countTriplesInDatabase()
//run your program, test for expected triples
val afterTripleCount = countTriplesInDatabase()
val diff = afterTripleCount - beforeTripleCount
//diff should be equal to the number of triples you expect to be added/removed
Additionally, if you are resistant to using variables in your tests, you could write additional queries to capture the URI of the node you have created, then add the results to your test queries using String concatenation.

Related

When do variables output properly in skeletons functions?

I'm trying to write a skeleton-function to output expressions in a loop. Out of a loop I can do,
(define-skeleton test
""
> "a")
When I evaluate this function it outputs "a" into the working buffer as desired. However, I'm having issues when inserting this into a loop. I now have,
(define-skeleton test
"A test skeleton"
(let ((i 1))
(while (< i 5)
>"a"
(setq i (1+ i)))))
I would expect this to output "aaaaa". However, instead nothing is outputted into the working buffer in this case. What is happening when I insert the loop?
The > somestring skeleton dsl does not work inside lisp forms.
You can however concatenate the string inside a loop:
(define-skeleton barbaz
""
""
(let ((s ""))
(dotimes (i 5)
(setq s (concat s "a")))
s)
)
My understanding is that code such as
> "a"
only works at the first nesting level inside a skeleton.
[EDIT] Regarding your question
What is happening when I insert the loop?
The return value of the let form (that is, the return value of the while form)is inserted. I do not know why it does not raise an error when evaluating > "a", but the return value of a while form is nil, so nothing is inserted.
I do agree that there's not much point using define-skeleton if you're going to need an (insert function within the skeleton.
This is also a rather trivial example to be using define-skeleton.
That said, they are often easier to read than a defun and useful when you want to create a function that inserts text (and optionally, takes input).
For example you may wish to have a different character repeated a set no. of times... Below, str refers to the argument supplied with the function (usually a string) and v1, v2 are the default names for local variables in a skeleton. Thus:
(define-skeleton s2 ""
nil ; don't prompt for value of 'str'
'(set 'v1 (make-string 5 (string-to-char str)))
\n v1 \n \n)
Below, calling the function leads to a newline, the string, then leaves the cursor at the position indicated by the square brackets [].
(s2 "a")
aaaaa
[]

How to use case and read-event with down-mouse-1

I have a function that uses (case (read-event) . . .) -- I have been unable to get down-mouse-1 to equal an integer for the duration of the function. The following is an example where down-mouse-1 yields a result of Try again instead of Hello world. All of the following examples work, except for down-mouse-1: ('f12 516); (?\s-k 517); ('f3 518); ('C-tab 519); ('C-M-s-right 520); (?m 522).
(let* (
(test (case (read-event)
('down-mouse-1 9999))))
(cond
((eq test 9999)
(message "Hello world."))
(t (message "Try again."))))
read-event never returns down-mouse-1. For a mouse click, the first event it will return will look like (down-mouse-1 ...). So you could do:
(pcase (read-event)
(`(down-mouse-1 . ,_) 9999))
Note that in my experience, 99% of the uses of read-event would be better rewritten some other way.
Not clear to me what you are trying to do. But you should not quote the keys in a case clause. E.g, use down-mouse-1, not 'down-mouse-1.

XText cross referencing

I have written following grammar
Model:
package = PackageDec?
greetings+=Greeting*
usage+=Usage* ;
PackageDec:
'package' name=QualifiedName ;
Greeting:
'greet' name=ID '{' ops += Operation* '}' ;
Operation:
'op' name=ID ('(' ')' '{' '}')? ;
QualifiedName:
ID ('.' ID)*;
Usage:
'use';
With above i can write following script.
package p1.p2
greet G1 {op f1 op f2 }
Now i need to write something like this:
package p1.p2
greet G1 {op f1 op f2 op f3}
use p1.p2.G1.f1
use p1.p2.G1
use p1.p2.G1.f3
To support that i changed Usage RULE like this
Usage:
'use' head=[Greet|QualifiedName] =>('.' tail=[Operation])?
However when i generate xtext artifacts it is complaining about multiple alternatives.
Please let me know how to write correct grammar rule for this.
This is because QualifiedName consumes dots (.). Adding ('.' ...)? makes two alternatives. Consider input
a.b.c
This could be parsed as
head="a" tail = "b.c"
head="a.b" tail = "c"
If I understand your intention of using predicate => right, than you just have to replace
head=[Greet|QualifiedName]
with
head=[Greet]
In this case however you will not be able to parse references with dots.
As a solution I would recommend to substitute your dot with some other character. For example with colon:
Usage:
'use' head=[Greet|QualifiedName] (':' tail=[Operation])?

NET::LDAP FIlter with OR

In PERL, NET::LDAP, I'm trying to use-
my $psearch-$ldap->search(
base => $my_base,
attr => ["mail","employeeNumber","physicalDeliveryOfficeName"],
filter => "(&(mail=*)(!(employeeNumber=9*)) (&(physicalDeliveryOfficeName=100)) (|(physicalDeliveryOfficeName=274)))");
Saying "give me everyone with a mail entry, where employee number does not begin with 9 and the physicalDeliveryOfficeName is either 100 or 274".
I can get it to work using just 100 or using just 274 but I can't seem to figure out how to specify 100 OR 274.
I can't seem to find the correct filter string, ready pull my hair out... please help!!
I can't test this, but LDAP queries use prefix notation while we're use to using infix notation. Imagine if you want something that's either a dog or a cat. In infix notation, it would look something like this:
((ANIMAL = "cat") OR (ANIMAL = "dog"))
With prefix notation, the boolean operator goes at the beginning of the query:
(OR (ANIMAL = "cat") (ANIMAL = "dog"))
The advantage to prefixed notation comes when you do more than two checks per boolean. Here I'm looking for something that's either a cat, a dog or a wombat:
(OR (ANIMAL = "cat") (ANIMAL = "dog") (ANIMAL = "wombat"))
Notice that I only needed a single boolean operator in the front of my statement. This will OR together all three statements. With our standard infix notation, I would have to have a second OR operator:
((ANIMAL = "cat") OR (ANIMAL = "dog") OR (ANIMAL = "wombat"))
Prefix notation was created by a Polish Mathematician named Jan Lukasiewicz back in 1924 in Warsaw Univeristy and thus became known as Polish Notation. Later on, it was discovered that computers could work an equation from front to back if the equation was written in postfix notation which is the reverse of Polish Notation. Thus, Reverse Polish Notation (or RPN) was born.
Early HP calculators used RPN notation which became the Geek Sheik thing back in the early 1970s. Imagine the sense of brain superiority you get when you hand your calculator to someone and they have no early idea how to use it. The only way to be cooler back then was to have a Curta.
Okay, enough walking down nostalgia lane. Let's get back to the problem...
The easiest way to construct an infix operation is to build a tree diagram of what you want. Thus, you should sketch out your LDAP query as a tree:
AND
/ | \
/ | \
/ | \
/ | \
/ | \
/ | \
/ | \
OR employee!=9* mail=*
/ \
/ \
/ \
/ \
/ \
phyDelOfficeName=100 phyDelOfficeName=274
To build a query based upon this tree, start with the bottom of the tree, and work your way up each layer. The bottom part of our tree is the OR part of our query:
(OR (physicalDeliveryOfficeName = 100) (physicalDeliveryOfficeName = 274))
Using LDAP's OR operator, the pipe (|) and removing the extra spaces, we get:
(|(physicalDeliveryOfficeName = 100)(physicalDeliveryOfficeName = 274))
When I build an LDAP query, I like to save each section as a Perl scalar variable. It makes it a bit easier to use:
$or_part = "|(physicalDeliveryOfficeName=100)(physicalDeliveryOfficeName=274)";
Notice I've left off the outer pair or parentheses. The outer set of parentheses return when you string all the queries back together. However, some people put them anyway. An extra set of parentheses doesn't hurt an LDAP query.
Now for the other two parts of the query:
$mailAddrExists = "mail=*";
$not_emp_starts_9 = "!(employee=9*)";
And, now we AND all three sections together:
"(&($mailAddrExists)($not_emp_starts_9)($or_part))"
Note that a single ampersand weaves it all together. I can substitute back each section to see the full query:
(&(mail=*)(!(employee=9*))(|(physicalDeliveryOfficeName=100)(physicalDeliveryOfficeName=274)))
Or like this:
my $psearch-$ldap->search(
base => $my_base,
attr => ["mail","employeeNumber","physicalDeliveryOfficeName"],
filter => "(&(mail=*)(!(employee=9*))(|(physicalDeliveryOfficeName=100)(physicalDeliveryOfficeName=274)))",
);
Or piecemeal:
my $mail = "mail=*";
my $employee = "!(employee=9*)";
my $physicalAddress = "|(physicalDeliveryOfficeName=100)"
. "(physicalDeliveryOfficeName=274)";
my $psearch-$ldap->search(
base => $my_base,
attr => ["mail","employeeNumber","physicalDeliveryOfficeName"],
filter => "(&($mail)($employee)($physicalAddress))",
);
As I said before, I can't test this. I hope it works. If nothing else, I hope you understand how to create an LDAP query and can figure out how to do it yourself.

Newbie transforming CSV files in Clojure

I'm both new and old to programming -- mostly I just write a lot of small Perl scripts at work. Clojure came out just when I wanted to learn Lisp, so I'm trying to learn Clojure without knowing Java either. It's tough, but it's been fun so far.
I've seen several examples of similar problems to mine, but nothing that quite maps to my problem space. Is there a canonical way to extract lists of values for each line of a CSV file in Clojure?
Here's some actual working Perl code; comments included for non-Perlers:
# convert_survey_to_cartography.pl
open INFILE, "< coords.csv"; # Input format "Northing,Easting,Elevation,PointID"
open OUTFILE, "> coords.txt"; # Output format "PointID X Y Z".
while (<INFILE>) { # Read line by line; line bound to $_ as a string.
chomp $_; # Strips out each line's <CR><LF> chars.
#fields = split /,/, $_; # Extract the line's field values into a list.
$y = $fields[0]; # y = Northing
$x = $fields[1]; # x = Easting
$z = $fields[2]; # z = Elevation
$p = $fields[3]; # p = PointID
print OUTFILE "$p $x $y $z\n" # New file, changed field order, different delimiter.
}
I've puzzled out a little bit in Clojure and tried to cobble it together in an imperative style:
; convert-survey-to-cartography.clj
(use 'clojure.contrib.duck-streams)
(let
[infile "coords.csv" outfile "coords.txt"]
(with-open [rdr (reader infile)]
(def coord (line-seq rdr))
( ...then a miracle occurs... )
(write-lines outfile ":x :y :z :p")))
I don't expect the last line to actually work, but it gets the point across. I'm looking for something along the lines of:
(def values (interleave (:p :y :x :z) (re-split #"," coord)))
Thanks, Bill
Please don't use nested def's. It doesn't do, what you think it does. def is always global! For locals use let instead. While the library functions are nice to know, here a version orchestrating some features of functional programming in general and clojure in particular.
(import 'java.io.FileWriter 'java.io.FileReader 'java.io.BufferedReader)
(defn translate-coords
Docstrings can be queried in the REPL via (doc translate-coords). Works eg. for all core functions. So supplying one is a good idea.
"Reads coordinates from infile, translates them with the given
translator and writes the result to outfile."
translator is a (maybe anonymous) function which extracts the translation from the surrounding boilerplate. So we can reuse this functions with different transformation rules. The type hints here avoid reflection for the constructors.
[translator #^String infile #^String outfile]
Open the files. with-open will take care, that the files are closed when its body is left. Be it via normal "drop off the bottom" or be it via a thrown Exception.
(with-open [in (BufferedReader. (FileReader. infile))
out (FileWriter. outfile)]
We bind the *out* stream temporarily to the output file. So any print inside the binding will print to the file.
(binding [*out* out]
The map means: take the seq and apply the given function to every element and return the seq of the results. The #() is a short-hand notation for an anonymous function. It takes one argument, which is filled in at the %. The doseq is basically a loop over the input. Since we do that for the side effects (namely printing to a file), doseq is the right construct. Rule of thumb: map: lazy => for result, doseq: eager => for side effects.
(doseq [coords (map #(.split % ",") (line-seq in))]
println takes care for the \n at the end of the line. interpose takes the seq and adds the first argument (in our case " ") between its elements. (apply str [1 2 3]) is equivalent to (str 1 2 3) and is useful to construct function calls dynamically. The ->> is a relatively new macro in clojure, which helps a bit with readability. It means "take the first argument and add it as last item to the function call". The given ->> is equivalent to: (println (apply str (interpose " " (translator coords)))). (Edit: Another note: since the separator is \space, we could here write just as well (apply println (translator coords)), but the interpose version allows to also parametrize the separator as we did with the translator function, while the short version would hardwire \space.)
(->> (translator coords)
(interpose " ")
(apply str)
println)))))
(defn survey->cartography-format
"Translate coords in survey format to cartography format."
Here we use destructuring (note the double [[]]). It means the argument to the function is something which can be turned into a seq, eg. a vector or a list. Bind the first element to y, the second to x and so on.
[[y x z p]]
[p x y z])
(translate-coords survey->cartography-format "survey_coords.txt" "cartography_coords.txt")
Here again less choppy:
(import 'java.io.FileWriter 'java.io.FileReader 'java.io.BufferedReader)
(defn translate-coords
"Reads coordinates from infile, translates them with the given
translator and writes the result to outfile."
[translator #^String infile #^String outfile]
(with-open [in (BufferedReader. (FileReader. infile))
out (FileWriter. outfile)]
(binding [*out* out]
(doseq [coords (map #(.split % ",") (line-seq in))]
(->> (translator coords)
(interpose " ")
(apply str)
println)))))
(defn survey->cartography-format
"Translate coords in survey format to cartography format."
[[y x z p]]
[p x y z])
(translate-coords survey->cartography-format "survey_coords.txt" "cartography_coords.txt")
Hope this helps.
Edit: For CSV reading you probably want something like OpenCSV.
Here's one way:
(use '(clojure.contrib duck-streams str-utils)) ;;'
(with-out-writer "coords.txt"
(doseq [line (read-lines "coords.csv")]
(let [[x y z p] (re-split #"," line)]
(println (str-join \space [p x y z])))))
with-out-writer binds *out* such that everything you print will go to the filename or stream you specify, rather than standard-output.
Using def as you're using it isn't idiomatic. A better way is to use let. I'm using destructuring to assign the 4 fields of each line to 4 let-bound names; then you can do what you want with those.
If you're iterating over something for the purpose of side-effects (e.g. I/O) you should usually go for doseq. If you wanted to collect up each line into a hash-map and do something with them later, you could use for:
(with-out-writer "coords.txt"
(for [line (read-lines "coords.csv")]
(let [fields (re-split #"," line)]
(zipmap [:x :y :z :p] fields))))