TraMineR:::seqerules help page? - traminer

Is there a help-page for TraMineR:::seqerules? I cannot seem to find it, either in the package nor online. The lack of this help page makes the output somewhat difficult to interpret. For example what do the Conf and Lift columns specify? Below is an example of the output:
Rules Support Conf Lift
308 (NR)-(QU)-(QU) => (IN) 8 0.61538462 2.666667
153 (IN)-(EX) => (IN) 11 0.55000000 2.383333
394 (NR)-(NR)-(QU) => (IN) 7 0.53846154 2.333333
390 (NR)-(NR) => (NR)-(FA) 7 0.14000000 2.298947
259 (QU)-(EX) => (IN) 9 0.52941176 2.294118

You are right that seqerules is not documented. Help pages are required for public R functions only, and seqerules is currently not a public function of TraMineR. That also is why you need the ::: operator to access it.
The functions returns sequential association rules. The first rule in your example outcome says that when the subsequence (NR)-(QU)-(QU) occurs then it is (generally) followed by the subsequence (IN). This rule has a support of 8 (i.e., it is observed in 8 sequences).
Conf is the confidence, i.e., the probability to observe the conclusion of the rule among the sequences that contain the premise of the rule. For the first rule it is 61.5%.
Lift is the lift, i.e., the ratio of the confidence over the probability to observe the conclusion among all sequences (not only those who satisfy the premise). The higher the lift, the better the rule. A lift less than 1 would means that when the premise occurs, it reduces the chances for the conclusion to occur, and indicates that the rule is of no interest.

Related

Boolean expression with a redundant overlapping term

I have simplified a Boolean expression using 14 Boolean Algebra law steps and now have a working resultant function which according to a KV map still has a redundant term.
In an attempt to remove this term I have tried various distribution, complement and identity applications followed by a deMorgans law, as well a Consensus Theorem approach. Of the text books I've consulted they all say there is no theory or set rules to resolving such an issue, just experience!
After much simplification (page and half) my resultant expression is,
z = ~a~cd + b~a + b~d + bc [1]
Using a KV map I get a slightly simpler expression of,
z = ~a~cd + b~d + bc [2]
The truth table of each expression is equivalent therefore the b~a of my first expression [1] appears to be redundant.
I expected to be able to cancel the redundant **b~a** function by applying the laws of Boolean algebra but after much experimenting I'm unable to find an entry point.
This is an assignment question so I do not expect anybody to do my homework but advise on how to approach this challenge would be appreciated.

Drools RETE algorithm confusion

I am having an issue understanding RETE algorithm Beta node JoinNode and notNode?
Documentation says :
There are two two-input nodes, JoinNode and NotNode, and both are
types of BetaNodes. BetaNodes are used to compare 2 objects, and their
fields, to each other. The objects may be the same or different types.
By convention, we refer to the two inputs as left and right. The left
input for a BetaNode is generally a list of objects; in Drools this is
a Tuple. The right input is a single object. Two Nodes can be used to
implement 'exists' checks. BetaNodes also have memory. The left input
is called the Beta Memory and remembers all incoming tuples. The right
input is called the Alpha Memory and remembers all incoming objects.
I understood, Alpha Node: Various literal conditions for drl rules but above documentation for BetaNodes is confusing me a bit.
say below is drl condition for above diagram:
$person : Person( favouriteCheese == $cheddar )
Query: 1) what are these left and right inputs to two-input Beta Nodes exactly as explained in above documentation? I believe it's referring to facts and rules where I believe tuples would be facts?
2) notNode would be basically drl condition matching literal condition with not?
Updated question on 6Sep17:
3) I believe above diagram represent joinNode, how would notNode be represented , if above workflow is altered to suit notNode?
The condition corresponding to the diagram would be
Cheese( $name: name == "Cheddar" )
Person( favouriteCheese == $name )
Once there is a match, a new tuple consisting of the matching Cheese and Person is composed and can act as a new tuple for further matches if there is a third pattern in the condition.
A not-node would be one that asserts the non-existence of some fact. It would fire only once.
You might find a much better description of "rete" on the web.

PLT Redex: parameterizing a language definition

This is a problem that's been nagging at me for some time, and I wonder if anyone here can help.
I have a PLT Redex model of a language called lambdaLVar that is more or less a garden-variety untyped lambda calculus, but extended with a store containing "lattice variables", or LVars. An LVar is a variable whose value can only increase over time, where the meaning of "increase" is given by a partially ordered set (aka a lattice) that the user of the language specifies. Therefore lambdaLVar is really a family of languages -- instantiate it with one lattice and you get one language; with a different lattice, and you get another. You can take a look at the code here; the important stuff is in lambdaLVar.rkt.
In the on-paper definition of lambdaLVar, the language definition is parameterized by that user-specified lattice. For a long time, I've wanted to do the same kind of parameterization in the Redex model, but so far, I haven't been able to figure out how. Part of the trouble is that the grammar of the language depends on how the user instantiates the lattice: elements of the lattice become terminals in the grammar. I don't know how to express a grammar in Redex that is abstract over the lattice.
In the meantime, I tried to make lambdaLVar.rkt as modular as I could. The language defined in that file is specialized to a particular lattice: natural numbers with max as the least-upper-bound (lub) operation. (Or, equivalently, natural numbers ordered by <=. It's a very boring lattice.) The only parts of the code that are specific to that lattice are the line (define lub-op max) near the top, and natural appearing in the grammar. (There's a lub metafunction that is defined in terms of the user-specified lub-op function. The latter is just a Racket function, so lub has to escape out to Racket to call lub-op.)
Barring the ability to actually specify lambdaLVar in a way that is abstract over the choice of lattice, it seems like I ought to be able to write a version of lambdaLVar with the most bare-bones of lattices -- just Bot and Top elements, where Bot <= Top -- and then use define-extended-language to add more stuff. For instance, I could define a language called lambdaLVar-nats that is specialized to the naturals lattice I described:
;; Grammar for elements of a lattice of natural numbers.
(define-extended-language lambdaLVar-nats
lambdaLVar
(StoreVal .... ;; Extend the original language
natural))
;; All we have to specify is the lub operation; leq is implicitly <=
(define-metafunction/extension lub lambdaLVar-nats
lub-nats : d d -> d
[(lub-nats d_1 d_2) ,(max (term d_1) (term d_2))])
Then, to replace the two reduction relations slow-rr and fast-rr that I had for lambdaLVar, I could define a couple of wrappers:
(define nats-slow-rr
(extend-reduction-relation slow-rr
lambdaLVar-nats))
(define nats-fast-rr
(extend-reduction-relation fast-rr
lambdaLVar-nats))
My understanding from the documentation on extend-reduction-relation is that it should reinterpret the rules in slow-rr and fast-rr, but using lambdaLVar-nats. Putting all this together, I tried running the test suite that I had with one of the new, extended reduction relations:
> (program-test-suite nats-slow-rr)
The first thing I get is a contract violation complaint: small-step-base: input (((l 3)) new) at position 1 does not match its contract. The contract line of small-step-base is just #:contract (small-step-base Config Config), where Config is a grammar nonterminal that has a new meaning if reinterpreted under lambdaLVar-nats than it did under lambdaLVar, because of the specific lattice stuff. As an experiment, I got rid of the contracts onsmall-step-base and small-step-slow.
I was then able to actually run my 19 test programs, but 10 of them fail. Perhaps unsurprisingly, all the ones that fail are programs that use natural-number-valued LVars in some way. (The rest are "pure" programs that don't interact with the store of LVars at all.) So, the tests that fail are exactly the ones that use the extended grammar.
So I kept following the rabbit hole, and it seems like Redex wants me to extend all of the existing judgment forms and metafunctions to be associated with lambdaLVar-nats rather than lambdaLVar. That makes sense, and it seems to work OK for judgment forms as far as I can tell, but with metafunctions I get into trouble: I want the new metafunction to overload the old one of the same name (because existing judgment forms are using it) and there doesn't seem to be a way to do that. If I have to rename the metafunctions, it defeats the purpose, because I'll have to write whole new judgment forms anyway. I suppose that what I want is a sort of late binding of metafunction calls!
My question in a nutshell: Is there any way in Redex to parameterize the definition of a language in the way I want, or to extend the definition of a language in a way that will do what I want? Will I end up just having to write Redex-generating macros?
Thanks for reading!
I asked the Racket users mailing list; the thread begins here. To summarize the resulting discussion: In Redex as it stands today, the answer is no, there is no way to parameterize a language definition in the way I want. However, it should be possible in a future version of Redex with a module system, which is in the works right now.
It also doesn't work to try to use Redex's existing extension forms (define-extended-language, extend-reduction-relation, and so on) in the way I tried to do here, because -- as I discovered -- the original metafunctions do not get transitively reinterpreted to use the extended languages. But a module system would apparently help with this, too, because it would allow you to package up metafunctions, judgment-forms, and reduction relations together and simultaneously extend them (see the discussion here).
So, for now, the answer is, indeed, to write a Redex-generating macro. Something like this works:
(define-syntax-rule (define-lambdaLVar-language name lub-op lattice-values ...)
(begin
;; Entire original Redex model goes here, with `natural` replaced with
;; `lattice-values ...`, and instances of `...` replaced with `(... ...)`
))
And then you can instantiate particular lattices with, e.g.,:
(define-lambdaLVar-language lambdaLVar-nat max natural)
I hope Redex does get modules soon, but in the meantime, this seems to work well.

Documenting Scala functional chains [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
Scala (and functional programming, in general), advocates a style of programming where you produce functional "chains" of the form
collection.operation1(...).operation2(...)...
where the operations are various combinations of map, filter, etc.
Where the equivalent Java code might require 50 lines, the Scala code can be done in 1 or 2 lines. The functional chain can change an input collection to something completely different.
The disadvantage of the Scala code is that 10 minutes later (never mind 6 months later), I can't figure out what I was thinking, because the notation is so compact, and lacks type information (because of implied types).
How do you document this? Do you put a large block comment before the chain, changing an elegant 1 line solution into a bulky 40 line solution consisting of 39 lines of comment? Do you intersperse your comments like this?
collection.
// Select the items that meet condition X
filter(predicate_function).
// Change these items from A's to B's
map(transformation_function).
// etc.
Something else? No documentation? (Leave them guessing. They'll never "downsize" you then, because no one else can maintain the code. :-))
If you find yourself writing comments at that detail level, you're just repeating what the code says.
For long functional chains, define new functions to replace parts of the chain. Give these meaningful names. Then you might be able to avoid comments. The names of these functions themselves should explain what they do.
The best comments are the ones that explain why the code does something. Well-written code should make the "how" obvious from the code itself.
I don't write that code to begin with (unless it's a script for one-time use or playing around in the REPL).
If I can explain what the code does in one comment and the reads okay, then I keep it as a one liner:
// Find all real-valued square roots and group them in integer bins
ds.filter(_ >= 0).map(math.sqrt).groupBy(_.toInt).map(_._2)
If I can't understand this by reading carefully through the chain of commands, then I should break it up more into functionally distinct units. For example, if I expected someone to not realize that the square root of a negative number is not real-valued, I would say:
// Only non-negative numbers have a real-valued square root
val nonneg = ds.filter(_ >= 0)
// Find square roots and group them in integer bins
nonneg.map(math.sqrt).groupBy(_.toInt).map(_._2)
In particular, if someone doesn't know the Scala collections library well, and doesn't have the patience to spend five to ten minutes understanding one line of code, then either they shouldn't be working on my code (nor on anything else that accomplishes something nontrivial that they don't understand and don't have the patience to understand), or I should know in advance that I'm providing an e.g. language and mathematics tutorial in addition to writing working code, either by writing a paragraph explaining how the following line works, or breaking it out command by command, or including comments at the start of each anonymous function explaining what is going on (as appropriate).
Anyway, if you can't understand what it does, you probably need some intermediate values. They are very helpful for mental-resetting ("I can't see how to get from A to C!...but...okay, I can understand A to B. And I can understand B to C.")
If your chained operations are all monadic transforms: map, flatMap, filter, then it's often much, much clearer to rewrite the logic as a for-comprehension.
coll.filter(predicate).map(transform)
could become
for(elem <- coll if predicate) yield transform(elem)
it's even easier to show off the power of the technique if you have a longer sequence of operations, such as with Kassen's example:
def eligibleCustomers(products: Seq[Product]) = for {
product <- products
customer <- product.customers
paying <- customer if customer.isPremium
eligible <- paying if paying.age < 20
} yield eligible
If you don't want to split it in multiple methods as hammar suggested you can split the line and give the intermediate values names (and optionally types).
def eligibleCustomers: List[Customer] = {
val customers = products.flatMap(_.customers)
val paying = customers.filter(_.isPremium)
val eligible = paying.filter(_.age < 20)
eligible
}
The linelength is a somehow natural indicator, when your chain is getting too long. :)
Of course, it will depend upon how trivial the chain is:
customerdata.filter (_.age < 40).filter (_.city == "Rio").
filter (_.income > 3000).filter (_.joined < 2005)
filter (_.sex == 'f'). ...
I recently had your impression, where an application of 3 files, one of them a bit lengthy, consisting of 4 classes, one of them not trivial, and of about 10 to 20 methods. Each method was about 5 to 10 lines, and each 2 of them could have been easily combined to a lager one, but I had to convince myself, that although measuring the elegance in spared lines of codes isn't completely wrong, sparing lines isn't the goal itself.
But splitting a method into two often makes complexity per line lower, but not the overall complexity, to understand the whole program.
If the problem domain is complex - filter data at different levels, rowwise, columnwise, map it, group it, build averages, build graphs, paginate them ... - the complicated job has to be done somewhere.
The program isn't more easy to understand, you just have to hit page down less often. It is a readjustment, that you have to read a line of code more slowly.
It doesn't bother me that much now I'm used to Scala. If you want to be more explicit with types, you can always, for example, replace things like map(_.foo) with map { a:A => a.foo } to make the code more readable in lengthy/complex operations. Not that I usually find the need to do that.

Implementing a Measured value in Scala

A Measured value consists of (typically nonnegative) floating-point number and unit-of-measure. The point is to represent real-world quantities, and the rules that govern them. Here's an example:
scala> val oneinch = Measure(1.0, INCH)
oneinch : Measure[INCH] = Measure(1.0)
scala> val twoinch = Measure(2.0, INCH)
twoinch : Measure[INCH] = Measure(2.0)
scala> val onecm = Measure(1.0, CM)
onecm : Measure[CM] = Measure(1.0)
scala> oneinch + twoinch
res1: Measure[INCH] = Measure(3.0)
scala> oneinch + onecm
res2: Measure[INCH] = Measure(1.787401575)
scala> onecm * onecm
res3: Measure[CMSQ] = Measure(1.0)
scala> onecm * oneinch
res4: Measure[CMSQ] = Measure(2.54)
scala> oncem * Measure(1.0, LITER)
console>:7: error: conformance mismatch
scala> oneinch * 2 == twoinch
res5: Boolean = true
Before you get too excited, I haven't implemented this, I just dummied up a REPL session. I'm not even sure of the syntax, I just want to be able to handle things like adding Measured quantities (even with mixed units), multiplying Measured quantities, and so on, and ideally, I like Scala's vaunted type-system to guarantee at compile-time that expressions make sense.
My questions:
Is there extant terminology for this problem?
Has this already been done in Scala?
If not, how would I represent concepts like "length" and "length measured in meters"?
Has this been done in some other language?
A $330-million Mars probe was lost because the contractor was using yards and pounds and NASA was using meters and newtons. A Measure library would have prevented the crash.
F# has support for it, see for example this link for an introduction. There has been some work done in Scala on Units, for example here and here. There is a Scala compiler plugin as well, as described in this blog post. I briefly tried to install it, but using Scala 2.8.1, I got an exception when I started up the REPL, so I'm not sure whether this plugin is actively maintained at the moment.
Well, this functionality exists in Java, meaning you can use it directly in Scala.
jsr-275, which was moved to google code. jscience implements the spec. Here's a good introduction. If you want a better interface, I'd use this as a base and build a wrapper around it.
Your question is fully answered with one word. You can thank me later.
FRINK. http://futureboy.us/frinkdocs/
FYI, I have developed a Scalar class in Scala to represent physical units. I am currently using it for my R&D work in air traffic control, and it is working well for me. It does not check for unit consistency at compile time, but it checks at run time. I have a unique scheme for easily substituting it with basic numeric types for efficiency after the application is tested. You can find the code and the user guide at
http://russp.us/scalar-scala.htm
Here is the summary from the website:
Summary-- A Scala class was designed to represent physical scalars and to eliminate errors involving implicit physical units (e.g., confusing radians and degrees). The standard arithmetic operators are overloaded to provide syntax identical to that for basic numeric types. The Scalar class itself does not define any units but is part of a package that includes a complete implementation of the standard metric system of units and many common non-metric units. The scalar package also allows the user to define a specialized or reduced set of physical units for any particular application or domain. Once an application has been developed and tested, the Scalar class can be switched off at compile time to achieve the execution efficiency of operations on basic numeric types, which are an order of magnitude faster. The scalar class can also be used for discrete units to enforce type checking of integer counts, thereby enhancing the static type checking of Scala with additional dynamic type checking.
Let me clarify my previous post. I should have said, "These kinds of errors ["meter/yard conversion errors"] are automatically AVOIDED (not "handled") by simply using my Scalar class. All unit conversions are done automatically. That's the easy part.
The harder part is the checking for unit inconsistencies, such as adding a length to a velocity. This is where the issue of dynamic vs. static type checking comes up. I agree that static checking is generally preferable, but only if it can be done without sacrificing usability and convenience.
I have seen at least two "projects" for static checking of units, but I have never heard of anyone actually using them for real work. If someone knows of a case where they were used, please let me know. Until you use software for real work, you don't know what sorts of issues will come up.
As I wrote above, I am currently using my Scalar class (http://russp.us/scalar-scala.htm) for my R&D work in ATC. I've had to make many tweaks along the way for usability and convenience, but it is working well for me. I would be willing to consider a static units implementation if a proven one comes along, but for now I feel that I have essentially 99% of the value of such a thing. Hey, the vast majority of scientists and engineers just use "Doubles," so cut me some slack!
"Yeah, ATC software with run-time type checking? I can see headlines now: "Flight 34 Brought Down By Meter/Yard Conversion"."
Sorry, but you don't know what you're talking about. ATC software is tested for years before it is deployed. That is enough time to catch unit inconsistency errors.
More importantly, meter/yard conversions are not even an issue here. These kinds of errors are automatically handled simply by using my Scalar class. For those kinds of errors, you need neither static nor dynamic checking. The issue of static vs. dynamic checking comes up only for unit inconsistencies, as in adding length to time. These kinds of errors are less common and are typically caught with dynamic checking on the first test run.
By the way, the interface here is terrible.