Scala variance positions - theory behind it? - scala

Scala has notion of "variance position" and fancy rules around it, especially when variance is combined with method type bounds. Rules ensure type safety, one can read them in Scala lang spec, or see briefed in "Programming in Scala" 5ed:
To classify the positions, the compiler starts from the declaration of a type parameter and then moves inward through deeper nesting levels. Positions at the top level of the declaring class are classified as positive. By default, positions at deeper nesting levels are classified the same as that at enclosing levels, but there are a handful of exceptions where the classification changes. Method value parameter positions are classified to the flipped classification relative to positions outside the method, where the flip of a positive classification is negative, the flip of a negative classification is positive, and the flip of a neutral classification is still neutral.
Here we see some sort of "variance positions algebra". I understand those flipping, nesting - by trying in code, so my question is different.
Question:
imagine I want to create my language that supports variance (as rich as Scala, or rudimentary like Java's wildcard generics). Which theory (type theory?), which section of that theory I need to grasp to understand all the "mechanics" behind this "variance positions algebra"?
I seek for kind of more abstract, more general knowledge (not just list of given rules in Scala lang spec) that would permit me to see particular languages' variance implementations simply as special cases?

Related

How to control ordering of Matlab *optimproblem.Variables*

Matlab's optimproblem class of objects allows users to define an Integer Linear Program (ILP) problems using symbolic variables. This is dubbed the "problem-based" formulation. Internal methods takes care of setting up the detailed ILP formulation by assembling the coefficient arrays and matrices for the objective function, equality constraints, and inequality constraints. In Matlab, these details are referred to as the "structure" for the "solver-based" formulation.
Users can see the order in which the optimproblem.Variables are taken in setting up the solver-based formulation by using prob2struct to explicitly convert an optimizationproblem object into a solver-based structure. The Algorithms section of the prob2struct page, the variables are taken in the order in which they appear in the optimizationproblem.Variables property.
I haven't been able to find what determines this order. Is there any way to control het order, maybe even change it if necessary? This would allow one to control the order of the scalar variables of the archetypal ILP problem setup, i.e., the solver-based formulation.
Thanks.
Reason for this question
I'm using Matlab as a prototyping environment, and may be relying on others to develop based on the prototype, possibly calling other solver engines. An uncontrolled ordering of variables makes it hard to compare, especially if the development has a deterministic way of arranging the variables. Hence my wish to control the variable ordering. If this is not possible, it would be nice to know. I would then know to turn my attention completely to mitigating the challenge of disparately ordered variables.

Is Dijkstra's a special case for A*?

According to a.o. this accepted answer, Dijkstra's algorithm is a special case of the A* algorithm.
Special case
In logic, especially as applied in mathematics, concept A is a special case or specialization of concept B precisely if every
instance of A is also an instance of B but not vice versa, or
equivalently, if B is a generalization of A.
There are, however, cases that produce different results for A* with a constant zero heuristic when compared to Dijktra's algorithm.
One of these is their behaviour when the graph contains negative cycles: while A*'s closed set handling completely avoids such cycles, Dijkstra's follows negative cycles infinite times before proceeding.
Another edge case where their results may differ, this time even with an admissible heuristic, is when A* underestimates the cost of the final edge in a graph. Given a graph like this:
Most A* implementations, including the psuedo-code example at wikipedia, would incorrectly assume A-B-D to be the shortest path if the heuristic function for rates B to D below 2, while Dijkstra correctly yields A-C-D.
Can Dijkstra's algorithm still be viewed as special case for A*, given the above?

Self organized map understaning

I have a self organized map created a Som_pak-3.1 here
If I have three different type of elements, and they are different. Why the elements are not in different parts of the map? Why the "A", "B" and "C" are in many many cases together at the same hexagon? Why "B" and "C" are never alone in an hexagon?
Thanks in advance!
I feel that it is a normal result for SOM. The unsupervised SOM algorithm is not aware of the elements. Using the distance metric, the neurons have learned the vectors, and then the elements were placed as labels at the best matching neuron.
One possible reason for two elements appearing on the same node is if they have the same values for each of the features. Otherwise, they have different values for each feature, but the values still seem similar according to the distance metric.
The spatial resolution can be increased by increasing the map size. This may allow the classes to be separable. However, the trade-off is that statistical significance of each neuron goes down when it is associated with fewer data points. So what I would suggest is that you can try different sizes of maps to find the one that is appropriate for your data set and goals.
Actually I was just reading about this exact point, p. 19 in Kohonen's book "MATLAB Implementations and Applications of the Self-Organizing Map" available at http://docs.unigrafia.fi/publications/kohonen_teuvo/. It covers the MATLAB SOM-Toolkit that was created after SOM-PAK. The book only briefly covers SOM-PAK but I believe that the theory from the book would help out.

Text patterns from classification

Say I have some kind of multi-class text/conversation classificator (naive bayes or so) and I wanted to find text patterns that were significant for the classification. How would I best go about finding these text patterns? The motivation behind this is you could use these patterns to better understand the process behind the classification.
A pattern is defined as a (multi)set of words s={w1, ... , wn}, this pattern has a class probability for each class c - P(c|s) - inferred by the classificator. A pattern is then significant, if the inferred probability is high (local maximum, top n, something like that).
Now it wouldn't be such a problem to run the classificator on parts of text in the dataset you are looking at. However, these patterns do not have to be natural sentences or something like that, but any (multi)subset of the vocabulary. You are then looking at running the classification on all the (multi)subsets of the vocabulary, which is computationally unrealistic.
I think what could work is to search the text space using a heuristic search algorithm such as hill climbing to maximize the likelyhood of a certain class. You could run the hillclimber a bunch of times from different initial conditions and then just take the top 10 or so unique results as patterns.
Is this a good approach, or are there better ones? Thanks for any suggestions.

How to find the time complexity of the algebra operation in algebraixlib

How can I calculate time complexity using mathematics or Big O notation for algebra operations used in the algebra of data.
I will use book example to explain my question. Consider following example given in book.
B
In above example I would like to calculate the time complexity of transpose and compose operation.
If it possible I would also like to find out other algebra data operations' time complexity.
Please let me know if you need more explanation.
#wesholler I edited my question to understand you explanation. Following is a real life example and suppose we want to calculate the time complexity for operations used below.
suppose I have algebra of data operations as follows
Could you describe how we would calculate the time complexity in above example. Preferably in Big O?
Thanks
This answer has three parts:
General Time Complexity Analysis
Generally, the time complexity/BigO can be determined by considering the origin of an operation - that is, what operations were extended from more primitive algebras to derive this one?
The following rules describe the upper-bound on the time complexity for both unary and binary operations that are extended into their power set algebras.
Unary extension can be thought of similarly to a map operation and so has linear time complexity. Binary extension evaluates the cross product of the operation's arguments and so has a worst-case time complexity similar to O(n^2). However it is important to consider that the real upper bound is a product of the cardinality of both arguments; this comes up in practice often when the right-hand argument to a composition or superstriction operation is a singleton.
Time Complexity for algebraixlib Implementations
We can take a look at a few examples of how extension affects the time complexity while at the same time analyzing the complexity of the implementations in algebraixlib (the last part talks about other implementations.)
Being that it is a reference implementation for data algebra, algebraixlib implements extended operations very literally. For that reason, Big Theta is used below, because the formulas represent both the lower and upper bounds of the time complexity.
Here is the unary operation transpose being extended from couplets to relations and then to clans.
Likewise, here is the binary operation compose being extended from couplets to relations and then to clans.
It is clear that the complexity of both of the clan operations is influenced by both the number of relation elements as well as the number of couplets in those relations.
Time Complexity for Other Implementations
It is important to note that the above section describes the time complexity that is specific to the algorithms implemented in algebraixlib.
One could imagine implementing e.g. clans.cross_union with a method similar to sort-merge-join or hash-join. In this case, the upper bound would remain the same, but the lower bound (and expected) time complexity would be reduced by one or more degrees.