Can all values of types with unit be created with unit? - scala

In Functional Programming in Scala, when introducing the concept of the algebra of an API, they propose the following law
map(unit(x))(f) == unit(f(x))
for some value x and function f. So far unit has been a way to create a unit of parallelism Par[A], but I guess it doesn't have to be here. I anticipate it's the same unit as in monads. map is what you think it is.
They substitute the identity id for f
map(unit(x))(id) == unit(id(x))
map(unit(x))(id) == unit(x)
and finally substitute y for unit(x)
map(y)(id) == y
I don't get how they make this last step, unless all possible values of that type can be constructed by unit, which doesn't seem right since Par[A] is a type alias for a function, along some object-level functions, and I'd have thought it trivial to hand-craft some new Par[A] functions without unit. Given they're talking quite abstractly, my question is how this works generally, not just for their Par[A] type.

I have the same book here. If you take a look at the notes, the author says:
This is the same sort of substitution and simplification one might do when solving an algebraic equation.
Which, in terms of functional programming, means that mapping any value y over id should be equals to y. If we take the concept of referential transparency, which states that:
An expression is called referentially transparent if it can be replaced with its corresponding value without changing the program's behavior
y could represent any value, be it unit(x), x or any other value

Related

Are PyTorch activation functions best stored as fields?

An example of a simple neural network in PyTorch can be found at https://visualstudiomagazine.com/articles/2020/10/14/pytorch-define-network.aspx
class Net(T.nn.Module):
def __init__(self):
super(Net, self).__init__()
self.hid1 = T.nn.Linear(4, 8) # 4-(8-8)-1
self.hid2 = T.nn.Linear(8, 8)
self.oupt = T.nn.Linear(8, 1)
T.nn.init.xavier_uniform_(self.hid1.weight)
T.nn.init.zeros_(self.hid1.bias)
T.nn.init.xavier_uniform_(self.hid2.weight)
T.nn.init.zeros_(self.hid2.bias)
T.nn.init.xavier_uniform_(self.oupt.weight)
T.nn.init.zeros_(self.oupt.bias)
def forward(self, x):
z = T.tanh(self.hid1(x))
z = T.tanh(self.hid2(z))
z = T.sigmoid(self.oupt(z))
return z
A distinctive feature of the above is that the layers are stored as fields within the Net object (as they need to be, in the sense that they contain the weights, which need to be remembered across training epochs), but the activation functors such as tanh are re-created on every call to forward. The author says:
The most common structure for a binary classification network is to define the network layers and their associated weights and biases in the __init__() method, and the input-output computations in the forward() method.
Fair enough. On the other hand, perhaps it would be marginally faster to store the functors rather than re-create them on every call to forward. On the third hand, it's unlikely to make any measurable difference, which means it might end up being a matter of code style.
Is the above, indeed the most common way to do it? Does either way have any technical advantage, or is it just a matter of style?
On "storing" functors
The snippet is not "re-creating" anything -- calling torch.tanh(x) is literally just calling the function tanh exported by the torch package with arguments x.
Other ways of doing it
I think the snippet is a fair example for small neural blocks that are use-and-forget or are just not meant to be parameterizable.
Depending on your intentions, there are of course alternatives, but you'd have to weigh yourself whether the added complexity offers any value.
activation functions as strings
allow a selection of an activation function from a fixed set
class Model(torch.nn.Module):
def __init__(..., activation_function: Literal['tanh'] | Literal['relu']):
...
if activation_function == 'tanh':
self.activation_function = torch.tanh
elif activation_function == 'relu':
self.activation_function = torch.relu
else:
raise ValueError(f'activation function {activation_function} not allowed, use tanh or relu.'}
def forward(...) -> Tensor:
output = ...
return self.activation_function(output)
activation functions as callables
use arbitrary modules or functions as activations
class Model(torch.nn.Module):
def __init__(..., activation_function: torch.nn.Module | Callable[[Tensor], Tensor]):
self.activation_function = activation_function
def forward(...) -> Tensor:
output = ...
return self.activation_function(output)
which would for instance work like
def cube(x: Tensor) -> Tensor: return x**3
cubic_model = Model(..., activation_function=cube)
The key difference between the above examples and your snippet is the fact that the latter are transparent and adjustable wrt. to the activation used; you can inspect the activation function (i.e. model.activation_function), and change it (before or after initialization), whereas in the case of the original snippet it is invisible and baked into the model's functionality (to replicate the model with a different function, you'd need to define it from scratch).
Overall, I think the best way to go is to create small, locally tunable blocks that are as parametric as you need them to be, and wrap them into bigger blocks that make generalizations over the contained parameters. i.e. if your big model consists of 5 linear layers, you could make a single, activation-parametric wrapper for 1 layer (including dropouts, layer norms, whatever), and then another wrapper for a flow of N layers, which asks once for which activation function to initialize its children with. In other words, generalize and parameterize when you anticipate this to save you from extra effort and copy-pasting code in the future, but don't overdo it or you'll end up far away from your original specifications and needs.
ps: I don't know whether calling activation functions functors is justifiable.

How to define custom functions in Maple?

I'm new to Maple and I'm looking for a simple way to automate some tasks. In particular, I'm looking for a way to define custom "action" that perform some steps automatically.
As as an example I would like to define a quick way to compute the determinant of the Hessian of a polynomial. Currently the way I do this is opening Maple, create a new worksheet than performing the following commands:
p := (x, y) -> x^2*y + 3*x^3 + y^3
with(VectorCalculus):
h := Hessian(p(x, y), [x, y])
Determinant(h)
What I would like to do is to compute the hessian determinant directly with something like
HessDet(p)
where HessDet would be a custom command that performs the operations above. How does one achieve something like this in Maple?
First things first: The value assigned to your p is a procedure which can return a polynomial expression, but not itself a polynomial. It's important not to muddle expressions and procedures. Doing so is a common cause of problems for new users.
Being able to throw around p(x,y) may be visually pleasing to your eye, but it serves little programmatic purpose here. The fact that the formal parameters of procedure p happen to be called x and y, along with the fact that you called procedure p with arguments x and y, is actually just another common source of confusion. Don't create procedures merely to call them in this way.
Also, your call p(x,y) makes it look magic that your code snippet "knows" how many arguments would be required by procedure p. So it's already a muddle to have your candidate HessDet accept p as a procedure.
So instead let's keep it straightforward, by writing HessDet to accept a polynomial rather than a procedure. We can programmatically ascertain the names in which this expression of of type polynom.
restart;
HessDet:=proc(p::algebraic)
local H,vars;
vars:=indets(p,
And(name,Non(constant),
satisfies(u->type(p,polynom(anything,u)))));
H:=VectorCalculus:-Hessian(p,[vars[]]);
LinearAlgebra:-Determinant(H);
end proc:
Now some examples of using it,
P := x^2*y + 3*x^3 + y^3;
HessDet(P);
p := (x, y) -> x^2*y + 3*x^3 + y^3;
HessDet(p(x,y));
HessDet(x^3-x^2+4*x);
HessDet(s^2*t + 3*s^3 + t^3);
HessDet(s[r]^2*t[r] + 3*s[r]^3 + t[r]^3);
You might also wonder how you could re-use this custom procedure across sessions, without having to type it in each time. Two reasonable ways are:
Put the (above) defining plaintext definition of HessDet inside a personal initialization file.
Create a (.mla) Maple Library Archive file, then Save your HessDet to that, and then augment the Library search path in your initialization file.
It might look like 2) is more effort, but only the Save step is needed for repeats, and you can store many custom procedures to the same archive. Your choice...
[edit] The OP has asked for clarification of the first part of the above procedure HessDet, which I suspect means the call to indets.
If P is assigned an expression then then the call indets(P,name) will return a set of all the names present in that expression. Basically, it returns the set of all indeterminate subexpressions of the expression which are of type name in Maple's technical sense.
For example,
P := x*y + sin(a*Pi)*x;
x y + sin(a Pi) x
indets( P,
name );
{Pi, a, x, y}
Perhaps the name of the constant Pi is not wanted here. Ie,
indets( P,
And( name,
Non(constant) ) );
{a, x, y}
Perhaps we want only the non-constant names in which the expression is a polynomial? Ie,
indets( P,
And( name,
Non(constant),
satisfies(u->type(p,polynom(anything,u))) ) );
{x, y}
That last result is an advanced way of using the following tests:
type(P, polynom(anything, x));
true
type(P, polynom(anything, y));
true
type(P, polynom(anything, a));
false
A central issue here is that the OP made no mention of what kind of polynomials are to be handled by the custom procedure. So I guessed with some defensive coding, in hope of less surprises later on. The original Question states that the input could be a "polynomial", but we weren't told what kind of coefficients there might be.
Perhaps the coefficients will always be real and exact or numeric. Perhaps the custon procedure should throw an error when not supplied such. These details weren't mentioned in the Question.

Distinguishing cryptographic properties: hiding and collision resistance

I saw from Another question the following definitions, which clarifies somewhat:
Collision-resistance:
Given: x and h(x)
Hard to find: y that is distinct from x and such that h(y)=h(x).
Hiding:
Given: h(r|x), where r|x is the concatenation of r and x
Secret: x and a highly-unlikely-and-randomly-chosen r
Hard to find: y such that h(y)=h(r|x). where r|x is the concatenation of r and x
This is different from collision-resistance in that it doesn’t matter whether or not y=r|x.
My question:
Does this mean that any hash function h(x) is non-hiding if there is no secret r, that is, the hash is h(x), not h(r|x)? where r|x is the concatenation of r and x
Example:
Say I make a simple hash function h(x) = g^x mod(n), where g is the generator for the group. The hash should be Collision resistant with p(x_1 != x_2, h(x_1) = h(x_2)) = 1/(2^(n/2)), but I would think it is hiding as well?
Hashfunctions can kinda offer collision-resistance
Commitments have to be hiding.
In contrast to popular opinion these primitives are not the same!
Very strictly speaking the thing that you think of as hash-function cannot offer collision resistance: There always ARE collisions. The input space is infinite in theory, yet the function always produces a fixed-length output. The terminology should actually be “H is randomly drawn from a family of collision-resistant functions”. In practice however we will just call that function collision-resistant and ignore that it technically isn't.
A commitment has to offer two properties: Hiding and Binding. Binding means that you can only open it to one value (this is where the relation to collision-resistance comes in). Hiding means that it is impossible to learn anything about the element that is contained in it. This is why a secure commitment MUST use randomness (or nounces, but after all is said and done, those boil down to the same). Imagine any hash-function, no matter how perfect you want it to be: You can use a random-oracle if you want. If I give you a hash H(m) of a value m , you can compute H(0), compare the result and learn whether m = 0, meaning it is not hiding.
This is also why g^x is not a hiding commitment-scheme. Whether it is binding depends on what you allow as the message-space: If you allow all integers, then a simple attack y = x*phi(n)produces H(y)=H(x).
works. If you define it as ℤ_p, where p is the group-order, then it is perfectly binding, as it is an information-theoretically collision-resistant one-way-function. (Since message-space and target-space are of the same size, this time a single function actually CAN be truly collision-resistant!)

How does the scala compiler locate the positions for variance annotation

I am trying to understand how the compiler checks whether the position for a type parameter is covariant or contravariant.
As far as I know, if the type parameter is annotated with the +, which is the covariant annotation, then any method cannot have a input parameter typed with that class/trait's type parameter.
For example, bar cannot have a parameter of type T.
class Foo[+T] {
def bar(param: T): Unit =
println("Hello foo bar")
}
Because the position for the parameter of bar() is considered to be negative, which means any type parameter in that position is in a contravariant position.
I am curious how the Scala compiler can find if every location in the class/trait is positive, negative, or neutral. It seems that there exist some rules like flipping its position in some condition but couldn't understand it clearly.
Also, if possible, I would like to know how these rules are defined. For example, it seems that parameters for methods defined in a class that has covariant annotation, like bar() method in Foo class, should have contravariant class type. Why?
I am curious how the Scala compiler can find if every location in the
class/trait is positive, negative, or neutral. It seems that there
exist some rules like flipping its position in some condition but
couldn't understand it clearly.
The Scala compiler has a phase called parser (like most compilers), which goes over the text and parses out tokens. One of these tokens is called variance. If we dive into the detail, there's a method called Parsers.typeParamClauseOpt which is responsible for parsing out the type parameter clause. The part relevant to your question is this:
def typeParam(ms: Modifiers): TypeDef = {
var mods = ms | Flags.PARAM
val start = in.offset
if (owner.isTypeName && isIdent) {
if (in.name == raw.PLUS) {
in.nextToken()
mods |= Flags.COVARIANT
} else if (in.name == raw.MINUS) {
in.nextToken()
mods |= Flags.CONTRAVARIANT
}
}
The parser looks for the + and - signs in the type parameter signature, and creates a class called TypeDef which describes the type and states that it is covariant, contravariant or invariant.
Also, if possible, I would like to know how these rules are defined.
Variance rules are universal, and they stem from a branch of mathematics called Category Theory. More specifically, they're derived from Covariant and Contravariant Functors and the composition between the two. If you want to learn more on these rules, that would be the path I'd take.
Additionally, there is a class called Variance in the Scala compiler which looks like a helper class in regards to variance rules, if you want to take a deeper look.
To verify the correctness of variance annotation, the compiler classifies all positions in a class or trait body as positive (+), negative (-) or neutral.
A "position" is any location in the class or trait (but from now on I'll just write "class") body where a type parameter may be used.
The compiler checks each use of each of the class's type parameters:
+T type parameters (covariant/flexible) may only be used in positive
positions.
-T type parameters (contravariant) may only be used in
negative positions.
T type parameters (invariant/rigid) may be used in
any position, and is therefore, the only kind of type parameter that
can be used in neutral positions.
To classify the positions, the compiler starts from the given class declaration of a type parameter and moves inward through deeper nesting levels. Positions at the top level of the declaring class are classified as positive. By default, positions at deeper nesting levels are classified the same as that at enclosing levels, but there are three exceptions where the classification changes:
Method value parameter positions are classified to the flipped
classification relative to positions outside the method, where:
the flip of a positive classification is negative
the flip of a negative classification is positive
the flip of a neutral classification remains neutral.
Besides method value parameter positions, the current classification
is also flipped at the type parameters of methods.
A classification is sometimes flipped at the type argument position of a type, such as the Arg in C[Arg], depending on the variance of the corresponding type parameter. If C's type parameter is:
+T then the classification stays the same.
-T then the current classification is flipped
T then the current classification is changed to neutral
To understand better, here's a contrived example, where all positions are annotated with their classifications, given by the compiler:
abstract class Cat[-T, +U] {
def meow[W-](volume: T-, listener: Cat[U+, T-]-): Cat[Cat[U+, T-]-, U+]+
}
The type parameter W is in a negative (contravariant) position because of rule number 2 stated above. (So it is flipped relative to positions outside the method - which are stated to be positive by default, so the compiler flips it and that is why it becomes negative.).
The value parameter volume is in a negative position (contravariant) because of rule number 1. (So it is flipped in the same manner as W)
The value parameter listener, is in a negative position for the same reason as volume. Looking at the positions of its type arguments U and T inside the type Cat, they are flipped because Cat is in a negative position, and according to rule number 3 it must be flipped.
The result type of the method is positive because it's considered
outside the method. Looking inside the result type of meow: the
position of the first Cat[U, T] is negative because Cat's first
type parameter, T is annotated with a -; while the second type
argument, U is positive since Cat's second type parameter, U is
annotated with a +. The two positions remain unchanged (are not
flipped) because the flipping rule does not apply here (rule number
3), since Cat is in a positive position. But the types U and T inside the first argument of Cat are flipped because the flipping
rule does apply here - that Cat is in a negative position.
As you can see it's quite hard to keep track of variance positions. That's why is a welcome relief that the Scala compiler does this job for you.
Once the classification is computed, the compiler checks that each type parameter is only used in positions that are classified appropriately. In this case, T is only used in negative positions, while U is only used in positive positions. So class Cat is type correct.
The rules and example are taken directly from the Programming in Scala book by M. Odersky, B. Venners and L. Spoon.
This also answers your second question: based on these rules we can infer that method value arguments will always be in contravariant positions, while method result types will always be in covariant positions. This is why you can't have a covariant type in a method value parameter position in your example.

variance annotation, keeping track "positive" and "negative" positions by Scala compiler

In Programming in Scala page 436, the author gives an example of the compiler checking that each type parameter is only used in positions that are classified appropriately.
abstract class Cat[-T, +U] {
def meow[W^-](volume: T^-, listener: Cat[U^+, T^-]^-) : Cat[Cat[U^+, T^-]^-, U^+]^+
}
How does the example work out? Why do W and the first T get a negative sign? How does the algorithm actually work?
http://www.artima.com/pins1ed/type-parameterization.html
19.4 in 1st ed.
"Method value parameter positions are classified to the flipped classification relative to positions outside the method."
"Besides method value parameter positions, the current classification is also flipped at the type parameters of methods."
Flipped in this case means "flipped from positive", hence, negative.
For bonus points, generate a LOLcats that illustrates a physical interpretation of this model.
Additional Q&A:
Okay let's look at the 3rd value parameter "listener".
It has a annotation of: Cat[U^+, T^-]^-.
Why does U have +? Why does T have -? Why does the whole thing have a -?
The method param is a contravariant position, hence the outermost (right-most) minus.
The type params to Cat are [-T, +U], so in this flipped position, [+, -]. (The actual params being applied, [U, T], aren't relevant.) It checks because the actual params are co- and contra-variant, respectively.
More questions:
Could you kindly describe on SO why the return value type has the following annotation
for the sake of completeness...
Also could you be so kind as to give an example of the following rule?
A classification is sometimes flipped at the type argument position of a type...
This second additional question is the same as your previous first additional question. The two Cat[+,-] illustrate flipping, and the result type Cat[-,+] illustrates not flipping.
This thread provides further motivation for variance of params (things you pass in) and results (things you get out):
https://groups.google.com/forum/#!topic/scala-user/ViwLKfvo3ec
I found the Java discussion and examples (PECS or Naftalin and Wadler) useful background for what Scala provides.