Instantiate a module in OCaml dynamically - command-line

I have several modules implementing the same interface. I want to load only one of this module depending on one argument given on the command line.
I was thinking to use first-class module but the problem is that I want to execute some functions before that the module is instanciated.
For the moment I have this :
module Arch = (val RetrolixAbstractArch.get_arch() : RetrolixAbstractArch.AbstractArch)
let get_arch () =
let arch = Options.get_arch() in
if arch = "" then
Error.global_error "During analysis of compiler's architecture"
"No architecture specified"
else
if arch = "mips" then
( module MipsArch : AbstractArch)
else
Error.global_error "During analysis of compiler's architecture"
(Printf.sprintf "Architecture %s not supported or unknown" arch)
But since the command line is not parsed yet, Options.get_arch give me the empty string.
I would like to realize the command line parsing before that this function is executed (without adding the parssing in the function). Is it possible ? Should I find an other way to achieve this ?

It is possible, but you must use local modules. This is a minor issue, that basically requires only few refactoring.
let arch_of_name = function
| "mips" -> (module MipsArch : AbstractArch)
| "arm" -> (module Arm)
| _ -> invalid_arg "unknown arch"
let main () =
...
let arch_name = get_arch () in
let module Arch = (val arch_of_name arch_name) in
(* here you can use module Arch as usual *)
Another approach is to functorize your modules with arch structure and instantiate your functors as soon as you know the architecture. You can see a full fledged example here (see function target_of_arch that creates first-class module for particular architecture).
If your AbstractArch interface doesn't contain type definitions, then you can use other abstractions instead of modules: records of functions or objects. They may work more smoothly, and may even allow you to overload the arch instance dynamically (by making the arch instance to be a reference, although I wouldn't suggest this, as it is quite unclean, imo).

Related

How to load multiple modules implementing the same behaviour

I do not understand how is one supposed to use multiple modules that each implement the same behaviour since i get this error at compile time:
Function is already imported from
In my case i have two modules implementing gen_event behaviour and i am trying to import them in a third module.
I get the error message whenever i am trying to compile this code:
-module(mgr).
-import(h1,[init/1]). // implements gen_event
-import(h2,[init/1]). // implements gen_event
You can't do that. Import is a simple trick to avoid to write the complete "definition" of a function. It does nothing but just says to the compiler : when you see init(P) in this module, replace with h1:init(P).
Thus it is not possible to import several function with the same name/arity.
For short names, I do not see any benefit to use import.
If you are using module:function with long names, and you want to shorten the lines in the code, it is possible to use macros instead, and there are no limitation (but also few chance that the function name are the same :o):
-define(Func1(Var1,...,VarN), module1:func(Var1,...,VarN)).
-define(Func2(Var1,...,VarN), module2:func(Var1,...,VarN)).
...
?Func1(A1,...,AN);
...
?Func2(B1,...,BN);
Edit
The next example illustrates how it works, first I create the module mod1 as follow:
-module (mod1).
-export ([test/1]).
test(P) ->
case P of
1 -> ok;
2 -> mod2:test()
end.
and I test it in the shell:
1> c(mod1).
{ok,mod1}
2> mod1:test(1).
ok
3> mod1:test(2).
** exception error: undefined function mod2:test/0
4> % this call failed because mod2 was not defined.
4> % lets define it and compile.
mod2 is created as:
-module (mod2).
-export ([test/0]).
test() ->
io:format("now it works~n").
continue in the shell:
4> c(mod2).
{ok,mod2}
5> mod1:test(1).
ok
6> mod1:test(2).
now it works
ok
7>
As you can see, it is not necessary to modify mod1, but only to create and compile mod2 (note that it would be the same if mod2 already exists but the function test/0 is not exported).
If you want to verify that your code is not using undefined function, you can use external tools. As I am using rebar3 to manage my projects, I use the command rebar3 xref to perform this check. Note that calling an undefined function is a simple warning, it is meaningful in the context of application upgrading. This verification is not bullet proof: it is done at build time, this does not guarantee that the modules yo need will be present, with the right version on a production system: it opens a lot more interesting questions about versioning, code loading...

How do you resolve an OCaml circular build error?

I have code that produced a circular build error, and I looked up the error. This page gives a similar but smaller example of what's in my .mli file: https://ocaml.org/learn/tutorials/ocamlbuild/New_kinds_of_build_errors.html
Essentially the problem is that my file is both defining a type and defining functions that use arguments and return values of that same type. However, that's exactly what I want my program to do. My type is not private, it's declared explicitly in the .mli file:
type state = {
current_pos : int*int;
contents : int*int list;
}
val update_state : state -> state
It seems to me reasonable to want to build a module that defines a type and then to share that type with other files, but it seems like the circular build error will always prevent that. Is there some "more proper" way of doing this sharing?
There's nothing at all wrong with the code you posted. It compiles fine. So the problem is in your .ml file.
The page you point to shows code that is incorrect. The only point being made is that you'll get a different error if you use ocamlbuild than you would if you just compile the file directly.
The key point is that you should not use the name of a module inside the definition of the module.
Instead of this (in a.ml):
type t = int
let x : A.t = 14
You should have this:
type t = int
let x: t = 14
If your code is really like this example, you just need to remove the module names inside the .ml file.
As you say, what you want to do is by far the most common use of a module.

Can I instantiate classes containing values with side effects at the top level?

This question is related to and overlaps with the question in Should one wrap type providers containing values that have side effects inside a class?, kindly answered by Aaron M. Eshbach.
I am trying to implement in my code the excellent advice in the F# coding conventions page
https://learn.microsoft.com/en-us/dotnet/fsharp/style-guide/conventions.
The section Use classes to contain values that have side effects is particularly interesting. It says
There are many times when initializing a value can have side effects, such as instantiating a context to a database or other remote resource. It is tempting to initialize such things in a module and use it in subsequent functions.
and provides an example. Then it points out three problems with this practice (I omit those for lack of space, but they can be seen at the linked article) and recommends using a simple class to hold dependencies.
Foolowing on that advice I implemented a simple class to contain a value that has side effects:
type Roots() =
let msg = "Roots: Computer must be one of THREADRIPPER, LAPTOP or HPW8"
member this.dropboxRoot =
let computerName = Environment.MachineName
match computerName with
| "THREADRIPPER" -> #"C:\"
| "HP-LAPTOP" -> #"C:\"
| "HPW8" -> #"H:\"
| _ -> failwith msg
Then I can use it inside a function
let foo (name: string) =
let roots = Roots()
let path = Path.Combine(roots.dropboxRoot, #"Dropbox\Temp\" + name + ".csv")
printfn "%s" path
foo "SomeName"
So far so good. In the example above the class is quite "light" and I can instantiate it inside any function.
However, the class containing the values with side effects could as well be computationally intensive. In that case I would like to instantiate it only once and call it from different functions:
let roots = Roots()
let csvPrinter (name: string) =
let path = Path.Combine(roots.dropboxRoot, #"Dropbox\Folder1\" + name + ".csv")
printfn "%s" path
let xlsxPrinter (name: string) =
let path = Path.Combine(roots.dropboxRoot, #"Dropbox\Folder2\" + name + ".xlsx")
printfn "%s" path
csvPrinter "SomeName"
xlsxPrinter "AnotherName"
So my question is: if I instantiate the class Roots at the top level in a module am I defeating the purpose of creating a class, which was to avoid the problems described in the F# coding conventions page? If that is the case, how do I deal with computationally intensive definitions?
Short answer is - yes, that defeats the purpose of having this sort of wrapper in the first place.
The guideline however misses the forest for the trees a bit - the real problem there is a more fundamental question of managing stateful dependencies and external data in an environment that advocates function purity and referential transparency, especially when you're looking at a large codebase that needs to grow and change over time (if we're looking at one-off throwaway scripts, just do what gets the job done). It is more in the way how the roots field is populated and consumed (as a hardcoded, static dependency), then whether the value there is wrapped in a class or not.
The approach I would recommend here is to write your business logic as a module (or multiple modules) of pure functions, and pass dependencies explicitly as arguments. This way, you defer making decisions about the dependencies to the caller. This may go all the way up, to the entry point of your program (the main function in a console application, the Startup class in an API and so on). In the dreaded OOP parlance, what you're looking at is the equivalent of a composition root - the one place in your program where you assemble your dependencies.
This may involve having a class wrapper around an otherwise purely functional module, as the convention you link to suggests, but that's not a foregone conclusion. You may very well have one (side-effecting) function to produce the value for you, and you may just pass this one single value down.
let getDropboxRoot () : string option =
let computerName = Environment.MachineName
match computerName with
| "THREADRIPPER" -> Some #"C:\"
| "HP-LAPTOP" -> Some #"C:\"
| "HPW8" -> Some #"H:\"
| _ -> None
let csvPrinter (dropboxRoot: string) (name: string) =
let path = Path.Combine(dropboxRoot, #"Dropbox\Folder1\" + name + ".csv")
printfn "%s" path
This way you have a full control over your effectful operation - you can call the function whenever you want, and you can call it again for a new value if the environment changes. The rest of the code does neither know or care that the value you feed in comes from an effectful operation - it makes reasoning about what it does, as well as testing, simple.
Having a class wrapper around it adds nothing to those properties by itself. It might provide a nicer API for a bit more boilerplate, but the real problem that is being discussed there is elsewhere.

OCaml interface vs. signature?

I'm a bit confused about interfaces vs. signatures in OCaml.
From what I've read, interfaces (the .mli files) are what govern what values can be used/called by the other programs. Signature files look like they're exactly the same, except that they name it, so that you can create different implementations of the interface.
For example, if I want to create a module that is similar to a set in Java:
I'd have something like this:
the set.mli file:
type 'a set
val is_empty : 'a set -> bool
val ....
etc.
The signature file (setType.ml)
module type Set = sig
type 'a set
val is_empty : 'a set -> bool
val ...
etc.
end
and then an implementation would be another .ml file, such as SpecialSet.ml, which includes a struct that defines all the values and what they do.
module SpecialSet : Set
struct
...
I'm a bit confused as to what exactly the "signature" does, and what purpose it serves. Isn't it acting like a sort of interface? Why is both the .mli and .ml needed? The only difference in lines I see is that it names the module.
Am I misunderstanding this, or is there something else going on here?
OCaml's module system is tied into separate compilation (the pairs of .ml and .mli files). So each .ml file implicitly defines a module, each .mli file defines a signature, and if there is a corresponding .ml file that signature is applied to that module.
It is useful to have an explicit syntax to manipulate modules and interfaces to one's liking inside a .ml or .mli file. This allows signature constraints, as in S with type t = M.t.
Not least is the possibility it gives to define functors, modules parameterized by one or several modules: module F (X : S) = struct ... end. All these would be impossible if the only way to define a module or signature was as a file.
I am not sure how that answers your question, but I think the answer to your question is probably "yes, it is as simple as you think, and the system of having .mli files and explicit signatures inside files is redundant on your example. Manipulating modules and signatures inside a file allows more complicated tricks in addition to these simple things".
This question is old but maybe this is useful to someone:
A file named a.ml appears as a module A in the program...
The interface of the module a.ml can be written in file named a.mli
slide link
This is from the OCaml MOOC from Université Paris Diderot.

Redundancy in OCaml type declaration (ml/mli)

I'm trying to understand a specific thing about ocaml modules and their compilation:
am I forced to redeclare types already declared in a .mli inside the specific .ml implementations?
Just to give an example:
(* foo.mli *)
type foobar = Bool of bool | Float of float | Int of int
(* foo.ml *)
type baz = foobar option
This, according to my normal way of thinking about interfaces/implementations, should be ok but it says
Error: Unbound type constructor foobar
while trying to compile with
ocamlc -c foo.mli
ocamlc -c foo.ml
Of course the error disappears if I declare foobar inside foo.ml too but it seems a complex way since I have to keep things synched on every change.
Is there a way to avoid this redundancy or I'm forced to redeclare types every time?
Thanks in advance
OCaml tries to force you to separate the interface (.mli) from the implementation (.ml. Most of the time, this is a good thing; for values, you publish the type in the interface, and keep the code in the implementation. You could say that OCaml is enforcing a certain amount of abstraction (interfaces must be published; no code in interfaces).
For types, very often, the implementation is the same as the interface: both state that the type has a particular representation (and perhaps that the type declaration is generative). Here, there can be no abstraction, because the implementer doesn't have any information about the type that he doesn't want to publish. (The exception is basically when you declare an abstract type.)
One way to look at it is that the interface already contains enough information to write the implementation. Given the interface type foobar = Bool of bool | Float of float | Int of int, there is only one possible implementation. So don't write an implementation!
A common idiom is to have a module that is dedicated to type declarations, and make it have only a .mli. Since types don't depend on values, this module typically comes in very early in the dependency chain. Most compilation tools cope well with this; for example ocamldep will do the right thing. (This is one advantage over having only a .ml.)
The limitation of this approach is when you also need a few module definitions here and there. (A typical example is defining a type foo, then an OrderedFoo : Map.OrderedType module with type t = foo, then a further type declaration involving'a Map.Make(OrderedFoo).t.) These can't be put in interface files. Sometimes it's acceptable to break down your definitions into several chunks, first a bunch of types (types1.mli), then a module (mod1.mli and mod1.ml), then more types (types2.mli). Other times (for example if the definitions are recursive) you have to live with either a .ml without a .mli or duplication.
Yes, you are forced to redeclare types. The only ways around it that I know of are
Don't use a .mli file; just expose everything with no interface. Terrible idea.
Use a literate-programming tool or other preprocessor to avoid duplicating the interface declarations in the One True Source. For large projects, we do this in my group.
For small projects, we just duplicate type declarations. And grumble about it.
You can let ocamlc generate the mli file for you from the ml file:
ocamlc -i some.ml > some.mli
In general, yes, you are required to duplicate the types.
You can work around this, however, with Camlp4 and the pa_macro syntax extension (findlib package: camlp4.macro). It defines, among other things, and INCLUDE construct. You can use it to factor the common type definitions out into a separate file and include that file in both the .ml and .mli files. I haven't seen this done in a deployed OCaml project, however, so I don't know that it would qualify as recommended practice, but it is possible.
The literate programming solution, however, is cleaner IMO.
No, in the mli file, just say "type foobar". This will work.