Custom fields that somehow get all the way to buildHook? - plugins

I would like to pass some custom, per-executable/library configurations (ideally a whole bag of key-value pairs, but at the very least a single String) from my .cabal file all the way to Setup.hs's buildHook.
For reference, buildHook's parameters are:
buildHook
:: PackageDescription
-> LocalBuildInfo
-> UserHooks
-> BuildFlags -> IO ()
So what I am hoping for is something in the PackageDescription's library / executables field that gives me access to custom fields, without disrupting all other Cabal phases, that I could put in the .cabal file. Here's a made-up example that would basically be as good as it gets:
...
executable my-exe
main-is: my-main.hs
...
plugin-args:
myplugin:
foo: bar
baz: quux
so I could retrieve all myplugin key/value pairs to get "foo" |-> "bar", "baz" |-> "quux" in some kind of associative data structure like HashMap.
Note that I am already doing intense violence in my Setup.hs, so any kind of hacky suggestions are welcome. If need be, I can override ALL Setup.hs hooks to ignore some settings in everything-but-buildHook, if that is needed for some solution.

Although I haven't found it in the user documentation, there is this nugget in the BuildInfo type:
customFieldsBI :: [(String, String)]
Custom fields starting with x-, stored in a simple assoc-list.
So it turns out you can write
...
executable my-exe
main-is: my-main.hs
...
plugin-args:
x-myplugin: foo
and then access it with lookup "x-myplugin" . view customFieldsBI :: (HasBuildInfo bi) => bi -> Maybe String.
In particular, Executable and Library has HasBuildInfo instances, so you can just traverse the PackageDescription in buildHook and process their String value there.

Related

Unique symbol value on type level

Is it possible to have some kind of unique symbol value on the type level, that could be used to distinct (tag) some record without the need to supply a unique string value?
In JS there is Symbol often used for such things. But I would like to have it without using Effect, in pure context.
Well, it could even like accessing Full qualified module name (which is quite unique for the task), but I'm not sure if this is a really relevant/possible thing in the Purescript context.
Example:
Say There is some module that exposes:
type Worker value state =
{ tag :: String
, work :: value -> state -> Effect state
}
makeWorker :: forall value state. Worker value state
performWork :: forall value state. woker -> Worker value state -> value -> Unit
This module is used to manage the state of workers, it passes them value and current state value, and gets Effect with new state value, and puts in state map where keys are tags.
Users of the module:
In one module:
worker = makeWorker { tag: "WorkerOne", work }
-- Then this tagged `worker` is used to performWork:
-- performWork worker "Some value"
In another module we use worker with another tag:
worker = makeWorker { tag: "WorkerTwo", work }
So it would be nice if there would be no need to supply a unique string ("WorkerOne", "WorkerTwo") as a tag but use some "generated" unique value. But the task is that worker should be created on the top level of the module in pure context.
Semantics of PureScript as such is pure and pretty much incompatible with this sort of thing. Same expression always produces same result. The results can be represented differently at a lower level, but in the language semantics they're the same.
And this is a feature, not a bug. In my experience, more often than not, a requirement like yours is an indication of a flawed design somewhere upstream.
An exception to this rule is FFI: if you have to interact with the underlying platform, there is no choice but to play by that platform's rules. One example I can give is React, which uses the JavaScript's implicit object identity as a way to tell components apart.
So the bottom line is: I urge you to reconsider the requirement. Chances are, you don't really need it. And even if you do, manually specified strings might actually be better than automatically generated ones, because they may help you troubleshoot later.
But if you really insist on doing it this way, good news: you can cheat! :-)
You can generate your IDs effectfully and then wrap them in unsafePerformEffect to make it look pure to the compiler. For example:
import Effect.Unsafe (unsafePerformEffect)
import Data.UUID (toString, genUUID)
workerTag :: String
workerTag = toString $ unsafePerformEffect genUUID

Pass information from one compiler component to another without mutation

I am building a compiler plugin that has two components
Permission Accumulator: Load the function definitions and some extra meta data about them into a structure like a Map[String, (...)] where String keys represents the function name and the tuple contains the meta information + the definition in scope.
Function Transformer: Recursively traverse the function bodies to check if the metadata of the caller aligns with the callee. More specifically caller.metadata ⊆ callee.metada
This kind of preloading is a rather common thing in compilers (Zinc, Unison etc. all have similar tricks they pull). The first component needs to pass this information it has accumulated to the second component.
Unfortunately the current implementation uses a mutable.Map in the Plugin class and initiates the phases with a reference to this mutable Map. While given the fact that this code won't be surfaced to the end user and some amount of mutation could be tolerated, if someone (including myself) were to add another component/phase that touched this Map in the future, things can go very wrong, resulting in a situation that is painful to debug.
Question: I am wondering if there is a way to instantiate one component, extract some information from it, use that info to init the second component and run it.
Current Implementation:
import scala.collection.mutable.{ Map => MMap }
class Contentable(override val global: Global) extends Plugin {
val functions: MMap[String, (String, List[String])] = MMap()
val components = new PermissionAccumulator(global, functions) :: new FunctionRewriter(global, functions) :: Nil
}
The first component mutates the Map as such:
functions += (dd.name.toString -> ((md5HashString(dd.rhs.toString()), roles)))
What I have tried:
Original plan was to encapsulate the mutation inside the first component and do something like secondComponent(global, firstComponent.functions) but because Scala class create a copy of their arguments when an instance is created, the changes to this Map is not reflected in the second component
Note: I have no problem turning these component to phases if that makes a difference.

How to use Scala Cats Validated the correct way?

Following is my use case
I am using Cats for validation of my config. My config file is in json.
I deserialize my config file to my case class Config using lift-json and then validate it using Cats. I am using this as a guide.
My motive for using Cats is to collect all errors iff present at time of validation.
My problem is the examples given in the guide, are of the type
case class Person(name: String, age: Int)
def validatePerson(name: String, age: Int): ValidationResult[Person] = {
(validateName(name),validate(age)).mapN(Person)
}
But in my case I already deserialized my config into my case class ( below is a sample ) and then I am passing it for validation
case class Config(source: List[String], dest: List[String], extra: List[String])
def vaildateConfig(config: Config): ValidationResult[Config] = {
(validateSource(config.source), validateDestination(config.dest))
.mapN { case _ => config }
}
The difference here is mapN { case _ => config }. As I already have a config if everything is valid I dont want to create the config anew from its members. This arises as I am passing config to validate function not it's members.
A person at my workplace told me this is not the correct way, as Cats Validated provides a way to construct an object if its members are valid. The object should not exist or should not be constructible if its members are invalid. Which makes complete sense to me.
So should I make any changes ? Is the above I'm doing acceptable ?
PS : The above Config is just an example, my real config can have other case classes as its members which themselves can depend on other case classes.
One of the central goals of the kind of programming promoted by libraries like Cats is to make invalid states unrepresentable. In a perfect world, according to this philosophy, it would be impossible to create an instance of Config with invalid member data (through the use of a library like Refined, where complex constraints can be expressed in and tracked by the type system, or simply by hiding unsafe constructors). In a slightly less perfect world, it might still be possible to construct invalid instances of Config, but discouraged, e.g. through the use of safe constructors (like your validatePerson method for Person).
It sounds like you're in an even less perfect world where you have instances of Config that may or may not contain invalid data, and you want to validate them to get "new" instances of Config that you know are valid. This is totally possible, and in some cases reasonable, and your validateConfig method is a perfectly legitimate way to solve this problem, if you're stuck in that imperfect world.
The downside, though, is that the compiler can't track the difference between the already-validated Config instances and the not-yet-validated ones. You'll have Config instances floating around in your program, and if you want to know whether they've already been validated or not, you'll have to trace through all the places they could have come from. In some contexts this might be just fine, but for large or complex programs it's not ideal.
To sum up: ideally you'd validate Config instances whenever they are created (possibly even making it impossible to create invalid ones), so that you don't have to remember whether any given Config is good or not—the type system can remember for you. If that's not possible, because of e.g. APIs or definitions you don't control, or if it just seems too burdensome for a simple use case, what you're doing with validateConfig is totally reasonable.
As a footnote, since you say above that you're interested in looking in more detail at Refined, what it provides for you in a situation like this is a way to avoid even more functions of the shape A => ValidationResult[A]. Right now your validateName method, for example, probably takes a String and returns a ValidationResult[String]. You can make exactly the same argument against this signature as I have against Config => ValidationResult[Config] above—once you're working with the result (by mapping a function over the Validated or whatever), you just have a string, and the type doesn't tell you that it's already been validated.
What Refined allows you to do is write a method like this:
def validateName(in: String): ValidationResult[Refined[String, SomeProperty]] = ...
…where SomeProperty might specify a minimum length, or the fact that the string matches a particular regular expression, etc. The important point is that you're not validating a String and returning a String that only you know something about—you're validating a String and returning a String that the compiler knows something about (via the Refined[A, Prop] wrapper).
Again, this may be (okay, probably is) overkill for your use case—you just might find it nice to know that you can push this principle (tracking validation in types) even further down through your program.

Configuration data in Scala -- should I use the Reader monad?

How do I create a properly functional configurable object in Scala? I have watched Tony Morris' video on the Reader monad and I'm still unable to connect the dots.
I have a hard-coded list of Client objects:
class Client(name : String, age : Int){ /* etc */}
object Client{
//Horrible!
val clients = List(Client("Bob", 20), Client("Cindy", 30))
}
I want Client.clients to be determined at runtime, with the flexibility of either reading it from a properties file or from a database. In the Java world I'd define an interface, implement the two types of source, and use DI to assign a class variable:
trait ConfigSource {
def clients : List[Client]
}
object ConfigFileSource extends ConfigSource {
override def clients = buildClientsFromProperties(Properties("clients.properties"))
//...etc, read properties files
}
object DatabaseSource extends ConfigSource { /* etc */ }
object Client {
#Resource("configuration_source")
private var config : ConfigSource = _ //Inject it at runtime
val clients = config.clients
}
This seems like a pretty clean solution to me (not a lot of code, clear intent), but that var does jump out (OTOH, it doesn't seem to me really troublesome, since I know it will be injected once-and-only-once).
What would the Reader monad look like in this situation and, explain it to me like I'm 5, what are its advantages?
Let's start with a simple, superficial difference between your approach and the Reader approach, which is that you no longer need to hang onto config anywhere at all. Let's say you define the following vaguely clever type synonym:
type Configured[A] = ConfigSource => A
Now, if I ever need a ConfigSource for some function, say a function that gets the n'th client in the list, I can declare that function as "configured":
def nthClient(n: Int): Configured[Client] = {
config => config.clients(n)
}
So we're essentially pulling a config out of thin air, any time we need one! Smells like dependency injection, right? Now let's say we want the ages of the first, second and third clients in the list (assuming they exist):
def ages: Configured[(Int, Int, Int)] =
for {
a0 <- nthClient(0)
a1 <- nthClient(1)
a2 <- nthClient(2)
} yield (a0.age, a1.age, a2.age)
For this, of course, you need some appropriate definition of map and flatMap. I won't get into that here, but will simply say that Scalaz (or Rúnar's awesome NEScala talk, or Tony's which you've seen already) gives you all you need.
The important point here is that the ConfigSource dependency and its so-called injection are mostly hidden. The only "hint" that we can see here is that ages is of type Configured[(Int, Int, Int)] rather than simply (Int, Int, Int). We didn't need to explicitly reference config anywhere.
As an aside, this is the way I almost always like to think about monads: they hide their effect so it's not polluting the flow of your code, while explicitly declaring the effect in the type signature. In other words, you needn't repeat yourself too much: you say "hey, this function deals with effect X" in the function's return type, and don't mess with it any further.
In this example, of course the effect is to read from some fixed environment. Another monadic effect you might be familiar with include error-handling: we can say that Option hides error-handling logic while making the possibility of errors explicit in your method's type. Or, sort of the opposite of reading, the Writer monad hides the thing we're writing to while making its presence explicit in the type system.
Now finally, just as we normally need to bootstrap a DI framework (somewhere outside our usual flow of control, such as in an XML file), we also need to bootstrap this curious monad. Surely we'll have some logical entry point to our code, such as:
def run: Configured[Unit] = // ...
It ends up being pretty simple: since Configured[A] is just a type synonym for the function ConfigSource => A, we can just apply the function to its "environment":
run(ConfigFileSource)
// or
run(DatabaseSource)
Ta-da! So, contrasting with the traditional Java-style DI approach, we don't have any "magic" occurring here. The only magic, as it were, is encapsulated in the definition of our Configured type and the way it behaves as a monad. Most importantly, the type system keeps us honest about which "realm" dependency injection is occurring in: anything with type Configured[...] is in the DI world, and anything without it is not. We simply don't get this in old-school DI, where everything is potentially managed by the magic, so you don't really know which portions of your code are safe to reuse outside of a DI framework (for example, within your unit tests, or in some other project entirely).
update: I wrote up a blog post which explains Reader in greater detail.

OCaml interface vs. signature?

I'm a bit confused about interfaces vs. signatures in OCaml.
From what I've read, interfaces (the .mli files) are what govern what values can be used/called by the other programs. Signature files look like they're exactly the same, except that they name it, so that you can create different implementations of the interface.
For example, if I want to create a module that is similar to a set in Java:
I'd have something like this:
the set.mli file:
type 'a set
val is_empty : 'a set -> bool
val ....
etc.
The signature file (setType.ml)
module type Set = sig
type 'a set
val is_empty : 'a set -> bool
val ...
etc.
end
and then an implementation would be another .ml file, such as SpecialSet.ml, which includes a struct that defines all the values and what they do.
module SpecialSet : Set
struct
...
I'm a bit confused as to what exactly the "signature" does, and what purpose it serves. Isn't it acting like a sort of interface? Why is both the .mli and .ml needed? The only difference in lines I see is that it names the module.
Am I misunderstanding this, or is there something else going on here?
OCaml's module system is tied into separate compilation (the pairs of .ml and .mli files). So each .ml file implicitly defines a module, each .mli file defines a signature, and if there is a corresponding .ml file that signature is applied to that module.
It is useful to have an explicit syntax to manipulate modules and interfaces to one's liking inside a .ml or .mli file. This allows signature constraints, as in S with type t = M.t.
Not least is the possibility it gives to define functors, modules parameterized by one or several modules: module F (X : S) = struct ... end. All these would be impossible if the only way to define a module or signature was as a file.
I am not sure how that answers your question, but I think the answer to your question is probably "yes, it is as simple as you think, and the system of having .mli files and explicit signatures inside files is redundant on your example. Manipulating modules and signatures inside a file allows more complicated tricks in addition to these simple things".
This question is old but maybe this is useful to someone:
A file named a.ml appears as a module A in the program...
The interface of the module a.ml can be written in file named a.mli
slide link
This is from the OCaml MOOC from Université Paris Diderot.