Using STArray and ignore the return of modify in Purescript - purescript

I think I'm close to what I want, though I suspect I'm not understanding how thaw / TH Region works.
Here is what I'm trying to implement (at least roughly)
modifyPerIndex :: forall t a. Foldable t => t (Tuple Int (a -> a)) -> Array a -> Array a
modifyPerIndex foldableActions array = run do
mutableArray <- thaw array
let actions = fromFoldable foldableActions
foreach actions (\(Tuple index action) -> modify index action mutableArray)
freeze mutableArray
This is sort of how I imagine updateAtIndices works. I suppose I could write modifyPerIndex to use updateAtIndices by reading in the values, applying the (a -> a) and mapping the result into a list of Tuples to be sent to updateAtIndices.
I'm curious how to do it this way though.
In the code above modify returns ST h Boolean, which I'd like to change into ST h Unit. That's where I'm lost. I get that h here is a constraint put on mutable data to stop it from leaving run, what I don't understand is how to use that.

There are a few options. But it has nothing to do with h. You don't have to "use" it for anything, and you don't have to worry about it at all.
First, the most dumb and straightforward approach - just bind the result to an ignored variable and then separately return unit:
foreach actions \(Tuple index action) -> do
_ <- modify index action mutableArray
pure unit
Alternatively, you can use void, which does more or less the same thing under the hood:
foreach actions \(Tuple index action) -> void $ modify index action mutableArray
But I would go straight for for_, which is the same as foreach, but works for any monad (not just ST) and ignores individual iterations' return values:
for_ actions \(Tuple index action) -> modify index action mutableArray

Related

Function generation with arbitrary signature - revisited

I am resubmitting a question asked almost a decade ago on this site link - but which is not as generic as I would like.
What I am hoping for is a way to construct a function from a list of types, where the final output type can have an arbitrary/default value (such as 0.0 for a float, or "" for a string). So, from
[float; int; float;]
I would get something that amounts to
fun(f: float) ->
fun(i: int) ->
0.0
I am hopeful of achieving this, but am so far unable to. It would be helping me out a lot if I could see a sample that does the above.
The answer in the above link goes some of the way, but the example seems to know its function signature at compile time, which I won't, and also generates a compiler warning.
The scenario I have, for those that find context helpful, is that I want to be able to open a dll and one way or another identify a method which will have a given signature with argument-types limited to a known set of types (i.e. float, int). For each input parameter in this function signature I will run code to generate a 'buffer' object, which will have
a buffer of data items of the given type, i.e. [1.2; 3.2; 4.5]
a supplier of that data type (supplies may be intermittent so the receiving buffer may be empty at any one time)
a generator function that transforms data items before being dispatched. This function can be updated at any time.
a dispatch function. The dispatch target of bufferA will be bufferB, and for bufferB it will be a pub-sub thing where subscribers can subscribe to the end result of the calculation, in this case a stream of floats. Data accumulates in applicative style down the chain of buffers, until the final result is published as a new stream.
a regulator that turns the stream of data heading out to the consumer on or off. This ensures orderly function application.
The function from the dll will eventually be given to BufferA to apply to a float and pass the result on to buffer B (to pick up an int). However, while setting up the buffer infrastructure I only need a function with the correct signature, so a dummy value, such as 0.0, is fine.
For a function of a known signature I can handcraft the code that creates the necessary infrastructure, but I would like to be able to automate this, and ideally register dlls and have new calculated streams available plugin-style without rebuilding the application.
If you're willing to throw type safety out the window, you could do this:
let rec makeFunction = function
| ["int"] -> box 0
| ["float"] -> box 0.0
| ["string"] -> box ""
| "int" :: types ->
box (fun (_ : int) -> makeFunction types)
| "float" :: types ->
box (fun (_ : float) -> makeFunction types)
| "string" :: types ->
box (fun (_ : string) -> makeFunction types)
| _ -> failwith "Unexpected"
Here's a helper function for invoking one of these monstrosities:
let rec invokeFunction types (values : List<obj>) (f : obj) =
match types, values with
| [_], [] -> f
| ("int" :: types'), (value :: values') ->
let f' = f :?> (int -> obj)
let value' = value :?> int
invokeFunction types' values' (f' value')
| ("float" :: types'), (value :: values') ->
let f' = f :?> (float -> obj)
let value' = value :?> float
invokeFunction types' values' (f' value')
| ("string" :: types'), (value :: values') ->
let f' = f :?> (string -> obj)
let value' = value :?> string
invokeFunction types' values' (f' value')
| _ -> failwith "Unexpected"
And here it is in action:
let types = ["int"; "float"; "string"] // int -> float -> string
let f = makeFunction types
let values = [box 1; box 2.0]
let result = invokeFunction types values f
printfn "%A" result // output: ""
Caveat: This is not something I would ever recommend in a million years, but it works.
I got 90% of what I needed from this blog by James Randall, entitled compiling and executing fsharp dynamically at runtime. I was unable to avoid concretely specifying the top level function signature, but a work-around was to generate an fsx script file containing that signature (determined from the relevant MethodInfo contained in the inspected dll), then load and run that script. James' blog/ github repository also describes loading and running functions contained in script files. Having obtained the curried function from the dll, I then apply it to default arguments to get representative functions of n-1 arity using
let p1: 'p1 = Activator.CreateInstance(typeof<'p1>) :?> 'p1
let fArity2 = fArity3 p1
Creating and running a script file is slow, of course, but I only need to perform this once when setting up the calculation stream

How do I cache hash codes for an AST?

I am working on a language in F# and upon testing, I find that the runtime spends over 90% of its time comparing for equality. Because of that the language is so slow as to be unusable. During instrumentation, the GetHashCode function shows fairly high up on the list as a source of overhead. What is going on is that during method calls, I am using method bodies (Expr) along with the call arguments as keys in a dictionary and that triggers repeated traversals over the AST segments.
To improve performance I'd like to add memoization nodes in the AST.
type Expr =
| Add of Expr * Expr
| Lit of int
| HashNode of int * Expr
In the above simplified example, what I would like is that the HashNode represent the hash of its Expr, so that the GetHashCode does not have to travel any deeper in the AST in order to calculate it.
That having said, I am not sure how I should override the GetHashCode method. Ideally, I'll like to reuse the inbuilt hash method and make it ignore only the HashNode somehow, but I am not sure how to do that.
More likely, I am going to have to make my own hash function, but unfortunately I know nothing about hash functions so I am a bit lost right now.
An alternative idea that I have would be to replace nodes with unique IDs while keeping that hash function as it is, but that would introduce additional complexities into the code that I'd rather avoid unless I have to.
I needed a similar thing recently in TheGamma (GitHub) where I build a dependency graph (kind of like AST) that gets recreated very often (when you change code in editor and it gets re-parsed), but I have live previews that may take some time to calculate, so I wanted to reuse as much of the previous graph as possible.
The way I'm doing that is that I attach a "symbol" to each node. Two nodes with the same symbol are equal, which I think you could use for efficient equality testing:
type Expr =
| Add of ExprNode * ExprNode
| Lit of int
and ExprNode(expr:Expr, symbol:int) =
member x.Expression = expr
member x.Symbol = symbol
override x.GetHashCode() = symbol
override x.Equals(y) =
match y with
| :? ExprNode as y -> y.Symbol = x.Symbol
| _ -> false
I do keep a cache of nodes - the key is some code of the node kind (0 for Add, 1 for Lit, etc.) and symbols of all nested nodes. For literals, I also add the number itself, which will mean that creating the same literal twice will give you the same node. So creating a node looks like this:
let node expr ctx =
// Get the key from the kind of the expression
// and symbols of all nested node in this expression
let key =
match expr with
| Lit n -> [0; n]
| Add(e1, e2) -> [1; e1.Symbol; e2.Symbol]
// Return either a node from cache or create a new one
match ListDictionary.tryFind key ctx with
| Some res -> res
| None ->
let res = ExprNode(expr, nextId())
ListDictionary.set key res ctx
res
The ListDictionary module is a mutable dictionary where the key is a list of integers and nextId is the usual function to generate next ID:
type ListDictionaryNode<'K, 'T> =
{ mutable Result : 'T option
Nested : Dictionary<'K, ListDictionaryNode<'K, 'T>> }
type ListDictionary<'K, 'V> = Dictionary<'K, ListDictionaryNode<'K, 'V>>
[<CompilationRepresentation(CompilationRepresentationFlags.ModuleSuffix)>]
module ListDictionary =
let tryFind ks dict =
let rec loop ks node =
match ks, node with
| [], { Result = Some r } -> Some r
| k::ks, { Nested = d } when d.ContainsKey k -> loop ks (d.[k])
| _ -> None
loop ks { Nested = dict; Result = None }
let set ks v dict =
let rec loop ks (dict:ListDictionary<_, _>) =
match ks with
| [] -> failwith "Empty key not supported"
| k::ks ->
if not (dict.ContainsKey k) then
dict.[k] <- { Nested = Dictionary<_, _>(); Result = None }
if List.isEmpty ks then dict.[k].Result <- Some v
else loop ks (dict.[k].Nested)
loop ks dict
let nextId =
let mutable id = 0
fun () -> id <- id + 1; id
So, I guess I'm saying that you'll need to implement your own caching mechanism, but this worked quite well for me and may hint at how to do this in your case!

Using F#'s hash function inside GetHashCode() evil?

I encountered a couple of places online where code looked something like this:
[<CustomEquality;NoComparison>]
type Test =
| Foo
| Bar
override x.Equals y =
match y with
| :? Test as y' ->
match y' with
| Foo -> false
| Bar -> true // silly, I know, but not the question here
| _ -> failwith "error" // don't do this at home
override x.GetHashCode() = hash x
But when I run the above in FSI, the prompt does not return when I either call hash foo on an instance of Test or when I call foo.GetHashCode() directly.
let foo = Test.Foo;;
hash foo;; // no returning to the console until Ctrl-break
foo.GetHashCode();; // no return
I couldn't readily proof it, but it suggests that hash x calls GetHashCode() on the object, which means the above code is dangerous. Or is it just FSI playing up?
I thought code like the above just means "please implement custom equality, but leave the hash function as default".
I have meanwhile implemented this pattern differently, but am still wondering whether I am correct in assuming that hash just calls GetHashCode(), leading to an eternal loop.
As an aside, using equality inside FSI returns immediately, suggesting that it either does not call GetHashCode() prior to comparison, or it does something else. Update: this makes sense as in the example above x.Equals does not call GetHashCode(), and the equality operator calls into Equals, not into GetHashCode().
It's not quite as simple as the hash function simply being a wrapper for GetHashCode but I can comfortably tell you that it's definitely not safe to use the implementation : override x.GetHashCode() = hash x.
If you trace the hash function through, you end up here:
let rec GenericHashParamObj (iec : System.Collections.IEqualityComparer) (x: obj) : int =
match x with
| null -> 0
| (:? System.Array as a) ->
match a with
| :? (obj[]) as oa -> GenericHashObjArray iec oa
| :? (byte[]) as ba -> GenericHashByteArray ba
| :? (int[]) as ba -> GenericHashInt32Array ba
| :? (int64[]) as ba -> GenericHashInt64Array ba
| _ -> GenericHashArbArray iec a
| :? IStructuralEquatable as a ->
a.GetHashCode(iec)
| _ ->
x.GetHashCode()
You can see here that the wild-card case calls x.GetHashCode(), hence it's very possible to find yourself in an infinite recursion.
The only case I can see where you might want to use hash inside an implementation of GetHashCode() would be when you are manually hashing some of an object's members to produce a hash code.
There is a (very old) example of using hash inside GetHashCode() in this way in Don Syme's WebLog.
By the way, that's not the only thing unsafe about the code you posted.
Overrides for object.Equals absolutely must not throw exceptions. If the types do not match, they are to return false. This is clearly documented in System.Object.
Implementations of Equals must not throw exceptions; they should
always return a value. For example, if obj is null, the Equals method
should return false instead of throwing an ArgumentNullException.
(Source)
If the GetHashCode() method is overridden, then the hash operator will use that:
[The hash operator is a] generic hash function, designed to return equal hash values for items that are equal according to the = operator. By default it will use structural hashing for F# union, record and tuple types, hashing the complete contents of the type. The exact behavior of the function can be adjusted on a type-by-type basis by implementing System.Object.GetHashCode for each type.
So yes, this is a bad idea and it makes sense that it would lead to an infinite loop.

append if element in list

I'm trying to create a parser for program. For example,
I entered (what I want)
"(2+3)-4" it will become something like this "(minus, (plus, num 2, num 3),num 4)"
What I've done so far..
"(2+3)-4" I then split it and it becomes list Z = ["(","2","+","3",")","-","4"] then I compared if "-" is a member of Z, if true I append the element "-" into a new list ["-"]
I'm not sure if the way I'm doing is correct, I'm new to Er-lang and struggling quite a lot. If anyone is able to offer me some insight, thanks.
Consider the following, which returns a tuple-based representation of its input:
parse(Expr) ->
Elems = re:split(Expr, "([-+)(])", [{return,list}]),
parse(lists:filter(fun(E) -> E /= [] end, Elems), []).
parse([], [Result]) ->
Result;
parse([], [V2,{op,Op},V1|Tacc]) ->
parse([], [{Op,V1,V2}|Tacc]);
parse(["("|Tail], Acc) ->
parse(Tail, [open|Acc]);
parse([")"|Tail], [Op,open|TAcc]) ->
parse(Tail, [Op|TAcc]);
parse(["+"|Tail], Acc) ->
parse(Tail, [{op,plus}|Acc]);
parse(["-"|Tail], Acc) ->
parse(Tail, [{op,minus}|Acc]);
parse([V2|Tail], [{op,Op},V1|Tacc]) ->
parse(Tail, [{Op,V1,{num,list_to_integer(V2)}}|Tacc]);
parse([Val|Tail], Acc) ->
parse(Tail, [{num,list_to_integer(Val)}|Acc]).
The first function, parse/1, splits the expression along the + and - operators and parentheses, preserving these in the resulting list. It then filters that list to remove empty elements, and passes it with an empty accumulator to parse/2.
The parse/2 function has eight clauses, described below:
The first two handle the case when the parsed input list has been exhausted. The second of these handles the case where multiple elements in the accumulator need to be collapsed into a single tuple consisting of operator and operands.
The next two handle clauses parentheses. When we see an open parenthesis, we push an atom open into the accumulator. Upon seeing the matching close parenthesis, we expect to see an operation tuple and the atom open in the accumulator, and we replace them with just the tuple.
Clauses 5 and 6 handle + and - respectively. Each just pushes a {op,Operator} tuple into the accumulator, where Operator is either the atom plus or the atom minus.
The final two clauses handle values. The first one handles the case where the accumulator holds a value and an op tuple, which gets replaced with a full operation tuple consisting of the atom plus or minus followed by two num tuples each holding integer operands. The last clause just handles plain values.
Putting this in a module p, compiling it, and running it in an Erlang shell yields the following:
1> p:parse("2+3").
{plus,{num,2},{num,3}}
2> p:parse("(2+3)-4").
{minus,{plus,{num,2},{num,3}},{num,4}}

Passing anonymous vars in While loop syntax

Im just starting with Coffeescript and running the examples presented in "Programming in CoffeeScript book".
In the while loops section i got intrigued why the call to the times function has to declared as stated bellow.
times = (number_of_times, callback) ->
index = 0
while index++ < number_of_times
callback(index)
return null
times 5, (index) ->
console.log index
I was struggling a bit to read the code and when i've tried:
times (5, (index)) ->
console.log index
It returns an error.
Could you provide some help understanding this code please?
A standard function definition is structured like this:
name = (arg, ...) ->
body
so there's not much to say about your times definition. So let us look at your call to times:
times 5, (index) ->
console.log index
This part:
(index) ->
console.log index
is just another function definition but this one is anonymous. We can rewrite your call using a named function to help clarify things:
f = (index) -> console.log index
times 5, f
And we can fill in the optional parentheses to really spell it out:
f = (index) -> console.log(index)
times(5, f)
Once everything has been broken down you should see that the 5 and (index) in:
times 5, (index) ->
console.log index
have nothing to do with each other so grouping them in parentheses:
times (5, (index)) ->
console.log index
doesn't make sense. If you wanted to add parentheses to that times call to clarify the structure (which is quite useful when the callback function is longer) you need to know two things:
No space between the function name and the opening parenthesis around the arguments. If there is a space then CoffeeScript will think you're using the parentheses to group things within the argument list.
The parentheses need to surround the entire argument list and that includes the callback function's body.
Give that, you'd write:
times(5, (index) ->
console.log index
)
or perhaps:
times(5, (index) -> console.log(index))
With console.log was a non-native function you could even:
times(5, console.log)
but that will give you a TypeError in some browsers so don't go that far.