SML/ML Int to String conversion - smlnj

I have this code:
datatype 'a Tree = Empty | LEAF of 'a | NODE of ('a Tree) list;
val iL1a = LEAF 1;
val iL1b = LEAF 2;
val iL1c = LEAF 3;
val iL2a = NODE [iL1a, iL1b, iL1c];
val iL2b = NODE [iL1b, iL1c, iL1a];
val iL3 = NODE [iL2a, iL2b, iL1a, iL1b];
val iL4 = NODE [iL1c, iL1b, iL3];
val iL5 = NODE [iL4];
fun treeToString f Node = let
fun treeFun (Empty) = ["(:"]
| treeFun (NODE([])) = [")"]
| treeFun (LEAF(v)) = [f v]
| treeFun (NODE(h::t)) = [""] # ( treeFun (h)) # ( treeFun (NODE(t)) )
in
String.concat(treeFun Node)
end;
treeToString Int.toString iL5;
When I run my function I get the output: "32123)231)12)))".
The answer should be "((32((123)(231)12)))".
I've tried modifying my function to add ( in every place I can think but I cannot figure out where I should be adding "(". Where have I messed up?
Edit: I believe I need to use map or List.filter somewhere, but am not sure where.

It looks like your method of recursion over the tail of a list node is the problem. Instead of treeFun h appended to treefun (NODE(t)), try using this for the NODE case:
treeFun (NODE(items)) = ["("] # List.concat (map treeFun items) # [")"]
That is, map treeFun over the entire contents of the node, and surround the results with "(" and ")". That definition might be a bit too terse for you to understand what's going on, so here's a more verbose form that you might find clearer:
| treeFun (NODE(items)) =
let val subtree_strings : string list list = map treeFun items
val concatenated_subtrees : string list = List.concat subtree_strings
in ["("] # concatenated_subtrees # [")"]
end
subtree_strings is the result of taking all the subtrees in the given node, and turning each of them to a list of strings by recursively calling treeFun on each subtree. Since treeFun gives back a list of strings each time it's called, and we're calling it on an entire list of subtrees, the result is a corresponding list of lists of subtrees. So for instance, if we called map treeFun [LEAF 1, LEAF 2, LEAF 3], we'd get back [["1"], ["2"], ["3"]].
That's not the answer we want, since it's a list of lists of strings rather than a list of plain strings. We can fix that using List.concat, which takes a list of lists, and forms a single list of all the underlying items. So for instance List.concat [["1"], ["2"], ["3"]] returns ["1", "2", "3"]. Now all we have to do is put the parentheses around the result, and we're done.
Notice that this strategy works just as well for completely empty nodes as it does for nodes with one or more subtrees, so it eliminates the need for the second case of treeFun in your original definition. Generally, in ML, it's a code smell if a function of one argument doesn't have exactly one case for each constructor of the argument's type.

Related

Filling in desired lines in Scala

I currently have a value of result that is a string which represents cycles in a graph
> scala result
String =
0:0->52->22;
5:5->70->77;
8:8->66->24;8->42->32;
. //
. // trimmed to get by point across
. //
71:71->40->45;
77:77->34->28;77->5->70;
84:84->22->29
However, I want to have the output have the numbers in between be included and up to a certain value included. The example code would have value = 90
0:0->52->22;
1:
2:
3:
4:
5:5->70->77;
6:
7:
8:8->66->24;8->42->32;
. //
. // trimmed
. //
83:
84:84->22->29;
85:
86:
87:
88:
89:
90:
If it helps or makes any difference, this value is changed to a list for later purposes, such like
list_result = result.split("\n").toList
List[String] = List(0:0->52->22;, 5:5->70->77;, 8:8->66->24;8->42->32;, 11:11->26->66;11->17->66;
My initial thought was to insert the missing numbers into the list and then sort it, but I had trouble with the sorting so I instead look here for a better method.
Turn your list_result into a Map with default values. Then walk through the desired number range, exchanging each for its Map value.
val map_result: Map[String,List[String]] =
list_result.groupBy("\\d+:".r.findFirstIn(_).getOrElse("bad"))
.withDefault(List(_))
val full_result: String =
(0 to 90).flatMap(n => map_result(s"$n:")).mkString("\n")
Here's a Scastie session to see it in action.
One option would be to use a Map as an intermediate data structure:
val l: List[String] = List("0:0->52->22;", "5:5->70->77;", "8:8->66->24;8->42->32;", "11:11->26->66;11->17->66;")
val byKey: List[Array[String]] = l.map(_.split(":"))
val stop = 90
val mapOfValues = (1 to stop).map(_->"").toMap
val output = byKey.foldLeft(mapOfValues)((acc, nxt) => acc + (nxt.head.toInt -> nxt.tail.head))
output.toList.sorted.map {case (key, value) => println(s"$key, $value")}
This will give you the output you are after. It breaks your input strings into pseudo key-value pairs, creates a map to hold the results, inserts the elements of byKey into the map, then returns a sorted list of the results.
Note: If you are using this in anything like production code you'd need to properly check that each Array in byKey does have two elements to prevent any nullPointerExceptions with the later calls to head and tail.head.
The provided solutions are fine, but I would like to suggest one that can process the data lazily and doesn't need to keep all data in memory at once.
It uses a nice function called unfold, which allows to "unfold" a collection from a starting state, up to a point where you deem the collection to be over (docs).
It's not perfectly polished but I hope it may help:
def readLines(s: String): Iterator[String] =
util.Using.resource(io.Source.fromString(s))(_.getLines)
def emptyLines(from: Int, until: Int): Iterator[(String)] =
Iterator.range(from, until).map(n => s"$n:")
def indexOf(line: String): Int =
Integer.parseInt(line.substring(0, line.indexOf(':')))
def withDefaults(from: Int, to: Int, it: Iterator[String]): Iterator[String] = {
Iterator.unfold((from, it)) { case (n, lines) =>
if (lines.hasNext) {
val next = lines.next()
val i = indexOf(next)
Some((emptyLines(n, i) ++ Iterator.single(next), (i + 1, lines)))
} else if (n < to) {
Some((emptyLines(n, to + 1), (to, lines)))
} else {
None
}
}.flatten
}
You can see this in action here on Scastie.
What unfold does is start from a state (in this case, the line number from and the iterator with the lines) and at every iteration:
if there are still elements in the iterator it gets the next item, identifies its index and returns:
as the next item an Iterator with empty lines up to the latest line number followed by the actual line
e.g. when 5 is reached the empty lines between 1 and 4 are emitted, terminated by the line starting with 5
as the next state, the index of the line after the last in the emitted item and the iterator itself (which, being stateful, is consumed by the repeated calls to unfold at each iteration)
e.g. after processing 5, the next state is 6 and the iterator
if there are no elements in the iterator anymore but the to index has not been reached, it emits another Iterator with the remaining items to be printed (in your example, those after 84)
if both conditions are false we don't need to emit anything anymore and we can close the "unfolding" collection, signalling this by returning a None instead of Some[(Item, State)]
This returns an Iterator[Iterator[String]] where every nested iterator is a range of values from one line to the next, with the default empty lines "sandwiched" in between. The call to flatten turns it into the desired result.
I used an Iterator to make sure that only the essential state is kept in memory at any time and only when it's actually used.

Traversing a tree and assigning a subtree in Julia

I am trying to manipulate a tree in Julia. The tree is created as an object. All I want is substituting the one of the branches with another one. I can do it manually but can not do it by using a recursion function.
mutable struct ILeaf
majority::Any # +1 when prediction is correct
values::Vector # num_of_samples
indicies::Any # holds the index of training samples
end
mutable struct INode
featid::Integer
featval::Any
left::Union{ILeaf,INode}
right::Union{ILeaf,INode}
end
ILeafOrNode = Union{ILeaf,INode}
And my function for chaning the tree is (tree is original one where, by using LR_STACK, I am willing to change one of the branches and substitute it with the subtree. ) :
function traverse_and_assign(tree, subtree, lr_stack) # by using Global LR_stack
if top(lr_stack) == 0
tree = subtree
elseif top(lr_stack) == :LEFT
pop!(lr_stack)
return traverse_and_assign(tree.left, subtree, lr_stack)
else # right otherwise
pop!(lr_stack)
return traverse_and_assign(tree.right, lr_stack)
end
end
What happens is that I cannot change the original tree.
On the other hand :
tree.left.left = subtree
works perfectly fine.
What is wrong with my code ? Do I have to write a macro for this ?
B.R.
edit#1
In order to generate data :
n, m = 10^3, 5 ;
features = randn(n, m);
lables = rand(1:2, n);
edit#2
use 100 samples for training the decision tree :
base_learner = build_iterative_tree(labels, features, [1:20;])
then give other samples one by one :
i = 21
feature = features[21, :], label = labels[21]
gtree_stack, lr_stack = enter_iterate_on_tree(base_learner, feature[:], i, label[1])
get the indices of incorrect samples
ids = subtree_ids(gtree_stack)
build the subtree:
subtree = build_iterative_tree(l, f, ids)
update the original tree(base_learner):
traverse_and_assign(base_learner, subtree, lr_stack)
I still miss MWE but maybe I could help with one problem without it.
In Julia value is bind to variable. Parameters in functions are new variables. Let's do test what does it mean:
function test_assign!(tree, subtree)
tree = subtree
return tree
end
a = 4;
b = 5;
test_assign!(a, b) # return 5
show(a) # 4 ! a is not changed!
What happend? value 4 was bind to tree and value 5 was bind to subtree.
subtree's value (5) was bind to tree.
And nothing else! Means a is stil bound to 4.
How to could we change a? This will work:
mutable struct SimplifiedNode
featid::Integer
end
function test_assign!(tree, subtree)
tree.featid = subtree.featid
end
a = SimplifiedNode(4)
b = SimplifiedNode(5)
test_assign!(a, b)
show(a) # SimplifiedNode(5)
Why? What happend?
Value of a (which is something like pointer to mutable struct) is bind to tree and value of b is bound to subtree.
So a and tree are bound to same structure! Means that if we change that structure a is bind to changed structure.

How do I cache hash codes for an AST?

I am working on a language in F# and upon testing, I find that the runtime spends over 90% of its time comparing for equality. Because of that the language is so slow as to be unusable. During instrumentation, the GetHashCode function shows fairly high up on the list as a source of overhead. What is going on is that during method calls, I am using method bodies (Expr) along with the call arguments as keys in a dictionary and that triggers repeated traversals over the AST segments.
To improve performance I'd like to add memoization nodes in the AST.
type Expr =
| Add of Expr * Expr
| Lit of int
| HashNode of int * Expr
In the above simplified example, what I would like is that the HashNode represent the hash of its Expr, so that the GetHashCode does not have to travel any deeper in the AST in order to calculate it.
That having said, I am not sure how I should override the GetHashCode method. Ideally, I'll like to reuse the inbuilt hash method and make it ignore only the HashNode somehow, but I am not sure how to do that.
More likely, I am going to have to make my own hash function, but unfortunately I know nothing about hash functions so I am a bit lost right now.
An alternative idea that I have would be to replace nodes with unique IDs while keeping that hash function as it is, but that would introduce additional complexities into the code that I'd rather avoid unless I have to.
I needed a similar thing recently in TheGamma (GitHub) where I build a dependency graph (kind of like AST) that gets recreated very often (when you change code in editor and it gets re-parsed), but I have live previews that may take some time to calculate, so I wanted to reuse as much of the previous graph as possible.
The way I'm doing that is that I attach a "symbol" to each node. Two nodes with the same symbol are equal, which I think you could use for efficient equality testing:
type Expr =
| Add of ExprNode * ExprNode
| Lit of int
and ExprNode(expr:Expr, symbol:int) =
member x.Expression = expr
member x.Symbol = symbol
override x.GetHashCode() = symbol
override x.Equals(y) =
match y with
| :? ExprNode as y -> y.Symbol = x.Symbol
| _ -> false
I do keep a cache of nodes - the key is some code of the node kind (0 for Add, 1 for Lit, etc.) and symbols of all nested nodes. For literals, I also add the number itself, which will mean that creating the same literal twice will give you the same node. So creating a node looks like this:
let node expr ctx =
// Get the key from the kind of the expression
// and symbols of all nested node in this expression
let key =
match expr with
| Lit n -> [0; n]
| Add(e1, e2) -> [1; e1.Symbol; e2.Symbol]
// Return either a node from cache or create a new one
match ListDictionary.tryFind key ctx with
| Some res -> res
| None ->
let res = ExprNode(expr, nextId())
ListDictionary.set key res ctx
res
The ListDictionary module is a mutable dictionary where the key is a list of integers and nextId is the usual function to generate next ID:
type ListDictionaryNode<'K, 'T> =
{ mutable Result : 'T option
Nested : Dictionary<'K, ListDictionaryNode<'K, 'T>> }
type ListDictionary<'K, 'V> = Dictionary<'K, ListDictionaryNode<'K, 'V>>
[<CompilationRepresentation(CompilationRepresentationFlags.ModuleSuffix)>]
module ListDictionary =
let tryFind ks dict =
let rec loop ks node =
match ks, node with
| [], { Result = Some r } -> Some r
| k::ks, { Nested = d } when d.ContainsKey k -> loop ks (d.[k])
| _ -> None
loop ks { Nested = dict; Result = None }
let set ks v dict =
let rec loop ks (dict:ListDictionary<_, _>) =
match ks with
| [] -> failwith "Empty key not supported"
| k::ks ->
if not (dict.ContainsKey k) then
dict.[k] <- { Nested = Dictionary<_, _>(); Result = None }
if List.isEmpty ks then dict.[k].Result <- Some v
else loop ks (dict.[k].Nested)
loop ks dict
let nextId =
let mutable id = 0
fun () -> id <- id + 1; id
So, I guess I'm saying that you'll need to implement your own caching mechanism, but this worked quite well for me and may hint at how to do this in your case!

get one random letter from each tuple then return them all as a string

3 tuples in a list
val l = List(("a","b"),("c","d"),("e","f"))
choice one element from each tuple then return this 3 letters word every time
for example: fca or afd or cbf ...
how to realize it
the same as:
echo {a,b}{c,d}{e,f}|xargs -n1|shuf -n1|sed 's/\B/\n/g'|shuf|paste -sd ''
Working with tuples can be a bit of a pain. You can't easily index them and tuples of different sizes are considered different types in the type system.
val ts = List(("a","b"),("c","d"),("e","f"))
val str = ts.map{t =>
t.productElement(util.Random.nextInt(t.productArity))
}.mkString("")
Every time I run this I get a different result: bde, acf, bdf, etc.

.pop() equivalent in scala

I have worked on python
In python there is a function .pop() which delete the last value in a list and return that
deleted value
ex. x=[1,2,3,4]
x.pop() will return 4
I was wondering is there is a scala equivalent for this function?
If you just wish to retrieve the last value, you can call x.last. This won't remove the last element from the list, however, which is immutable. Instead, you can call x.init to obtain a list consisting of all elements in x except the last one - again, without actually changing x. So:
val lastEl = x.last
val rest = x.init
will give you the last element (lastEl), the list of all bar the last element (rest), and you still also have the original list (x).
There are a lot of different collection types in Scala, each with its own set of supported and/or well performing operations.
In Scala, a List is an immutable cons-cell sequence like in Lisp. Getting the last element is not a well optimised solution (the head element is fast). Similarly Queue and Stack are optimised for retrieving an element and the rest of the structure from one end particularly. You could use either of them if your order is reversed.
Otherwise, Vector is a good performing general structure which is fast both for head and last calls:
val v = Vector(1, 2, 3, 4)
val init :+ last = v // uses pattern matching extractor `:+` to get both init and last
Where last would be the equivalent of your pop operation, and init is the sequence with the last element removed (you can also use dropRight(1) as suggested in the other answers). To just retrieve the last element, use v.last.
I tend to use
val popped :: newList = list
which assigns the first element of the list to popped and the remaining list to newList
The first answer is correct but you can achieve the same doing:
val last = x.last
val rest = x.dropRight(1)
If you're willing to relax your need for immutable structures, there's always Stack and Queue:
val poppable = scala.collection.mutable.Stack[String]("hi", "ho")
val popped = poppable.pop
Similar to Python's ability to pop multiple elements, Queue handles that:
val multiPoppable = scala.collection.mutable.Queue[String]("hi", "ho")
val allPopped = poppable.dequeueAll(_ => true)
If it is mutable.Queue, use dequeue function
/** Returns the first element in the queue, and removes this element
* from the queue.
*
* #throws java.util.NoSuchElementException
* #return the first element of the queue.
*/
def dequeue(): A =
if (isEmpty)
throw new NoSuchElementException("queue empty")
else {
val res = first0.elem
first0 = first0.next
decrementLength()
res
}