How to do fast prefix string matching in Scala - scala

I'm using some Java code to do fast prefix lookups, using java.util.TreeSet, could I be using scala's TreeSet instead? Or a different solution?
/** A class that uses a TreeSet to do fast prefix matching
*/
class PrefixMatcher {
private val _set = new java.util.TreeSet[String]
def add(s: String) = _set.add(s)
def findMatches(prefix: String): List[String] = {
val matches = new ListBuffer[String]
val tailSet = _set.tailSet(prefix)
for ( tail <- tailSet.toArray ) {
val tailString = tail.asInstanceOf[String]
if ( tailString.startsWith(prefix) )
matches += tailString
else
return matches.toList
}
matches.toList
}
}

Use a Trie. Nobody's actually posted a Trie here yet, despite the fact that some people have posted sorted TreeMap data structures that they have misnamed as tries. Here is a fairly representative sample of a Trie implementation in Java. I don't know of any in Scala. See also an explanation of Tries on Wikipedia.

The from & takeWhile approach:
class PrefixMatcher {
private val _set = new TreeSet[String]
def add(s: String) = _set.add(s)
def findMatches(prefix: String): Iterable[String] =
_set from prefix takeWhile(_ startsWith prefix)
}
An alternative is to select a subset from prefix to prefix++ (the smallest string after the prefix). This selects only the range of the tree that actually starts with the given prefix. Filtering of entries is not necessary. The subSet method will create a view of the underlying set.
There's still some work (overflow and empty strings won't work) in the increment method but the intent should be clear.
class PrefixMatcher {
private val _set = new java.util.TreeSet[String]
def add(s: String) = _set.add(s)
def findMatches(prefix: String) : Set[String] = {
def inc(x : String) = { //ignores overflow
assert(x.length > 0)
val last = x.length - 1
(x take last) + (x(last) + 1).asInstanceOf[Char]
}
_set.subSet(prefix, inc(prefix))
}
}
The same works with the scala jcl.TreeSet#range implementation.

As I understand it, the Scala TreeSet is backed by the Java TreeSet, but using the Scala variant would allow you to shorten up the code using a sequence comprehension (http://www.scala-lang.org/node/111) giving you an implementation that looked something like (for Scala 2.7):
import scala.collection.jcl.TreeSet;
class PrefixMatcher
{
private val _set = new TreeSet[String]
def add(s: String) = _set.add(s)
def findMatches(prefix: String): Iterable[String] =
for (s <- _set.from(prefix) if s.startsWith(prefix)) yield s
}
object Main
{
def main(args: Array[String]): Unit =
{
val pm = new PrefixMatcher()
pm.add("fooBar")
pm.add("fooCow")
pm.add("barFoo")
pm.findMatches("foo").foreach(println)
}
}
Apologies for any bad Scala style on my part, I'm just getting used to the language myself.

I blogged about finding matches for a combination of prefixes a while ago. It's a harder problem, as you don't know when one prefix ends and the other begins. It might interest you. I'll even post below the code that I did not blog (yet, hopefully :), though it is stripped of all comments, none of which were made in English:
package com.blogspot.dcsobral.matcher.DFA
object DFA {
type Matched = List[(String, String)]
def words(s : String) = s.split("\\W").filter(! _.isEmpty).toList
}
import DFA._
import scala.runtime.RichString
class DFA {
private val initialState : State = new State(None, "")
private var currState : State = initialState
private var _input : RichString = ""
private var _badInput : RichString = ""
private var _accepted : Boolean = true
def accepted : Boolean = _accepted
def input : String = _input.reverse + _badInput.reverse
def transition(c : Char) : List[(String, Matched)] = {
if (c == '\b') backtrack
else {
if (accepted) {
val newState = currState(c)
newState match {
case Some(s) => _input = c + _input; currState = s
case None => _badInput = c + _badInput; _accepted = false
}
} else {
_badInput = c + _badInput
}
optionList
}
}
def transition(s : String) : List[(String, Matched)] = {
s foreach (c => transition(c))
optionList
}
def apply(c : Char) : List[(String, Matched)] = transition(c)
def apply(s : String) : List[(String,Matched)] = transition(s)
def backtrack : List[(String, Matched)] = {
if(_badInput isEmpty) {
_input = _input drop 1
currState.backtrack match {
case Some(s) => currState = s
case None =>
}
} else {
_badInput = _badInput drop 1
if (_badInput isEmpty) _accepted = true
}
optionList
}
def optionList : List[(String, Matched)] = if (accepted) currState.optionList else Nil
def possibleTransitions : Set[Char] = if (accepted) (currState possibleTransitions) else Set.empty
def reset : Unit = {
currState = initialState
_input = ""
_badInput = ""
_accepted = true
}
def addOption(s : String) : Unit = {
initialState addOption s
val saveInput = input
reset
transition(saveInput)
}
def removeOption(s : String) : Unit = {
initialState removeOption s
val saveInput = input
reset
transition(saveInput)
}
}
class State (val backtrack : Option[State],
val input : String) {
private var _options : List[PossibleMatch] = Nil
private val transitions : scala.collection.mutable.Map[Char, State] = scala.collection.mutable.Map.empty
private var _possibleTransitions : Set[Char] = Set.empty
private def computePossibleTransitions = {
if (! options.isEmpty)
_possibleTransitions = options map (_.possibleTransitions) reduceLeft (_++_)
else
_possibleTransitions = Set.empty
}
private def computeTransition(c : Char) : State = {
val newState = new State(Some(this), input + c)
options foreach (o => if (o.possibleTransitions contains c) (o computeTransition (newState, c)))
newState
}
def options : List[PossibleMatch] = _options
def optionList : List[(String, Matched)] = options map (pm => (pm.option, pm.bestMatch))
def possibleTransitions : Set[Char] = _possibleTransitions
def transition(c : Char) : Option[State] = {
val t = c.toLowerCase
if (possibleTransitions contains t) Some(transitions getOrElseUpdate (t, computeTransition(t))) else None
}
def apply(c : Char) : Option[State] = transition(c)
def addOption(option : String) : Unit = {
val w = words(option)
addOption(option, w.size, List(("", w.head)), w)
}
def addOption(option : String, priority : Int, matched : Matched, remaining : List[String]) : Unit = {
options find (_.option == option) match {
case Some(pM) =>
if (!pM.hasMatchOption(matched)) {
pM.addMatchOption(priority, matched, remaining)
if (priority < pM.priority) {
val (before, _ :: after) = options span (_ != pM)
val (highPriority, lowPriority) = before span (p => p.priority < priority ||
(p.priority == priority && p.option < option))
_options = highPriority ::: (pM :: lowPriority) ::: after
}
transitions foreach (t => pM computeTransition (t._2, t._1))
}
case None =>
val (highPriority, lowPriority) = options span (p => p.priority < priority ||
(p.priority == priority && p.option < option))
val newPM = new PossibleMatch(option, priority, matched, remaining)
_options = highPriority ::: (newPM :: lowPriority)
transitions foreach (t => newPM computeTransition (t._2, t._1))
}
computePossibleTransitions
}
def removeOption(option : String) : Unit = {
options find (_.option == option) match {
case Some(possibleMatch) =>
val (before, _ :: after) = options span (_ != possibleMatch)
(possibleMatch.possibleTransitions ** Set(transitions.keys.toList : _*)).toList foreach (t =>
transition(t).get.removeOption(option))
_options = before ::: after
computePossibleTransitions
case None =>
}
}
}
class PossibleMatch (val option : String,
thisPriority : Int,
matched : Matched,
remaining : List[String]) {
private var _priority = thisPriority
private var matchOptions = List(new MatchOption(priority, matched, remaining))
private var _possibleTransitions = matchOptions map (_.possibleTransitions) reduceLeft (_++_)
private def computePossibleTransitions = {
_possibleTransitions = matchOptions map (_.possibleTransitions) reduceLeft (_++_)
}
def priority : Int = _priority
def hasMatchOption(matched : Matched) : Boolean = matchOptions exists (_.matched == matched)
def addMatchOption(priority : Int, matched : Matched, remaining : List[String]) : Unit = {
if (priority < _priority) _priority = priority
val (highPriority, lowPriority) = matchOptions span (_.priority < priority)
val newMO = new MatchOption(priority, matched, remaining)
matchOptions = highPriority ::: (newMO :: lowPriority)
computePossibleTransitions
}
def bestMatch : Matched = matchOptions.head.matched.reverse.map(p => (p._1.reverse.toString, p._2)) :::
remaining.tail.map(w => ("", w))
def possibleTransitions : Set[Char] = _possibleTransitions
def computeTransition(s: State, c : Char) : Unit = {
def computeOptions(state : State,
c : Char,
priority : Int,
matched : Matched,
remaining : List[String]) : Unit = {
remaining match {
case w :: ws =>
if (!w.isEmpty && w(0).toLowerCase == c.toLowerCase) {
val newMatched = (w(0) + matched.head._1, matched.head._2.substring(1)) :: matched.tail
val newPriority = if (matched.head._1 isEmpty) (priority - 1) else priority
if (w.drop(1) isEmpty)
s.addOption(option, newPriority - 1, ("", ws.head) :: newMatched , ws)
else
s.addOption(option, newPriority, newMatched, w.substring(1) :: ws)
}
if (ws != Nil) computeOptions(s, c, priority, ("", ws.head) :: matched, ws)
case Nil =>
}
}
if(possibleTransitions contains c)
matchOptions foreach (mO => computeOptions(s, c, mO.priority, mO.matched, mO.remaining))
}
}
class MatchOption (val priority : Int,
val matched : Matched,
val remaining : List[String]) {
lazy val possibleTransitions : Set[Char] = Set( remaining map (_(0) toLowerCase) : _* )
}
It really needs some refactoring, though. I always do it when I'm start to explain it for the blog.

Ok, I just realized what you want is pretty much what a friend of mine suggested for another problem. So, here is his answer, simplified for your needs.
class PrefixMatcher {
// import scala.collection.Set // Scala 2.7 needs this -- and returns a gimped Set
private var set = new scala.collection.immutable.TreeSet[String]()
private def succ(s : String) = s.take(s.length - 1) + ((s.charAt(s.length - 1) + 1)).toChar
def add(s: String) = set += s
def findMatches(prefix: String): Set[String] =
if (prefix.isEmpty) set else set.range(prefix, succ(prefix))
}

Related

Why i could not get accumulator from map instance correctly?

first method
i wonder to use Accumulator to calculate num of "NULL" String in different columns, so i write Spark code as follows(the code is simplified), when i put some input in appData's map operation, i could see std output in spark web ui, the value of accumulator is increased, but when i want to get the final value in driver, the accumulators are always be zero, i'll appreciate it if you could do me a favor
val mapAC = collection.mutable.Map[String, LongAccumulator]()
for (ei <- eventList) {
val idNullCN = sc.longAccumulator(ei + "_idNullCN")
mapAC.put(ei + "_idNullCN", idNullCN)
val packNullCN = sc.longAccumulator(ei + "_packNullCN")
mapAC.put(ei + "_packNullCN", packNullCN)
val positionNullCN = sc.longAccumulator(ei + "_positionNullCN")
mapAC.put(ei + "_positionNullCN", positionNullCN)
}
val mapBC = sc.broadcast(mapAC)
val res = appData.map(d => {
val ei = d.eventId
val map = mapBC.value
if (d.id.toUpperCase == "NULL") map(ei + "_idNullCN").add(1)
if (d.pack.toUpperCase == "NULL") map(ei + "_packNullCN").add(1)
if (d.position.toUpperCase == "NULL") map(ei + "_positionNullCN").add(1)
ei
})
res.count()
mapBC.value.foreach(ac=>{
println(ac._1 + ": " + ac._2.value)
})
second method
i've tried another way to caculate the value by creating a map accumulator like this.
import java.util
import java.util.Collections
import org.apache.spark.util.AccumulatorV2
import scala.collection.JavaConversions._
class CountMapAccumulator extends AccumulatorV2[String, java.util.Map[String, Long]] {
private val _map = Collections.synchronizedMap(new util.HashMap[String, Long]())
override def isZero: Boolean = _map.isEmpty
override def copy(): CountMapAccumulator = {
val newAcc = new CountMapAccumulator
_map.synchronized {
newAcc._map.putAll(_map)
}
newAcc
}
override def reset(): Unit = _map.clear()
override def add(key: String): Unit = _map.synchronized{_map.put(key, _map.get(key) + 1L)}
override def merge(other: AccumulatorV2[String, java.util.Map[String, Long]]): Unit = other match {
case o: CountMapAccumulator => for ((k, v) <- o.value) {
val oldValue = _map.put(k, v)
if (oldValue != null) {
_map.put(k, oldValue.longValue() + v)
}
// println("merge key: "+k+" old val: "+oldValue+" new Value: "+v+" current val: "+_map.get(k))
}
case _ => throw new UnsupportedOperationException(
s"Cannot merge ${this.getClass.getName} with ${other.getClass.getName}")
}
override def value: util.Map[String, Long] = _map.synchronized {
java.util.Collections.unmodifiableMap(new util.HashMap[String, Long](_map))
}
def setValue(value: Map[String, Long]): Unit = {
val newValue = mapAsJavaMap(value)
_map.clear()
_map.putAll(newValue)
}
}
then i invoke it as follows
val tmpMap = collection.mutable.Map[String, Long]()
for (ei <- eventList) {
tmpMap.put(ei + "_idNullCN", 0L)
tmpMap.put(ei + "_packNullCN", 0L)
tmpMap.put(ei + "_positionNullCN", 0L)
}
val accumulator = new CountMapAccumulator
accumulator.setValue(collection.immutable.Map[String,Long](tmpMap.toSeq:_*))
sc.register(accumulator, "CustomAccumulator")
val res = appData.map(d => {
val ei = d.eventId
if (d.id.toUpperCase == "NULL") accumulator.add(ei + "_idNullCN")
if (d.pack.toUpperCase == "NULL") accumulator.add(ei + "_packNullCN")
if (d.position.toUpperCase == "NULL") accumulator.add(ei + "_positionNullCN")
if (d.modulePos.toUpperCase == "NULL") accumulator.add(ei + "_modulePosNullCN")
ei
})
res.count()
accumulator.value.foreach(println)
but the accumulator value is still zero either
second method correct
since the program ends correctly, i did not check the log, after i take a look, i found this ERROR
java.lang.UnsupportedOperationException: Cannot merge $line105198665522.$read$$iw$$iw$CountMapAccumulator with $line105198665522.$read$$iw$$iw$CountMapAccumulator so i change merge methd's pattern matching code like this
override def merge(other: AccumulatorV2[String, java.util.Map[String, Long]]): Unit = other match {
case o: AccumulatorV2[String, java.util.Map[String, Long]] => for ((k, v) <- o.value) {
val oldValue: java.lang.Long = _map.get(k)
if (oldValue != null) {
_map.put(k, oldValue.longValue() + v)
} else {
_map.put(k, v)
}
println(s"key: ${k} oldValue: ${oldValue} newValue: ${v} finalValue: ${_map.get(k)}")
}
case _ => throw new UnsupportedOperationException(
s"Cannot merge ${this.getClass.getName} with ${other.getClass.getName}")
}
after changed o's type, it works finally, but it still confused me what first way behaves.
in your custom accumulator you have mistake in merge function, look at correct:
val oldValue: java.lang.Long = _map.get(k)
if (oldValue != null) {
_map.put(k, oldValue.longValue() + v)
} else {
_map.put(k, v)
}

What's point in receiving a `PrefixMap` and returning a empty `PrefixMap`?

Here is an example from the stairway book:
object Example1 {
import collection._
class PrefixMap[T]
extends mutable.Map[String, T]
with mutable.MapLike[String, T, PrefixMap[T]] {
var suffixes: immutable.Map[Char, PrefixMap[T]] = Map.empty
var value: Option[T] = None
def get(s: String): Option[T] = {
// base case, you are at the root
if (s.isEmpty) value
// recursive
else suffixes get (s(0)) flatMap (_.get(s substring 1))
}
def iterator: Iterator[(String, T)] = {
(for (v <- value.iterator) yield ("", v)) ++
(for ((chr, m) <- suffixes.iterator; (s, v) <- m.iterator) yield (chr +: s, v))
}
def +=(kv: (String, T)): this.type = {
update(kv._1, kv._2)
this
}
def -=(key: String): this.type = {
remove(key)
this
}
def withPrefix(s: String): PrefixMap[T] = {
if (s.isEmpty) this
else {
val leading = s(0)
suffixes get leading match {
case None => {
// key does not exist, create it
suffixes = suffixes + (leading -> empty)
}
case _ =>
}
// recursion
suffixes(leading) withPrefix (s substring 1)
}
}
override def update(s: String, elem: T) = {
withPrefix(s).value = Some(elem)
}
override def remove(key: String): Option[T] = {
if (key.isEmpty) {
// base case. you are at the root
val prev = value
value = None
prev
} else {
// recursive
suffixes get key(0) flatMap (_.remove(key substring 1))
}
}
override def empty = PrefixMap.empty
}
import collection.mutable.{Builder, MapBuilder}
import collection.generic.CanBuildFrom
object PrefixMap {
def empty[T] = new PrefixMap[T]
def apply[T](kvs: (String, T)*): PrefixMap[T] = {
val m: PrefixMap[T] = empty
for(kv <- kvs) m += kv
m
}
def newBuilder[T]: Builder[(String, T), PrefixMap[T]] = {
new mutable.MapBuilder[String, T, PrefixMap[T]](empty)
}
implicit def canBuildFrom[T]: CanBuildFrom[PrefixMap[_], (String, T), PrefixMap[T]] = {
new CanBuildFrom[PrefixMap[_], (String, T), PrefixMap[T]] {
def apply(from: PrefixMap[_]) = newBuilder[T]
def apply() = newBuilder[T]
}
}
}
}
I don't understand this line:
def apply(from: PrefixMap[_]) = newBuilder[T]
What's point in receiving a PrefixMap and returning a empty PrefixMap?
Read little bit more official docs
If in short: CBF can return builder with knowledge of properties of whole collection.
For example it could preinitialize some buffer of needed size to collect entries.
Or even reuse some parts of collection of known type and structure.
But by default in many case it would just collect element by element to empty collection. That's happening in your case.

Scala unique sequence

In Scala (2.10), I'd like an immutable SeqLike collection (supporting indexing) which offers a SetLike interface to the user, and won't allow duplicate elements. Ideally, this would implement both SetLike and SeqLike, but this isn't possible, so I have to pick one. My first idea was as follows:
sealed class IndexableSet[A] ( private val values : Seq[A] )
extends Set[A]
with SetLike[A,IndexableSet[A]]
{
override def empty : IndexableSet[A] = new IndexableSet[A]( Seq[A]() )
def + ( elem : A ) : IndexableSet[A] = values.contains( elem ) match
{
case true => this
case false => new IndexableSet[A]( values :+ elem )
}
def - ( elem : A ) : IndexableSet[A] = values.contains( elem )
{
case true => new IndexableSet[A]( values.filter( _ != elem ) )
case false => this
}
def iterator = values.iterator
def contains( elem : A ) = values.contains( elem )
def apply( index : Int ) = values( index )
def length : Int = values.size
def contents : Seq[A] = values
}
This exposes a suitable interface, but not sortability (no sortBy or sorted)
I'm wondering, therefore, whether to change my implementation to something which implements Seq and SeqLike instead, and fakes the Set interface:
sealed class UniqueSeq[A] private ( private val values : IndexedSeq[A] )
extends SeqLike[A,UniqueSeq[A]]
with Seq[A]
with GenericTraversableTemplate[A,UniqueSeq]
{
def apply( idx : Int ) : A = values( idx )
def iterator = values.iterator
def length = values.length
override def companion: GenericCompanion[UniqueSeq] = new GenericCompanion[UniqueSeq]()
{
def newBuilder[A]: Builder[A, UniqueSeq[A]] = new Builder[A, UniqueSeq[A]]
{
val elems = new ArrayBuffer[A]()
def +=(a:A) = { elems += a; this }
def clear() { elems.clear }
def result(): UniqueSeq[A] = new UniqueSeq[A](elems)
}
}
def + ( elem : A ) : UniqueSeq[A] = values.contains( elem ) match
{
case true => this
case false => new UniqueSeq[A]( values :+ elem )
}
def - ( elem : A ) : UniqueSeq[A] = values.contains( elem ) match
{
case true => new UniqueSeq[A]( values.filter( _ != elem ) )
case false => this
}
}
I'm not sure which is better - or whether there's another way. I know there are things like TreeSet but the SortedSet trait doesn't offer the critical indexability.
So the questions are:
Is there a clear winner between these two implementations?
Is there another way, which is better, in the standard collections?
I would create a SeqLike backed by a Seq. Override any functions which add elements to add the element to the underlying Seq followed by a call to distinct to eliminate duplicates.
I'd prefer the second solution, possibly modified to maintain both a Set and a Seq internally:
sealed class UniqueSeq[A] private ( values : IndexedSeq[A], valueSet : Set[A] )
extends SeqLike[A,UniqueSeq[A]]
with Seq[A]
with GenericTraversableTemplate[A,UniqueSeq]
{
def apply( idx : Int ) : A = values( idx )
def iterator = values.iterator
def length = values.length
override def companion: GenericCompanion[UniqueSeq] = new GenericCompanion[UniqueSeq]()
{
def newBuilder[A]: Builder[A, UniqueSeq[A]] = new Builder[A, UniqueSeq[A]]
{
val elems = new ArrayBuffer[A]()
def +=(a:A) = { elems += a; this }
def clear() { elems.clear }
def result(): UniqueSeq[A] = new UniqueSeq[A](elems)
}
}
def + ( elem : A ) : UniqueSeq[A] = valueSet.contains( elem ) match
{
case true => this
case false => new UniqueSeq[A]( values :+ elem, valueSet + elem )
}
def - ( elem : A ) : UniqueSeq[A] = valueSet.contains( elem ) match
{
case true => new UniqueSeq[A]( values.filter( _ != elem ), valueSet - elem )
case false => this
}
}
The following code works in my application. It mixes your code with code from this question:
Create a custom scala collection where map defaults to returning the custom collection?
class Unique[A] private (list: Vector[A], set: Set[A]) extends Traversable[A]
with TraversableLike[A, Unique[A]]
with GenericTraversableTemplate[A, Unique]
{
def apply(index: Int): A = list(index)
override def companion: GenericCompanion[Unique] = Unique
def foreach[U](f: A => U) { list foreach f }
override def seq = list
def +(elem: A): Unique[A] = {
if (set.contains(elem)) this
else new Unique(list :+ elem, set + elem)
}
def -(elem: A): Unique[A] = {
if (set.contains(elem)) new Unique(list.filter(_ != elem), set - elem)
else this
}
def --(elems: Traversable[A]): Unique[A] = {
val set = elems.toSet
val values2 = list.filterNot(set.contains)
new Unique(values2, values2.toSet)
}
def ++(elems: Traversable[A]): Unique[A] = {
val list2 = elems.filterNot(set.contains)
val values2 = list ++ list2
new Unique(values2, values2.toSet)
}
def contains(elem: A): Boolean = set.contains(elem)
def zipWithIndex: Traversable[(A, Int)] = list.zipWithIndex
}
object Unique extends TraversableFactory[Unique] {
def newBuilder[A] = new UniqueBuilder[A]
implicit def canBuildFrom[A]: CanBuildFrom[Coll, A, Unique[A]] = {
new CanBuildFrom[Coll, A, Unique[A]] {
def apply(): Builder[A, Unique[A]] = new UniqueBuilder()
def apply(from: Coll): Builder[A, Unique[A]] = apply()
}
}
class UniqueBuilder[A] extends Builder[A, Unique[A]] {
private val list = Vector.newBuilder[A]
private val set = new HashSet[A]()
def += (elem: A): this.type = {
if (!set.contains(elem)) {
list += elem
set += elem
}
this
}
def clear() {
list.clear()
set.clear()
}
def result(): Unique[A] = new Unique(list.result, set.toSet)
}
}

How to get a name of a class member?

I want to be able to do something like this:
prepare form:
val formDescription = formBuilder(_.textField[User](_.firstName)
.textField[User](_.lastName)
).build
showForm(formDescription)
extract data from user filled form, using User:
//contains data of a form submitted by a user:
val formData: Map[String, String] = getFormData
val newUser = User(id = randomUuid, firstName = formData.extract[User](_.firstName))
One solution I see is to use a dynamic proxy that extends provided class and remembers what was invoked on him:
def getFieldName[T:Manifest](foo: T => Any) = {
val clazz = implicitly[Manifest[T]].erasure
val proxy = createDynamicProxy(clazz)
foo(proxy)
proxy.lastInvokedMethodName
}
Is there a better way to do it? Is there any lib that implements it already?
This reflective approach takes a case class and invokes its companion apply, calling getField and fetching default args if the field is not in the data.
import scala.reflect.runtime.{currentMirror => cm, universe => uni}
import uni._
def fromXML(xml: Node): Option[PluginDescription] = {
def extract[A]()(implicit tt: TypeTag[A]): Option[A] = {
// extract one field
def getField(field: String): Option[String] = {
val text = (xml \\ field).text.trim
if (text == "") None else Some(text)
}
val apply = uni.newTermName("apply")
val module = uni.typeOf[A].typeSymbol.companionSymbol.asModule
val ts = module.moduleClass.typeSignature
val m = (ts member apply).asMethod
val im = cm reflect (cm reflectModule module).instance
val mm = im reflectMethod m
def getDefault(i: Int): Option[Any] = {
val n = uni.newTermName("apply$default$" + (i+1))
val m = ts member n
if (m == NoSymbol) None
else Some((im reflectMethod m.asMethod)())
}
def extractArgs(pss: List[List[Symbol]]): List[Option[Any]] =
pss.flatten.zipWithIndex map (p => getField(p._1.name.encoded) orElse getDefault(p._2))
val args = extractArgs(m.paramss)
if (args exists (!_.isDefined)) None
else Some(mm(args.flatten: _*).asInstanceOf[A])
}
// check the top-level tag
xml match {
case <plugin>{_*}</plugin> => extract[PluginDescription]()
case _ => None
}
}
The idea was to do something like:
case class User(id: Int = randomUuid, firstName: String, lastName: String)
val user = extract[User]()
That's my own solution:
package utils
import javassist.util.proxy.{MethodHandler, MethodFilter, ProxyFactory}
import org.specs2.mutable._
import javassist.util.proxy.Proxy
import java.lang.reflect.{Constructor, Method}
class DynamicProxyTest extends Specification with MemberNameGetter {
"Dynamic proxy" should {
"extract field name" in {
memberName[TestClass](_.a) must ===("a")
memberName[TestClass](_.i) must ===("i")
memberName[TestClass](_.b) must ===("b")
memberName[TestClass](_.variable) must ===("variable")
memberName[TestClass](_.value) must ===("value")
memberName[TestClass](_.method) must ===("method")
}
}
}
trait MemberNameGetter {
def memberName[T: Manifest](foo: T => Any) = {
val mf = manifest[T]
val clazz = mf.erasure
val proxyFactory = new ProxyFactory
proxyFactory.setSuperclass(clazz)
proxyFactory.setFilter(new MethodFilter {
def isHandled(p1: Method) = true
})
val newClass = proxyFactory.createClass()
var lastInvokedMethod: String = null
val mh = new MethodHandler {
def invoke(p1: Any, p2: Method, p3: Method, p4: Array[AnyRef]) = {
lastInvokedMethod = p2.getName
p3.invoke(p1, p4: _*)
}
}
val constructor = defaultConstructor(newClass)
val parameters = defaultConstructorParameters(constructor)
// val proxy = constructor.newInstance("dsf", new Integer(0))
val proxy2 = constructor.newInstance(parameters: _*)
proxy2.asInstanceOf[Proxy].setHandler(mh)
foo(proxy2.asInstanceOf[T])
lastInvokedMethod
}
private def defaultConstructor(c: Class[_]) = c.getConstructors.head
private def defaultConstructorParameters(constructor: Constructor[_]) = {
val parameterTypes = constructor.getParameterTypes
parameterTypes.map{
case Integer.TYPE => Integer.valueOf(0)
case _ => null
}
}
}
case class TestClass(a: String, i: Int, b: Boolean) {
var variable = "asdf"
val value = "asdfasdfasd"
def method = "method"
}
val mh = new MethodHandler {
def invoke(p1: Any, p2: Method, p3: Method, p4: Array[AnyRef]) = {
lastInvokedMethod = p2.getName
p3.invoke(p1, p4: _*)
}
}
val constructor = defaultConstructor(newClass)
val parameters = defaultConstructorParameters(constructor)
// val proxy = constructor.newInstance("dsf", new Integer(0))
val proxy2 = constructor.newInstance(parameters: _*)
proxy2.asInstanceOf[Proxy].setHandler(mh)
foo(proxy2.asInstanceOf[T])
lastInvokedMethod
}
private def defaultConstructor(c: Class[_]) = c.getConstructors.head
private def defaultConstructorParameters(constructor: Constructor[_]) = {
val parameterTypes = constructor.getParameterTypes
parameterTypes.map{
case Integer.TYPE => Integer.valueOf(0)
case java.lang.Double.TYPE => java.lang.Double.valueOf(0)
case java.lang.Long.TYPE => java.lang.Long.valueOf(0)
case java.lang.Boolean.TYPE => java.lang.Boolean.FALSE
case _ => null
}
}
}
case class TestClass(a: String, i: Int, b: Boolean) {
var variable = "asdf"
val value = "asdfasdfasd"
def method = "method"
}

make a lazy var in scala

Scala does not permit to create laze vars, only lazy vals. It make sense.
But I've bumped on use case, where I'd like to have similar capability. I need a lazy variable holder. It may be assigned a value that should be calculated by time-consuming algorithm. But it may be later reassigned to another value and I'd like not to call first value calculation at all.
Example assuming there is some magic var definition
lazy var value : Int = _
val calc1 : () => Int = ... // some calculation
val calc2 : () => Int = ... // other calculation
value = calc1
value = calc2
val result : Int = value + 1
This piece of code should only call calc2(), not calc1
I have an idea how I can write this container with implicit conversions and and special container class. I'm curios if is there any embedded scala feature that doesn't require me write unnecessary code
This works:
var value: () => Int = _
val calc1: () => Int = () => { println("calc1"); 47 }
val calc2: () => Int = () => { println("calc2"); 11 }
value = calc1
value = calc2
var result = value + 1 /* prints "calc2" */
implicit def invokeUnitToInt(f: () => Int): Int = f()
Having the implicit worries me slightly because it is widely applicable, which might lead to unexpected applications or compiler errors about ambiguous implicits.
Another solution is using a wrapper object with a setter and a getter method that implement the lazy behaviour for you:
lazy val calc3 = { println("calc3"); 3 }
lazy val calc4 = { println("calc4"); 4 }
class LazyVar[A] {
private var f: () => A = _
def value: A = f() /* getter */
def value_=(f: => A) = this.f = () => f /* setter */
}
var lv = new LazyVar[Int]
lv.value = calc3
lv.value = calc4
var result = lv.value + 1 /* prints "calc4 */
You could simply do the compilers works yourself and do sth like this:
class Foo {
private[this] var _field: String = _
def field = {
if(_field == null) {
_field = "foo" // calc here
}
_field
}
def field_=(str: String) {
_field = str
}
}
scala> val x = new Foo
x: Foo = Foo#11ba3c1f
scala> x.field
res2: String = foo
scala> x.field = "bar"
x.field: String = bar
scala> x.field
res3: String = bar
edit: This is not thread safe in its currents form!
edit2:
The difference to the second solution of mhs is, that the calculation will only happen once, whilst in mhs's solution it is called over and over again.
I've summarized all provided advices for building custom container:
object LazyVar {
class NotInitialized extends Exception
case class Update[+T]( update : () => T )
implicit def impliciţUpdate[T](update: () => T) : Update[T] = Update(update)
final class LazyVar[T] (initial : Option[Update[T]] = None ){
private[this] var cache : Option[T] = None
private[this] var data : Option[Update[T]] = initial
def put(update : Update[T]) : LazyVar[T] = this.synchronized {
data = Some(update)
this
}
def set(value : T) : LazyVar[T] = this.synchronized {
data = None
cache = Some(value)
this
}
def get : T = this.synchronized { data match {
case None => cache.getOrElse(throw new NotInitialized)
case Some(Update(update)) => {
val res = update()
cache = Some(res)
res
}
} }
def := (update : Update[T]) : LazyVar[T] = put(update)
def := (value : T) : LazyVar[T] = set(value)
def apply() : T = get
}
object LazyVar {
def apply[T]( initial : Option[Update[T]] = None ) = new LazyVar[T](initial)
def apply[T]( value : T) = {
val res = new LazyVar[T]()
res.set(value)
res
}
}
implicit def geţLazy[T](lazyvar : LazyVar[T]) : T = lazyvar.get
object Test {
val getInt1 : () => Int = () => {
print("GetInt1")
1
}
val getInt2 : () => Int = () => {
print("GetInt2")
2
}
val li : LazyVar[Int] = LazyVar()
li := getInt1
li := getInt2
val si : Int = li
}
}
var value: () => Int = _
lazy val calc1 = {println("some calculation"); 1}
lazy val calc2 = {println("other calculation"); 2}
value = () => calc1
value = () => calc2
scala> val result : Int = value() + 1
other calculation
result: Int = 3
If you want to keep on using a lazy val (it can be used in path-dependent types and it's thread safe), you can add a layer of indirection in its definition (previous solutions use vars as an indirection):
lazy val value: Int = thunk()
#volatile private var thunk: () => Int = ..
thunk = ...
thunk = ...
You could encapsulate this in a class if you wanted to reuse it, of course.