(Scala) Creating a data structure for English words

(Scala) Creating a data structure for English words - scala

I'm currently working on a problem that involves working with English words. I'm fairly new to functional programming and i want to write as good code as possible. It's a really simple question but i just wanna get this right!^^
How do i create a data structure for English words? I need it because if I ONLY used strings its bad - words can not have any numbers in it or any other inconsistent character - strings allow it.
I'm thinking of making a case class that overrides it's apply(or constructor - I come from OP background so i still mix these up) method that returns Either[String, EnglishWord] where Left would return me an error message - something like - "Found a number in your word". Am i thinking correctly? Any suggestions?
-Thank you so much!!
Cheers=)

You should override the apply method, since you can't make the constructor return anything other than EnglishWord. If you want, you can make the constructor private so that people have to use the apply method.
You can use Try[EnglishWord] instead of Either[String, EnglishWord].
For this sort of thing (wrapping a type in another type for type safety), you may want a value class.
And, of course, make sure you allow corner-case words like "you're", "résumé", "façade", and, as Andrey pointed out, words with digits.
Here is an example:
import scala.util.{ Try, Success, Failure }
case class EnglishWord private(text: String) extends AnyVal
object EnglishWord {
def apply(text: String): Try[EnglishWord] = {
if (isValid(text)) {
Success(new EnglishWord(text))
} else {
Failure(new IllegalArgumentException("Invalid word: " + text))
}
}
def isValid(s: String): Boolean = ???
}

Related

How to design abstract classes if methods don't have the exact same signature?

This is a "real life" OO design question. I am working with Scala, and interested in specific Scala solutions, but I'm definitely open to hear generic thoughts.
I am implementing a branch-and-bound combinatorial optimization program. The algorithm itself is pretty easy to implement. For each different problem we just need to implement a class that contains information about what are the allowed neighbor states for the search, how to calculate the cost, and then potentially what is the lower bound, etc...
I also want to be able to experiment with different data structures. For instance, one way to store a logic formula is using a simple list of lists of integers. This represents a set of clauses, each integer a literal. We can have a much better performance though if we do something like a "two-literal watch list", and store some extra information about the formula in general.
That all would mean something like this
object BnBSolver[S<:BnBState]{
def solve(states: Seq[S], best_state:Option[S]): Option[S] = if (states.isEmpty) best_state else
val next_state = states.head
/* compare to best state, etc... */
val new_states = new_branches ++ states.tail
solve(new_states, new_best_state)
}
class BnBState[F<:Formula](clauses:F, assigned_variables) {
def cost: Int
def branches: Seq[BnBState] = {
val ll = clauses.pick_variable
List(
BnBState(clauses.assign(ll), ll :: assigned_variables),
BnBState(clauses.assign(-ll), -ll :: assigned_variables)
)
}
}
case class Formula[F<:Formula[F]](clauses:List[List[Int]]) {
def assign(ll: Int) :F =
Formula(clauses.filterNot(_ contains ll)
.map(_.filterNot(_==-ll))))
}
Hopefully this is not too crazy, wrong or confusing. The whole issue here is that this assign method from a formula would usually take just the current literal that is going to be assigned. In the case of two-literal watch lists, though, you are doing some lazy thing that requires you to know later what literals have been previously assigned.
One way to fix this is you just keep this list of previously assigned literals in the data structure, maybe as a private thing. Make it a self-standing lazy data structure. But this list of the previous assignments is actually something that may be naturally available by whoever is using the Formula class. So it makes sense to allow whoever is using it to just provide the list every time you assign, if necessary.
The problem here is that we cannot now have an abstract Formula class that just declares a assign(ll:Int):Formula. In the normal case this is OK, but if this is a two-literal watch list Formula, it is actually an assign(literal: Int, previous_assignments: Seq[Int]).
From the point of view of the classes using it, it is kind of OK. But then how do we write generic code that can take all these different versions of Formula? Because of the drastic signature change, it cannot simply be an abstract method. We could maybe force the user to always provide the full assigned variables, but then this is a kind of a lie too. What to do?
The idea is the watch list class just becomes a kind of regular assign(Int) class if I write down some kind of adapter method that knows where to take the previous assignments from... I am thinking maybe with implicit we can cook something up.

I'll try to make my answer a bit general, since I'm not convinced I'm completely following what you are trying to do. Anyway...
Generally, the first thought should be to accept a common super-class as a parameter. Obviously that won't work with Int and Seq[Int].
You could just have two methods; have one call the other. For instance just wrap an Int into a Seq[Int] with one element and pass that to the other method.
You can also wrap the parameter in some custom class, e.g.
class Assignment {
...
}
def int2Assignment(n: Int): Assignment = ...
def seq2Assignment(s: Seq[Int]): Assignment = ...
case class Formula[F<:Formula[F]](clauses:List[List[Int]]) {
def assign(ll: Assignment) :F = ...
}
And of course you would have the option to make those conversion methods implicit so that callers just have to import them, not call them explicitly.
Lastly, you could do this with a typeclass:
trait Assigner[A] {
...
}
implicit val intAssigner = new Assigner[Int] {
...
}
implicit val seqAssigner = new Assigner[Seq[Int]] {
...
}
case class Formula[F<:Formula[F]](clauses:List[List[Int]]) {
def assign[A : Assigner](ll: A) :F = ...
}
You could also make that type parameter at the class level:
case class Formula[A:Assigner,F<:Formula[A,F]](clauses:List[List[Int]]) {
def assign(ll: A) :F = ...
}
Which one of these paths is best is up to preference and how it might fit in with the rest of the code.

Map an instance using function in Scala

Say I have a local method/function
def withExclamation(string: String) = string + "!"
Is there a way in Scala to transform an instance by supplying this method? Say I want to append an exclamation mark to a string. Something like:
val greeting = "Hello"
val loudGreeting = greeting.applyFunction(withExclamation) //result: "Hello!"
I would like to be able to invoke (local) functions when writing a chain transformation on an instance.
EDIT: Multiple answers show how to program this possibility, so it seems that this feature is not present on an arbitraty class. To me this feature seems incredibly powerful. Consider where in Java I want to execute a number of operations on a String:
appendExclamationMark(" Hello! ".trim().toUpperCase()); //"HELLO!"
The order of operations is not the same as how they read. The last operation, appendExclamationMark is the first word that appears. Currently in Java I would sometimes do:
Function.<String>identity()
.andThen(String::trim)
.andThen(String::toUpperCase)
.andThen(this::appendExclamationMark)
.apply(" Hello "); //"HELLO!"
Which reads better in terms of expressing a chain of operations on an instance, but also contains a lot of noise, and it is not intuitive to have the String instance at the last line. I would want to write:
" Hello "
.applyFunction(String::trim)
.applyFunction(String::toUpperCase)
.applyFunction(this::withExclamation); //"HELLO!"
Obviously the name of the applyFunction function can be anything (shorter please). I thought backwards compatibility was the sole reason Java's Object does not have this.
Is there any technical reason why this was not added on, say, the Any or AnyRef classes?

You can do this with an implicit class which provides a way to extend an existing type with your own methods:
object StringOps {
implicit class RichString(val s: String) extends AnyVal {
def withExclamation: String = s"$s!"
}
def main(args: Array[String]): Unit = {
val m = "hello"
println(m.withExclamation)
}
}
Yields:
hello!

If you want to apply any functions (anonymous, converted from methods, etc.) in this way, you can use a variation on Yuval Itzchakov's answer:
object Combinators {
implicit class Combinators[A](val x: A) {
def applyFunction[B](f: A => B) = f(x)
}
}

A while after asking this question, I noticed that Kotlin has this built in:
inline fun <T, R> T.let(block: (T) -> R): R
Calls the specified function block with this value as its argument and returns
its result.
A lot more, quite useful variations of the above function are provided on all types, like with, also, apply, etc.

Input validation with the scala type system

Having played a bit with Scala now, I question myself how you should do input validation in Scala.
This is what I have seen many times:
def doSomethingWithPositiveIntegers(i: Int) = {
require(i>0)
//do something
}
to bring matters to a head, it feels like doing this in Java:
void doSomething(Object o) {
if (!o instanceof Integer)
throw new IllegalArgumentException();
}
There, you first accept more than you are willing to accept, and then introduce some "guard" that only lets the "good ones" in. To be exact, you'd need these guards in every function that does something with positive integers, and in case you'd like for example to include zero later on, you'd need to change every function. Of course you can shift it to another function, but nevertheless you'd always need to rember to call the correct function, and it might not survive type refactorings etc. Does not sound that I'd like to have that. I was thinking about pushing this validation code to the data type itself, like this:
import scala.util.Try
object MyStuff {
implicit class PositiveInt(val value: Int) {
require(value>0)
}
implicit def positiveInt2Int(positiveInt: PositiveInt): Int = positiveInt.value
}
import MyStuff._
val i: MyStuff.PositiveInt = 5
val j: Int = i+5
println(i) //Main$$anon$1$MyStuff$PositiveInt#3a16cef5
println(j) //10
val sum = i + i
println(sum) //10
def addOne(i: MyStuff.PositiveInt) = i + 1
println(Try(addOne(-5))) //Failure(java.lang.IllegalArgumentException: requirement failed)
println(Try(addOne(5))) //Success(6)
Then I have a type PositiveInt that can only contain positive integers, and I can use it (almost) everywhere like an Int. Now, my API defines what I am willing to take - this is what I'd like to have! The function itself has nothing to validate, because it knows it can only get valid positive integers - they cannot be constructed without validation. You'd have to run your validation only once - upon creation of the type! Think of other cases, where validation might be more expensive (validate an email address or URL, or that a number is a prime).
Advantages:
Your API tells you directly what kind of objects you accept (no more do(String, String, String) what could be do(User, Email, Password))
Your objects get validated "automatically"
The compiler can help you reduce the risk of bugs. Some things that you'd before see on run time can be seen on compile time. Example:
def makeNegative(i: PositiveInt): NegativeInt = -i
addOne(makeNegative(1)) //will create a compile-time error!
However, there are some drawbacks:
Unfortunately, you break many functions that work due to implicit conversions. E.g., this will not work:
val i: PositiveInteger = 5
val range = i to 10 //error: value to is not a member of this.MyStuff.PositiveInt
val range = i.value to 10 //will work
It could be solved if you could extend Int and just add the require, because then all PositiveInt are Ints (what really is the case!), but Int is final :). You could add implicit conversions for all the cases you need, but that would be pretty verbose.
More objects are created. Maybe one can lower that burden with value classes (can anybody show me how?).
These are my questions:
Am I missing something? I have not seen anybody do this before, and I wonder why. Maybe there are good reasons for not doing this.
Is there a better way to integrate validation into my types?
How can I avoid the problems with the need of duplicate implicits (drawback #1)? Maybe some kind of macro that looks at other implicits in scope and adds implicits at compile time for me (Example: implicit conversion from PositiveInt to RichInt)?

You can create a class with a private constructor visible to a companion object with a factory method e.g.
class PositiveInt private[PositiveInt](val i: Int)
object PositiveInt {
def apply(i: Int): Option[PositiveInt] = if(i > 0) Some(new PositiveInt(i)) else None
}
clients cannot create instances of PositiveInt directly so they have to go through the apply method which does the validation and only returns valid instances if the input value is valid.

How to pass around string values type-safely?

E.g.:
def updateAsinRecords(asins:Seq[String], recordType:String)
Above method takes a Seq of ASINs and a record type. Both have type of String. There are also other values that are passed around with type String in the application. Needless to say, this being Scala, I'd like to use the type system to my advantage. How to pass around string values in a type safe manner (like below)?
def updateAsinRecords(asins:Seq[ASIN], recordType:RecordType)
^ ^
I can imagine, having something like this:
trait ASIN { val value:String }
but I'm wondering if there's a better approach...

There is an excellent bit of new Scala functionality know as Value Classes and Universal Traits. They impose no runtime overhead but you can use them to work in a type safe manner:
class AnsiString(val inner: String) extends AnyVal
class Record(val inner: String) extends AnyVal
def updateAnsiRecords(ansi: Seq[AnsiString], record: Record)
They were created specifically for this purpose.

You could add thin wrappers with case classes:
case class ASIN(asin: String)
case class RecordType(recordType: String)
def updateAsinRecords(asins: Seq[ASIN], recordType: RecordType) = ???
updateAsinRecords(Vector(ASIN("a"), ASIN("b")), RecordType("c"))
This will not only make your code safer, but it will also make it much easier to read! The other big advantage of this approach is that refactoring later will be much easier. For example, if you decide later that an ASIN should have two fields instead of just one, then you just update the ASIN class definition instead of every place it's used. Likewise, you can do things like add methods to these types whenever you decide you need them.

In addition to the suggestions about using a Value Class / extends AnyVal, you should probably control the construction to allow only valid instances, since presumably not any old string is a valid ASIN. (And... is that an Amazon thing? It rings a bell somehow.)
The best way to do this is to make the constructor private and put a validating factory method in a companion object. The reason for this is that throwing exceptions in constructors (when an attempt is made to instantiate with an invalid argument) can lead to puzzling failure modes (I often see it manifest as a NoClassDefFoundError error when trying to load a different class).
So, in addition to:
case class ASIN private (asin: String) extends AnyVal { /* other stuff */ }
You should include something like this:
object A {
import scala.util.{Try, Success, Failure}
def fromString(str: String): Try[ASIN] =
if (validASIN(str))
Success(new ASIN(str))
else
Failure(new InvalidArgumentException(s"Invalid ASIN string: $str")
}

How about a type alias?
type ASIN = String
def update(asins: Seq[ASIN])

How to convert a scala.xml.Elem into a JsExp in Lift?

My latest problem is one that I already have a solution for, it just
feels like there should be a better way.
The problem:
I want to send a PartialUpdate to a comet service, and I need to XML
escape the string, so that when it is used on the client it gets the
correct results. I currently have:
override def lowPriority = {
case v: List[TaskOwner] => {
partialUpdate(
taskOwners.foldLeft(JsCrVar("table", Call("$", Str("table#userTable"))) &
Call("table.dataTable().fnClearTable"))((r, c) => {
r & Call("table.dataTable().fnAddData",
JsArray(Str(Text(c.name).toString),
Str(Text(c.initials).toString),
Str(makeDeleteButton(c).toString)),
Num(0))
}) & Call("table.dataTable().fnDraw"))
}
}
And this works fine, however the Str(Text(c.name).toString) feels
quite wordy to me. Now, I can, of course, create a pair of implicit
conversion functions for this, but it seems like this should have
already been done somewhere, I just don't know where. And so, in the
spirit of reducing the code that I have written, I ask if anyone knows
a better way to do this, or if the implicit conversion already exist
somewhere?
I have seen reference to a solution here. However the code is summarized as:
def xmlToJson(xml: Elem): JsExp = {
// code to map XML responses to JSON responses. Handles tricky things like always returning
// js arrays for some fields even if only 1 element appears in the XML
}

A possibly better way of escaping the names is, instead of:
JsArray(Str(Text(c.name).toString),
Str(Text(c.initials).toString),
Str(makeDeleteButton(c).toString))
to use
JsArray(Str(c.name.asHtml.toString),
Str(c.initials.asHtml.toString),
Str(makeDeleteButton(c).toString))
This can be further reduced by using an implicit within the class like:
implicit def elemToJsExp(elem: NodeSeq): JsExp = Str(elem.toString)
…
JsArray(c.name.asHtml,
c.initials.asHtml,
makeDeleteButton (c))

I don't know what Str does, but maybe you mean Str(xml.Utility.escape(c.name))?
Well, how about:
def JsStrArray(strings: String*) = JsArray(strings map xml.Utility.escape map Str : _*)
And then just use
JsStrArray(c.name, c.initials, makeDeleteButton(c).toString)
Mmmmm. It might incorrectly escape the result of makeDeleteButton. Anyway, you can play with it and see what looks good.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

(Scala) Creating a data structure for English words - scala

Related

How to design abstract classes if methods don't have the exact same signature?

Map an instance using function in Scala

Input validation with the scala type system

How to pass around string values type-safely?

How to convert a scala.xml.Elem into a JsExp in Lift?

Categories

Resources