How can I dynamically (runtime) generate a sorted collection in Scala using the java.lang.reflect.Type? - scala

Given an array of items I need to generate a sorted collection in Scala for a java.lang.reflect.Type but I'm unable to do so. The following snippet might explain better.
def buildList(paramType: Type): SortedSet[_] = {
val collection = new Array[Any](5)
for (i <- 0 until 5) {
collection(i) = new EasyRandom().nextObject(paramType.asInstanceOf[Class[Any]])
}
SortedSet(collection:_*)
}
I'm unable to do as I get the error "No implicits found for parameter ord: Ordering[Any]". I'm able to work around this if I swap to an unsorted type such as Set.
def buildList(paramType: Type): Set[_] = {
val collection = new Array[Any](5)
for (i <- 0 until 5) {
collection(i) = new EasyRandom().nextObject(paramType.asInstanceOf[Class[Any]])
}
Set(collection:_*)
}
How can I dynamically build a sorted set at runtime? I've been looking into how Jackson tries to achieve the same but I couldn't quite follow how to get T here: https://github.com/FasterXML/jackson-module-scala/blob/0e926622ea4e8cef16dd757fa85400a0b9dcd1d3/src/main/scala/com/fasterxml/jackson/module/scala/introspect/OrderingLocator.scala#L21
(Please excuse me if my question is unclear.)

This happens because SortedSet needs a contextual (implicit) Ordering type class instance for a given type A
However, as Luis said on the comment section, I'd strongly advice you against using this approach and using a safer, strongly typed one, instead.
Generating random case classes (which I suppose you're using since you're using Scala) should be easy with the help of some libraries like magnolia. That would turn your code into something like this:
def randomList[A : Ordering : Arbitrary]: SortedSet[A] = {
val arb: Arbitrary[A] = implicitly[Arbitrary[A]]
val sampleData = (1 to 5).map(arb.arbitrary.sample)
SortedSet(sampleData)
}
This approach involves some heavy concepts like implicits and type classes, but is way safer.

Related

Generic class wrapper in scala

Hello I would like to create a generic wrapper in scala in order to track the changes of the value of any type. I don't know/haven't found any other ways so far and I was thinking of creating a class and I've been trying to use the Dynamic but it has some limitations.
case class Wrapper[T](value: T) extends Dynamic {
private val valueClass = value.getClass
def applyDynamic(id: String)(parameters: Any*) = {
val objectParameters = parameters map (x => x.asInstanceOf[Object])
val parameterClasses = objectParameters map (_.getClass)
val method = valueClass.getMethod(id, parameterClasses:_*)
val res = method.invoke(value, objectParameters:_*)
// TODO: Logic that will eventually create some kind of event about the method invoked.
new Wrapper(res)
}
}
With this code I have trouble when invoking the plus("+") method on two Integers and I don't understand why. Isn't there a "+" method in the Int class? The error I am getting when I try addition with both a type of Wrapper/Int is:
var wrapped1 = Wrapper(1)
wrapped1 = wrapped1 + Wrapper[2] // or just 2
type mismatch;
found : Wrapper[Int]/Int
required: String
Why is it expecting a string?
If possible it would also be nice to be able to work with both the Wrapper[T] and the T methods seamlessly, e.g.
val a = Wrapper[Int](1)
val b = Wrapper[Int](2)
val c = 3
a + b // Wrapper[Int].+(Wrapper[Int])
a + c // Wrapper[Int].+(Int)
c + a // Int.+(Wrapper[Int])
Well, if youre trying to make a proxy which will get any changes of desired values you'l probably fail without agents(https://dzone.com/articles/java-agent-1) because it will force you make code modifications for bytecode that accepts final classes and primitives to accept your proxy instead of that and it would require more than intercepting changes of "just class" but also all classes of members and produce origin-of-value analysis and so on. It's by no way trivial problem.
Another approach is to produce diffs of case classes by comparing classes in certain points of execution and there's generic implementation like that, it uses derivation for computing diffs: https://github.com/ivan71kmayshan27/ShapelesDerivationExample I believe you can came with easier solution with magnolia. Actualy this one is unable to work for just classes unless you write your own macro and have some problems regarding ordered and unordered collections.

Prevent empty values in an array being inserted into Mongo collection

I am trying to prevent empty values being inserted into my mongoDB collection. The field in question looks like this:
MongoDB Field
"stadiumArr" : [
"Old Trafford",
"El Calderon",
...
]
Sample of (mapped) case class
case class FormData(_id: Option[BSONObjectID], stadiumArr: Option[List[String]], ..)
Sample of Scala form
object MyForm {
val form = Form(
mapping(
"_id" -> ignored(Option.empty[BSONObjectID]),
"stadiumArr" -> optional(list(text)),
...
)(FormData.apply)(FormData.unapply)
)
}
I am also using the Repeated Values functionality in Play Framework like so:
Play Template
#import helper._
#(myForm: Form[models.db.FormData])(implicit request: RequestHeader, messagesProvider: MessagesProvider)
#repeatWithIndex(myForm("stadiumArr"), min = 5) { (stadium, idx) =>
#inputText(stadium, '_label -> ("stadium #" + (idx + 1)))
}
This ensures that whether there are at least 5 values or not in the array; there will still be (at least) 5 input boxes created. However if one (or more) of the input boxes are empty when the form is submitted an empty string is still being added as value in the array, e.g.
"stadiumArr" : [
"Old Trafford",
"El Calderon",
"",
"",
""
]
Based on some other ways of converting types from/to the database; I've tried playing around with a few solutions; such as:
implicit val arrayWrite: Writes[List[String]] = new Writes[List[String]] {
def writes(list: List[String]): JsValue = Json.arr(list.filterNot(_.isEmpty))
}
.. but this isn't working. Any ideas on how to prevent empty values being inserted into the database collection?
Without knowing specific versions or libraries you're using it's hard to give you an answer, but since you linked to play 2.6 documentation I'll assume that's what you're using there. The other assumption I'm going to make is that you're using reactive-mongo library. Whether or not you're using the play plugin for that library or not is the reason why I'm giving you two different answers here:
In that library, with no plugin, you'll have defined a BSONDocumentReader and a BSONDocumentWriter for your case class. This might be auto-generated for you with macros or not, but regardless how you get it, these two classes have useful methods you can use to transform the reads/writes you have to another one. So, let's say I defined a reader and writer for you like this:
import reactivemongo.bson._
case class FormData(_id: Option[BSONObjectID], stadiumArr: Option[List[String]])
implicit val formDataReaderWriter = new BSONDocumentReader[FormData] with BSONDocumentWriter[FormData] {
def read(bson: BSONDocument): FormData = {
FormData(
_id = bson.getAs[BSONObjectID]("_id"),
stadiumArr = bson.getAs[List[String]]("stadiumArr").map(_.filterNot(_.isEmpty))
)
}
def write(formData: FormData) = {
BSONDocument(
"_id" -> formData._id,
"stadiumArr" -> formData.stadiumArr
)
}
}
Great you say, that works! You can see in the reads I went ahead and filtered out any empty strings. So even if it's in the data, it can be cleaned up. That's nice and all, but let's notice I didn't do the same for the writes. I did that so I can show you how to use a useful method called afterWrite. So pretend the reader/writer weren't the same class and were separate, then I can do this:
val initialWriter = new BSONDocumentWriter[FormData] {
def write(formData: FormData) = {
BSONDocument(
"_id" -> formData._id,
"stadiumArr" -> formData.stadiumArr
)
}
}
implicit val cleanWriter = initialWriter.afterWrite { bsonDocument =>
val fixedField = bsonDocument.getAs[List[String]]("stadiumArr").map(_.filterNot(_.isEmpty))
bsonDocument.remove("stadiumArr") ++ BSONDocument("stadiumArr" -> fixedField)
}
Note that cleanWriter is the implicit one, that means when the insert call on the collection happens, it will be the one chosen to be used.
Now, that's all a bunch of work, if you're using the plugin/module for play that lets you use JSONCollections then you can get by with just defining play json Reads and Writes. If you look at the documentation you'll see that the reads trait has a useful map function you can use to transform one Reads into another.
So, you'd have:
val jsonReads = Json.reads[FormData]
implicit val cleanReads = jsonReads.map(formData => formData.copy(stadiumArr = formData.stadiumArr.map(_.filterNot(_.isEmpty))))
And again, because only the clean Reads is implicit, the collection methods for mongo will use that.
NOW, all of that said, doing this at the database level is one thing, but really, I personally think you should be dealing with this at your Form level.
val form = Form(
mapping(
"_id" -> ignored(Option.empty[BSONObjectID]),
"stadiumArr" -> optional(list(text)),
...
)(FormData.apply)(FormData.unapply)
)
Mainly because, surprise surprise, form has a way to deal with this. Specifically, the mapping class itself. If you look there you'll find a transform method you can use to filter out empty values easily. Just call it on the mapping you need to modify, for example:
"stadiumArr" -> optional(
list(text).transform(l => l.filter(_.nonEmpty), l => l.filter(_.nonEmpty))
)
To explain a little more about this method, in case you're not used to reading the signatures in the scaladoc.
def
transform[B](f1: (T) ⇒ B, f2: (B) ⇒ T): Mapping[B]
says that by calling transform on some mapping of type Mapping[T] you can create a new mapping of type Mapping[B]. In order to do this you must provide functions that convert from one to the other. So the code above causes the list mapping (Mapping[List[String]]) to become a Mapping[List[String]] (the type did not change here), but when it does so it removes any empty elements. If I break this code down a little it might be more clear:
def convertFromTtoB(list: List[String]): List[String] = list.filter(_.nonEmpty)
def convertFromBtoT(list: List[String]): List[String] = list.filter(_.nonEmpty)
...
list(text).transform(convertFromTtoB, convertFromBtoT)
You might wondering why you need to provide both, the reason is because when you call Form.fill and the form is populated with values, the second method will be called so that the data goes into the format the play form is expecting. This is more obvious if the type actually changes. For example, if you had a text area where people could enter CSV but you wanted to map it to a form model that had a proper List[String] you might do something like:
def convertFromTtoB(raw: String): List[String] = raw.split(",").filter(_.nonEmpty)
def convertFromBtoT(list: List[String]): String = list.mkString(",")
...
text.transform(convertFromTtoB, convertFromBtoT)
Note that when I've done this in the past sometimes I've had to write a separate method and just pass it in if I didn't want to fully specify all the types, but you should be able to work from here given the documentation and type signature for the transform method on mapping.
The reason I suggest doing this in the form binding is because the form/controller should be the one with the concern of dealing with your user data and cleaning things up I think. But you can always have multiple layers of cleaning and whatnot, it's not bad to be safe!
I've gone for this (which always seems obvious when it's written and tested):
implicit val arrayWrite: Writes[List[String]] = new Writes[List[String]] {
def writes(list: List[String]): JsValue = Json.toJson(list.filterNot(_.isEmpty).toIndexedSeq)
}
But I would be interested to know how to
.map the existing Reads rather than redefining from scratch
as #cchantep suggests

Can Scala infer the actual type from the return type actually expected by the caller?

I have a following question. Our project has a lot of code, that runs tests in Scala. And there is a lot of code, that fills the fields like this:
production.setProduct(new Product)
production.getProduct.setUuid("b1253a77-0585-291f-57a4-53319e897866")
production.setSubProduct(new SubProduct)
production.getSubProduct.setUuid("89a877fa-ddb3-3009-bb24-735ba9f7281c")
Eventually, I grew tired from this code, since all those fields are actually subclasses of the basic class that has the uuid field, so, after thinking a while, I wrote the auxiliary function like this:
def createUuid[T <: GenericEntity](uuid: String)(implicit m : Manifest[T]) : T = {
val constructor = m.runtimeClass.getConstructors()(0)
val instance = constructor.newInstance().asInstanceOf[T]
instance.setUuid(uuid)
instance
}
Now, my code got two times shorter, since now I can write something like this:
production.setProduct(createUuid[Product]("b1253a77-0585-291f-57a4-53319e897866"))
production.setSubProduct(createUuid[SubProduct]("89a877fa-ddb3-3009-bb24-735ba9f7281c"))
That's good, but I am wondering, if I could somehow implement the function createUuid so the last bit would like this:
// Is that really possible?
production.setProduct(createUuid("b1253a77-0585-291f-57a4-53319e897866"))
production.setSubProduct(createUuid("89a877fa-ddb3-3009-bb24-735ba9f7281c"))
Can scala compiler guess, that setProduct expects not just a generic entity, but actually something like Product (or it's subclass)? Or there is no way in Scala to implement this even shorter?
Scala compiler won't infer/propagate the type outside-in. You could however create implicit conversions like:
implicit def stringToSubProduct(uuid: String): SubProduct = {
val n = new SubProduct
n.setUuid(uuid)
n
}
and then just call
production.setSubProduct("89a877fa-ddb3-3009-bb24-735ba9f7281c")
and the compiler will automatically use the stringToSubProduct because it has applicable types on the input and output.
Update: To have the code better organized I suggest wrapping the implicit defs to a companion object, like:
case class EntityUUID(uuid: String) {
uuid.matches("[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}") // possible uuid format check
}
case object EntityUUID {
implicit def toProduct(e: EntityUUID): Product = {
val p = new Product
p.setUuid(e.uuid)
p
}
implicit def toSubProduct(e: EntityUUID): SubProduct = {
val p = new SubProduct
p.setUuid(e.uuid)
p
}
}
and then you'd do
production.setProduct(EntityUUID("b1253a77-0585-291f-57a4-53319e897866"))
so anyone reading this could have an intuition where to find the conversion implementation.
Regarding your comment about some generic approach (having 30 types), I won't say it's not possible, but I just do not see how to do it. The reflection you used bypasses the type system. If all the 30 cases are the same piece of code, maybe you should reconsider your object design. Now you can still implement the 30 implicit defs by calling some method that uses reflection similar what you have provided. But you will have the option to change it in the future on just this one (30) place(s).

What are good examples of: "operation of a program should map input values to output values rather than change data in place"

I came across this sentence in Scala in explaining its functional behavior.
operation of a program should map input of values to output values rather than change data in place
Could somebody explain it with a good example?
Edit: Please explain or give example for the above sentence in its context, please do not make it complicate to get more confusion
The most obvious pattern that this is referring to is the difference between how you would write code which uses collections in Java when compared with Scala. If you were writing scala but in the idiom of Java, then you would be working with collections by mutating data in place. The idiomatic scala code to do the same would favour the mapping of input values to output values.
Let's have a look at a few things you might want to do to a collection:
Filtering
In Java, if I have a List<Trade> and I am only interested in those trades executed with Deutsche Bank, I might do something like:
for (Iterator<Trade> it = trades.iterator(); it.hasNext();) {
Trade t = it.next();
if (t.getCounterparty() != DEUTSCHE_BANK) it.remove(); // MUTATION
}
Following this loop, my trades collection only contains the relevant trades. But, I have achieved this using mutation - a careless programmer could easily have missed that trades was an input parameter, an instance variable, or is used elsewhere in the method. As such, it is quite possible their code is now broken. Furthermore, such code is extremely brittle for refactoring for this same reason; a programmer wishing to refactor a piece of code must be very careful to not let mutated collections escape the scope in which they are intended to be used and, vice-versa, that they don't accidentally use an un-mutated collection where they should have used a mutated one.
Compare with Scala:
val db = trades filter (_.counterparty == DeutscheBank) //MAPPING INPUT TO OUTPUT
This creates a new collection! It doesn't affect anyone who is looking at trades and is inherently safer.
Mapping
Suppose I have a List<Trade> and I want to get a Set<Stock> for the unique stocks which I have been trading. Again, the idiom in Java is to create a collection and mutate it.
Set<Stock> stocks = new HashSet<Stock>();
for (Trade t : trades) stocks.add(t.getStock()); //MUTATION
Using scala the correct thing to do is to map the input collection and then convert to a set:
val stocks = (trades map (_.stock)).toSet //MAPPING INPUT TO OUTPUT
Or, if we are concerned about performance:
(trades.view map (_.stock)).toSet
(trades.iterator map (_.stock)).toSet
What are the advantages here? Well:
My code can never observe a partially-constructed result
The application of a function A => B to a Coll[A] to get a Coll[B] is clearer.
Accumulating
Again, in Java the idiom has to be mutation. Suppose we are trying to sum the decimal quantities of the trades we have done:
BigDecimal sum = BigDecimal.ZERO
for (Trade t : trades) {
sum.add(t.getQuantity()); //MUTATION
}
Again, we must be very careful not to accidentally observe a partially-constructed result! In scala, we can do this in a single expression:
val sum = (0 /: trades)(_ + _.quantity) //MAPPING INTO TO OUTPUT
Or the various other forms:
(trades.foldLeft(0)(_ + _.quantity)
(trades.iterator map (_.quantity)).sum
(trades.view map (_.quantity)).sum
Oh, by the way, there is a bug in the Java implementation! Did you spot it?
I'd say it's the difference between:
var counter = 0
def updateCounter(toAdd: Int): Unit = {
counter += toAdd
}
updateCounter(8)
println(counter)
and:
val originalValue = 0
def addToValue(value: Int, toAdd: Int): Int = value + toAdd
val firstNewResult = addToValue(originalValue, 8)
println(firstNewResult)
This is a gross over simplification but fuller examples are things like using a foldLeft to build up a result rather than doing the hard work yourself: foldLeft example
What it means is that if you write pure functions like this you always get the same output from the same input, and there are no side effects, which makes it easier to reason about your programs and ensure that they are correct.
so for example the function:
def times2(x:Int) = x*2
is pure, while
def add5ToList(xs: MutableList[Int]) {
xs += 5
}
is impure because it edits data in place as a side effect. This is a problem because that same list could be in use elsewhere in the the program and now we can't guarantee the behaviour because it has changed.
A pure version would use immutable lists and return a new list
def add5ToList(xs: List[Int]) = {
5::xs
}
There are plenty examples with collections, which are easy to come by but might give the wrong impression. This concept works at all levels of the language (it doesn't at the VM level, however). One example is the case classes. Consider these two alternatives:
// Java-style
class Person(initialName: String, initialAge: Int) {
def this(initialName: String) = this(initialName, 0)
private var name = initialName
private var age = initialAge
def getName = name
def getAge = age
def setName(newName: String) { name = newName }
def setAge(newAge: Int) { age = newAge }
}
val employee = new Person("John")
employee.setAge(40) // we changed the object
// Scala-style
case class Person(name: String, age: Int) {
def this(name: String) = this(name, 0)
}
val employee = new Person("John")
val employeeWithAge = employee.copy(age = 40) // employee still exists!
This concept is applied on the construction of the immutable collection themselves: a List never changes. Instead, new List objects are created when necessary. Use of persistent data structures reduce the copying that would happen on a mutable data structure.

Why does Scala maintain the type of collection not return Iterable (as in .Net)?

In Scala, you can do
val l = List(1, 2, 3)
l.filter(_ > 2) // returns a List[Int]
val s = Set("hello", "world")
s.map(_.length) // returns a Set[Int]
The question is: why is this useful?
Scala collections are probably the only existing collection framework that does this. Scala community seems to agree that this functionality is needed. Yet, noone seems to miss this functionality in the other languages. Example C# (modified naming to match Scala's):
var l = new List<int> { 1, 2, 3 }
l.filter(i => i > 2) // always returns Iterable[Int]
l.filter(i => i > 2).toList // if I want a List, no problem
l.filter(i => i > 2).toSet // or I want a Set
In .NET, I always get back an Iterable and it is up to me what I want to do with it. (This also makes .NET collections very simple) .
The Scala example with Set forces me to make a Set of lengths out of a Set of string. But what if I just want to iterate over the lengths, or construct a List of lengths, or keep the Iterable to filter it later. Constructing a Set right away seems pointless. (EDIT: collection.view provides the simpler .NET functionality, nice)
I am sure you will show me examples where the .NET approach is absolutely wrong or kills performance, but I just can't see any (using .NET for years).
Not a full answer to your question, but Scala never forces you to use one collection type over another. You're free to write code like this:
import collection._
import immutable._
val s = Set("hello", "world")
val l: Vector[Int] = s.map(_.length)(breakOut)
Read more about breakOut in Daniel Sobral's detailed answer to another question.
If you want your map or filter to be evaluated lazily, use this:
s.view.map(_.length)
This whole behavior makes it easy to integrate your new collection classes and inherit all the powerful capabilities of the standard collection with no code duplication, all of this ensuring that YourSpecialCollection#filter returns an instance of YourSpecialCollection; that YourSpecialCollection#map returns an instance of YourSpecialCollection if it supports the type being mapped to, or a built-in fallback collection if it doesn't (like what happens of you call map on a BitSet). Surely, a C# iterator has no .toMySpecialCollection method.
See also: “Integrating new sets and maps” in The Architecture of Scala Collections.
Scala follows the "uniform return type principle" assuring that you always end up with the appropriate return type, instead of loosing that information like in C#.
The reason C# does it this was is that their type system is not good enough to provide these assurances without overriding the whole implementation of every method in every single subclass. Scala solves this with the usage of Higher Kinded Types.
Why Scala has the only collection framework doing this? Because it is harder than most people think it is, especially when things like Strings and Arrays which are no "real" collections should be integrated as well:
// This stays a String:
scala> "Foobar".map(identity)
res27: String = Foobar
// But this falls back to the "nearest" appropriate type:
scala> "Foobar".map(_.toInt)
res29: scala.collection.immutable.IndexedSeq[Int] = Vector(70, 111, 111, 98, 97, 114)
If you have a Set, and an operation on it returns an Iterable while its runtime type is still a Set, then you're losing important informations about its behavior, and the access to set-specific methods.
BTW: There are other languages behaving similar, like Haskell, which influenced Scala a lot. The Haskell version of map would look like this translated to Scala (without implicitmagic):
//the functor type class
trait Functor[C[_]] {
def fmap[A,B](f: A => B, coll: C[A]) : C[B]
}
//an instance
object ListFunctor extends Functor[List] {
def fmap[A,B](f: A => B, list: List[A]) : List[B] = list.map(f)
}
//usage
val list = ListFunctor.fmap((x:Int) => x*x, List(1,2,3))
And I think the Haskell community values this feature as well :-)
It is a matter of consistency. Things are what they are, and return things like them. You can depend on it.
The difference you make here is one of strictness. A strict method is immediately evaluated, while a non-strict method is only evaluated as needed. This has consequences. Take this simple example:
def print5(it: Iterable[Int]) = {
var flag = true
it.filter(_ => flag).foreach { i =>
flag = i < 5
println(i)
}
}
Test it with these two collections:
print5(List.range(1, 10))
print5(Stream.range(1, 10))
Here, List is strict, so its methods are strict. Conversely, Stream is non-strict, so its methods are non-strict.
So this isn't really related to Iterable at all -- after all, both List and Stream are Iterable. Changing the collection return type can cause all sort of problems -- at the very least, it would make the task of keeping a persistent data structure harder.
On the other hand, there are advantages to delaying certain operations, even on a strict collection. Here are some ways of doing it:
// Get an iterator explicitly, if it's going to be used only once
def print5I(it: Iterable[Int]) = {
var flag = true
it.iterator.filter(_ => flag).foreach { i =>
flag = i < 5
println(i)
}
}
// Get a Stream explicitly, if the result will be reused
def print5S(it: Iterable[Int]) = {
var flag = true
it.toStream.filter(_ => flag).foreach { i =>
flag = i < 5
println(i)
}
}
// Use a view, which provides non-strictness for some methods
def print5V(it: Iterable[Int]) = {
var flag = true
it.view.filter(_ => flag).foreach { i =>
flag = i < 5
println(i)
}
}
// Use withFilter, which is explicitly designed to be used as a non-strict filter
def print5W(it: Iterable[Int]) = {
var flag = true
it.withFilter(_ => flag).foreach { i =>
flag = i < 5
println(i)
}
}