Query a SetColumn - scala

how can I search NOT CONTAINS in a set ?
Say I have the following model:
case class ClassRoom(id:String, age:Int, name:String , kids: Set[String])
abstract class PersonModel extends CassandraTable[PersonModel, Person] {
override def tableName = "ClassRooms"
object id extends StringColumn(this) with PartitionKey[String]
object age extends DoubleColumn(this) with PrimaryKey[Double]
object kids extends SetColumn[String](this)
I want to do the following query
def findMissing(minAge:Double, kid:String) = select
.where(_.age > age)
.and (_.kids not contain kid)
.fetch()

Sadly because in Cassandra the engine is based on "map style" indexing, the NOT operator is not an inherently suitable concept. Arguably it could be exposed through things like the key cache for partition key hits, but for things like secondary indexes they may not be a technical reality.
First of all, you will need object kids extends SetColumn[String](this) with Index[Set[String]] for the normal contains to work, hence my mention of a secondary index.
To achieve what you want, based on cardinality you have two ways, you either use secondary indexes and a normal CONTAIN or you de-normalise and store in a separate table, which may improve performance but it will still require "diffing".

Related

Advice on Case Classes/Objects/Matching

I am trying to model (in my Scala application) a list of options presented in my web page and am scratching my head coming up with a solution for mapping a String value posted from the client to it's corresponding object in the backend.
eg. Let's say it is a list of Animals and the user can choose 1 which gets posted to the backend.
Animals
Polar Bear
Rabbit
Great White Shark
When a request comes in, I want to convert the Great White Shark String to an Animal but not sure on how best to match the
String to the appropriate type in the backend.
So far I have this.
sealed abstract class Animal(val name: String)
case object GreatWhite extends Animal("Great White Shark")
case object PolarBear extends Animal("Polar Bear")
Which allows me to do this to match the String from the UI to it's corresponding case object in my Scala application.
def matcher(animal: String) = animal match {
case GreatWhite.name => GreatWhite
case PolarBear.name => PolarBear
}
Problem
If the List of Animal's grows long however, this matcher is going to be very cumbersome since I need to have a case expression for every Animal.
I would much appreciate any experienced Scala guys giving me a pointer on a more elegant solution.
It's looks like what you need is simply have a hash table of String to Animal.
Such approach gives you ability to get result in constant time O(1) even with extensivly growing list.
val mapping = Map[String, Animal]("Rabbit" -> Rabbit, "Polar Bear" -> PolarBear /* ... */ )
// matcher
mapping.get(animal)
UPD.
Some useful comments below.
sealed abstract class Animal(val name: String)
case object GreatWhite extends Animal("Great White Shark")
case object PolarBear extends Animal("Polar Bear")
val mapping: Map[String, Animal] = Seq(GreatWhite, PolarBear).map(x => x.name -> x).toMap
mapping
Have you looked at Enums? If they are usable for you, Enums have a .withName method http://yefremov.net/blog/scala-enum-by-name/

Using options for auto-incrementing model ids in slick?

In many slick examples in which the type parameter of the table is a case class and the table has an auto-incrementing primary key, I have seen an Option used in the case class for the id field:
case class Item(id : Option[Long], name : String)
object Items extends Table[Item]("item"){
def id = column[Long]("id", O.PrimaryKey, O.AutoInc)
def name = column[String]("name")
def * = id.? ~ name <> (Item.apply _, Item.unapply _)
}
This kind of makes sense to me, because the id field will have no meaningful value until the object is inserted into the table. However, database queries will always give me back Items that have the id set to something and it gets extremely tedious always folding or pattern matching on something that I know will not be None. I could just put a 0L in the id field when I create a new Item, but this doesn't seem like a good choice.
How is this typically dealt with? Are those the only two options?
This related question has some possible answers in it. The question itself is about a much more specific issue with postgres though.
Update: see Rikards comment below
You could just call .get in the places where you know id is not None. Which is what people usually do I suppose.
An alternative would be having two different classes. One with an id field and one without. Or an ID trait which you only mix in for classes that have an ID.
trait WithID{ def id : Int }
case class Person(name: String)
// create a person with id
(newid:Int,name:String) => new Person(name) with WithID{ def id = newid }
The mapping you have to provide to Slick will be more verbose, but the usage code will be simpler and type-safe. I believe #nafg has an abstraction for this in https://github.com/nafg/slick-additions but I may be mistaken.

scala programming without vars

val and var in scala, the concept is understandable enough, I think.
I wanted to do something like this (java like):
trait PersonInfo {
var name: Option[String] = None
var address: Option[String] = None
// plus another 30 var, for example
}
case class Person() extends PersonInfo
object TestObject {
def main(args: Array[String]): Unit = {
val p = new Person()
p.name = Some("someName")
p.address = Some("someAddress")
}
}
so I can change the name, address, etc...
This works well enough, but the thing is, in my program I end up with everything as vars.
As I understand val are "preferred" in scala. How can val work in this
type of example without having to rewrite all 30+ arguments every time one of them is changed?
That is, I could have
trait PersonInfo {
val name: Option[String]
val address: Option[String]
// plus another 30 val, for example
}
case class Person(name: Option[String]=None, address: Option[String]=None, ...plus another 30.. ) extends PersonInfo
object TestObject {
def main(args: Array[String]): Unit = {
val p = new Person("someName", "someAddress", .....)
// and if I want to change one thing, the address for example
val p2 = new Person("someName", "someOtherAddress", .....)
}
}
Is this the "normal" scala way of doing thing (not withstanding the 22 parameters limit)?
As can be seen, I'm very new to all this.
At first the basic option of Tony K.:
def withName(n : String) = Person(n, address)
looked promising, but I have quite a few classes that extends PersonInfo.
That means in each one I would have to re-implement the defs, lots of typing and cutting and pasting,
just to do something simple.
If I convert the trait PersonInfo to a normal class and put all the defs in it, then
I have the problem of how can I return a Person, not a PersonInfo?
Is there a clever scala thing to somehow implement in the trait or super class and have
all subclasses really extend?
As far as I can see all works very well in scala when the examples are very simple,
2 or 3 parameters, but when you have dozens it becomes very tedious and unworkable.
PersonContext of weirdcanada is I think similar, still thinking about this one. I guess if
I have 43 parameters I would need to breakup into multiple temp classes just to pump
the parameters into Person.
The copy option is also interesting, cryptic but a lot less typing.
Coming from java I was hoping for some clever tricks from scala.
Case classes have a pre-defined copy method which you should use for this.
case class Person(name: String, age: Int)
val mike = Person("Mike", 42)
val newMike = mike.copy(age = 43)
How does this work? copy is just one of the methods (besides equals, hashCode etc) that the compiler writes for you. In this example it is:
def copy(name: String = name, age: Int = age): Person = new Person(name, age)
The values name and age in this method shadow the values in the outer scope. As you can see, default values are provided, so you only need to specify the ones that you want to change. The others default to what there are in the current instance.
The reason for the existence of var in scala is to support mutable state. In some cases, mutable state is truly what you want (e.g. for performance or clarity reasons).
You are correct, though, that there is much evidence and experience behind the encouragement to use immutable state. Things work better on many fronts (concurrency, clarity of reason, etc).
One answer to your question is to provide mutator methods to the class in question that don't actually mutate the state, but instead return a new object with a modified entry:
case class Person(val name : String, val address : String) {
def withName(n : String) = Person(n, address)
...
}
This particular solution does involve coding potentially long parameter lists, but only within the class itself. Users of it get off easy:
val p = Person("Joe", "N St")
val p2 = p.withName("Sam")
...
If you consider the reasons you'd want to mutate state, then thing become clearer. If you are reading data from a database, you could have many reasons for mutating an object:
The database itself changed, and you want to auto-refresh the state of the object in memory
You want to make an update to the database itself
You want to pass an object around and have it mutated by methods all over the place
In the first case, immutable state is easy:
val updatedObj = oldObj.refresh
The second is much more complex, and there are many ways to handle it (including mutable state with dirty field tracking). It pays to look at libraries like Squery, where you can write things in a nice DSL (see http://squeryl.org/inserts-updates-delete.html) and avoid using the direct object mutation altogether.
The final one is the one you generally want to avoid for reasons of complexity. Such things are hard to parallelize, hard to reason about, and lead to all sorts of bugs where one class has a reference to another, but no guarantees about the stability of it. This kind of usage is the one that screams for immutable state of the form we are talking about.
Scala has adopted many paradigms from Functional Programming, one of them being a focus on using objects with immutable state. This means moving away from getters and setters within your classes and instead opting to to do what #Tony K. above has suggested: when you need to change the "state" of an inner object, define a function that will return a new Person object.
Trying to use immutable objects is likely the preferred Scala way.
In regards to the 22 parameter issue, you could create a context class that is passed to the constructor of Person:
case class PersonContext(all: String, of: String, your: String, parameters: Int)
class Person(context: PersonContext) extends PersonInfo { ... }
If you find yourself changing an address often and don't want to have to go through the PersonContext rigamarole, you can define a method:
def addressChanger(person: Person, address: String): Person = {
val contextWithNewAddress = ...
Person(contextWithNewAddress)
}
You could take this even further, and define a method on Person:
class Person(context: PersonContext) extends PersonInfo {
...
def newAddress(address: String): Person = {
addressChanger(this, address)
}
}
In your code, you just need to make remember that when you are updating your objects that you're often getting new objects in return. Once you get used to that concept, it becomes very natural.

What would be an appropriate collection type for storing few elements?

Of the different collection types supported in Scala (lists, maps, hashmaps, set etc) what would be an appropriate collection type for implementing something that can be done by C code below
typedef enum { GOOD BAD MAX_QUALITY } quality
struct student_data s_data[MAX_QUALITY];
The collection size is small... 2 or 3 elements, but having a collection helps to keep the code elegant, when performing similar operations on the data .
Thanks!
List or Seq of different case classes should do the trick. When I say different case classes, I really mean:
case class CaseClass1(arg1: String, arg2: Int, arg3: OtherCaseClass)
case class OtherCaseClass(arg1: String, arg2: String)
val foo: List[CaseClass1] = ...
Then instances of CaseClass1 are composed and stored in your list.
If your goal is to create an enum-like data structure which provides a fast index lookup, I would go for:
sealed trait Quality { val index: Int }
case class BAD() extends Quality { val index = 0 }
case class GOOD() extends Quality { val index = 1 }
case class MAX_QUALITY() extends Quality { val index = 2 }
This allows to use pattern matching on an arbitrary quality: Quality and the verbose syntax quality.index makes it explicit that quality is used as an Int index at that point.
Can you please elaborate further ?
If your intention is to create structure in which you associate to something of type quality a data structure of type student_data then you can use a Map
Depending on whether you need mutable access, or you just create the structure and then the access is read-only you would use mutable or immutable maps.
I suggest, in your case, to use immutable Maps. The Scala Collection Library provides specialized versions for small maps, up to 5 inserted key-value pairs, which provide better performance and memory footprint than the unspecialized types of map.
val x = Map(1 -> 2, 2 -> 3)
println(x.getClass)
>> class scala.collection.immutable.Map$Map2
You can also define an alias, to work more easily with your structures
type StudentStruct = Map[Quality, StudentData]
Usage:
val studentStruct = Map(Bad -> studentData)
val studentStruct2 = studentStruct + (Good -> studentData2)
Hope it helps.

scala: map-like structure that doesn't require casting when fetching a value?

I'm writing a data structure that converts the results of a database query. The raw structure is a java ResultSet and it would be converted to a map or class which permits accessing different fields on that data structure by either a named method call or passing a string into apply(). Clearly different values may have different types. In order to reduce burden on the clients of this data structure, my preference is that one not need to cast the values of the data structure but the value fetched still has the correct type.
For example, suppose I'm doing a query that fetches two column values, one an Int, the other a String. The result then names of the columns are "a" and "b" respectively. Some ideal syntax might be the following:
val javaResultSet = dbQuery("select a, b from table limit 1")
// with ResultSet, particular values can be accessed like this:
val a = javaResultSet.getInt("a")
val b = javaResultSet.getString("b")
// but this syntax is undesirable.
// since I want to convert this to a single data structure,
// the preferred syntax might look something like this:
val newStructure = toDataStructure[Int, String](javaResultSet)("a", "b")
// that is, I'm willing to state the types during the instantiation
// of such a data structure.
// then,
val a: Int = newStructure("a") // OR
val a: Int = newStructure.a
// in both cases, "val a" does not require asInstanceOf[Int].
I've been trying to determine what sort of data structure might allow this and I could not figure out a way around the casting.
The other requirement is obviously that I would like to define a single data structure used for all db queries. I realize I could easily define a case class or similar per call and that solves the typing issue, but such a solution does not scale well when many db queries are being written. I suspect some people are going to propose using some sort of ORM, but let us assume for my case that it is preferred to maintain the query in the form of a string.
Anyone have any suggestions? Thanks!
To do this without casting, one needs more information about the query and one needs that information at compiole time.
I suspect some people are going to propose using some sort of ORM, but let us assume for my case that it is preferred to maintain the query in the form of a string.
Your suspicion is right and you will not get around this. If current ORMs or DSLs like squeryl don't suit your fancy, you can create your own one. But I doubt you will be able to use query strings.
The basic problem is that you don't know how many columns there will be in any given query, and so you don't know how many type parameters the data structure should have and it's not possible to abstract over the number of type parameters.
There is however, a data structure that exists in different variants for different numbers of type parameters: the tuple. (E.g. Tuple2, Tuple3 etc.) You could define parameterized mapping functions for different numbers of parameters that returns tuples like this:
def toDataStructure2[T1, T2](rs: ResultSet)(c1: String, c2: String) =
(rs.getObject(c1).asInstanceOf[T1],
rs.getObject(c2).asInstanceOf[T2])
def toDataStructure3[T1, T2, T3](rs: ResultSet)(c1: String, c2: String, c3: String) =
(rs.getObject(c1).asInstanceOf[T1],
rs.getObject(c2).asInstanceOf[T2],
rs.getObject(c3).asInstanceOf[T3])
You would have to define these for as many columns you expect to have in your tables (max 22).
This depends of course on that using getObject and casting it to a given type is safe.
In your example you could use the resulting tuple as follows:
val (a, b) = toDataStructure2[Int, String](javaResultSet)("a", "b")
if you decide to go the route of heterogeneous collections, there are some very interesting posts on heterogeneous typed lists:
one for instance is
http://jnordenberg.blogspot.com/2008/08/hlist-in-scala.html
http://jnordenberg.blogspot.com/2008/09/hlist-in-scala-revisited-or-scala.html
with an implementation at
http://www.assembla.com/wiki/show/metascala
a second great series of posts starts with
http://apocalisp.wordpress.com/2010/07/06/type-level-programming-in-scala-part-6a-heterogeneous-list%C2%A0basics/
the series continues with parts "b,c,d" linked from part a
finally, there is a talk by Daniel Spiewak which touches on HOMaps
http://vimeo.com/13518456
so all this to say that perhaps you can build you solution from these ideas. sorry that i don't have a specific example, but i admit i haven't tried these out yet myself!
Joschua Bloch has introduced a heterogeneous collection, which can be written in Java. I once adopted it a little. It now works as a value register. It is basically a wrapper around two maps. Here is the code and this is how you can use it. But this is just FYI, since you are interested in a Scala solution.
In Scala I would start by playing with Tuples. Tuples are kinda heterogeneous collections. The results can be, but not have to be accessed through fields like _1, _2, _3 and so on. But you don't want that, you want names. This is how you can assign names to those:
scala> val tuple = (1, "word")
tuple: ([Int], [String]) = (1, word)
scala> val (a, b) = tuple
a: Int = 1
b: String = word
So as mentioned before I would try to build a ResultSetWrapper around tuples.
If you want "extract the column value by name" on a plain bean instance, you can probably:
use reflects and CASTs, which you(and me) don't like.
use a ResultSetToJavaBeanMapper provided by most ORM libraries, which is a little heavy and coupled.
write a scala compiler plugin, which is too complex to control.
so, I guess a lightweight ORM with following features may satisfy you:
support raw SQL
support a lightweight,declarative and adaptive ResultSetToJavaBeanMapper
nothing else.
I made an experimental project on that idea, but note it's still an ORM, and I just think it may be useful to you, or can bring you some hint.
Usage:
declare the model:
//declare DB schema
trait UserDef extends TableDef {
var name = property[String]("name", title = Some("姓名"))
var age1 = property[Int]("age", primary = true)
}
//declare model, and it mixes in properties as {var name = ""}
#BeanInfo class User extends Model with UserDef
//declare a object.
//it mixes in properties as {var name = Property[String]("name") }
//and, object User is a Mapper[User], thus, it can translate ResultSet to a User instance.
object `package`{
#BeanInfo implicit object User extends Table[User]("users") with UserDef
}
then call raw sql, the implicit Mapper[User] works for you:
val users = SQL("select name, age from users").all[User]
users.foreach{user => println(user.name)}
or even build a type safe query:
val users = User.q.where(User.age > 20).where(User.name like "%liu%").all[User]
for more, see unit test:
https://github.com/liusong1111/soupy-orm/blob/master/src/test/scala/mapper/SoupyMapperSpec.scala
project home:
https://github.com/liusong1111/soupy-orm
It uses "abstract Type" and "implicit" heavily to make the magic happen, and you can check source code of TableDef, Table, Model for detail.
Several million years ago I wrote an example showing how to use Scala's type system to push and pull values from a ResultSet. Check it out; it matches up with what you want to do fairly closely.
implicit val conn = connect("jdbc:h2:f2", "sa", "");
implicit val s: Statement = conn << setup;
val insertPerson = conn prepareStatement "insert into person(type, name) values(?, ?)";
for (val name <- names)
insertPerson<<rnd.nextInt(10)<<name<<!;
for (val person <- query("select * from person", rs => Person(rs,rs,rs)))
println(person.toXML);
for (val person <- "select * from person" <<! (rs => Person(rs,rs,rs)))
println(person.toXML);
Primitives types are used to guide the Scala compiler into selecting the right functions on the ResultSet.