How to filter FrequentItemset? - scala

I have a beginner question about filtering FrequetItemset with Scala .
The code starts by the book :
import org.apache.spark.mllib.fpm.FPGrowth
import org.apache.spark.rdd.RDD
val transactions: RDD[Array[String]] = data.map(s => s.trim.split(','))
val fpg = new FPGrowth()
.setMinSupport(0.04)
.setNumPartitions(10)
val model = fpg.run(transactions)
Now I'd like to filter out items start with 'aaa' , for example "aaa_ccc", from result
I tried :
val filtered_result = model.freqItemsets.itemset.filter{ item => startwith("aaa")}
and
val filtered_result = model.freqItemsets.filter( itemset.items => startwith("aaa"))
and
val filtered_result = model.freqItemsets.filter( itemset => items.startwith("aaa"))
What I did wrong?

Any of the suggested codes would not compile. So I am not sure if the problem you are talking about is why is it not compiling or is it that you get wrong results.
Scala collections can be filtered using filter method and passing method as a parameter, which can be written in a couple of ways:
coll filter { item => filterLogic(item) }
so to filter freqItemsets you would use something like:
model.freqItemsets filter { itemSet => filterLogic(itemSet) }
If you wanted to filter all freqItemsets that contain at least one string that starts with "aaa"
model.freqItemsets filter { itemSet => itemSet.items exists { item => item.startsWith("aaa") }
Or if your goal was to filter items inside the freqItemset then:
model.freqItemsets map { itemSet => itemSet.copy(items = itemSet filter { _.startsWith("aaa") }) }
Note that in Scala you can use:
someCollection filter { item => item.startsWith("string") } which is the same as: someCollection filter { _.startsWith("string") }
Hope this helps.

items is an Array[String]. If you want to filter any itemset which contains item which is starts with aaa you'll need something like this:
model.freqItemsets.filter(_.items.exists(_.startsWith("aaa")))

Related

Simplify two filters in Scala

Is there a way to simplify this scala code into a for comprehension?
val selectedNames = names filter {setOfNames}
val selectedPersons = persons filter {p => seletectedNames contains p.name}
Here I'm assuming that persons have a name attribute.
Edit
Of course the value names is obtained as
val names = persons map _.name
How about
val selectedPersons = persons filter { person => setOfNames contains person.name }
I'm not sure this is much of a simplification. It's just doing the same thing via a for comprehension as requested.
val selectedPersons = for {
p <- persons
if setOfNames(p.name)
} yield p

Scala/Play/Squeryl Retrieve multiple params

I have the following url : http://localhost/api/books/?bookId=21&bookId=62?authorId=2
I want to retrieve all the bookId values with Scala and then use Squeryl to do a fetch in a the database.
I'm using the PlayFrameWork as the WebServer, so here's my code :
val params = request.queryString.map { case (k, v) => k -> v(0) } // Retrieve only one the first occurence of a param
So params.get("bookId") will only get the last value in the bookId params. e-g : 62.
To retrieve all my bookId params i tried this :
val params = request.queryString.map { case (k, v) => k -> v } so i can get a Seq[String], but what about the authorId which is not a Seq[String]? .
At the end i want to fetch the bookIds and authorId in my DB using Squeryl :
(a.author_id === params.get("authorId").?) and
(params.get("bookId").map(bookIds: Seq[String] => b.bookId in bookIds))
In my controller i get the params and open the DB connection :
val params = request.queryString.map { case (k, v) => k -> v(0) }
DB.withTransaction() { where(Library.whereHelper(params)}
In my model i use the queries :
def whereHelper(params : Map[String,String]) = {
(a.author_id === params.get("authorId").?) and
(params.get("bookId").map{bookIds: Seq[String] => b.bookId in bookIds})
}
Since bookIds is a list, i need to use the Seq[String]. There's a way to use request.queryString.map { case (k, v) => k -> v } for both a string (authorId) and a list of strings (bookIds) ?
Thanks,
If I really understand what you are trying to do, you want to know how to get the parameters from queryString. This is pretty simple and you can do the following at your controller:
def myAction = Action { request =>
// get all the values from parameter named bookId and
// transforming it to Long. Maybe you don't want the map
// and then you can just remove it.
val bookIds: Seq[Long] = request.queryString("bookId").map(_.toLong)
// Notice that now I'm using getQueryString which is a helper
// method to access a queryString parameter. It returns an
// Option[String] which we are mapping to a Option[Long].
// Again, if you don't need the mapping, just remove it.
val authorId: Option[Long] = request.getQueryString("authorId").map(_.toLong)
DB.withTransaction() { where(Library.whereHelper(authorId, bookIds) }
// Do something with the result
}
At your model you will have:
def whereHelper(authorId: Option[Long], booksId: List[Long]) = authorId match {
case Some(author_id) =>
(a.author_id === author_id) and
(b.bookId in bookIds)
case None =>
(b.bookId in bookIds)
}
I've left explicit types to help you understand what is happen. Now, since you have both values, you can just use the values at your query.
Edit after chat:
But, since you want to receive a params: Map[String, Seq[String]] at your models and is just having problems about how to get the authorId, here is what you can do:
def whereHelper(params: Map[String, Seq[String]]) = {
// Here I'm being defensive to the fact that maybe there is no
// "booksIds" key at the map. So, if there is not, an Seq.empty
// will be returned. map method will run only if there is something
// at the Seq.
val booksIds = params.getOrElse("booksIds", Seq.empty).map(_.toLong)
// The same defensive approach is being used here, and also getting
// the head as an Option, so if the Seq is empty, a None will be
// returned. Again, the map will be executed only if the Option
// is a Some, returning another Some with the value as a Long.
val authorId = params.getOrElse("authorId", Seq.empty).headOption
authorId.map(_.toLong) match {
case Some(author_id) =>
(a.author_id === author_id) and
(b.bookId in booksIds)
case None =>
(b.bookId in booksIds)
}
}
Of course, more parameters you have, more complicated this method will be.

How to avoid any mutable things in this builder?

I have a simple Scala class like this:
class FiltersBuilder {
def build(filter: CommandFilter) = {
val result = collection.mutable.Map[String, String]()
if (filter.activity.isDefined) {
result += ("activity" -> """ some specific expression """)
} // I well know that manipulating option like this is not recommanded,
//it's just for the simplicity of the example
if (filter.gender.isDefined) {
result += ("gender" -> """ some specific expression """)
}
result.toMap //in order to return an immutable Map
}
}
using this class so:
case class CommandFilter(activity: Option[String] = None, gender: Option[String] = None)
The result content depends on the nature of the selected filters and their associated and hardcoded expressions (String).
Is there a way to transform this code snippet by removing this "mutability" of the mutable.Map?
Map each filter field to a tuple while you add the result to a Seq, then filter out the Nones with flatten finally convert the Seq of tuples to a Map with toMap.
For adding more fields to filter you just have to add a new line to the Seq
def build(filter: CommandFilter) = {
// map each filter filed to the proper tuple
// as they are options, map will transform just the Some and let the None as None
val result = Seq(
filter.activity.map(value => "activity" -> s""" some specific expression using $value """),
filter.gender.map(value => "gender" -> s""" some specific expression using $value """)
).flatten // flatten will filter out all the Nones
result.toMap // transform list of tuple to a map
}
Hope it helps.
Gaston.
Since there are at most 2 elements in your Map:
val activity = filter.activity.map(_ => Map("activity" -> "xx"))
val gender = filter.gender.map(_ => Map("gender" -> "xx"))
val empty = Map[String, String]()
activity.getOrElse(empty) ++ gender.getOrElse(empty)
I've just managed to achieve it with this solution:
class FiltersBuilder(commandFilter: CommandFilter) {
def build = {
val result = Map[String, String]()
buildGenderFilter(buildActivityFilter(result))
}
private def buildActivityFilter(expressions: Map[String, String]) =
commandFilter.activity.fold(expressions)(activity => result + ("activity" -> """ expression regarding activity """))
private def buildGenderFilter(expressions: Map[String, String]) =
commandFilter.gender.fold(expressions)(gender => result + ("gender" -> """ expression regarding gender """))
}
Any better way?

Given a Field and a Record, how to obtain the value of that Field within that Record?

I have something like:
val jobs = Job.where(...).fetch()
val fieldsToDisplay = Seq(Job.status, Job._id, ...)
val header = fieldsToDisplay map { _.name }
val tbl = jobs map { j => fieldsToDisplay map { _.getValueIn(j) } }
renderTable(header, tbl)
...and it's that hypothetical getValueIn I'm looking for.
I've been unable to find anything, but perhaps more experienced Lift'ers know a trick.
Each Field has a name that is unique within the Record
jobs map { j =>
fieldsToDisplay map { f =>
j.fieldByName(f.name)
}
}

Why my list is empty?

I have this code:
val products = List()
def loadProducts(report: (Asset, Party, AssetModel, Location, VendingMachineReading)) = {
report match {
case (asset, party, assetModel, location, reading) =>
EvadtsParser.parseEvadts(reading.evadts, result)
(result.toMap).map(product => ReportData(
customer = party.name,
location = location.description,
asset = asset.`type`,
category = "",
product = product._1,
counter = product._2,
usage = 0,
period = "to be defined")).toList
}
}
results.foreach(result => products ::: loadProducts(result))
println(products)
Can you please tell me what I am doing wrong because products list is empty? If I println products inside loadProducts method, products is not empty. Is the concatenation I am doing wrong?
PS: I am a scala beginner.
As I've already said, ::: yields a new list instead of mutating the one you already have in place.
http://take.ms/WDB http://take.ms/WDB
You have two options to go: immutable and mutable
Here is what you can do in immutable and idiomatic style:
def loadProducts(report: (...)): List[...] = {
...
}
val products = result.flatMap(result => loadProducs(result))
println(products)
But also, you can tie with mutability and use ListBuffer to do the things you've wanted:
def loadProducts(report: (...)): List[T] = {
...
}
val buffer = scala.collection.mutable.ListBuffer[T]()
result.foreach(result => buffer ++ = loadProducs(result))
val products = buffer.toList
println(products)
P.S. flatMap( ...) is an analog to map(...).flatten, so don't be confused that I and Tomasz written it so differently.
List type is immutable, val implies a reference that never changes. Thus you can't really change the contents of products reference. I suggest building a "list of lists" first and then flattening:
val products = results.map(loadProducts).flatten
println(products)
Notice that map(loadProducts) is just a shorthand for map(loadProducts(_)), which is a shorthand for map(result => loadProducts(result)).
If you become more experienced, try foldLeft() approach, which continuously builds products list just like you wanted to do this:
results.foldLeft(List[Int]())((agg, result) => agg ++ loadProducts(result))