How to create two sequence out of one comparing one custom object with another in that sequence? - scala

case class Submission(name: String, plannedDate: Option[LocalDate], revisedDate: Option[LocalDate])
val submission_1 = Submission("Åwesh Care", Some(2020-05-11), Some(2020-06-11))
val submission_2 = Submission("robin Dore", Some(2020-05-11), Some(2020-05-30))
val submission_3 = Submission("AIMS Hospital", Some(2020-01-24), Some(2020-07-30))
val submissions = Seq(submission_1, submission_2, submission_3)
Split the submissions so that the submission with the same plannedDate and/or revisedDate
goes to sameDateGroup and others go to remainder.
val (sameDateGroup, remainder) = someFunction(submissions)
Example result as below:
sameDateGroup should have
Seq(Submission("Åwesh Care", Some(2020-05-11), Some(2020-06-11)),
Submission("robin Dore", Some(2020-05-11), Some(2020-05-30)))
and remainder should have:
Seq(Submission("AIMS Hospital", Some(2020-01-24), Some(2020-07-30)))

So, if I understand the logic here, submission A shares a date with submission B (and both would go in the sameDateGrooup) IFF:
subA.plannedDate == subB.plannedDate
OR subA.plannedDate == subB.revisedDate
OR subA.revisedDate == subB.plannedDate
OR subA.revisedDate == subB.revisedDate
Likewise, and conversely, submission C belongs in the remainder category IFF:
subC.plannedDate // is unique among all planned dates
AND subC.plannedDate // does not exist among all revised dates
AND subC.revisedDate // is unique among all revised dates
AND subC.revisedDate // does not exist among all planned dates
Given all that, I think this does what you're describing.
import java.time.LocalDate
case class Submission(name : String
,plannedDate : Option[LocalDate]
,revisedDate : Option[LocalDate])
val submission_1 = Submission("Åwesh Care"
,Some(LocalDate.parse("2020-05-11"))
,Some(LocalDate.parse("2020-06-11")))
val submission_2 = Submission("robin Dore"
,Some(LocalDate.parse("2020-05-11"))
,Some(LocalDate.parse("2020-05-30")))
val submission_3 = Submission("AIMS Hospital"
,Some(LocalDate.parse("2020-01-24"))
,Some(LocalDate.parse("2020-07-30")))
val submissions = Seq(submission_1, submission_2, submission_3)
val pDates = submissions.groupBy(_.plannedDate)
val rDates = submissions.groupBy(_.revisedDate)
val (sameDateGroup, remainder) = submissions.partition(sub =>
pDates(sub.plannedDate).lengthIs > 1 ||
rDates(sub.revisedDate).lengthIs > 1 ||
pDates.keySet(sub.revisedDate) ||
rDates.keySet(sub.plannedDate))

A simple way to do this is to count the number of matching submissions for each submission in the list, and use that to partition the list:
def matching(s1: Submission, s2: Submission) =
s1.plannedDate == s2.plannedDate || s1.revisedDate == s2.revisedDate
val (sameDateGroup, remainder) =
submissions.partition { s1 =>
submissions.count(s2 => matching(s1, s2)) > 1
}
The matching function can contain whatever specific test is required.
This is O(n^2) so a more sophisticated algorithm would be needed for very long lists.

I think this will do the trick.
I'm sorry, some variablenames are not very meaningful, because I used different case class hen trying this. For some reason I only thought about using .groupBy later. So I'm not really recommend using this, as it is a bit uncomprehensible and can be solved easier with groupby
case class Submission(name: String, plannedDate: Option[String], revisedDate: Option[String])
val l =
List(
Submission("Åwesh Care", Some("2020-05-11"), Some("2020-06-11")),
Submission("robin Dore", Some("2020-05-11"), Some("2020-05-30")),
Submission("AIMS Hospital", Some("2020-01-24"), Some("2020-07-30")))
val t = l
.map((_, 1))
.foldLeft(Map.empty[Option[String], (List[Submission], Int)])((acc, idnTuple) => idnTuple match {
case (idn, count) => {
acc
.get(idn.plannedDate)
.map {
case (mapIdn, mapCount) => acc + (idn.plannedDate -> (idn :: mapIdn, mapCount + count))
}.getOrElse(acc + (idn.plannedDate -> (List(idn), count)))
}})
.values
.partition(_._2 > 1)
val r = (t._1.map(_._1).flatten, t._2.map(_._1).flatten)
println(r)
It basically follows the map-reduce wordcount schema.
If someone sees this, and knows how to do the tuple deconstruction easier, please let me know in the comments.

Related

Create Map from Elements from List of case class

case class Student(id:String, name:String, teacher:String )
val myList = List( Student("1","Ramesh","Isabela"), Student("2","Elena","Mark"),Student("3","invalidKey","Someteacher"))
val a = myList.foreach( i=> (i.name -> i.teacher)).toMap.filter(i.name != "invalidKey")
I have a list of case class of student. I Want to build a map of student, teacher which are name ( key of the map) will always be unique. Preferably map can filter out a certain name.
You're using foreach, which returns Unit as the result.
I would suggest either of these 2 below. First one is as Luis Miguel mentioned:
val myMap = myList.collect {
case student if student.name != "invalidKey" => student.name -> student.teacher
}.toMap
Or:
val myMap2 = myList.foldLeft[Map[String, String]](Map.empty) {
case (elementsMap, newElement) if newElement.name != "invalidKey" =>
elementsMap + (newElement.name -> newElement.teacher)
case (elementsMap, _) => elementsMap
}
Differences:
First approach is much easier to read and shorter (being shorter is not an advantage though :D). Second one has less iterations (first one has another iteration to convert to Map).

How can I take the highest ranking filter condition that ends up matching in a dataframe?

Wording of my question might be confusing so let me explain. Say I have an array of strings. They are ranked in order of the best case scenario match. So at index 0 we want this to always exist in the dataframe column, but if it doesn't then index 1 is the next best option. I have written this logic like this, but I don't feel like this is the most efficient way to have done this. Is there another way of doing it that is better?
The datasets are quite small, but I fear this can't really scale very well.
val df = spark.createDataFrame(data)
val nameArray = Array[String]("Name", "Name%", "%Name%", "Person Name", "Person Name%", "%Person Name%")
nameArray.foreach(x => {
val nameDf = df.where("text like '" + x + "'")
if(nameDf.count() > 0){
nameDf.show(1)
break()
}
})
If values are order according to preference from left (highest precedence) to right (lowest precedence) and lower precedence patterns already cover higher precedence ones (it doesn't look like it is the case in your example) you generate expression like this
import org.apache.spark.sql._
def matched(df: DataFrame, nameArray: Seq[String], c: String = "text") = {
val matchIdx = nameArray.zipWithIndex.foldRight(lit(-1)){
case ((s, i), acc) => when(col(c) like s, lit(i)).otherwise(acc)
}
df.select(max(matchIdx)).first match {
case Row(-1) => None // No pattern matches all records
case Row(i: Int) => Some(nameArray(i))
}
}
Usage examples:
matched(Seq("Some Name", "Name", "Name Surname").toDF("text"), Seq("Name", "Name%", "%Name%"))
// Option[String] = Some(%Name%)
There are two advantages of this method:
Only one action is required.
Pattern matching can be short circuited.
If pre-conditions are not satisfied you can
import org.apache.spark.sql.functions._
val unmatchedCount: Map[String, Long] = df.select(
nameArray.map(s => count(when(not($"text" like s), 1)).alias(s)): _*
).first.getValuesMap[Long](nameArray)
Unlike the first approach it will check all patterns, but it requires only one action.

Functional match-maker algorithm in scala

Suppose, in scala, I have a collection of Person objects where each person has an identifier and quantity value:
case class Person(identifier: String, quantity : Int)
A positive quantity represents supply and a negative quantity represents demand. Similarly, a transfer can be represented as:
case class Transfer(quantity : Int, supplier : String, consumer : String)
What is a "functional" algorithm that can maximize the transfers from suppliers to consumers by matching as much supply as possible with demand?
The signature would look something like
def matchMaker(people : Iterable[Person]) : Iterable[Transfer] = ???
Note: the collection type Iterable is not strictly necessary for either the input or output. A Set, List, etc. will suffice.
Example:
If our population is:
val people = Iterable(Person("Alice", 10),
Person("Charlie", -5),
Person("Bob", 4))
The matchmaker algorithm would create an Iterable that could be:
Iterable(Transfer(5, "Alice", "Charlie"))
Or, another possible solution could be
Iterable(Transfer(4, "Bob", "Charlie"),
Transfer(1, "Alice", "Charlie"))
A bad solution would be where only some of the possible transfer was identified:
Iterable(Transfer(4, "Bob", "Charlie")) //Charlie still has demand left
Thank you in advance for your review.
It's not pretty, and has much room for improvement in the details. But it should at least transport the idea of the algorithm and showcase some awesome scala language features like implicits, Ordering trait, tailrecursion, ...
implicit val peopleOrdering: Ordering[Person] = Ordering.by(_.quantity)
def matchMaker(people : Iterable[Person]) : Iterable[Transfer] = {
#tailrec
def matchMaker(sortedPeople : Vector[Person], transfers: List[Transfer]) : Iterable[Transfer] = {
// just to make a point, because I am expecting the incoming vector to be sorted
// If you are confident about your code you probably don't need the require
// However, imo, it is always a good idea to double check
// and require is 'elidable' so won't clutter your program compiled for production
require(in.sorted == in, "Passed person Vector MUST be sorted.")
if (sortedPeople.forall(_.quantity >= 0) || sortedPeople.forall(_.quantity <= 0)) {
// nothing more that can be done
transfers
} else {
val sender = sortedPeople.last
val receiver = sortedPeople.head
val transferQuantity = if (receiver.quantity + sender.quantity >= 0) {
-receiver.quantity
} else {
sender.quantity
}
val transfer = Transfer(transferQuantity, sender.identifier, receiver.identifier)
val nextPeople = sortedPeople.map {
case `sender` => sender.copy(quantity = sender.quantity - transfer.quantity)
case `receiver` => receiver.copy(quantity = receiver.quantity + transfer.quantity)
case other => other
}
matchMaker(nextPeople.sorted, transfer :: transfers)
}
}
matchMaker(people.toVector.sorted, Nil)
}
The algorithm will recursively transfer from the richest person to the person with the highest debt until no one is left with debt. Unless all persons are in debt or penniless from the beginning.
Split into those with supply and those wanting:
val (suppliers, demanders) = people.partition(_.quantity > 0)
Define a function to consume one supplier's quantity, returning an updated list of those wanting, and an updated list of transfers done
def consume(supplier: Person, demanders: Iterable[Person], transfers: List[Transfer]) = {
val (q, ds, ts) = demanders.foldLeft(
(supplier.quantity, Iterable[Person](), transfers))
{
case ((quantity, ds, ts), d) =>
val amount = Math.min(quantity, -d.quantity)
if (amount != 0) (quantity - amount,
ds ++ Iterable(d.copy(quantity = d.quantity + amount)),
Transfer(amount, supplier.identifier, d.identifier) :: ts)
else
(quantity, ds ++ Iterable(d), ts)
}
(ds, ts)
}
Go over the suppliers, consuming each, and passing along the updated demanders and current list of transfers
val (_, transfers) = suppliers.foldLeft((demanders, List[Transfer]()))
{ case ((ds, ts), s) => consume(s, ds, ts) }
transfers
// List(Transfer(5,Alice,Charlie))
An optimisation: drop a demander in consume if their demand is now satisfied. That way it won't be considered for later suppliers.
def consume(supplier: Person, demanders: Iterable[Person], transfers: List[Transfer]) = {
val (q, ds, ts) = demanders.foldLeft((supplier.quantity, Iterable[Person](), transfers))
{
case ((quantity, ds, ts), d) =>
val amount = Math.min(quantity, -d.quantity)
val remaining = d.quantity + amount
if (amount != 0) (quantity - amount,
if (remaining != 0) ds ++ Iterable(d.copy(quantity = remaining))
else ds,
Transfer(amount, supplier.identifier, d.identifier) :: ts)
else (quantity, ds ++ Iterable(d), ts)
}
(ds, ts)
}

How to create a List of Wildcard elements Scala

I'm trying to write a function that returns a list (for querying purposes) that has some wildcard elements:
def createPattern(query: List[(String,String)]) = {
val l = List[(_,_,_,_,_,_,_)]
var iter = query
while(iter != null) {
val x = iter.head._1 match {
case "userId" => 0
case "userName" => 1
case "email" => 2
case "userPassword" => 3
case "creationDate" => 4
case "lastLoginDate" => 5
case "removed" => 6
}
l(x) = iter.head._2
iter = iter.tail
}
l
}
So, the user enters some query terms as a list. The function parses through these terms and inserts them into val l. The fields that the user doesn't specify are entered as wildcards.
Val l is causing me troubles. Am I going the right route or are there better ways to do this?
Thanks!
Gosh, where to start. I'd begin by getting an IDE (IntelliJ / Eclipse) which will tell you when you're writing nonsense and why.
Read up on how List works. It's an immutable linked list so your attempts to update by index are very misguided.
Don't use tuples - use case classes.
You shouldn't ever need to use null and I guess here you mean Nil.
Don't use var and while - use for-expression, or the relevant higher-order functions foreach, map etc.
Your code doesn't make much sense as it is, but it seems you're trying to return a 7-element list with the second element of each tuple in the input list mapped via a lookup to position in the output list.
To improve it... don't do that. What you're doing (as programmers have done since arrays were invented) is to use the index as a crude proxy for a Map from Int to whatever. What you want is an actual Map. I don't know what you want to do with it, but wouldn't it be nicer if it were from these key strings themselves, rather than by a number? If so, you can simplify your whole method to
def createPattern(query: List[(String,String)]) = query.toMap
at which point you should realise you probably don't need the method at all, since you can just use toMap at the call site.
If you insist on using an Int index, you could write
def createPattern(query: List[(String,String)]) = {
def intVal(x: String) = x match {
case "userId" => 0
case "userName" => 1
case "email" => 2
case "userPassword" => 3
case "creationDate" => 4
case "lastLoginDate" => 5
case "removed" => 6
}
val tuples = for ((key, value) <- query) yield (intVal(key), value)
tuples.toMap
}
Not sure what you want to do with the resulting list, but you can't create a List of wildcards like that.
What do you want to do with the resulting list, and what type should it be?
Here's how you might build something if you wanted the result to be a List[String], and if you wanted wildcards to be "*":
def createPattern(query:List[(String,String)]) = {
val wildcard = "*"
def orElseWildcard(key:String) = query.find(_._1 == key).getOrElse("",wildcard)._2
orElseWildcard("userID") ::
orElseWildcard("userName") ::
orElseWildcard("email") ::
orElseWildcard("userPassword") ::
orElseWildcard("creationDate") ::
orElseWildcard("lastLoginDate") ::
orElseWildcard("removed") ::
Nil
}
You're not using List, Tuple, iterator, or wild-cards correctly.
I'd take a different approach - maybe something like this:
case class Pattern ( valueMap:Map[String,String] ) {
def this( valueList:List[(String,String)] ) = this( valueList.toMap )
val Seq(
userId,userName,email,userPassword,creationDate,
lastLoginDate,removed
):Seq[Option[String]] = Seq( "userId", "userName",
"email", "userPassword", "creationDate", "lastLoginDate",
"removed" ).map( valueMap.get(_) )
}
Then you can do something like this:
scala> val pattern = new Pattern( List( "userId" -> "Fred" ) )
pattern: Pattern = Pattern(Map(userId -> Fred))
scala> pattern.email
res2: Option[String] = None
scala> pattern.userId
res3: Option[String] = Some(Fred)
, or just use the map directly.

Scala: Detecting a Straight in a 5-card Poker hand using pattern matching

For those who don't know what a 5-card Poker Straight is: http://en.wikipedia.org/wiki/List_of_poker_hands#Straight
I'm writing a small Poker simulator in Scala to help me learn the language, and I've created a Hand class with 5 ordered Cards in it. Each Card has a Rank and Suit, both defined as Enumerations. The Hand class has methods to evaluate the hand rank, and one of them checks whether the hand contains a Straight (we can ignore Straight Flushes for the moment). I know there are a few nice algorithms for determining a Straight, but I wanted to see whether I could design something with Scala's pattern matching, so I came up with the following:
def isStraight() = {
def matchesStraight(ranks: List[Rank.Value]): Boolean = ranks match {
case head :: Nil => true
case head :: tail if (Rank(head.id + 1) == tail.head) => matchesStraight(tail)
case _ => false
}
matchesStraight(cards.map(_.rank).toList)
}
That works fine and is fairly readable, but I was wondering if there is any way to get rid of that if. I'd imagine something like the following, though I can't get it to work:
private def isStraight() = {
def matchesStraight(ranks: List[Rank.Value]): Boolean = ranks match {
case head :: Nil => true
case head :: next(head.id + 1) :: tail => matchesStraight(next :: tail)
case _ => false
}
matchesStraight(cards.map(_.rank).toList)
}
Any ideas? Also, as a side question, what is the general opinion on the inner matchesStraight definition? Should this rather be private or perhaps done in a different way?
You can't pass information to an extractor, and you can't use information from one value returned in another, except on the if statement -- which is there to cover all these cases.
What you can do is create your own extractors to test these things, but it won't gain you much if there isn't any reuse.
For example:
class SeqExtractor[A, B](f: A => B) {
def unapplySeq(s: Seq[A]): Option[Seq[A]] =
if (s map f sliding 2 forall { case Seq(a, b) => a == b } ) Some(s)
else None
}
val Straight = new SeqExtractor((_: Card).rank)
Then you can use it like this:
listOfCards match {
case Straight(cards) => true
case _ => false
}
But, of course, all that you really want is that if statement in SeqExtractor. So, don't get too much in love with a solution, as you may miss simpler ways of doing stuff.
You could do something like:
val ids = ranks.map(_.id)
ids.max - ids.min == 4 && ids.distinct.length == 5
Handling aces correctly requires a bit of work, though.
Update: Here's a much better solution:
(ids zip ids.tail).forall{case (p,q) => q%13==(p+1)%13}
The % 13 in the comparison handles aces being both rank 1 and rank 14.
How about something like:
def isStraight(cards:List[Card]) = (cards zip cards.tail) forall { case (c1,c2) => c1.rank+1 == c2.rank}
val cards = List(Card(1),Card(2),Card(3),Card(4))
scala> isStraight(cards)
res2: Boolean = true
This is a completely different approache, but it does use pattern matching. It produces warnings in the match clause which seem to indicate that it shouldn't work. But it actually produces the correct results:
Straight !!! 34567
Straight !!! 34567
Sorry no straight this time
I ignored the Suites for now and I also ignored the possibility of an ace under a 2.
abstract class Rank {
def value : Int
}
case class Next[A <: Rank](a : A) extends Rank {
def value = a.value + 1
}
case class Two() extends Rank {
def value = 2
}
class Hand(a : Rank, b : Rank, c : Rank, d : Rank, e : Rank) {
val cards = List(a, b, c, d, e).sortWith(_.value < _.value)
}
object Hand{
def unapply(h : Hand) : Option[(Rank, Rank, Rank, Rank, Rank)] = Some((h.cards(0), h.cards(1), h.cards(2), h.cards(3), h.cards(4)))
}
object Poker {
val two = Two()
val three = Next(two)
val four = Next(three)
val five = Next(four)
val six = Next(five)
val seven = Next(six)
val eight = Next(seven)
val nine = Next(eight)
val ten = Next(nine)
val jack = Next(ten)
val queen = Next(jack)
val king = Next(queen)
val ace = Next(king)
def main(args : Array[String]) {
val simpleStraight = new Hand(three, four, five, six, seven)
val unsortedStraight = new Hand(four, seven, three, six, five)
val notStraight = new Hand (two, two, five, five, ace)
printIfStraight(simpleStraight)
printIfStraight(unsortedStraight)
printIfStraight(notStraight)
}
def printIfStraight[A](h : Hand) {
h match {
case Hand(a: A , b : Next[A], c : Next[Next[A]], d : Next[Next[Next[A]]], e : Next[Next[Next[Next[A]]]]) => println("Straight !!! " + a.value + b.value + c.value + d.value + e.value)
case Hand(a,b,c,d,e) => println("Sorry no straight this time")
}
}
}
If you are interested in more stuff like this google 'church numerals scala type system'
How about something like this?
def isStraight = {
cards.map(_.rank).toList match {
case first :: second :: third :: fourth :: fifth :: Nil if
first.id == second.id - 1 &&
second.id == third.id - 1 &&
third.id == fourth.id - 1 &&
fourth.id == fifth.id - 1 => true
case _ => false
}
}
You're still stuck with the if (which is in fact larger) but there's no recursion or custom extractors (which I believe you're using incorrectly with next and so is why your second attempt doesn't work).
If you're writing a poker program, you are already check for n-of-a-kind. A hand is a straight when it has no n-of-a-kinds (n > 1) and the different between the minimum denomination and the maximum is exactly four.
I was doing something like this a few days ago, for Project Euler problem 54. Like you, I had Rank and Suit as enumerations.
My Card class looks like this:
case class Card(rank: Rank.Value, suit: Suit.Value) extends Ordered[Card] {
def compare(that: Card) = that.rank compare this.rank
}
Note I gave it the Ordered trait so that we can easily compare cards later. Also, when parsing the hands, I sorted them from high to low using sorted, which makes assessing values much easier.
Here is my straight test which returns an Option value depending on whether it's a straight or not. The actual return value (a list of Ints) is used to determine the strength of the hand, the first representing the hand type from 0 (no pair) to 9 (straight flush), and the others being the ranks of any other cards in the hand that count towards its value. For straights, we're only worried about the highest ranking card.
Also, note that you can make a straight with Ace as low, the "wheel", or A2345.
case class Hand(cards: Array[Card]) {
...
def straight: Option[List[Int]] = {
if( cards.sliding(2).forall { case Array(x, y) => (y compare x) == 1 } )
Some(5 :: cards(0).rank.id :: 0 :: 0 :: 0 :: 0 :: Nil)
else if ( cards.map(_.rank.id).toList == List(12, 3, 2, 1, 0) )
Some(5 :: cards(1).rank.id :: 0 :: 0 :: 0 :: 0 :: Nil)
else None
}
}
Here is a complete idiomatic Scala hand classifier for all hands (handles 5-high straights):
case class Card(rank: Int, suit: Int) { override def toString = s"${"23456789TJQKA" rank}${"♣♠♦♥" suit}" }
object HandType extends Enumeration {
val HighCard, OnePair, TwoPair, ThreeOfAKind, Straight, Flush, FullHouse, FourOfAKind, StraightFlush = Value
}
case class Hand(hand: Set[Card]) {
val (handType, sorted) = {
def rankMatches(card: Card) = hand count (_.rank == card.rank)
val groups = hand groupBy rankMatches mapValues {_.toList.sorted}
val isFlush = (hand groupBy {_.suit}).size == 1
val isWheel = "A2345" forall {r => hand exists (_.rank == Card.ranks.indexOf(r))} // A,2,3,4,5 straight
val isStraight = groups.size == 1 && (hand.max.rank - hand.min.rank) == 4 || isWheel
val (isThreeOfAKind, isOnePair) = (groups contains 3, groups contains 2)
val handType = if (isStraight && isFlush) HandType.StraightFlush
else if (groups contains 4) HandType.FourOfAKind
else if (isThreeOfAKind && isOnePair) HandType.FullHouse
else if (isFlush) HandType.Flush
else if (isStraight) HandType.Straight
else if (isThreeOfAKind) HandType.ThreeOfAKind
else if (isOnePair && groups(2).size == 4) HandType.TwoPair
else if (isOnePair) HandType.OnePair
else HandType.HighCard
val kickers = ((1 until 5) flatMap groups.get).flatten.reverse
require(hand.size == 5 && kickers.size == 5)
(handType, if (isWheel) (kickers takeRight 4) :+ kickers.head else kickers)
}
}
object Hand {
import scala.math.Ordering.Implicits._
implicit val rankOrdering = Ordering by {hand: Hand => (hand.handType, hand.sorted)}
}