Functional version of a typical nested while loop - scala

I hope this question may please functional programming lovers. Could I ask for a way to translate the following fragment of code to a pure functional implementation in Scala with good balance between readability and execution speed?
Description: for each elements in a sequence, produce a sub-sequence contains the elements that comes after the current elements (including itself) with a distance smaller than a given threshold. Once the threshold is crossed, it is not necessary to process the remaining elements
def getGroupsOfElements(input : Seq[Element]) : Seq[Seq[Element]] = {
val maxDistance = 10 // put any number you may
var outerSequence = Seq.empty[Seq[Element]]
for (index <- 0 until input.length) {
var anotherIndex = index + 1
var distance = input(index) - input(anotherIndex) // let assume a separate function for computing the distance
var innerSequence = Seq(input(index))
while (distance < maxDistance && anotherIndex < (input.length - 1)) {
innerSequence = innerSequence ++ Seq(input(anotherIndex))
anotherIndex = anotherIndex + 1
distance = input(index) - input(anotherIndex)
}
outerSequence = outerSequence ++ Seq(innerSequence)
}
outerSequence
}

You know, this would be a ton easier if you added a description of what you're trying to accomplish along with the code.
Anyway, here's something that might get close to what you want.
def getGroupsOfElements(input: Seq[Element]): Seq[Seq[Element]] =
input.tails.map(x => x.takeWhile(y => distance(x.head,y) < maxDistance)).toSeq

Related

Optimize "range-join" in plain scala (not Spark!)

I have two ordered sequences, one (large) is range of positions, one (small) is a sequence of attributes, defined on position_from/position_two which I'd like to join.
So for each element of positions, I need to traverse the other sequences, which is not optimal
def interpolateCurveOnPos(position:Seq[Double], curveAttributes:Seq[CurveSegment]) = {
position.map { pos =>
// range join
val cs = curveAttributes.find(c => pos >= c.position_von && pos < c.position_bis).get
// interpolate curve attribute
val curve = cs.curve_von + (pos - cs.position_von) * (cs.curve_bis - cs.curve_von) / (cs.position_bis - cs.position_von)
return curve
}
What I've tried:
As the index at which the matching curveSegement is found will allways increase, I've introduced a some state variables to reduce the search of the correct entry
def interpolateCurveOnPos(position:Seq[Double], curveAttributes:Seq[CurveSegment]) = {
var idxSave = 0
var csSave : CurveSegment = curveAttributes.head
position.map { pos =>
// range join
val cs = curveAttributes.drop(idxSave).find(c => pos >= c.position_von && pos < c.position_bis).get
if(cs != csSave) {
csSave = cs
idxSave=idxSave+1
}
// interpolate
val curve = cs.curve_von + (pos - cs.position_von) * (cs.curve_bis - cs.curve_von) / (cs.position_bis - cs.position_von)
return curve
}
I wonder if there is a more elegent way to do it?

Minimum cost solution to connect all elements in set A to at least one element in set B

I need to find the shortest set of paths to connect each element of Set A with at least one element of Set B. Repetitions in A OR B are allowed (but not both), and no element can be left unconnected. Something like this:
I'm representing the elements as integers, so the "cost" of a connection is just the absolute value of the difference. I also have a cost for crossing paths, so if Set A = [60, 64] and Set B = [63, 67], then (60 -> 67) incurs an additional cost. There can be any number of elements in either set.
I've calculated the table of transitions and costs (distances and crossings), but I can't find the algorithm to find the lowest-cost solution. I keep ending up with either too many connections (i.e., repetitions in both A and B) or greedy solutions that omit elements (e.g., when A and B are non-overlapping). I haven't been able to find examples of precisely this kind of problem online, so I hoped someone here might be able to help, or at least point me in the right direction. I'm not a graph theorist (obviously!), and I'm writing in Swift, so code examples in Swift (or pseudocode) would be much appreciated.
UPDATE: The solution offered by #Daniel is almost working, but it does occasionally add unnecessary duplicates. I think this may be something to do with the sorting of the priorityQueue -- the duplicates always involve identical elements with identical costs. My first thought was to add some kind of "positional encoding" (yes, Transformer-speak) to the costs, so that the costs are offset by their positions (though of course, this doesn't guarantee unique costs). I thought I'd post my Swift version here, in case anyone has any ideas:
public static func voiceLeading(from chA: [Int], to chB: [Int]) -> Set<[Int]> {
var result: Set<[Int]> = Set()
let im = intervalMatrix(chA, chB: chB)
if im.count == 0 { return [[0]] }
let vc = voiceCrossingCostsMatrix(chA, chB: chB, cost: 4)
// NOTE: cm contains the weights
let cm = VectorUtils.absoluteAddMatrix(im, toMatrix: vc)
var A_links: [Int:Int] = [:]
var B_links: [Int:Int] = [:]
var priorityQueue: [Entry] = []
for (i, a) in chA.enumerated() {
for (j, b) in chB.enumerated() {
priorityQueue.append(Entry(a: a, b: b, cost: cm[i][j]))
if A_links[a] != nil {
A_links[a]! += 1
} else {
A_links[a] = 1
}
if B_links[b] != nil {
B_links[b]! += 1
} else {
B_links[b] = 1
}
}
}
priorityQueue.sort { $0.cost > $1.cost }
while priorityQueue.count > 0 {
let entry = priorityQueue[0]
if A_links[entry.a]! > 1 && B_links[entry.b]! > 1 {
A_links[entry.a]! -= 1
B_links[entry.b]! -= 1
} else {
result.insert([entry.a, (entry.b - entry.a)])
}
priorityQueue.remove(at: 0)
}
return result
}
Of course, since the duplicates have identical scores, it shouldn't be a problem to just remove the extras, but it feels a bit hackish...
UPDATE 2: Slightly less hackish (but still a bit!); since the requirement is that my result should have equal cardinality to max(|A|, |B|), I can actually just stop adding entries to my result when I've reached the target cardinality. Seems okay...
UPDATE 3: Resurrecting this old question, I've recently had some problems arise from the fact that the above algorithm doesn't fulfill my requirement |S| == max(|A|, |B|) (where S is the set of pairings). If anyone knows of a simple way of ensuring this it would be much appreciated. (I'll obviously be poking away at possible changes.)
This is an easy task:
Add all edges of the graph in a priority_queue, where the biggest priority is the edge with the biggest weight.
Look each edge e = (u, v, w) in the priority_queue, where u is in A, v is in B and w is the weight.
If removing e from the graph doesn't leave u or v isolated, remove it.
Otherwise, e is part of the answer.
This should be enough for your case:
#include <bits/stdc++.h>
using namespace std;
struct edge {
int u, v, w;
edge(){}
edge(int up, int vp, int wp){u = up; v = vp; w = wp;}
void print(){ cout<<"("<<u<<", "<<v<<")"<<endl; }
bool operator<(const edge& rhs) const {return w < rhs.w;}
};
vector<edge> E; //edge set
priority_queue<edge> pq;
vector<edge> ans;
int grade[5] = {3, 3, 2, 2, 2};
int main(){
E.push_back(edge(0, 2, 1)); E.push_back(edge(0, 3, 1)); E.push_back(edge(0, 4, 4));
E.push_back(edge(1, 2, 5)); E.push_back(edge(1, 3, 2)); E.push_back(edge(1, 4, 0));
for(int i = 0; i < E.size(); i++) pq.push(E[i]);
while(!pq.empty()){
edge e = pq.top();
if(grade[e.u] > 1 && grade[e.v] > 1){
grade[e.u]--; grade[e.v]--;
}
else ans.push_back(e);
pq.pop();
}
for(int i = 0; i < ans.size(); i++) ans[i].print();
return 0;
}
Complexity: O(E lg(E)).
I think this problem is "minimum weighted bipartite matching" (although searching for " maximum weighted bipartite matching" would also be relevant, it's just the opposite)

Programs for printing reverse triangle patterns with * in scala

I am trying to explore Scala. I am new to Scala. This might be a simple question and searched in google to get below scenario to solve. But couldn't get answers. Instead of Scala I am getting Java related things.
My requirement to print format like below.
* * * * *
* * * *
* * *
*
Can someone suggest me how to get this format.
Thanks in advance.
Kanti
Just for the sake of illustration, here are two possible solution to the problem.
The first one is completely imperative, while the second one is more functional.
The idea is that this serves as an example to help you think how to solve problems in a programmatic way.
As many of us have already commented, if you do not understand the basic ideas behind the solution, then this code will be useless in the long term.
Here is the imperative solution, the idea is simple, we need to print n lines, each line contains n - i starts (where i is the number of the line, starting at 0). The starts are separated by an empty space.
Finally, before printing the starts, we need some padding, looking at example inputs, you can see that the padding starts at 0 and increases by 1 for each line.
def printReverseTriangle(n: Int): Unit = {
var i = 0
var padding = 0
while (i < n) {
var j = padding
while (j > 0) {
print(" ")
j -= 1
}
var k = n - i
while (k > 0) {
print("* ")
k -= 1
}
println()
i += 1
padding += 1
}
}
And here is a more functional approach.
As you can see, in this case we do not need to mutate anything, all the high level operators do that for us. And we only need to focus on the description of the solution.
def printReverseTriangle(size: Int): Unit = {
def makeReverseTriangle(size: Int): List[String] =
List.tabulate(size) { i =>
(" " * (size - i)) + ("* " * i)
}.reverse
println(makeReverseTriangle(size).mkString("\n"))
}
To add an alternative to Luis's answer, here's a recursive solution:
import scala.annotation.tailrec
def printStars(i: Int): Unit = {
#tailrec
def loop(j: Int): Unit = {
if(j > 0) {
val stars = Range(0, j).map(_ => "*").mkString(" ") // make stars
if(i == j) println(stars) // no need for spaces
else println((" " * (i - j)) + stars) // spaces before the stars
loop(j - 1)
}
}
loop(i)
}
printStars(3)
// * * *
// * *
// *
This function will take a maximum triangle size (i), and for that size until i is no longer greater than 0 it will print out the correct number of stars (and spaces), then decrement by 1.
Note: Range(0, j).map(_ => "*").mkString(" ") can be replaced with List.tabulate(j)(_ => "*").mkString(" ") per Luis's answer - I'm not sure which is faster (I've not tested it).

How to optimize this algorithm that find all maximal matching in a graph?

In my app people give grades to each other, out of ten point. Each day, an algorithm computes a match for as much people as possible (it's impossible to compute a match for everyone). It makes a graph where vertexes are users and edges are the grades
I simplify the problem by saying that if 2 people give a grade to each other, there is an edge between them with a weight of their respective grade average. But if A give a grade to B, but B doesnt, their is no edge between them and they can never match : this way, the graph is not oriented anymore
I would like that, in average everybody be happy, but in the same time, I would like as few as possible of people that have no match.
Being very deterministic, I made an algorithm that find ALL maximal matchings in a graph. I did that because I thought I could analyse all these maximal matchings and apply a value function that could look like :
V(Matching) = exp(|M| / max(|M|)) * sum(weight of all Edge in M)
That is to say, a matching is high-valued if its cardinal is close to the cardinal of the maximum matching, and if the sum of the grade between people is high. I put an exponential function to the ratio |M|/max|M| because I consider it's a big problem if M is lower that 0.8 (so the exp will be arranged to highly decrease V as |M|/max|M| reaches 0.8)
I would have take the matching where V(M) is maximal. Though, the big problem is that my function that computes all maximal matching takes a lot of time. For only 15 vertex and 20 edges, it takes almost 10 minutes...
Here is the algorithm (in Swift) :
import Foundation
struct Edge : CustomStringConvertible {
var description: String {
return "e(\(v1), \(v2))"
}
let v1:Int
let v2:Int
let w:Int?
init(_ arrint:[Int])
{
v1 = arrint[0]
v2 = arrint[1]
w = nil
}
init(_ v1:Int, _ v2:Int)
{
self.v1 = v1
self.v2 = v2
w = nil
}
init(_ v1:Int, _ v2:Int, _ w:Int)
{
self.v1 = v1
self.v2 = v2
self.w = w
}
}
let mygraph:[Edge] =
[
Edge([1, 2]),
Edge([1, 5]),
Edge([2, 5]),
Edge([2, 3]),
Edge([3, 4]),
Edge([3, 6]),
Edge([5, 6]),
Edge([2,6]),
Edge([4,1]),
Edge([3,5]),
Edge([4,2]),
Edge([7,1]),
Edge([7,2]),
Edge([8,1]),
Edge([9,8]),
Edge([11,2]),
Edge([11, 8]),
Edge([12,13]),
Edge([1,6]),
Edge([4,7]),
Edge([5,7]),
Edge([3,5]),
Edge([9,1]),
Edge([10,11]),
Edge([10,4]),
Edge([10,2]),
Edge([10,1]),
Edge([10, 12]),
]
// remove all the edge and vertex "touching" the edges and vertex in "edgePath"
func reduce (graph:[Edge], edgePath:[Edge]) -> [Edge]
{
var alreadyUsedV:[Int] = []
for edge in edgePath
{
alreadyUsedV.append(edge.v1)
alreadyUsedV.append(edge.v2)
}
return graph.filter({ edge in
return alreadyUsedV.first(where:{ edge.v1 == $0 }) == nil && alreadyUsedV.first(where:{ edge.v2 == $0 }) == nil
})
}
func findAllMaximalMatching(graph Gi:[Edge]) -> [[Edge]]
{
var matchings:[[Edge]] = []
var G = Gi // current graph (reduced at each depth)
var M:[Edge] = [] // current matching being built
var Cx:[Int] = [] // current path in the possibilities tree
// eg : Cx[1] = 3 : for the depth 1, we are at the 3th edge
var d:Int = 0 // current depth
var debug_it = 0
while(true)
{
if(G.count == 0) // if there is no available edge in graph, it means we have a matching
{
if(M.count > 0) // security, if initial Graph is empty we cannot return an empty matching
{
matchings.append(M)
}
if(d == 0)
{
// depth = 0, we cannot decrement d, we have finished all the tree possibilities
break
}
d = d - 1
_ = M.popLast()
G = reduce(graph: Gi, edgePath: M)
}
else
{
let indexForThisDepth = Cx.count > d ? Cx[d] + 1 : 0
if(G.count < indexForThisDepth + 1)
{
// depth ended,
_ = Cx.popLast()
if( d == 0)
{
break
}
d = d - 1
_ = M.popLast()
// reduce from initial graph to the decremented depth
G = reduce(graph: Gi, edgePath: M)
}
else
{
// matching not finished to be built
M.append( G[indexForThisDepth] )
if(indexForThisDepth == 0)
{
Cx.append(indexForThisDepth)
}
else
{
Cx[d] = indexForThisDepth
}
d = d + 1
G = reduce(graph: G, edgePath: M)
}
}
debug_it += 1
}
print("matching counts : \(matchings.count)")
print("iterations : \(debug_it)")
return matchings
}
let m = findAllMaximalMatching(graph: mygraph)
// we have compute all the maximal matching, now we loop through all of them to find the one that has V(Mi) maximum
// ....
Finally my question is : how can I optimize this algorithm to find all maximal matching and to compute my value function on them to find the best matching for my app in a polynomial time ?
I may be missing something since the question is quite complicated, but why not simply use maximum flow problem, with every vertex appearing twice and the edges weights are the average grading if exists? It will return the maximal flow if configured correctly and runs polynomial time.

Scala - Project euler #8

I'm currently learning Scala and I'm trying to solve some of the Euler Challenges with it.
I have some problems getting the response to the 8th challenge and I really don't know where is my bug.
object Product{
def main(args: Array[String]): Unit = {
var s = "7316717653133062491922511967442657474235534919493496983520312774506326239578318016984801869478851843858615607891129494954595017379583319528532088055111254069874715852386305071569329096329522744304355766896648950445244523161731856403098711121722383113622298934233803081353362766142828064444866452387493035890729629049156044077239071381051585930796086670172427121883998797908792274921901699720888093776657273330010533678812202354218097512545405947522435258490771167055601360483958644670632441572215539753697817977846174064955149290862569321978468622482839722413756570560574902614079729686524145351004748216637048440319989000889524345065854122758866688116427171479924442928230863465674813919123162824586178664583591245665294765456828489128831426076900422421902267105562632111110937054421750694165896040807198403850962455444362981230987879927244284909188845801561660979191338754992005240636899125607176060588611646710940507754100225698315520005593572972571636269561882670428252483600823257530420752963450";
var len = 13;
var bestSet = s.substring(0,len);
var currentSet = "";
var i = 0;
var compare = 0;
for(i <- 1 until s.length - len){
currentSet = s.substring(i,i+len);
compare = compareBlocks(bestSet,currentSet);
if(compare == 1) bestSet = currentSet;
}
println(v1);
var result = 1L;
var c = ' ';
for(c <- v1.toCharArray){
result = result * c.asDigit.toLong;
}
println(result);
}
def compareBlocks(block1: String, block2: String): Int = {
var i = 0;
var v1 = 0;
var v2 = 0;
if((block1 contains "0") && !(block2 contains "0")) return 1;
if(!(block1 contains "0") && (block2 contains "0")) return -1;
if((block1 contains "0") && (block2 contains "0")) return 0;
var chars = block1.toCharArray;
for(i <- 0 until chars.length){
v1 = v1 + chars(i).asDigit;
}
chars = block2.toCharArray;
for(i <- 0 until chars.length)
{
v2 = v2 + chars(i).asDigit;
}
if(v1 < v2) return 1;
if(v2 < v1) return -1;
return 0;
}
}
My result is:
9753697817977 <- Digit sequence
8821658160 <- Multiplication
Using the Euler Project to challenge yourself and learn a new language is a pretty good idea, but just coming up with the correct answer doesn't mean that you're using the language well.
It's obvious from your code that you have yet to learn idiomatic Scala. Would it surprise you to learn that the desired product can be calculated from the 100-character input string with just one line of code? That one line of code will:
turn each input character into a digit (Int)
slide a fixed size (13-digit) window over all the digits
multiply all the digits within each window
select the maximum from all those products
There's a handy little web site that has solved Euler challenges in Scala. I recommend that every time you solve an Euler problem, compare your code with what's found on that site. (But be careful. It's too easy to look ahead at solutions that you haven't tackled yet.)