How do you remove empty rows and add descriptive columns? - expss

A follow-up question to this one
Once I introduce some more complexity in my table, I'm seeing empty rows where no group-subgroup combination exists. Could those be remove?
I'm also wanting to add a "descriptive" column which does not fit into the cell-row-column tabulation, could I do that?
Here's an example:
animals_2 <- data.table(
family = rep(c(1, 1, 1, 1, 1, 1, 2, 2 ,2 ,3 ,3 ,3), 2),
animal = rep(c(1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4), 2),
name = rep(c(rep("fred", 3), rep("tod", 3), rep("timmy", 3), rep("johnno", 3)), 2),
age = rep(c(1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3), 2),
field = c(rep(1, 12), rep(2, 12)),
value = c(c(25, 45, 75, 10, 25, 50, 10, 15, 25, 5, 15, 20), c(5, 15, 30, 3, 9, 13, 2, 5, 9, 1, 2, 3.5))
)
animals_2 <- expss::apply_labels(
animals_2,
family = "|",
family = c("mammal" = 1, "reptilia" = 2, "amphibia" = 3),
animal = "|",
animal = c("dog" = 1, "cat" = 2, "turtle" = 3, "frog" = 4),
name = "|",
age = "age",
age = c("baby" = 1, "young" = 2, "mature" = 3),
field = "|",
field = c("height" = 1, "weight" = 2),
value = "|"
)
expss::expss_output_viewer()
animals_2 %>%
expss::tab_cells(value) %>%
expss::tab_cols(age %nest% field) %>%
expss::tab_rows(family %nest% animal) %>%
expss::tab_stat_sum(label = "") %>%
expss::tab_pivot()
You will see the column "name" doesn't feature in the table currently. I would just like to put it next to each animal and before the Age/Field summaries. Is this possible?
Thanks in advance!

As for empty categories - there is a special function for that - 'drop_empty_rows':
library(expss)
animals_2 <- data.table(
family = rep(c(1, 1, 1, 1, 1, 1, 2, 2 ,2 ,3 ,3 ,3), 2),
animal = rep(c(1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4), 2),
name = rep(c(rep("fred", 3), rep("tod", 3), rep("timmy", 3), rep("johnno", 3)), 2),
age = rep(c(1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3), 2),
field = c(rep(1, 12), rep(2, 12)),
value = c(c(25, 45, 75, 10, 25, 50, 10, 15, 25, 5, 15, 20), c(5, 15, 30, 3, 9, 13, 2, 5, 9, 1, 2, 3.5))
)
animals_2 <- expss::apply_labels(
animals_2,
family = "|",
family = c("mammal" = 1, "reptilia" = 2, "amphibia" = 3),
animal = "|",
animal = c("dog" = 1, "cat" = 2, "turtle" = 3, "frog" = 4),
name = "|",
age = "age",
age = c("baby" = 1, "young" = 2, "mature" = 3),
field = "|",
field = c("height" = 1, "weight" = 2),
value = "|"
)
expss::expss_output_viewer()
animals_2 %>%
expss::tab_cells(value) %>%
expss::tab_cols(age %nest% field) %>%
expss::tab_rows(family %nest% animal %nest% name) %>%
expss::tab_stat_sum(label = "") %>%
expss::tab_pivot() %>%
drop_empty_rows()
As for column "name" - you can add name to value label with pipe separator: dog|fred' or as in the example above, via%nest%`.
UPDATE:
If you need it as column with heading then it is better to place names as statistics:
animals_2 %>%
expss::tab_rows(family %nest% animal) %>%
# here we create separate column for name
expss::tab_cols(total(label = "name")) %>%
expss::tab_cells(name) %>%
expss::tab_stat_fun(unique) %>%
# end of creation
expss::tab_cols(age %nest% field) %>%
expss::tab_cells(value) %>%
expss::tab_stat_sum(label = "") %>%
expss::tab_pivot(stat_position = "outside_columns") %>%
drop_empty_rows()

Related

How do you remove the cell label from your table?

I'm trying to leverage expss to automate some reporting currently done in Excel via R. I'm generally needing to summarise a lot of values across some grouping (rows) relative to some fields (columns). I'm finding it difficult to get rid of the cell description.
Here's an example:
animals <- data.table(
animal = c(1, 1, 2, 2, 3, 3, 4, 4),
standing = c(1, 2, 1, 2, 1, 2, 1 ,2),
height = c(50, 70, 75, 105, 25, 55, 10, 20)
)
animals <- expss::apply_labels(
animals,
animal = "animal",
animal = c("cat" = 1, "dog" = 2, "turtle" = 3, "rat" = 4),
standing = "standing",
standing = c("no" = 1, "yes" = 2),
height = "height"
)
expss::expss_output_viewer()
animals %>%
expss::tab_cells(height) %>%
expss::tab_cols(animal) %>%
expss::tab_rows(standing) %>%
expss::tab_stat_sum(label = "") %>%
expss::tab_pivot()
You will see that "height" is printed as a label, how do I get rid of it please?
Thanks!
"|" assigned as label suppress both label and variable name:
library(expss)
animals <- data.table(
animal = c(1, 1, 2, 2, 3, 3, 4, 4),
standing = c(1, 2, 1, 2, 1, 2, 1 ,2),
height = c(50, 70, 75, 105, 25, 55, 10, 20)
)
animals <- expss::apply_labels(
animals,
animal = "animal",
animal = c("cat" = 1, "dog" = 2, "turtle" = 3, "rat" = 4),
standing = "standing",
standing = c("no" = 1, "yes" = 2),
height = "|" # to suppress label
)
expss::expss_output_viewer()
animals %>%
expss::tab_cells(height) %>%
expss::tab_cols(animal) %>%
expss::tab_rows(standing) %>%
expss::tab_stat_sum(label = "") %>%
expss::tab_pivot()

Generate Adjacency matrix from a Map

I know this is a lengthy question :) I'm trying to implement Hamiltonian Cycle on a dataset in Scala 2.11, as part of this I'm trying to generate Adjacency matrix from a Map of values.
Explanation:
Keys 0 to 4 are the different cities, so in below "allRoads" Variable
0 -> Set(1, 2) Means city0 is connected to city1 and city2
1 -> Set(0, 2, 3, 4) Means City1 is connected to city0,city2,city3,city4
.
.
I need to generate adj Matrix, for E.g:
I need to generate 1 if the city is connected, or else I've to generate 0, meaning
for: "0 -> Set(1, 2)", I need to generate: Map(0 -> Array(0,1,1,0,0))
input-
var allRoads = Map(0 -> Set(1, 2), 1 -> Set(0, 2, 3, 4), 2 -> Set(0, 1, 3, 4), 3 -> Set(2, 4, 1), 4 -> Set(2, 3, 1))
My Code:
val n: Int = 5
val listOfCities = (0 to n-1).toList
var allRoads = Map(0 -> Set(1, 2), 1 -> Set(0, 2, 3, 4), 2 -> Set(0, 1, 3, 4), 3 -> Set(2, 4, 1), 4 -> Set(2, 3, 1))
var adjmat:Array[Int] = Map()
for( i <- 0 until allRoads.size;j <- listOfCities) {
allRoads.get(i) match {
case Some(elem) => if (elem.contains(j)) adjmat = adjmat:+1 else adjmat = adjmat :+0
case _ => None
}
}
which outputs:
output: Array[Int] = Array(0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 0)
Expected output - Something like this, please suggest if there's something better to generate input to Hamiltonian Cycle
Map(0 -> Array(0, 1, 1, 0, 0),1 -> Array(1, 0, 1, 1, 1),2 -> Array(1, 1, 0, 1, 1),3 -> Array(0, 1, 1, 0, 1),4 -> Array(0, 1, 1, 1, 0))
Not sure how to store the above output as a Map or a Plain 2D Array.
Try
val cities = listOfCities.toSet
allRoads.map { case (city, roads) =>
city -> listOfCities.map(city => if ((cities diff roads).contains(city)) 0 else 1)
}
which outputs
Map(0 -> List(0, 1, 1, 0, 0), 1 -> List(1, 0, 1, 1, 1), 2 -> List(1, 1, 0, 1, 1), 3 -> List(0, 1, 1, 0, 1), 4 -> List(0, 1, 1, 1, 0))

Get the list of Triad nodes , who fall under the category of individual Triadic Census

By executing Networkx triadic_census Algorithm, I'm able to get the dictionary of the number of nodes falling on each type of triadic census
triad_census_social=nx.triadic_census(social_graph.to_directed())
Now, I'd like to return the list of triads, who all follow the pattern of census code "201", "120U", or any one of the 16 existing types.
How can I get those node lists under a census count?
There is no function in networkx that allow you to do it, so you should implement it manually. I modified the networkx.algorithms.triads code for you to return triads, not their count:
import networkx as nx
G = nx.DiGraph()
G.add_nodes_from([1,2,3,4,5])
G.add_edges_from([(1,2),(2,3),(2,4),(4,5)])
triad_census_social=nx.triadic_census(G)
# '003': 2,
# '012': 4,
# '021C': 3,
# '021D': 1,
# another: 0
#: The integer codes representing each type of triad.
#:
#: Triads that are the same up to symmetry have the same code.
TRICODES = (1, 2, 2, 3, 2, 4, 6, 8, 2, 6, 5, 7, 3, 8, 7, 11, 2, 6, 4, 8, 5, 9,
9, 13, 6, 10, 9, 14, 7, 14, 12, 15, 2, 5, 6, 7, 6, 9, 10, 14, 4, 9,
9, 12, 8, 13, 14, 15, 3, 7, 8, 11, 7, 12, 14, 15, 8, 14, 13, 15,
11, 15, 15, 16)
#: The names of each type of triad. The order of the elements is
#: important: it corresponds to the tricodes given in :data:`TRICODES`.
TRIAD_NAMES = ('003', '012', '102', '021D', '021U', '021C', '111D', '111U',
'030T', '030C', '201', '120D', '120U', '120C', '210', '300')
#: A dictionary mapping triad code to triad name.
TRICODE_TO_NAME = {i: TRIAD_NAMES[code - 1] for i, code in enumerate(TRICODES)}
def _tricode(G, v, u, w):
"""Returns the integer code of the given triad.
This is some fancy magic that comes from Batagelj and Mrvar's paper. It
treats each edge joining a pair of `v`, `u`, and `w` as a bit in
the binary representation of an integer.
"""
combos = ((v, u, 1), (u, v, 2), (v, w, 4), (w, v, 8), (u, w, 16),
(w, u, 32))
return sum(x for u, v, x in combos if v in G[u])
census = {name: set([]) for name in TRIAD_NAMES}
n = len(G)
m = {v: i for i, v in enumerate(G)}
for v in G:
vnbrs = set(G.pred[v]) | set(G.succ[v])
for u in vnbrs:
if m[u] <= m[v]:
continue
neighbors = (vnbrs | set(G.succ[u]) | set(G.pred[u])) - {u, v}
# Calculate dyadic triads instead of counting them.
for w in neighbors:
if v in G[u] and u in G[v]:
census['102'].add(tuple(sorted([u, v, w])))
else:
census['012'].add(tuple(sorted([u, v, w])))
# Count connected triads.
for w in neighbors:
if m[u] < m[w] or (m[v] < m[w] < m[u] and
v not in G.pred[w] and
v not in G.succ[w]):
code = _tricode(G, v, u, w)
census[TRICODE_TO_NAME[code]].add(tuple(sorted([u, v, w])))
# null triads, I implemented them manually because the original algorithm computes
# them as _number_of_all_possible_triads_ - _number_of_all_found_triads_
for v in G:
vnbrs = set(G.pred[v]) | set(G.succ[v])
not_vnbrs = set(G.nodes()) - vnbrs
for u in not_vnbrs:
unbrs = set(G.pred[u]) | set(G.succ[u])
not_unbrs = set(G.nodes()) - unbrs
for w in not_unbrs:
wnbrs = set(G.pred[w]) | set(G.succ[w])
if v not in wnbrs and len(set([u, v, w])) == 3:
census['003'].add(tuple(sorted([u, v, w])))
# '003': {(1, 3, 4), (1, 3, 5)},
# '012': {(1, 2, 3), (1, 2, 4), (2, 3, 4), (2, 4, 5)},
# '021C': {(1, 2, 3), (1, 2, 4), (2, 4, 5)},
# '021D': {(2, 3, 4)},
# another: empty
Building on vurmux's answer (by fixing the '102' and '012' triads):
import networkx as nx
import itertools
def _tricode(G, v, u, w):
"""Returns the integer code of the given triad.
This is some fancy magic that comes from Batagelj and Mrvar's paper. It
treats each edge joining a pair of `v`, `u`, and `w` as a bit in
the binary representation of an integer.
"""
combos = ((v, u, 1), (u, v, 2), (v, w, 4), (w, v, 8), (u, w, 16),
(w, u, 32))
return sum(x for u, v, x in combos if v in G[u])
G = nx.DiGraph()
G.add_nodes_from([1, 2, 3, 4, 5])
G.add_edges_from([(1, 2), (2, 3), (2, 4), (4, 5)])
#: The integer codes representing each type of triad.
#: Triads that are the same up to symmetry have the same code.
TRICODES = (1, 2, 2, 3, 2, 4, 6, 8, 2, 6, 5, 7, 3, 8, 7, 11, 2, 6, 4, 8, 5, 9,
9, 13, 6, 10, 9, 14, 7, 14, 12, 15, 2, 5, 6, 7, 6, 9, 10, 14, 4, 9,
9, 12, 8, 13, 14, 15, 3, 7, 8, 11, 7, 12, 14, 15, 8, 14, 13, 15,
11, 15, 15, 16)
#: The names of each type of triad. The order of the elements is
#: important: it corresponds to the tricodes given in :data:`TRICODES`.
TRIAD_NAMES = ('003', '012', '102', '021D', '021U', '021C', '111D', '111U',
'030T', '030C', '201', '120D', '120U', '120C', '210', '300')
#: A dictionary mapping triad code to triad name.
TRICODE_TO_NAME = {i: TRIAD_NAMES[code - 1] for i, code in enumerate(TRICODES)}
triad_nodes = {name: set([]) for name in TRIAD_NAMES}
m = {v: i for i, v in enumerate(G)}
for v in G:
vnbrs = set(G.pred[v]) | set(G.succ[v])
for u in vnbrs:
if m[u] > m[v]:
unbrs = set(G.pred[u]) | set(G.succ[u])
neighbors = (vnbrs | unbrs) - {u, v}
not_neighbors = set(G.nodes()) - neighbors - {u, v}
# Find dyadic triads
for w in not_neighbors:
if v in G[u] and u in G[v]:
triad_nodes['102'].add(tuple(sorted([u, v, w])))
else:
triad_nodes['012'].add(tuple(sorted([u, v, w])))
for w in neighbors:
if m[u] < m[w] or (m[v] < m[w] < m[u] and
v not in G.pred[w] and
v not in G.succ[w]):
code = _tricode(G, v, u, w)
triad_nodes[TRICODE_TO_NAME[code]].add(
tuple(sorted([u, v, w])))
# find null triads
all_tuples = set()
for s in triad_nodes.values():
all_tuples = all_tuples.union(s)
triad_nodes['003'] = set(itertools.combinations(G.nodes(), 3)).difference(all_tuples)
Result
# print(triad_nodes)
# {'003': {(1, 3, 4), (1, 3, 5)},
# '012': {(1, 2, 5), (1, 4, 5), (2, 3, 5), (3, 4, 5)},
# '102': set(),
# '021D': {(2, 3, 4)},
# '021U': set(),
# '021C': {(1, 2, 3), (1, 2, 4), (2, 4, 5)},
# '111D': set(),
# '111U': set(),
# '030T': set(),
# '030C': set(),
# '201': set(),
# '120D': set(),
# '120U': set(),
# '120C': set(),
# '210': set(),
# '300': set()}
In agreement with nx.triadic_census
# print(nx.triadic_census(G))
# {'003': 2,
# '012': 4,
# '102': 0,
# '021D': 1,
# '021U': 0,
# '021C': 3,
# '111D': 0,
# '111U': 0,
# '030T': 0,
# '030C': 0,
# '201': 0,
# '120D': 0,
# '120U': 0,
# '120C': 0,
# '210': 0,
# '300': 0}

Finding overlapping DateTime intervals of elements in multiple lists

I have a construct of n lists that are being used to record the beginning and ending times associated with something I want to monitor (say a task). A task can be repeated multiple times (although the same task cannot overlap / run concurrently ). Each task has a unique id and its begin / end times are stored in it’s own list.
I’m trying to find the period of time where all tasks were running at the same time.
So as an example, below I have 3 tasks; taskId 1 happens 7 times, taskId 2 happens twice and taskId 3 happens only once;
import org.joda.time.DateTime
case class CVT(taskId: Int, begin: DateTime, end: DateTime)
val cvt1: CVT = CVT (3, new DateTime(2015, 1, 1, 1, 0), new DateTime(2015, 1, 1, 20,0) )
val cvt2: CVT = CVT (1, new DateTime(2015, 1, 1, 2, 0), new DateTime(2015, 1, 1, 3, 0) )
val cvt3: CVT = CVT (1, new DateTime(2015, 1, 1, 4, 0), new DateTime(2015, 1, 1, 6, 0) )
val cvt4: CVT = CVT (2, new DateTime(2015, 1, 1, 5, 0), new DateTime(2015, 1, 1, 11,0) )
val cvt5: CVT = CVT (1, new DateTime(2015, 1, 1, 7, 0), new DateTime(2015, 1, 1, 8, 0) )
val cvt6: CVT = CVT (1, new DateTime(2015, 1, 1, 9, 0), new DateTime(2015, 1, 1, 10, 0) )
val cvt7: CVT = CVT (1, new DateTime(2015, 1, 1, 12, 0), new DateTime(2015, 1, 1, 14,0) )
val cvt8: CVT = CVT (2, new DateTime(2015, 1, 1, 13, 0), new DateTime(2015, 1, 1, 16,0) )
val cvt9: CVT = CVT (1, new DateTime(2015, 1, 1, 15, 0), new DateTime(2015, 1, 1, 17,0) )
val cvt10: CVT = CVT (1, new DateTime(2015, 1, 1, 18, 0), new DateTime(2015, 1, 1, 19,0) )
val combinedTasks: List[CVT] = List(cvt1, cvt2, cvt3, cvt4, cvt5, cvt6, cvt7, cvt8, cvt9, cvt10).sortBy(_.begin)
The result I’m trying to get is :
CVT(123, DateTime(2015, 1, 1, 5, 0), DateTime(2005, 1, 1, 6 0) )
CVT(123, DateTime(2015, 1, 1, 7, 0), DateTime(2005, 1, 1, 8 0) )
CVT(123, DateTime(2015, 1, 1, 9, 0), DateTime(2005, 1, 1, 10 0) )
CVT(123, DateTime(2015, 1, 1, 13, 0), DateTime(2005, 1, 1, 14 0) )
CVT(123, DateTime(2015, 1, 1, 15, 0), DateTime(2005, 1, 1, 16 0) )
Note : I don’t mind what the ‘taskId’ is in the result, I’m just showing ‘123’ to try and show in this example that all three tasks were running between these start and end times.
I’ve looked at trying to use both a recursive fn and also the Joda Interval with the .gap method but can’t seem to find the solution.
Any tips on how I could achieve what I’m trying to do would be great.
Tks
I got a library for sets of non-overlapping intervals at https://github.com/rklaehn/intervalset . It is going to be in the next version of spire
Here is how you would use it:
import org.joda.time.DateTime
import spire.algebra.Order
import spire.math.Interval
import spire.math.extras.interval.IntervalSeq
// define an order for DateTimes
implicit val dateTimeOrder = Order.from[DateTime](_ compareTo _)
// create three sets of DateTime intervals
val intervals = Map[Int, IntervalSeq[DateTime]](
1 -> (IntervalSeq.empty |
Interval(new DateTime(2015, 1, 1, 2, 0), new DateTime(2015, 1, 1, 3, 0)) |
Interval(new DateTime(2015, 1, 1, 4, 0), new DateTime(2015, 1, 1, 6, 0)) |
Interval(new DateTime(2015, 1, 1, 7, 0), new DateTime(2015, 1, 1, 8, 0)) |
Interval(new DateTime(2015, 1, 1, 9, 0), new DateTime(2015, 1, 1, 10, 0)) |
Interval(new DateTime(2015, 1, 1, 12, 0), new DateTime(2015, 1, 1, 14, 0)) |
Interval(new DateTime(2015, 1, 1, 15, 0), new DateTime(2015, 1, 1, 17, 0)) |
Interval(new DateTime(2015, 1, 1, 18, 0), new DateTime(2015, 1, 1, 19, 0))),
2 -> (IntervalSeq.empty |
Interval(new DateTime(2015, 1, 1, 5, 0), new DateTime(2015, 1, 1, 11, 0)) |
Interval(new DateTime(2015, 1, 1, 13, 0), new DateTime(2015, 1, 1, 16, 0))),
3 -> (IntervalSeq.empty |
Interval(new DateTime(2015, 1, 1, 1, 0), new DateTime(2015, 1, 1, 20, 0))))
// calculate the intersection of all intervals
val result = intervals.values.foldLeft(IntervalSeq.all[DateTime])(_ & _)
// print the result
for (interval <- result.intervals)
println(interval)
Note that spire intervals are significantly more powerful than what you probably need. They distinguish between open and closed interval bounds, and can handle infinite intervals. But nevertheless the above should be pretty fast.
Additionaly to Rüdiger 's library, which I believe is powerful, fast and extensible here is simple implementation using built-in collections lib.
I did redefine your CVT class reflecting ability to carry intersections as
case class CVT[Id](taskIds: Id, begin: DateTime, end: DateTime)
All you individual cvt defs now changed to
val cvtN: CVT[Int] = ???
We will try to catch events enters scope and leaves scope within our collection. For that algo we'll define following ADT:
sealed class Event
case object Enter extends Event
case object Leave extends Event
And corresponding ordering instances:
implicit val eventOrdering = Ordering.fromLessThan[Event](_ == Leave && _ == Enter)
implicit val dateTimeOrdering = Ordering.fromLessThan[DateTime](_ isBefore _)
Now we can write following
val combinedTasks: List[CVT[Set[Int]]] = List(cvt1, cvt2, cvt3, cvt4, cvt5, cvt6, cvt7, cvt8, cvt9, cvt10)
.flatMap { case CVT(id, begin, end) => List((id, begin, Enter), (id, end, Leave)) }
.sortBy { case (id, time, evt) => (time, evt: Event) }
.foldLeft((Set.empty[Int], List.empty[CVT[Set[Int]]], DateTime.now())) { (state, event) =>
val (active, accum, last) = state
val (id, time, evt) = event
evt match {
case Enter => (active + id, accum, time)
case Leave => (active - id, CVT(active, last, time) :: accum, time)
}
}._2.filter(_.taskIds == Set(1,2,3)).reverse
The most important here foldLeft part. After ordering events where Leaves are coming before Enterings, we are just carrying set of current working jobs from event to event, adding to this set when new job enters and capturing interval, using last entering time when some job leaves.

Idiomatic scala solution to combining sequences

Imagine a function combineSequences: (seqs: Set[Seq[Int]])Set[Seq[Int]] that combines sequences when the last item of first sequence matches the first item of the second sequence. For example, if you have the following sequences:
(1, 2)
(2, 3)
(5, 6, 7, 8)
(8, 9, 10)
(3, 4, 10)
The result of combineSequences would be:
(5, 6, 7, 8, 8, 9, 10)
(1, 2, 2, 3, 3, 4, 10)
Because sequences 1, 2, and 5 combine together. If multiple sequences could combine to create a different result, the decisions is arbitrary. For example, if we have the sequences:
(1, 2)
(2, 3)
(2, 4)
There are two correct answers. Either:
(1, 2, 2, 3)
(2, 4)
Or:
(1, 2, 2, 4)
(2, 3)
I can only think of a very imperative and fairly opaque implementation. I'm wondering if anyone has a solution that would be more idiomatic scala. I've run into related problems a few times now.
Certainly not the most optimized solution but I've gone for readability.
def combineSequences[T]( seqs: Set[Seq[T]] ): Set[Seq[T]] = {
if ( seqs.isEmpty ) seqs
else {
val (seq1, otherSeqs) = (seqs.head, seqs.tail)
otherSeqs.find(_.headOption == seq1.lastOption) match {
case Some( seq2 ) => combineSequences( otherSeqs - seq2 + (seq1 ++ seq2) )
case None =>
otherSeqs.find(_.lastOption == seq1.headOption) match {
case Some( seq2 ) => combineSequences( otherSeqs - seq2 + (seq2 ++ seq1) )
case None => combineSequences( otherSeqs ) + seq1
}
}
}
}
REPL test:
scala> val seqs = Set(Seq(1, 2), Seq(2, 3), Seq(5, 6, 7, 8), Seq(8, 9, 10), Seq(3, 4, 10))
seqs: scala.collection.immutable.Set[Seq[Int]] = Set(List(1, 2), List(2, 3), List(8, 9, 10), List(5, 6, 7, 8), List(3, 4, 10))
scala> combineSequences( seqs )
res10: Set[Seq[Int]] = Set(List(1, 2, 2, 3, 3, 4, 10), List(5, 6, 7, 8, 8, 9, 10))
scala> val seqs = Set(Seq(1, 2), Seq(2, 3, 100), Seq(5, 6, 7, 8), Seq(8, 9, 10), Seq(100, 4, 10))
seqs: scala.collection.immutable.Set[Seq[Int]] = Set(List(100, 4, 10), List(1, 2), List(8, 9, 10), List(2, 3, 100), List(5, 6, 7, 8))
scala> combineSequences( seqs )
res11: Set[Seq[Int]] = Set(List(5, 6, 7, 8, 8, 9, 10), List(1, 2, 2, 3, 100, 100, 4, 10))