What is the best way to translate the generation of a multidimensional cell array from Matlab to Clojure - matlab

I'm halfway through figuring out a solution to my question, but I have a feeling that it won't be very efficient. I've got a 2 dimensional cell structure of variable length arrays that is constructed in a very non-functional way in Matlab that I would like to convert to Clojure. Here is an example of what I'm trying to do:
pre = cell(N,1);
aux = cell(N,1);
for i=1:Ne
for j=1:D
for k=1:length(delays{i,j})
pre{post(i, delays{i, j}(k))}(end+1) = N*(delays{i, j}(k)-1)+i;
aux{post(i, delays{i, j}(k))}(end+1) = N*(D-1-j)+i; % takes into account delay
end;
end;
end;
My current plan for implementation is to use 3 loops where the first is initialized with a vector of N vectors of an empty vector. Each subloop is initialized by the previous loop. I define a separate function that takes the overall vector and the subindices and value and returns the vector with an updated subvector.
There's got to be a smarter way of doing this than using 3 loop/recurs. Possibly some reduce function that simplifies the syntax by using an accumulator.

I'm not 100% sure I understand what your code is doing (I don't know Matlab) but this might be one approach for building a multi-dimensional vector:
(defn conj-in
"Based on clojure.core/assoc-in, but with vectors instead of maps."
[coll [k & ks] v]
(if ks
(assoc coll k (conj-in (get coll k []) ks v))
(assoc coll k v)))
(defn foo []
(let [w 5, h 4, d 3
indices (for [i (range w)
j (range h)
k (range d)]
[i j k])]
(reduce (fn [acc [i j k :as index]]
(conj-in acc index
;; do real work here
(str i j k)))
[] indices)))
user> (pprint (foo))
[[["000" "001" "002"]
["010" "011" "012"]
["020" "021" "022"]
["030" "031" "032"]]
[["100" "101" "102"]
["110" "111" "112"]
["120" "121" "122"]
["130" "131" "132"]]
[["200" "201" "202"]
["210" "211" "212"]
["220" "221" "222"]
["230" "231" "232"]]
[["300" "301" "302"]
["310" "311" "312"]
["320" "321" "322"]
["330" "331" "332"]]
[["400" "401" "402"]
["410" "411" "412"]
["420" "421" "422"]
["430" "431" "432"]]]
This only works if indices go in the proper order (increasing), because you can't conj or assoc onto a vector anywhere other than one-past-the-end.
I also think it would be acceptable to use make-array and build your array via aset. This is why Clojure offers access to Java mutable arrays; some algorithms are much more elegant that way, and sometimes you need them for performance. You can always dump the data into Clojure vectors after you're done if you want to avoid leaking side-effects.
(I don't know which of this or the other version performs better.)
(defn bar []
(let [w 5, h 4, d 3
arr (make-array String w h d)]
(doseq [i (range w)
j (range h)
k (range d)]
(aset arr i j k (str i j k)))
(vec (map #(vec (map vec %)) arr)))) ;yikes?

Look to Incanter project that provide routines for work with data sets, etc.

Related

equivalent of assoc-in(clojure) in scala

I am trying to find an equivalent of assoc-in (clojure) in scala. I am trying to convert
(defn- organiseDataByTradeId [data]
(reduce #(let [a (assoc-in %1
[(%2 "internaltradeid") (read-string (%2 "paramseqnum")) "levelcols"]
(reduce (fn [m k](assoc m k (get %2 k)))
{}
(string/split xmlLevelAttributesStr #",")))
b (assoc-in a
[(%2 "internaltradeid") (read-string (%2 "paramseqnum")) "subLevelCols" (read-string (%2 "cashflowseqnum"))]
(reduce (fn [m k] (assoc m k (get %2 k)))
{}
(string/split xmlSubLevelAttributesStr #","))
)]
b)
{}
data))
to scala.
Have tried this :
def organiseDataByTradeId(data: List[Map[String, String]]) = {
data.map { entry => Map(entry("internaltradeid") -> Map(entry("paramseqnum").toInt -> Map("levelcols" -> (xmlLevelAttributesStr.split(",")).map{key=> (key,entry(key))}.toMap,
"subLevelCols" -> Map(entry("cashflowseqnum").asInstanceOf[String].toInt -> (xmlSubLevelAttributesStr.split(",")).map{key=> (key,entry(key))}.toMap)))) }
}
Not sure how to merge the list of maps I got without overwriting.
Here data List[Map[String,String]] is basically describing a table.Each entry is a row.Column names are keys of the maps and values are values.xmlLevelAttributeStr and xmlSubLevelAttributeStr are two Strings where column names are separated by comma.
I am fairly new to scala. I converted each row(Map[String,String]) to a scala Map and now not sure how to merge them so that previous data is not overwritten and behaves exactly as the clojure code.Also I am not allowed to use external libraries such as scalaz.
This Clojure code is not a good pattern to copy: it has a lot of duplication, and little explanation of what it is doing. I would write it more like this:
(defn- organiseDataByTradeId [data]
(let [level-reader (fn [attr-list]
(let [levels (string/split attr-list #",")]
(fn [item]
(into {} (for [level levels]
[level (get item level)])))))
attr-levels (level-reader xmlLevelAttributesStr)
sub-levels (level-reader xmlSubLevelAttributesStr)]
(reduce (fn [acc item]
(update-in acc [(item "internaltradeid"),
(read-string (item "paramseqnum"))]
(fn [trade]
(-> trade
(assoc "levelcols" (attr-levels item))
(assoc-in ["subLevelCols", (read-string (item "cashflowseqnum"))]
(sub-levels item))))))
{}, data)))
It's more lines of code than your original, but I've taken the opportunity to name a number of useful concepts and extract the repetition into a local function so that it's more self-explanatory.
It's even easier if you know there will be no duplication of internaltradeid: you can simply generate a number of independent maps and merge them together:
(defn- organiseDataByTradeId [data]
(let [level-reader (fn [attr-list]
(let [levels (string/split attr-list #",")]
(fn [item]
(into {} (for [level levels]
[level (get item level)])))))
attr-levels (level-reader xmlLevelAttributesStr)
sub-levels (level-reader xmlSubLevelAttributesStr)]
(apply merge (for [item data]
{(item "internaltradeid")
{(read-string (item "paramseqnum"))
{"levelcols" (attr-levels item),
"subLevelCols" {(read-string (item "cashflowseqnum")) (sub-levels item)}}}}))))
But really, neither of these approaches will work well in Scala, because Scala has a different data modeling philosophy than Clojure does. Clojure encourages loosely-defined heterogeneous maps like this, where Scala would prefer that your maps be homogeneous. When you will have data mixing multiple types, Scala suggests you define a class (or perhaps a case class - I'm no Scala expert) and then create instances of that class.
So here you'd want a Map[String, Map[Int, TradeInfo]], where TradeInfo is a class with two fields, levelcols : List[Attribute], and subLevelCols as some sort of pair (or perhaps a single-element map) containing a cashflowseqnum and another List[Attribute].
Once you've modeled your data in the Scala way, you'll be quite far away from using anything that looks like assoc-in because your data won't be a single giant map, so the question won't arise.

Pass a data structure in to a macro for filling in

I'm trying to solve a problem: I need to create a map from passed-in values, but while the symbol names for the values are consistent, the keys they map to are not. For instance: I might be passed a value that is a user ID. In the code, I can always use the symbol user-id -- but depending on other factors, I might need to make a map {"userId" user-id} or {"user_id" user-id} or {:user-id user-id} or -- well, you get the picture.
I can write a macro that gets me part-way there:
(defmacro user1 [user-id] `{"userId" ~user-id}
(defmacro user2 [user-id] `{"user_id" ~user-id}
But what I'd much rather do is define a set of maps, then combine them with a given set of symbols:
(def user-id-map-1 `{"userId" `user-id}
(defn combiner [m user-id] m) ;; <-- Around here, a miracle occurs.
I can't figure out how to get this evaluation to occur. It seems like I should be able to make a map containing un-evaluated symbols, then look up those symbols in the lexical scope of a function or macro that binds those symbols as locals -- but how?
Instead of standardizing your symbolic names, use maps with standard keyword keys. You don't need to go near macros, and you can turn your maps into records if need be without much trouble.
What you know as
(def user1 {:id 3124, :surname "Adabolo", :forenames ["Julia" "Frances"]})
... can be transformed by mapping the keys with whatever function you choose:
(defn map-keys [keymap m]
(zipmap (map keymap (keys m)) (vals m)))
For example,
(map-keys name user1)
;{"id" 3124, "surname" "Adabolo", "forenames" ["Julia" "Frances"]}
or
(map-keys {:id :user-id, :surname :family-name} user1)
;{:user-id 3124, :family-name "Adabolo", nil ["Julia" "Frances"]}
If you want rid of the nil entry, wrap the expression in (dissoc ... nil):
(defn map-keys [keymap m]
(dissoc
(zipmap (map keymap (keys m)) (vals m))
nil))
Then
(map-keys {:id :user-id, :surname :family-name} user1)
;{:user-id 3124, :family-name "Adabolo"}
I see from Michał Marczyk's answer, which has priority, that the above essentially rewrites clojure.set/rename-keys, which, however ...
leaves missing keys untouched:
For example,
(clojure.set/rename-keys user1 {:id :user-id, :surname :family-name})
;{:user-id 3124, :forenames ["Julia" "Frances"], :family-name "Adabolo"}
doesn't work with normal functions:
For example,
(clojure.set/rename-keys user1 name)
;IllegalArgumentException Don't know how to create ISeq from: clojure.core$name ...
If you forego the use of false and nil as keys, you can leave missing keys untouched and still use normal functions:
(defn map-keys [keymap m]
(zipmap (map #(or (keymap %) %) (keys m)) (vals m)))
Then
(map-keys {:id :user-id, :surname :family-name} user1)
;{:user-id 3124, :family-name "Adabolo", :forenames ["Julia" "Frances"]}
How about putting your passed-in values in a map keyed by keywords forged from the formal parameter names:
(defmacro zipfn [map-name arglist & body]
`(fn ~arglist
(let [~map-name (zipmap ~(mapv keyword arglist) ~arglist)]
~#body)))
Example of use:
((zipfn argmap [x y z]
argmap)
1 2 3)
;= {:z 3, :y 2, :x 1}
Better yet, don't use macros:
;; could take varargs for ks (though it would then need another name)
(defn curried-zipmap [ks]
#(zipmap ks %))
((curried-zipmap [:x :y :z]) [1 2 3])
;= {:z 3, :y 2, :x 1}
Then you could rekey this map using clojure.set/rename-keys:
(clojure.set/rename-keys {:z 3, :y 2, :x 1} {:z "z" :y "y" :x "x"})
;= {"x" 1, "z" 3, "y" 2}
The second map here is the "translation map" for the keys; you can construct in by merging maps like {:x "x"} describing how the individual keys ought to be renamed.
For the problem you described I can't find a reason to use macros.
I'd recommend something like
(defn assoc-user-id
[m user-id other-factors]
(assoc m (key-for other-factors) user-id))
Where you implement key-for so that it selects the key based on other-factors.

Working with Isabelle's code generator: Data refinement and higher order functions

This is a follow-up on Isabelle's Code generation: Abstraction lemmas for containers?:
I want to generate code for the_question in the following theory:
theory Scratch imports Main begin
typedef small = "{x::nat. x < 10}" morphisms to_nat small
by (rule exI[where x = 0], simp)
code_datatype small
lemma [code abstype]: "small (to_nat x) = x" by (rule to_nat_inverse)
definition a_pred :: "small ⇒ bool"
where "a_pred = undefined"
definition "smaller j = [small i . i <- [0 ..< to_nat j]]"
definition "the_question j = (∀i ∈ set (smaller j). a_pred j)"
The problem is that the equation for smaller is not suitable for code generation, as it mentions the abstraction function small.
Now according to Andreas’ answer to my last question and the paper on data refinement, the next step is to introduce a type for sets of small numbers, and create a definition for smaller in that type:
typedef small_list = "{l. ∀x∈ set l. (x::nat) < 10}" by (rule exI[where x = "[]"], auto)
code_datatype Abs_small_list
lemma [code abstype]: "Abs_small_list (Rep_small_list x) = x" by (rule Rep_small_list_inverse)
definition "smaller' j = Abs_small_list [ i . i <- [0 ..< to_nat j]]"
lemma smaller'_code[code abstract]: "Rep_small_list (smaller' j) = [ i . i <- [0 ..< to_nat j]]"
unfolding smaller'_def
by (rule Abs_small_list_inverse, cases j, auto elim: less_trans simp add: small_inverse)
Now smaller' is executable. From what I understand I need to redefine operations on small list as operations on small_list:
definition "small_list_all P l = list_all P (map small (Rep_small_list l))"
lemma[code]: "the_question j = small_list_all a_pred (smaller' j)"
unfolding small_list_all_def the_question_def smaller'_code smaller_def Ball_set by simp
I can define a good looking code equation for the_question. But the definition of small_list_all is not suitable for code generation, as it mentions the abstraction morphismsmall. How do I make small_list_all executable?
(Note that I cannot unfold the code equation of a_pred, as the problem actually occurs in the code equation of the actually recursive a_pred. Also, I’d like to avoid hacks that involve re-checking the invariant at runtime.)
I don't have a good solution to the general problem, but here's an idea that will let you generate code for the_question in this particular case.
First, define a function predecessor :: "small ⇒ small with an abstract code equation (possibly using lift_definition from λn::nat. n - 1).
Now you can prove a new code equation for smaller whose rhs uses if-then-else, predecessor and normal list operations:
lemma smaller_code [code]:
"smaller j = (if to_nat j = 0 then []
else let k = predecessor j in smaller k # [k])"
(More efficient implementations are of course possible if you're willing to define an auxiliary function.)
Code generation should now work for smaller, since this code equation doesn't use function small.
The short answer is no, it does not work.
The long answer is that there are often workarounds possible. One is shown by Brian in his answer. The general idea seems to be
Separate the function that has the abstract type in covariant positions besides the final return value (i.e. higher order functions or functions returning containers of abstract values) into multiple helper functions so that abstract values are only constructed as a single return value of one of the helper function.
In Brian’s example, this function is predecessor. Or, as another simple example, assume a function
definition smallPrime :: "nat ⇒ small option"
where "smallPrime n = (if n ∈ {2,3,5,7} then Some (small n) else None)"
This definition is not a valid code equation, due to the occurrence of small. But this derives one:
definition smallPrimeHelper :: "nat ⇒ small"
where "smallPrimeHelper n = (if n ∈ {2,3,5,7} then small n else small 0)"
lemma [code abstract]: "to_nat (smallPrimeHelper n) = (if n ∈ {2,3,5,7} then n else 0)"
by (auto simp add: smallPrimeHelper_def intro: small_inverse)
lemma [code_unfold]: "smallPrime n = (if n ∈ {2,3,5,7} then Some (smallPrimeHelper n) else None)"
unfolding smallPrime_def smallPrimeHelper_def by simp
If one wants to avoid the redundant calculation of the predicate (which might be more complex than just ∈ {2,3,5,7}, one can make the return type of the helper smarter by introducing an abstract view, i.e. a type that contains both the result of the computation, and the information needed to construct the abstract type from it:
typedef smallPrime_view = "{(x::nat, b::bool). x < 10 ∧ b = (x ∈ {2,3,5,7})}"
by (rule exI[where x = "(2, True)"], auto)
setup_lifting type_definition_small
setup_lifting type_definition_smallPrime_view
For the view we have a function building it and accessors that take the result apart, with some lemmas about them:
lift_definition smallPrimeHelper' :: "nat ⇒ smallPrime_view"
is "λ n. if n ∈ {2,3,5,7} then (n, True) else (0, False)" by simp
lift_definition smallPrimeView_pred :: "smallPrime_view ⇒ bool"
is "λ spv :: (nat × bool) . snd spv" by auto
lift_definition smallPrimeView_small :: "smallPrime_view ⇒ small"
is "λ spv :: (nat × bool) . fst spv" by auto
lemma [simp]: "smallPrimeView_pred (smallPrimeHelper' n) ⟷ (n ∈ {2,3,5,7})"
by transfer simp
lemma [simp]: "n ∈ {2,3,5,7} ⟹ to_nat (smallPrimeView_small (smallPrimeHelper' n)) = n"
by transfer auto
lemma [simp]: "n ∈ {2,3,5,7} ⟹ smallPrimeView_small (smallPrimeHelper' n) = small n"
by (auto intro: iffD1[OF to_nat_inject] simp add: small_inverse)
With that we can derive a code equation that does the check only once:
lemma [code]: "smallPrime n =
(let spv = smallPrimeHelper' n in
(if smallPrimeView_pred spv
then Some (smallPrimeView_small spv)
else None))"
by (auto simp add: smallPrime_def Let_def)

Standard ML permutations

I am working on a function to the permutations for all values in a list.
Here is what I have so far:
//MY ROTATE FUNCTION
fun rotate e [] = [[e]]
| rotate e (x::xs)= (e::x::xs)::(List.map (fn l => x::l) (rotate e xs));
//MY CURRENT PERMUTATION FUNCTION
fun perm [] = []
| perm (x::xs) = List.concat(List.map (fn l => (rotate x xs)) xs) # perm xs;
OUTPUT:
- perm [1,2,3];
val it = [[1,2,3],[2,1,3],[2,3,1],[1,2,3],[2,1,3],[2,3,1],[2,3],[3,2]]
The output should be something like [[1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2], [3, 2, 1]]. As you can see I am missing something here. I believe the issue is my 3 is not being passed to rotate as rotate 3 [1,2] is what I am missing from my code along with two 2 element lists being here for some reason.
How can I correct my perm function to show the output correctly? Any help no matter how big or small would help me a lot.
Here is a simple fix for your attempted solution. You were nearly there.
fun interleave x [] = [[x]]
| interleave x (h::t) =
(x::h::t)::(List.map(fn l => h::l) (interleave x t))
fun permute nil = [[]]
| permute (h::t) = List.concat( List.map (fn l => interleave h l) (permute t))
I don't think that the rotate approach is the one you'll want to take. Rather, as Shivindap describes here, a good way to do this sort of this is to pull the first element from the argument list, and append it to all permutations of the tail. Rinse and repeat this for every element of the list, and you'll end up with all the permutations.
You'll find an in depth explanation of this approach here. For code samples in ML, you could also check this out.
Best of luck to you!

Calculating the Moving Average of a List

This weekend I decided to try my hand at some Scala and Clojure. I'm proficient with object oriented programming, and so Scala was easy to pick up as a language, but wanted to try out functional programming. This is where it got hard.
I just can't seem to get my head into a mode of writing functions. As an expert functional programmer, how do you approach a problem?
Given a list of values and a defined period of summation, how would you generate a new list of the simple moving average of the list?
For example: Given the list values (2.0, 4.0, 7.0, 6.0, 3.0, 8.0, 12.0, 9.0, 4.0, 1.0), and the period 4, the function should return: (0.0, 0.0, 0.0, 4.75, 5.0, 6.0, 7.25, 8.0, 8.25, 6.5)
After spending a day mulling it over, the best I could come up with in Scala was this:
def simpleMovingAverage(values: List[Double], period: Int): List[Double] = {
(for (i <- 1 to values.length)
yield
if (i < period) 0.00
else values.slice(i - period, i).reduceLeft(_ + _) / period).toList
}
I know this is horribly inefficient, I'd much rather do something like:
where n < period: ma(n) = 0
where n = period: ma(n) = sum(value(1) to value(n)) / period
where n > period: man(n) = ma(n -1) - (value(n-period) / period) + (value(n) / period)
Now that would be easily done in a imperative style, but I can't for the life of me work out how to express that functionally.
Interesting problem. I can think of many solutions, with varying degrees of efficiency. Having to add stuff repeatedly isn't really a performance problem, but let's assume it is. Also, the zeroes at the beginning can be prepended later, so let's not worry about producing them. If the algorithm provides them naturally, fine; if not, we correct it later.
Starting with Scala 2.8, the following would give the result for n >= period by using sliding to get a sliding window of the List:
def simpleMovingAverage(values: List[Double], period: Int): List[Double] =
List.fill(period - 1)(0.0) ::: (values sliding period map (_.sum) map (_ / period))
Nevertheless, although this is rather elegant, it doesn't have the best performance possible, because it doesn't take advantage of already computed additions. So, speaking of them, how can we get them?
Let's say we write this:
values sliding 2 map sum
We have a list of the sum of each two pairs. Let's try to use this result to compute the moving average of 4 elements. The above formula made the following computation:
from d1, d2, d3, d4, d5, d6, ...
to (d1+d2), (d2+d3), (d3+d4), (d4+d5), (d5+d6), ...
So if we take each element and add it to the second next element, we get the moving average for 4 elements:
(d1+d2)+(d3+d4), (d2+d3)+(d4+d5), (d3+d4)+(d5+d6), ...
We may do it like this:
res zip (res drop 2) map Function.tupled(_+_)
We could then compute the moving average for 8 elements, and so on. Well, there is a well known algorithm to compute things that follow such pattern. It's most known for its use on computing the power of a number. It goes like this:
def power(n: Int, e: Int): Int = e match {
case 0 => 1
case 1 => n
case 2 => n * n
case odd if odd % 2 == 1 => power(n, (odd - 1)) * n
case even => power(power(n, even / 2), 2)
}
So, let's apply it here:
def movingSum(values: List[Double], period: Int): List[Double] = period match {
case 0 => throw new IllegalArgumentException
case 1 => values
case 2 => values sliding 2 map (_.sum)
case odd if odd % 2 == 1 =>
values zip movingSum(values drop 1, (odd - 1)) map Function.tupled(_+_)
case even =>
val half = even / 2
val partialResult = movingSum(values, half)
partialResult zip (partialResult drop half) map Function.tupled(_+_)
}
So, here's the logic. Period 0 is invalid, period 1 is equal to the input, period 2 is sliding window of size 2. If greater than that, it may be even or odd.
If odd, we add each element to the movingSum of the next (odd - 1) elements. For example, if 3, we add each element to the movingSum of the next 2 elements.
If even, we compute the movingSum for n / 2, then add each element to the one n / 2 steps afterwards.
With that definition, we can then go back to the problem and do this:
def simpleMovingAverage(values: List[Double], period: Int): List[Double] =
List.fill(period - 1)(0.0) ::: (movingSum(values, period) map (_ / period))
There's a slight inefficiency with regards to the use of :::, but it's O(period), not O(values.size). It can be made more efficient with a tail recursive function. And, of course, the definition of "sliding" I provided is horrendous performance-wise, but there will be a much better definition of it on Scala 2.8. Note that we can't make an efficient sliding method on a List, but we can do it on an Iterable.
Having said all that, I'd go with the very first definition, and optimize only if a critical path analysis pinpointed this as a big deal.
To conclude, let's consider how I went about the problem. We have a moving average problem. A moving average is the sum of a moving "window" on a list, divided by the size of that window. So, first, I try to get a sliding window, sum everything on it, and then divide by the size.
The next problem was to avoid repetition of already computed additions. In this case, I went to the smallest addition possible, and tried to figure out how to compute bigger sums reusing such results.
Finally, let's try to solve the problem the way you figured it, by adding and subtracting from the previous result. Getting the first average is easy:
def movingAverage(values: List[Double], period: Int): List[Double] = {
val first = (values take period).sum / period
Now we make two lists. First, the list of elements to be subtracted. Next, the list of elements to be added:
val subtract = values map (_ / period)
val add = subtract drop period
We can add these two lists by using zip. This method will only produce as many elements as the smaller list has, which avoids the problem of subtract being bigger than necessary:
val addAndSubtract = add zip subtract map Function.tupled(_ - _)
We finish by composing the result with a fold:
val res = (addAndSubtract.foldLeft(first :: List.fill(period - 1)(0.0)) {
(acc, add) => (add + acc.head) :: acc
}).reverse
which is the answer to be returned. The whole function looks like this:
def movingAverage(values: List[Double], period: Int): List[Double] = {
val first = (values take period).sum / period
val subtract = values map (_ / period)
val add = subtract drop period
val addAndSubtract = add zip subtract map Function.tupled(_ - _)
val res = (addAndSubtract.foldLeft(first :: List.fill(period - 1)(0.0)) {
(acc, add) => (add + acc.head) :: acc
}).reverse
res
}
I know Clojure better than Scala, so here goes. As I write this the other Clojure entry here is imperative; that's not really what you're after (and isn't idiomatic Clojure). The first algorithm that comes to my mind is repeatedly taking the requested number of elements from the sequence, dropping the first element, and recurring.
The following works on any kind of sequence (vector or list, lazy or not) and gives a lazy sequence of averages---which could be helpful if you're working on a list of indefinite size. Note that it takes care of the base case by implicitly returning nil if there aren't enough elements in the list to consume.
(defn moving-average [values period]
(let [first (take period values)]
(if (= (count first) period)
(lazy-seq
(cons (/ (reduce + first) period)
(moving-average (rest values) period))))))
Running this on your test data yields
user> (moving-average '(2.0, 4.0, 7.0, 6.0, 3.0, 8.0, 12.0, 9.0, 4.0, 1.0) 4)
(4.75 5.0 6.0 7.25 8.0 8.25 6.5)
It doesn't give "0" for the first few elements in the sequence, though that could easily be handled (somewhat artificially).
The easiest thing of all is to see the pattern and be able to bring to mind an available function that fits the bill. partition gives a lazy view of portions of a sequence, which we can then map over:
(defn moving-average [values period]
(map #(/ (reduce + %) period) (partition period 1 values))
Someone asked for a tail recursive version; tail recursion vs. laziness is a bit of a tradeoff. When your job is building up a list then making your function tail recursive is usually pretty simple, and this is no exception---just build up the list as an argument to a subfunction. We'll accumulate to a vector instead of a list because otherwise the list will be built up backwards and will need to be reversed at the end.
(defn moving-average [values period]
(loop [values values, period period, acc []]
(let [first (take period values)]
(if (= (count first) period)
(recur (rest values) period (conj acc (/ (reduce + first) period)))
acc))))
loop is a way to make an anonymous inner function (sort of like Scheme's named let); recur must be used in Clojure to eliminate tail calls. conj is a generalized cons, appending in the manner natural for the collection---the beginning of lists and the end of vectors.
Here is another (functional) Clojure solution:
(defn avarage [coll]
(/ (reduce + coll)
(count coll)))
(defn ma [period coll]
(map avarage (partition period 1 coll)))
The zeros at the beginning of the sequence must still be added if that is a requirement.
Here's a purely functional solution in Clojure. More complex than those already provided, but it is lazy and only adjusts the average at each step, instead of recalculating it from scratch. It's actually slower than a simple solution which calculates a new average at each step if the period is small; for larger periods, however, it experiences virtually no slowdown, whereas something doing (/ (take period ...) period) will perform worse for longer periods.
(defn moving-average
"Calculates the moving average of values with the given period.
Returns a lazy seq, works with infinite input sequences.
Does not include initial zeros in the output."
[period values]
(let [gen (fn gen [last-sum values-old values-new]
(if (empty? values-new)
nil
(let [num-out (first values-old)
num-in (first values-new)
new-sum (+ last-sum (- num-out) num-in)]
(lazy-seq
(cons new-sum
(gen new-sum
(next values-old)
(next values-new)))))))]
(if (< (count (take period values)) period)
nil
(map #(/ % period)
(gen (apply + (take (dec period) values))
(cons 0 values)
(drop (dec period) values))))))
Here's a partially point-free one line Haskell solution:
ma p = reverse . map ((/ (fromIntegral p)) . sum . take p) . (drop p) . reverse . tails
First it applies tails to the list to get the "tails" lists, so:
Prelude List> tails [2.0, 4.0, 7.0, 6.0, 3.0]
[[2.0,4.0,7.0,6.0,3.0],[4.0,7.0,6.0,3.0],[7.0,6.0,3.0],[6.0,3.0],[3.0],[]]
Reverses it and drops the first 'p' entries (taking p as 2 here):
Prelude List> (drop 2 . reverse . tails) [2.0, 4.0, 7.0, 6.0, 3.0]
[[6.0,3.0],[7.0,6.0,3.0],[4.0,7.0,6.0,3.0],[2.0,4.0,7.0,6.0,3.0]]
In case you aren't familiar with the (.) dot/nipple symbol, it is the operator for 'functional composition', meaning it passes the output of one function as the input of another, "composing" them into a single function. (g . f) means "run f on a value then pass the output to g", so ((f . g) x) is the same as (g(f x)). Generally its usage leads to a clearer programming style.
It then maps the function ((/ (fromIntegral p)) . sum . take p) onto the list. So for every list in the list it takes the first 'p' elements, sums them, then divides them by 'p'. Then we just flip the list back again with "reverse".
Prelude List> map ((/ (fromIntegral 2)) . sum . take 2) [[6.0,3.0],[7.0,6.0,3.0]
,[4.0,7.0,6.0,3.0],[2.0,4.0,7.0,6.0,3.0]]
[4.5,6.5,5.5,3.0]
This all looks a lot more inefficient than it is; "reverse" doesn't physically reverse the order of a list until the list is evaluated, it just lays it out onto the stack (good ol' lazy Haskell). "tails" also doesn't create all those separate lists, it just references different sections of the original list. It's still not a great solution, but it one line long :)
Here's a slightly nicer but longer solution that uses mapAccum to do a sliding subtraction and addition:
ma p l = snd $ mapAccumL ma' a l'
where
(h, t) = splitAt p l
a = sum h
l' = (0, 0) : (zip l t)
ma' s (x, y) = let s' = (s - x) + y in (s', s' / (fromIntegral p))
First we split the list into two parts at "p", so:
Prelude List> splitAt 2 [2.0, 4.0, 7.0, 6.0, 3.0]
([2.0,4.0],[7.0,6.0,3.0])
Sum the first bit:
Prelude List> sum [2.0, 4.0]
6.0
Zip the second bit with the original list (this just pairs off items in order from the two lists). The original list is obviously longer, but we lose this extra bit:
Prelude List> zip [2.0, 4.0, 7.0, 6.0, 3.0] [7.0,6.0,3.0]
[(2.0,7.0),(4.0,6.0),(7.0,3.0)]
Now we define a function for our mapAccum(ulator). mapAccumL is the same as "map", but with an extra running state/accumulator parameter, which is passed from the previous "mapping" to the next one as map runs through the list. We use the accumulator as our moving average, and as our list is formed of the element that has just left the sliding window and the element that just entered it (the list we just zipped), our sliding function takes the first number 'x' away from the average and adds the second number 'y'. We then pass the new 's' along and return 's' divided by 'p'. "snd" (second) just takes the second member of a pair (tuple), which is used to take the second return value of mapAccumL, as mapAccumL will return the accumulator as well as the mapped list.
For those of you not familiar with the $ symbol, it is the "application operator". It doesn't really do anything but it has a has "low, right-associative binding precedence", so it means you can leave out the brackets (take note LISPers), i.e. (f x) is the same as f $ x
Running (ma 4 [2.0, 4.0, 7.0, 6.0, 3.0, 8.0, 12.0, 9.0, 4.0, 1.0]) yields [4.75, 5.0, 6.0, 7.25, 8.0, 8.25, 6.5] for either solution.
Oh and you'll need to import the module "List" to compile either solution.
Here are 2 more ways to do moving average in Scala 2.8.0(one strict and one lazy). Both assume there are at least p Doubles in vs.
// strict moving average
def sma(vs: List[Double], p: Int): List[Double] =
((vs.take(p).sum / p :: List.fill(p - 1)(0.0), vs) /: vs.drop(p)) {(a, v) =>
((a._1.head - a._2.head / p + v / p) :: a._1, a._2.tail)
}._1.reverse
// lazy moving average
def lma(vs: Stream[Double], p: Int): Stream[Double] = {
def _lma(a: => Double, vs1: Stream[Double], vs2: Stream[Double]): Stream[Double] = {
val _a = a // caches value of a
_a #:: _lma(_a - vs2.head / p + vs1.head / p, vs1.tail, vs2.tail)
}
Stream.fill(p - 1)(0.0) #::: _lma(vs.take(p).sum / p, vs.drop(p), vs)
}
scala> sma(List(2.0, 4.0, 7.0, 6.0, 3.0, 8.0, 12.0, 9.0, 4.0, 1.0), 4)
res29: List[Double] = List(0.0, 0.0, 0.0, 4.75, 5.0, 6.0, 7.25, 8.0, 8.25, 6.5)
scala> lma(Stream(2.0, 4.0, 7.0, 6.0, 3.0, 8.0, 12.0, 9.0, 4.0, 1.0), 4).take(10).force
res30: scala.collection.immutable.Stream[Double] = Stream(0.0, 0.0, 0.0, 4.75, 5.0, 6.0, 7.25, 8.0, 8.25, 6.5)
The J programming language facilitates programs such as moving average. Indeed, there are fewer characters in (+/ % #)\ than in their label, 'moving average.'
For the values specified in this question (including the name 'values') here is a straightforward way to code this:
values=: 2 4 7 6 3 8 12 9 4 1
4 (+/ % #)\ values
4.75 5 6 7.25 8 8.25 6.5
We can describe this by using labels for components.
periods=: 4
average=: +/ % #
moving=: \
periods average moving values
4.75 5 6 7.25 8 8.25 6.5
Both examples use exactly the same program. The only difference is the use of more names in the second form. Such names can help readers who don't know the J primaries.
Let's look a bit further into what's going on in the subprogram, average. +/ denotes summation (Σ) and % denotes division (like the classical sign ÷). Calculating a tally (count) of items is done by # . The overall program, then, is the sum of values divided by the tally of values: +/ % #
The result of the moving-average calculation written here does not include the leading zeros expected in the original question. Those zeros are arguably not part of the intended calculation.
The technique used here is called tacit programming. It is pretty much the same as the point-free style of functional programming.
Here is Clojure pretending to be a more functional language. This is fully tail-recursive, btw, and includes leading zeroes.
(defn moving-average [period values]
(loop [[x & xs] values
window []
ys []]
(if (and (nil? x) (nil? xs))
;; base case
ys
;; inductive case
(if (< (count window) (dec period))
(recur xs (conj window x) (conj ys 0.0))
(recur xs
(conj (vec (rest window)) x)
(conj ys (/ (reduce + x window) period)))))))
(deftest test-moving-average
(is (= [0.0 0.0 0.0 4.75 5.0 6.0 7.25 8.0 8.25 6.5]
(moving-average 4 [2.0 4.0 7.0 6.0 3.0 8.0 12.0 9.0 4.0 1.0]))))
Usually I put the collection or list parameter last to make the function easier to curry. But in Clojure...
(partial moving-average 4)
... is so cumbersome, I usually end up doing this ...
#(moving-average 4 %)
... in which case, it doesn't really matter what order the parameters go.
Here's a clojure version:
Because of the lazy-seq, it's perfectly general and won't blow stack
(defn partialsums [start lst]
(lazy-seq
(if-let [lst (seq lst)]
(cons start (partialsums (+ start (first lst)) (rest lst)))
(list start))))
(defn sliding-window-moving-average [window lst]
(map #(/ % window)
(let [start (apply + (take window lst))
diffseq (map - (drop window lst) lst)]
(partialsums start diffseq))))
;; To help see what it's doing:
(sliding-window-moving-average 5 '(1 2 3 4 5 6 7 8 9 10 11))
start = (+ 1 2 3 4 5) = 15
diffseq = - (6 7 8 9 10 11)
(1 2 3 4 5 6 7 8 9 10 11)
= (5 5 5 5 5 5)
(partialsums 15 '(5 5 5 5 5 5) ) = (15 20 25 30 35 40 45)
(map #(/ % 5) (20 25 30 35 40 45)) = (3 4 5 6 7 8 9)
;; Example
(take 20 (sliding-window-moving-average 5 (iterate inc 0)))
This example makes use of state, since to me it's a pragmatic solution in this case, and a closure to create the windowing averaging function:
(defn make-averager [#^Integer period]
(let [buff (atom (vec (repeat period nil)))
pos (atom 0)]
(fn [nextval]
(reset! buff (assoc #buff #pos nextval))
(reset! pos (mod (+ 1 #pos) period))
(if (some nil? #buff)
0
(/ (reduce + #buff)
(count #buff))))))
(map (make-averager 4)
[2.0, 4.0, 7.0, 6.0, 3.0, 8.0, 12.0, 9.0, 4.0, 1.0])
;; yields =>
(0 0 0 4.75 5.0 6.0 7.25 8.0 8.25 6.5)
It is still functional in the sense of making use of first class functions, though it is not side-effect free. The two languages you mentioned both run on top of the JVM and thus both allow for state-management when necessary.
This solution is in Haskell, which is more familiar to me:
slidingSums :: Num t => Int -> [t] -> [t]
slidingSums n list = case (splitAt (n - 1) list) of
(window, []) -> [] -- list contains less than n elements
(window, rest) -> slidingSums' list rest (sum window)
where
slidingSums' _ [] _ = []
slidingSums' (hl : tl) (hr : tr) sumLastNm1 = sumLastN : slidingSums' tl tr (sumLastN - hl)
where sumLastN = sumLastNm1 + hr
movingAverage :: Fractional t => Int -> [t] -> [t]
movingAverage n list = map (/ (fromIntegral n)) (slidingSums n list)
paddedMovingAverage :: Fractional t => Int -> [t] -> [t]
paddedMovingAverage n list = replicate (n - 1) 0 ++ movingAverage n list
Scala translation:
def slidingSums1(list: List[Double], rest: List[Double], n: Int, sumLastNm1: Double): List[Double] = rest match {
case Nil => Nil
case hr :: tr => {
val sumLastN = sumLastNm1 + hr
sumLastN :: slidingSums1(list.tail, tr, n, sumLastN - list.head)
}
}
def slidingSums(list: List[Double], n: Int): List[Double] = list.splitAt(n - 1) match {
case (_, Nil) => Nil
case (firstNm1, rest) => slidingSums1(list, rest, n, firstNm1.reduceLeft(_ + _))
}
def movingAverage(list: List[Double], n: Int): List[Double] = slidingSums(list, n).map(_ / n)
def paddedMovingAverage(list: List[Double], n: Int): List[Double] = List.make(n - 1, 0.0) ++ movingAverage(list, n)
A short Clojure version that has the advantage of being O(list length) regardless of your period:
(defn moving-average [list period]
(let [accums (let [acc (atom 0)] (map #(do (reset! acc (+ #acc %1 ))) (cons 0 list)))
zeros (repeat (dec period) 0)]
(concat zeros (map #(/ (- %1 %2) period) (drop period accums) accums))))
This exploits the fact that you can calculate the sum of a range of numbers by creating a cumulative sum of the sequence (e.g. [1 2 3 4 5] -> [0 1 3 6 10 15]) and then subtracting the two numbers with an offset equal to your period.
It looks like you are looking for a recursive solution. In that case, I would suggest to slightly change the problem and aim for getting (4.75, 5.0, 6.0, 7.25, 8.0, 8.25, 6.5, 0.0, 0.0, 0.0) as a solution.
In that case, you can write the below elegant recursive solution in Scala:
def mavg(values: List[Double], period: Int): List[Double] = {
if (values.size < period) List.fill(values.size)(0.0) else
if (values.size == period) (values.sum / values.size) :: List.fill(period - 1)(0.0) else {
val rest: List[Double] = mavg(values.tail, period)
(rest.head + ((values.head - values(period))/period)):: rest
}
}
I know how I would do it in python (note: the first 3 elements with the values 0.0 are not returned since that is actually not the appropriate way to represent a moving average). I would imagine similar techniques will be feasible in Scala. Here are multiple ways to do it.
data = (2.0, 4.0, 7.0, 6.0, 3.0, 8.0, 12.0, 9.0, 4.0, 1.0)
terms = 4
expected = (4.75, 5.0, 6.0, 7.25, 8.0, 8.25, 6.5)
# Method 1 : Simple. Uses slices
assert expected == \
tuple((sum(data[i:i+terms])/terms for i in range(len(data)-terms+1)))
# Method 2 : Tracks slots each of terms elements
# Note: slot, and block mean the same thing.
# Block is the internal tracking deque, slot is the final output
from collections import deque
def slots(data, terms):
block = deque()
for datum in data :
block.append(datum)
if len(block) > terms : block.popleft()
if len(block) == terms :
yield block
assert expected == \
tuple(sum(slot)/terms for slot in slots(data, terms))
# Method 3 : Reads value one at a time, computes the sums and throws away read values
def moving_average((avgs, sums),val):
sums = tuple((sum + val) for sum in sums)
return (avgs + ((sums[0] / terms),), sums[1:] + (val,))
assert expected == reduce(
moving_average,
tuple(data[terms-1:]),
((),tuple(sum(data[i:terms-1]) for i in range(terms-1))))[0]
# Method 4 : Semantically same as method 3, intentionally obfuscates just to fit in a lambda
assert expected == \
reduce(
lambda (avgs, sums),val: tuple((avgs + ((nsum[0] / terms),), nsum[1:] + (val,)) \
for nsum in (tuple((sum + val) for sum in sums),))[0], \
tuple(data[terms-1:]),
((),tuple(sum(data[i:terms-1]) for i in range(terms-1))))[0]
Being late on the party, and new to functional programming too, I came to this solution with an inner function:
def slidingAvg (ixs: List [Double], len: Int) = {
val dxs = ixs.map (_ / len)
val start = (0.0 /: dxs.take (len)) (_ + _)
val head = List.make (len - 1, 0.0)
def addAndSub (sofar: Double, from: Int, to: Int) : List [Double] =
if (to >= dxs.length) Nil else {
val current = sofar - dxs (from) + dxs (to)
current :: addAndSub (current, from + 1, to + 1)
}
head ::: start :: addAndSub (start, 0, len)
}
val xs = List(2, 4, 7, 6, 3, 8, 12, 9, 4, 1)
slidingAvg (xs.map (1.0 * _), 4)
I adopted the idea, to divide the whole list by the period (len) in advance.
Then I generate the sum to start with for the len-first-elements.
And I generate the first, invalid elements (0.0, 0.0, ...) .
Then I recursively substract the first and add the last value.
In the end I listify the whole thing.
In Haskell pseudocode:
group4 (a:b:c:d:xs) = [a,b,c,d] : group4 (b:c:d:xs)
group4 _ = []
avg4 xs = sum xs / 4
running4avg nums = (map avg4 (group4 nums))
or pointfree
runnig4avg = map avg4 . group4
(Now one really should abstract the 4 out ....)
Using Haskell:
movingAverage :: Int -> [Double] -> [Double]
movingAverage n xs = catMaybes . (fmap avg . take n) . tails $ xs
where avg list = case (length list == n) -> Just . (/ (fromIntegral n)) . (foldl (+) 0) $ list
_ -> Nothing
The key is the tails function, which maps a list to a list of copies of the original list, with the property that the n-th element of the result is missing the first n-1 elements.
So
[1,2,3,4,5] -> [[1,2,3,4,5], [2,3,4,5], [3,4,5], [4,5], [5], []]
We apply fmap (avg . take n) to the result, which means we take the n-length prefix from the sublist, and compute its avg. If the length of the list we are avg'ing is not n, then we do not compute the average (since it is undefined). In that case, we return Nothing. If it is, we do, and wrap it in "Just". Finally, we run "catMaybes" on the result of fmap (avg . take n), to get rid of the Maybe type.
I was (surprised and) disappointed by the performance of what seemed to me the most idiomatic Clojure solutions, #JamesCunningham 's lazy-seq solutions.
(def integers (iterate inc 0))
(def coll (take 10000 integers))
(def n 1000)
(time (doall (moving-average-james-1 coll n)))
# "Elapsed time: 3022.862 msecs"
(time (doall (moving-average-james-2 coll n)))
# "Elapsed time: 3433.988 msecs"
So here's a combination of James' solution with #DanielC.Sobral 's idea of adapting fast-exponentiation to moving sums :
(defn moving-average
[coll n]
(letfn [(moving-sum [coll n]
(lazy-seq
(cond
(= n 1) coll
(= n 2) (map + coll (rest coll))
(odd? n) (map + coll (moving-sum (rest coll) (dec n)))
:else (let [half (quot n 2)
hcol (moving-sum coll half)]
(map + hcol (drop half hcol))))))]
(cond
(< n 1) nil
(= n 1) coll
:else (map #(/ % n) (moving-sum coll n)))))
(time (doall (moving-average coll n)))
# "Elapsed time: 42.034 msecs"
Edit: this one -based on #mikera 's solution- is even faster.
(defn moving-average
[coll n]
(cond
(< n 1) nil
(= n 1) coll
:else (let [sums (reductions + 0 coll)]
(map #(/ (- %1 %2) n) (drop n sums) sums))))
(time (doall (moving-average coll n)))
# "Elapsed time: 9.184 msecs"