How do I set the initial value of an accumulator in KDB? - kdb

I've written a function r that represents a functional state s update, of the form s: r[s;q]
I'm trying to get it to scan a list of integers q, but the problem I'm getting is that the application of scan / consumes the first and second digits of the list, rather than the initial state s. The result should be a list of states.
What am I doing wrong?
Actual code:
a: 5+cos (til 20)
s: `peg`ent`exi!3#enlist[::]
r: {[s;q]
l: 0.9; h: 1.1;
init: `peg`ent`exi!({null x};{null x}; {null x});
s1: `peg`ent`exi!({not null x};{null x}; {null x});
s2: `peg`ent`exi!({not null x};{not null x}; {null x});
s: $[all init#'s; s, enlist[`peg]!enlist[q];s];
s: $[(all s1#'s) & q<l*s[`peg]; s, enlist[`ent]!enlist[q];s];
s: $[(all s2#'s) & q>h*s[`peg]; s, enlist[`exi]!enlist[q];s];
s
}
r scan a

The answer to this was to set the first value of the list to the state:
r scan enlist[s], a

Related

Scala Set intersection performance

Using Scala's scala.collection.Set[T]. Given a small set s with only a few elements and another big set b with lots of elements, is there any performance difference between:
s & b // s intersect b
and
b & s // b intersect s.
If yes, which is the fastest?
The answer is: it's complicated.
The default implementation of an immutable set is scala.collection.immutable.Set. This has special cases for sizes 1 to 4. As soon as you have more than 4 elements, it will use scala.collection.immutable.HashSet.
Very small s (1..4)
So let's say you have a large set b and a small set s, with s containing <4 elements.
Then it will make a large difference:
b & s will filter all elements of b against membership in s and therefore takes b.count * s.count equality comparisons. Since b is large this will be quite slow.
s & b will filter the few elements of s against a membership in b, which is s.length times a hashing and an equality comparison if the hashes match (remember b is a HashSet). Since is is small this should be very fast.
Small s (n>4)
As soon as s is larger than 4 elements, it also will be a HashSet. Intersection for HashSets is written in a symmetric and very efficient way. It will combine the tree structures of s and b and perform equality comparisons when the hashes match. It will use maximum structural sharing. E.g. if b contains all elements of s, the result will be the same instance as s, so no objects will be allocated.
General advice
If you just write generic code where you don't care much about high performance, it is fine to use the default implementations such as scala.collection.Set. However, if you care about performance characteristics it is preferable to use a concrete implementation. E.g. if s and b are declared as scala.collection.immutable.HashSet, you will have consistent high performance independent of order, provided that you have a good hash function.
The generic implementation seen in the GenSetLike using intersect is overriden for HashSet with an implementation which looks quite complex to me (see scala.collection.immutable.HashSet.HashTrieSet#intersect0). Based on my rough benchmark its performance is similar for both a & b and b & a and it is similar to the performance of a filter b, which is an order of magnitude faster than b filter a. My testing code is:
object Sets extends App {
def time[R](block: => R): R = {
val t0 = System.nanoTime()
val result = block // call-by-name
val t1 = System.nanoTime()
println("Elapsed time: " + (t1 - t0)/1e6 + "ms")
result
}
val a = (0 until 10000 by 1).toSet //smaller data
val b = (0 until 1000000 by 2).toSet
time {a & b}
time {b & a}
time {a & b}
time {b & a}
time {a & b}
time {b & a}
println("Filter")
time {a filter b}
time {b filter a}
time {a filter b}
time {b filter a}
time {a filter b}
time {b filter a}
}
Result is:
Elapsed time: 6.990442ms
Elapsed time: 4.25017ms
Elapsed time: 4.1089ms
Elapsed time: 4.480789ms
Elapsed time: 3.71588ms
Elapsed time: 3.160159ms
Filter
Elapsed time: 7.781613ms
Elapsed time: 68.33023ms
Elapsed time: 5.858472ms
Elapsed time: 42.491131ms
Elapsed time: 2.982364ms
Elapsed time: 52.762474ms
Let us create two sets as per the condition mentioned
val a = (0 until 10000 by 1).toSet //smaller data
val b = (0 until 1000000 by 2).toSet //Relatively larger data
we can define a time function to check the execution time as below
def time[R](block: => R): R = {
val t0 = System.nanoTime()
val result = block // call-by-name
val t1 = System.nanoTime()
println("Elapsed time: " + (t1 - t0) + "ns")
result
}
Now we can check the intersection conditon
scala> time {a & b}
Elapsed time: 5895220ns
res2: scala.collection.immutable.Set[Int] = Set(892, 5810, 8062, ..)
scala> time {b & a}
Elapsed time: 6038271ns
res3: scala.collection.immutable.Set[Int] = Set(892, 5810, 8062, ...)
So by this we can conclude that intersection between a smaller and large dataset has performance difference and it is better to have the smaller dataset on the left side for faster execution for Scala set

return a list of numbers in scala for which a predicate holds

I want to write a function in scala calcMod(a, b, c) where a should serve as a predicate and b and c taking the range of numbers (e.g. 3(incl.)...9(excl.)) which have to be evaluated and return a list of numbers in this range for which the predicate holds.
For example the function-call calcMod(k => k % 2 == 0, 3, 9) should evaluate in Return(4, 6, 8)
The fact that I have mod 2 == 0 makes it clear that even numbers will always be returned. I want to solve this with linear recursion.
def calcMod(a: Int => Boolean, b: Int, c: Int): List[Int] = ?
Below function will go from b until c, then apply filter with the function we got in argument.
def calcMod(a: Int => Boolean, b: Int, c: Int): List[Int] = (b until c filter a).toList

Fibonacci in linear time by using an extra pointer

I have a function to find the nth number in a fibonacci sequence, in which I am recursively calling the function. The sum is stored in a class variable and I have an extra pointer I increment every time the function gets called. This extra pointer is the gate keeper which dictates the base case of when to exit from the loop. The performance I get, using this algorithm is, O(n) linear time and with O(1) space. I get the expected answer but I am confused if this an acceptable solution from a coding interview stand point.
var x = 0
var sum = 0
func myFibonacci(of n: Int, a: Int, b:Int) -> Int {
x+=1
if (x == n) {
return sum
} else {
sum = a+b
return myFibonacci(of: n, a: b, b: sum)
}
}
let finalAns = myFibonacci(of: 9, a: 0, b: 1)
print("The nth number in Fibonacci sequence is \(finalAns)")
Output: 34
Time complexity: O(n) linear time
Space complexity O(1)
Is this an acceptable solution for a coding interview?

Nim operator overloading

Just started programming in the Nim language (which I really like so far). As a learning exercise I am writing a small matrix library. I have a bunch more code, but I'll just show the part that's relevant to this question.
type
Matrix*[T; nrows, ncols: static[int]] = array[0 .. (nrows * ncols - 1), T]
# Get the index in the flattened array corresponding
# to row r and column c in the matrix
proc index(mat: Matrix, r, c: int): int =
result = r * mat.ncols + c
# Return the element at r, c
proc `[]`(mat: Matrix, r, c: int): Matrix.T =
result = mat[mat.index(r, c)]
# Set the element at r, c
proc `[]=`(mat: var Matrix, r, c: int, val: Matrix.T) =
mat[mat.index(r, c)] = val
# Add a value to every element in the matrix
proc `+=`(mat: var Matrix, val: Matrix.T) =
for i in 0 .. mat.high:
mat[i] += val
# Add a value to element at r, c
proc `[]+=`(mat: var Matrix, r, c: int, val: Matrix.T) =
mat[mat.index(r, c)] += val
# A test case
var mat: Matrix[float, 3, 4] # matrix with 3 rows and 4 columns
mat[1, 3] = 7.0
mat += 1.0
# add 8.0 to entry 1, 3 in matrix
`[]+=`(mat, 1, 3, 8.0) # works fine
All this works fine, but I'd like to be able to replace the last line with something like
mat[1, 3] += 4.0
This won't work (wasn't expecting it to either). If I try it, I get
Error: for a 'var' type a variable needs to be passed
How would I create an addition assignment operator that has this behavior? I'm guessing I need something other than a proc to accomplish this.
There are two ways you can do this:
Overload [] for var Matrix and return a var T (This requires the current devel branch of Nim):
proc `[]`(mat: Matrix, r, c: int): Matrix.T =
result = mat[mat.index(r, c)]
proc `[]`(mat: var Matrix, r, c: int): var Matrix.T =
result = mat[mat.index(r, c)]
Make [] a template instead:
template `[]`(mat: Matrix, r, c: int): expr =
mat[mat.index(r, c)]
This causes a problem when mat is not a value, but something more complex:
proc x: Matrix[float, 2, 2] =
echo "x()"
var y = x()[1, 0]
This prints x() twice.

Return type of Scala for/yield

I'm reading through Scala for the Impatient and I've come across something that's got me scratching my head.
The following returns a String:
scala> for ( c<-"Hello"; i <- 0 to 1) yield (c+i).toChar
res68: String = HIeflmlmop
But this returns a Vector:
scala> for (i <- 0 to 1; c <- "Hello") yield (c + i).toChar
res72: scala.collection.immutable.IndexedSeq[Char] = Vector(H, e, l, l, o, I, f, m, m, p)
The text preceding these two examples reads...
"When the body of the for loop starts with yield, then the loop
constructs a collection of values, one for each iteration...This type of loop is called a for comprehension. The generated collection is compatible with the first generator.
If the generated collection is compatible with the first generator, then why isn't the second example returning a type of Range, as in the following:
scala> val range = 0 to 1
range: scala.collection.immutable.Range.Inclusive = Range(0, 1)
Or am I misinterpreting entirely what the text means by, "...the generated collection is compatible with the first generator."
for-comprehensions are desugared to a series of map, flatMap and filter operations.
When you use map on a Range, you get a Vector output:
scala> 0 to 2 map (x => x * x)
res12: scala.collection.immutable.IndexedSeq[Int] = Vector(0, 1, 4)
This is because a Range is a very simple sort of collection, that is essentially just two three numbers: a start value, an end value and a step. If you look at the result of the mapping above, you can see that the resulting values cannot be represented by something of the Range type.
in this for (i <- 0 to 1; c <- "Hello") yield (c + i).toChar comprehension,
the 1st generator is of Type scala.collection.immutable.Range.Inclusive
the yield result vector is of Type scala.collection.immutable.IndexedSeq[Int]
and if you check the class Range:
http://www.scala-lang.org/api/current/index.html#scala.collection.immutable.Range
it shows Range extends/mixin the IndexedSeq. the super type IndexedSeq is compatible with the sub type Range.
If the result can not be represented by a range(as the previous answer explained), it will 'search for' the super type to represent the result.