Does `Random.nextString()` take O(n) time in Scala? - scala

When calling nextString() from the built-in scala.util.Random library, what time does it take to run? Is that O(n)?

Yes, it's O(n). It can't be any lower, because it creates a new string and that has O(n) cost. It shouldn't be any higher, because creating a random number is O(1) and that's enough to pick a character or word or something. And in practice it's actually O(n).
The constant factor is pretty high, though, due to how it's implemented. If it is important to you to make random strings really fast, you should get your own high-performance random number generator and pack chars into a char array.

Couldn't find anything on Scala docs, but from the source code:
def nextString(length: Int) = {
def safeChar() = {
val surrogateStart: Int = 0xD800
val res = nextInt(surrogateStart - 1) + 1
res.toChar
}
List.fill(length)(safeChar()).mkString
}
I would say O(n), assuming O(1) from nextInt(), on the length of the string asked

Related

Swift - should I create local variable of a strings "count"?

Does it matter if I use a strings 'count' multiple times within a function. That is, does Swift cache the 'count' after it firsts computes it. Below are two examples, does it matter which one I use? I assume the second is definitely okay but what about the first? I see example code like the first one all the time.
func Foo1 (str: String) {
...
// calling str.count twice
if x < str.count && y < str.count {
...
}
func Foo2 (str: String) {
...
// calling str.count once
let c = str.count
if x < c && y < c {
...
}
.count is defined by the Collection protocol with the following complexity:
Complexity: O(1) if the collection conforms to RandomAccessCollection; otherwise, O(n), where n is the length of the collection.
String is not a RandomAccessCollection. It's a BidirectionalCollection, so it does not promise O(1). It only promises O(n).
It definitely does not promise any caching (and you shouldn't expect any).
It happens to be true that in many (probably most) cases, String's count is cached. It's part of _StringObject, which is part of the low-level storage abstraction, and it's often inlined by the optimizer. But none of this is promised.
That said, unless you expect the String to be extremely large (10kB at a minimum, possibly more), it is difficult to imagine this being a major bottleneck by being called twice outside a tight loop. As with most things, you should write clearly, and then profile. I would likely create an extra variable just for clarity, but you shouldn't second-guess here too much. Write clearly. Then profile.
Do you have particularly large strings that you're working with?

chisel3 arithmetic operations on Doubles

Please I have problems manipulating arithmetic operations with doubles in chisel. I have been seeing examples that uses just the following types: Int,UInt,SInt.
I saw here that arithmetic operations where described only for SInt and UInt. What about Double?
I tried to declare my output out as Double, but didn't know how. Because the output of my code is Double.
Is there a way to declare in Bundle an input and an output of type Double?
Here is my code:
class hashfunc(val k:Int, val n: Int ) extends Module {
val a = k + k
val io = IO(new Bundle {
val b=Input(UInt(k.W))
val w=Input(UInt(k.W))
var out = Output(UInt(a.W))
})
val tabHash1 = new Array[Array[Double]](n)
val x = new ArrayBuffer[(Double, Data)]
val tabHash = new Array[Double](tabHash1.size)
for (ind <- tabHash1.indices){
var sum=0.0
for (ind2 <- 0 until x.size){
sum += ( x(ind2) * tabHash1(ind)(ind2) )
}
tabHash(ind) = ((sum + io.b) / io.w)
}
io.out := tabHash.reduce(_ + _)
}
When I compile the code, I get the following error:
code error
Thank you for your kind attention, looking forward to your responses.
Chisel does have a native FixedPoint type which maybe of use. It is in the experimental package
import chisel3.experimental.FixedPoint
There is also a project DspTools that has simulation support for Doubles. There are some nice features, e.g. it that allows modules to parameterized on the numeric types (Complex, Double, FixedPoint, SInt) so that you can run simulations on double to validate the desired mathematical behavior and then switch to a synthesizable number format that meets your precision criteria.
DspTools is an ongoing research projects and the team would appreciate outside users feedback.
Operations on floating point numbers (Double in this case) are not supported directly by any HDL. The reason for this is that while addition/subtraction/multiplication of fixed point numbers is well defined there are a lot of design space trade-offs for floating point hardware as it is a much more complex piece of hardware.
That is to say, a high performance floating point unit is a significant piece of hardware in it's own right and would be time shared in any realistic design.

List of random values with Rng library

I am looking through Rng sources to see how they generate a list of random values.
They define a function fill:
def fill(n: Int): Rng[List[A]] = sequence(List.fill(n)(this))
where sequence is just an invocation of Traverse.sequence from scalaz:
def sequence[T[_], A](x: T[Rng[A]])(implicit T: Traverse[T]): Rng[T[A]] =
T.sequence(x)
In other words they create a temporary list List[Rang[A]] and then apply sequence: List[Rng[A]] => Rng[List[A]]. I see how it works but the temporary list looks list a waste of memory to me. Is it absolutely necessary ? Can it be improved ?
This is a slightly faster implementation. I didn't profile to see if there was a noticeable impact on the heap. I did a rough timing test and it took roughly 70% of the time Rng.fill took to fill a 1M item list with random Ints. I didn't attempt to find out how these scaled with different size lists. See https://gist.github.com/drstevens/77db6bab6b1e995dac13
def fill[A](a: Rng[A], count: Int): Rng[List[A]] =
Stream.from(0).take(count).traverseU(_ => a).map(_.toList)
The interesting thing about this is that the toList isn't evaluated until unsafePerformIO.

For loop in scala without sequence?

So, while working my way through "Scala for the Impatient" I found myself wondering: Can you use a Scala for loop without a sequence?
For example, there is an exercise in the book that asks you to build a counter object that cannot be incremented past Integer.MAX_VALUE. In order to test my solution, I wrote the following code:
var c = new Counter
for( i <- 0 to Integer.MAX_VALUE ) c.increment()
This throws an error: sequences cannot contain more than Int.MaxValue elements.
It seems to me that means that Scala is first allocating and populating a sequence object, with the values 0 through Integer.MaxValue, and then doing a foreach loop on that sequence object.
I realize that I could do this instead:
var c = new Counter
while(c.value < Integer.MAX_VALUE ) c.increment()
But is there any way to do a traditional C-style for loop with the for statement?
In fact, 0 to N does not actually populate anything with integers from 0 to N. It instead creates an instance of scala.collection.immutable.Range, which applies its methods to all the integers generated on the fly.
The error you ran into is only because you have to be able to fit the number of elements (whether they actually exist or not) into the positive part of an Int in order to maintain the contract for the length method. 1 to Int.MaxValue works fine, as does 0 until Int.MaxValue. And the latter is what your while loop is doing anyway (to includes the right endpoint, until omits it).
Anyway, since the Scala for is a very different (much more generic) creature than the C for, the short answer is no, you can't do exactly the same thing. But you can probably do what you want with for (though maybe not as fast as you want, since there is some performance penalty).
Wow, some nice technical answers for a simple question (which is good!) But in case anyone is just looking for a simple answer:
//start from 0, stop at 9 inclusive
for (i <- 0 until 10){
println("Hi " + i)
}
//or start from 0, stop at 9 inclusive
for (i <- 0 to 9){
println("Hi " + i)
}
As Rex pointed out, "to" includes the right endpoint, "until" omits it.
Yes and no, it depends what you are asking for. If you're asking whether you can iterate over a sequence of integers without having to build that sequence first, then yes you can, for instance using streams:
def fromTo(from : Int, to : Int) : Stream[Int] =
if(from > to) {
Stream.empty
} else {
// println("one more.") // uncomment to see when it is called
Stream.cons(from, fromTo(from + 1, to))
}
Then:
for(i <- fromTo(0, 5)) println(i)
Writing your own iterator by defining hasNext and next is another option.
If you're asking whether you can use the 'for' syntax to write a "native" loop, i.e. a loop that works by incrementing some native integer rather than iterating over values produced by an instance of an object, then the answer is, as far as I know, no. As you may know, 'for' comprehensions are syntactic sugar for a combination of calls to flatMap, filter, map and/or foreach (all defined in the FilterMonadic trait), depending on the nesting of generators and their types. You can try to compile some loop and print its compiler intermediate representation with
scalac -Xprint:refchecks
to see how they are expanded.
There's a bunch of these out there, but I can't be bothered googling them at the moment. The following is pretty canonical:
#scala.annotation.tailrec
def loop(from: Int, until: Int)(f: Int => Unit): Unit = {
if (from < until) {
f(from)
loop(from + 1, until)(f)
}
}
loop(0, 10) { i =>
println("Hi " + i)
}

Adding immutable Vectors

I am trying to work more with scalas immutable collection since this is easy to parallelize, but i struggle with some newbie problems. I am looking for a way to create (efficiently) a new Vector from an operation. To be precise I want something like
val v : Vector[Double] = RandomVector(10000)
val w : Vector[Double] = RandomVector(10000)
val r = v + w
I tested the following:
// 1)
val r : Vector[Double] = (v.zip(w)).map{ t:(Double,Double) => t._1 + t._2 }
// 2)
val vb = new VectorBuilder[Double]()
var i=0
while(i<v.length){
vb += v(i) + w(i)
i = i + 1
}
val r = vb.result
}
Both take really long compared to the work with Array:
[Vector Zip/Map ] Elapsed time 0.409 msecs
[Vector While Loop] Elapsed time 0.374 msecs
[Array While Loop ] Elapsed time 0.056 msecs
// with warm-up (10000) and avg. over 10000 runs
Is there a better way to do it? I think the work with zip/map/reduce has the advantage that it can run in parallel as soon as the collections have support for this.
Thanks
Vector is not specialized for Double, so you're going to pay a sizable performance penalty for using it. If you are doing a simple operation, you're probably better off using an array on a single core than a Vector or other generic collection on the entire machine (unless you have 12+ cores). If you still need parallelization, there are other mechanisms you can use, such as using scala.actors.Futures.future to create instances that each do the work on part of the range:
val a = Array(1,2,3,4,5,6,7,8)
(0 to 4).map(_ * (a.length/4)).sliding(2).map(i => scala.actors.Futures.future {
var s = 0
var j = i(0)
while (j < i(1)) {
s += a(j)
j += 1
}
s
}).map(_()).sum // _() applies the future--blocks until it's done
Of course, you'd need to use this on a much longer array (and on a machine with four cores) for the parallelization to improve things.
You should use lazily built collections when you use more than one higher-order methods:
v1.view zip v2 map { case (a,b) => a+b }
If you don't use a view or an iterator each method will create a new immutable collection even when they are not needed.
Probably immutable code won't be as fast as mutable but the lazy collection will improve execution time of your code a lot.
Arrays are not type-erased, Vectors are. Basically, JVM gives Array an advantage over other collections when handling primitives that cannot be overcome. Scala's specialization might decrease that advantage, but, given their cost in code size, they can't be used everywhere.