How to understand this line of chisel code - scala

I'm in the process of learning chisel and scala language and try to analyse some lines of rocket-chip code.Could anyone try to explain me this line? https://github.com/chipsalliance/rocket-chip/blob/54237b5602a273378e3b35307bb47eb1e58cb9fb/src/main/scala/rocket/RocketCore.scala#L957
I understand what log2Up function is doing but don't understand why that log2Up(n)-1 and 0,were passed like "arguments" to addr which is val of type UInt!?

I could not find where UInt was defined, but if I had to guess, UInt is a class that has an apply method. This is a special method that allows us to use a parenthesis operator on an instance of the class.
For example lets say we have a class called Multiply that defines an apply method.
class Multiply {
def apply(a: Int, b: Int): Int = a * b
}
This allows you to call operator () on any instance of that class. For example:
val mul = new Multiply()
println(mul(5, 6)) // prints 30

What i concluded is that we use addr(log2Up(n)-1, 0) to get addres bits starting from zero up to log2Up(n)-1 bit. Lets take an example.
If we made object of class RegFile in this way
val reg = RegFile(31, 10)
First, memory rf is created. Size of that memory is 31 data of type UInt width of 10, starting from 0 up to 30.
When we compute log2Up(n)-1 we get 4, and we have something like this: addr(4,0). This gives us last five bits of addr. Like #Jack Koenig said in one of the comment above: "Rocket's register file uses a little trick where it reverses the order of the registers physically compared to the RISC-V", that's why we use ~addr. And at least rf(~addr) gives us back what is in that memory location.
This is implemented in this way to provide adequate memory access.
Take a look what whould be if we try to get data from memory location that we don't have in our memory. If method access was called in this way
access(42)
We try to access memory location on 41th place, but we only have 31 memory location(30 is top). 42 binary is 101010. Using what i said above
~addr(log2Up(n)-1,0)
would return us 10101 or 21 in decimal. Because order of registers is reversed this is
10th memory location (we try to access 41th but only have 31, 41 minus 31 is 10).

Related

Hashcode doesn't change between reruns

object Main extends App {
var a = new AnyRef()
println(a hashCode)
}
I have this code in Intellij Idea. I noticed that hashcode does not change between reruns. Even more, it doesn't change if I restart idea, or do some light modifications to the code. I can rename variable a or add a few more variables and I still have the same hashcode.
Is it cached somewhere? Or it's just OS who allocated the same address to a variable? Any consequences of this?
I'd expect it to be new each time, as OS should allocate new address each run.
The implementation for Object.hashCode() can vary between JVMs as long as it obeys the contract, which doesn't require the numbers to be different between runs. For HotSpot there is even an option (-XX:hashCode) to change the implementation.
HotSpot's default is to use a random number generator, so if you are using that (with no -XX:hashCode option) then it seems it uses the same seed on each run, resulting in the same sequence of hash codes. There's nothing wrong with that.
lmm's answer is not correct unless maybe if you are using HotSpot with -XX:hashCode=4 or another JVM that uses this technique by default. But I'm not at all certain about that (you can try yourself by using HotSpot with -XX:hashCode=4 and see if you get another value which also stays the same between runs).
Check out the code for the different options:
http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/tip/src/share/vm/runtime/synchronizer.cpp#l555
There is a comment in there about making the "else" branch the default, which is the Xorshift pattern, which is indeed a pseudo-random number generator which will always provide the same sequence.
The answer from "apangin" on this question says that indeed this has become the default since JDK8 which explains the change from JDK7 you described in your comment.
I can confirm that this is correct, look at the JDK8 source:
http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/87ee5ee27509/src/share/vm/runtime/globals.hpp#l1127
--> Default value is now 5, which corresponds to the "else" branch (Xorshift).
Some experiment:
scala> class A extends AnyRef
defined class A
scala> val a1= new A
a1: A = A#5f6b1f19
scala> val a2 = new A
a2: A = A#d60aa4
scala> a1.hashCode
res19: Int = 1600855833
scala> a2.hashCode
res20: Int = 14027428
scala> val a3 = new AnyRef
a3: Object = java.lang.Object#16c3388e
scala> a3.hashCode
res21: Int = 381892750
So, it's obvious AnyRef hash code is equal to address of object. If we have equal hashes it's mean object address is the same on every rerun. And that is true for me with two repls.
API tells about AnyRef hashCode method:
The hashCode method for reference types. See hashCode in scala.Any.
And about Any method:
Calculate a hash code value for the object.
The default hashing algorithm is platform dependent.
I guess that platform determines location of object and therefore value of hashCode.
Any new process gets its own virtual address space from the OS. So while the process might exist at a different physical address each time the program runs, it will be mapped to the same virtual address each time. (ASLR exists, but I understand the JVM doesn't participate in it). You can see this with e.g. a small C program with a string constant in (you might have to deliberately disable ASLR for that program) - if you take a pointer to the string constant and print that pointer as an integer, it will be the same value every time.
hashCode() is not a random number. It is a digested result from analyzing some part of an object. Objects with the same values will, more than likely, have the same hash code. This is true for your case, since the "value" of an AnyRef with no fields is essentially empty.

Scala - What is the difference between size and length of a Seq?

What is the difference between size and length of a Seq? When to use one and when the other?
scala> var a :Seq[String] = Seq("one", "two")
a: Seq[String] = List(one, two)
scala> a.size
res6: Int = 2
scala> a.length
res7: Int = 2
It's the same?
Thanks
Nothing. In the Seq doc, at the size method it is clearly stated: "The size of this sequence, equivalent to length.".
size is defined in GenTraversableOnce, whereas length is defined in GenSeqLike, so length only exists for Seqs, whereas size exists for all Traversables. For Seqs, however, as was already pointed out, size simply delegates to length (which probably means that, after inlining, you will get identical bytecode).
In a Seq they are the same, as others have mentioned. However, for information, this is what IntelliJ warns on a scala.Array:
Replace .size with .length on arrays and strings
Inspection info: This
inspection reports array.size and string.size calls. While such calls
are legitimate, they require an additional implicit conversion to
SeqLike to be made. A common use case would be calling length on
arrays and strings may provide significant advantages.
Nothing, one delegates to the other. See SeqLike trait.
/** The size of this $coll, equivalent to `length`.
*
* $willNotTerminateInf
*/
override def size = length
I did an experiment, using Scala version 2.12.8, and a million item list. Upon the first use, length() is 7 or 8 times faster than size(). But on the 2nd try on the same list, size() is about the same speed as length().
However, after some time, presumably the cache is gone, size() is slow() by 7 or 8 times again.
This shows that length() is preferred for sequences. It's not just another name for size().

Scala Buffer: Size or Length?

I am using a mutable Buffer and need to find out how many elements it has.
Both size and length methods are defined, inherited from separate traits.
Is there any actual performance difference, or can they be considered exact synonyms?
They are synonyms, mostly a result of Java's decision of having size for collections and length for Array and String. One will always be defined in terms of the other, and you can easily see which is which by looking at the source code, the link for which is provided on scaladoc. Just find the defining trait, open the source code, and search for def size or def length.
In this case, they can be considered synonyms. You may want to watch out with some other cases such as Array - whilst length and size will always return the same result, in versions prior to Scala 2.10 there may be a boxing overhead for calling size (which is provided by a Scala wrapper around the Array), whereas length is provided by the underlying Java Array.
In Scala 2.10, this overhead has been removed by use of a value class providing the size method, so you should feel free to use whichever method you like.
As of Scala-2.11, these methods may have different performance. For example, consider this code:
val bigArray = Array.fill(1000000)(0)
val beginTime = System.nanoTime()
var i = 0
while (i < 2000000000) {
i += 1
bigArray.length
}
val endTime = System.nanoTime()
println(endTime - beginTime)
sys.exit(-1)
Running this on my amd64 computer gives about 2423834 nanos time (varies from time to time).
Now, if I change the length method to size, it will become about 70764719 nanos time.
This is more than 20x slower.
Why does it happen? I didn't dig it through, I don't know. But there are scenarios where length and size perform drastically different.
They are synonyms, as the scaladoc for Buffer.size states:
The size of this buffer, equivalent to length.
The scaladoc for Buffer.length is explicit too:
The length of the buffer. Note: xs.length and xs.size yield the same result.
Simple advice: refer to the scaladoc before asking a question.
UPDATE: Just saw your edit adding mention of performance. As Daniel C. Sobral aid, one is normally always implemented in term of the other, so they have the same performance.

Optimal HashSet Initialization (Scala | Java)

I'm writing an A.I. to solve a "Maze of Life" puzzle. Attempting to store states to a HashSet slows everything down. It's faster to run it without a set of explored states. I'm fairly confident my node (state storage) implements equals and hashCode well as tests show a HashSet doesn't add duplicate states. I may need to rework the hashCode function, but I believe what's slowing it down is the HashSet rehashing and resizing.
I've tried setting the initial capacity to a very large number, but it's still extremely slow:
val initCapacity = java.lang.Math.pow(initialGrid.width*initialGrid.height,3).intValue()
val frontier = new QuickQueue[Node](initCapacity)
Here is the quick queue code:
class QuickQueue[T](capacity: Int) {
val hashSet = new HashSet[T](capacity)
val queue = new Queue[T]
//methods below
For more info, here is the hash function. I store the grid values in bytes in two arrays and access it using tuples:
override def hashCode(): Int = {
var sum = Math.pow(grid.goalCoords._1, grid.goalCoords._2).toInt
for (y <- 0 until grid.height) {
for (x <- 0 until grid.width) {
sum += Math.pow(grid((x, y)).doubleValue(), x.toDouble).toInt
}
sum += Math.pow(sum, y).toInt
}
return sum
}
Any suggestions on how to setup a HashSet that wont slow things down? Maybe another suggestion of how to remember explored states?
P.S. using java.util.HashSet, and even with initial capacity set, it takes 80 seconds vs < 7 seconds w/o the set
Okay, for a start, please replace
override def hashCode(): Int =
with
override lazy val hashCode: Int =
so you don't calculate (grid.height*grid.width) floating point powers every time you need to access the hash code. That should speed things up by an enormous amount.
Then, unless you somehow rely upon close cells having close hash codes, don't re-invent the wheel. Use scala.util.hashing.MurmurHash3.seqHash or somesuch to calculate your hash. This should speed your hash up by another factor of 20 or so. (Still keep the lazy val.)
Then you only have overhead from the required set operations. Right now, unless you have a lot of 0x0 grids, you are using up the overwhelming majority of your time waiting for math.pow to give you a result (and risking everything becoming Double.PositiveInfinity or 0.0, depending on how big the values are, which will create hash collisions which will slow things down still further).
Note that the following assumes all your objects are immutable. This is a sane assumption when using hashing.
Also you should profile your code before applying optimization (use e.g. the free jvisualvm, that comes with the JDK).
Memoization for fast hashCode
Computing the hash code is usually a bottleneck. By computing the hash code only once for each object and storing the result you can reduce the cost of hash code computation to a minimum (once at object creation) at the expense of increased space consumption (probably moderate). To achieve this turn the def hashCode into a lazy val or val.
Interning for fast equals
Once you have the cost of hashCode eliminated, computing equals becomes a problem. equals is particularly expensive for collection fields and deep structures in general.
You can minimize the cost of equals by interning. This means that you acquire new objects of the class through a factory method, which checks whether the requested new object already exists, and if so, returns a reference to the existing object. If you assert that every object of this type is constructed in this way you know that there is only one instance of each distinct object and equals becomes equivalent to object identity, which is a cheap reference comparison (eq in Scala).

Parsing of binary data with scala

I need to parse some simple binary Files. (The files contains n entries which consists of several signed/unsigned Integers of different sizes etc.)
In the moment i do the parsing "by hand". Does somebody know a library which helps to do this type of parsing?
Edit: "By hand" means that i get the Data Byte by Byte sort it in to the correct Order and convert it to an Int/Byte etc. Also some of the Data is unsigned.
I've used the sbinary library before and it's very nice. The documentation is a little sparse but I would suggest first looking at the old wiki page as that gives you a starting point. Then check out the test specifications, as that gives you some very nice examples.
The primary benefit of sbinary is that it gives you a way to describe the wire format of each object as a Format object. You can then encapsulate those formatted types in a higher level Format object and Scala does all the heavy lifting of looking up that type as long as you've included it in the current scope as an implicit object.
As I say below, I'd now recommend people use scodec instead of sbinary. As an example of how to use scodec, I'll implement how to read a binary representation in memory of the following C struct:
struct ST
{
long long ll; // # 0
int i; // # 8
short s; // # 12
char ch1; // # 14
char ch2; // # 15
} ST;
A matching Scala case class would be:
case class ST(ll: Long, i: Int, s: Short, ch1: String, ch2: String)
I'm making things a bit easier for myself by just saying we're storing Strings instead of Chars and I'll say that they are UTF-8 characters in the struct. I'm also not dealing with endian details or the actual size of the long and int types on this architecture and just assuming that they are 64 and 32 respectively.
Scodec parsers generally use combinators to build higher level parsers from lower level ones. So for below, we'll define a parser which combines a 8 byte value, a 4 byte value, a 2 byte value, a 1 byte value and one more 1 byte value. The return of this combination is a Tuple codec:
val myCodec: Codec[Long ~ Int ~ Short ~ String ~ String] =
int64 ~ int32 ~ short16 ~ fixedSizeBits(8L, utf8) ~ fixedSizeBits(8L, utf8)
We can then transform this into the ST case class by calling the xmap function on it which takes two functions, one to turn the Tuple codec into the destination type and another function to take the destination type and turn it into the Tuple form:
val stCodec: Codec[ST] = myCodec.xmap[ST]({case ll ~ i ~ s ~ ch1 ~ ch2 => ST(ll, i, s, ch1, ch2)}, st => st.ll ~ st.i ~ st.s ~ st.ch1 ~ st.ch2)
Now, you can use the codec like so:
stCodec.encode(ST(1L, 2, 3.shortValue, "H", "I"))
res0: scodec.Attempt[scodec.bits.BitVector] = Successful(BitVector(128 bits, 0x00000000000000010000000200034849))
res0.flatMap(stCodec.decode)
=> res1: scodec.Attempt[scodec.DecodeResult[ST]] = Successful(DecodeResult(ST(1,2,3,H,I),BitVector(empty)))
I'd encourage you to look at the Scaladocs and not at the Guide as there's much more detail in the Scaladocs. The guide is a good start at the very basics but it doesn't get into the composition part much but the Scaladocs cover that pretty well.
Scala itself doesn't have a binary data input library, but the java.nio package does a decent job. It doesn't explicitly handle unsigned data--neither does Java, so you need to figure out how you want to manage it--but it does have convenience "get" methods that take byte order into account.
I don't know what you mean with "by hand" but using a simple DataInputStream (apidoc here) is quite concise and clear:
val dis = new DataInputStream(yourSource)
dis.readFloat()
dis.readDouble()
dis.readInt()
// and so on
Taken from another SO question: http://preon.sourceforge.net/, it should be a framework to do binary encoding/decoding.. see if it has the capabilities you need
If you are looking for a Java based solution, then I will shamelessly plug Preon. You just annotate the in memory Java data structure, and ask Preon for a Codec, and you're done.
Byteme is a parser combinators library for doing binary. You can try to use it for your tasks.