Chisel persist value in module until new write

Chisel persist value in module until new write - scala

I have created a basic module that is meant to represent a unit of memory in Chisel3:
class MemristorCellBundle() extends Bundle {
val writeBus = Input(UInt(1.W))
val dataBus = Input(UInt(8.W))
val cellBus = Output(UInt(8.W))
}
class MemCell() extends Module {
val io = IO(new MemCellBundle())
val write = Wire(UInt())
write := io.voltageBus
val internalValue = Reg(UInt())
// More than 50% of total voltage in (255).
when(write === 1.U) {
internalValue := io.dataBus
io.cellBus := io.dataBus
} .otherwise {
io.cellBus := internalValue
}
}
What I want is for it to output the internalValue when the write bus is logic LOW, and change this value with logic HIGH. My understanding of Chisel is that the register can persist this internalValue between clock cycles, so that this basically acts as a single unit of memory.
I'm doing it in this way as part of a larger project. However when writing a unit test I am finding that the 'read-after-write' scenario fails.
class MemCellTest extends FlatSpec with ChiselScalatestTester with Matchers {
behavior of "MemCell"
it should "read and write" in {
test(new MemCell()) { c =>
c.io.dataBus.poke(5.U)
c.io.write.poke(0.U)
c.io.cellBus.expect(0.U)
// Write
c.io.dataBus.poke(5.U)
c.io.write.poke(1.U)
c.io.cellBus.expect(5.U)
// Verify read-after-write
c.io.dataBus.poke(12.U)
c.io.write.poke(0.U)
c.io.cellBus.expect(5.U)
}
}
}
The first two expectations work just as I would expect. However, when I try to read after writing, the cellBus returns to 0 instead of persisting the 5 that I had written previously.
test MemCell Success: 0 tests passed in 1 cycles in 0.035654 seconds 28.05 Hz
[info] MemCellTest:
[info] MemCell
[info] - should read and write *** FAILED ***
[info] io_cellBus=0 (0x0) did not equal expected=5 (0x5) (lines in MyTest.scala: 10) (MyTest.scala:21)
Clearly the register is not keeping this value, and so internalValue reverts to 0. But why does this happen, and how would I be able to create a value that can persist?

Drakinite's comment is correct. You need to make sure to step the clock in order to see the register latch the value. I tweaked your test to include a couple of steps and it works as expected:
c.io.dataBus.poke(5.U)
c.io.writeBus.poke(0.U)
c.io.cellBus.expect(0.U)
c.clock.step() // Added step
// Write passthrough (same cycle)
c.io.dataBus.poke(5.U)
c.io.writeBus.poke(1.U)
c.io.cellBus.expect(5.U)
c.clock.step() // Added step
// Verify read-after-write
c.io.dataBus.poke(12.U)
c.io.writeBus.poke(0.U)
c.io.cellBus.expect(5.U)
Here's an executable example showing that this works (using chisel3 v3.4.4 and chiseltest v0.3.4): https://scastie.scala-lang.org/5E1rOEsYSzSUrLXZCvoyNA

Related

Adding a MMIO peripheral to a small rocket core

I have successfully added and simulated my MMIO perihperal coupled to a normal sized rocket core before.
But now I want to try to add it to a small core (so called TinyCore), and this is the part where I am having problems. Also, just in case it is relevant, the conexions with my peripheral are all trough FIFOs.
First, the error I am getting when trying to generate the design:
[error] java.lang.IllegalArgumentException: requirement failed: Ports cannot overlap: AddressSet(0x80000000, 0x3fff) AddressSet(0x80000000, 0xfffffff)
I imagine this comes from the fact that the small rocket config has a different memory map, which I don't know, and I am trying to add the peripheral to an address that doesn't exist in this configuration.
Here it is the configuration I am using:
class myTinyRocketConfig2 extends Config(
new freechips.rocketchip.subsystem.WithInclusiveCache(nBanks=1, nWays=4, capacityKB=128) ++
new freechips.rocketchip.subsystem.With1TinyCore ++ // single tiny rocket-core
new chipyard.config.AbstractConfig)
And this is how I added the peripheral, it shows the address and some other parameters:
class TLTxWriteQueue
(
depth: Int = 4,
csrAddress: AddressSet = AddressSet(0x2000, 0xff),
beatBytes: Int = 4,
)(implicit p: Parameters) extends TxWriteQueue(depth) with TLHasCSR {
val devname = "tlQueueIn"
val devcompat = Seq("ucb-art", "dsptools")
val device = new SimpleDevice(devname, devcompat) {
override def describe(resources: ResourceBindings): Description = {
val Description(name, mapping) = super.describe(resources)
Description(name, mapping)
}
}
// make diplomatic TL node for regmap
override val mem = Some(TLRegisterNode(address = Seq(csrAddress), device = device, beatBytes = beatBytes))
}
I apologize in advance for any stupid mistake, as I am a beginner trying to go trough with his first project. Thanks

The the Rocket TinyCore uses a default scratchpad instead of a backing memory. This scratchpad 0x80000000 to 0x80003fff is overlapping with the memport's address range.
You'll have to remove the memport. This is what chipyard's TinyRocketConfig does. This config should generate a design (just without an L2 Cache or backing memory).
class TinyRocketConfig extends Config(
new chipyard.config.WithTLSerialLocation(
freechips.rocketchip.subsystem.FBUS,
freechips.rocketchip.subsystem.PBUS) ++ // attach TL serial adapter to f/p busses
new chipyard.WithMulticlockIncoherentBusTopology ++ // use incoherent bus topology
new freechips.rocketchip.subsystem.WithNBanks(0) ++ // remove L2$
new freechips.rocketchip.subsystem.WithNoMemPort ++ // remove backing memory
new freechips.rocketchip.subsystem.With1TinyCore ++ // single tiny rocket-core
new chipyard.config.AbstractConfig)
If you wanted to include an InclusiveCache in your design, you can try using a modified version of chipyard's TinyRocketConfig. Though currently, it doesn't seem like you're addressing the entire L2 Cache, and I think it's microarchitecturally unused with TinyCore. If you simply need a larger scratchpad, you can modify the scratchpad to contain more sets:
class WithModifiedScratchPad extends Config((site, here, up) => {
case RocketTilesKey => up(RocketTilesKey, site) map { r =>
// each set is currently 64 bytes
r.copy(dcache = r.dcache.map(_.copy(nSets = 2048 /*128KiB scratchpad*/))) }
})

Flink: ProcessWindowFunction

I am recently studying ProcessWindowFunction in Flink's new release. It says the ProcessWindowFunction supports global state and window state. I use Scala API to give it a try. I can so far get the global state working but I do no have any luck to make it for the window state. What I'm doing is to process system logs and count the number of logs keyed by hostname and severity level. I would like to calculate the difference in log count between two adjacent windows. Here is my code implementing ProcessWindowFunction.
class LogProcWindowFunction extends ProcessWindowFunction[LogEvent, LogEvent, Tuple, TimeWindow] {
// Create a descriptor for ValueState
private final val valueStateWindowDesc = new ValueStateDescriptor[Long](
"windowCounters",
createTypeInformation[Long])
private final val reducingStateGlobalDesc = new ReducingStateDescriptor[Long](
"globalCounters",
new SumReduceFunction(),
createTypeInformation[Long])
override def process(key: Tuple, context: Context, elements: Iterable[LogEvent], out: Collector[LogEvent]): Unit = {
// Initialize the per-key and per-window ValueState
val valueWindowState = context.windowState.getState(valueStateWindowDesc)
val reducingGlobalState = context.globalState.getReducingState(reducingStateGlobalDesc)
val latestWindowCount = valueWindowState.value()
println(s"lastWindowCount: $latestWindowCount ......")
val latestGlobalCount = if (reducingGlobalState.get() == null) 0L else reducingGlobalState.get()
// Compute the necessary statistics and determine if we should launch an alarm
val eventCount = elements.size
// Update the related state
valueWindowState.update(eventCount.toLong)
reducingGlobalState.add(eventCount.toLong)
for (elem <- elements) {
out.collect(elem)
}
}
}
I always get 0 value from the window state instead of the previous updated count it should be. I've been struggling with such problem for several days. Can someone please help me to figure it out? Thanks.

The scope of the per-window state is a single window instance. In the case of your process method above, every time it is called a new window is in scope, and so the latestWindowCount is always zero.
For a normal, vanilla window that is only going to fire once, per-window state is useless. Only if a window somehow has multiple firings (e.g., late firings) can you make good use of the per-window state. If you are trying to remember something from one window to the next, then you can do this with the global window state.
For an example of using per-window state to remember data to use in late firings, see slides 13-19 in Flink's advanced window training.

Spark - how to handle with lazy evaluation in case of iterative (or recursive) function calls

I have a recursive function that needs to compare the results of the current call to the previous call to figure out whether it has reached a convergence. My function does not contain any action - it only contains map, flatMap, and reduceByKey. Since Spark does not evaluate transformations (until an action is called), my next iteration does not get the proper values to compare for convergence.
Here is a skeleton of the function -
def func1(sc: SparkContext, nodes:RDD[List[Long]], didConverge: Boolean, changeCount: Int) RDD[(Long] = {
if (didConverge)
nodes
else {
val currChangeCount = sc.accumulator(0, "xyz")
val newNodes = performSomeOps(nodes, currChangeCount) // does a few map/flatMap/reduceByKey operations
if (currChangeCount.value == changeCount) {
func1(sc, newNodes, true, currChangeCount.value)
} else {
func1(sc, newNode, false, currChangeCount.value)
}
}
}
performSomeOps only contains map, flatMap, and reduceByKey transformations. Since it does not have any action, the code in performSomeOps does not execute. So my currChangeCount does not get the actual count. What that implies, the condition to check for the convergence (currChangeCount.value == changeCount) is going to be invalid. One way to overcome is to force an action within each iteration by calling a count but that is an unnecessary overhead.
I am wondering what I can do to force an action w/o much overhead or is there another way to address this problem?

I believe there is a very important thing you're missing here:
For accumulator updates performed inside actions only, Spark guarantees that each task’s update to the accumulator will only be applied once, i.e. restarted tasks will not update the value. In transformations, users should be aware of that each task’s update may be applied more than once if tasks or job stages are re-executed.
Because of that accumulators cannot be reliably used for managing control flow and are better suited for job monitoring.
Moreover executing an action is not an unnecessary overhead. If you want to know what is the result of the computation you have to perform it. Unless of course the result is trivial. The cheapest action possible is:
rdd.foreach { case _ => }
but it won't address the problem you have here.
In general iterative computations in Spark can be structured as follows:
def func1(chcekpoinInterval: Int)(sc: SparkContext, nodes:RDD[List[Long]],
didConverge: Boolean, changeCount: Int, iteration: Int) RDD[(Long] = {
if (didConverge) nodes
else {
// Compute and cache new nodes
val newNodes = performSomeOps(nodes, currChangeCount).cache
// Periodically checkpoint to avoid stack overflow
if (iteration % checkpointInterval == 0) newNodes.checkpoint
/* Call a function which computes values
that determines control flow. This execute an action on newNodes.
*/
val changeCount = computeChangeCount(newNodes)
// Unpersist old nodes
nodes.unpersist
func1(checkpointInterval)(
sc, newNodes, currChangeCount.value == changeCount,
currChangeCount.value, iteration + 1
)
}
}

I see that these map/flatMap/reduceByKey transformations are updating an accumulator. Therefore the only way to perform all updates is to execute all these functions and count is the easiest way to achieve that and gives the lowest overhead compared to other ways (cache + count, first or collect).

Previous answers put me on the right track to solve a similar convergence detection problem.
foreach is presented in the docs as:
foreach(func) : Run a function func on each element of the dataset. This is usually done for side effects such as updating an Accumulator or interacting with external storage systems.
It seems like instead of using rdd.foreach() as a cheap action to trigger accumulator increments placed in various transformations, it should be used to do the incrementing itself.
I'm unable to produce a scala example, but here's a basic java version, if it can still help:
// Convergence is reached when two iterations
// return the same number of results
long previousCount = -1;
long currentCount = 0;
while (previousCount != currentCount){
rdd = doSomethingThatUpdatesRdd(rdd);
// Count entries in new rdd with foreach + accumulator
rdd.foreach(tuple -> accumulator.add(1));
// Update helper values
previousCount = currentCount;
currentCount = accumulator.sum();
accumulator.reset();
}
// Convergence is reached

Pytest - skip (xfail) mixed with parametrize

is there a way to use the #incremental plugin like described att Pytest: how to skip the rest of tests in the class if one has failed? mixed with #pytest.mark.parametrize like below:
#pytest.mark.incremental
Class TestClass:
#pytest.mark.parametrize("input", data)
def test_preprocess_check(self,input):
# prerequisite for test
#pytest.mark.parametrize("input",data)
def test_process_check(self,input):
# test only if test_preprocess_check succeed
The problem i encountered is, at the first fail of test_preprocess_check with a given input of my data set, the following test_preprocess_check and test_process_check are labeled "xfail".
The behaviour i expect will be, at each new "input" of my parametrized data set, the test will act in an incremental fashion.
ex: data = [0,1,2]
if only test_preprocess_check(0) failed:
i got the following report:
1 failed, 5 xfailed
but i expect the report:
1 failed, 1 xfailed, 4 passed
Thanks

After some experiments i found a way to generalize the #incremental to works with parametrize annotation. Simply rewrite the _previousfailed argument to make it unique for each input. The argument _genid was excactly the need.
I added a #pytest.mark.incrementalparam to achieve this.
Code become:
def pytest_runtest_setup(item):
previousfailed_attr = getattr(item, "_genid",None)
if previousfailed_attr is not None:
previousfailed = getattr(item.parent, previousfailed_attr, None)
if previousfailed is not None:
pytest.xfail("previous test failed (%s)" %previousfailed.name)
previousfailed = getattr(item.parent, "_previousfailed", None)
if previousfailed is not None:
pytest.xfail("previous test failed (%s)" %previousfailed.name)
def pytest_runtest_makereport(item, call):
if "incrementalparam" in item.keywords:
if call.excinfo is not None:
previousfailed_attr = item._genid
setattr(item.parent,previousfailed_attr, item)
if "incremental" in item.keywords:
if call.excinfo is not None:
parent = item.parent
parent._previousfailed = item
It's interesting to mention that's it can't be used without parametrize cause parametrize annotation creates automatically _genid variable.
Hope this can helps others than me.

Spark: run an external process in parallel

Is it possible with Spark to "wrap" and run an external process managing its input and output?
The process is represented by a normal C/C++ application that usually runs from command line. It accepts a plain text file as input and generate another plain text file as output. As I need to integrate the flow of this application with something bigger (always in Spark), I was wondering if there is a way to do this.
The process can be easily run in parallel (at the moment I use GNU Parallel) just splitting its input in (for example) 10 part files, run 10 instances in memory of it, and re-join the final 10 part files output in one file.

The simplest thing you can do is to write a simple wrapper which takes data from standard input, writes to file, executes an external program, and outputs results to the standard output. After that all you have to do is to use pipe method:
rdd.pipe("your_wrapper")
The only serious considerations is IO performance. If it is possible it would be better to adjust program you want to call so it can read and write data directly without going through disk.
Alternativelly you can use mapPartitions combined with process and standard IO tools to write to the local file, call your program and read the output.

If you end up here based on the question title from a Google search, but you don't have the OP restriction that the external program needs to read from a file--i.e., if your external program can read from stdin--here is a solution. For my use case, I needed to call an external decryption program for each input file.
import org.apache.commons.io.IOUtils
import sys.process._
import scala.collection.mutable.ArrayBuffer
val showSampleRows = true
val bfRdd = sc.binaryFiles("/some/files/*,/more/files/*")
val rdd = bfRdd.flatMap{ case(file, pds) => { // pds is a PortableDataStream
val rows = new ArrayBuffer[Array[String]]()
var errors = List[String]()
val io = new ProcessIO (
in => { // "in" is an OutputStream; write the encrypted contents of the
// input file (pds) to this stream
IOUtils.copy(pds.open(), in) // open() returns a DataInputStream
in.close
},
out => { // "out" is an InputStream; read the decrypted data off this stream.
// Even though this runs in another thread, we can write to rows, since it
// is part of the closure for this function
for(line <- scala.io.Source.fromInputStream(out).getLines) {
// ...decode line here... for my data, it was pipe-delimited
rows += line.split('|')
}
out.close
},
err => { // "err" is an InputStream; read any errors off this stream
// errors is part of the closure for this function
errors = scala.io.Source.fromInputStream(err).getLines.toList
err.close
}
)
val cmd = List("/my/decryption/program", "--decrypt")
val exitValue = cmd.run(io).exitValue // blocks until subprocess finishes
println(s"-- Results for file $file:")
if (exitValue != 0) {
// TBD write to string accumulator instead, so driver can output errors
// string accumulator from #zero323: https://stackoverflow.com/a/31496694/215945
println(s"exit code: $exitValue")
errors.foreach(println)
} else {
// TBD, you'll probably want to move this code to the driver, otherwise
// unless you're using the shell, you won't see this output
// because it will be sent to stdout of the executor
println(s"row count: ${rows.size}")
if (showSampleRows) {
println("6 sample rows:")
rows.slice(0,6).foreach(row => println(" " + row.mkString("|")))
}
}
rows
}}
scala> :paste "test.scala"
Loading test.scala...
...
rdd: org.apache.spark.rdd.RDD[Array[String]] = MapPartitionsRDD[62] at flatMap at <console>:294
scala> rdd.count // action, causes Spark code to actually run
-- Results for file hdfs://path/to/encrypted/file1: // this file had errors
exit code: 255
ERROR: Error decrypting
my_decryption_program: Bad header data[0]
-- Results for file hdfs://path/to/encrypted/file2:
row count: 416638
sample rows:
<...first row shown here ...>
...
<...sixth row shown here ...>
...
res43: Long = 843039
References:
https://www.scala-lang.org/api/current/scala/sys/process/ProcessIO.html
https://alvinalexander.com/scala/how-to-use-closures-in-scala-fp-examples#using-closures-with-other-data-types

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Chisel persist value in module until new write - scala

Related

Adding a MMIO peripheral to a small rocket core

Flink: ProcessWindowFunction

Spark - how to handle with lazy evaluation in case of iterative (or recursive) function calls

Pytest - skip (xfail) mixed with parametrize

Spark: run an external process in parallel

Categories

Resources