Order in always_comb block - system-verilog

I have the impression that in an always_comb block, all the non-blocking assignment should work in parallel. That is, if I have
always_comb
begin
a = b;
b = c;
end
Then, a should be equal to c regardless of the order of above two lines in the always_comb block, as they are evaluated concurrently anyway. However, today I experienced an issue that change the order of above two lines, the results are different!!! Whay is that?

The statements within a begin/end block execute serially. It does not matter if you are using an always_comb or any other kind of always block. But you are using blocking assignments, not non-blocking assignments, which is the proper thing to do in an always_comb block. Non-blocking assignments are used to assign sequential logic, which implies storage of the current and next state.

This difference stems from that combinational always blocks cannot "self-trigger". The way the simulator works when a signal changes value is to locate all always blocks with that signal in the sensitivity list, then execute them one by one sequentially. But, only if that block is not already running! In your case the expected behavior would require the block to run twice, but instead only one iteration occurs for every update of c.
The situation is unfortunate since the sensitivity list is a simulator concept and generally ignored altogether for synthesis. Most synthesis tools would generate a wire from your code without producing any warning, creating a simulation-synthesis mismatch.
Note that an explicit sensitivity list (e.g., always #(b or c)) does not make any difference. One solution is to always ensure that the assignments are in the right order. Another is to use non-blocking assignments, but this is generally advised against since it slows down the simulator. (Note that VHDL does not have blocking assignments, and would thus always have this performance penalty. On the plus side you do not have problems like this.)

Related

Can someone explain the control flow of modules in System Verilog

I know how to link modules but could someone explain the flow of calling the modules to be used when I want it to be used.
Like have a state machine and depending on the state I can call a module to activate, or like if I need to repeat a process how to go back to a module earlier in a state machine.
again I get the instantiating part like this
wire clk;
wire sig;
wire out;
A a(clk, sig, topout);
B b(clk, sig);
endmodule
but can someone explain how to call modules and how the control flow works in general for them?
(I am new to HDLs so I appreciate any help)
Verilog is a language specifically developed to simulate behavior of hardware. Hardware is a set of transistors and other elements which always statically presented and function in parallel. Functioning of such elements could be enabled or disabled, but the hardware is still present.
Verilog is similar to the hardware in the sense that all its elements are always present, intended for parallel functioning.
The basic functional elements of Verilog are gates, primitives and procedural blocks (i.e., always blocks). Those blocks are connected by wires.
All those elements are then grouped in modules. Modules are used to create logical partitioning of the hardware mode. They cannot be 'called'. They can be instantiated in a hierarchical manner to describe a logical structure of the model. They cannot be instantiated conditionally since they represent pieces of hardware. Different module instances are connected via wires to express hierarchical connectivity between lower-level elements.
There is one exception however, the contents of an always block is pure software. It describes an algorithmic behavior of it and as such, software flow constructs are allowed inside always block (specific considerations must be used to make it synthesizable).
As it comes to simulation, Verilog implements an event-driven simulation mode which is intended to mimic parallel execution of hardware. In other words, a Verilog low level primitive (gate or always block) is executed only if at least one of its inputs changes.
The only flow control which is usually used in such models is a sequence of input events and clocks. The latter are used to synchronize results of multiple parallel operations and to organize pipes or other sequential functions.
As I mentioned before, hardware elements can be enabled/disabled by different methods, so the only further control you can use by implementing such methods in your hardware description. For example, all hardware inside a particular module can be turned off by disabling clock signal which the module uses. There could be specific enable/disable signals on wires or in registers, and so on.
Now to your question: your code defines hierarchical instantiation of a couple of modules.
module top(out);
output wire out;
wire clk;
wire sig;
A a(clk, sig, out);
B b(clk, sig);
endmodule
Module 'top' (missing in your example) contains instances of two other modules, A and B. A and B are module definitions. They are instantiated as corresponding instances 'a' and 'b'. The instances are connected by signals 'clk', which is probably a clock signal, some signal 'sig' which is probably an output of one of the modules and input in another. 'out' is output of module top, which is probably connected to another module or an element in a higher level of hierarchy, not shown here.
The flow control in some sense is defined by the input/output relation between modules 'A' and 'B'. For example:
module A(input clk, input sig, output out);
assign out = sig;
...
endmodule
module B(input clk, output sig);
always#(posedge clk) sig <= some-new-value;
...
endmodule
However, in general it is defined by the input/output relation of the internal elements inside module (always blocks in the above example). input/output at the module port level is mostly used for semantic checking.
In the event-driven simulation it does not matter hardware of which module is executed first. However as soon as the value of the 'sig' changes in always#(posedge clk) of module 'B', simulation will cause hardware in module 'A' (the assign statement to be evaluated (or re-evaluated). This is the only way you can express a sequence in the flow at this level. Same as in hardware.
If you are like me you are looking at Verilog with the background of a software programmer. Confident in the idea that a program executes linearly. You think of ordered execution. Line 1 before line 2...
Verilog at its heart wants to execute all the lines simultaneously. All the time.
This is a very parallel way to program and until you get it, you will struggle to think the right way about it. It is not how normal software works. (I recall it took me weeks to get my head around it.)
You can prefix blocks of simultaneous execution with conditions, which are saying execute the lines in this block when the condition is true. All the time the condition is true. One class of such conditions is the rising edge of a clock: always #(posedge clk). Using this leads to a block of code that execute once every time the clk ticks (up).
Modules are not like subroutines. They are more like C macros - they effectively inline blocks of code where you place them. These blocks of code execute all the time any conditions that apply to them are true. Typically you conditionalize the internals of a module on the state of the module arguments (or internal register state). It is the connectivity of the modules through the shared arguments that ensures the logic of a system works together.

System verilog:: Static Variable non-blocking Assignent outside program-block?

I'm new to system verilog and i'm stuck with a basic concept, kindly provide rationale behind the following behavior:
In System verilog, Why Static class properties declared in other than program-block scope cannot be assigned with blocking assignment from program block?
2.Why is that, even if static variable is assigned with non-blocking statement, the change in that static variable is no visible ($display) immediately, it is available after a delay of say #1.
Example:
class A ;
static int i;
endclass
program main ;
A obj;
initial
begin
obj.i = 123; // Not Allowed, can only be done using <= ... WHY ??
$display(obj.i);
#1 $display(obj.i);
end
endprogram
There is no such rule in the IEEE 1800-2012 LRM Earlier version of SystemVerilog had more restrictions on the types of assignments allowed, but those have all been removed. I do not recommend that anyone use program blocks anymore. There are a big source of unnecessary confusion. See http://go.mentor.com/programblocks
The purpose of "program" block in SystemVerilog is to guarantee that there will not be any race condition between the testbench and the DUT if the user encloses his testbench in program block(s) and keeps the DUT outside of program block(s). Another way to avoid race conditions is implemented by limiting DUT/testbench interaction to interfaces/clocking blocks. Also note that:
a) blocking assignments (since they are executed immediately and therefor the result of the execution can vary with the order of execution of threads) can lead to race conditions
b) hardware (RTL) variables can only be static
Given the whole scenario, the compiler makes out that the blocking statement could lead to a race condition between the DUT and the testbench. And hence the error.
When you use non-blocking assignment, the assignment is scheduled and not executed immediately. It would get executed once the scheduler gets a chance to execute it. And that would happen only after the present thread yields because of a blocking expression that involves time increment. In the given code snippet that happens once the executing thread encounters #1; the $display after #1 sees the result of the non-blocking assignment while the one before does not.

Force parfor to respect some order

I understand that some indeterminism stems from parfor's parallel nature but I don't understand why it should be entirely random. Is there any way to force parfor to respect (at least loosely) the order of the loop? More specifically I would like that in the case of:
parfor i=1:100
do_independent_stuff()
end
each worker of the pool when asking for a new task (i.e. in this case a new iteration of the loop) to be affected the lowest i that hasn't been computed or affected to a worker yet.
I think its by design that running something in parallel assumes that order is not important, at least in Matlab. Each thread/worker should be independent of each other. However, as indicated in this question, you could try job and task control interface to give you some level of control.
Firstly, in practice, PARFOR isn't "entirely random" - you can easily observe that it sends out chunks of loop iterates in reverse order. In R2013b and later, if you need more control over ordering (if, for example, you know that certain of your independent things are likely to take a long time, and therefore wish to start computing them first), you can use PARFEVAL.
If you need to loosely synchronize things, for instance wait until some thread as finished or has reach some point before to start another one, best should be to use semaphores, locks, mutex, etc...
I don't know if 'Parallel toolbox' includes such synchronization objects, but here is some workaround to create semaphore for instance:
https://stackoverflow.com/a/22874669/684399
You can also use objects in 'System.Threading' namespace (requires .NET):
Init:
someResultAvailable = System.Threading.ManualResetEvent(false);
In some job:
... do work ...
someResultAvailable .Set();
... continue ...
In another one:
... do work ...
if (!someResultAvailable.WaitOne(10000))
{
error('Timeout waiting for result from other thread');
}
... continue now knowing that results are available ...

What is the architecture behind Scratch programming blocks?

I need to build a mini version of the programming blocks that are used in Scratch or later in snap! or openblocks.
The code in all of them is big and hard to follow, especially in Scratch which is written in some kind of subset of SmallTalk, which I don't know.
Where can I find the algorithm they all use to parse the blocks and transform it into a set of instructions that work on something, such as animations or games as in Scratch?
I am really interested in the algorithmic or architecture behind the concept of programming blocks.
This is going to be just a really general explanation, and it's up to you to work out specifics.
Defining a block
There is a Block class that all blocks inherit from. They get initialized with their label (name), shape, and a reference to the method. When they are run/called, the associated method is passed the current context (sprite) and the arguments.
Exact implementations differ among versions. For example, In Scratch 1.x, methods took arguments corresponding to the block's arguments, and the context (this or self) is the sprite. In 2.0, they are passed a single argument containing all of the block's arguments and context. Snap! seems to follow the 1.x method.
Stack (command) blocks do not return anything; reporter blocks do.
Interpreting
The interpreter works somewhat like this. Each block contains a reference to the next one, and any subroutines (reporter blocks in arguments; command blocks in a C-slot).
First, all arguments are resolved. Reporters are called, and their return value stored. This is done recursively for lots of Reporter blocks inside each other.
Then, the command itself is executed. Ideally this is a simple command (e.g. move). The method is called, the Stage is updated.
Continue with the next block.
C blocks
C blocks have a slightly different procedure. These are the if <> style, and the repeat <> ones. In addition to their ordinary arguments, they reference their "miniscript" subroutine.
For a simple if/else C block, just execute the subroutine normally if applicable.
When dealing with loops though, you have to make sure to thread properly, and wait for other scripts.
Events
Keypress/click events can be dealt with easily enough. Just execute them on keypress/click.
Something like broadcasts can be done by executing the hat when the broadcast stack is run.
Other events you'll have to work out on your own.
Wait blocks
This, along with threading, is the most confusing part of the interpretation to me. Basically, you need to figure out when to continue with the script. Perhaps set a timer to execute after the time, but you still need to thread properly.
I hope this helps!

Simulink: Synchronizing and timing

in order to simulate some processes I have a problem with getting a predefined working order of my self modeled blocks.
How can I be sure, that for example Block A must be finished before Block B and C start working?
The problem is, that some Blocks shall work after some others and some shall not. I must admit I have not much experience with Simulink in order of doing time-depending things (althought basic knowledge of simulink is available).
For instance this scenario shall be realised:
A -> B, C -> D, E, F
The main thing is, that all blocks A-F have no logic correlation to each other, they all do several things. My aim is to make B and C start working, after A has finished. And D/E/F after B AND C have finished.
In this case, the word "parallel" was the wrong word, this does not have to be calculated really parallely. Just making sure, that this complies with a predifined steady order.
Edit:
My new idea is to use the matlab workspace als buffer, so my block A can push its results to workspace (by the "to workspace" block). But now I have to make sure, that block B and C may read the results (with "From workspace") of A AFTER A pushed its information to workspace...how to do this?
Edit2:
Here's a screenshot which should make some thinks clearer:
As the documentation of "Sorted order" refers, the set up seems to be ok (including the subsystems timing). But unfortunately problem still exists. The variable "simin" is loaded from workspace, before it was written :( As you see, the display shows "1", which it shouldn't do. In the very first run of the simulation I get an exception, that the variable "simin" does not exist.
It would be nice, if you can help me with my issue.
Greets, poeschlorn
So in your example, if you have Block A connected with the same wire to both B and C, when Block A is finished, Block B and C will work in parallel.
EDIT:
I am using the same blocks as you are, but it works for me. I think you are over complicating things. The way you are setting block priorities is no different than how Simulink runs the blocks without them. Below you can see my setup and the output on both binary displays.
The error you see on the first run is due to Simulink not creating the variable until the first time step is being executed. When Simulink builds the simulation it sees that the variable used as input from the workspace is not created.
If the connection between the block are not enough to set the order, you can use block priorities.
A tip to test the execution order is to add an "embedded Matlab block" with a disp command displaying the name of the block.
It's not really clear what you're asking. When you say that Block A must be finished, do you mean the Output function? The way simulation works in Simulink is that the blocks are run serially, so Block B and C would never run until Block A finished it's Output function.
I don't know of any obvious way of running blocks B and C in parallel currently in Simulink.