System verilog:: Static Variable non-blocking Assignent outside program-block? - system-verilog

I'm new to system verilog and i'm stuck with a basic concept, kindly provide rationale behind the following behavior:
In System verilog, Why Static class properties declared in other than program-block scope cannot be assigned with blocking assignment from program block?
2.Why is that, even if static variable is assigned with non-blocking statement, the change in that static variable is no visible ($display) immediately, it is available after a delay of say #1.
Example:
class A ;
static int i;
endclass
program main ;
A obj;
initial
begin
obj.i = 123; // Not Allowed, can only be done using <= ... WHY ??
$display(obj.i);
#1 $display(obj.i);
end
endprogram

There is no such rule in the IEEE 1800-2012 LRM Earlier version of SystemVerilog had more restrictions on the types of assignments allowed, but those have all been removed. I do not recommend that anyone use program blocks anymore. There are a big source of unnecessary confusion. See http://go.mentor.com/programblocks

The purpose of "program" block in SystemVerilog is to guarantee that there will not be any race condition between the testbench and the DUT if the user encloses his testbench in program block(s) and keeps the DUT outside of program block(s). Another way to avoid race conditions is implemented by limiting DUT/testbench interaction to interfaces/clocking blocks. Also note that:
a) blocking assignments (since they are executed immediately and therefor the result of the execution can vary with the order of execution of threads) can lead to race conditions
b) hardware (RTL) variables can only be static
Given the whole scenario, the compiler makes out that the blocking statement could lead to a race condition between the DUT and the testbench. And hence the error.
When you use non-blocking assignment, the assignment is scheduled and not executed immediately. It would get executed once the scheduler gets a chance to execute it. And that would happen only after the present thread yields because of a blocking expression that involves time increment. In the given code snippet that happens once the executing thread encounters #1; the $display after #1 sees the result of the non-blocking assignment while the one before does not.

Related

Is it allowed to use #1step as a procedural delay?

I am not sure if the LRM is clear about the #1step usage, but I have a case of creating a smallest possible delay a simulator could detect. So, I have written the following code:
virtual task drive_signal();
// Initialise mysignal to a value of '1'.
m_vif.mysignal= 1;
#1step; // Advance with 1 time step
m_vif.mysignal= 0;
#m_cfg.configured_delay; //Delay by configured value
m_vif.mysignal= 1;
endtask
Is this a valid way to do so?
I did however use #0 instead of #1step but it did not create any runtime delay.
This is currently an open issue in the IEEE 1800-2017 SystemVerilog LRM, but the intent was not to allow it.
The use of simple delays like #0 or #1 is a bad practice as they increase the potential for race conditions. Since you tagged this question with UVM, the use of any delays in a driver is highly discouraged and instead you should use synchronous clock edge in an interface or top-level testbench.

UVM End of test Objection Mechanism and Phase Ready to End Implementation

I am exploring different ways to end a UVM test. One method that has come often from studying different blogs from Verification Academy and other sites is to use the Phase Ready to End. I have some questions regarding the implementation of this method.
I am using this method in scoreboard class, where my understanding is after my usual run phase is finished, it will call the phase ready to end method and implement it. The reason I am using it my scoreboard's run_phase finishes early, and there are some data into queues that need to be processed. So I am trying to prolong this scoreboard run_phase using this method. Here are is some pseudo-code that I have used.
function void phase_ready_to_end(uvm_phase phase);
if (phase.get_name() != "run") return;
if (queue.size() != 0) begin
phase.raise_objection(.obj(this));
fork
begin
delay_phase(phase);
end
join_none
end
endfunction
task delay_phase(uvm_phase phase);
wait(queue.size() == 0);
phase.drop_objection(.obj(this));
endtask
I have taken inspiration for this implementation from this link UVM-End of Test Mechanism for your reference. Here are some of the ungated thoughts in my mind on which I need guidance and help.
to the best of my understanding the phase_ready_to_end is called at the end of run_phase and when it runs it raises the objection for that scoreboard run_phase and runs delay_phase task.
That Delay Phase task is just waiting for the queue to end, but I am not seeing any method or task which will pop the items from the queue. Does I have to call some method to pop from the queue or as according to the 1st point above the raised objection will start the run phase so there is no need for that and we have to wait for a considerable amount of time?
Let me give you some pre-context to this question. I have a scoreboard where there are two queues whose write methods are implemented and they are being fed correctly by their source.
task run_phase (uvm_phase phase);
forever begin
compare_queues(); // this method takes data from two queues and compares them, both queues implementation are fine and they take data from their respective sources. Let me give you a scenario, let's suppose there are a total of 10 transactions being generated but the scoreboard was able to process only 6 of them and there are 4 transactions left when all objections are dropped. So to tackle that I implement this phase_to_ready_end method in my scoreboard.
end
endtask
The problem with this method that I am having is that, when I raise the objection in this phase_ready_to_end and call delay_phase method, nothing happens. And I am curious is there more to this implementation or not?
Sorry for the delay. I have shared more context to the existing question. Please see to that, let me know if it is confusing.
We have a pair of monitors that calls write method implemented inside the scoreboard. The monitors typically capture the transaction from BUS and call these WR methods to push the transactions. Thus two source and destination monitors WR into two - source and destination - queues as and when they find the transactions.
We have a checker task with RD-n-check running in forever loop in the run-phase of scoreboard. It's in a while loop and watches if the destination queue has non-zero entry. Once it finds so, it pops the head entry from destination queue and then pops the head entry from source queue as well and compares the two entries to declare if the check was a PASS or FAIL.
There are more than 2 queues and more than a pair of source/destination of course, but broadly this is the architecture around here.
Now in the current scenario, it seems that the checker tasks stop prints after certain point of time in some of the test cases. Upon adding debug prints thoroughly, it seems that checker tasks that does the job #2/#3 above and gets called inside the forever loop of the run-phase, exits gracefully one last time. However they are entered again - which is to say that the forever loop that should be calling them didn't call. As if the forever loop of run-phase stopped completely.
We also added another forever loop in run-phase that observes whether the queues are empty. From prints inside that parallel loop and from the monitor prints, we know that the queues aren't empty and monitors did push WRs into the queues for a long time.
It seems that the forever loop stopped working suddenly ( going by prints spewed out) all of a sudden but another set of threads that we added in runphase in another forever loop just to monitor those queues - keep printing that the queues have contents. So run-phase shouldn't be over but the checker tasks running in forever has stopped.
We are using Vivado 2020.2 for the simulation. This is a baffling/weird problem for us and we did go through prints multiple times to make sure nothing has been missed out. It seems we are missing very very basic or has hit a bug/broken some basics of UVM coding to land into here.
If you have any help, thoughts here, will appreciate that greatly.
The function phase_ready_to_end() gets called at the end of every task-based phase when all objections have been dropped (or never raised at all).
Typically a scoreboard has a queue or some kind of array of transactions waiting to be checked sent from a monitor via an analysis_port write() method. If your scoreboard is an in-order comparison checker, the queue size is zero when there are no more transactions waiting to be received.
If you look at the code in the link you shared, there is the following in the write_south method doing exactly that:
if (!item.compare(item_stream.pop_front()))

Can someone explain the control flow of modules in System Verilog

I know how to link modules but could someone explain the flow of calling the modules to be used when I want it to be used.
Like have a state machine and depending on the state I can call a module to activate, or like if I need to repeat a process how to go back to a module earlier in a state machine.
again I get the instantiating part like this
wire clk;
wire sig;
wire out;
A a(clk, sig, topout);
B b(clk, sig);
endmodule
but can someone explain how to call modules and how the control flow works in general for them?
(I am new to HDLs so I appreciate any help)
Verilog is a language specifically developed to simulate behavior of hardware. Hardware is a set of transistors and other elements which always statically presented and function in parallel. Functioning of such elements could be enabled or disabled, but the hardware is still present.
Verilog is similar to the hardware in the sense that all its elements are always present, intended for parallel functioning.
The basic functional elements of Verilog are gates, primitives and procedural blocks (i.e., always blocks). Those blocks are connected by wires.
All those elements are then grouped in modules. Modules are used to create logical partitioning of the hardware mode. They cannot be 'called'. They can be instantiated in a hierarchical manner to describe a logical structure of the model. They cannot be instantiated conditionally since they represent pieces of hardware. Different module instances are connected via wires to express hierarchical connectivity between lower-level elements.
There is one exception however, the contents of an always block is pure software. It describes an algorithmic behavior of it and as such, software flow constructs are allowed inside always block (specific considerations must be used to make it synthesizable).
As it comes to simulation, Verilog implements an event-driven simulation mode which is intended to mimic parallel execution of hardware. In other words, a Verilog low level primitive (gate or always block) is executed only if at least one of its inputs changes.
The only flow control which is usually used in such models is a sequence of input events and clocks. The latter are used to synchronize results of multiple parallel operations and to organize pipes or other sequential functions.
As I mentioned before, hardware elements can be enabled/disabled by different methods, so the only further control you can use by implementing such methods in your hardware description. For example, all hardware inside a particular module can be turned off by disabling clock signal which the module uses. There could be specific enable/disable signals on wires or in registers, and so on.
Now to your question: your code defines hierarchical instantiation of a couple of modules.
module top(out);
output wire out;
wire clk;
wire sig;
A a(clk, sig, out);
B b(clk, sig);
endmodule
Module 'top' (missing in your example) contains instances of two other modules, A and B. A and B are module definitions. They are instantiated as corresponding instances 'a' and 'b'. The instances are connected by signals 'clk', which is probably a clock signal, some signal 'sig' which is probably an output of one of the modules and input in another. 'out' is output of module top, which is probably connected to another module or an element in a higher level of hierarchy, not shown here.
The flow control in some sense is defined by the input/output relation between modules 'A' and 'B'. For example:
module A(input clk, input sig, output out);
assign out = sig;
...
endmodule
module B(input clk, output sig);
always#(posedge clk) sig <= some-new-value;
...
endmodule
However, in general it is defined by the input/output relation of the internal elements inside module (always blocks in the above example). input/output at the module port level is mostly used for semantic checking.
In the event-driven simulation it does not matter hardware of which module is executed first. However as soon as the value of the 'sig' changes in always#(posedge clk) of module 'B', simulation will cause hardware in module 'A' (the assign statement to be evaluated (or re-evaluated). This is the only way you can express a sequence in the flow at this level. Same as in hardware.
If you are like me you are looking at Verilog with the background of a software programmer. Confident in the idea that a program executes linearly. You think of ordered execution. Line 1 before line 2...
Verilog at its heart wants to execute all the lines simultaneously. All the time.
This is a very parallel way to program and until you get it, you will struggle to think the right way about it. It is not how normal software works. (I recall it took me weeks to get my head around it.)
You can prefix blocks of simultaneous execution with conditions, which are saying execute the lines in this block when the condition is true. All the time the condition is true. One class of such conditions is the rising edge of a clock: always #(posedge clk). Using this leads to a block of code that execute once every time the clk ticks (up).
Modules are not like subroutines. They are more like C macros - they effectively inline blocks of code where you place them. These blocks of code execute all the time any conditions that apply to them are true. Typically you conditionalize the internals of a module on the state of the module arguments (or internal register state). It is the connectivity of the modules through the shared arguments that ensures the logic of a system works together.

Order in always_comb block

I have the impression that in an always_comb block, all the non-blocking assignment should work in parallel. That is, if I have
always_comb
begin
a = b;
b = c;
end
Then, a should be equal to c regardless of the order of above two lines in the always_comb block, as they are evaluated concurrently anyway. However, today I experienced an issue that change the order of above two lines, the results are different!!! Whay is that?
The statements within a begin/end block execute serially. It does not matter if you are using an always_comb or any other kind of always block. But you are using blocking assignments, not non-blocking assignments, which is the proper thing to do in an always_comb block. Non-blocking assignments are used to assign sequential logic, which implies storage of the current and next state.
This difference stems from that combinational always blocks cannot "self-trigger". The way the simulator works when a signal changes value is to locate all always blocks with that signal in the sensitivity list, then execute them one by one sequentially. But, only if that block is not already running! In your case the expected behavior would require the block to run twice, but instead only one iteration occurs for every update of c.
The situation is unfortunate since the sensitivity list is a simulator concept and generally ignored altogether for synthesis. Most synthesis tools would generate a wire from your code without producing any warning, creating a simulation-synthesis mismatch.
Note that an explicit sensitivity list (e.g., always #(b or c)) does not make any difference. One solution is to always ensure that the assignments are in the right order. Another is to use non-blocking assignments, but this is generally advised against since it slows down the simulator. (Note that VHDL does not have blocking assignments, and would thus always have this performance penalty. On the plus side you do not have problems like this.)

How can I create a task which drives an output across time without using globals?

I want to write some tasks in a package, and then import that package in some files that use those tasks.
One of these tasks toggles a reset signal. The task is reset_board and the code is as follows:
package tb_pkg;
task reset_board(output logic rst);
rst <= 1'b0;
#1;
rst <= 1'b1;
#1;
rst <= 1'b0;
#1;
endtask
endpackage
However, if I understand this correctly, outputs are only assigned at the end of execution, so in this case, the rst signal will just get set to 0 at the end of the task's execution, which is obviously not what I want.
If this task were declared locally in the module in which it is used, I could refer to the rst signal directly (since it is declared in the module). However, this would not allow me to put the task in a separate package. I could put the task in a file and then `include it in the module, but I'm trying to avoid the nasty complications that come with the way SystemVerilog handles includes (long-story-short, it doesn't work the way C does).
So, is there any way that the task can drive an output with different values across the duration of its execution without it having to refer to a global variable?
A quick solution is to use a ref that passes the task argument by reference instead of an output argument that is copied after returning from the task.
task reset_board(ref logic rst);
There are a few drawbacks of doing it this way. You can only pass variables of matching types by reference, so when you call reset_board(*signal*), signal cannot be a wire. Another problem is you cannot use an NBA <= to assign a variable passed by reference, you must use a blocking assignment =. This is because you are allowed to pass automatic variables by reference to a task, but automatic variable are not allowed to be assigned by NBAs. There is no way for the task to check the storage type of the argument passed to it.
Standard methodologies like the UVM recommend using virtual interfaces or abstract classes to create these kinds of connections from the testbench to the DUT. See my DVCon paper for more information.