System Verilog Testbench waveforms no data - simulation

I'm trying to develop a code which acts like a logical calculator; I've managed to compile both the code and the testbench without any errors. Here is the code:
module AriLogCal(
input logic [3:0] OpA, OpB, //Operands A and B. The two numbers we will operate on.
input logic [2:0] DoOpt, //Operator. Determines the operation we will do.
input logic EqualTo, AC, //Interrupts. AC resets, EqualTo transfers data to display.
output logic [6:0] S2, S1, S0 //Seven-Segement LEDS. Shows each digit separately.
);
logic [7:0] result; //Result.
Mathematical operation result data is stored here.
logic [3:0] D2, D1, D0; //Digits. Determines
the number/symbol/deactivation for each respective SevenSeg.
always begin
if(AC)begin //Makes all the numbers display 0 if AC returns TRUE
result=8'b00000000;
S0=7'b1111110;
S1=7'b1111110;
S2=7'b1111110;
end
else if(EqualTo)begin //Does this stuff if EqualTo returns TRUE
//Part 1: Operation. Decides the relationship between Operand A and B and stores data under "result"
case(DoOpt)
3'b000:result=OpA+OpB; //Addition
3'b001:begin //Subtraction
if(OpB>OpA)
result=OpB-OpA;
else
result=OpA-OpB;
end
3'b010:result=OpA*OpB; //Multiplication
3'b011:begin //Division
if(OpB)
result=OpA/OpB;
else
result=0;
end
3'b100:begin
if(OpA&&OpB) //Logical AND
result=8'b00000001;
else
result=8'b00000000;
end
3'b101:begin
if(OpA||OpB) //Logical OR
result=8'b00000001;
else result=8'b00000000;
end
endcase
//Part 2: Digits. Dissects the value of "result" into its decimal digits and stores them in logic "D"
if(!OpB&&DoOpt==3'b011) //This will show "Err" on LED displays
D0=4'b1010;
else if(result<10)begin //Single Digit. S1 and S2 is temporarily set to zero
D0=result;
D1=4'b0000;
D2=4'b0000;
end
else if(result<100)begin //Double digit. S2 is temporarily set to zero
D0=result%10;
D1=result/10;
D2=4'b0000;
end
else begin //Triple digit.
D2=result/100;
result=result%100;
D1=result/10;
D0=result%10;
end
//Part 3: Blanks. Adds blanks and negative sign depending on operation type, according to requirements
case(DoOpt)
3'b000:D2=4'b1011; //Addition deactivates S2
3'b001:begin
if(OpB>OpA) //Subtraction deactivates or shows negative sign
for S2
D2=4'b1100;
else
D2=4'b1011;
end
3'b011:begin //Multiplcation is skipped.
if(!OpB)begin //Division has two options:
D0=4'b1010; //If divider is 0, this will show "Err" on LED
displays
D1=4'b1010;
D2=4'b1010;
end else //Otherwise, S2 is deactivated
D2=4'b0000;
end
3'b100:begin //Logical AND deactivates S2 and S1
D2=4'b1011;
D1=4'b1011;
end
3'b101:begin //Logical OR deactivates S2 and S1
D2=4'b1011;
D1=4'b1011;
end
endcase
//Part 4: Display. Prints the digits from "D" onto its respective Seven Segment LED S
case(D0)
4'b1010: S0<=7'b0000101; //D0=10 means S0 displays R
4'b1001: S0<=7'b1110011; //9
4'b1000: S0<=7'b1111111; //8
4'b0111: S0<=7'b1110000; //7
4'b0110: S0<=7'b1011111; //6
4'b0101: S0<=7'b1011011; //5
4'b0100: S0<=7'b0110011; //4
4'b0011: S0<=7'b1111001; //3
4'b0010: S0<=7'b1101101; //2
4'b0001: S0<=7'b0110000; //1
4'b0000: S0<=7'b1111110; //0
endcase
case(D1)
4'b1011: S1<=7'b0000000; //D1=11 means S1 deactivates
4'b1010: S1<=7'b0000101; //D1=10 means S1 displays R
4'b1001: S1<=7'b1110011; //9
4'b1000: S1<=7'b1111111; //8
4'b0111: S1<=7'b1110000; //7
4'b0110: S1<=7'b1011111; //6
4'b0101: S1<=7'b1011011; //5
4'b0100: S1<=7'b0110011; //4
4'b0011: S1<=7'b1111001; //3
4'b0010: S1<=7'b1101101; //2
4'b0001: S1<=7'b0110000; //1
4'b0000: S1<=7'b1111110; //0
endcase
case(D2)
4'b1100: S2<=7'b0000001; //D2=12 means S2 shows negative sign
4'b1011: S2<=7'b0000000; //D2=11 means S2 deactivates
4'b1010: S2<=7'b1001111; //D2=10 means S2 displays E
4'b1001: S2<=7'b1110011; //9
4'b1000: S2<=7'b1111111; //8
4'b0111: S2<=7'b1110000; //7
4'b0110: S2<=7'b1011111; //6
4'b0101: S2<=7'b1011011; //5
4'b0100: S2<=7'b0110011; //4
4'b0011: S2<=7'b1111001; //3
4'b0010: S2<=7'b1101101; //2
4'b0001: S2<=7'b0110000; //1
4'b0000: S2<=7'b1111110; //0
endcase
end
end
endmodule
and here is the current testbench (this is a shorter version; I'm still trying to find the problem behind this)
`timescale 1ns/1ps
module AriLogCal_tb;
logic [3:0] in_OpA;
logic [3:0] in_OpB;
logic [2:0] in_DoOpt;
logic in_EqualTo;
logic in_AC;
logic [6:0] out_S2, out_S1, out_S0;
AriLogCal AriLogCal_inst0(.OpA(in_OpA), .OpB(in_OpB), .DoOpt(in_DoOpt),
.EqualTo(in_EqualTo), .AC(in_AC), .S2(out_S2), .S1(out_S1), .S0(out_S0));
initial begin
in_EqualTo=1'b0;
in_AC=1'b0;
in_OpA = 4'b0111; in_OpB = 4'b0010; in_DoOpt = 3'b000;
in_EqualTo = 1'b0;#100;
$finish;
end
endmodule
Both of these files are able to individually compile successfully, with no errors. However, when I try to compile them in the RTL Simulator, I get these results:
https://drive.google.com/file/d/0By4LCb9TUml0WWVsZEYtcG03LVk/view?usp=sharing
Why do I still get "No Data" in my results, despite successful compilation? Immediate help will be appreciated. Thanks in advance.

There isn't any event/time blockers in AriLogCal's always block.always begin is an infinite loop. It will continuously re-evaluate and prevent the simulator from moving to the next time step.
It should be changed always_comb begin, which has an inherited time blocking and will only trigger the at time 0 and when an stimulus signal changes. Alternative you could use the Verilog auto-sensitivity #* (or the synonymous #(*)) and change the statement to always #* begin. always_comb is superior to always #* because it with through compiling errors if basic synthesis requirements are not stratified (ex: register is assigned in only one always block and no blocking # or # statements in the always block).
FYI: You should not be using non-blocking (<=) assignments in combinational logic; blocking (=) assignments are preferred. Non-blocking assignments should be used in always_ff and the occasional always_latch.

Related

Verilog Pipelined Multiplier Intuition

I'm trying to understanding how the following code works, but struggling to put it together in my head. Could someone give me a more intuitive (visual) explanation of how this pipelined multiplier stage works?
// This is one stage of an 8 stage (9 depending on how you look at it)
// pipelined multiplier that multiplies 2 64-bit integers and returns
// the low 64 bits of the result. This is not an ideal multiplier but
// is sufficient to allow a faster clock period than straight *
module mult_stage(
input clock, reset, start,
input [63:0] product_in, mplier_in, mcand_in,
output logic done,
output logic [63:0] product_out, mplier_out, mcand_out
);
logic [63:0] prod_in_reg, partial_prod_reg;
logic [63:0] partial_product, next_mplier, next_mcand;
assign product_out = prod_in_reg + partial_prod_reg;
assign partial_product = mplier_in[7:0] * mcand_in;
assign next_mplier = {8'b0,mplier_in[63:8]};
assign next_mcand = {mcand_in[55:0],8'b0};
//synopsys sync_set_reset "reset"
always_ff #(posedge clock) begin
prod_in_reg <= #1 product_in;
partial_prod_reg <= #1 partial_product;
mplier_out <= #1 next_mplier;
mcand_out <= #1 next_mcand;
end
// synopsys sync_set_reset "reset"
always_ff #(posedge clock) begin
if(reset)
done <= #1 1'b0;
else
done <= #1 start;
end
endmodule

Output is Always X

Whenever I write a testbench for my systemverilog code, the output seems to always be X even though the implementation is correct. Where is my error?
`timescale 1ns / 1ps
module fsm( input logic clk, input logic reset,
input logic start, clockwise,
output logic [3:0] pattern);
parameter A=4'b1100,
B=4'b0110,
Ab=4'b0011,
Bb=4'b1001;
typedef enum logic [1:0] {S0,S1,S2,S3} statetype;
statetype state, nextstate;
//state register
always# (posedge clk)
begin
if (reset)
state= S0;
else
state = nextstate;
end
//nextstate logic
always_comb
case(state)
S0: if(start==1 && clockwise==0)
nextstate<= S3;
else if(start==1&&clockwise==1)
nextstate<=S1;
else
nextstate<=S0;
S1: if(start==1 && clockwise==0)
nextstate<= S0;
else if(start==1&&clockwise==1)
nextstate<=S2;
else
nextstate<=S1;
S2: if(start==1 && clockwise==0)
nextstate<= S1;
else if(start==1&&clockwise==1)
nextstate<=S3;
else
nextstate<=S2;
S3: if(start==1 && clockwise==0)
nextstate<= S2;
else if(start==1&&clockwise==1)
nextstate<=S0;
else
nextstate<=S3;
endcase
//output logic
always# (posedge clk)
case(state)
S0: pattern= A;
S1: pattern= B;
S2: pattern= Ab;
S3: pattern= Bb;
endcase
endmodule
and here is my testbench
module fsmtest();
logic clk, reset, clockwise, start;
logic [3:0] pattern;
fsm dut(clk, reset, start, clockwise, pattern);
//generate clock
always
begin
clk=0; #5; clk=1; #5;
end
initial
begin
reset=0;
start=1;
clockwise=1;
#10;
start=0;
#10;
end
endmodule
I'm not sure if it is my finite state machine that is wrong or if it's the testbench. Hoping to get some help, thanks in advance.
You never asserted reset, so your state machine remains uninitialized. You should fix this by adding a default branch to your case statement. Then, if your DUT ever comes up in an un-encoded state, it is guaranteed to get into a known state.

How to cover latency between request and response

Let's say we have a protocol where request req is asserted with req_id and corresponding rsp will be asserted with rsp_id. These can be out of order. I want to cover the number of clks or latency between req with particular req_id and rsp with the same id. I tried something like this. Is this correct way of doing? Is there any other efficient way?
covergroup cg with function sample(int a);
coverpoint a {
a1: bins short_latency = {[0:10]};
a2: bins med_latency = {[11:100]};
a3: bins long_latency = {[101:1000]};
}
endgroup
// Somewhere in code
cg cg_inst = new();
sequence s;
int lat;
int id;
#(posedge clk) disable iff (~rst)
(req, id = req_id, lat = 0) |-> ##[1:$] ((1'b1, lat++) and (rsp && rsp_id == id, cg_inst.sample(lat)));
endsequence
You're trying to use the |-> operator inside a sequence, which is only allowed inside a property.
If rsp can only come one cycle after req, then this code should work:
property trans;
int lat, id;
(req, id = req_id, lat = 0) |=> (1, lat++) [*0:$] ##1 rsp && rsp_id == id
##0 (1, $display("lat = %0d", lat));
endproperty
The element after ##0 is there for debugging. You can omit it in production code.
I wouldn't mix assertions and coverage like this, though, as I've seen that the implication operators can cause issues with variable flow (i.e. lat won't get updated properly). You should have a property that just covers that you've seen a matching response after a request:
property cov_trans;
int lat, id;
(req, id = req_id, lat = 0) ##1 (1, lat++) [*0:$] ##1 rsp && rsp_id == id
##0 (1, $display("cov_lat = %0d", lat));
endproperty
cover property (cov_trans);
Notice that I've used ##1 to separate the request from the response.
Basically your idea is right , But looks like the right hand side of the sequence will be evaluated once when the condition is true and hence the lat will be incremented only once .
You will need a loop mechanism to count the latency.
Below is an sample working example. You can change [1:$], ##1 etc based on how close the signals are generated
property ps;
int lat;
int id;
#(posedge clk)
disable iff (~rst)
(req, id = req_id, lat = 0) |=> (1'b1, lat++)[*1:$] ##1 (rsp && rsp_id == id, cg_inst.sample(lat));
endproperty
assert property (ps);
Alternatively...
property/sequences though they appear to be small code , in this case for every req ( which has not yet received a rsp ) a seperate process with its own counter is forked. This results in many counters doing very similar work. In case there are many req in flight ( and/or many instances of the property or sequence ) it will start adding into simulation run-time [ even though this is just a small block of code ]
so another approach is to keep the trigger simpler and we try to keep the processing linear.
int counter=0; // you can use a larger variablesize to avoid the roll-over issue
int arr1[int] ; // can use array[MAX_SIZE] if you know the max request id is small
always #( posedge clk ) counter <= counter + 1 ; // simple counter
function int latency (int type_set_get , int a ) ;
if ( type_set_get == 0 ) arr1[a] = counter; // set
//DEBUG $display(" req id %d latency %d",a,counter-arr1[a]);
// for roll-over - if ( arr1[a] > counter ) return ( MAX_VAL_SIZE - arr1[a] + counter ) ;
return (counter - arr1[a]); //return the difference between captured clock and current clock .
endfunction
property ps();
#(posedge clk)
disable iff (~rst)
##[0:$]( (req,latency(0,req_id) ) or (rsp,cg_inst.sample(latency(1,rsp_id))) );
endproperty
assert property (ps);
The above property is triggered only when req/rsp is seen and only 1 thread is active looking for it.
If needed extra checks can be added into the function , But for latency counting this should be fine.
Anecdote :
Mentor AE - Dan discovered an assertion which was slowing our simulations by as much as 40 % . The poorly written assertion was part of our block tb and its effects went unnoticed there , as our block level test, run times were limited. It then sneaked into our top-level tb causing untold runtime losses till it was discovered a year later :) . [ guess we should have profiled our simulation runs earlier ]
Say for example if the above protocol implemented an abort at a later time, then the req-rsp thread will continue to process and wait ( till the simulation ends) for an aborted transaction , though it will not affect the functionality , it will sneakily continue to hog processor resources doing nothing useful in return. Till finally an vendor AE steps in to save the day :)

Verilog testbench design for my MSB downsampling module

A couple of days ago I asked about a module (here) I wanted to implement which takes the MSB of input samples, accumulates them (by shifting) and combines them into the output sample when the 32 output bit is "filled".
Thanks to the help there, I got this implementation, which doesn't produce any compilation errors and seemed fine with Xilinx 12.1:
module my_rx_dsp0
#(
//frontend bus width
parameter WIDTH = 24
)
(
//control signals
input clock, //dsp clock
input reset, //active high synchronous reset
input clear, //active high on packet control init
input enable, //active high when streaming enabled
//user settings bus, controlled through user setting regs API
input set_stb, input [7:0] set_addr, input [31:0] set_data,
//full rate inputs directly from the RX frontend
input [WIDTH-1:0] frontend_i,
input [WIDTH-1:0] frontend_q,
//full rate outputs directly to the DDC chain
output [WIDTH-1:0] ddc_in_i,
output [WIDTH-1:0] ddc_in_q,
//strobed samples {I16,Q16} from the RX DDC chain
input [31:0] ddc_out_sample,
input ddc_out_strobe, //high on valid sample
output ddc_out_enable, //enables DDC module
//strobbed baseband samples {I16,Q16} from this module
output reg [31:0] bb_sample,
output reg bb_strobe //high on valid sample
);
reg [3:0] i_msb;
reg [3:0] q_msb;
reg [31:0] temp_buff = 0;
reg [1:0] count = 0;
always #(posedge clock) begin
if(ddc_out_strobe) begin
// bit shifter for MSB
temp_buff <= {i_msb,q_msb,temp_buff[31:8]};
// to avoid if-else condition
count <= (count==2'd3) ? 2'd0 : (count+1);
end
end
always #(*) begin
i_msb = ddc_out_sample[31:28];
q_msb = ddc_out_sample[15:12];
// to avoid if-else condition
bb_strobe = (count==2'd3);
bb_sample = bb_strobe ? temp_buff : 32'd0;
end
assign ddc_in_i = frontend_i;
assign ddc_in_q = frontend_q;
assign ddc_out_enable = enable;
endmodule //my_rx_dsp0_custom
Now I wanted to implement a testbench that tests my_rx_dsp0.v with some examples.
I implemented a my_rx_dsp0_tb_2.v, which reads 32 bit samples from a file named my_input.dat to feed to the module as inputs ddc_out_sample.
They are then compared to the correct values stored at my_output.dat.
Note: I did not write this testbench myself, I adapted it from another testbench from an open-source project.
Here is the implementation:
module my_rx_dsp0_tb ( );
reg clk;
reg reset;
reg enable;
reg ddc_out_strobe; //high on valid sample
reg [31:0] ddc_out_sample;
wire [31:0] bb_sample = 32'd0;
wire bb_strobe;
wire ddc_out_enable = 1'b1; //enables DDC module
parameter WIDTH = 24;
parameter clocks = 2; // number of clocks per input
reg endofsim = 0;
integer number_of_errors;
initial number_of_errors = 0;
wire set_stb = 1;
wire [7:0] set_addr;
wire [31:0] set_data;
wire [WIDTH-1:0] frontend_i;
wire [WIDTH-1:0] frontend_q;
wire [WIDTH-1:0] ddc_in_i;
wire [WIDTH-1:0] ddc_in_q;
reg signed [31:0] compare_out;
// Setup the clock
initial clk = 1'b0;
always #5 clk <= ~clk ;
// Come out of reset after a while
initial reset = 1'b1 ;
initial #1000 reset = 1'b0 ;
// Enable the entire system
initial enable = 1'b1 ;
// Instantiate UUT
my_rx_dsp0 #(.WIDTH(WIDTH)) UUT_rx_dsp0
( .clock(clk), .reset(reset), .clear(clear), .enable(enable),
.set_stb(set_stb), .set_addr(set_addr), .set_data(set_data),
.frontend_i(frontend_i), .frontend_q(frontend_q),
.ddc_in_i(ddc_in_i), .ddc_in_q(ddc_in_q),
.ddc_out_sample(ddc_out_sample), .ddc_out_strobe(ddc_out_strobe), .ddc_out_enable(ddc_out_enable),
.bb_sample(bb_sample), .bb_strobe(bb_strobe) );
//-------Setup file IO-------//
//
integer i, r_in, r_out, infile, outfile;
initial begin
infile = $fopen("my_input.dat","r");
outfile = $fopen("my_output.dat","r");
$timeformat(-9, 2, " ns", 10) ;
// for n=9,p=2 digits after decimal pointer
//min_field_width=10 number of character positions for %t
end
//-------Get sim values and display errors-------//
//
initial begin
// Initialize inputs
ddc_out_strobe <= 1'd0;
ddc_out_sample <= 32'd0;
// Wait for reset to go away
#(negedge reset) #0;
while(!endofsim) begin
// Write the input from the file or 0 if EndOfFile(EOF)
#(posedge clk) begin
#1
ddc_out_strobe <= 1'b1;
if(!$feof(infile))
r_in = $fscanf(infile,"%b\n",ddc_out_sample);
else
ddc_out_sample <= 32'd0;
end
//
// Clocked in; set the strobe to 0 if the # of clocks/sample
// is greater than 1
if( clocks > 1 ) begin
#(posedge clk) begin
ddc_out_strobe <= 1'b0 ;
end
// Wait for the specified # of cycles
for( i = 0 ; i < (clocks-2) ; i = i + 1 ) begin
#(posedge clk) #1 ;
end
end
//
//
// Print out the number of errors that occured
if(number_of_errors) begin
$display("FAILED: %d errors during simulation",number_of_errors) ;
end else begin
$display("PASSED: Simulation successful") ;
end
//
end
end
//-------Comparison btwn simulated values vs known good values-------//
//
always #(posedge clk) begin
if(reset)
endofsim <= 1'b0 ;
else begin
if(!$feof(outfile)) begin
if(bb_strobe) begin
r_out = $fscanf(outfile,"%b\n",compare_out);
if(compare_out != bb_sample) begin
$display("%t: %b != %b",$realtime,bb_sample,compare_out);
number_of_errors = number_of_errors + 1;
end else begin
$display("%t: %b = %b",$realtime,bb_sample,compare_out);
end
end
end else begin
// Signal end of simulation when no more outputs
endofsim <= 1'b1 ;
end
end
end
endmodule // my_rx_dsp0_tb
When simulating with ISim from Xilinx ISE Suite Edition 12.1 I do not get the desired functionality from the module. I am afraid the output contains several x states (unknown states), instead of 1s or 0s as expected.
Question Is this due to:
1) The way the files are being read with $fscanf?
2) Did I wrong by initializing reg [31:0] temp_buff = 0?
3) Or does someone have an idea on what went wrong?
The error prompts from the testbench are (as an example):
xx000x00xxx00x0xx000x0x000000000 != 10000110111001011100010001101100
The X is from having multiple conflicting drivers on bb_sample and ddc_out_enable. The wire type merges the drivers, conflicting bit values of the same strength resolve as X.
UUT_rx_dsp0 is the intended diver. However you added and additional drivers from the way you declared your wires.
...
wire [31:0] bb_sample = 32'd0; // "= 32'd0" is a continuous driver
wire bb_strobe;
wire ddc_out_enable = 1'b1; // "= 1'd1" is a continuous driver
...
What you want is:
...
wire [31:0] bb_sample;
wire bb_strobe;
wire ddc_out_enable;
...
Correcting the above will resolve the X issue. Based on the example error it looks like are data miss matches. With the provided information, it is hard to tell it if it a test-bench or design issue. Could be just clock or propagation skew.

Removing the need to reset the device before using it

I'm having trouble implementing a controller block for an 8-bit multiplier. It works normally, but only if I turn the reset wire on, then off, such as in the following stimulus (which works fine):
`timescale 1ns / 100ps
module Controller_tb(
);
reg reset;
reg START;
reg clk;
reg LSB;
wire STOP;
wire ADD_cmd;
wire SHIFT_cmd;
wire LOAD_cmd;
Controller dut (.reset(reset),
.START(START),
.clk(clk),
.LSB(LSB),
.STOP(STOP),
.ADD_cmd(ADD_cmd),
.SHIFT_cmd(SHIFT_cmd),
.LOAD_cmd(LOAD_cmd)
);
always
begin
clk <= 0;
#25;
clk <= 1;
#25;
end
initial
begin
LSB <= 0;
START <= 0;
reset <= 1;
#55;
reset <= 0;
#10;
START <= 1;
#100;
START <= 0;
LSB <= 1;
#200;
#20;
#100;
end
initial
$monitor ("stop,shift_cmd,load_cmd, add_cmd: " , STOP,SHIFT_cmd,LOAD_cmd,ADD_cmd);
endmodule
Here's the simulation result for the working stimulus:
Now, when I set the reset to zero, without ever bringing it high, here's what happens:
Clearly, I'm using the reset wire to bring my Controller to the IDLE state. Here's the code for the controller block:
`timescale 1ns / 1ps
module Controller(
input reset,
input START,
output STOP,
input clk,
input LSB,
output ADD_cmd,
output SHIFT_cmd,
output LOAD_cmd
);
//Five states:
//IDLE : 000 , INIT: 001, TEST: 011, ADD: 010, SHIFT: 110
localparam [2:0] S_IDLE = 0;
localparam [2:0] S_INIT = 1;
localparam [2:0] S_TEST = 2;
localparam [2:0] S_ADD = 3;
localparam [2:0] S_SHIFT = 4;
reg [2:0] state,next_state;
reg [3:0] count;
// didn't assign the outputs to wire.. if not work, check this.
assign ADD_cmd = (state == S_ADD);
assign SHIFT_cmd = (state == S_SHIFT);
assign LOAD_cmd = (state == S_INIT);
assign STOP = (state == S_IDLE);
always #(*) begin
case(state)
S_INIT: begin
count = 3'b000;
end
S_SHIFT: begin
count = count + 1;
end
endcase
end
always #(*)
begin
next_state = state;
case (state)
S_IDLE: next_state = START ? S_INIT : S_IDLE;
S_INIT: next_state = S_TEST;
S_TEST: next_state = LSB ? S_ADD : S_SHIFT;
S_ADD: next_state = S_SHIFT;
S_SHIFT: next_state = (count == 8) ? S_IDLE : S_TEST;
endcase
end
always #(posedge clk)
begin
//state <= S_IDLE;
if(reset) state <= S_IDLE;
else state <= next_state;
end
reg [8*6-1:0] statename;
always #* begin
case( state )
S_IDLE: statename <= "IDLE";
S_INIT: statename <= "INIT";
S_TEST: statename <= "TEST";
S_ADD: statename <= "ADD";
S_SHIFT: statename <= "SHIFT";
default: statename <= "???";
endcase
end
endmodule
I don't know how to fix this. As you can see from the code above, there is a commented portion which is basically always initializing the state to IDLE. But even that doesn't work. Here's the simulation for the code above removing the comment from '//state <= S_IDLE;':
It's going into a different state than any listed above, and I have no idea why.
So I'd like to know:
Why is it going into an unknown state? Why doesn't my uncommented code work?
What can I change for it to work as I intend?
Your problem is that without a reset or initial value, state and next_state will be X. Your case statement assigning to statename will take the default branch and decode to ???. Since your process that assigns next_state does not handle cases where state is X it will get stuck in this state forever.
Your attempt to fix this will not work:
state <= S_IDLE;
if(reset) state <= S_IDLE;
else state <= next_state;
When reset is low you are making two assignments to state, the first as S_IDLE and the second as next_state. This is not a race condition. The Verilog standard states that:
Nonblocking assignments shall be performed in the order the statements were executed.
Since no re-ordering of the event queue occurs for sequential statements within a process this translates to last assignment wins. Therefore your state <= S_IDLE; is effectively optimised away since regardless of the value of reset the assignment will be overridden.
There are two ways you could fix this so that you don't need a reset:
1. Use the default clause to make your state machine safe
always #(*)
begin
next_state = state;
case (state)
S_IDLE: next_state = START ? S_INIT : S_IDLE;
S_INIT: next_state = S_TEST;
S_TEST: next_state = LSB ? S_ADD : S_SHIFT;
S_ADD: next_state = S_SHIFT;
S_SHIFT: next_state = (count == 8) ? S_IDLE : S_TEST;
default: next_state = S_IDLE;
endcase
end
This will ensure that your state-machine is 'safe' and drops into S_IDLE if state is a non-encoded value (including X).
2. Initialise the variable
reg [2:0] state = S_IDLE;
For some synthesis targets (e.g. FPGAs) this will initialise the register to a specific value and can be used alongside or instead of a reset (see Altera Documentation on power-up values).
A couple of general points:
Depending on your synthesis tool it may be better to use an enumeration rather than explicitly defining values for your states. This allows the tool to optimise based on the overall design or use a global configuration for encodings (for example safe, one-hot).
Using a reset registers holding state is standard practice so you should carefully consider whether you really want to avoid using a reset.
The uncommented code is an example of poor coding practice because you are making 2 nonblocking assignments to state in the same timestep. Synthesis linting tools are likely to warn you of this situation.
Since using a reset is a common, good practice, I don't think you need to fix anything.