Topic 7 · Finite State Machines

The Three-Block Template

Video 2 of 4 · ~12 minutes

Dr. Mike Borowczak · Electrical & Computer Engineering · CECS · UCF

FSM Theory3-Block TemplateState EncodingMethodology

🌍 Where This Lives

Where it shows up

At Intel, AMD, Qualcomm, ARM — every internal style guide mandates one specific structural pattern for control logic, and every static-analysis tool (Spyglass, Lint, Design Compiler) flags deviations. Senior reviewers scan the structure before they read the logic. If the structure is wrong, the review stops there. The pattern is so widespread it works as a calling card: engineers can spot each other's training in five seconds.

When it goes wrong

“Works in sim, broken in silicon” is the most expensive sentence an engineer can write in an email. Almost every instance traces back to two things that should have been kept apart — present and next, combinational and sequential — getting blurred into a single line of code. The bug is not reproducible. It does not show in code review. It passes simulation. It appears once, in the field, on a customer's device.

Career tag: Show a 3-block FSM in an interview and you signal “I've been trained.” Show a tangled 1-block FSM and you signal the opposite.

⚠️ Why Not Write It All in One Block?

❌ Wrong Model

“I'll just use one big always @(posedge clk) that does state transitions, state register, and output logic all together. Fewer blocks = cleaner code.”

✓ Right Model

Three blocks separate three different concerns: (1) state register (how state persists), (2) next-state logic (how state changes), (3) output logic (what state produces). Mixing these in one block mixes blocking/nonblocking, creates subtle latches, and makes the circuit hard to reason about. The template exists because engineers have broken FSMs every other way first.

The receipt: 1-block and 2-block FSMs appear in old textbooks. They work for trivial cases. They scale badly. They also teach habits that break at the first non-trivial design.

👁️ I Do — Block 1: State Register

PICTURE
FSM
RTL
Block 1
State Reg
Block 2
Next-State
Block 3
Outputs
// --- Block 1: state register ---
always @(posedge i_clk) begin
    if (i_reset) r_state <= S_INIT;
    else         r_state <= r_next_state;
end
3 inputs · 1 output · pure DFF
Block 1 RTL: r_next_state into D pin of DFF; i_clk into clock pin; i_reset into reset pin; r_state at Q output
Signal colors match the code on the left
My thinking: Block 1 is always trivial. It says: “on every clock edge, load r_next_state into r_state; on reset, go to S_INIT.” That's all it ever does. If you find yourself putting logic here, you're doing it wrong — logic goes in Block 2.

🤝 We Do — Block 2: Next-State Logic

PICTURE
FSM
RTL
Block 1
State Reg
Block 2
Next-State
Block 3
Outputs
// --- Block 2: next-state logic (combinational) ---
always @(*) begin
    r_next_state = r_state;            // DEFAULT: stay (no latch)
    case (r_state)
        S_GREEN:  if (timer_done) r_next_state = S_YELLOW;
        S_YELLOW: if (timer_done) r_next_state = S_RED;
        S_RED:    if (timer_done) r_next_state = S_GREEN;
        default:                  r_next_state = S_INIT;   // illegal-state safety
    endcase
end
Block 2 RTL: r_state and timer_done feed a case-decoder that produces r_next_state; default branch keeps r_state to prevent latches
Combinational case-decoder · no flops
Together — two critical habits:
  1. Default assignment first: r_next_state = r_state; — prevents latch inference and means “if no transition fires, stay put.”
  2. default case at end: recovers from illegal states. For a 2-bit state with only 3 legal values, the 4th is unreachable in theory, but silicon glitches happen in practice.

🤝 We Do — Block 3: Output Logic (Moore)

PICTURE
FSM
RTL
Block 1
State Reg
Block 2
Next-State
Block 3
Outputs
// --- Block 3: output logic (Moore: f(state) only) ---
always @(*) begin
    // defaults for ALL outputs (no latches)
    o_red    = 1'b0;
    o_yellow = 1'b0;
    o_green  = 1'b0;

    case (r_state)
        S_GREEN:  o_green  = 1'b1;
        S_YELLOW: o_yellow = 1'b1;
        S_RED:    o_red    = 1'b1;
        default: ;  // all off
    endcase
end
Block 3 RTL: r_state into an output decoder producing o_green, o_yellow, o_red. Default-0 above the case prevents latches.
Moore: only r_state drives outputs
Together — the same two habits reappear: defaults at top (for every output, not just one), default: case at end. Same discipline, same reasons, same template. Moore = outputs depend only on r_state; Mealy = outputs also depend on inputs, but the structure stays identical.

🧪 You Do — Spot the Bug

PICTURE
FSM
RTL
Block 1
State Reg
Block 2
Next-State
Block 3
Outputs
always @(*) begin
    case (r_state)
        S_GREEN:  begin r_next_state = S_YELLOW; o_green = 1; end
        S_YELLOW: begin r_next_state = S_RED;                 end
        S_RED:    begin r_next_state = S_GREEN;  o_red   = 1; end
    endcase
end

Three bugs. Find them.

Buggy combined block infers latches on o_green, o_yellow, and r_next_state — outputs become 'stuck' memory cells
⚠ Inferred latches in synthesized silicon
Answers:
  1. Combined blocks: state and output logic are in the same always block — violates the template.
  2. No defaults: o_green, o_yellow, o_red aren't assigned in every branch → latch inference.
  3. No default: case: if r_state takes an illegal value, r_next_state isn't assigned → another latch, plus no recovery.
▶ LIVE DEMO

Traffic Light FSM — From Diagram to Working Chip

~6 minutes

▸ COMMANDS

cd lecture_examples/week2_day07/d07_s2_ex1/
cat day07_ex01_fsm_template.v   # see 3-block template
make sim             # self-check
make wave            # see sequence
make stat            # check LUT count
make prog            # flash Go Board

▸ EXPECTED STDOUT

PASS: After reset: GREEN
PASS: GREEN->YELLOW
PASS: YELLOW->RED
PASS: RED->GREEN (cycle 2)
PASS: Mid-cycle reset -> GREEN
=== 5 passed, 0 failed ===

▸ GTKWAVE

Signals: r_state · o_red · o_yellow · o_green · timer_done. Watch state cycle GREEN → YELLOW → RED → GREEN while outputs change in lockstep with state (pure Moore: output = f(state)).

🔧 What Did the Tool Build?

$ yosys -p "read_verilog day07_ex01_fsm_template.v; synth_ice40 -top traffic_light; stat" -q

=== traffic_light ===  (3 states, binary encoded → 2 bits)
   Number of wires:                 10
   Number of cells:                  7
     SB_DFF                          2    ← 2 bits of state
     SB_LUT4                         5    ← next-state + output decode
What to notice: 2 flops for state (binary encoding uses $clog2(3) = 2 bits). 5 LUTs for all the combinational logic (next-state + output). Seven cells total for a complete traffic-light controller. That's the efficiency of a well-written FSM on an FPGA.
Preview: Video 3 shows how changing the state encoding (one-hot vs binary) changes these numbers — one of the few Verilog choices that visibly affects silicon cost.

🤖 Check the Machine

Ask AI: “Write a vending-machine FSM using the 3-block template: states IDLE, COIN_IN, DISPENSE, RETURN_CHANGE. Include all defaults and default cases.”

TASK

Ask AI for a 3-block FSM with 4 states.

BEFORE

Predict: 3 separate always blocks, defaults at top, default case at end.

AFTER

Strong AI uses localparam for states + has defaults. Weak AI uses magic numbers.

TAKEAWAY

If AI skips the output defaults, it'll synth with latches. Catch it before synth does.

Key Takeaways

 Three blocks: state register, next-state logic, output logic.

 Block 1 is trivial. Block 2 defaults to “stay.” Block 3 defaults outputs.

 Every case has a default:. Every output has a default assignment.

 This template prevents the three most common FSM bugs.

Three blocks. Always. No exceptions. No cleverness.

🔗 Transfer

State Encoding

Video 3 of 4 · ~8 minutes

▸ WHY THIS MATTERS NEXT

Your traffic light FSM had 3 states and used 2 state bits. Is that optimal? What if you used 3 bits with one-hot encoding? Video 3 shows you the tradeoff and the Yosys output that proves it. This is one of the few Verilog decisions that visibly changes silicon cost.