Topic 5 · Counters, Shifters & Sync

Metastability & Synchronizers

Video 3 of 4 · ~12 minutes

Dr. Mike Borowczak · Electrical & Computer Engineering · CECS · UCF

🌍 Where This Lives

Where it shows up

Anywhere two clocks meet. Your finger presses a key whenever it feels like it, but the chip inside the keyboard ticks at 100MHz. A Bluetooth packet arrives on the radio's clock and needs to land in the CPU's. Your laptop's CPU, GPU, memory bus, USB controller, and WiFi radio all keep their own time — and they hand each other signals millions of times a second.

When it goes wrong

The fault is invisible. It doesn't appear on the schematic. It doesn't appear in simulation. The chip works on the bench all afternoon. Then a customer ships ten thousand units and one in five locks up after eleven hours. Every senior FPGA/ASIC engineer has a story about chasing this for weeks. Every chip vendor's errata sheet has a paragraph about it.

Career tag: “Explain metastability” is the most-asked FPGA/ASIC interview question. An engineer who can't answer it doesn't get the job.

⚠️ Real Silicon Doesn't Care About Your Simulator

❌ Wrong Model

“Flip-flops output 0 or 1 on every clock edge. My testbench always shows clean transitions. Asynchronous inputs work fine in simulation.”

✓ Right Model

Real flip-flops have setup and hold time windows around the clock edge. If an input changes during that window, the flop's output goes into a metastable state — neither 0 nor 1 — and may oscillate, decay slowly, or resolve randomly. Simulators don't model this. Your waveform will be clean while your real chip glitches once a week.

The receipt: Metastability causes physical failures that simulation never catches. Silicon datasheets quote MTBF (mean time between failures) in the 10⁻⁹ to 10⁻¹² range for unsynchronized asynchronous inputs. That's ~1 failure per week at 25 MHz without mitigation.

The Setup / Hold Window

If D changes during the setup/hold window, the flop's output becomes metastable — stuck between 0 and 1 — until it resolves (probabilistically).

iCE40 specs: t_setup ≈ 0.4 ns, t_hold ≈ 0.3 ns. Miss that 0.7 ns window on a 25 MHz asynchronous input: you'll hit it eventually.

👁️ I Do — The 2-FF Synchronizer

module synchronizer (
    input  wire i_clk,
    input  wire i_async_in,    // from external world, any timing
    output wire o_sync_out     // clean, synchronous to i_clk
);
    reg r_meta, r_sync;
    always @(posedge i_clk) begin
        r_meta <= i_async_in;  // first flop: may go metastable
        r_sync <= r_meta;      // second flop: has a full clock period to resolve
    end
    assign o_sync_out = r_sync;
endmodule

My thinking: The first flop may go metastable — but metastability resolves exponentially (probabilistic decay). Given a full clock period (~40 ns on iCE40) to settle, the probability it's still metastable at the second flop's capture edge is astronomically small. At 25 MHz with iCE40's τ, MTBF is measured in centuries.

🔧 2-FF Synchronizer — RTL View

Two-flop synchronizer RTL diagram: async input feeds first flop which may go metastable, then drives a second flop that resolves cleanly

Why two flops, not one? Metastability decays exponentially: $P(\text{still meta after } t) \propto e^{-t/\tau}$. One flop gives metastability zero recovery time before its Q is consumed. A second flop hands it a full clock period — and on iCE40 ($\tau \approx 200$ ps, $T_{clk} = 40$ ns), that's $\sim e^{-200}$, i.e. astronomically improbable.

🤝 We Do — When To Synchronize

Does this input need a synchronizer?

A push button wired to an FPGA pin
The output of a Topic 4 counter feeding a comparator
UART RX line from a USB-serial chip
A register's output feeding an ALU
A signal from another FPGA running on a different crystal

Answers: (1) YES — human input, no clock relationship. (2) NO — already synchronous to your clock. (3) YES — external chip, different clock domain. (4) NO — internal signal. (5) YES — different crystal = different clock domain = async by definition.

The rule is absolute: Every input that crosses from one clock domain into another (including “from outside the chip”) needs a synchronizer. No exceptions.

🧪 You Do — Spot the Bug

module reader (
    input wire clk, async_ready, async_data,
    output reg captured
);
    always @(posedge clk)
        if (async_ready) captured <= async_data;
endmodule

Buggy reader RTL: both async_ready and async_data go straight into a flop (CE and D pins) — both can cause metastability. Fix is to insert a 2-FF synchronizer on each async input.

Find the metastability bugs.

Answer: Both inputs (async_ready and async_data) are asynchronous and feed flops directly. Need synchronizers on each. Worse: using async_ready as both a clock-enable and a mux select creates a risk of capturing async_data mid-transition even if it were synchronized. Fix: synchronize both signals through 2-FF synchronizers first, then use them.

▶ LIVE DEMO

Synchronizer on an Asynchronous Button

~4 minutes

▸ COMMANDS

cd lecture_examples/week2_day05/d05_s3_ex3/
make sim
make wave
make stat

▸ EXPECTED STDOUT

PASS: synced follows async
      with 2-cycle latency
PASS: glitches on async do
      not propagate
=== 10 passed, 0 failed ===

  SB_DFF: 2

▸ GTKWAVE

Signals: i_async_in · r_meta · r_sync. Note: r_meta may glitch in simulation (if the TB drives pulses shorter than the clock period); r_sync is always clean. Latency is 2 clock cycles. That's the cost of safety.

🧪 Can We Emulate Metastability?

Standard Verilog simulators model flops as ideal — they snap to 0 or 1 every edge. So real metastability can't be observed in a vanilla testbench. But we can emulate the consequences in several useful ways:

① Race the clock (TB)

Drive the async input with edges placed at deliberately bad offsets — e.g., 1 ps before/after the rising clock. With non-zero #delay on the assignment, many simulators will issue a timing violation warning and the unsynchronized flop captures the old value while the synchronizer's downstream flop sees something consistent. Useful for visualizing latency, not the metastable analog state itself.

② Inject randomness (TB)

When a setup/hold violation is detected, force the flop's Q to $urandom % 2 for one cycle. This emulates the “flop resolves to either value, randomly.” Run the testbench 10,000× and watch downstream FSMs misbehave without a synchronizer — and behave correctly with one.

③ Use 'X' propagation

Drive the async input as 1'bx for one cycle around the violation point. The first flop captures X; if the design doesn't synchronize, X propagates through downstream logic and lights up the waveform. SystemVerilog's $asserton and assertion-based verification (Topic 9) make this systematic.

④ Gate-level + SDF

Post-synthesis simulation with SDF (Standard Delay Format) back-annotation and specify blocks does model setup/hold checks. Violations produce X on the flop output until the next stable edge. This is the closest free simulators get to real silicon behavior — the industry-standard pre-tapeout signoff flow.

Bottom line: RTL simulation will not surprise you with metastability — that's why missing synchronizers ship to silicon. The fix is process (always synchronize external inputs), not testbench cleverness.

🔧 What Did the Tool Build?

$ yosys -p "read_verilog day05_ex03_synchronizer.v; synth_ice40 -top synchronizer; stat" -q

=== synchronizer ===
   Number of wires:                  4
   Number of cells:                  2
     SB_DFF                          2    ← exactly 2 flops, no frills
     SB_LUT4                         0

Cost of safety: 2 flops per synchronized signal. On iCE40 HX1K: 0.16% of the chip per synchronizer. You can afford to synchronize every external signal — there is no reason not to.

Advanced note: Some FPGA flows add ASYNC_REG or similar synthesis attributes to ensure these two flops are kept close in silicon. For iCE40, Yosys handles placement automatically. For Xilinx, you'd add (* ASYNC_REG="TRUE" *) attributes.

🤖 Check the Machine

Ask AI: “I'm reading a button press directly into my Verilog state machine. Do I need a synchronizer? Calculate MTBF with and without one for a 25 MHz clock.”

TASK

Ask AI about sync + MTBF.

BEFORE

Predict: without sync, MTBF ~hours-days. With 2FF sync, MTBF ~centuries.

AFTER

Strong AI shows the MTBF formula. Weak AI handwaves “it's fine” — dangerous advice.

TAKEAWAY

Any AI that says “buttons don't need synchronizers” is wrong. Don't trust that model for RTL work.

MTBF formula: MTBF = exp(T_resolve / τ) / (T_window · F_clock · F_async). For iCE40 (τ≈200 ps), T=40 ns, F_clk=25 MHz, F_async=10 Hz button: MTBF ≈ 10¹⁵ years with sync, ~1 day without.

Key Takeaways

① Asynchronous signals + setup/hold violations = metastability.

② The 2-FF synchronizer gives metastability time to resolve.

③ Every external input and clock-domain crossing needs one.

④ Cost: 2 flops, 2 cycles latency. Cheap insurance.

Every external input gets 2 flops. Every time. No exceptions.

🔗 Transfer

Button Debouncing

Video 4 of 4 · ~10 minutes

▸ WHY THIS MATTERS NEXT

Synchronization handles metastability — but buttons have a second problem: they bounce mechanically for up to 20 ms. At 25 MHz that's 500,000 false edges per press. Video 4 combines the synchronizer you just saw with a counter-based debouncer to build the complete input pipeline you'll use everywhere.