This paper presents a 50-Gb/s receiver (RX) with an adaptive phase-shifting (APS) phase detector (PD) for four-level pulse amplitude modulation (PAM-4) clock and data recovery (CDR). The APS PD ...adopts a <inline-formula> <tex-math notation="LaTeX">\beta </tex-math> </inline-formula> detector to achieve a unique locking point that resolves the dead-zone problem caused by the combination of the conventional baud-rate PD and adaptive decision feedback equalizer (DFE). The APS CDR is configured with a sign-sign minimum mean squared error (SS-MMSE) PD and an addition of a digital coefficient which is adaptively controlled through the <inline-formula> <tex-math notation="LaTeX">\beta </tex-math> </inline-formula> detector by detecting pre-cursor inter-symbol interference (ISI) dependency of 1-level transitions. Therefore, the proposed CDR does not rely on external coefficients. Furthermore, adaptive programmable gain amplifiers (PGAs) and DFE are implemented with the APS CDR to compensate the pre and post-cursor ISIs, and main-cursor level. Since the adaptive equalizers and the APS CDR share the error samplers, no additional analog hardware is required. Fabricated in 28-nm CMOS technology, a prototype PAM-4 RX operates at 50 Gb/s and occupies an active area of 0.16 mm<inline-formula> <tex-math notation="LaTeX">^{2}</tex-math> </inline-formula>. The RX tested over a 25.3-dB loss channel achieves a bit error rate (BER) of less than 10<inline-formula> <tex-math notation="LaTeX">^{-12}</tex-math> </inline-formula> and energy efficiency of 2.52 pJ/b.
This brief presents an 8-GHz Octa-phase Error Corrector (OEC) employing a digital delay-locked loop (DLL) with a coprime phase comparison scheme. To alleviate timing constraint during the phase ...comparison, clock phases spaced in coprime to 8 is utilized, enabling up to a 64-Gb/s link operation. In particular, this brief applies 3T/8 spaced clock rather than T/8. In addition, by employing a clock-divided 5-bit selection scheme, a high-speed 8:2 multiplexer (MUX) operates seamlessly without glitches. To minimize a mismatch and calibration -induced jitter, a single shared phase comparator and a finite-state machine (FSM) for tracking the minimum total delay are employed. The test chip has been fabricated in the 40-nm CMOS technology in an active area of 0.0814 mm 2 . The core phase calibration loop consumes 10.8 mW at 8 GHz at a 0.9-V supply achieving a maximum residue phase error of 0.95 ps.
This brief presents a quadrature resonant clock generator for driving four 4.3-mm load wires with tuning capacitors and an amplitude control feedback loop. By using frequency tuning capacitors, which ...reduce the mismatch in operation and LC resonant frequencies, the proposed clock generator offers power reduction by 20-25% compared with conventional CMOS clock driver and by 23-34% compared with conventional CML clock driver over a wide voltage swing. The amplitude control feedback loop, which determines the bias current of the negative gm cell, maintains the constant optimized clock swing over wide PVT variations. Measurement result from the prototype chip fabricated in 65 nm CMOS shows that total power consumption of the proposed quadrature resonant clock is 11.92 mW in 7-GHz operation with four 559-fF load wire capacitances. Measured period jitter is 573.6 fsrms and phase noise at 1MHz offset is -138.37 dBc/Hz.
This paper presents a 375-GB/s/mm power-efficient memory interface that consists of the PAM-4 transceivers with a per-pin training system for the next-generation HBM controllers. The self-training ...system executes foreground driver calibration, 2-D sampling point optimization, FFE coefficient adaptation, and sampler offset calibration. Using DC-levels and SBR patterns, the entire training sequence for 8 DQ transceivers and 2 DQS transceivers takes less than 1-ms. In addition, a charge-recycling sampler that saves 44.5% of power consumption compared to the strongARM latch sampler is proposed. The proposed memory interface fabricated in a 40-nm CMOS process shows a power efficiency of 0.41-pJ/b and a power efficiency per channel length of 68.7-fJ/b/mm, significantly higher than the standard state-of-the-art memory interfaces.
This paper presents a 10-to-12-GHz dual loop quadrature clock corrector consisting of quadrature phase error corrector (QEC) and duty-cycle corrector (DCC) using a digital delay-locked loop (DLL). To ...ensure stability, QEC and DCC loops operate concurrently with different bandwidths. The correctors use a shared phase comparator scheme to minimize the calibration-induced error. The chip is implemented in 28-nm CMOS technology with an active area of 0.016 mm2. The calibration loops consume 16.5 mW at 12 GHz on a 1.0- V supply, with 0.6 ps residual clock phase inaccuracy and 0.7 % duty cycle error.
The data-rate requirement for a data center interconnect is being switched to 400 Gb/s. For the optical interface, the electro-optical transmitter chip designed in the CMOS process has been proposed ...rather than the BiCMOS process as technology advances. On the other hand, vertical-cavity surface-emitting laser (VCSEL) has been a popular candidate for the optical modulation device of the 400 GbE interconnect because of its cost and packaging efficiency 1 -7. However, the high operating voltage, nonlinear effects, and low bandwidth are problems to be overcome by the VCSEL driver to transmit high-speed pulse-amplitude-modulation-4 (PAM-4). The pattern-detecting equalization is proposed 3, 6 to compensate for the nonlinearity, and the driver combines MSB-LSB with a 1-bit DAC scheme to produce PAM-4 output 3, 5-6. However, pattern detection requires power-consuming additional blocks. The combining at the driver needs an additional pre-driver and increases driver drain capacitance which lowers the bandwidth. This paper presents a PAM-4 64 Gb/s VCSEL transmitter (TX) for 400 GbE with 3-tap sub-UI asymmetric feed-forward equalizer (FFE) fabricated in 40 nm CMOS technology. The quarter-rate system and the PAM-4 combining 8:1 multiplexer (MUX) are employed for clocking power reduction. The sub-UI FFE and combining-ahead scheme compensate for low bandwidth. The TX achieves a power efficiency of 2.09 pJ/bit at PAM-4 64 Gb/s.
This paper proposes a 112 Gb/ s quarter-rate four-level pulse-amplitude modulation (PAM-4) transmitter (TX) using lUI pulse generation based 8:1 multiplexer (MUX) which is adequate for high-speed ...operation. The transmitter includes current-mode logic (CML) driver, quadrature clock generator and phase interpolator (PI) for clock generation and tap generation for 3-tap feed-forward equalizer (FFE). The key feature of 8:1 MUX is combining MSB/LSB path with 4:1 serializing to reduce the area and power consumption. The chip is implemented in 28-nm CMOS technology and core block occupies an area of 0.202 mm 2 . The simulated power efficiency of proposed MUX is 0.21 pJ/b.
This paper presents a quadrature resonant clock generator with tuning capacitors for driving four 2.1-mm load wires. By using frequency tuning capacitors, which reduce the mismatch in operating and ...LC resonant frequencies, the proposed clock generator offers power reduction by 19-22% compared with conventional CMOS clock driver and by 1632% compared with conventional CML clock driver. Measurement result from the prototype chip fabricated in 65 nm CMOS shows that total power consumption of the proposed quadrature resonant clock is 14.5 mW in 12.5-GHz operation with four 370-fF load wire capacitances. Measured phase noise at 1 MHz offset is -141.36 dBc/Hz.