## A 2.5-20kSps in-pixel direct digitization front-end for ECoG with in-stimulation recording

Aditi Jain, Eric Fogleman, Paul Botros, Ritwik Vatsyayan, Corentin Pochet, Andrew Bourhis, Zhaoyi Liu, Suhas Chethan, Hanh-Phuc Le, Ian Galton, Shadi Dayeh, and Drew A. Hall

## University of California, San Diego

Closed-loop neuromodulation promises to enhance treatment for movement disorders, pain, and epilepsy. Advancements in low-impedance, high-density recording grids [1] have paved the way for low-noise neural recording systems with high spatial and temporal resolution. However, a conventional high-density neural recording signal path with programmable gain amplifiers (PGAs) and a shared ADC [2] saturates during stimulation because of the high amplifier gain. Due to a fundamental trade-off with the input high-pass cutoff frequency (for dc electrode offset elimination), it takes hundreds of ms to recover, leading to critical data loss. Recent advances in direct digitization-based analog front-ends (AFEs) overcome this limitation by forgoing the amplifier and directly connecting the electrode to a high dynamic range ADC. Directly using a continuous time  $\Delta\Sigma$  modulator (CTDSM) for this application has several notable challenges: slow recovery/instability during artifacts beyond the input range, power and area limitations, and low input impedance (Zin). We report a 4×2 array of per-pixel  $2^{nd}$ -order  $\Delta\Sigma$  ADCs (including the decimation filter) for ECoG with the fastest (sub-ms) artifact recovery time, enabling in-stimulation recording and power-efficient bandwidth scaling.

The 2<sup>nd</sup>-order  $\Delta\Sigma$  ADC architecture is shown in **Fig. 1**. While timedomain-only ADC architectures, such as voltage-controlled oscillator (VCO)-based ADCs, are popular for their ability to run off a low supply voltage and process scalability, they contend with issues such as VCO flicker (1/f) noise and compromised noise efficiency when utilized as the first integrator [3]. This system instead uses a Gm-C filter as the first integrator, utilizing chopping to reduce the flicker noise and a complementary input stage to double the noise efficiency. The second stage is an area-efficient time-based integrator realized by a G<sub>m</sub>-current-starved current-controlled oscillator (CCO) with a counter. The 4-bit feedback capacitive DAC (CDAC) is mismatch-shaped with a 1<sup>st</sup>-order shaped tree-structured dynamic element matching (DEM) encoder and resampled with a delay of 6.25% of the modulator sampling period to account for the quantizer and DEM delay. The pseudo-virtual ground feedforward architecture [3] allows the modulator to have high linearity and a compact area since it only needs a single feedback DAC despite being a 2<sup>nd</sup>-order cascade of integrators feedback (CIFB) modulator. Deadband (DB) switches suppress the differential chopping artifacts at the  $G_m$  input; however, they must be large for low input-referred noise applications [4]. To remedy this, the DB switches are strategically moved from the sensitive input node to the G<sub>m</sub>-C output and the feedforward input. Finally, an auxiliary-amplifier-based impedance booster [5] increases  $Z_{in}$  to >30M $\Omega$ .

In theory, direct digitization offers uninterrupted monitoring; however, the modulator oscillates when overloaded if the phase difference at the *n*-bit counter output wraps modulo  $2^n$  and is fed back into the loop. The stimulation artifacts in this application are large enough to momentarily over-range the ADC. This phase wrapping-induced instability requires resetting the modulator, leading to recording data corruption. Fig. 2 shows the fast-recovery, over-range detecting phase guantizer implementation. We introduce a saturation detector and expand the counter depth by two additional bits for faster recovery. The first additional bit detects saturation, while the second bit allows for reaction time without phase wrapping and loss of dynamic range (DR). The counters are implemented with a Gray code to minimize asynchronous sampling errors. The added bits are clocked at a fraction of the CCO frequency, fcco, and thus require negligible additional power. Recovery is nearly instantaneous after the artifact, with the delay determined by the decimation filter.

Neural signals are categorized by their frequency band and require an AFE with scalable power based on the operation mode. This design supports four modes covering 2.5–20kSps recording through a power-efficient bandwidth-scalable CTDSM. Fig. 3 shows the implementation of the G<sub>m</sub>-C integrator with transconductance-based coefficient scaling, comprising four parallel Gm branches with drive strength ratios 1:1:2:4, which are turned on/off through the cascode

bias nodes to maintain a constant input capacitance. This switching is more power efficient than scaling the integration capacitor,  $C_{I}$ , and scaling the bias current alone is inaccurate. Each branch comprises a dual-tail, complementaryinput transconductor biased in sub-subthreshold with a  $G_m/I_D$  of 40S/A for high noise efficiency.



Die micrograph.

Since this block's power efficiency is essential, medium Vt (MVT) devices were used to reduce the supply to 0.9V. With a 3.2fF unit CDAC element, the input devices are near minimum size to preserve SNR against input capacitive network attenuation. The G<sub>m</sub> is cascoded to reduce leakage and enable downmodulation at a low-impedance node for faster settling. A triode-based common-mode feedback (CMFB) topology was chosen for its high input-linear range and stability. To minimize area, Cluses anti-parallel NMOS varactors to cancel the even-order nonlinearity. The high capacitor density and differential topology save 4× area over MIM or MOM caps. The second integrator comprises a CCO that combines the G<sub>m</sub>-cells of the direct and feedforward paths. Its gain is scaled across modes through the number of CCO stages, which saves 2-8× power compared to frequency divider-based scaling. Part of the current is bled off to reduce  $f_{CCO}$ , reducing the counter's power by 2×, which is important as  $f_{CCO}$ is ~10× the sampling frequency,  $f_s$ . The CCO uses a pseudo-differential NMOS cross-coupled inverter architecture for noise efficiency.

Several system-level decisions were implemented to array this design and allow it to be scalable to thousands of channels on a chip. First, the decimation filter is implemented within each pixel, substantially reducing the datalink power, which is proportional to the oversampling ratio (OSR). This also offers stronger anti-aliasing rejection of out-of-band signals compared to the PGA+shared ADC architecture. Finally, for improved noise immunity, the serial data clock frequency, f<sub>DL</sub>, is an integer multiple of f<sub>s</sub> (across all modes), thus placing the energy in a null of the modulator's signal transfer function.

This 8-channel chip is fabricated in 180nm CMOS, with each channel occupying 0.09mm<sup>2</sup>, including the digital blocks (decimation filter (18%), DEM (13%), and clock generation (3%)). Each channel consumes 14µW from 0.9V and 0.7V supplies for analog (53%) and digital (47%), respectively, where the latter can be reduced with technology scaling. This technology provides a cost-effective path for scaling to a 1024-channel array with bumped die chip-scale packaging. Fig. 4 shows a measured spectrum with and without decimation for a 125mV<sub>pp</sub> full-scale (FS) sinusoidal input where the ADC achieves a 78.6dB SNDR and DR in a 10kHz bandwidth and an SFDR of 97.7dB with  $f_s$ =1.28MHz and an  $f_s/2$  chopping clock. The characteristic 40dB/decade noise-shaping from a 2<sup>nd</sup>-order modulator is apparent in the spectrum. Due to the scalable architecture, the integrated input-referred noise is constant across the different modes. Channel-to-channel isolation of >60dBc was measured by injecting a full-scale input on one channel while shorting the inputs of the others. The ADC output recovers in one decimated sample (50µs for BW=10kHz) with a 250mV (4x over-range) input. Fig. 5 compares this work with state-of-the-art neural recording front-ends.

An in vivo whisker barrel rat experiment was performed to compare the performance with and without stimulation against a commercial chip [6]. Platinum-nanorod (PtNR) electrodes [1] captured ECoG activity (70-200Hz) simultaneously across all 8 channels, as shown in **Fig. 6**. During monopolar stimulation (500µA at 100Hz), the Intan chip saturated beyond its input range (~5mV), resulting in data loss. In contrast, this chip outperformed by recovering with immediate observation of neural activity. This work showcases a scalable 2<sup>nd</sup>-order  $\Delta\Sigma$  ADC featuring the fastest in-pixel artifact recovery backed by in vivo data and unique power-efficient bandwidth scaling.

### Acknowledgment:

This work was funded by the National Institutes of Health (UG3NS123723).

#### References:

- [1] Y. Tchoe, et al., Sci. Transl. Med., 14.628 (2022): eabj1441.
- [2] D. Yoon, et al., Symp. VLSI Circuits, pp.1-2, 2021.
- [2] D. Fochet, et al., *ISSCC*, vol. 65, 2022.
  [4] H. Chandrakumar, et al., *JSSC*, 53.12 (2018): 3470-3483.
- [5] H. Chandrakumar, et al., JSSC, 52.11 (2017): 2811-2828.
- [6] Intan RHD2164, https://intantech.com/files/Intan\_RHD2164\_datasheet.pdf.

# **IEEE CICC 2024**



the-art.

with commercial Intan chip.