International Conference

[ 50 ] Yujin Na, Jin-Ku Kang, “A 2.5-12.3 Gb/s continuous-rate referenceless CDR with counter-based unlimited frequency detection”, ISOCC 2024.

Abstract

This paper presents a referenceless half-rate clock and data recovery (CDR) circuit that employs a counter-based frequency detector with unlimited frequency acquisition capability. For frequency acquisition, the proposed frequency detector only requires the modified output of the conventional half-rate bang-bang phase detector (BBPD). Regulating the gain of the frequency locked loop (FLL) according to the frequency deviation effectively reduces the overall lock time and minimizes the variation of lock time under different conditions. The simulation results showed that the proposed CDR operates at a wide range of input data rates from 2.5 to 12.3 Gb/s. The power consumption is 3.66 mW at 2.5 Gb/s and 13.44 mW at 12.3 Gb/s. This work is designed using 65 nm process.

[ 49 ] Su-A Kim, Jin-Ku Kang, “A 12-28 Gb/s Temperature Compensated PAM-4 Transmitter with 7B4Q Maximum Transition Avoidance and Fractional-Spaced FFE”, ISOCC 2024.

Abstract

This paper presents a pulse amplitude modulation 4- level (PAM-4) transmitter with wide eye-opening performance in the temperature range from -20 to 100 ˚ C. The maximum transition avoidance (MTA) encoder can reduce power consumption and gate count based on the proposed 7B4Q MTA algorithm. Fractional-spaced feed-forward equalizer (FFE) compensates channel loss for 12 to 28 Gb/s. The proposed transmitter increases average eye width by 66.7% and improves level separation mismatch ratio (RLM) from 0.915 to 0.966. This work is designed using 65 nm process.

[ 48 ] Hyunjae Park, Jin-Ku Kang, Yongwoo Kim, “The hardware implementation of QARMA-64 with RoCC on FPGA for Memory encryption”, ISOCC 2024.

Abstract

This paper proposes a hardware accelerator for the lightweight encryption cipher QARMA-64, integrated into the RISC-V Rocket Chip using the Rocket Custom Coprocessor (RoCC). The accelerator achieves significant performance gains, with encryption speedup of 106x and decryption speedup of 98x compared to software. FPGA utilization on the Xilinx VC707 Evaluation board shows only a modest increase of 3.64% in Look-Up Tables (LUTs) and 0.82% in Flip-Flops (FFs) relative to the base Rocket Chip design.

[ 47 ] Yugwon Seo, Jin-Ku Kang, Yongwoo Kim, “Autoencoder-based Knowledge Distillation For Quantized YOLO Detector”, ISOCC 2024.

Abstract

Object detectors like YOLO often suffer performance degradation when quantized to lower bit precision. To mitigate this, research explores knowledge distillation (KD) methods alongside quantization. This paper proposes a novel KD method using an autoencoder for quantizing YOLO. This method involves training an autoencoder on full-precision network features, effectively transferring them to the quantized network. Applied to YOLOv7-tiny with the PASCAL VOC dataset, our method improved the mAP(0.5:0.95) of LSQ and LLTQ methods by 0.6 and 0.9, respectively. Compared to the conventional KD method, we observed an improvement in mAP(0.5:0.95) of up to 0.5.

[ 46 ] Hyunjun Ko, Jin-Ku Kang, Yongwoo Kim, “An Efficient and Fast Filter Pruning Method for Object Detection in Embedded Systems”, AICAS 2024.

Abstract

Recently, CNN-based networks have exhibited high performance in computer vision. On the other hand, due to

the networks becoming deeper and wider, it is hard to implement the model in real-time embedded environments. To

overcome the drawback, filter pruning has been widely studied for neural network compression. Filter pruning does not need

any special hardware or software because it removes filters of CNN and accelerates inference without any special software or

hardware. In this paper, we proposed efficient and fast filter pruning (EFFP), which focuses on reducing the training computation resources and searching optimal pruned networks. The success stems from two significant improvements upon other pruning methods. (1) Short training time: In the pruning stage, we make redundant filters to zero to make the output feature map the same as a lightweight model, and (2) adjust the change of redundancy using regrowing: It is difficult to get an optimal pruned model by pruning redundant filters at once. Therefore, we use the pruning/regrowing method to gradually remove unimportant filters to avoid permanently pruning important filters to get an optimal lightweight model. Experimental results indicate that EFFP can reduce the FLOPs and parameters more efficiently and faster than other pruning methods on the object detection model. The inference time is measured on NVIDIA Jetson Xavier NX. As a result, we improve mAP and inference time by a maximum of 45% compared to other pruning methods.

[ 45 ] Jaemyung Kim, Jin-Ku Kang, Yongwoo Kim, “Fast, Efficient and Lightweight Compressed Image Super-Resolution Network for Edge Devices”, AICAS 2024.

Abstract

In many applications, images are reduced in size and compressed to save storage and transmission bandwidth. This process leads to loss of detail and often generates undesirable artifacts that degrade visual quality and impact the performance of vision tasks. To solve this challenge, many studies have been proposed on compressed image superresolution (CISR). However, most previous works have designed complicate architectures that require substantial computational resources, limiting their applicability in edge devices. To address this problem, we propose a fast, efficient and lightweight compressed image super-resolution network (FELCSRN) for edge devices. The proposed FELCSRN is a single network that reduces compression artifacts and enhances the resolution simultaneously. Furthermore, the reparameterization and quantization methods are utilized to further reduce computational and memory costs. Experimental results demonstrate that the proposed FELCSRN outperforms

existing efficient super-resolution methods in terms of quality metrics and efficiency. In addition, compared to state-of-theart

CISR methods, it significantly reduces computational costs and model size. As a result of evaluating the performance of

the proposed FELCSRN by deploying it on the Xilinx ZCU104 board, it was confirmed that CISR tasks are performed in realtime.

[ 44 ] Seoung-geun Cho, Jin-Ku Kang, “A PAM-4 Baud-Rate CDR with High-Gain Phase Detector Using Shared Sampler”, ISOCC 2023.

Abstract

This paper proposes a pulse amplitude modulation-4 (PAM-4) baud-rate clock and data recovery circuit that improves phase detector (PD) gain by increasing transition density. The proposed idea is that one sampler (shared sampler) serves as a data sampler and an error sampler simultaneously to minimize the number of samplers. It can reduce power consumption while having a high phase detector gain. In addition, the accuracy of the early/late signal is improved through the pattern dependent phase detector. Simulation result shows power consumption of the proposed receiver is 0.79[pJ/bit] at -17.8dB channel loss. This work is designed using 65nm process.

[ 43 ] Jeong-Mi Park, Jin-Ku Kang, “A PAM-4 Receiver with Selective Reference Voltage Adaptation for Low Sensitivity to Sampler Voltage Variations”, ISOCC 2023.

Abstract

This paper presents a pulse amplitude modulation-4 (PAM-4) receiver with low sensitivity to sampler voltage variation. Based on time-based LSB decoder, it has improved bit error rate (BER) performance in voltage fluctuation environment by selectively applying data level as reference voltage. In addition, a single error sampler is utilized to perform continuous reference voltage adaptation based on random data. By applying the data levels as reference voltage, the adaptation algorithms are simplified by finding only two data levels and can reduce power consumption and hardware complexity. Simulation result shows power comsumption of the proposed receiver is only 1.45[pJ/bit] at 20Gb/s. This work is designed using 65nm process.

[ 42 ] Jaemyung Kim, Jin-Ku Kang, and Yongwoo Kim, “An FPGA-based Lightweight Deblocking CNN for Edge Devices”, ISCAS 2023.

Abstract

The demand for multimedia data is rapidly increasing in many applications running on edge devices. The data compression method is essential for efficient communication in limited network bandwidth. However, the compressed data contains blocking artifacts that cause perceptual quality degradation and visual recognition problems. Recently, deep learning-based works have shown excellent progress in deblocking tasks. However, most previous works have used complex and deep network architectures that require high computational cost and large amounts of memory to improve performance. Therefore, deploying these networks as edge device applications is highly challenging work. In this paper, we propose an FPGA-based lightweight deblocking convolutional neural network (CNN) for edge devices. We present a lightweight CNN architecture to efficiently reduce blocking artifacts in the color domain. In addition, two optimization methods were introduced to further decrease the computation and memory requirement. Compared with CNNs proposed in other works, the number of MAC operations and the model size were reduced by approximately x145 and x2,300, respectively. Finally, the proposed deblocking CNN has been implemented on a ZCU104 FPGA board. As a result of measuring the frame rate, it achieved 30.73 FPS.

[ 41 ] Jaemyung Kim, Jin-Ku Kang, and Yongwoo Kim, “An FPGA Implementation of CNN-based Compression Artifact Reduction”, ISOCC 2022.

Abstract

This paper proposes a convolutional neural network (CNN) based compression artifact reduction hardware. The proposed CNN architecture is applied to re-parameterization and INT8 quantization methods for efficient inference in edge devices. As a result of applying the optimization methods, the model size was reduced by ×5.62, and the number of operations was reduced by ×1.72. The proposed hardware achieves a frame rate of 33.33 FPS when implemented on a Xilinx ZCU104 SoC.

[ 40 ] Jihun Jeon, Jin-Ku Kang, “Filter Pruning Method for Inference Time Acceleration Based on YOLOX in Edge Device”, ISOCC 2022

Abstract

Convolutional neural network (CNN) has a lot of parameters and floating point operations (FLOPs), so it is difficult to use it in edge devices with limited resources. To solve this problem, the filter pruning method of our previous study was extended and applied to the state-of-the-art object detection network, YOLOX. In addition, the inference time of the pruned network was measured on NVIDIA Jetson Xavier NX using the PASCAL VOC dataset to confirm performance improvement in the actual edge device. When the target pruning rates of parameters and FLOPs were 40% and 30%, mean average precision (mAP)(0.5) improved by 0.07%, mAP(0.5:0.95) decreased by 0.8%, and inference time improved by 19.48%. Also, when the target pruning rates of parameters and FLOPs were 40% and 50%, mAP(0.5) decreased by 0.57%, and mAP(0.5:0.95) decreased by 2.84%, but the inference time was improved by 36.21%.

[ 39 ] Sang‐Ung Shin, Jin‐Ku Kang, and Yongwoo Kim, “A Design and Implementation of MIPI A-PHY RTS Layer”, ISOCC 2022

Abstract

This paper implements and verifies the A-PHY and RTS layers, which are the newly proposed SerDes standards. APHY and RTS layers were verified using Xilinx KC705 FPGA board and Loopback module. As a result of synthesis in FPGA, it was confirmed that 3,924 LUTs, 2,019 registers, and 132 block memories were used, and the maximum operating speed was 200MHz.

[ 38 ] Jin-ho Kim and Jin-ku Kang, “A Wide-range Low Power Quarter Rate Single Loop CDR”, ISOCC 2022.

Abstract

This paper presents a reference-less single loop clock and data recovery circuit (CDR). The proposed CDR is operating with unlimited capture range. And problem generated by oversampling is removed using bang-bang phase detector (BBPD) with two samples per 1UI. Simulation result shows the proposed CDR achieves a wide capture range from 2.6Gb/s to 13.2Gb/s and power consumption is 0.363 [pJ/bit] at 13.2Gb/s. This work is designed using 28nm CMOS process.

[ 37 ] Hwang-ung and Jin-ku Kang, “High-speed serial interface using PWAM signaling scheme”, ISOCC 2022.

Abstract

This paper presents a novel PWAM signaling scheme, which improves the high-speed data transmission capability by increasing the minimum pulse width compared to the conventional PWAM scheme. In addition, versus the existing PAM, the power efficiency of the transceiver employing a novel PWAM signaling scheme is improved by the PWM modulator based on CMOS logic. The 10-Gb/s transceiver designed for a 0.18-μm CMOS process consumes 229 mW and has a power efficiency of 22.9 pJ/bit.

[ 36 ] Hyunin-Kim and Jin-ku Kang, “A Low-Power Counter-based Digital CDR”, ISOCC 2022.

Abstract

This paper presents a counter-based digital CDR. For frequency acquisition, frequency direction is determined by comparing the number of edges of input data and recovered data. And for fine frequency acquisition, FLL gain is reduced by counting the number of times as the direction signal changes. The proposed digital CDR is designed and simulated in a 28nm CMOS technology and consumes 6.5mW in 1V supply voltage with 10Gb/s input data.

[ 35 ] Sang-Ung Shin, Jin-Ku Kang and Yongwoo Kim, “A Design and Implementation of Automotive SerDes A-PHY RTS Layer on FPGA”, ICNGC 2022.

Abstract

A-PHY interface, an automotive high-speed Serializer/Deserializer (SerDes) interface standard, was proposed by the mobile industry processor interface (MIPI). A-PHY interface created the retransmission (RTS) layer to ensure data transmission in noisy automotive environments. In this paper, we propose and design a detailed structure of the RTS layer of the A-PHY interface standard. The proposed RTS layer is designed to meet the A-PHY interface standard's retransmission specification and is implemented on the FPGA and verified by simulation and video data transmission. As a result of FPGA implementation, 2,019 registers, 3,924 LUTs, and 132 block memories were used, with a maximum operating frequency of 200 MHz.

[ 34 ] Kyeong-Min Ko, Dohyeon Kwon and Jin-Ku Kang, "Design of 20Gb/s PAM4 Transmitter with Maximum Transition Elimination and Transition Compensation Techniques", ISOCC 2021, pp. 405-406, Nov 25, 2021.

Abstract

In this paper, a pulse amplitude modulation 4-level (PAM4) transmitter with maximum transition elimination (MTE) and transition compensation technique is presented. In the PAM4 signal, inter-symbol interference (ISI) and crosstalk noise are more affected by the maximum transition than the middle or the minimum transition. The proposed PAM4 transmitter uses a encoder mapping data into maximum transition eliminated data. The proposed PAM4 transmitter also uses the transition compensation (TC) technique with an additional logic and driver expanding a middle eye in a output eye-diagram to balance middle eye with top, bottom eye. Compared to the original PAM4 transmitter, the top and the bottom eye height and width of the PAM4 transmitter with MTE encoder are expanded about 50%. With TC technique, the difference between eyes areas is reduced to less than 15%.

[ 33 ] Hyung-Wook Lee, Kyeong-Min Ko, Jin-Ku Kang, "An 8 - 26 Gb/s Single Loop Reference-less CDR with Unrestricted Frequency Acquisition", ISOCC 2021, pp. 45-46, Nov 25, 2021.

Abstract

This paper presents an 8–26 Gb/s single loop referenceless CDR with unrestricted frequency acquisition. The CDR circuit is designed in a 28 nm CMOS technology. A frequency detector (FD) controller and mode switch are proposed to extend the frequency capture range of the frequency detector. The phase frequency detector (PFD) with the FD controller and the mode switch has unlimited capture range as long as the target frequency is within the frequency range of the VCO. The simulation shows the proposed CDR achieves the capture range from 8 Gb/s to 26Gb/s and the frequency acquisition time of 0.47μs.

[ 32 ] Jaemyung Kim, Yongwoo Kim, Jin-Ku Kang, “An FPGA Implementation of Quantized CNN Hardware for IoT Devices” 2021 ICNGC.

Abstract

Due to the recent improvement in the computational power of hardware and the growth of data, a deep learning-based approach that optimizes parameters using massive data showed excellent performance. In computer vision, research using a convolutional neural network(CNN) is being actively conducted. However, it is challenging to apply to IoT devices due to the high computational complexity and massive memory usage required. In this paper, we propose a quantized CNN hardware for IoT devices that optimized memory usage and computation complexity. In addition, we present a quantization framework for the proposed hardware design. The presented framework includes floating-point training, quantization, fully integer arithmetic inference, and hardware design processes. As a result of implementing the quantized CNN on the Xilinx ZC702 evaluation board, power consumption and inference speed improved by 4.86× and 2.58×, respectively, compared to 32-bit floating-point hardware.

[ 31 ] Chang han Rho, Jin-ku Kang, Jin Liu, “Two-step Time-to-Digital Converter using pulse-shifting time-difference repetition circuit” ISOCC 2021, pp. 333-334, Nov 25, 2021

Abstract

This paper proposes a two-step TDC (Time-to-digital converter with pulse-shifting TDR (Time-difference repetition) circuit that is improved from the conventional time difference repetition circuit which only served as a time amplifier. The proposed TDC requires no time amplifier. Within the pulse shifting TDR, two pulses rotate with residual time difference information sharing one loop and perform fine quantization by shifting the pulse with 5ps resolution. This mechanism not only reduces the significant delay mismatches caused by the devices effectively, but also area and power efficient in comparison with conventional two-step TDCs that utilize both the time amplifier and fine TDC. The proposed circuit is fabricated in 180nm process and achieve 8bits 5ps resolution. The conversion rated is 10Ms/s while consuming 2.43mW and occupying 0.18 mm^2 area with 1275ps dynamic range.

[ 30 ] Kyung-Sub Son, Namyong Kim, Jin-Ku Kang“Counter-based Eye-open Monitoring System Design for High-speed Serial Interface”, ISOCC 2019, pp.311-312, Apr 27, 2020.

Abstract

An eye-open monitoring system based on signal counting is introduced. Data is sampled 2048 times and "0" or "1" is counted to determine eye-opening at each sampling point. The FPGA stores the counter value and outputs the estimated eye-diagram. Through the estimated eye-opening information, the eye calculates the open area and the optimal sampling point. The size and phase of the sampling point are controlled by 5-bit, respectively. The proposed eye-open monitor was fabricated through a 180-nm CMOS process and consumes 86mW at a 2Gb/s data rate, 1.8V supply.

[ 29 ] Min Kim, Kyung-Sub Son, Jin-Ku Kang, "A two-step time-to-digital converter using ring oscillator time amplifier", ISOCC 2018, pp.143-144, Feb. 24, 2019.

Abstract

A two-step time-to-digital converter using a ring oscillator time amplifier is presented. The time amplifier structure does not accumulates the error in the iterative process of time. There are 8 bits in total, of which 4 bits are obtained in the coarse conversion and 4 bits are obtained in the fine conversion by amplifying the remaining time. The TDC circuit occupied an area of 0.34 mm2 using 180 nm CMOS process. The effective number of bits is 7.42bits. The TDC circuit has shown 10.5 ps resolution for a 50 MHz. The DNL and INL are 0.7(LSB) and 0.5(LSB), respectively. The power consumption is 1.34 mW with a 1.8 V supply.

[ 28 ] Nguyen Huu Tho, Kyung-Sub Son, Kyongsu Lee, and Jin-Ku Kang, "A 200-Mb/s to 3-Gb/s Wide-band Referenceless", IEICE Electronics Express, Vol.14(2017), No.8, pp. 20161279, Apr. 25, 2017.

Abstract

Thispaper presents a 200-Mb/s to 3-Gb/s half-rate referenceless clock and data recovery (CDR) circuit in 180nm CMOS process. A bidirectional frequency detector (FD) is proposed to eliminate the harmonic locking issue and reduce the frequency acquisition time. A frequency band selectorfor widerange the voltage-control oscillator (VCO) is also presented to select an exact frequency band of the VCO. The simulation shows the CDR achieves 10-ps peak-to-peak jitter at 3Gb/s and the frequency acquisition time of 12.9 μs.

[ 27 ] Eunho Yang, Kyoungsu Lee, Jin-Ku Kang"A low power 120-to-520Mb/s clock and data recovery circuit for PWM signaling scheme",Circuits and Systems (ISCAS), 2015 IEEE International Symposium on,pp.345-348,May 2015.

Abstract

This paper presents a 120-to-520Mb/s clock and data recovery (CDR) circuit that utilizes pulse width modulation (PWM) signaling scheme. Compared to the conventional approach, the proposed retiming scheme improves sampling margin over 200%, which results in lower BER. The proposed idea has been simulated in a 65nm CMOS technology. The post layout simulation result has shown that recovered clock and data have 3.42ps and 7.55ps rms jitter at 500Mb/s data rate. The CDR circuit consumes 1.97mW (1.2V supply) at 500Mb/s of MIPI M-PHY signaling format.

[ 26 ] Kyung-Sub Son,Jin-Ku Kang"On-chip jitter tolerance measurement technique with independent jitter frequency modulation from VCO in CDR",IEICE Electronics Express,Vol.12(2015)No.15pp.20150570,July,2015.

Abstract

We present an on-chip measurement technique to characterize the jitter tolerance of a clock and data recovery (CDR) circuit. The proposed jitter modulation scheme incorporates a modulated-charge-pump and a pulse generation circuits to apply a periodic triangular form voltage directly to the control voltage of CDR circuit. This jitter frequency generation scheme independent from the VCO in the CDR allows a wide and linear control of jitter. The modulated jitter amplitude range was 0.05–2 UIpp at 10 MHz, and the jitter frequency range was 100 KHz–20 MHz. The circuit was fabricated in 65 nm CMOS, and the jitter tolerance was successfully measured at 5 Gbps with a 27-1 PRBS pattern. The accuracy was within 10% error from the external BER equipment measurement result. The whole CDR circuit consumes 29.9 mW at a supply voltage of 1.2 V.

[ 25 ] Taek-Joon An, Kyung-Sub Son, Young-Jin Kim, In-Seok Kong, Jin-Ku Kang,"A 8.7mW 5-Gb/s clock and data recovery circuit with 0.18-µm CMOS",Circuits and Systems (ISCAS),2014 IEEE International Symposium on,pp.2329-2332, June 2014

Abstract

The rapid growth of the data rate in serial links reveals the problem of power consumption, motivating utilization of low power building blocks. This paper presents a low-power clock and data recovery (CDR). By employing dynamic CML latch which draws a current during half of the clock cycle and voltage-to-current(V/I) converter which performs the XOR function itself, power reduction in phase detector(PD) is achieved. The CDR circuit is simulated using 5-Gb/s data with 0.18-μm CMOS technology, and the circuit consumes 8.7mW from a 1.8-V supply.

[ 24 ] Taek-joon An, Jin-Ku Kang,"A 5-Gb/s 11.4mW half-rate CDR in 0.18μm CMOS",ISOCC,2013International,pp.333-334, Nov.2013

Abstract

A low-power clock and data recovery(CDR) circuit with a phase detector(PD) using dynamic current-mode logic latches and a novel V/I converter is described. The proposed latch draws a current during half of the clock cycle and the proposed V/I converter includes the XOR function by itself. The half-rate CDR circuit is simulated using 5-Gb/s with 0.18-um CMOS technology, and the circuit consumes only 11.4mW from a 1.8-V supply.

[ 23 ] Jin-Cheol Seo, Sang-Soon Im, Kwan Yoon, Seung-Wook Oh, Taek-Joon An, Gi-Yeol Bae and Jin-Ku Kang, "A 1.62/2.7/5.4Gbps clock and data recovery circuit for DisplayPort 1.2", IEEE SOCC 2012, pp.57-60, Sept. 2012.

ABSTRACT

In this paper, a clock and data recovery (CDR) circuit that supports triple data rates of 1.62, 2.7 and 5.4Gbps for DisplayPort 1.2 standard is described. The proposed CDR circuit employs a dual-loop architecture that includes a phase-locked loop and a frequency-locked loop. The circuit with a half-rate phase detector has a triple-mode voltagecontrolled oscillator (VCO) which changes the operating frequency by 3bit code. The prototype chip is designed and verified using a 65nm CMOS technology. The recovered-clock jitter with the data rates of 1.62/2.7/5.4Gbps at 231-1 PRBS is measured to 7/5.6/4.7psrms, respectively, while consuming 11mW with a 1.2V supply.

[ 22 ] Seung-Wuk Oh, Sang-Ho Kim, Jin-Ku Kang, "An Audio Clock Regenerator with a Wide Dividing Ratio for HDMI", International Symposium on Circuit and Systems, pp.2019-2022, May. 2012.

Abstract

This paper presents a clock regenerator using two 2nd order Σ-Δ (sigma-delta) modulators for wide range of dividing ratio as HDMI standard. The proposed circuit adopts a fractional-N frequency synthesis architecture for PLL-based clock regeneration. The source device sends N (Dividing ratio of video clock to TMDS clock) and CTS (Cycle Time Stamp) values to the sink device for regenerating the audio clock. By

processing the integer and fractional part of the N and CTS values separately at two different Σ-Δ modulators, the proposed circuit covers a very wide range of the dividing ratio as HDMI standard and occupies small chip area. The circuit is fabricated using 0.18um CMOS and shows 13mW power consumption with on-chip loop filter.

[ 21 ] Benjamin P. Wilkerson, Tae-Ho Kim, Jin-Ku Kang, "Low-Power Non-Coherent Data and Power Recovery Circuit for Implantable Biomedical Devices", International SOC Design Conference, Nov. 2011.

Abstract

In this paper, we present a low-power non-coherent amplitude shift keying (ASK) and phase shift keying (PSK) demodulators with inductive power self-recovery system, and the data-clock recovery circuit for implantable biomedical devices. Both circuits use different coupling factors from 0.1 to 0.5 in ASK and 0.5 in PSK for inductive link. The recovered power regulator that consists of a beta multiplier reference

(BMR) with band-gap uses the bridge rectifier output as an unregulated DC input, and produces from 1.73 to 1.8 V regulated output for the demodulator with 2 MHz carrier. The PSK demodulator uses a new demodulation method that uses signals from two differential comparators with low-pass prefiltered (LPPF) and high-pass pre-filtered (HPPF) outputs for detecting phase changing edges to recover data and clock

signals. The full wave detecting signals are used as differential inputs in LPPF comparator of the ASK demodulator. The results of the demodulators with the self-power supply show up to less than 62 W and 115 W power consumption in ASK and PSK at 1 Mbps data transfer rate.

[ 20 ] Tae-Ho Kim, Jong-Seok Han, Sang-Soon Im, Jae-Young Jang, Jin-Ku Kang, "A 4Gb/s Adaptive FFE/DFE Receiver with Data- Dependent Jitter Measurement", European solid-state circuit conference, Sept. 2011.

Abstract

This paper presents an adaptive FFE/DFE receiver with data-dependent jitter measuring algorithm. The proposed adaptive algorithm determines the compensation level by measuring the input data-dependent jitter. The adaptive algorithm is combined with a CDR phase detector. The receiver is fabricated in a 0.13-μm CMOS technology and the compensation range of equalization is up to 26 dB at 2GHz. Test chip is verified for 40-inch FR4 trace and 53-cm FPC (Flexible Printed Circuit) channel. The receiver occupies 440μm × 520μm, and power dissipation is 49mW (excluding I/O buffers) from a 1.2-V supply.

[ 19 ] Hyun-Bae Jin, Jong-Seok Han, Jin-Ku Kang, "A 720Mbps Fast Auxiliary Channel Design for DisplayPort 1.2", ISOCC 2010, pp. 404-407, Nov. 2010.

Abstract

This paper presents the design of a fast auxiliary channel bus for DisplayPort 1.2 interface. The fast auxiliary channel supports Manchester transactions at 1Mbps and fast auxiliary transactions at 780Mbps. The Manchester transaction is used for managing the main link and auxiliary channel and the fast auxiliary transaction is for data transfer via the auxiliary channel. Simplified serial bus architecture is proposed to be

implemented in fast auxiliary channel. The fast auxiliary channel is synthesized using a FPGA board and it operates at 72MHz to support 720Mbps.

[ 18 ] Jae-Wook Yoo, Tae-Ho Kim, Dong-Kyun Kim, Jin-Ku Kang, "A CMOS 5.4/3.24Gbps Dual-rate Clock and Data Recovery", SOCC 2010, pp. 88-91, Sept. 2010.

Abstract

This paper presents a clock and data recovery (CDR) circuit that supports dual data rates of 5.4Gbps and 3.24Gbps for DisplayPort v1.2 sink device. The quarter-rate linear PD in the proposed CDR reduces jitter by enhancing the up and down pulse width. A charge pump (CP) is designed to compensate the different up and down pulse width of the PD and to reduce the current mismatch and power consumption. A voltage controlled oscillator (VCO) is designed with a “Mode” switching control for operating frequency selection. The measured RMS jitter of recovered clock signal is 3.0ps and the peak-to-peak jitter is 24.89ps under the simulation of a 231-1 bit-long pseudo random bit sequence (PRBS) at the bit rate of 5.4Gbps. The chip area is 1.0mm x 1.3mm and the power consumption is 117mW from a 1.8V supply using 0.18μm CMOS process.

[ 17 ] Sang-Ho Kim, Hyung-Min Park, Tae-Ho Kim, Jin-Ku Kang, Jin-Ho Kim, Jae-Youl Lee, Yoon-Kyung Choi, Myung-Hee Lee, "A 1.7Gbps DLL-based Clock Data Recovery in 0.35um CMOS", SOCC 2010, pp. 84-87, Sept. 2010.

Abstract

This paper presents a DLL(Delay Locked Loop)-based CDR(Clock Data Recovery) design with nB(n+2)B data formatting scheme. Due to the proposed data formatting scheme, the CDR does not require the external reference clock. The proposed nB(n+2)B data formatting scheme is done by inserting the ‘01’ pattern in every N-bit data. To prove the feasibility of the scheme, a 1.7Gbps CDR is designed, simulated and fabricated. The proposed CDR achieves less jitter due to the DLL structure. The proposed 1.7Gbps CDR with the 10B12B data formatting consumes approximately 8mA under 3.3V power supply using 0.35m CMOS process.

[ 16 ] Jae-Wook Yoo, Dong-Kyun Kim, Jin-Ku Kang, "A 5.4Gbps/3.24Gbps Dual-rate CDR with Strengthened Up/Down Pulse Ratio", ISOCC 2009, pp. 528-531, Nov. 2009.

Abstract

This paper describes a clock and data recovery (CDR) circuit that supports dual data rates of 5.4Gbps and 3.24Gbps for DiaplayPort1.2 sink device. The proposed CDR uses a quarter-rate linear phase detector (PD). A charge pump is designed to compensate the different up & down pulse width of the PD and to reduce the current mismatch and powerconsumption. The proposed PD strengthens the up/down pulse ratio to 5:4. The simulated peak-to-peak jitter is reduced to 7.715ps from 25.16ps of the conventional approach at 5.4Gbps. A voltage-controlled oscillator (VCO) is designed for changing the operating frequency of quarter-rate clock with a “Mode” switching control. This work is designed based on 0.18μm CMOS process. Simulation shows the power consumption is 117mW from a 1.8V supply.

[ 15 ] Seung-Won Lee, Tae-Ho Kim, Jae-Wook Yoo, Jin-Ku Kang, "A 2.7Gbps & 1.62Gbps Dual-Mode Clock and Data Recovery for DisplayPort in 0.18um CMOS", SOCC 2009, pp. 179-182, Sept. 2009.

Abstract

This paper describes a clock and data recovery (CDR) circuit that support dual data rates of 2.7Gbps and 1.62Gbps for DisplayPort standard. The proposed CDR has a dual mode voltagecontrolled oscillator (VCO) that changes the operating frequency with a “Mode” switch control. The chip has been implemented using 0.18μm CMOS process. Measured results show the circuit exhibits peak-to-peak jitters of 37ps(@2.7Gbps) and 27ps(@1.62Gbps) in the recovered data. The power dissipation is 80mW at 2.7Gbps rate from a 1.8V supply.

[ 14 ] Jae-Wook Yoo, Tae-Ho Kim, Kwang-Su Ko, Jin-Ku Kang, "5.4Gbps/3.24Gbps Dual-rate CDR with Quarter-rate Linear Phase Detector", ITC-CSCC 2009, pp. 779-782, Jul. 2009.

Abstract

This paper describes a clock and data recovery (CDR) circuit that supports dual data rates of 5.4Gbps and 3.24Gbps for DiaplayPort1.2 sink device. The proposed CDR uses a quarter-rate linear phase detector (PD). A voltage-controlled oscillator (VCO) is designed for changing the operating frequency of quarter-rate clock with a “Mode” switching control. This work is designed based on 0.18

[ 13 ] Suk-Won Lee, Hyung-Min Park, Sang-Ho Kim, Jin-Ku Kang, "A High-Resolution Coarse-Fine Time-to-Digital Converter Reducing a Delay Mismatch in the Vernier Delay Line", ITC-CSCC 2009, pp. 773 -776, Jul. 2009.

Abstract

This paper describes a coarse-fine time-to-digital converter design (TDC) using a vernier delay line (VDL) for a digital frequency synthesizer. We propose the coarsefine TDC architecture using a dual-DLL and a multiplexer that reduce the delay mismatch caused by process, voltage and temperature (PVT) variations. The TDC circuit is designed and simulated in 0.18 μm CMOS technology. The proposed TDC achieves 11.1 ps fine resolution. The INL is less than ±0.29 LSB and the DNL is within +0.15 LSB to −0.29 LSB.

[ 12 ] Yong-Woo Kim, Jin-Ku Kang, "An 8B/10B encoder with a modified coding table", APCCAS 2008, pp. 1522-1525, Dec. 2008.

Abstract

This paper presents a design of 8B/10B encoder with a modified coding table. The proposed encoder has been designed based on a reduced coding table with a modified disparity control block. After being synthesized using CMOS 0.18μm process, the proposed encoder shows the operating frequency of 343 MHz and occupies the chip area of 1886 μm2 with 189 logic gates. It consumes 2.74mW power. Compared to conventional approaches, the operating frequency is improved by 25.6% and chip area is decreased to 43%.

[ 11 ] Tae-Ho Kim, Sang-Ho Kim, Jin-Ku Kang, "A 5-Gb/s Continuous-time Adaptive Equalizer and CDR using 0.18um CMOS", ISOCC 2008, pp. 49-52, Nov. 2008.

Abstract

In this paper, a 5-Gb/s receiver with adaptive equalizer and clock and data recovery(CDR) for serial link interface is proposed. In order to operate adaptively at 5-Gb/s data rate, LMS algorithm uses two internal signals from slicers which does not have an effect on gain boosting performance. In addition, this scheme enables it to operate without passive filter since two internal signals of slicers has a similar DC magnitude. The proposed adaptive equalizer in this receiver can compensate up to 20-dB and operate in various environments, which are 15-m shield twisted pair(STP) cable for DisplayPort and flame retardant 4(FR-4) traces up to 60-inch adaptively. This work is implemented 0.18-μm 1-poly 4-metal CMOS technology. Power dissipation of the equalizer is only 6-mW and it occupies 200μm x 350μm. Total power dissipation of the combined CDR is 164-mW(including output buffers) and operating range is available up to 5-Gb/s.

[ 10 ] Hyun-Chul Lee, Suk-Won Lee, Jin-Ku Kang, "Spread Spectrum Clock Generator for DisplayPort", ISOCC 2008, pp. 5-8, Nov. 2008.

Abstract

This paper describes a spread spectrum clock generator (SSCG) for the DisplayPort transmitter system. The proposed architecture generates the spread spectrum clock using a fractional-N PLL. The SSCG uses a digital 2nd order MASH 1-1 sigma-delta modulator and 9bit Up/Down counter. The SSCG generates clocks at 270MHz and 162MHz with 0.25% downspreading with triangular waveform frequency modulation of 33 KHz for DisplayPort transmitter system. And the peak power reduction is about 5dBm. The circuit has been simulated in 0.18um CMOS technology.

[ 9 ] Seung-Won Lee, Jae-Wook Yoo, Jin-Ku Kang, "A 2.7Gbps & 1.62Gbps Dual-Mode Clock and Data Recovery for DisplayPort", ISOCC 2008, pp. 13-16, Nov. 2008.

Abstract

This paper describes a clock and data recovery (CDR) circuit that support dual data rates of 2.7Gbps and 1.62Gbps for DisplayPort sink device. This CDR uses the half-rate linear phase detector (PD). A voltage-controlled oscillator (VCO) is proposed to change the operating frequency of half-rate clock with a “Mode” switch control. This work is implemented 0.18μm CMOS process. The device exhibits peak-to-peak jitters of 12ps and 14ps in the recovered clock with random data inputs. The power dissipation is 81mW from a 1.8V supply.

[ 8 ] Yong-Woo Kim, Seong-Bok Cha, Jin-Ku Kang, "A Design of DisplayPort Link Layer", ISOCC 2008, pp. 45-48, Nov. 2008.

Abstract

This paper presents an implementation of DisplayPort 1.1 Link Layer. The DisplayPort link layer provides isochronous transport service, link service, and device service. Isochronous transport service in source device maps the video and audio streams into the main link under a set of rules, so that the stream can be properly reconstructed to original format and synchronized by the sink device. The link service is used for discovering, configuring, and maintaining the link by accessing DPCD via AUX CH. The main link transmitter and receiver is implemented with 4,820 ALUTs and 4496 register, 557,110 of block memory bits synthesized using Quartus II at Altera Stratix II GX board and can be operated at 200.32MHz. Also, the AUXCH block is implemented with 765 ALUTs and 298 register, respectively.

[ 7 ] Wan-Sik Lim, Jin-Ku Kang, "Spread Spectrum Clock Generator by VCO Current Modulation", ISOCC 2007, pp. 469-472, Oct. 2007.

Abstract

A spread spectrum clock generation is an efficient way to reduce EMI radiation in modern mixed signal chip systems. The proposed architecture generates the spread clock by directly injecting the modulation

voltage into the VCO current source. This method has a fixed frequency ratio and has an advantage of simple structure and less power consumption and area. And the peak power reduction is 10dBm.

[ 6 ] Wan-Sik Lim, Jin-Ku Kang, "A Spread Spectrum Clock Generator using modulationon VCO current source for SATA-Ⅱapplication", ITC-CSCC 2006, Jul. 2007.

Abstract

In this paper, we proposed a spread spectrum clock generator phase locked loop (SSCG PLL) for the Serial Advanced Technology Attachment Ⅱ(SATA Ⅱ). We use a conventional integer PLL topology and Modulation block. Proposed SSCG generates clock at 1.5 ㎓ with ± 0.47% center spread ratio.

[ 5 ] Ho-Kyoung Lee, Jin-Ku Kang, "3.125Gb/s Reduced Word-aligning 1:10 Demultiplexer for Seiral Communication", ITC-CSCC 2007, Jul. 2007.

Abstract

In this paper, reduced 1:10 demultiplexer with word alignment and comma detect block in the receiver for serial data communication is designed using 0.18 um CMOS technology. Proposed architecture simplified circuit structure with new comma detect and net routing. It operates from 1 Gb/s to 3.125 Gb/s using single clock edge. Power consumption is 16 mW.

[ 4 ] Ki-Hyuk Ha, Jin-Ku Kang, "A Low Power and Fully Differential 2.5GHz 70dBΩ CMOS TIA for Optical Communication", ITC-CSCC 2007, Jul. 2007.

Abstract: In this paper, the TIA of differential structure is developed based on CMOS 0.18㎛ technology. The proposed TIA includes input stage, inverter gain stage and fully-differential amplifier, has insensitivity of the parasitic capacitance of ESD and PD and maximized the bandwidth. The differential TIA with a gain of 70dB and 3.125Gbps bandwidth consumes 6.5mW power. At high speed application as not using the inductor to expand the bandwidth, reduces the chip area.

[ 3 ] Jung-Yong Lee, Jin-Ku Kang, "1.25Gbps Clock/Data Recovery with a Wide Frequenct Tracking", ISOCC 2006, pp. 249-252, Oct. 2006.

Abstract - An integrated 1.25Gbps half rate clock and data recovery (CDR) circuit is presented. The circuit does not need a reference clock. It has a phase and frequency detector (PFD), which incorporates a bang-bang type oversampling PD and a rotational frequency detector (FD). It also has a ring oscillator type VCO with four delay stages and three zero-offset charge pumps. The currents of three different charge pums A, B, and C are 10, 50 and 200 ㎂, respectively. The circuit recovers a recovered data rate from 900Mbps to 1.5Gbps while the tuning range of the VCO is from 400MHz to 800MHz. With a proposed PD and FD, the tracking range of 20%(up tracking) to 28%(down tracking) can be achieved. The chip has been fabricated in 0.25㎛ CMOS process, and consumes a power of 150㎽ from 2.5 / 3.3 V supply.

[ 2 ] Il-Do Kim, Byeong-Jin Noh, Ki-Hyuk Ha, Jin-Ku Kang, "A multi-spread ratio spread-spectrum clock generator for LCD panel", ISOCC 2006, pp. 447-450, Oct. 2006.

Abstract

In this paper, a multi-spread ratio spread spectrum clock generator (SSCG) for LCD timing controller is described. A PLL-based spread spectrum clock generator with six different spread ratios is designed by connecting a programmable charge pump to the low pass filter (LPF). In order to obtain “Hershey Kiss” modulation profile, a number of charge pump circuits in parallel are used. In order to apply the timing controller in the LCD panel, the circuit has the input and spread output frequency of 25MHz to 100MHz. The spread ratio of 12% to 0.5% was obtained through the ratio selecting signals. It dissipates 100mW of power at 65MHz. The proposed circuit is designed using the TSMC 0.18um CMOS technology.

[ 1 ] Jung-Young Lee, Wan-Sik Lim, Ki-Hyuk Ha, Jin-Ku Kang, "10GHz LC Tank Multiphase PLL for 40Gbps CDR", ITC-CSCC 2006, pp. 137-140, Jul. 2006.

Abstract

In this paper, 10GHz LC Tank Multiphase Phase Locked Loop (PLL) is design for 40 Gb/s Clock and Data Recovery(CDR). The LC Tank VCO generates clock signal of 8 phases, operating from 9.7 GHz to 10.5 GHz and each clock phase is 45 degree apart. The divider accomplishes a divider 160 with two steps of a divider 4 and a divider 40. The Phase frequency detector (PFD) compares the divided signal with reference signal from crystal oscillator, and generates Up and Down signals to the charge pump to produce a control voltage for VCO. Proposed circuit is designed using the 0.18um CMOS technology and operating voltage is 1.8V.

Page updated

Google Sites

Report abuse