International Journal

International Journal

2024

Jin-ho Kim, Tae ho Kim, Hyung-wook Lee, Jeong-mi Park, Jin-ku Kang, “1.4-8 Gb/s Low Power Quarter-Rate Single-Loop Referenceless CDR With Unlimited Capture Range, IEEE Transactions on Circuits and Systems II: Express Briefs, March, 2024

Abstract

This paper describes a low power quarter-rate single-loop clock and data recovery circuit (CDR) without a reference clock. A new frequency acquisition method is proposed, featuring unlimited frequency capture range and a short locking time. The proposed CDR has been designed and fabricated in a 28nm CMOS process, and measurement results show a capture range of 1.4Gb/s to 8Gb/s over the full voltage-controlled oscillator (VCO) operating range, with a locking time of approximately 1.37μs. The power efficiency is 0.71 pJ/bit at 8Gb/s input data. 

2023

Jaemyung Kim, Hyun-Ho Kim, Doo-Chun Seo, Jae-Heon Jeong, Jin-Ku Kang, Yongwoo Kim, “MASCAR: Multi-domain Adaptive Spatial-spectral Variable Compression Artifact Removal Network for Multi-spectral Remote Sensing Images”, IEEE Transactions on Geoscience and Remote Sensing, December, 2023

Abstract

In remote sensing environments, image compression is essential to efficiently transmit and store high-resolution images due to the limited bandwidth and storage capacity. However, compression often leads to image quality degradation, requiring compression artifact removal technology in the post-processing stage. Although deep neural networks have shown remarkable performance in image restoration, most existing methods have not adequately considered the compression conditions specific to remote sensing environments and have been evaluated primarily on synthetic datasets. To solve these issues, we propose a multi-domain adaptive spatial-spectral variable compression artifact removal network (MASCAR), which effectively restores the earth surface details of compressed images in remote sensing environments. We introduce a multi-domain local patch collaborative learning strategy that extracts diverse features by decomposing the input local patch into different domains. In addition, we propose a detail focusing approach to direct the network’s focus towards fine-texture detail restoration and ensure stable training of remote sensing images with significant deviations in pixel distribution of local patches. Furthermore, a detail enhancement approach is presented to enhance the details of the restored images. Moreover, we propose an incorporated compressed image quality adaptation mechanism to respond flexibly to unknown compression ratios in remote sensing environments. The performance of MASCAR applied with the proposed method is evaluated on synthetic and real-world remote sensing datasets. Experimental results demonstrate that the proposed method has better quantitative performance and visual quality than existing methods. 

Jeong-Mi Park, Jin-Ku Kang, “A 20-Gb/s PAM-4 Receiver with Dual-mode Threshold Voltage Adaptation using a Time-based LSB Decoder”, JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, Vol. 23, No. 5, pp. 303~313, October, 2023

Abstract

This paper presents a pulse amplitude modulation-4 (PAM-4) receiver with dual-mode threshold voltage applied to a time-based LSB decoder. The proposed receiver can select the threshold voltage that improves the robustness to sampler voltage variations. It also presents a random data-based threshold voltage adaptation using a single error sampler. Compared to the conventional PAM-4 threshold voltage adaptation that finds four data levels, this method finds only two levels, which reduces the overall power consumption, hardware complexity and adaptation time. The 20-Gb/s PAM-4 serial link was designed in a 65 nm CMOS Technology and analyzed with XMODEL and Cadence Design System's Spectre. A channel with 15.36 dB loss at Nyquist frequency was compensated through a two-stage continuous-time linear equalizer (CTLE), a variable gain amplifier (VGA). The

simulation results demonstrate proper convergence of threshold voltage and reduce the threshold adaptation time compared to the conventional. The power consumption of the receiver is only 29 mW. The power efficiency of the receiver is 1.45 pJ/bit.

Yong-Sung Ahn, Jeong-Mi Park, Jin-Ku Kang, Jaehoon Jun, “A ±0.48◦C (3σ) Inaccuracy BJT-Based Temperature Sensor With 241 μs Conversion Time for Display Driver IC in 40 nm CMOS”, IEEE ACCESS, Vol.11, pp. 132843~132851, 2023

Abstract

This paper describes a fast BJT-based temperature sensor with ±0.48◦C inaccuracy embedded in a display driver integrated circuit (DDIC) for detecting the temperature of a display module. It utilizes the base-emitter voltage difference between two BJT elements in a bandgap reference (BGR) circuit to create a voltage proportional to the absolute temperature, which is then converted to a digital value through an analog-to-digital converter (ADC). The voltage varies proportionally with the temperature change obtained from the temperature sensor and is directly digitized without removing the offset errors from the analog circuit stage. The error is mitigated through a proposed digital correction method. The proposed on-chip temperature sensing circuit for sophisticated DDIC applications shows an inaccuracy of ±0.48◦C and a resolution of 0.25◦C by applying a digital compensation method including thermal resistance calibration considering an operation mode of a display. The conversion time of the temperature to digital converter is only 241 μs. The prototype dissipates only 129.17 μWand achieves high energy-efficiency of 31.1 nJ/conversion.

Sang-ung Shin, Jin-Ku Kang, Yongwoo Kim, “Design and Implementation of an MIPI A-PHY Retransmission Layer for Automotive Applications”, Electronics 2023, Vol. 12, Issue 20, 4243

Abstract

Recently, with the development of automobile technologies such as advanced driver assistance systems (ADASs), the performance and number of cameras and displays required for a vehicle have significantly increased. Therefore, the need for in-vehicle high-speed data transmission has increased, but there is difficulty in handling the required high-speed data transmission in existing in-vehicle networks. The MIPI A-PHY interface for automobiles has been proposed as a new standard to solve this issue. To ensure data transmission in noisy automotive environments, the A-PHY interface contains an added retransmission (RTS) layer within the new physical layer. In this paper, we propose and design in detail the structure of an RTS layer presented in the standard A-PHY interface. The proposed RTS layer was designed to satisfy the RTS specification of the MIPI A-PHY standard and was verified through simulations. Moreover, the A-PHY SerDes environment was configured in an FPGA using a Xilinx KC705 FPGA development board and an FPGA Mezzanine Card (FMC) loopback module, and RTS layer operation was verified through the process of transmitting video data to the A-Packet. The A-PHY interface with the RTS layer designed on the FPGA uses 3924 LUTs, 2019 registers, and 132 block memories and operates at a maximum speed of 200 MHz. In addition, as a result of designing the A-PHY interface as an ASIC implementation using the Synopsys SAED 28 nm process, the number of logic gates is 25 K, the chip area is 0.40 ㎟, and the maximum operating speed is 200 MHz. 

Seongho Kim, Taek-Joon An, Yongwoo Kim, and Jin-Ku Kang, “A Spread Spectrum Clock Generator with Dual-tone Hershey-Kiss Modulation Profile”, JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, Vol. 23, No. 1, p.39~49, FEBRUARY, 2023

Abstract

This paper presents a spread spectrum clock generator (SSCG) using a dual-tone Hershey-Kiss modulation profile. The modulation controller has two up/down counters and one delta-sigma modulator, and the output of the modulation controller is provided to a multi-modulus divider in a fractional-N PLL. The proposed SSCG is designed to operate in either single-tone modulation mode or dual-tone modulation mode. Once the targeted modulation frequency and spread ratio are given, the

design variables for the SSCG can be controlled digitally. The proposed SSCG was designed and fabricated using the 65 nm CMOS process and consumes 8.5 mW while generating a 5 GHz spectrum-spread clock signal with 1.2 V supply voltage. After all design parameters are set for a 0.5% spread ratio using 30 and 33 kHz modulation frequencies, the measured EMI reduction is 24.6 dB while single-tone modulation is applied and 28.7 dB while dual-tone modulation is applied.

2022

Hwan-ung Kim, Jin-Ku Kang, “A novel PWAM signaling scheme for high-speed serial interface”, JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, Vol. 22, Issue 5, pp. 326~339, 2022

Abstract

This paper presents a novel PWAM signaling scheme, which combines a dual-mode pulse amplitude modulation-4 (dual-mode PAM-4) and a pulse width modulation-2 (PWM-2). Its combination is different from that of the conventional PWAM scheme [4]. So, the minimum pulse width of the proposed PWAM scheme is increased. The proposed PWAM scheme can reduce the power consumption of the transceiver by decreasing the number of differential levels (X) compared with the existing 4-bit/symbol PAM-X scheme (i.e., dual-mode PAM-10 [2] and PAM-16 [13]). The proposed PWAM transceiver was designed in a 180 nm CMOS process, and it has a target of 10-Gb/s. The power consumption of the transmitter and receiver is 134 mW and 95 mW, respectively. The power efficiency of the transmitter and receiver are 13.4 pJ/bit and 9.5 pJ/bit, respectively.

Jaemyung Kim, Jin-Ku Kang, Yongwoo Kim, “A Low-Cost Fully Integer-Based CNN Accelerator on FPGA for Real-Time Traffic Sign Recognition”, IEEE ACCESS, 2022

Abstract

Traffic sign recognition (TSR) technology allows the vehicle to recognize road signs through a camera and use it for driving. For traffic safety, TSR is one of the core technologies constituting advanced driver assistance systems (ADAS), and several researches have been studied. The advent of convolutional neural networks (CNNs) has opened up new possibilities in automotive environments, especially for ADAS.

However, deploying a real-time TSR application in resource-constrained ADAS is challenging because most CNNs require high computing resources and memory usage. To address this problem, some works have been studied to consider optimization in embedded platforms, but existing works used many hardware resources or showed low computation performance. In this paper, we propose a low-cost CNN-based real-time TSR hardware accelerator. Firstly, we extend a novel hardware-friendly quantization method to reduce computational complexity. The quantization method can reconstruct the CNN so that all operations, including the skip connection path of residual blocks, use only integer arithmetic and reduce the computational overhead by replacing the quantization affine mapping process with a shift operation. Secondly, the proposed hardware accelerator applied two parallelization strategies to balance real-time inference and resource consumption.

In addition, we present a simple and effective hardware design scheme that handles the skip connection path of residual blocks. This design scheme can optimize the dataflow of the skip connection path and reduce additional internal memory usage. Experimental results show that the reconstructed fully integerbased CNN only requires 24M integer operations (IOPs) and possesses a model size of 0.17MB. Compared with the previous work, the proposed CNN model size was reduced by ×105, and the number of operations was reduced by ×58. In addition, the proposed CNN can achieve a TSR accuracy of 99.07%, which is the highest accuracy among CNN-based TSR works implemented on embedded platforms. The proposed hardware accelerator achieves a computation performance of 960 MOPS and a frame rate of 40 FPS when implemented on a Xilinx ZC706 SoC. Consequently, this work improves by ×11.87 and ×36.7 on computation performance and frame rate compared to the previous work.

Jihoon Jeon, Jaemyung Kim, Jin-Ku Kang, Sungtae Moon, Yongwoo Kim, “Target Capacity Filter Pruning Method for Optimized Inference Time Based on YOLOv5 in Embedded Systems”, IEEE ACCESS, Vol.10, pp. 70840~70849, 2022

Abstract

Recently, convolutional neural networks (CNNs), which exhibit excellent performance in the field of computer vision, have been in the spotlight. However, as the networks become wider for higher accuracy, the number of parameters and the computational costs increase exponentially. Therefore, it is challenging to use deep learning networks in embedded environments with limited resources, computational performance, and power. Moreover, CNNs consume a great deal of time for inference. To solve this problem, we propose a practical method for filter pruning to provide an optimal network architecture for target capacity and inference acceleration. After revealing the correlation between the inference time and the FLOPs, we proposed a method to generate a network with the desired inference time. Various object detection datasets were used to evaluate the performance of the proposed filter pruning method. The inference time of the pruned network was measured and analyzed using the NVIDIA Jetson Xavier NX platform. As a result of pruning the number of parameters and FLOPs of the YOLOv5 network in the PASCAL VOC dataset by 30%, 40%, and 50%, the mAP decreased by 0.6%, 2.3%, and 2.9%, respectively, while the inference time was improved by 14.3%, 26.4%, and 34.5%, respectively.

Jaepil Bak, Taek-Joon An, Yongwoo Kim, Jin-Ku Kang, “An Overhead-Reduced Key Coding Technique for High-Speed Serial Interface”, IEEE ACCESS, Vol.10, pp. 21187~21192, 2022

Abstract

This paper describes a packet-based overhead-reduced (OR) key coding technique for a highspeed serial interface. The 8B10B code is a de facto standard coding technique in the application but its bit-overhead is 25%. The proposed key coding technique is to reduce the coding overhead and still provides enough bit transition to facilitate clock and data recovery in the receiver. After a key pattern is generated from a certain data stream, input data are encoded and framed as packets along with the generated key for transmission. The packets are transmitted and then decoded as original data in the receiver. Using the proposed coding scheme, 4-, 6-, and 8-bit key coding systems are designed and compared. When a 6-bit key coding encoder/decoder is tested, a packet is composed of a 6-bit OR key header followed by 30 encoded sub-packets, in which each sub-packet has a 6-bit data. In the 6-bit case, the bit overhead is only 3.33% and the maximum continuous run length is 10 bits. To control the running disparity for the AC coupling interface, a logic for selecting the optimal key is implemented to keep the running disparity as low as possible. The running disparity of the encoded data with 6-bit key code is controlled within +/−12.

2021

Jaemyung Kim, Jin-Ku Kang, Yongwoo Kim, “A Resource Efficient Integer-Arithmetic-Only FPGA-Based CNN Accelerator for Real-Time Facial Emotion Recognition”, IEEE ACCESS, Vol.9, pp. 104367~104381, 2021

Abstract

Recently, many researches have been conducted on recognition of facial emotion using convolutional neural networks (CNNs), which show excellent performance in computer vision. To obtain a high classification accuracy, a CNN architecture with many parameters and high computational complexity is required. However, this is not suitable for embedded systems where hardware resources are limited. In this paper, we present a lightweight CNN architecture optimized for embedded systems. The proposed CNN architecture has a small memory footprint and low computational complexity. Furthermore, a novel hardware-friendly quantization method that uses only integer-arithmetic is proposed. The proposed hardware-friendly quantization method maps the scale factors to power-of-two terms and replaces multiplication and division operations using scale factors with shift operations. To improve the generalization and classification performance of the CNN, we create the FERPlus-A dataset. This is a new training dataset created using a variety of image processing algorithms. After training with FERPlus-A, quantization has been performed. The size of a quantized CNN parameter is about 0.39 MB, and the number of operations is about 28 M integer operations (IOPs). By evaluating the performance of the quantized CNN that uses only integerarithmetic on the FERPlus test dataset, the classification accuracy is approximately 86.58%. It achieved higher accuracy than other lightweight CNNs in prior studies. The proposed CNN architecture that uses only integer-arithmetic is implemented on the Xilinx ZC706 SoC platform for real-time facial emotion recognition by applying parallelism strategies and efficient data caching strategies. The FPGA-based CNN accelerator implemented for real-time facial emotion recognition achieves about 10 frame per second (FPS) at 250 MHz and consumes 2.3 W

Do-Hyeon Kwon, Hyung-Wook, Kyeong-Min Ko, “Adaptive Non-speculative DFE with Extended Time Constraint for PAM-4 Receiver”, JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, Vol.21, Issue 2, pp. 166~173, 2021

Abstract

This paper presents a novel approach to solve the time constraint issue of DFE with PAM4 signaling. By using track and hold operation to sample signals of the same level at two points, the time constraint of 1 UI in direct DFE can be extended to 1.5UI. The FIR-tap employs LVDS structure to maintain common voltage and SS-LMS algorithm is used to obtain the optimal tap weight. The first postcursor ISI cancellation is done by the LVDS tap and a sufficient settling time is provided by the proposed DFE. The proposed structure may eliminate the loop unrolling speculative DFE for PAM-4, which leads to less hardware for PAM-4 DFE implementation. A PAM-4 serial link using the proposed DFE was designed in a 65nm CMOS technology and analyzed. Channels with 11.9 dB and 13.8 dB losses were compensated through CTLE and the proposed 1 tap DFE, and simulation results demonstrate the time constraint can be extended without deterioration of the eye opening.

Ngyuen Tho, Joon-Ho Lee, Taek-Joon An, Jin-Ku Kang, A 0.32-2.7Gb/s Reference-less Continuous-rate Clock and Data Recovery Circuit with Unrestricted and Fast Frequency Acquisition, IEEE TCAS on Circuits and Systems II: Express Briefs, 2021

Abstract

This brief presents a design of fast frequency locking 320 Mb/s to 2.7 Gb/s continuous-rate reference-less clock and data recovery (CDR) circuit. A simultaneous coarse/fine frequency acquisition processes are being done to achieve an unrestricted frequency acquisition range and a fast frequency acquisition time. The CDR is implemented in a 180 nm CMOS process, consumes 62 mW of power including I/O buffers at 2.7 Gb/s with a 1.8 V supply. The CDR takes 15.2 µs of a maximum locking time when the data rate locked at 2.7 Gb/s

is switched to 320 Mb/s. The CDR circuit has shown 59 ps and 75.4 ps peak-to-peak jitter in recovered clock and data, respectively, with 2.7 Gb/s input data.

2020

Kyung-Sub Son, Taek-Joon An, Yong-Hwan Moon, Jin-Ku Kang, "A 0.42 -3.45 Gb/s Referenceless Clock and Data Recovery Circuit with Counter-based Unrestricted Frequency Acquisition", IEEE Transactions on Circuits and Systems II: Express Briefs, 2020

Abstract

A 0.42 to 3.45 Gb/s counter-based referenceless clock and data recovery (CDR) circuit that has an unrestricted and continuous-rate frequency acquisition capability is presented. The proposed frequency detector first selects a frequency driving direction of the recovered clock using counters and the frequency locking is achieved with the frequency driving direction plus phase information. After that, phase locking is done with the phase-locked loop. The CDR circuit occupied an area of 0.442 mm2 using 180-nm CMOS process. Locking time less than 17.9 ms has been achieved from initially the highest data rate of 3.45 Gb/s to the lowest 0.42 Gb/s rate, and vice versa. The CDR circuit has shown 4.33 ps rms jitter in recovered data for a 3.45 Gb/s PRBS31 pattern. The power consumption is 20.3mW including I/O buffer at 3.45 Gb/s with a 1.8V supply.

2019

Seong-Mun An, Kyung-Sub Son, Taek-Joon An, Jin-Ku Kang, "Design of a third-order delta-sigma TDC with error-feedback structure", IEICE Electronics Express, Vol.16, Issue 3, Feb. 10, 2019

Abstract

A 1-1-1 MASH delta-sigma TDC with a simpler structure was designed using an error feedback structure. The proposed 1-1-1 MASH delta-sigma TDC modulator has a single subtractor without any explicit integrator. Each modulator stage is composed of a subtractor, digital-to-time converter, and a quantizer. The subtractor generates the timing difference between input signal interval and the feedback signal interval. The digital-to-time converter (DTC) adds or subtracts fixed delays depending on the subtractor output and the quantizer values. The proposed circuit was designed using a 180 nm CMOS process. The simulation results show a resolution of 2.07 ps and a valid bit count of 11.5 bits at a sampling frequency of 50 MHz. The area is 0.14 mm2, and the power consumption is 1.34 mW.

Yong-Hwan Moon, Kyung-Sub Son, Jin-Ku Kang, "A 2.41-pJ/bit 5.4-Gb/s Dual-Loop Reference-Less CDR With Fully Digital Quarter-Rate Linear Phase Detector for Embedded DisplayPort", IEEE Transactions on Circuits and Systems I: Regular Papers,Aug.2019

Abstract

This paper describes a low-power reference-less 5.4-Gb/s clock and data recovery (CDR) circuit with a fully digital quarter-rate linear phase detector (QLPD) having an extended pulse width output. By using a fully digital circuit and merging XOR function with charge pump, the power efficiency and linearity of the phase detector are improved. The proposed QLPD responds correctly up to 0.75 UI of phase difference at 5.4 Gb/s. The CDR was designed to conform to Embedded DisplayPort (eDP) standard of the Video Electronics Standards Association (VESA). The proposed CDR circuit has been fabricated using a 40-nm CMOS technology. The jitter tolerance (JTOL) margin was measured at 0.75 UI, which is 20% higher than the eDP specification of 0.624 UI at the 20-MHz jitter frequency. BER was measured as less than 10 -12 with a 2 7 -1 pseudo random bit sequence (PRBS) pattern. The CDR consumes 12.99 mW, including a bandgap reference circuit and power efficiency achieving 2.41 pJ/bit at 5.4 Gb/s. The area of the proposed CDR is 0.13 mm2.

2017

Byeonggyu Park, Tae-Gwon Yun, Kyongsu Lee and Jin-Ku Kang, "An Inductively Coupled Power and Data Link with Selfreferenced ASK Demodulator and Wide-range LDO for Bio-implantable Devices", JSTS, Vol. 17, Issue 1, pp.120-128, Feb. 2017

Abstract

This paper describes a neural stimulation system that employs an inductive coupling link to transfer power and data wirelessly. For the reliable data and power delivery, a self–referenced amplitudeshift keying (ASK) demodulator and a wide-range voltage regulator are suggested and implemented in the proposed stimulator system. The prototype fabricated in 0.35um BCD process successfully transferred 1.2Kbps data bi-directionally while supplying 4.5mW power to internal MCU and stimulation block.

Nguyen Huu Tho, Kyung-Sub Son, and Jin-Ku Kang, "A 200Mb/s ~ 3.2Gb/s referenceless clock and data recovery circuit with bidirectional frequency detector", IEICE ELECTRONICS EXPRESS, Vol. 14, No.8, pp. 20161279, Apr. 6, 2017

Abstract

This paper presents a 200-Mb/s to 3.2-Gb/s half-rate reference-less clock and data recovery (CDR) circuit in 180 nm CMOS process. A bidirectional frequency detector (FD) is proposed to eliminate the harmonic locking and reduce the frequency acquisition time. A frequency band selector for wide-range the voltage-control oscillator (VCO) is also presented to select an exact frequency band of the VCO. The simulation shows the CDR achieves 11-ps peak-to-peak jitter at 3 Gb/s and the frequency acquisition time of 11.8 μs.

2016

Bum-Hee Choi, Kyung-Sub Son, Taek-Joon An, Jin-Ku Kang, "A burst-mode clock and data recovery circuit with two symmetric quadrature VCO’s", EICE Electronics Express, Vol.13, No.24, pp. 20161086, Dec. 8, 2016

Abstract

This paper presents a burst-mode clock and data recovery (CDR) circuit based on two symmetric quadrature phase VCO’s. The reduced loop locking time of less than 5 bits was achieved without any extra delay circuit swhich are added in conventional schemes for timing control. The proposed circuit is designed in 350 nm CMOS process and its feasibility has been proved successfully operating at 1.25 Gb/s.

2015

Kyung-Sub Son, Jin-Ku Kang, "On-chip Jitter Tolerance Measurement Technique with Independent Jitter Frequency Modulation from VCO in CDR", IEICE Electronics Express, Vol.12(2015) No.15 pp. 20150570, Aug. 10, 2015

Abstract

We present an on-chip measurement technique to characterize the jitter tolerance of a clock and data recovery (CDR) circuit. The proposed jitter modulation scheme incorporates a modulated-charge-pump and a pulse generation circuits to apply a periodic triangular form voltage directly to the control voltage of CDR circuit. This jitter frequency generation scheme independent from the VCO in the CDR allows a wide and linear control of jitter. The modulated jitter amplitude range was 0.05 - 2UIpp at 10MHz, and the jitter frequency range was 100 KHz - 20MHz. The circuit was fabricated in 65nm CMOS, and the jitter tolerance was successfully measured at 5Gbps with a 27-1 PRBS pattern. The accuracy was within 10% error from the external BER equipment measurement result. The whole CDR circuit consumes 29.9mW at a supply voltage of 1.2V.

Inseok Kong, Kyung-Sub Son,Kyongsu Lee,Jin-Ku Kang"Precise time-difference repetition for TDC with delay mismatch cancelling scheme",IEICE Electronics Express,Vol.12(2015)No.21pp.20150752,Nov.2015

Abstract

This paper presents a precise time-difference repetition technique to enhance the timing accuracy in repetition based time-to-digital converters (TDC). In the proposed scheme, any delay mismatches during timing difference repetition process can be removed. The proposed circuit could be used for multi-step TDC, delta-sigma TDC, and SAR-type TDC. The proposed scheme was designed and simulated with a 65-nm CMOS process. The proposed circuit shows a delay variation of about 100 fs in the presence of device mismatches, which is much less than that of conventional approaches. The input time range and the conversion rate is 480 ps and 100 Msps if applied to a 2-step TDC, respectively.

2014

Hyun-Bae Jin, Gi-Yeol Bae, Kwang-Hee Yoon, Tae-Ho Kim, Ji-Hoon Jang, Byung-Cheol Song, Jin-Ku Kang, "A Link Layer Design for DisplayPort Interface with State Machine Based Packet Processing", J Sign Process Syst, January.2014

Abstract

This paper presents a link layer design of DisplayPort interface with a state machine based packet processing. The DisplayPort link layer provides isochronous video/audio transport service, link service, and device service. The merged video, audio main link, and AUX channel controller are implemented with 7,648 ALUTs(Loop Up Tables), 6,020 register, and 451,425 of block memory bits synthesized using a FPGA board and it operates at 203.32 MHz.

Taek-Joon Ahn, Sang-Soon Im, Yong-Sung Ahn, Jin-Ku Kang, "A low jitter clock and data recovery with a single edge sensing Bang-Bang PD", Vol.11, No.7, pp.20140088, March.2014

Abstract

This letter describes a low jitter clock and data recovery (CDR) circuit with a modified bang-bang phase detector (BBPD). The proposed PD senses the phase relationship using a single edge of input data to reduce ripples in the VCO control voltage. A 2.5Gbps CDR circuit with a proposed BBPD has been designed and compared with conventional BBPD using 0.13μm CMOS technology. Measured results reveal that proposed CDR shows the peak-to-peak jitter of 17ps on 25−1 PRBS input pattern compared to 26ps with the CDR with a conventional BBPD. The proposed CDR can be best applied to 8B10B encoded input data. Power consumption can also be saved by about 3mW with the proposed BBPD.

Hak Gu Kim, Jin-Ku Kang, Byung Cheol Song "Automatic SfM-Based 2D-to-3D Conversion for Multi-Object Scenes", IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciences E97A(5),1159-1161,May 2014 

Abstract

This letter presents an automatic 2D-to-3D conversion method using a structure from motion (SfM) process for multi-object scenes. The foreground and background regions may have different depth values in an image. First, we detect the foreground objects and the background by using a depth histogram. Then, the proposed method creates the virtual image by projecting each region with its computed projective matrix. Experimental results compared to previous research show that the proposed method provides realistic stereoscopic images.

Yong-Hwan Moon, In-Seok Kong, Young-Soo Ryu, Jin-Ku Kang, "A 2.2-mW 20–135-MHz False-Lock-Free DLL for Display Interface in 0.15-μm CMOS", IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, Vol.61, No.8, pp.554-558, August. 2014

Abstract

This brief describes a wide-range operating false-lock-free delay-locked loop (DLL) for a low-voltage differential signaling (LVDS) display interface. A false-lock detector circuit and a self-reset circuit internally prevent any possible false locks in a robust way. The proposed DLL immediately removes stuck false locks caused by an improper phase detector state. The DLL circuit does not require the duty ratio of the input clock to be 50%. The proposed circuit has been fabricated using the 0.15-μm 1P-6M mixed-mode CMOS technology. The proposed DLL is implemented for an LVDS display interface and supports operating from 20 to 135 MHz without any error. It consumes 2.2 mW under a 130-MHz operation.

Kyongsu Lee, Youngjin Kim, Kyungsub Son, Sangmin Lee, Jin-Ku Kang, "A 1.1mW/Gb/s 10Gbps Half-rate Clock-Embedded Transceiver for High-Speed Links in 65nm CMOS", IEICE ELECTRONICS EXPRESS, Vol.11, No.17, pp.20140671, August. 2014

Abstract

This paper presents a low-power half-rate clock-embedded transceiver architecture that employs quarter-rate multiplexing/ de-multiplexing circuit technique, low-Vdd current-mode driver topology embedding half-rate clock, and multi-functional injection-locked oscillator (ILRO) for a digital clock and data recovery (CDR) design. The whole transceiver circuit was simulated in 65nm CMOS process and its feasibility was proved successfully operating at 10Gb/s across a band-limited channel. The achievable power efficiencies of the receiver and transceiver were 0.7mW/Gb/s and 1.1mW/Gb/s respectively.

Taek-Joon Ahn, Kyung-Sub Son, Yong-Sung Ahn, Jin-Ku Kang, "A low-power CDR using dynamic CML latches and V/I converter merged with XOR for half-rate linear phase detection", IEICE ELECTRONICS EXPRESS, Vol.11, No.17, pp.20140657, August. 2014

Abstracts

A low-power clock and data recovery (CDR) circuit using dynamic current-mode logic (CML) latches followed by a V/I (Voltage to Current) converter merged with XOR for half-rate linear phase detection is described. The loop latency of the CDR is also reduced with the proposed scheme. Thus the faster locking time and jitter reduction could be achieved compared to the CDR using conventional static CML latches and XOR gates in a linear phase detector (PD) followed by a V/I converter. A CDR circuit with the proposed circuit topology has been designed and fabricated with 0.18-µm CMOS technology and has shown 5-Gb/s data recovery with 14.9 mW power saving compared to the conventional CDR structure under a 1.8-V supply

Yong-Sung Ahn, Taek-Joon Ahn, Kyongsu Lee, Jin-Ku Kang, "Avoiding noise frequency interference with binary phase pulse driving and CDS for capacitive TSP controller, IEICE ELECTRONICS EXPRESS, Vol.11, No.21, pp.20140837, October. 2014

Abstract

Noise frequency interference avoidance (NFIA) using binary phase pulse driving and correlated double sampling (CDS) circuit technique is applied to avoid the interference of noise signal in capacitive touch screen panel (TSP). The proposed analog front-end circuitry is composed of a charge transfer circuit and a two-stage cascaded CDS circuit. The purpose of the two-stage cascaded CDS circuit is to remove harmonic noise frequencies interfering with touched data in TSP. If the system detects that noise frequencies interfere with the touched data, the proposed NFIA technique can be applied. Our proposed methodology is implemented in a real TSP system using 0.13-µm CMOS process and the measured SNR is 46 dB with the scan rate of 120 Hz in 21.5 inch TSP.

2013

Seung-Wook Oh, Hyung-Min Park, Yong-Hwan Moon, Jin-Ku Kang, "A spread Spectrum Clock Generator for DisplayPort 1.2 with a Hershey-Kiss Modulation Profile", Journal of Semiconductor Technology and Science, Vol.13, No.4, August. 2013

Abstract

This paper describes a spread spectrum clock generator (SSCG) circuit for DisplayPort 1.2 standard. A Hershey-Kiss modulation profile is generated by dual sigma-delta modulators. The structure generates various modulation slopes to

shape a non-linear modulation profile. The proposed SSCG for DisplayPort 1.2 generates clock signals with 5000 ppm down spreading with a Hershey-Kiss modulation profile at three different clock frequencies, 540 MHz, 270 MHz and 162 MHz. The measured peak power reduction is about 15.6 dB at 540 MHz with the chip fabricated using a 0.13 µm CMOS technology.

Benjamin P. Wilkerson, Joon-Hyup Seo, Jin-Cheol Seo, Jin-Ku Kang, "An ultra-low power BPSK demodulator with dual band filtering for implantable biomedical devices", IEICE ELECTRONICS EXPRESS, Vol. 10, No. 7, April. 2013

Abstract

In this letter, a low-power non-coherent BPSK demodulator which is applicable to implantable biomedical devices is described. The proposed demodulator adopts the dual band filtering for recovering the timing and data in non-coherent way. The circuit has been fabricated with a 0.18µm CMOS technology and the power consumption of the proposed demodulator is measured at 82µW with a 2MHz carrier frequency achieving 1Mbps data rate.

2012

Tae-Ho Kim, Jin-Cheol Seo, Yong-Sung Ahn, Jin-Ku Kang, "A 10Gb/s Adaptive Equalizer with ISI Level Measurement", IEICE ELECTRONICS EXPRESS, Vol. 9, No.17, Sept. 2012

Abstract

In this letter, an adaptive equalizer with an inter-symbol interference (ISI) level measurement using a periodic training pattern is presented. The compensation level of the equalizer is determined by measuring ISI level and fed back to the feed-forward equalizer. The proposed algorithm is verified with a 100-cm flexible flat cable (FFC) at 10Gb/s data rate using a 90nm CMOS process technology. Compensation range is from 15 dB to 33 dB with a tuning range of 18 dB, and the current consumption is 7.6mA at 1 V power supply.

Tae Hwan Lee, Jin-Ku Kang, Byung Cheol Song, "Video denoising using overlapped motion compensation and advanced collaborative filtering ", JOURNAL OF ELECTRONIC IMAGING, Vol. 21, No.2, April. 2012

Abstract

We present spatiotemporal denoising based on overlapped motion compensation and advanced collaborative filtering. First, noise-robust overlapped motion compensation is performed on a block basis for temporal grouping. Next, the K-nearest neighbors of each block are grouped in a 3D array, and the 3D array is transformed. Then, adaptive soft thresholding is performed in the 3D transform domain. In addition, a modified weighting strategy for aggregation is applied for better visual quality. Simulation results show that the proposed algorithm improves the peak signal-to-noise ratio performance by about 2 dB in comparison with the state-of-the-art technique while providing much better subjective visual quality.

Seung-Wuk Oh, Sang-Ho Kim, Sang-Soon Im, Yong-Sung Ahn, Jin-Ku Kang, "A Clock Regenerator using Two 2nd Order Sigma-Delta Modulators for Wide Range of Dividing Ratio", Journal of Semiconductor Technology and Science, Vol.12, No.1, Mar. 2012.

Abstract

This paper presents a clock regenerator using tow 2nd order sigma-delta modulator for wide range of dividing ratio as defined in HDMI standard. The proposed circuit adopts a fractional-N frequency synthesis architecture for PLL-based clock regeneration. By converting the integer and decimal part of the N and CTS values in HDMI format and processing separately at two different sigma-delta modulators, the proposed circuit covers a very wide range of the dividing ratio as HDMI standard. The circuit is fabricated using 0.18 um CMOS and shows 13 mW power consumption with an on-chip loop filter implementation.

Yong-Hwan Moon, Sang-Ho Kim, Tae-Ho Kim, Hyung-Min Park, Jin-Ku Kang, "A 1.7 Gbps DLL-based Clock Data Recovery for a Serial Display Interface in 0.35 μm CMOS", ETRI Journal, Vol.34, No.1, pp.35-43, Feb. 2012.

Abstract

This paper presents a delay-locked loop (DLL)-based clock and data recovery (CDR) circuit design with a nB(n+2)B data formatting scheme for a high-speed serial display interface. The nB(n+2)B data is formatted by inserting a ‘01’ clock information pattern in every N-bit data. The proposed CDR recovers clock and data in 1:10 demultiplexed form without an external reference clock.

To validate the feasibility of the scheme, a 1.7 Gbps CDR based on the proposed scheme is designed, simulated, and fabricated. Input data patterns were formatted as 10B12B for a high-performance display interface. The proposed CDR consumes approximately 8 mA under a 3.3 V power supply using a 0.35 μm CMOS process and the measured peak to peak jitter of the recovered clock is 44 ps.

2011

Tae-Ho Kim, Yong-Hwan Moon, Jin-Ku Kang, "A 4 Gb/s Adaptive FFE/DFE Receiver with a Data-Dependent Jitter Measurement", IEICE transactions on electronics, Vol.E94-C, No.11, pp.171-174, Nov. 2011.

Abstract

This paper presents an adaptive FFE/DFE receiver with an algorithm that measures the data-dependent jitter. The proposed adaptive algorithm determines the compensation level by measuring the input data-dependent jitter. The adaptive algorithm is combined with a clock and data recovery phase detector. The receiver is fabricated in with 0.13 µm CMOS technology, and the compensation range of equalization is up to 26 dB at 2 GHz. The test chip is verified for a 40 inch FR4 trace and a 53 cm flexible printed circuit channel. The receiver occupies an area of 440 µm 520 µm and has a power dissipation of 49 mW (excluding the I/O buffers) from a 1.2 V supply.

Jae-Wook Yoo, Tae-Ho Kim, Dong-Kyun Kim, Jin-Ku Kang, "A CMOS 5.4/3.24Gbps Dual-Rate CDR with Enhanced Quarter-rate Linear Phase Detector", ETRI Journal, Vol.33, No.5, pp.643-649, Oct. 2011.

Abstract

This paper presents a clock and data recovery circuit that supports dual data rates of 5.4 Gbps and 3.24 Gbps for DisplayPort v1.2 sink device. A quarter-rate linear phase detector (PD) is used in order to mitigate high speed circuit design effort. The proposed linear PD results in better jitter performance by increasing up and down pulse widths of the PD and removes dead-zone problem of charge pump circuit. A voltage-controlled oscillator is designed with a ‘Mode’ switching control for frequency selection. The measured RMS jitter of recovered clock signal is 2.92 ps, and the peak-to-peak jitter is 24.89 ps under 231–1 bit-long pseudo-random bit sequence at the bitrate of 5.4 Gbps. The chip area is 1.0 mm×1.3 mm, and the power consumption is 117 mW from a 1.8 V supply using 0.18 μm CMOS process.

H. K. Jeon, Y. H. Moon, J. K. Kang, L. S. Kim, "An Intra-Panel Interface With Clock-Embedded Differential Signaling for TFT-LCD Systems", JOURNAL OF DISPLAY TECHNOLOGY, Vol. PP, Issue. 99, pp. 1-10, Jul. 2011.

Abstract

In this paper, an intra-panel interface with a clock embedded differential signaling for TFT-LCD systems is proposed. The proposed interface reduces the number of signal lines between the timing controller and the column drivers in a TFT-LCD panel by adopting the embedded clock scheme. The protocol of the proposed interface provides a delay-locked loop (DLL)-based clock recovery scheme for the receiver. The timing controller and the column driver integrated with the proposed interface are fabricated in 0.13- m CMOS process technology and 0.18- m high voltage CMOS process technology, respectively. The proposed interface is verified on a 47-inch Full High-Definition (FHD) (1920RGB 1080) TFT-LCD panel with 8-bit RGB and 120-Hz driving technology. The maximum data rate per differential pair was measured to be as high as 2.0 Gb/s in a wafer test.

2010

Hyng-Min Park, Hyun-Bae Jin, Jin-Ku Kang, "SSCG with Hershey-Kiss modulation profile using Dual Sigma-Delta modulators", IEICE ELECTRONICS EXPRESS, Vol. 7, No. 18, pp. 1349-1353, Sept. 2010.

Abstract

This letter describes a spread spectrum clock generator (SSCG) circuit with the Hershey-Kiss modulation profile using two stacked sigma-delta modulators. The proposed Hershey-Kiss profile modulator generates various slopes to achieve non-linear modulation profile. Since the modulators are implemented by digital blocks, it can be modified for other applications. Simulation results show that peak power reduction level of 10.2dBm with 5000ppm down spreading at the 340MHz operation using 0.13m CMOS.

Tae-Ho Kim, Sang-Ho Kim, Jin-Ku Kang, "A DLL-based Clock Data Recovery with a modified input format", IEICE ELECTRONICS EXPRESS, Vol. 7, No. 8, pp. 539-545, Apr. 2010.

Abstract

This letter presents a DLL (Delay Locked Loop)-based CDR (Clock Data Recovery) design with a modified input data format. The proposed CDR recovers the clock and tracks the phase by the proposed training and real data patterns. The proposed input data formatting is done by inserting the ‘01’ pattern in every N-bit data. To prove the feasibility, a 2.4Gbps CDR is designed and simulated. The training and the real data pattern were formatted as the 10B12B for a high-performance display interface. The CDR achieves less jitter due to the DLL structure. The proposed CDR with the 10B12B format consumes approximately 8mA under 3.3V power supply using 0.25µm CMOS process.

2008

Yong-Woo Kim, Beomseok Shin, Jin-ku Kang, "High-speed 8B/10B encoder design using a simplified coding table", IEICE ELECTRONICS EXPRESS, Vol. 5, No. 16, pp. 581-585, Aug. 2008.

Abstract

This letter presents a high-speed 8B/10B encoder design using a simplified coding table. The proposed encoder also includes a modified disparity control block. Logic simulation and synthesis have been done for the performance verification. After synthesized with a CMOS 0.18µm process, the proposed design shows the operating frequency of 343MHz with no latency. The synthesized chip area is 1886µm2 with 189 logic gates. The proposed 8B/10B encoder shows the overall performance improvement compared to previous approaches.

2006

Hyun-Shik Lee, Shinmo An, Young Kim, Do-Kyoon Kim, Jin-Ku Kang, Young-Wan Choi, Seung Gol Lee, Beom Hoan O, El-Hang Lee, "Fabrication of a 2.5 Gbps x 4 channel optical micro-module for O-PCB application", MICROELECTRONIC ENGINEERING, 83, 2006

Abstract

We report on the fabrication of a polymer-based 2.5 Gbps · 4 channel optical interconnecting micro-module for optical printed circuit board (O-PCB) application. An optical waveguide array is used for optical transmission from vertical surface emitting laser(VCSEL) array to photodiode (PD) array and the built-in 45 waveguide mirrors are used for vertical coupling. The optical waveguide array and the 45 mirrors are fabricated by UV imprint process in one-step. We fabricate microlensed VCSELs by micro-inkjetting method, which reduced radiation angle of VCSEL from 18 to 15 for better light coupling. We use solder ball array and pin array for alignment between O-PCB and the electrical sub-boards with alignment mismatch below 10 lm in x, y and z axis. The fabricated optical interconnection module transmits data at the rate of 2.5 Gbps per channel.

2004

Jin-Ho Choi, Jin-Ku Kang, "All digital DLL with three phase tuning stages", IEICE Trans. Fund. Electron. Comm. Comput. Sci.,E87-A, No.6, pp.1305-1309, Jun. 2004.

Abstract

This paper describes an all-digital DLL (Delay Locked Loop) circuit with a high phase resolution. The proposed architecture is based on three-stage phase tuning blocks for coarse, fine and ultra fine phase control. Each block has a phase detector, a phase selection block and a delay line, respectively. It was simulated in a 0.35 .MU.m CMOS technology under 3.3 V power supply. The simulation result shows the maximum phase error can be reduced to 13-42 ps with the operating range of 250 MHz to 800 MHz.

2003

Yong-Hwan Moon, Jin-Ku Kang, "2x oversampling 2.5Gbps clock and data recovery with phase picking method", CURRENT APPLIED PHYSICS, No. 6, Dec. 2003.

2002

Lee CH. Park SH. Kang JK. Kim CW, "A Real time Image Processor for Reproduction of Gray Levels in Dark Areas on Plasma Display Panel (PDP)", IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, Vol. 48, pp 812-818, Nov. 2002

ABSTRACT

Due to the non-balanced and linear luminous characteristics of RGB color channels of plasma display panels (PDPs), it is required to determine the white point of each gray level and perform inverse gamma correction. However, the two measures cause degradation of gray level representation and undesirable false contouring in the dark areas on PDPs. We describe the implementation of a real time image processor with an error diffusion algorithm and unsharp masking operation. Experimental results show improvements of gray level representation and reduction of undesirable false contouring on the PDP.

2001

Jin-Ku Kang, "Self-Timed Pipelining using latest arriving signal detection", ELECTRONICS LETTERS, Vol. 37 No. 10, pp. 615-617, May. 2001.

Abstract

A self-timed pipelining methodology using latest arriving signal detection is presented. The self-timing control block in the algorithm consists of a self-timing signal generator and pipelining latches. The computation completion of a logic block can be detected and the data latched by the pulse-type self-timing signal for further processing. Using thealgorithm, a 32-bit carry look-ahead adder is implemented. Simulation results show that the adder can operate at 800 MHz in 0.25 µm CMOS technology.

2000

Yeo-San Song, Jin-Ku Kang,Kwang-Sub Yoon, "A Delay Locked Loop Circuit with Mixed Mode Phase Tuning Technique ", IEICE Trans. Fund. Electron. Comm. Comput. Sci. VOl. E83-A, No. 9, pp. 1860-1862,Sept. 2000.

Abstract

This paper describes a DLL(Delay Locked Loop) circuit with the mixed-mode phase tuning method. The circuit accomplishes unlimited phase shift and accurate phase alignment through the coarse and fine phase tuning technique. It is based on a dual delay locked loop structure. The main loop is for generating coarsely spaced clocks and the second loop is for fast and accurate phase tuning with digital and analog phase detection. Simulations show that this circuit has 360 degree phase shift capability and can resolve 10 ps phase error using 0.6 μm CMOS technology.

Dong-Hee Kim,Jin-Ku Kang, "Clock and data recovery with two exclusive -OR phase frequency detector", ELECTRONICS LETTERS, Vol. 36, No. 16, pp. 1347-1349, Aug. 2000.

Abstract

This paper describes a 1.0 Gbps Clock and Data Recovery circuit with a simple PFD structure. The proposed circuit is based on a single loop controlled by a Phase Frequency Detector (PFD) which has two-XOR gates. The VCO composed of four differential buffer stages generates eight differential clocks each spaced by 45°. The PFD generates the VCO control signal by comparing two different phase clocks and input data. The circuit operates on 800 Mbps to 1.2 Gbps data rate under 2.5 V supply using 0.25 μm-CMOS HSPICE simulation. The circuit is under fabrication. The measured results are presented.

Jun-young Park,Jin-Ku Kang, "A 1.0 Gbps CMOS oversampling Data Recovery Circuit with Fine Delay Generation Method", IEICE Trans. Fund. Electron. Comm. Comput. Sci. Vol. E83-A, No. 6, pp. pp1100-1105, Jun. 2000.

Abstract

This paper describes an oversampling data recovery circuit composed of an analog delay locked loop and a digital decision logic. The novel oversampling technique is based on the delay locked loop circuit locked to multiple clock periods rather than a single clock period, which generates the timing resolution less than the gate delay of the delay chain. The digital logic for data recovery was implemented with the assumption that there is no frequency deviation that hurts the center of acquired data. The chip has been fabricated using 0.6μm CMOS technology. The chip has been tested at 1.0Gb/s NRZ input data with 125MHz clock and recovers the serial input data into eight 125Mb/s output stream.

1997

Jin-Ku Kang, W. Liu,R. Cain III, "A CMOS High-Speed Data Recovery Circuit Using the Matched Delay Sampling Technique", IEEE Journal of Solid-State Circuits, Vol. 32, No. 10, pp1588-1596, Oct. 1997. 

Abstract

This paper presents a scheme and circuitry for demultiplexing and synchronizing high-speed data using the matched delay sampling technique. By simultaneously propagating data and clock signals through two different delay taps, the sampler achieves a very fine sampling resolution which is determined by the difference between the data and clock delays. This high resolution sampling capability of the matched delay sampler can be used in the oversampling data recovery circuit. A data recovery circuit using the matched delay sampling technique has been designed and fabricated in 1.2-um CMOS technology. The chip has been tested at 417 Mb/s [2.4 ns nonreturn to zero(NRZ)] input data and demultiplexes serial input dta into four 104 Mb/s(9.6 ns NRZ) output streams with 800 mW power consumption at 4 V power supply. While recovering data, the sampling clock running at 1/4 of the data frequency is phase-tracking with the input data based on information extracted from a digital phase control circuit.