# ON-CHIP CIRCUIT DESIGN TECHNIQUES FOR HIGH-SPEED SERIAL LINKS

Ph.D. Thesis

by

Mahesh Kumawat



# DISCIPLINE OF ELECTRICAL ENGINEERING INDIAN INSTITUTE OF TECHNOLOGY INDORE DECEMBER 2019

# ON-CHIP CIRCUIT DESIGN TECHNIQUES FOR HIGH-SPEED SERIAL LINKS

# A THESIS

thesis submitted in partial fulfillment of the

requirement for the award of the degree

of

# **DOCTOR OF PHILOSOPHY**

by

Mahesh Kumawat



# DISCIPLINE OF ELECTRICAL ENGINEERING INDIAN INSTITUTE OF TECHNOLOGY INDORE DECEMBER 2019



## INDIAN INSTITUTE OF TECHNOLOGY INDORE

## **CANDIDATE'S DECLARATION**

I hereby certify that the work which is being presented in the thesis entitled **ON-CHIP CIRCUIT DESIGN TECHNIQUES FOR HIGH-SPEED SERIAL LINKS** in the partial fulfillment of the requirements for the award of the degree of **DOCTOR OF PHILOSOPHY** and submitted in the **DISCIPLINE OF ELECTRICAL ENGINEERING, Indian Institute of Technology Indore**, is an authentic record of my own work carried out during the time period from July 2014 to December2019 under the supervision of Dr. Santosh Kumar Vishvakarma, Associate Professor, Discipline of Electrical Engineering, Indian Institute of Technology Indore.

The matter presented in this thesis has not been submitted by me for the award of any other degree of this or any other institute.

Signature of the student with date (MAHESH KUMAWAT)

This is to certify that the above statement made by the candidate is correct to the best of my/our knowledge.

Signature of Thesis Supervisor with date (Dr. SANTOSH KUMAR VISHVAKARMA)

MAHESH KUMAWAT has successfully given his/her Ph.D. Oral Examination held on 18-May-2020.

Signature of Chairperson (OEB) Date: May 18, 2020

Signature of PSPC Member #1

Date:

Signature of Head of Discipline Date:

Digitally signed by Dr. Sudeb Dasgupta DN: cn=Dr. Sudeb Dasgupta gn=Dr. Sudeb Dasgupta c=US United States I=US United States Difference of Extended States in the output of the states document M. Shaith Location: IIT Roorkee Date: 2020-05-28 12:33+05:30 18-05-2020

Signature of PSPC Member #2 Date:

Signature of Convener, DPGC Date: 18-05-2020

# ACKNOWLEDGEMENTS

It is the journey, not the arrival, which matters. And yes, Ph.D. is that small beginning towards the upcoming endeavors in my research career. In this part of the journey, I have been accompanied by many people who helped me to make this expedition happy and memorable. In this regard, I would like to express my appreciation to all of them.

Firstly, I would like to express my sincere gratitude to my advisor Dr. Santosh Kumar Vishvakarma, for the continuous support of my Ph.D. study and related research, for his patience, motivation, and immense knowledge. His guidance helped me in research and writing of this thesis. I could not have imagined having a better advisor and mentor for my Ph.D. study.

I also extend my heartiest thanks to my thesis committee: Dr. Amod Umarikar, Dr. Shaikh M. Mobin, for their encouragement and insightful comments and interesting discussions, which helped me to widen my research perspectives. I would also like to express my gratitude to Prof. Pradeep Mathur, Director IIT Indore. He always goes out of the way to help students of IIT Indore.

I want to express my deep gratitude to my seniors Late Dr. Dheeraj Sharma, Dr. Chandrabhan Singh Kushwaha, Dr. Bhupendra Singh Reniwal, Dr. Pooran Singh, and Dr. Deepika Gupta and my collegues Dr. Ankur Beohar, Dr. Maisagalla Gopal, Dr. Vishal Sharma, Dr. Ambika Prasad Shah, Dr. Nandakishor Yadav, Sajid Khan, Neha Gupta, Gopal R. Raut, Gunjan who helped me in various ways during my thesis work and for providing invaluable support and motivation.

My special thanks to Dr. Vikas Vijayvargiya, Dr. Abhishek Kumar Upadhyay, Gaurav Singh, Ravi Kumar, Pooja Bohara, Neha Singh, Deepak Mittal, my loving sisters Meena and Mangla for their faith and motivation during my toughest phase of Ph.D. Without their support, I wouldn't even imagine achieving my goals in doctoral studies.

My sincere thanks to Abhijeet, Sharad, Sudheer, and the staff of the Electrical Engineering Department, Ms. Shagufta, Mr. Raghvendra, Mr. Ram Kumar, for their enormous help during my research work.

This thesis is the outcome of many sacrifices made on my behalf by my parents, my wife, and my kids Tarun and Anshika. My mother and father are always a source of encouragement and inspiration to me throughout my life. I would expand my special heartfelt thanks to my wife Meena for her affection, absolute love, and precious support. Her support, patience and encouragement in my toughest period of Ph.D. was a great treasure. This thesis was not possible without their support and help.

Last but not least, I thank god almighty for showering his blessings since my childhood.

Mahesh Kumawat

Dedicated

to

My Parents

#### ABSTRACT

Advances in semiconductor manufacturing allow higher degree of integration for application-specific integrated circuits (ASIC). The high transmission bandwidth requires an interface system to match the error-free transmission and communication need. The high speed computing and communication network advancement require scaling of silicon technology for higher number of component integration on small chip area and low power consumption.

Serial Link Transceivers is the crucial element in achieving high data transmission with I/O pin limitations in systems-on-chips (SoCs) and ASIC's. To match this high speed data computing requirements, power consumption and data recovery becomes key aspects. The focus of this thesis is on the circuit design of serial links using different methods of serialization.

A new serial link transceiver design is presented for high-speed synchronous transmission. The design consists of Wave Combining and driving unit at the transmitter end and the Decombiner at the receiver end. Wave combining and driving unit is responsible for combining the different serial data streams, and its driving over the transmission channel respectively. In contrast, a decombiner separates the information according to the clock signal. Continuous-Time Linear Equalizer (CTLE) taps help to limit the jitter tolerance up to 10% of the data received. The other commonly used serial link method is the asynchronous transmission. The asynchronous serial links are independent of a clock. The handshake protocols in asynchronous transmission method ensure valid data transmission. Furthermore, handshaking signals slow down the circuit, so the current mode logic blocks are used for faster transmission of the signal. To help in asynchronous data serial link communication, an improved Current Mode Logic (CML) latch design is proposed. The Improved CML latch results in a boost in the output voltage swing, the delay model of latch is also proposed, based on the small-signal equivalent circuit analysis for the proposed latch. As an application of CML latch, asynchronous Wave-Pipelined Serial link is proposed. In designing of a serializer and deserializer is built using proposed CML Latch. The CML circuit operates at a higher data rate and low power consumption.

#### **List of Publications**

#### List of publication included in thesis:

#### **Peer Reviewed Journals: (04)**

- Mahesh Kumawat, Abhishek Dalal, Mohit S. Choudhary, Ravi Kumar, Gaurav Singh and S. K. Vishvakarma, "Wave Combining Driver based Serial Data Link Transceiver Design for Multi-Standard Applications," *Journal of Nanoelectronics and Optoelectronics*, vol 14, no. 5, pp. 675-679 (5) May 2019.
- Mahesh Kumawat, Abhishek Kumar Upadhyay, Sanjay Sharma, Ravi Kumar, Gaurav Singh and S. K. Vishvakarma, "An Improved Current Mode Logic Latch for High-Speed Applications," *International Journal of Communication System Wiley (Accepted)*
- 3. Mahesh Kumawat, Mohit S. Choudhary, Ravi Kumar, Gaurav Singh and S. K. Vishvakarma, "A Novel CML latch based Wave Pipelined Serial Link Transceiver for Low power application," *Journal of Circuits, Systems and Computers, World Scientific*. (Accepted)
- 4. Mohit S. Choudhary, Mahesh Kumawat, Pramod K. Bharti and S. K. Vishvakarma, "16.64Gbps Synchronous CML SerDes Transceiver Design Technique with Process Corner Variations for Low Power Application, "*IEEE Computer Society TC-VLSI Circuits & Systems Letter*", vol. no. 2, issue 1, April 2016. (*ESCI Journal*)

### List of publication apart from thesis:

#### **Conference Proceedings: (01)**

 Tuhina Bhalla, Mahesh Kumawat, Atul Awadhiya, S.K. Vishvakarma, Vaibhav Neema, "Energy Efficient Low Power DC Balanced Full Custom Circuit Design of 8b/10b Encoder and Decoder," 3<sup>rd</sup> IEEE International Conference on Microelectronics, Circuits and Systems (Micro2016), 9<sup>th</sup>-10<sup>th</sup> July 2016, Kolkata, India.

# CONTENTS

| ABSTRACT              | i   |
|-----------------------|-----|
| LIST OF PUBLICATION   | iii |
| LIST OF FIGURES       | vii |
| LIST OF TABLES        | ix  |
| LIST OF ABBREVIATIONS | xi  |

| 1. Introducti             | on                                                                            |
|---------------------------|-------------------------------------------------------------------------------|
| 1.1                       | Motivation1                                                                   |
| 1.2                       | 2 Research Objectives                                                         |
| 1.3                       | B Thesis Organization                                                         |
| 2. Backgroun              | nd and Related Work                                                           |
| 2.1                       | Overview of Serial Link7                                                      |
|                           | 2.1.1 Transmitter                                                             |
|                           | 2.1.2 Receiver                                                                |
| 2.2                       | Equalization10                                                                |
| 2.3                       | Related Work                                                                  |
| 3. Wave Com<br>Multi Stan | bining Driver based Serial Data Link Transceiver Design for dard Applications |
| 3.1                       | Introduction                                                                  |
| 3.2                       | Proposed Tranceiver Design17                                                  |
|                           | 3.2.1 Serializer Design                                                       |
|                           | 3.2.2 Deserializer Design                                                     |
| 3.3                       | Wave Combiner and Decombiner                                                  |
| 3.4                       | Results and Discussion                                                        |
| 3.5                       | Conclusion                                                                    |
| 4. An Improve             | ed Current Mode Logic Latch for High-Speed Application                        |
| 4.1                       | Introduction                                                                  |
| 4.2                       | CML Latch                                                                     |

| 4.3 | Delay Model                        | 31 |
|-----|------------------------------------|----|
| 4.4 | Proposed CML Design                | 33 |
| 4.5 | Static Model and Transistor Sizing | 36 |
| 4.6 | Results and Discussion             | 38 |
| 4.7 | Conclusion                         | 41 |

# 5. A Novel CML Latch Based Wave-Pipelined Asynchronous SerDes Transceiver for Low Power Application

| 5.1 | Introduction                         | 43 |
|-----|--------------------------------------|----|
| 5.2 | Proposed Serializer and Deserializer | 43 |
| 5.3 | CML latch and other building blocks  | 45 |
| 5.4 | Results and Discussions.             | 52 |

## 6. Conclusions and Future Works

| 6.1 | Conclusions  |       | <br> | <br> | 57 |
|-----|--------------|-------|------|------|----|
| 6.2 | Future works | ••••• | <br> | <br> |    |

### REFERENCES

# **LIST OF FIGURES**

| 1.1 | Internet of Things and its Interconnected area1              |
|-----|--------------------------------------------------------------|
| 1.2 | Role of LTE in Internet of Things2                           |
| 2.1 | Block diagram of serial link                                 |
| 2.2 | Synchronous and asynchronous types of serial link            |
| 2.3 | Serial Link building blocks                                  |
| 2.4 | Eye diagram                                                  |
| 2.5 | CTLE (a) Passive (b) Active                                  |
| 3.1 | Block diagram of proposed SerDes with proposed wave combiner |
|     | and decombiner                                               |
| 3.2 | Serializer design with stage wise serialization19            |
| 3.3 | Deserializer design with outputs 20                          |
| 3.4 | (a). Wave combiner and driving unit design                   |
|     | (b). Decombiner design                                       |
| 3.5 | Simulation result waveform                                   |
|     | (a) Transmitter end                                          |
|     | (b) Receiver end                                             |
| 3.6 | Eye diagram                                                  |
|     | (a) Transmitter and Receiver Section                         |
|     | (b) Decombined signal 26                                     |
| 4.1 | CML Latch                                                    |
|     | (a) Conventional CML latch                                   |
|     | (b) Modified CML Latch of Ref.[24] 30                        |
| 4.2 | Equivalent circuit model for Delay                           |

| 4.3  | Proposed CML Latch                                      |
|------|---------------------------------------------------------|
| 4.4  | Proposed CML Latch Equivalent circuit model for Delay35 |
| 4.5  | Layout of CML Latch                                     |
| 4.6  | Proposed CML latch output & 1.25 GHz 39                 |
| 4.7  | (a). Frequency divider                                  |
|      | (b). Post layout simulation of frequency divider        |
| 5.1  | Asynchronous Transceiver                                |
|      | (a). Serializer                                         |
|      | (b). Deserializer                                       |
| 5.2  | Conventional CML Latch                                  |
| 5.3  | Proposed Novel CML Latch                                |
| 5.4  | Delay element(DE)                                       |
| 5.5  | Control Block                                           |
| 5.6  | CML Mux                                                 |
| 5.7  | CML Inverter                                            |
| 5.8  | Initial stage circuit                                   |
| 5.9  | CML to CMOS Converter                                   |
| 5.10 | CML latch Output                                        |
|      | (a) Proposed Latch Output Waveform                      |
|      | (b) Conventional latch Output Waveform                  |
| 5.11 | Serializer output for the input bits 10101010 53        |
| 5.12 | Deserializer output for the input bits 1010101053       |

# LIST OF TABLES

| 3 | .1 Power consumption of block associated with design | . 23 |
|---|------------------------------------------------------|------|
| 3 | .2 Comparative analysis with previous work           | . 25 |
| 4 | .1 Comparative analysis with previous work           | . 39 |
| 5 | .1 Comparison of SerDes Architectures                | . 54 |
| 5 | .2 PVT Corner of various SerDes Transceiver          | 54   |

# LIST OF ABBREVIATIONS

| ASIC    | : | Application Specific Integrated Circuits  |
|---------|---|-------------------------------------------|
| BSIM3v3 | : | Berkeley Short-Channel IGEFT Model        |
| CML     | : | Current Mode Logic                        |
| C-L     | : | CML Latches                               |
| CMOS    | : | Complementary Metal-Oxide Semiconductor   |
| CTLE    | : | Continuous Time Linear Equalizer          |
| DC      | : | Delay Control                             |
| DE      | : | Delay Element                             |
| DETFF   | : | Double Edge Triggered Flip-Flops          |
| FF      | : | Fast NMOS Fast PMOS                       |
| HDMI    | : | High-Definition Multimedia Interface      |
| IC's    | : | Integrated Circuits                       |
| IGFET   | : | Insulated-Gate Field-Effect Transistor    |
| ΙоТ     | : | Internet of Thing                         |
| ISI     | : | Inter-symbol interference                 |
| LEDR    | : | Level Encoded Dual-Rail                   |
| LET-A   | : | Long Term Evolution Advanced              |
| MOS     | : | Metal-Oxide Semiconductor                 |
| MUX     | : | Multiplexer                               |
| NMOS    | : | N-type Metal-Oxide Semiconductor          |
| NoC     | : | Network on Chip                           |
| PCIE    | : | Peripheral Component Interconnect Express |
| PLL     | : | Phase-Locked Loop                         |
| PMOS    | : | P-type Metal-Oxide Semiconductor          |
| PVT     | : | Process Voltage Temperature               |
| SATA    | : | Serial Advanced Technology Attachment     |
| SerDes  | : | Serializer and De-serializer              |

| SF       | : | Slow NMOS Fast PMOS           |
|----------|---|-------------------------------|
| SIPO     | : | Serial-In-Parallel-Out        |
| SoC      | : | System on Chip                |
| SS       | : | Slow NMOS Slow PMOS           |
| TT       | : | Typical NMOS Typical PMOS     |
| UHF      | : | Ultra-High Frequency          |
| USB      | : | Universal Serial Bus          |
| WP       | : | Wave Pipeline                 |
| 3G/4G/5G | : | Third/Fourth/Fifth Generation |
|          |   |                               |

# **Chapter 1**

# Introduction

### **1.1 Motivation**

The era of the Internet of Things (IoT) and Big-Data, needs the significant amount of data generation and transmission through various hardware and software protocols. The IoT networks have enormous amount of information, from multiple connected devices. Such kind of data usage is increasing day by day therefore the interconnected devices of IoT network requires the standard protocol to work together for data handling. So, this kind of data can be utilized with any other network protocols [1][2].



Figure 1.1 Internet of Things and its Interconnected area [3]

Figure 1.1 shows the role of IoT network in our daily life, it covers every aspect connected to humans need i.e. healthcare, infrastructure, automation, energy etc. [3]. It uses various communication protocols to fulfill these requirements, which plays a significant role in wire line and wireless communication techniques [2]. For information transfer among multiple devices over the network, fast data transmission rate is required. The internet bandwidth requirement is boosted to match the needs of IoT systems. As IoT network uses machine to machine communication, it requires the large number of transmitter and receiver for real-time applications in this emerging field [4]. It requires deployment of sensors and controlling unit over the remote location, which helps in data sensing.

The data transfer growth rate in cellular networks is making milestone while working with IoT and Big Data. In mobile networks latest standards like Long Term Evolution Advanced (LTE-A) [5] also plays an essential part in IoT network. It helps in connecting the various IoT nodes and information sharing with data centres as shown in Figure 1.2 [6-7]. It captures the demand like high bandwidth, data rate, control etc.



Figure 1.2 Role of LTE in Internet of Things [7]

In machine to machine communication, intermediate nodes are used to send data from sensor to network. Data processing through these intermediate nodes may reduce the cost of the system, but there are issues related to topological or routing changes. The deployment of sensing devices in next generation communication standards like 4G/LTE-A or its advanced version demands communication efficiency, reliability and data security [8],[9]. To match with this surging demand of high-speed network, operators are turning towards the 4G networks, and boosting their connectivity, capacity, and speed [9].

As the numbers of I/O pins in Integrated Circuits (IC's) are increased, it results, the high-power dissipation, capacitive load, inter-symbol interference (ISI) and signal loses over the medium. [10-12]. These performance degradation issues make the condition worse for network-on-chip, routers and cross bar switches [11]. In addition, the scaling of MOS devices improves the performance of various circuit blocks , but the global interconnect not exactly follow this same scaling improvement comparative to logic blocks due to low congestion communication infrastructure. As the chip sizes reduce day by day, it results in the complex design of high-speed links. Which needs the power optimization compared to the reduced area, and another major design issue is clock synchronization. The timing errors due to jitter and skew on the parallel bus makes the receiver synchronization crucial and limit the bandwidth [11-12].

In order to achieve connectivity and high speed in on-chip communication, one can choose parallel communication or serial link. Parallel data transmission requires parallel transmission lines which increases crosstalk as well as chip area. This parallel data transmission also generates routing congestion at receiver end which results in signal recovery problem. To overcome these issues of parallel transmission, the researchers find another promising solution for data transfer through the serial link [13].

Serial link transceivers have the advantage to achieve bandwidth requirement, small chip area and cost, which makes it a most promising solution. It finds its application in Serial Advanced Technology Attachment (SATA), Peripheral Component Interconnect Express (PCIE), High-Definition Multimedia Interface (HDMI), and Universal Serial Bus (USB) 3.0. A System on Chip (SoCs) widely used for multi-core communication such as Network on Chip (NoCs). Serial link transceiver has been designed using synchronous and

asynchronous method [14]. Synchronous transceiver uses the clock signal control for selection of various parallel signals. There are different types of clock selection, which is used to convert low-frequency data to high-frequency serialized data and then transmitted through the channel. At the receiver end, deserializer is used to recast the serialized data into the parallel signals. The asynchronous transceiver entirely controlled by the handshaking protocols instead of clock selection dependency. The serial link transceiver is used in different backplane communication circuits like equalizer, Analog to Digital Converters ADCs, and data converters. These circuits are desired to operate at high-speed logic, and this high-speed is difficult to achieve using conventional metal oxide semiconductor (MOS) based circuit design.

Additionally, circuit design using MOS offers different transition frequencies for NMOS and PMOS transistor which results into several design issues. The carrier's mobility in p-channel MOS transistor is low as compared to NMOS transistor, which can restrict the operating speed of the circuit. The PMOS device has an inferior unity gain frequency and cannot operate in the higher-frequency range; which may lead to degradation in high-speed performance [13–23].

The generation of the local oscillatory signal for frequency synthesis and modulation is typically implemented with phase-locked loop (PLL) based frequency synthesizer. The divider is a crucial element for high-frequency synthesizer as it operates at the higher frequency. Basically, two main architectures of the divider are widely used for high-frequency implementation. First one is the true signal phase clocked, while another is master-slave flip-flop based frequency divider [15-21]. It can be designed by CMOS rail to rail logic and current mode logic (CML). For low-frequency applications, CMOS logic is preferred owing to its simplicity, while CML is used for high-frequency applications because of its rapid switching speed, low power dissipation, and smaller output voltage swing [23–26].

This thesis is focused on the design and development of the circuit for the high-speed serial link transceivers with synchronous and asynchronous transmission method which can help the transceivers to achieve high data speed requirements. In addition, the current mode logic latch is discussed in

this thesis, and its time delay model is developed along with previously published latch.

#### **1.2 Research Objectives**

The circuit design techniques for both synchronous and asynchronous type high speed serial link transceivers are presented. The specific contributions made in this thesis for the serial link transceivers are as follows.

- Design of synchronous transceiver with the wave combiner and decombiner based Circuit block.
- Analysis and design of current mode logic latch with improved performance parameter. The delay model development of latch.
- Asynchronous serial link transceiver block design is with the improved current mode logic latch and the delay elements.

#### **1.3 Thesis Organization**

The chapters of this thesis, provides the background and related introductory information followed by the compilation of published research work. The thesis is organized as follows.

Chapter 1 describes the design requirement and challenges towards the present standards of IoT, Big data and the machine to machine communications. It shows the requirement and introduction about the serial link, various types of serializing methods.

Chapter 2 provides the brief discussion on high speed serial link blocks. This chapter also explains the requirement of continuous time linear equalizers and current mode logic circuits.

Chapter 3 is concerned with the wave combining driver based serial data link transceiver design for multi standard applications. In this chapter, new serial link transceiver design is presented, which is used to achieve high-speed transmission.

Chapter 4 incorporates an improved Current Mode Logic (CML) latch design that is proposed for high-frequency applications. The small signal

equivalent circuit based delay model is also developed for the analysis of proposed CML latch.

Chapter 5 discusses the high-speed asynchronous wave pipelined serializer and deserializer (SerDes) transceiver design implemented using (CML).

Chapter 6, all the contributions of the thesis are summarized in this chapter and the suggestions for future research possibilities are provided.

## Chapter 2

## **Background and Related Work**

#### 2.1 Overview of Serial Link

The first thing, which needs to be understood before serial links, is the transition from parallel link communication to serial links in many applications. Because of parallel nature of transmission link, wide data buses are required to transmit the information. As every single data bit is transferred through single link, it requires its own conductors. It results in limited data transmission speeds less and large area [9], [27-28]. The higher performance systems are typically used in supercomputers or workstation.

In last two decades, the data transmission need is exponentially boosted. This demand creates major issue regarding cost and area in parallel link communication to achieve the data requirements. With serial link transmission we can avoid these major bottlenecks [28]. Serial link is the most promising solution and some of its interface like PCIe and SATA are used computer applications to match present scenario requirement because it addresses all the design specifications i.e. area, power, cost and data bandwidth. So after utilizing serial link method for data transmission over parallel method, immediate changes are seen in the form of cost and area[29].

Moreover, in serial link communication the data information is transmitted over a single line, which results in reduction of crosstalk and data skew. As crosstalk's and data skew are the major concerns of parallel line transmission [30]. It also helps in chip packaging due to its less area. In technology scaling, the Moore's law is also implemented on the serial link, which helps in proportional supply voltage reduction according to scaling of technology. With all these advantages, serial links proven that as a favorable option to achieve the data transmission and efficiency [31].

Figure 2.1 shows the basic block diagram of a serial data link. It consists of a transmitter that modulates the digital input signal bits into the analog signal as per the clock pulse input, further transmitting it to the channel. Second one is Transmission channel, which is responsible for transmission of the data information from the transmitter to receiver. This transmission medium can be an optical fiber, copper wire or metal line. The other side of transmission channel is Receiver block, which samples the incoming signal and deserialize the transmitted bits according to the clock inputs [32-33].



Figure 2.1 Block diagram of Serial Link

Figure 2.2 shows the basic types of Synchronous and Asynchronous Serial link. Synchronous Serial link transceiver uses the reference clock signal and serializes the data according to clock edges. The transmitted data from Tx is transferred to channel along with the clock, which helps receiver to recover the serialize bits by using clock and data recovery. Asynchronous type of serial link uses some standard band rates in transmission medium. Asynchronous method uses reference clock generator at both transmitter and receiver end for clock and data recovery [32-33].





Figure 2.2 Serial link type (a) synchronous (b) asynchronous

Now the typical implementation of synchronous Serial link, is shown in Figure 2.3. The explanations of transmitter and receiver are as follows.



Figure 2.3 Serial Link Building Blocks

### 2.1.1 Transmitter

The transmitter consists of a serializer, output driver and the PLL. The output driver and serializer transmit data bits at maximum speed and consumes large amount of power at transmitter block. The PLL block is used to guide the data inputs to serialize as per the clock edges, it also matches the jitter requirement of the transmitter. These drivers are used at both the transmitting and receiver end of channel for amplification of signal [34]. The Output driver used in the transmitter portion must have some fixed output impedance matched with the transmission channel, independent from output swings and controllable with equalization methods. The conventional output

drivers are current mode logic (CML) based driver and Voltage mode drivers. Current mode logic drivers have all desirable features of driver to match the transmission channel requirement with only disadvantages found in large current consumption. Whereas the Voltage mode helps in energy efficient operations [35-36]

### 2.1.2 Receiver

The role of receiver is to recover the input data bits by sampling using recovered clock. In transmission channel, the information got attenuated and distorts due to noise and channel losses. For removal of these losses equalizers at both ends of the serial link can be used. There are different types of equalizers used in serial links; in Section 2.2 we will explain the equalization. Also, the recovered clock required to be aligned with the center half of received signal, so the proper voltage margin is maintained for higher as well as for lower voltage levels. After the proper equalization of signal the clock data recovery circuit is used. At receiver the clock is recovered from the received bit stream and used to sample the data. Afterwards, the system again needs to turn into a set of parallel data from received bit stream. The deserialization process is completed with the help of deserializer. It is fundamentally designed with flip-flop based shift registers or de-multiplexers [33-35].

### 2.2 Equalization

Furthermore, the important part of any data communication method is its proper recovery. As discussed in Section 2.1 that crosstalk and skew are the major issues that contribute in distortion and data information in the channel. In serial link equalization circuit is used for proper recovery of signal and to remove the distortion effects. The equalization circuit also helps to increase the bit error rate along with removal of closed nature of eye diagram. At the receiver end, it was repetitively sampled the signal and generates the eye which help the circuit designer to evaluate the effect of skew, noise and intersymbol interference during transmission of signal. In the following Figure 2.4 an eye diagram is shown for the transition from 0 to 1 and 1 to 0.



Figure 2.4 Eye Diagram

The interpretation of eye diagram is according to its Eye opening and closure nature. The height of eye (peak to peak supply) opening represents noise in the signal, whereas the eye width represents jitter effects and behavior of circuit architecture. For proper recovery of signals, presence of any intersymbol interference responds in the form of eye closure [31]. To overcome the inter-symbol interference, equalizer circuits are used. Figure 2.5 (a) shows the passive continuous time linear equalizer used for removal of inter-symbol interference [37].



(a)



Figure 2.5 CTLE (a) Passive (b) Active

For the passive equalizer proposed by Hanumolu [37] the transfer function equation is as follows.

$$H(s) = \frac{R_2}{R_1 + R_2} \frac{1 + R_1 C_1 s}{1 + \frac{R_1 R_2}{R_1 + R_2} (C_1 + C_2)}$$
(2.1)

The pole and the zero frequencies of the CTLE are as follows.

$$\omega_z = \frac{1}{R_1 C_1} \tag{2.2}$$

$$\omega_p = \frac{1}{\frac{R_1 R_2}{R_1 + R_2} (C_1 + C_2)}$$
(2.3)

The overall DC gain for the passive CTLE will depend on the values of resistors

$$DC \ Gain = \frac{R_2}{R_1 + R_2} \tag{2.4}$$

By using this CTLE design at receiver in Serial link communication the equalizer adds gain to the high frequencies as per the pole and zero selection. So, if the pole value of the CTLE is larger than the pole of transmission channel, then the bandwidth of the equalized channel has increased.

The passive CTLE has limitation in the form of zero. In lossy channel single zero based CTLE is not sufficient to match with channel performances. It requires another zero addition to add additional +20dB/decade to match frequency range. The another form of linear equalizer is active CTLE as shown in Figure 2.5 (b) has advantage in gain and it can be designed easily and can be integrated with silicon. The additional zero is introduced in this design is achieved with the parallel combination of resistor and capacitor [37]. The transfer function of active CTLE is given by

$$H(s) = \frac{g_m}{C_L} \frac{s + \frac{1}{R_s C_s}}{\frac{1 + g_m R_s}{2}} (2.5)$$

$$(s + \frac{1}{R_s C_s})(s + \frac{1}{R_L C_L})$$

The pole and zeroes of active CTLE are as follows.

$$\omega_z = \frac{1}{R_s C_s} \tag{2.6}$$

$$\omega_{p1} = \frac{1}{R_L C_L} \tag{2.7}$$

$$\omega_{p2} = \frac{1 + g_m R_s/2}{R_s C_s} \tag{2.8}$$

For high frequency gain the pole value is selected higher than the zero frequency, and the peaking gain is also controlled using the selected frequencies ratio of pole and zero. The peaking gain value is as follows.

$$A = \frac{\omega_{p1}}{\omega_z} \frac{g_m R_L}{1 + g_m R_s/2}$$
(2.9)

This method of equalization helps to achieve the serial link to match the higher data transmission rate with proper height and width of eye diagram.

This differential pair type design also helps in the form of driver design in transceivers. Before transmitting the data information from transmitter, the driver is required with fixed output impedance to match the channel for signal integrity, independent from the effect of equalization without changing its output impedance and its control on output voltage swing. The differential pair design works with the current mode logic-based driver design. In this technique the output impedance is easy to match with channel and independent of output signal swings. Furthermore, to learn more we read many research papers to understand the research findings and gaps of published work in the field of high-speed serial link. The related work in serial links design is discussed as follows in section 2. 3.

#### 2.3 Related Work

S. Safwat *et al.* [32] proposed a self-time signaling technique to multiplex data at the speed of 12Gbps with a clock frequency of 24GHz. The new three-level signaling scheme is introduced to extract data and clock at the receiver side. Serdes transceiver is tested on the lossy on-chip transmission line and shows 15mW power consumption. In [33] order to modify the same, self-timed design is published. This amendment removes the limitation on the minimum frequency found in previous technique. Data transmission is also improved up to the 16Gbps speed. A single-ended transmission line is used to solve routing congestion at the receiver side. They also showed switching threshold inverter techniques for correction of the signal at the receiver side. In Chapter 3 the work proposed is the combiner at the transmitter side and decombiner at receiver side along with the threshold inverters, this method maintains the two-level signaling.

Jaiswal *et.al.* [36] presented an on-chip asynchronous wave-pipelined CML SerDes, which uses delay element and MUX for bits propagation through the

serial link and new bits loaded via MUX. This SerDes implemented on 65nm technology resulting in total power consumption of 14.3mW and data rates of 12.67Gbps

Tadros *et al.*,[34] showed a differential self-timed three level signaling scheme, which uses half data rate frequency and achieves 24Gbps data transmission with 65nm technology. With the 12GHz input clock, it serializes the parallel 8 bits input of 3Gbps speed.

Mu-Shan Lin *et al.*, [38] presents a system of SerDes with 5Gbps Speed for Low power PCI Express. It occupies the area of 510  $\mu$ m \* 710 $\mu$ m and, power consumption is 125mW with a supply voltage of 0.9 V at 40 nm technology. Asynchronous method of serialization is used by Bui Chinh Hien *et al.* in [39], In this technique, data is fed with the help of *load* signal through transmission gate, at this time *en\_se* signal is LOW so that tri-state inverter are disabled which prevents data to propagate before it is properly loaded. After loading, the *load* signal goes LOW and *en\_se* goes HIGH to propagate the data. Deserializer has same structure as of Serializer for timing reference and to prevent the jitters and data corruption. The circuit operates at 3.9Gbps speed with 2.44mW power consumption.

In [40], a clocked static CMOS/CML 16:1 Serializer is designed., by using 2:1 multiplexer to combine the input for serialization. Both CMOS and CML are used so that advantages of both can be utilized. The low speed branches are implemented using CMOS to reduce the static power loss and at high frequency stages CML is used. The circuit operates at 10Gbps speed with power 106mW. In [34] asynchronous CML SerDes is designed with the help of multiplexers, where all are designed in CML. *load* signal is used to control the flow of the data. Data is loaded from one of the inputs of multiplexer and propagated from other input. Serializer and Deserializer have same structure and control block for proper functioning. *Pilot* bit is used for all serialized 8 bit word so as to control signal can sense the bit and judge that the bit stream is arrived for deserialization. Thus, control block generates the signal to stop the transmission and parallel data can be available at the output pins.
In [24] an ultra-high speed CML latch is designed. This latch uses additional transistor so that current at one level of clock can be increased to reduce the error and thereby increasing the speed. Latch uses only one additional transistor at negative level of the clock so that tail current can be increased to gain speed. *Dobkin et al.* in [42] proposed wave pipelined bit-serial link with level encoded dual-rail (LEDR) asynchronous protocol to reduce per bit synchronization.

# Chapter 3

# Wave Combining Driver based Serial Data Link Transceiver Design for Multi Standard Applications

## 3.1 Introduction

The internet revolution increases the data traffic, the need to support this traffic requirement can be achieved with high bandwidth and equipment performance growth. Various devices have been a requirement to send and receive the information through internet. Previously the data inside the system is transmitted using parallel communication. Once data is transmitted from the transmitter, cable sizes required large area to carry the data and also crosstalk issues associated with multiple signal lines. Serial transmission was adopted as a solution that simplified data transmission protocols.

This chapter has presented a new serial link transceiver design, which is used to achieve high-speed data transmission requirements. The proposed design comprises of wave combining and driving unit at the transmitter end, and decombiner at the receiver end. The continuous time linear equalizer (CTLE) helps to limit the jitter tolerance up to 10% of the data received. The simulation results with PVT corners show its compatibility with process corner variations. The wave combiner design may help to double the transmission speed of existing serial link standards like PCI, HDMI, USB, and SATA.

#### 3.2 Proposed Transceiver Design

Figure 3.1 shows the proposed block diagram of synchronous transceiver design. At transmitter side two serializers are used for serializing of 8-bit parallel data as per the clocking sequences. The transmitter section includes the serialization blocks with wave combining and driving unit. Tcruche wave

combining unit generates combined bit sequence with driving capability so that it can be transferred over the transmission channel.

In the transmitter, oscillator is used to generate the operating frequency of 8GHz. Initially, 1GHz clock is used to select the data bit from the input of serializer, which helps to serialize data at the rate of 8Gbps, which is transferred through buffers after serialization. The wave combining unit combines these serialized streams and drives it to the transmission channel with the help of clock signal. We have used the single dispersion-less transmission line as a transmission channel to transmit the data with its length of 3mm. At the receiver, differential pair based active equalizer is selected to equalize the signal from noise and distortion, afterwards de-combiner is used to separate the data streams into data stream 1 and data stream 2, and then deserializer recast the serial data into parallel.



Figure 3.1 Block diagram of proposed SerDes with proposed wave combiner and decombiner

## 3.2.1 Serializer Design

The serializer design shown in Figure 3.2, consists of the combination of double edge triggered flip-flops (DETFF) for serializing the data [32]. The DETFFs are the basic cell for the serialization block, which select the data

according to the positive and negative edges of the clock cycles. These DETFFs were designed using transmission gate logic which help in the form of speed enhancement and less power requirement over CMOS.



# 1234567812345678

(d) Third stage clock

Figure 3.2 Serializer design with stage wise serialization

In Figure 3.2 (b) different clock supply has been applied at every stage. It helps to choose data bit according to the positive and negative edges of clock signal. So, at the transmitter end we received data according to the serializer data bits.

After the first stage of clock signal, where the clock signal is one fourth of the main clock, the data is serialized in the following pattern as shown in Figure 3.2(b). These serialized data is now applied to the second stage DETFF, which allows and serializes the data in the form of odd and even bits according to its clock/2 signal as shown in Figure 3.2(c). Finally, the last stage uses same phenomenon to serialize all 8 bit data into one serialized data which is shown in Figure 3.2 (d)

#### **3.2.2 Deserializer Design**

The Deserializer is designed using serial-in-parallel-out shift register (SIPO) as shown in Figure 3.3. This SIPO architecture divides the bit stream according to the clock signal. In this design we use half clock rate frequency, which is used at input serialization blocks. It helps to distinguish bit streams according to the positive and negative edges of the clock and parallel data is received using the D-flip-flops.



Figure 3.3 Deserializer design with outputs.

#### 3.3 Wave Combiner and Decombiner

The proposed wave combining and driving unit, which is shown in Figure 3.4(a), has two blocks. The first one is used for selection of the information while second one is used to drive it through the transmission channel. In the first block, two different data streams are applied as input.

The combination of data takes place by positive and negative edge triggering of clock input signal (clk). The input clock and its complement (clkbar) are used for data stream selection according to rising and falling edge of clock signal. Data stream 2 is selected during rising edge of the clock signal, while the data stream 1 is selected during falling edge so that the output does not have any information overlapping after the wave combination. For the combination of data streams, 8GHz clock frequency is used as a clock signal. At the driving stage of wave combiner, the combined data is boosted to maintain the logic level "high" and "low" with the help of bias voltage signal and then it is fed to the transmission channel.





Figure 3.4 (a) Wave combiner and driving unit design (b) Decombiner design

A single ended dispersion-less transmission line with ground termination is used to match with proposed model, instead of the differential transmission line because it reduces the routing area and power dissipation. The dispersionless model helped us concerning non-dependency on the frequency of results. In this transmission line model inductance and capacitance makes the characteristics impedance of transmission line frequency independent. So, the output propagation in this transmission is dispersion-less.

After the data transmitted over this transmission line, at receiver side CTLE is used to reduce jitters incorporated during transmission in channel. We used differential pair based active CTLE which need differential signaling, So CMOS to CML (current mode logic) blocks are used which helps CTLE for equalization on receiving data streams. The equalizer taps are used to maintain the offset voltage level of collecting the data at receiver end. Moreover, the decombiner uses the single input data streams after equalization. For singleended input, we used CML to CMOS converter after equalizer, which is required to send the data to wave decombiner. The Wave de-combining stage uses the same approach as the wave combiner unit of edge triggering and divides the received data into two separate streams. As shown in Figure 3.4 (b), according to the positive and negative edges of clock signal it generates the data stream 1 and data stream 2 respectively. For proper recovery of data streams the clock rate of decombiner and combining block should be the same. Therefore, signal of same clock frequency i.e. 8GHz has been chosen at the decombiner end. The separated two different data streams are then forwarded to deserializer which convert data stream into bits.

#### **3.4 Results and Discussion**

The Synchronous SerDes transceiver is designed with 65nm standard CMOS technology and 1.2V voltage supply. Each serializer block serializes data with 8Gbps, whereas wave combiner results in 16Gbps after combining two separate bit streams. The transceiver also boosted the voltage levels of transmitting signal. Simulation result waveforms of wave combiner and decombiner at room temperature are shown in Figure 3.5. 16Gbps speed achieved with very low power consumption at each stage. Table 3.1 shows the power consumption of various blocks under PVT corners. The circuit simulated with temperature variation from -45°C to 125°C and voltage variation from 1.08V to 1.32V. Table 3.2 compares this work with [30-32]. Use of transmission gate instead of CMOS reduces the power consumption at the different stages.

| Blocks                 | Average Power Consumption<br>(mW) in different process corner |      |       |  |  |
|------------------------|---------------------------------------------------------------|------|-------|--|--|
|                        | TT                                                            | SS   | FF    |  |  |
| Serializer             | 57.8                                                          | 39.4 | 64.24 |  |  |
| Wave-Combiner & Driver | 4.16                                                          | 2.3  | 4.3   |  |  |
| Equalization block     | 15                                                            | 5.56 | 29.84 |  |  |
| Decombiner             | 5.71                                                          | 4    | 7.3   |  |  |
| Buffers                | 10                                                            | 4.69 | 11.6  |  |  |

TABLE 3.1 POWER CONSUMPTION OF BLOCKS ASSOCIATED WITH DESIGN



Figure 3.5 Simulation waveform (a) Transmitter end (b) Receiver end

|        | Our<br>Work | Safwat <i>et.al.</i><br>[32] | Hussein <i>et.al.</i><br>[33] | Tadros <i>et.al.</i><br>[34] | Duvvuri<br><i>et.al.</i><br>[42] |
|--------|-------------|------------------------------|-------------------------------|------------------------------|----------------------------------|
| Tech.  | 65nm        | 65nm                         | 65nm                          | 65nm                         | 65nm                             |
|        | 1.2 V       | 1.2 V                        | 1.2 V                         | 1.2 V                        | 1.1 V                            |
| Data   | 16          | 12                           | 16                            | 24                           | 15                               |
| Rate   |             |                              |                               |                              |                                  |
| (Gbps) |             |                              |                               |                              |                                  |
| TL     | 3mm         | 3mm                          | 3mm                           | 5mm                          | 7.5 in                           |
|        |             |                              |                               |                              | FR4                              |
| Freq.  | 8           | 24                           | 16                            | 12                           |                                  |
| (GHz)  |             |                              |                               |                              |                                  |
| Notes  | 2 level     | 3 level                      | 3 level                       | 3 level                      |                                  |

TABLE 3.2 COMPARATIVE ANALYSIS WITH PREVIOUS WORK

Eye diagram of combined data at transmitter and receiver is shown in Figure 3.6 (a), which shows the proper transmission of information. Eye diagram of decombined bit stream 1 and 2 are shown in Figure 3.6 (b). It shows proper eye opening of the signal, which depicts in a proper recovery of an input signal at the receiver end. Due to edge triggering at wave de-combiner unit around 10% jitter window occurred in output stream 2, which shows more occurrence of "0" signal compared with "1". The vertical eye opening of output signal is 0.9V and horizontal eye opening is 10ps.

For perfect matching of output with respect to input signal, voltage swing required to be as large as possible. In [30], due to three levels signaling the voltage swing is limited to 500mV, which results in input and output voltage swing is around 500-600mV for speed of 12Gbps and its eye diagram of output, showing eye crossing of around 75% due to noise and other pulse symmetry issue. As per the eye diagram in [40], the serializer eye diagram shows 1V peak to peak output swing.





Figure 3.6 Eye diagram (a) transmitter and receiver Section (b) Eye diagram of decombined signal.

In proposed wave combiner design, eye diagram shows the voltage swing of around 0.9V eye opening on transmitter and receiver end. I have worked on two level signaling in wave combiner design and it shows vertical eye opening of 0.9V at transmitter and receiver. The eye diagram of decombined data streams have also shown with vertical eye opening of 1.1V peak to peak and horizontal eye crossing at 0.5V after decombination.

## **3.5 Conclusion**

In this chapter, the synchronous serial link design is discussed with the help of a proposed wave combiner and decombiner. The proposed model is responsible for the combination of different data streams into a single data stream. This design also helps with efficient transmission capability of highspeed serial links. It uses a binary tree approach for designing wave combiner and decombiner with half clock rate structure. It uses a low-frequency signal to transmit a high-speed signal compared to literature. Furthermore, the other type of serial link technique, i.e., asynchronous serial links, uses the current mode logic-based circuit for the fast operation because of its swift current switching method at differential pair transistors and its output swings.

# **Chapter 4**

# An Improved Current Mode Logic Latch for High Speed Applications

#### 4.1 Introduction

In analog circuit design, current mode logic circuits are essential element. Differential pair of CML circuit helps with current steering and produce the differential outputs, the output signals are complement to each other. The main difference in CML is its output voltage swing, the output complement each other by smaller voltage swing, which results in fast switching operation. The limitation with the CML is its power consumption, but it is tolerable for achieving high speed operation.

Numerous Current Mode Logic (CML) architectures are discussed along with an improved Current Mode Logic (CML) latch design. For an asynchronous transceiver, the improved CML latch is designed to boost the output voltage swing. A frequency divider for operating frequency of 16GHz is also proposed using the same improved CML latch based design. Next, the delay model is also developed based on small signal equivalent circuit for the analysis of proposed latch. The output voltage behavior of the proposed latch is analyzed using 180nm standard CMOS technology.

#### 4.2 CML Latch

The Conventional CML latch is shown in Figure 4.1. It works in two different modes: tracking mode and latch mode. In the tracking mode, transistor pair of  $M_1$  and  $M_2$  senses the information and the regenerative pair  $M_3$  and  $M_4$  stores that information accordingly. These modes are dependent on the clock signal. When  $V_{clk+}$  is high, transistor  $M_5$  operates in tracking mode, while  $M_1$  and  $M_2$  transistors sense the input data variation. In this mode of operation tail current  $I_0$  is taken to transistor  $M_5$ , and allows output to track the input signal, at this

time cross-coupled regenerative transistors stores the data [26]. When input clock signal  $V_{clk}$  is high, it enables the latch mode of the circuit to store the data through the cross-coupled transistors M<sub>3</sub> and M<sub>4</sub>, while current I<sub>0</sub> flows through M<sub>6</sub> to feed the input signal V<sub>in</sub> [23–26]. The output voltage levels are defined as V<sub>DD</sub> and (V<sub>DD</sub> - RI<sub>0</sub>) respectively and is independent of the input common mode voltage level [44]. For large output swing common voltage required to be increased, but it should not be greater than input clock common mode level. In conventional CML latch fixed tail current source supplies fixed bias current, so the transistors always work in saturation region. So, the CML topology is used which exhibits higher operating speed [24].





Figure 4.1 (a) Conventional CML Latch (b) Modified CML Latch of Ref. [24]

In higher frequency range (> 7*GHz*), the conventional latch has some functional failures for limited gain in latching branch [44], so different architectures are proposed to improve its gain under high frequency [22],[45]. In a conventional latch, during tracking operation,  $M_1$  and  $M_2$  transistor's capacitance increases which degrades the small signal gain for proper tracking operation. The conventional CML latch consists of only one single tail current source for both modes of operation. The main limitation of this latch is that for ultra-high speed data rates, the parasitic of the input transistors degrade the performance during the tracking operation. Therefore, the tail current needs to be high to achieve significant linearity but on another side in latch mode, large bias current is not required. Therefore, the modification in the conventional latch is proposed [22] by designing distinct tail current branches for the latch and track mode of operation controlled by the reference voltage, which makes the new latch suitable to work on high-frequency operations.

This modified CML latch introduces additional tail current source so that both tracking and latching mode use distinct current, which makes the new latch suitable to work in high-frequency operations. When clock signal  $V_{clk+}$  is high, the entire tail current flows through tracking branch. The additional transistors  $M_7$  and  $M_8$  controlled by  $V_{ref}$  allow tracking of input signal  $V_{in}$ . In latch mode, the tracking branch is disabled so the latch pair permits the stored logic to output.

Another modification in the CML latch is proposed by Zhang *et. al.* [24] by boosting the tail current during latch mode. An additional NMOS transistor is introduced in latch branch which is controlled by  $V_{clk}$  signal. During tracking mode, the latch works as a conventional latch design while the additional transistor is in the off state, so the total current flown in track mode is I<sub>0</sub>. In latching mode, the additional NMOS transistor is turned on, therefore increased total current flows through latching branch is equal to I(I<sub>0</sub>) + I(NMOS transistor). It results in higher gain under higher frequency.

#### 4.3 Delay Model

Propagation delay is one of the most important parameters of any sequential or combinational logic design. We have developed a propagation delay model of CML latch introduced by Zhang *et. al.*[24]. The delay model for already developed CML latch design [24] is shown in Figure 4.2. The circuit is divided into three parts. The delay  $\tau_A$  is calculated for the Output node ( $V_{out+}$ ) with reference to the input signal  $V_{in}$  and the delay  $\tau_B$  is associated with the output node ( $V_{out-}$ ) with reference to the input signal  $V_{in}$ . The third delay ( $\tau_C$ ) originates due to  $V_{clk+}$ .

For Output node  $(V_{out+})$ 

$$\tau_A = R_{eq,A} C_{eq,A} \tag{4.1}$$

Where  $R_{eq,A}$  is the equivalent resistance, and  $C_{eq,A}$  is the equivalent capacitance of Figure 4.2 (a), is given as,

$$R_{eq,A} = \frac{1}{(2G_{m1,2} \parallel G_{m5})} \tag{4.2}$$

While,

$$C_{eq,A} = C_{gs5} + C_{gd5} + 2C_{gs1,2} + 2C_{gb1,2}$$
(4.3)

For Output node  $(V_{out-})$ 

$$\tau_B = R_{eq,B} C_{eq,B} \tag{4.4}$$

Where  $R_{eq,B}$  is the equivalent resistance,  $C_{eq,B}$  is the equivalent capacitance of Figure 4.2 (b), is given as,

$$R_{eq,B} \cong R_D \tag{4.5}$$

While,

$$C_{eq,B} = 2C_{gd1,3} + C_{gs3,4} + 2C_{db\ 1,3} + C_{RD} + C_L \tag{4.6}$$

the third time constant is due to input clock signal  $V_{clk+}$  and output node.

$$\tau_{\mathcal{C}} = R_{eq,\mathcal{C}} \mathcal{C}_{eq,\mathcal{C}} \tag{4.7}$$







(c)

Figure 4.2 Equivalent circuit model for delay for (a)  $\tau_A$  (b)  $\tau_B$  (c)  $\tau_c$  where,

$$R_{eq,A} = \frac{1}{G_{m5}}$$
(4.8)

and

$$C_{eq,c} = C_{gs5} \tag{4.9}$$

#### 4.4 Proposed CML Design

The proposed CML latch is shown in Figure 4.3, in which we introduce additional transistor to provide current flow in track mode transition. In both operations, the transistor pair combination of  $M_5$ - $M_8$  and  $M_6$ - $M_7$  is controlled by the differential clock signal. When  $V_{clk+}$  signal is high, the transistors  $M_5$ - $M_8$  are turned on and allow tail current to flow in tracking mode. In latch mode, transistor pair  $M_6$ - $M_7$  helps to flow tail current in

regenerative transistor pair. These additional transistors  $M_7$  and  $M_8$  result in high gain as well as improved circuit suitability for high frequency and low power applications.

The propagation delay model of the proposed CML latch is discussed in this section. Delay model is shown by considering the output signal according to the clock ( $V_{clk}$ ) [18], [45–47]. In our proposed model of CML latch, equivalent propagation delay model is shown in Figure 4.4. Propagation delay model is also split into three parts similar to time constants of [24].

For Output node  $(V_{out+})$ 

$$\tau_A = R_{eq,A} C_{eq,A} \tag{4.10}$$

Where

$$R_{eq,A} = \frac{1}{(G_{m1} + G_{m2})} \tag{4.11}$$

and

$$C_{eq,A} = C_{gs5,6} + C_{gd5} + 2(C_{gs1,2} + C_{gb1,2})$$
(4.12)



Figure 4.3 Proposed CML Latch











(c)

Figure 4.4 Proposed CML latch Equivalent circuit model for delay for (a)  $\tau_A$ (b)  $\tau_B$  (c)  $\tau_c$ 

For Output node  $(V_{out-})$ 

$$\tau_B = R_{eq,B} C_{eq,B} \tag{4.13}$$

where,

$$R_{eq,B} \cong R_D \tag{4.14}$$

and

$$C_{eq,B} = 2C_{gd1,3} + C_{gs3,4} + 2C_{db\ 1,3} + C_{RD} + C_L$$
(4.15)

The third time constant in our proposed design is due to the input clock signal  $(V_{clk+})$ , which is also connected through additional transistor M<sub>8</sub>, which will affect the time constant and the output node.

The third time constant

$$\tau_C = R_{eq,C} C_{eq,C} \tag{4.16}$$

where,

$$R_{eq,c} = \frac{(G_{m5,} + G_{m8})}{(G_{m5}G_{m8})} \tag{4.17}$$

and

$$C_{eq,c} = 2C_{gb5,8} + 2C_{gs5,8} \tag{4.18}$$

This proposed design is having slightly higher time constant  $\tau_c$  compared to reported work by [22].

#### 4.5 Static Model and Transistor Sizing

The proposed CML latch is designed using resistive load, the static model derived in [14] and [16] by modeling their load transistors by an equivalent resistance  $R_P$ . By BSIM3v3 model, the linear resistance computed is as follows

$$R_P = \frac{Rint}{1 - \frac{(R_{DSW} \cdot 1.10^{-06})/W_P}{Rint}}$$
(4.19)

Where,  $R_{DSW} = Empirical model parameter$ 

 $W_p$  = Channel width of load transistor and

 $R_{int}$  = Intrinsic resistance of the PMOS transistor in linear region

However, we have used resistive load structure, the linear resistance model of PMOS transistor is not applied in the proposed design.

One more essential parameter for design is Output Voltage Swing. With reference to [9], the voltage swing should be lower than the twice of threshold voltage ( $V_{Th}$ ), to ensure that the transistor  $M_{1,2,3,4}$  shown in Figure 4.3 should operate in saturation region. The voltage swing equation determined by [16] is

$$V_{swing} = R_p I_{ss} \tag{4.20}$$

The small signal voltage gain  $A_v$  and noise margin *NM* of the CML will be computed according to the method outlined by [26].

$$A_{v} = \frac{V_{swing}}{2} \sqrt{2\mu_{eff,n} C_{ox} \frac{W_{n}}{L_{n}} \frac{1}{I_{ss}}}$$
(4.21)

$$NM = \frac{V_{swing}}{2} \left[ 1 - \frac{\sqrt{2}}{A_v} \right]$$
(4.22)

Where  $\mu_{eff,n}$ ,  $W_n$ ,  $L_n$  are the effective electron mobility and the width and length of the transistor pair M<sub>1</sub>-M<sub>2</sub> and M<sub>3</sub>-M<sub>4</sub>.

The sizing of the transistor is determined by the small signal voltage gain and the noise margin. As we have used the resistive load in our design so the maximum current,  $I_{max}$  for the design will be as follows.

$$I_{max} = \frac{V_{swing}}{R_{load}} \tag{4.23}$$

The transistors ( $M_{5,6,}$ ) work as conventional latches for the track and latch mode operation. For maintaining the voltage swing, the aspect ratio of the additional transistors  $M_7$  and  $M_{8,}$  in current source section is maintained in a manner such that the total voltage swing does not cross the twice of threshold voltage level.

With reference to [24], in the latching operation, the transistor  $M_7$  turns on and the current flows through the latching branch equals to  $I(I_0) + I(M_7)$ , which is larger than the conventional CML latch. To match the voltage swing condition the total current should not exceed larger than  $I_{max}$ . Therefore, following criteria for current source should be followed in tracking and latching operations

$$I_{max} \le I_0 + I(M_7)$$
 (4.24)

The  $M_7 \& M_8$  Transistors have the same aspect ratio and the width of the transistors can be increased to an extent such that maximum current matches the Voltage swing condition.

In this design the aspect ratio of  $M_{1,2,3,4}$  is chosen to be same, whereas the transistors  $M_{5,6}$  are chosen as double aspect ratio to handle the worst case current of transistor pair  $M_1$ - $M_2$  and  $M_3$ - $M_4$ . The additional transistors  $M_7$  and  $M_8$  are used with higher aspect ratio to maintain and provide path for  $I_{max}$  if it exceeds  $I_{bias}$  by a very large amount.

#### 4.6 **Results and Discussion**

The circuit is designed using 180nm standard CMOS technology with 1.2V power supply. The layout of the proposed design is as shown in Figure 4.5. The post-layout simulations were carried using Spectre simulator. Some of the output wave forms of the proposed latch is as shown in Figure 4.6. The performance of the CML latch is checked at 1.25GHz frequency and shown in Figure 4.6. In tracking and latching operation the output offset voltage range is increased. In output waveform, received voltage range is between 0.1V to 0.3V for "logic low" and between 0.9V to 1.15V for "logic high". The additional transistors used in tail branch helped to increase overall current in both operations of CML latch. This technique has the merit of improved offset voltage window as compared to conventional and previously published works.

Comparative analysis of the maximum operational frequency of proposed latch and previously reported CML latch is shown in Table 4.1. The proposed design can work with a gigabit frequency range of 16GHz with the maximum power dissipation of 0.280mW.







Figure 4.6 Proposed CML latch output at 1.25 GHz

|                              | Performances  |                               |                               |                   |                          |  |
|------------------------------|---------------|-------------------------------|-------------------------------|-------------------|--------------------------|--|
| Latch<br>Structure           | Tech.<br>(nm) | Power<br>@<br>1.25GHz<br>(mW) | Maximum<br>Frequency<br>(GHz) | No. of<br>Devices | Extra<br>bias<br>voltage |  |
| Conventional                 | 180           | 0.180                         | 8.5                           | 9                 | No                       |  |
| Heydari <i>et al</i><br>[22] | 180           | 0.262                         | 14.3                          | 14                | Yes                      |  |
| Zhang et al [24]             | 130           | 0.182                         | 15.2                          | 10                | No                       |  |
| This work                    | 180           | 0.280                         | 16                            | 11                | No                       |  |

Table 4.1 Comparative analysis with previous work







(b)

Figure 4.7 (a) Frequency divider (b) Post layout simulation of frequency divider.

As CML latches are widely used in a variety of circuits, one flip-flop based frequency divider circuit has been designed to verify its performance. In Figure 4.7 (a) two D flip-flops based frequency divider circuit is presented and the functionality of this frequency divider is with the supply voltage of 1.2 V. It divides the input frequency of 16 GHz into the divide-by-four ratios. The output voltage swing variation of the frequency divider is around 0.6 V.

## 4.7 Conclusion

The need of Current mode logic for high speed circuits is discussed in this chapter. The delay model and transistor sizing of proposed CML latch is also defined. These type of CML latches are very useful for the asynchronous type serial link transceivers. The next chapter describes the asynchronous wave pipelined transceiver using improved CML Latch.

# Chapter 5

# A Novel CML Latch Based Wave-Pipelined Asynchronous SerDes Transceiver for Low Power Application

#### **5.1 Introduction**

In the present technology development billions of transistors are fabricated on a single chip, which improves the performance of circuits in terms of high data transmission speed and power consumption. This requirement of data transmission speed is achieved with the help of high-speed Transceivers. In this chapter, a high-speed asynchronous wave pipelined Serializer and Deserializer (SerDes) transceiver implemented using current mode logic (CML) has been presented. This asynchronous transceiver circuit does not require a clock and therefore it saves the large amount of power which is consumed in the PLL and frequency synthesizer circuits. Furthermore, the proposed design is built using CML which saves more power. CML circuit can be operated at relatively higher speed as compared to CMOS circuits which helps circuit in higher data rate applications. In spite of using conventional CML latch, a novel CML latch is proposed in our design to increase the speed. The circuit is implemented in standard CMOS 65nm technology. The total power consumed by the Serializer and Deserializer is 9.32mW, which is very less as compared to published related works. The proposed asynchronous SerDes transceiver operates at 18.1Gbps data transmission rate with low power dissipation.

#### **5.2 Proposed Serializer and Deserializer**

The proposed asynchronous SerDes is designed using CML logic. Figure 5.1(a) shows the Serializer, in which load, delay control (dc) and shift are the

handshaking signals for serialization. Serializer circuit contains CML multiplexer, CML Latches (C-L) and Delay Element blocks. CML latches are controlled by the load signal, whereas the multiplexers are controlled using shift signal. CML latch inputs are connected to the *D7.... D0* and *Pilot* for loading of parallel data and multiplexers are used to propagate received parallel data. To send the parallel data to the Serializer, load signal is kept high, which sends input data to input of the delay element (DE) through multiplexer. All delay elements are selected according to the state of dc signal and transfer the data to multiplexers. Multiplexers transfer the bit received from delay element depending on shift signal. At the end the serialized output is received along with one pilot bit. After arrival of Pilot bit at control block data transmission is stopped and loading of next bit stream at Serializer is initiated.



(a)



Figure 5.1 Asynchronous Transceiver (a) Serializer (b) Deserializer

The controlling in CML latch is maintained by load signal, whenever the load signal is high it transfers the parallel data to multiplexer blocks. The multiplexers

are controlled with shift signal. They are Shift signal turn ON The multiplexers are connected in such a manner that they are turned ON whenever the shift signal goes LOW. The load and shift signals are always turn ON their respective connected blocks. The load and shift signal always in opposite state to each other for proper transmission of data. Therefore, when load signal is HIGH, it turns ON the vertical CML latch to load parallel data and shift signal's LOW state to enable the multiplexer for data transmission.

Proposed asynchronous deserializer design is shown in Figure 5.1(b). It has same structure as of Serializer with same timing references to avoid jitter and noise. Serialized bit stream with prefix pilot bit received at Deserializer is passed through multiplexers, since delay control (dc) is wiped of all the data from Deserializer before bit stream enters Deserializer, thus all the mux will have zero at their input before the serial stream enters. When dc signal is low, data stream propagates through all multiplexer latches with pilot bit at starting. When pilot bit reaches to control block, it generates signal to stop the received signal and turns ON the CML latch. Since, the CML latches are turned OFF, they will maintain the respective bits at the input of CML latch and these latches pass the data information to the output.

#### 5.3 CML latch and other building blocks

It's very challenging for a CMOS based circuit to be operated at MOS Device's transition frequency. Gigabit communication needs high speed signal in transceiver and should be abandoned to use PMOS devices. CMOS circuit's limitation to work in gigahertz high frequency makes Current mode logic (CML) most promising. CML circuits can work for high speed applications with low output swing. For the proposed asynchronous SerDes transceiver, CML latch is the basic building block [18-23].



Figure 5.2 Conventional CML Latch

A conventional current mode logic latch works in a sample and hold stage. The conventional latch is shown in Figure 5.2. During sample stage, transistors  $M_1$ ,  $M_2$ , &  $M_5$  are operated, and for hold stage transistor  $M_3$ ,  $M_4$  &  $M_6$  are operated. Clock signal connected with M5 & M6 decides the stage of the latch. Sample stage works when  $V_{clk+}$  is HIGH, and hold stage works for HIGH  $V_{clk-}$ . Sink current  $I_0$  is used to maintain the voltage swing of the CML output in high frequency conditions. For the limited gain of the conventional latch it is difficult to operate it in ultra-high frequency (UHF) application (>10 GHz) [22], therefore a new CML latch has been introduced with improved tail current Figure 5.3.

In proposed CML latch an additional parallel path for current flow has been introduced for sample and hold stages of operation. Clock signal  $V_{clk+}$  is HIGH during sample stage operation which turns ON the transistors  $M_5-M_8$  and allows more current to flow in additional tail branch. For the hold stage operation transistor pair  $M_6-M_7$  operates and the tail current flows through this regenerative pair transistor. These transistors produce high gain and stability for high speed application. The other advantage of this proposed design is its output offset

voltage range. By using additional MOS transistors its current as well as output voltage level (High and Low) margin is improved.



Figure 5.3 Proposed Novel CML Latch

Delay Element (DE) used in our asynchronous transceiver is as shown in Figure 5.4, consist of CML AND Gate followed by CML buffers. One input of AND gate is connected to data and the other one is connected to the dc signal. This topology generates output zero potential when the dc signal is low otherwise it follows input signal. The dc signal is used to wipe off any residual bit at the input of multiplexer in Deserializer to prevent false triggering of control block. Since control block is triggered by logic HIGH *Pilot* bit, so if any logic HIGH bit is present in data it will trigger false the control block when the propagation starts for next group of data. Each buffer in the delay element provides a delay of 4.352ps and overall delay provided by DE is 34.65ps.



Figure 5.4 Delay Element (DE)

Control block is an important part of asynchronous CML SerDes. In asynchronous circuit, the data rate is bounded by the delay signal due to non-availability of clock signal. For this reason, the control block must switch the handshaking signal for proper data recovery. The proposed control block generates two handshaking signals, i.e. "shift" and "load" for proper functioning of Deserializer. Jaiswal *et. al.* [36] presented control block with fast propagation and optimum driving capabilities for control signal as shown in Figure 5.5. It consists *Pilot* hold circuit and buffer chain blocks to generate the handshaking signals. For fast switching of signal the multiplexers need to react as fast as possible, so parallel buffer chain blocks are used to generate multiple outputs. To overcome the voltage swing issue of propagating signal, larger current signal is employed in buffer chain.



Figure 5.6 CML Mux

The CML multiplexer, as shown in Figure 5.6, works with select signals  $S_p \& S_n$  and it is controlled by the shift signal. Initially, the shift signal is set to "1" such that all serialized bits inserted in Deserializer and the pilot bit is received at the control block. Control block then generates the shift= "0" to hold the information at each multiplexer and load = "1" to receive the deserialized data bits. The shift and load signals are opposite in nature. For logic"1", shift signal turns OFF the propagation whereas load signal turns ON the CML latches and

opposite is true for logic "0". Modified CML latch works with larger voltage swing, so at the output we get data bit with larger voltage swing without applying higher sink current. CML buffer and CML inverter are the important circuits used in buffer chain and delay element of control block. These buffers and inverters have the identical structures as shown in Figure 5.7. The basic difference between buffer and inverter is their output selection. The output of CML buffer is taken in inverted manner as compared to CML inverter.



Figure 5.7 CML Inverter

As shown in Figure 5.1, the Serializer and Deserializer circuit blocks are controlled by the same *load* signal and *shift* signal. Therefore, loading of the parallel data at the Serializer and taking out of the parallel data at Deserializer takes place at the same time.

Before starting of data transmission switching on of the start-up circuit is required. The initialization circuit as shown in Figure 5.8 [36] is placed with the control block input to make the system in reset condition. The circuit consists of an NMOS transistor along with a resistance and a capacitance. When the circuit is

switched on, this NMOS capacitors generates an active HIGH and an active LOW signal for a small duration of time so that the circuit starts correctly.

Since the SerDes transceiver using current mode logic, therefore the input which is in CMOS logic should be first converted to CML. This study has used differential CML inverter as CMOS to CML converter to achieve such requirements. In this circuit, we have applied CMOS signal to the InP of CML converter and inverted CMOS signal to the InN of CML converter. After completion of receiving the information at the receiver end, to ensure that the output is compatible with CMOS signal, a CML to CMOS converter is used. As shown in Figure 5.9, CML to CMOS converter is designed using a push-pull output Operational amplifier with regenerative inverter blocks.



Figure 5.8 Initial state circuits



Figure 5.9 CML to CMOS Converter
### **5.4 Results**

In this section, the modified CML latch circuit is compared with the conventional CML latch and it shows the larger output voltage swing as shown in Figure 5.10. Conventional latch shows the 0.4V output swing and works between 0.7V to 1.1V and the modified latch has 0.9V output voltage swing and works between 0.2V to 1.1V supply range. Results are captured while applying input voltage swing from 0 to 1.2V.



Figure 5.10 CML latch Output waveform (a) Proposed latch (b) Conventional latch

The novel CML latch based asynchronous SerDes transceiver is designed in previous work. PVT corners simulations are also performed to ensure the stable operation of the transceiver. The circuit is designed and simulated using Spectre with standard 65nm CMOS technology. Serialized bit stream (for the input 10101010) is shown in the Figure 5.11. As this circuit is implemented using CML logic, final serialized bit having voltage swing between 0.6 to 1.2V. The first upper waveform in the figure shows the differential *shift* signal, which is responsible for transmission of serial information. The bottom waveform shows the complete serialized output for the input 10101010. All other outputs of



multiplexers are shown in between second to ninth waveform. The Figure 5.12 shows output of Deserializer with the voltage swing between 0.7 to 1.18V.

Figure 5.11 Serializer output for the input bits 10101010



Figure 5. 12 Deserialized output for the bits 10101010

The asynchronous CML based SerDes gives a bit width of 55.24ps, corresponding to the data rate of 18.1Gbps. The proposed circuit has dynamic power consumption of 9.32mW. The comparison of proposed design with literature is given in the Table 5.1.

| Architecture    | Technology(nm) | Speed  | Power         |
|-----------------|----------------|--------|---------------|
|                 |                | (Gbps) | ( <b>mW</b> ) |
| CMOS-CML [40]   | 45/65          | 10     | 10/106        |
| Self-timed [32] | 65             | 12     | 15.5          |
| Self-timed [33] | 65             | 16     | 18.1          |
| WP-CML [36]     | 65             | 12.67  | 14.3          |
| This work       | 65             | 18.1   | 9.32          |

 Table 5.1
 Comparison of SerDes Architectures

Table 5.2 PVT corner of various SerDes transceiver



Process Corner analysis is also compared with other published results. For SS process corner, 1.08V supply voltage and 125<sup>0</sup>C temperature is selected, while for

FF corner 1.32V supply voltage and  $-45^{\circ}C$  temperature is chosen. For TT corner, 1.2V supply voltage is chosen at room temperature. Table 5.2 shows that, the proposed design works with higher data transmission rate in different process corner conditions.

As compared to WP-CML, it shows the 42.8% higher data transmission rate in typical conditions. In SS corner conditions the proposed design is also better than WP-CMOS and WP-CML techniques. In FF PVT Corner the proposed technique has shown 39% improvement from WP-CML and 50% from WP-CMOS. Hence, the proposed SerDes design can work better in different corner conditions.

## Chapter 6

## **Conclusions and Future Works**

#### 6.1 Conclusions

In this thesis, all the proposed design techniques simulations are carried out using Cadence and Mentor graphics tools.

The wave combining and de-combining block based SerDes transceiver link is designed and analysed. It helps in connecting the separate input streams and its levelled transmission with proper driving capabilities. The two-level signalling improves the signal bits communication and signal recovery at receiver. Single-ended transmission line used in the transmission of the signal, which shows 10% jitter at output de-serialized stream. Further, the CTLE maintains the 400mV signal offset window by that output signal readily recovered at the receiver end.

Additionally, the thesis also shows the improved on-chip low voltage, high-speed CML latch designed using 180nm standard CMOS technology with a 1.2V supply. The delay model of the proposed latch and reported design has been developed and presented. The overall time constant of the proposed circuit is described and compared with the reported latch structure. The performance of the proposed latch design is checked for power dissipation and offset swing at the 1.25GHz clock frequency. The Output voltage swing range of this design is 0.6 V, which is higher than the conventional latch. Hence, the proposed design has the advantage to work with a large tail current condition in both tracking and latching mode of operation. Additionally, this design provides an output voltage swing of around 0.6V.

Furthermore, the thesis reports a new wave pipelined technique for asynchronous serial link transmission. With the help of Improved CML latch design the high-

speed serial link is presented, which results in improved voltage swing, reduction of error and speed boosting. This design also has advantages in the form of reduced power dissipation, when compared to other structures. In this design approach we achieved 18.1Gbps data transmission speed with low dynamic power dissipation of 9.32mW in 65nm technology. Comparative analysis with different process corner was also verified with the proposed design and it was found that it works between 13.33Gbps to 19.1Gbps in different corner conditions with low power dissipation and is suitable for the higher data transmission rate.

#### **6.2 Future works**

The research vision of this thesis is to design circuit blocks for high speed serial links. CML based latch design is proposed and simulation-based results are studied. The design is verified using some example for frequency divider and asynchronous based transceiver links, it helps this thesis to propose half circuit architecture-based wave combiner and decombiner approach for synchronous type of high speed serial links. This thesis deals with different serial link circuit blocks design with external clock signals, therefore, for future work is to complete SerDes with PLL design.

# References

- T. Padiya, M. Bhise and P. Rajkotiya, "Data Management for Internet of Things," *IEEE Region 10 Symposium*, Ahmedabad, pp. 62-65, May 2015.
- [2] H. Shamoto, K. Shirahata, A. Drozd, H. Sato, and S. Matsuoka, "GPU-Accelerated Large-Scale Distributed Sorting Coping with Device Memory Capacity," *IEEE Transactions on Big Data*, vol. 2, no. 1, pp. 57–69, March 2016
- [3] Schem-IoT, https://www.snyxius.com/software-development-companytexas/schema-iot/.
- [4] A. Nessa and M. Kadoch, "Joint Network Channel Fountain Schemes for Machine-Type Communications Over LTE-Advanced," *IEEE Internet of Things Journal*, vol. 3, no. 3, pp. 418-427, June 2016.
- [5] D. Boswarthick, O. Elloumi, and O. Hersent, "M2M Communications: A Systems Approach." Hoboken, NJ, USA: Wiley, March 2012
- [6] R. Lu, X. Li, X. Liang, X. Shen and X. Lin, "GRS: The Green, Reliability, and Security of Emerging Machine to Machine Communications," in *IEEE Communications Magazine*, vol. 49, no. 4, pp. 28-35, April 2011
- [7] FPGA for Internet of Things, https://iot.electronicsforu.com/expertopinion/fpga-internet-things/
- [8] N. Mysore Balasubramanya, L. Lampe, G. Vos and S. Bennett, "DRX With Quick Sleeping: A Novel Mechanism for Energy-Efficient IoT Using LTE/LTE-A," *IEEE Internet of Things Journal*, vol. 3, no. 3, pp. 398-407, June 2016.
- [9] K. Chang, G. Zhang and C. Borrelli, "Evolution of Wireline Transceiver Standards: Various, Most-Used Standards for the Bandwidth Demand," *IEEE Solid-State Circuits Magazine*, vol. 7, no. 4, pp. 47-52, November 2015
- [10] S. Kim, T. Kim, D. Kwon and W. Choi, "A 5–8 Gb/s Low-Power Transmitter with 2-Tap Pre-Emphasis Based on Toggling

Serialization," 2016 IEEE Asian Solid-State Circuits Conference (A-SSCC), Toyama, pp. 249-252, Nov 2016

- [11] R. R. Dobkin, A. Morgenshtein, A. Kolodny, and R. Ginosar, "Parallel vs. Serial On-Chip Communication," Proceedings of the 2008 international workshop on System level interconnect prediction (SLIP '08) ACM, New York, NY, USA, pp 43-50.
- [12] C. O. Chen, S. Park, T. Krishna and L. Peh, "A Low-Swing Crossbar and Link Generator for Low-Power Networks-on-Chip," 2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), San Jose, pp. 779-786, December 2011.
- [13] M. Alioto, G. Palumbo, "Model and Design of Bipolar and MOS Current-Mode logic: CML, ECL and SCL Digital Circuits," Dordrecht (The Netherlands): Springer, 2005.
- [14] N. Pandey, K.Gupta, G. Bhatia, B. Choudhary "MOS Current Mode Logic Exclusive-OR Gate using Multi-Threshold Triple-Tail Cells," *Microelectronics Journal.*, vol. 57, no. 11, pp. 13–20, November 2016
- [15] L. Szilagyi, G. Belfiore, R. Henker, F. Ellinger "Low Power Inductor-less CML Latch and Frequency Divider for Full-Rate 20 Gbps in 28-nm CMOS," *Proceeding of 10th Conference Ph.D. Research in Microelectronics and Electronics (PRIME)*, pp. 1–4, June 2014.
- [16] K. Gupta, N. Pandey, M. Gupta "Low-Voltage MOS Current Mode Logic Multiplexer," *Radioengineering Journal*, vol. 22, no. 1, pp. 259–268, April 2013.
- [17] M. Alioto, R. Mita and G. Palumbo, "Design of High-Speed Power-Efficient MOS Current-Mode Logic Frequency Dividers," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 53, no. 11, pp. 1165-1169, Nov. 2006
- [18] M. Usama and T. Kwasniewski, "Design and Comparison of CMOS Current Mode Logic Latches," 2004 IEEE International Symposium on Circuits and Systems (ISCAS), Vancouver, BC, pp. 353-356, May 2004

- [19] K. Gupta, N. Pandey , M. Gupta, "MCML D-Latch Using Triple-Tail Cells: Analysis and Design," *Active Passive Electronic Component*, vol. 2013, pp. 1–9, 2013.
- [20] M. Yamashina, H. Yamada. "An MOS Current Mode Logic (MCML) Circuit for Low Power Sub-GHz Processors". *IEICE Transactions on Electronics*, vol. E75-C, no. 10, p. 1181 -1187, October 1992.
- [21] Z. Toprak and Y. Leblebici, "Low-Power Current Mode Logic for Improved DPA-Resistance in Embedded Systems," *IEEE International Symposium on Circuits and Systems*, Kobe, pp. 1059-1062 Vol. 2, May 2005.
- [22] P. Heydari and R. Mohanavelu, "Design of Ultrahigh-Speed Low-Voltage CMOS CML Buffers and Latches," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 12, no. 10, pp. 1081-1093, October 2004
- [23] O. Lozada, G. Espinosa, "An Improved High Speed, and Low Voltage CMOS Current Mode Logic Latch," *Analog Integrated Circuits and Signal Processing*, vol. 90, no. 1, pp. 247–252, January 2017.
- [24] X. Zhang, Y. Wang, S. Jia, G. Zhang, X. Zhang "A Novel CML Latch for Ultra High Speed Applications," *IEEE International Conference* onElectron Devices and Solid-State Circuits (EDSSC), pp. 1–2, June 2014
- [25] G. Scotti, D. Bellizia, A. Trifiletti, G.Palumbo, "Design of Low-Voltage High-Speed CML D-Latches in Nano meter CMOS Technologies," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 25, no. 12, pp. 3509–3520, December 2017.
- [26] P. Payandehnia, H.Maghami, S. Sheikhaei, *et al*, "High Speed CML Latch using Active Inductor in 0.18 μm CMOS Technology," 2011 19th Iranian Conference onElectrical Engineering (ICEE), pp. 1–4, May 2011
- [27] R. Vetter, D. Du, and A. Klietz," Network Supercomputing," IEEE Network, vol.6, no. 3,pp.38-44, May 1992.
- [28] B. Razavi, "Monolitic Phase-Locked Loops and Clocked Recovery Circuits." *IEEE Press*, 1996.

- [29] J. C. Chen, "Multi-Gigabit SERDES: The CornerStone, of High Speed Serial Interconnects,"2011.
- [30] S. Palermo, "CMOS Nanoelectronics Analog and RF VLSI Circuits." New York City, NY: McGraw-Hill, 2011.
- [31] "AND9075/D Application Note: Understanding Data Eye Diagram Methodology for Analyzing High Speed Signals," On Semiconductor, June 2015.
- [32] S. Safwat, E. E. Hussein, M. Ghoneima and Y. Ismail, "A 12Gbps All Digital Low Power SerDes Transceiver for On-Chip Networking," 2011 IEEE International Symposium of Circuits and Systems (ISCAS), Rio de Janeiro, pp. 1419-1422, May 2011.
- [33] E. E. Hussein, S. Safwat, M. Ghoneima and Y. Ismail, "A 16Gbps Low Power Self-Timed SerDes Transceiver for Multi-Core Communication," 2012 IEEE International Symposium on Circuits and Systems (ISCAS), Seoul, pp. 1660-1663, May 2012.
- [34] R. N. Tadros, A. H. Ahmed, M. Ghoneima and Y. Ismail, "A 24 Gbps SerDes Transceiver for On-Chip Networks Using a new Half-Data-Rate Self-Timed 3-level Signaling Scheme," 5<sup>th</sup> International Conference on Energy Aware Computing Systems & Applications, Cairo, pp. 1-4, March 2015.
- [35] M. P. Flynn and J. J. Kang, "Global Signaling over Lossy Transmission Lines," *IEEE/ACM International Conference on Computer-Aided Design( ICCAD)*, San Jose, CA, USA, pp. 985-992, November 2005.
- [36] A. Jaiswal, D. walk, Y. Fang and K. Hofmann, "Low-Power High-Speed On-Chip Asynchronous Wave-Pipelined CML SerDes," 27<sup>th</sup> IEEE International System-on-Chip Conference (SOCC), Las Vegas, NV, 2014, pp. 5-10, September 2014.
- [37] P. K. Hanumolu, G. Wei, and U. Moon, "Equalizers for High Speed Serial Links," *International Journal of High Speed Electronics and systems*, vol.15, no. 2, pp.429-458, 2005.

- [38] M. Lin, C.C. Tsai, C.H. Chang et al., "A 5Gb/s Low-Power PCI express/USB3.0 Ready PHY in 40nm CMOS Technology with High-Jitter Immunity," 2009 IEEE Asian Solid-State Circuits Conference, Taipei, pp. 177-180, November 2009.
- [39] B. C. Hien, S.M. Kim and K. Cho, "Design of a Wave-Pipelined Serializer-Deserializer with an Asynchronous Protocol for High Speed Interfaces," 2012 4th Asia Symposium on Quality Electronic Design (ASQED), Penang, pp. 265-268, July 2012.
- [40] D. F. Tondo and R. R. Lopez, "A Low-Power, High-Speed CMOS/CML 16:1 Serializer," Argentine School of Micro-Nanoelectronics, Technology and Applications, San Carlos de Bariloche, 2009, pp. 81-86, October 2009.
- [41] R. R. Dobkin, Y. Perelman, T. Liran, R. Ginosar and A. Kolodny, "High Rate Wave-pipelined Asynchronous On-chip Bit-serial Data Link," 13<sup>th</sup> IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC'07), Berkeley, CA, pp. 3-14, March 2007.
- [42] D. Duvvuri and V. S. R. Pasupureddi, "An Integrated Common Gate CTLE Receiver Front End with Charge Mode Adaptation," *IEEE Computer Society Annual Symposium on VLSI (ISVLSI)*, Pittsburgh, PA, pp. 12-17, July 2016.
- [43] B. Razavi "Design of Analog CMOS Integrated Circuits," Electrical Engineering. Los Angeles McGraw Hill: University of California; 2001.
- [44] B. Razavi, "The Role of PLLs in Future Wireline Transmitters," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 56, no. 8, pp. 1786-1793, Aug. 2009.
- [45] I Jang, J Kim, SY Kim," Accurate Delay Modals of CMOS CML Circuits for Design Optimization."*Analog Integer Circuit and Signal Processing*, vol. 82, no. 1, pp. 297-307, Jan 2015.
- [46] A. Kapoor, Y. Hu and R. Bashirullah, "A Current-Density Centric Logical Effort Delay and Power Model for High-Speed CML Gates," *IEEE*

Transactions on Circuits and Systems I: Regular Papers, vol. 60, no. 10, pp. 2618-2630, Oct. 2013

 [47] B. Kuo, "Automatic Control Systems.' 3<sup>rd</sup> edition Englewood Cliffs, NY, USA: Prentice-Hall; 1975.