# HIGH SPEED SYNCHRONOUS SERDES TRANSCEIVER DESIGN

**M.Tech.** Thesis

## By **PRAMOD KUMAR BHARTI**



## DISCIPLINE OF ELECTRICAL ENGINEERING INDIAN INSTITUTE OF TECHNOLOGY INDORE JUNE 2016

# HIGH SPEED SYNCHRONOUS SERDES TRANSCEIVER DESIGN

### A THESIS

Submitted in partial fulfillment of the requirements for the award of the degree

*of* Master of Technology

*by* **PRAMOD KUMAR BHARTI** 



## DISCIPLINE OF ELECTRICAL ENGINEERING INDIAN INSTITUTE OF TECHNOLOGY INDORE JUNE 2016



## INDIAN INSTITUTE OF TECHNOLOGY INDORE

### **CANDIDATE'S DECLARATION**

I hereby certify that the work which is being presented in the thesis entitled **HIGH SPEED SYNCHRONOUS SERDES TRANSCEIVER DESIGN** in the partial fulfillment of the requirements for the award of the degree of **MASTER OF TECHNOLOGY** and submitted in the **DISCIPLINE OF ELECTRICAL ENGINEERING, Indian Institute of Technology Indore**, is an authentic record of my own work carried out during the time period from July 2014 to June 2016 under the supervision of Dr. Santosh Kumar Vishvakarma, Associate Professor, IIT Indore.

The matter presented in this thesis has not been submitted by me for the award of any other degree of this or any other institute.

#### Signature of the student with date PRAMOD KUMAR BHARTI ROLL NO:1402102009

\_\_\_\_\_

This is to certify that the above statement made by the candidate is correct to the best of my/our knowledge.

 $\setminus$ 

Signature of the Supervisor of M.Tech. thesis (with date) (Dr. SANTOSH KUMAR VISHVAKARMA)

**PRAMOD KUMAR BHARTI** has successfully given his/her M.Tech. Oral Examination held on **29<sup>th</sup> June, 2016**.

Signature of Supervisor of M.Tech. thesis Date:

Convener, DPGC Date:

| Signature of PSPC Member #1 |  |
|-----------------------------|--|
| Date:                       |  |

Signature of PSPC Member #1 Date:

## Acknowledgments

First of all, I would like to express my sincere gratitude to my guide Dr. Santosh Kumar Vishvakarma for giving me the opportunity to workout the topic of High-Speed Synchronous SerDes Transceiver Design. I am much grateful to him for all his guidance, encouragement and support during my research work. It has been a great experience by working under his supervision.

Besides my guide, I am very thankful to the PG Student's Progress Committee(PSPC) members: Dr. M. Ambarasu and Dr. Sarika Jalan for their kind support and encouragement, which helped me to improve my knowledge in all ways.

I would also like to give special thanks to my fellow lab mates: Mahesh, Nand Kishore, Bhupendra, Vishal, Abhishek, Deepika, Gopal, Puran, Ankur, Vikas, Gourav, Puja, Ravi and Mohit for their continuous assistance and valuable discussions which helped me to improve my work.

I am expressing my hearty gratitude to all my friends and colleagues for being there with me during my ups and downs and making my life better here.

I also thank my family members for their love and support and for encouraging me every time. Thank you.

## Abstract

Today due to advancement in wireless and wireline technologies, the requirement of high-speed data rate devices are increasing. The network devices should be capable of handling and process high-speed data signal reliably with minimum electromagnetic interference(EMI) and bit error rates(BER). The most promising solution for these requirements is SerDes transceiver design, which can handle high-speed data transfer. A synchronous on-chip SerDes (Serializer and Deserializer) is proposed using current mode logic (CML) technique. Instead of using multiplexers (MUX) for designing serializer, CML based double edge triggered (DETFF) flip-flops is used. Deserializer uses both positive and negative edge triggered CML - based D flip flops. The clock and data recovery circuit using phase locked loop(PLL) consumes more power than PLL - less clock and data recovery circuit. Hence a novel design of PLL-less clock and data recovery circuit is used using 3 level encoder-decoder technique. Apart from clock and data recovery, it also eliminates the need of transmit (feed forward) equalizer and receiver(decision feedback) equalizer. A 3 mm lossy transmission line is used for transmission of serialized stream. This work is implemented in UMC 65 nm technology. In this work, a total power consumption of 44.03%, and the data rate of 25% is improved in designing of serializer and deserializer and an power consumption of 62%, and data rate of 12% is improved for the design of SerDes using a new proposed Phase Locked Loop(PLL) less clock and data recovery circuit than recent works.

SerDes is used in various applications which include stackable Ethernet switch expansion, rackto-rack, shelf-to-shelf datacom/telecom interconnect, video/camera links, base stations, automotive imaging/video, sensor systems telecom, add-drop multiplexers, pseudo-optical switches etc.

## **List of Publications**

- Mohit S. Choudhary, Mahesh Kumawat, Pramod K. Bharti, and S. K. Vishvakarma, "16.64Gbps Synchronous CML SerDes Transceiver Design Technique with Process Corner Variations for Low Power Application," VLSI Circuit and System Letters: Regular Papers, vol. 2, no. 1, pp. 02–07, April 2016.
- Pramod K. Bharti, Mahesh Kumawat, Mohit S. Choudhary, and S. K. Vishvakarma, "20 Gbps High Speed CML based Synchronous On-Chip SerDes Design," IETE Journal of Research, Taylor & Francis, 2016 (Under Review).

# Contents

| A  | Acknowledgments |         |                                                    |    |  |
|----|-----------------|---------|----------------------------------------------------|----|--|
| Ał | ostrac          | et      |                                                    | ii |  |
| 1  | Intr            | oductio | n                                                  | 1  |  |
|    | 1.1             | Overv   | iew                                                | 1  |  |
|    |                 | 1.1.1   | Disadvantage of parallel lines and their solutions | 1  |  |
|    | 1.2             | High s  | peed SerDes                                        | 2  |  |
|    |                 | 1.2.1   | Serializer / Deserializer Blocks                   | 3  |  |
|    |                 | 1.2.2   | Equalizer Block                                    | 4  |  |
|    |                 | 1.2.3   | Clock and Data Recovery Circuit(CDR)               | 6  |  |
|    | 1.3             | Differ  | ent SerDes Architecture and its Applications       | 7  |  |
|    |                 | 1.3.1   | Parallel clock SerDes:                             | 7  |  |
|    |                 | 1.3.2   | Embedded Clock Bits SerDes                         | 8  |  |
|    |                 | 1.3.3   | 8b/10b SerDes                                      | 10 |  |
|    |                 | 1.3.4   | Bit Interleaving SerDes                            | 11 |  |
|    | 1.4             | Curren  | nt Mode Logic(CML)                                 | 12 |  |
|    |                 | 1.4.1   | Need of CML Technique                              | 12 |  |
|    |                 | 1.4.2   | Operation of CML Technique                         | 12 |  |
|    |                 | 1.4.3   | Switching Condition of Inverter                    | 14 |  |
|    | 1.5             | Motiv   | ation                                              | 16 |  |
|    | 1.6             | Summ    | ary of Contributions                               | 16 |  |
|    | 1.7             | Organ   | ization of the Thesis                              | 17 |  |
| 2  | Lite            | rature  | Survey                                             | 18 |  |

| 3  | Rese   | earch Objectives and Methodology                        | 20 |
|----|--------|---------------------------------------------------------|----|
|    | 3.1    | Overview                                                | 20 |
|    | 3.2    | Research Objectives                                     | 20 |
|    | 3.3    | Research Methodology                                    | 20 |
| 4  | Prop   | posed Design of Synchronous Serializer and Deserializer | 22 |
|    | 4.1    | Design of Serializer                                    | 23 |
|    | 4.2    | Design of Deserializer                                  | 24 |
| 5  | Prop   | posed Design of Clock and Data Recovery Circuit         | 29 |
|    | 5.1    | Design of 3- Level encoder                              | 30 |
|    | 5.2    | Construction of Channel                                 | 31 |
|    | 5.3    | 3- Level Decoder                                        | 33 |
|    | 5.4    | Clock Recovery Circuit:                                 | 34 |
|    | 5.5    | Data Recovery Circuit:                                  | 34 |
| 6  | Con    | clusion and Future Works                                | 39 |
|    | 6.1    | Conclusion                                              | 39 |
|    | 6.2    | Future Works                                            | 40 |
| Re | eferen | ces                                                     | 40 |

# **List of Figures**

| 1.1  | Block diagram of SerDes                                  | 3  |
|------|----------------------------------------------------------|----|
| 1.2  | Channel characteristics [1]                              | 4  |
| 1.3  | Output response of SerDes without transmit equalizer [1] | 4  |
| 1.4  | Output waveform of signal after equalization [1]         | 5  |
| 1.5  | Transmit equalizer                                       | 5  |
| 1.6  | Receiver equalizer [1]                                   | 6  |
| 1.7  | 5 - tap DFE [1]                                          | 6  |
| 1.8  | Parallel clock SerDes [9]                                | 8  |
| 1.9  | Block diagram of embedded clock bits SerDes              | 9  |
| 1.10 | 8b/10b SerDes                                            | 10 |
| 1.11 | Bit interleaving SerDes                                  | 11 |
| 1.12 | CML buffer/inverter                                      | 13 |
| 1.13 | Transfer characteristics of CML inverter/ buffer         | 13 |
| 1.14 | Design of CML buffer/ inverter using level shifter       | 15 |
| 3.1  | The adopted research methodology for proposed work       | 21 |
| 4.1  | Block diagram of proposed SerDes                         | 22 |
| 4.2  | Design of serializer                                     | 23 |
| 4.3  | Double - edge triggered flip - flop                      | 24 |
| 4.4  | Design of CML MUX                                        | 25 |
| 4.5  | CML D - latch                                            | 26 |
| 4.6  | Output waveform of serializer                            | 26 |
| 4.7  | Conventional deserializer                                | 26 |
| 4.8  | Deserializer design                                      | 27 |
| 10   |                                                          |    |
| 4.9  | D flip - flop                                            | 27 |

| 5.1  | Design of an encoder                                     | 31 |
|------|----------------------------------------------------------|----|
| 5.2  | Model graph of 3- level encoder                          | 31 |
| 5.3  | Three level encoder                                      | 32 |
| 5.4  | Decoder design                                           | 33 |
| 5.5  | Desired output waveform of clock recovery circuit        | 34 |
| 5.6  | Desired output of data recovery circuit based on A and B | 35 |
| 5.7  | Clock and data recovery circuit                          | 36 |
| 5.8  | Output waveform of clock and data recovery circuit       | 37 |
| 5.9  | Eye diagram of extracted data                            | 38 |
| 5.10 | Eye diagram of extracted clock                           | 38 |

# **List of Tables**

| 4.1 | Power consumption of different components of SerDes | 25 |
|-----|-----------------------------------------------------|----|
| 4.2 | Comparison results                                  | 28 |
| 5.1 | Truth table of 3 level encoder                      | 31 |
| 5.2 | Truth table of clock recovery circuit               | 34 |
| 5.3 | Truth table of data recovery circuit                | 35 |
| 5.4 | Truth table for J                                   | 35 |
| 5.5 | Truth table for K                                   | 36 |
| 5.6 | Power consumption                                   | 37 |
| 5.7 | Design comparison                                   | 37 |

## Chapter 1

## Introduction

### 1.1 Overview

In this chapter, basics of SerDes is explained in details along with its uses in various field.

#### **1.1.1** Disadvantage of parallel lines and their solutions

In recent years, there have been tremendous advancements in the field of high-speed data communication. These communications research developing newer models for high-speed data transmission over the chips as well as propagation through a channel for off-chip communication. The complex System-on-Chips (SoCs) have increased the requirement of more interconnections on a single chip. As a result of this, high-speed data transmission with the high quality of service has become a challenging task.

The simplest way to transmit data between input and output is to directly connect a path between both. But the data to be transmitted consists of more than one bit and hence we need to connect that much wide data path. This data path is also called parallel data bus as data is transmitted in a parallel manner. The present technology works with the parallel links transmission with good accuracy and high speed. The disadvantage of these links is in the form of large area and high static power consumption. Parallel links suffer from crosstalk, skew and leakage power [2]. Also, global interconnects cannot be scaled with technology [3], so parallel lines require more area as compared to the other processing elements which result in an increment of packaging cost of the chip, wiring complexity and routing congestion [1]. To overcome the disadvantage of parallel lines, these are replaced by a serial link. These requirements can be fulfilled by using SerDes transceiver [4] [5] [6].For longer links, the serial link outperforms the parallel link in terms of active area, leakage, and dynamic power [7]. SerDes stands for Serializer and Deserializer in which serializer is used to convert parallel data in serial stream at transmitter where as deserializer perform reverse operation and converts parallel lines into a serial line.

Many modifications have been employed to conquer the drawbacks of parallel data bus such as; reducing the number of I/O pins, clock forwarding, higher speed source synchronous interfaces, etc. The first one is to reduce the number of I/O pins for transferring the data. It is done by using a multiplexer at the transmitter and a De-multiplexer at the receiver. The multiplexer converts n bits of data lines at the transmitter to k bits (where k < n) and then at the receiver, the De-multiplexer converts back the k bit lines to n bits. Although it results in lower number of I/O pins, the clock frequency increases by the reverse of the ratio by which the number of pins gets reduced, and this tends to create the timing issues. So the first modification fails to solve the problem arising from parallel data bus approach.

The next modification is to add a high-speed clock source to the data path between the transmitter and receiver. The source has a lower frequency than that required to clock the data flip-flops on the chip. Phase locked loops (PLLs) are used in each chip to generate clock frequencies of higher multiples of this frequency. These clocks are used to transmit and capture data at the transmitter and receiver respectively. This method is called clock forwarding, and it has the advantage that the high-speed clock used to transmit data at the transmitter is also given as a reference to the receiver to receive the data. This approach helps to meet the timing requirements easily.

Though several methods have been applied as described above, they don't reach the thirst for required high-speed data transmission, as there are certain disadvantages present in each and every modification. Another emerging development in this field is high-speed SerDes design which outperforms the above three methods to a great extent.

### **1.2 High speed SerDes**

Implementation of High-speed SERDES devices provides a prominent solution to the problem aforementioned. These devices are high-speed I/O interfaces which can work at speeds of 2.5gbps and higher. Here rather than using a separate line for clock information, the clock information is transferred along with the data [1]. As a result of which the area required for separate lines in case of clock forwarding techniques gets eliminated which in turn reduces the problem of timing synchronization.

Figure 1.1 shows the block diagram of High speed SerDes.

Below section describes each and every block mentioned above:



Figure 1.1: Block diagram of SerDes

#### **1.2.1** Serializer / Deserializer Blocks

The serializer at the transmitter end take n-bits data path as its input and serializes them into a single bit serial data path which is fed to the FTE and Driver stages. The inputs to the serializer are multiples of 8 or 10, and they may be programmable or non-programmable. In the case of un-encoded and scrambled data transfer multiples of 8 data lines are used whereas multiples of 10 data lines are useful 8B/10B coding protocols.

In the above figure, the serializer converts the n-bit data lines into a single bit line and feeds it to the driver stage. The number of bits at the output of serializer depends mostly on the implementation, and it may vary from one to m (where m<n). A wider data path results in more complex design issues but at the same time, it requires a lower operating frequency. In these cases, another stage of serialization may be performed at the driver stage.

At the serializer, the high-speed clock is divided down by n to generate the sample clock for parallel data transmission. This sampling clock is provided to the driver stage along with the data so that it can be transferred to the receiver through the transmitting channel.

The de-serializer at the receiver end performs the inverse function of the serializer. It deserializes the serial data into n-bit data path and gives as its output. It is to note that the number of data lines at the output of deserializer is exactly same as those at the input of the serializer. A sample clock signal is generated by dividing down the high-speed internal clock, and this clock signal is supplied to the latching circuit which latches the parallel data.

#### **1.2.2 Equalizer Block**

The channel between transmitter and receiver acts as a low pass filter and the signals get distorted due to this. This characteristic of channel is shown in Figure 1.2.



Figure 1.2: Channel characteristics [1]

As shown in the Figure 1.3, the clean input waveform gets distorted at the output of the channel.



*Figure 1.3: Output response of SerDes without transmit equalizer [1]* 

The channel can be modeled as a typical low pass filter, and its cut-off frequency is less than the frequency of the signal. As a result of which signal distortion occurs.

The equalizer equalizes the signal both at the transmitter and receiver to eliminate the effect of the channel and for proper decoding of the signal. The reason behind this is that the equalizer has a transfer function which is roughly the inverse of that of the channel transfer function. The equalizer modifies or distorts the signal in such a way that the resulting signal at the output of the channel becomes a clean waveform. Pre-emphasis circuits are used as equalizer at the transmitter.



Figure 1.4 shows the output waveform of equalized signal.

Figure 1.4: Output waveform of signal after equalization [1]

Most SerDes transceivers implement a Feed Forward Equalizer(FFE) at its transmitter. Many flip-flop circuits are present and used as the taps of the filter. These flip-flop circuits cause the serial data signal to be delayed. Each tap is multiplied by a tap weight or filter coefficient, and the summation of these are driven to the serial data output.

Figure 1.5 shows the block diagram of 3- Tap FFE [1].



Figure 1.5: Transmit equalizer

Equalizer is also present at the receiver side to counter the effect of signal distortion at the input of the receiver. In the case of SerDes, Decision Feedback Equalizer(DFE) circuits are used as receiver equalizer. One frequently used DFE is a peak amplifier circuit. It amplifies the higher

frequency components more than those of, the lower frequency components. If the peaking is same as the difference between high and low frequency, then the channel is said to be equalized.

Figure 1.6 shows the output waveform of signal at the receiver equalizer and Figure 1.7 shows the block diagram of 5 - tap DFE.



Figure 1.6: Receiver equalizer [1]



Figure 1.7: 5 - tap DFE [1]

#### **1.2.3** Clock and Data Recovery Circuit(CDR)

The clock circuitry at the transmitter and receiver need to be synchronized so that no information will be lost. In the case of clock forwarding, the clock signal is transmitted along with the data signals over the channel. So the transmitter and receiver get synchronized. But there is a need of additional link to transmit this clock information. The clock and data recovery circuit extract both data and the clock signal used at the transmitter. In this case, no clock signal is sent along with the data signal and hence there is no need to add an extra data line during transmission.

6

The CDR circuit is present at the receiver, and it recovers the timing information at the receiver end by monitoring the transitions of the data signal and selecting an optimal timing sample. CDR should be robust to inter-symbol interference (ISI) and other jitter components.

### **1.3 Different SerDes Architecture and its Applications**

It is also relevant to consider different architectures of SerDes as this provides a better insight of performance evaluation and comparison regarding various aspects of topology, protocol, data flow, additional buffering and clocking, etc [8]. So the choice of SerDes plays a prominent role concerning system cost and performance.

There are four distinct SerDes architectures: parallel clock SerDes, 8b/10b SerDes, embedded clock bits (alias start-stop bit) SerDes, and bit interleaving SerDes. [8] [9] [10]

#### **1.3.1** Parallel clock SerDes:

#### Architecture:

These are used to serialize wide data-address-control parallel buses. Such parallel buses include processor buses, control buses, Peripheral Component Interconnect(PCI), etc. Conventional SerDes employs a single MUX to multiplex the whole data bus. But parallel clock SerDes uses a bank of n-to-1 MUXs where each one serializes its corresponding section of the data bus in a separate way. All the serial data paths then travel to the receiver in a parallel fashion. An additional clock signal is required along with the data which helps the receiver to latch in and recover the data properly. There is a risk of pair-to-pair skew in this case, and it should be minimized to get a better performance. Figure 1.8 shows the block diagram of parallel clock SerDes.

#### **Application:**

Parallel clock SerDes are generally used to serialize the traditional data-address-control buses. These act as a virtual ribbon cable unidirectional bridge. Typical applications of this architecture include stackable Ethernet switch expansion, rack to rack and shelf to shelf data communication interconnect and video or camera links.

Parallel clock SerDes requires multiple serial lines, but it still provides better performance than that of non-serialized circuitry as it has fewer wires, lower power, longer cable driving capability, less noise or EMI and lower cable/connector costs. As parallel clock SerDes is not associated with



Figure 1.8: Parallel clock SerDes [9]

a single serial line, it can be made arbitrarily wide i.e. many serial lines can be present, and also it eliminates the issues related to ultra high-speed serial data rates.

Parallel clock SerDes outperforms other architectures and hence it is the only practical way to transmit a traditional wide parallel bus over several meters long cable. Common parallel bus widths include 21-, 28- and 48-bits wide.

#### **1.3.2 Embedded Clock Bits SerDes**

#### Architecture:

The embedded clock bits architecture serializes the data bus and clock signal into one single signal pair. For each cycle, two clock bits, i.e. one high and one low are embedded into the serial stream. This architecture frames the starting and ending bits of each serialized word and hence creates a periodic rising edge in the serial data stream. In this case 10- and 18- bit data widths are popular bus widths.

The receiver automatically searches for this periodic clock rising edge embedded in the data stream. The data bits change over time, but the clock bits do not. Hence the receiver locates this unique clock edge and synchronizes to it. After that data recovery from the serial stream becomes easy regardless of the payload data pattern. This automatic synchronization capability is especially

useful where the receiver is not under direct control of the system. Here the receiver is locked to an incoming clock embedded with the signal and not to an external clock for reference. Hence, jitter requirements are relaxed significantly both in the case of transmitter and receiver.



Figure 1.9: Block diagram of embedded clock bits SerDes

Figure 1.9 shows the block diagram of embedded clock bits SerDes.

#### **Application:**

These type of SerDes are used in case of applications that transmit raw data as well as other signals such as control signals, parity signals, frame signals, sync signals, status signals, etc. These extra non-data bytes require the SerDes to work faster than the normal data conversion rate due to its higher cable designing demand. They also require some idle insertion and deletion flow control mechanism.

Another application of embedded clock bits SerDes is in places where one transmitter broadcasts to more than one receivers and where the receiver is in a remote module and not under system control. In these cases, a new receiver locks the random data without the help from other existing receivers.

These SerDes are also being applied to non-byte based applications where unpackaged raw data along with the control signals are transmitted. Such systems include base stations, automotive imaging or video and sensor systems.

#### 1.3.3 8b/10b SerDes

#### Architecture

The 8-bit/10-bit serializer serializes the parallel data bus in two consecutive steps. First, it maps the parallel data byte to a 10-bit code and then it serializes that 10-bit code to a serial line.

The 10-bit transmission codes were developed by IBM in the early 1980's, and they ensure both multiple edge transitions at each cycle and DC balance. DC balance is nothing but having Balance number of ones and zeros. Multiple edge transitions allow the receiver to synchronize to the incoming data stream and the DC balance helps in driving AC-coupled loads, optical cables, and long modules.

Most 8b/10b de-serializer architectures lock the edges by comparing the recovered clock frequency to an external reference clock. So, they need fixed external clock source frequency and more jitter control.



Figure 1.10: 8b/10b SerDes

The block diagram of 8b/10b SerDes is shown in Figure 1.10

#### **Application:**

8b/10b SerDes are suitable for byte-oriented data such as cell or packet traffic across cable and fiber. Many standard communication links such as Ethernet, fiber channel, etc. use the most common 8b/10b coding. This SerDes has a maximum run length of 5bits. This limits the spectral content of the serial bit stream which in turn suppresses the electromagnetic radiation. Jitter components due to the ISI effect on lossy interconnect reduces due to the reduction in this run length.

These 8b/10b SerDes are also DC balanced which helps in the extension of cable driving capability. DC balance coding and small run lengths are needed for reliable driving of AC coupled environment as well as fiber optic cables. 8b/10b coding also provides a way for checking errors and sending control signals.

#### **1.3.4 Bit Interleaving SerDes**

#### Architecture:

Bit interleaving SerDes multiplexes a number of slower 8b/10b serial streams into a faster serial stream by interleaving the bits. At the other end, the receiver demultiplexes the interleaved bits back into the original slower bit streams. This type of SerDes works with high-speed and low jitter requirements. And hence, it requires very precise external clocks. Figure 1.11 shows the block



Figure 1.11: Bit interleaving SerDes

diagram of bit interleaving SerDes.

#### **Application:**

These types of SerDes are commonly used in telecommunication transmission equipment such as add-drop MUXs and pseudo optical switches to assemble SONET/SDH streams for transmission over cable or optical fiber. These configurations include 4 x 155 Mbps to 622 Mbps and 4 x 622 Mbps to 2.488 Gbps MUX/DEMUX functions.

Another type of bit interleaving SerDes multiplexes 8-bits/10-bits coded streams. These are placed in switching and router equipment to get more bandwidth extension than the existing ones.

### **1.4 Current Mode Logic(CML)**

#### **1.4.1** Need of CML Technique

Most of the System-on-chip (SOC) design uses CMOS logic because it has only dynamic power dissipation and high noise margin [11]. However at high speed, due to the dependency of CMOS on pMOS device, logic gates implemented using CMOS logic is rather slow and does not give a good performance. The mobility of pMOS device is approximately 2.7 times less than nMOS device. So to have equal rise and fall time of logic gates implemented using CMOS, the width of pMOS devices are increased which increases both the area and power dissipation of devices. Also, CMOS is a single ended logic which is highly influenced by environmental noise [2]. To overcome these problems, gates use CML logic which implements differential nMOS transistors. It does not require pMOS devices so that the circuit can be operated at higher speed. But unlike CMOS logic which has only dynamic power dissipation, it also suffers from static power dissipation.

#### **1.4.2** Operation of CML Technique

CML logic is frequently used in designing of very high speed MUX, flip flops and logic gates [12]. CML logic gates are designed using differential pairs. The differential input is applied to the gates of the transistors, and their corresponding differential output is taken from the drains of the corresponding transistors. Figure 1.12 shows the simple inverter/buffer using CML technique.

$$Input \ Voltage = V_{in \ p} - V_{in \ n} \tag{1.1}$$

$$Output Voltage = V_{out_p} - V_{out_n}$$
(1.2)

$$Biasing Current, I_0 = I_1 + I_2 \tag{1.3}$$

(1.4)

Transfer characteristics of differential CML buffer/inverter give the condition for the transistor to operate properly in saturation region which is shown in Figure 1.13.

The minimum input required for the saturation of differential output is equal to:

$$\sqrt{\frac{2I_0}{\mu \cdot C_{ox} \frac{W}{L}}} > I_0 R. \tag{1.5}$$

The differential output saturates when the total current  $I_0$  passes through only one transistor



Figure 1.12: CML buffer/inverter



Figure 1.13: Transfer characteristics of CML inverter/ buffer

that is when transistor M1 goes into saturation, transistor M2 must be in the cut-off region. So,

$$V_{M2} = V_T \tag{1.6}$$

$$V_{M1} = V_T + \sqrt{\frac{2I_0}{\mu . C_{ox} \frac{W}{L}}}$$
(1.8)

For proper operation of logic gates, the gain must be greater than 1 [13]; that is,

$$\frac{V_{out}}{V_{in}} > 1 \tag{1.9}$$

$$\Rightarrow \frac{I_0 R}{\sqrt{\frac{2I_0}{\mu.C_{ox}\frac{W}{L}}}} > 1 \tag{1.10}$$

$$\Rightarrow I_0 R > \sqrt{\frac{2I_0}{\mu . C_{ox} \frac{W}{L}}} \tag{1.11}$$

If  $\sqrt{2I_0/(\mu C_{ox}W/L)} > I_0R$ , in cascading of logic gates, the differential output will not be able to drive the next logic gate.

For better performance,  $\sqrt{2I_0/(\mu . C_{ox}W/L)}$  should be small enough, which can be achieved by increasing the value of  $I_0$  or increasing the value of (W/L) ratio.  $I_0$  can't be increased as power consumption of logic gate will increase due to this. Increasing the value of (W/L) ratio increases the performance, but it can not be raised to a large extent because the larger (W/L) ratio value also increases the input capacitance of logic gate which slows down the devices.

#### 1.4.3 Switching Condition of Inverter

For Inverter shown in Figure 1.12 to act as a switch, one of the transistor(either M1 or M2) should be in saturation while another one should be in cut- off.

For M1 to be in saturation:

$$V_{DS \ M1} \ge V_{GS \ M1} - V_{T \ M1} \tag{1.12}$$

Let gate of M1 is applied with VDD.

$$VDD - I_0 R \ge VDD - V_{T M1} \tag{1.13}$$

$$I_0 R \le V_T \tag{1.14}$$



Figure 1.14: Design of CML buffer/ inverter using level shifter

 $V_T$  must not be very less. Otherwise, M1 go into triode region. To overcome the problem of having low  $V_T$  transistor, level shifters are used. Figure 1.14 shows the CML buffer/inverter design using level shifters.

Let Transistors M3 and M4 are applied with a voltage of VDD and  $VDD - I_0R$  respectively. Level shifter eliminate the transistor M1 to go into tride region even if the threshold voltage of transistor M1 is very low.

#### Condition for M1 to be in saturation:

$$V_{G_M1} < V_{G_M1} + V_{T_M1}$$
  

$$\Rightarrow VDD - V_{DSAT_M3} - V_{T_M3} < VDD - I_0R + V_{T_M1}$$

$$\Rightarrow I_0 R < V_{T\_M1} + V_{DSAT3\_M3} + V_{T\_M3}$$

This above design added  $V_{DSAT_M3} + V_{T_M3}$  in  $V_{T_M1}$ , so the low  $V_{T_M1}$  problem incurred in Figure 1.12 is eliminated.

For transistor M0 acts as a current source, M0 must be in saturation.

### **1.5** Motivation

Now a days, SerDes is growing in the field of emerging research and development area. Various industries are focusing on the development of high-speed SerDes requiring low power with a high quality of service.

Recently Intel has demonstrated a general-purpose, 14 nm SerDes chip designed to reduce the size of its successful 22 nm SerDes offering by 40%, and also to cut power consumption by 20% compared to the 22 nm SerDes product.

Various Start-up companies have been opened in the field of SerDes which specifically work on improvement of data rates and power consumption. In May 2016 Credo announced the availability of its 28G and 56G PAM-4 SerDes transceiver IP on TSMC's 16-nm FinFET Compact (16FFC) process.

Marvell Technology Group Ltd. (Santa Clara, Calif.) has licensed the Glasswing chip-to-chip SerDes technology from startup company Kandou Bus SA (Lausanne, Switzerland). The technology used by Kandou is capable of delivering 1Tbps bandwidth at less than 1 watt and is suitable for short chip-to-chip links inside and outside of packaging, according to Kandou.

Use of SerDes fulfills the requirement of next generation high-speed data thirst and also it is an emerging topic of research. Hence, I have chosen this as my research area.

### **1.6 Summary of Contributions**

Significant contributions of this thesis are listed as following

- The dependency of pMOS device in CMOS techniques slows down the overall performance of SerDes. Hence, a CML logic is used for designing of SerDes. The MUX based design of serializer is more sensitive to inter-symbol interference, and hence, DETFF - based serializer is used which operates at higher speed.
- The conventional deserializer using D-FF samples the incoming signal at every positive edge. Hence to de-serialize, 8 bits sequence, 8 clock periods are required. It is replaced by a deserializer using both positive and negative edge triggered D-FF which requires only 4 clock periods than the conventional one.
- New design of ultra low power clock and data recovery circuit is proposed which does not require power hungry Phase Locked Loop (PLL) circuits. It also eliminates the requirement of equalizers.

• The implementation of the above-listed technique is compared with both Synchronous and Asynchronous SerDes.

## 1.7 Organization of the Thesis

The thesis is organized as follows:

- In Chapter 1, an introduction about SerDes(serializer and deserializer) is explained.
- In Chapter 2, a literature survey of various designs of SerDes purposed previously are discussed.
- In chapter 3, research objectives and methodology are discussed.
- In chapter 4, the proposed design of Serializer and Deserializer is explained in details.
- In chapter 5, the proposed design of clock and data recovery circuit is explained in details.
- In Chapter 6, conclusions are made, and a discussion on the possibility of future work is presented.

## Chapter 2

## **Literature Survey**

In this chapter, a comprehensive survey of the different type of SerDes is discussed in details. In Jaiswal et al. [14] an on-chip asynchronous wave-pipelined CML SerDes is presented which uses delay element, and MUX for bits propagation through a serial link and new bits are loaded via MUX. The advantage of this paper is that it doesn't require clock generation and clock and data recovery circuit which consumes much power due to the presence of PLL circuit. New bits are loaded via MUX when load input is high. After the loading process, load input becomes low which enables data transmission through the delay element. Delay element enables the wave propagation through injected stage. The serialized signal is applied as an input to deserializer, which should have a similar structure as the serializer. Otherwise, waves of information signal would propagate differently, preventing correct sampling of the received signal. This SerDes is implemented in 65nm technology resulting in total power consumption of 14.3 mW and data rates of 12.67 Gbps. Although asynchronous SerDes transceiver does not require clock generation circuit which is too power consuming, delay control is a crucial issue in this case. Moreover, it needs to wait for the acknowledge signal which slows down the device. These disadvantages are eliminated using synchronous high-speed CMOS/CML 16:1 serializer, proposed in Tondo et al. [2] to overcome the problem incurred in Jaiswal et al. [14]. 16 parallel bits are serialized using 16:1 MUX, which in turn is designed using 2:1 MUXes. The MUXes, which are enabled by a low-frequency clock signal, are designed using CMOS technique taking the advantage of low power dissipation. Those of the high-frequency clock signal MUXes are designed using CML technique for faster-switching capability. In Tondo et al. [2] the authors have used 65nm and 45nm technologies and performed data multiplexing. Total power consumption using these technologies are 106 mW and 50 mW respectively at a data rate of 10 Gbps in both cases. The author in Tondo et al. [2] has only designed serializer using 2:1 MUXes. The same topology is used in Chen et al. [15] for both serializer and deserializer design using MUX. The CDR in Chen et al. [15] requires PLL circuit, which increases the overall power consumption of SerDes chip.PLL based the CDR design in Harwood et al. [16] and Lee et al. [17] have a power consumption of 130mW and 144mW at 65nm and  $0.18\mu$  CMOS technology respectively. DETFF is more robust to inter-symbol interference(ISI) than MUX. Also, MUX based design is noisy because any noise in the enable signal disrupts the MUX output easily.Hence Unlike serializer design using MUX in Chen et al. [15], DETFF based serializer design is robust to noise and ISI. A serializer is designed using DETFF in Safwat et al. [18]. In this paper, a self-timed synchronous CMOS SerDes transceiver is presented using TSMC 65 nm technology. A new signaling technique is presented in Safwat et al. [18] to overcome the power disadvantage incurred in Chen et al. [15]. After serialization, a 3 level encoder circuit is used at the transmitter which converts 2 level signal into 3 level encoded signal. The outstanding characteristic of using this technique is, clock and data signals are easily recovered using serializer design using a simple circuitry requiring very less power. The CDR does not require PLL circuit A three level signaling technique is used for high-speed data transmission. This SerDes transceiver consumes a total power of 15.5 mW and operates up to 12 Gbps data rate. In Hussein et al. [19] a new design of three level encoder is proposed by the same author to further increase the data rates up to 16 Gbps but in this case, the total power consumption has grown to 18.1 mW. The SerDes proposed in Hussein et al. [19] and Safwat et al. [18] have used CMOS logic the dependency of CMOS on pMOS device slows down the overall performance of SerDes.

## **Chapter 3**

## **Research Objectives and Methodology**

## 3.1 Overview

In the chapter, the objective of research problems are defined and the methodology is given.

### **3.2 Research Objectives**

Synchronous SerDes design objectives are as follows:

- SerDes design capable of handling high data rate:
   A CML based SerDes is proposed using double edge triggered flip-flop.
- low power SerDes transceiver design:
   A combination of CML and CMOS technique is used.
- minimum chip area requirement.

### 3.3 Research Methodology

The Cadence EDA tool based simulations will be used as a methodology for the proposed research work. The proposed methodology is described in Figure 3.1.



Figure 3.1: The adopted research methodology for proposed work

## **Chapter 4**

# **Proposed Design of Synchronous Serializer and Deserializer**

In this chapter, synchronous serializer and deserializer are designed using CML logic. The advantage of using CML logic over CMOS is independency of pMOS transistor. Hence, CML logic performance is better than CMOS logic at higher speed [20] [21]. The only disadvantage of using CML logic is static power dissipation. However, proper selection of drain resistance and biasing current reduces the power consumption to a significant extent.

The serializer and deserializer are designed using CML based DETFF and CML based D-FF respectively. A CMOS to CML converter is used to convert digital input signal into differential signal whereas a CML to CMOS converter is used to convert differential signal into digital signal [22] [14].

The transmission line channel used in this work is 3mm lossy differential transmission line [18] to have less distortion. The clock and data recovery circuit is designed using CMOS techniques.

Figure 4.1 shows the block diagram of proposed work.



Figure 4.1: Block diagram of proposed SerDes

### 4.1 Design of Serializer

Figure 4.2 shows serializer design which consists of ring oscillator circuit, frequency divider [23] circuit and DETFF. The ring oscillator circuit works as a clock generation circuit which generates a clock signal of 10 GHz. The parallel bits are serialized using time division multiplexing (TDM) of 8 bits [24] [25] [26]. At the first stage, DETFF is used to double the input data rate and at the second stage, this data rate again gets doubled. This process continues till the final stage. The final stage output of DETFF is the desired serialized output. In 1st stage, all four DETFFS are clocked with 2.5GHz of the clock frequency. The 2nd stage DETFFS are enabled by a clock frequency of 5 GHz. At the final stage, it is enabled by a clock frequency of 10 GHz and serialized output is achieved. A frequency divider circuit is used to obtain clock signals of 5 GHz and 2.5 GHz frequency by dividing the original 10 GHz clock signal by a factor of 2 and 4 respectively.



Figure 4.2: Design of serializer

In this paper, a more sophisticated DETFF design is used than that proposed in [19] regarding both power and area requirement. DETFF designed in [19] consists of two positive edge triggered flip-flops, a negative edge triggered latch and a 2:1 MUX. Area requirement is decreased to a large extent by replacing the DETFF design by only two latches and a 2:1 MUX, which is shown in Figure 4.3. Both positive and negative edge triggered latches are used to fetch data from both the positive and negative edges of the clock. On positive edge of the clock, D1 is selected as the output of DETFF and on the negative edge of the clock, D0 is selected as the output of DETFF.



Figure 4.3: Double - edge triggered flip - flop

The MUX inputs are controlled by select line input  $S_p$  and  $S_n$ . Only one differential pair is enabled by select line input at a time. When  $S_p$  is chosen as level 0,  $I_0$  is selected as output, and when  $S_p$  is chosen as level 1,  $I_1$  is selected as the output of the MUX. Circuit diagram of MUX is shown in Figure 4.4.

The circuit diagram of CML latch is shown in Figure 4.5. The CML latch gives the output same as input A when clock input S is enabled i.e. level 1. When level 0 of S is provided, latch output will be kept stable same as previous output. For proper latching, the cross-coupled differential transistor pair width size must be greater than input differential pair.

Figure 4.6 shows the output waveform of serializer. 8 parallel bits 11000100 is given to inputs of serializer, and it is serialized and transmitted through serial link.

### 4.2 Design of Deserializer

Figure 4.7 shows the conventional circuit diagram of deserializer using D flip flops [27]. The D flip-flop used in [27] samples the incoming data at every positive edge of the input signal. So for 8 bits serial data to be de-serialized, 8 clock periods are required.



Figure 4.4: Design of CML MUX

In this paper, the deserializer design consists of both positive edge triggered D flip - flop, and negative edge triggered D flip - flop. The advantage of using both positive and negative edge triggered D-flip-flops are that it samples the serialized data at both the positive and negative edge of the clock. So for 8 bits serial data to be de-serialized, only 4 clock periods are required. Hence, deserializer used in this work is faster than conventional deserializer [27].

Figure 4.8 shows the design of deserializer. The D-FF used in the design of deserializer is shown in Figure 4.9 whereas output waveform of deserializer is shown in Figure 4.10.

The power consumption of different components of SerDes is shown in Table: 4.1. The total power consumption of this work is lower than recent works as compared in Table: 4.2.

|    | Components   | Power(mW) |
|----|--------------|-----------|
| Tw | ADPLL        | 2.9       |
|    | Serializer   | 2.61      |
| Dy | Deserializer | 1.72      |
| KX | ADPLL        | 2.9       |

Table 4.1: Power consumption of different components of SerDes



Figure 4.5: CML D - latch



Figure 4.6: Output waveform of serializer



Figure 4.7: Conventional deserializer



Figure 4.8: Deserializer design







Figure 4.10: Output waveform of deserializer

Table 4.2: Comparison results

| Architecture                      | Technology(nm) | Speed(Gbps) | Power(mW) |
|-----------------------------------|----------------|-------------|-----------|
| Self - timed SerDes [19]          | 65             | 16          | 18.1      |
| All digital low power SerDes [18] | 65             | 12          | 15.5      |
| WP-CML SerDes [14]                | 65             | 12.67       | 14.3      |
| WP-CMOS SerDes [27]               | 180            | 3.9         | 2.44      |
| CMOS-CML SerDes [2]               | 45/65          | 10          | 50/106    |
| This work                         | 65             | 20          | 10.13     |

## **Chapter 5**

# **Proposed Design of Clock and Data Recovery Circuit**

Synchronous SerDes uses the clock as a control signal. The same clock is used for both serialization and de-serialization. Whereas asynchronous SerDes doesn't require any clock as a control signal. It uses delay elements to convert parallel bits into the serial stream.

In the case of synchronous SerDes, clock generation circuit is used to generate the clock. To have the same clock signal available at the receiver, clock and data recovery circuit is used to recover the clock signal at the receiver end.

At low frequency, the channel between transmitter and receiver behaves as simple RC interconnect which introduces low pass effect. At high frequency, inductance effect also becomes significant and enables the signal to travel with a speed of light. The resistive component causes attenuation in the transmitted signal. To nullify the effect of the channel at high frequency, transmit equalizer is used.

In this work a 3- level encoder -decoder circuit is used to replace the requirement of equalizers. 3 level encoder converts a 2 -level signal into 3- levels. Using the clock as a control signal, a 3level encoded signal is generated.

The data rates of multilevel signals are defined as:

$$R = 2Blog_2M \tag{5.1}$$

where,

$$R = data \ rate \tag{5.2}$$

$$B = bandwidth of the channel$$
(5.3)

$$M = number \ of \ levels \tag{5.4}$$

3- level signal data rates are 1.58 times higher than 2- level signaling. Hence, the increased channel bandwidth utilization eliminates the need of the equalizers [4].

At the receiver, the 3 level encoded signal is received and reconverted into 2- Level signal using phase detector. It reconstructs both clock information as well as the serialized data signal. CDR circuit using PLL is a power hungry circuit. In comparison to PLL based CDR PLL-less CDR using 3- level encoder-decoder consumes very less power [4] [5] [6].

The above CDR design is insensitive to jitter incurred during signal propagation and also at the receiver end because clock information is extracted from an encoded data stream. Hence, the timing error in the received signal reflects in both data and extracted clock and hence data are sampled correctly. The main advantage of this CDR is very low power dissipation as it does not require PLL circuit. The encoder and decoder are designed using complementary CMOS logic taking the advantage of very low power dissipation.

In this work, parallel bits of 10101010 is serialized at the transmitter end. Serialized signal is a 2 level signal which is converted into 3 level using 3 level encoder.

### 5.1 Design of 3- Level encoder

It consists of only two transmission gates which are controlled by the transmitter clock. The encoding technique used by this circuitry maintains a DC level of VDD/2 which is independent of the data signal.

Figure 5.1 shows the design of an encoder.

Figure 5.2 shows the encoded output waveform based on the transmitter clock.

Truth table is shown in the Table 5.1. From truth table, it is evident, whenever the clock signal is at '0' level, the encoded output is a DC value of VDD/2 else encoded output follows the data. This signaling technique eliminates the need of sending clock through an extra wire or conventional complex CDR circuit requiring PLL. Also, simple circuitry is necessitated at receiver for reconstruction of the clock.



Figure 5.1: Design of an encoder



Figure 5.2: Model graph of 3- level encoder

Table 5.1: Truth table of 3 level encoder

| Clock | Data | Encoded Output |
|-------|------|----------------|
| 1     | 0    | 0              |
| 0     | 0    | VDD/2          |
| 1     | 1    | VDD            |
| 0     | 1    | VDD/2          |

### 5.2 Construction of Channel

At very high speed, the interconnect does not act like a standard RC line. The inductance effect also comes into the picture. The channel used between transmitter and receiver is nothing but a



Figure 5.3: Three level encoder

transmission line channel which has a characteristics impedance of 64 ohms.

To have a minimum distortion, a distortionless transmission line is used. In this work, a 3 mm lossy transmission line is used. The resistance of interconnects makes the line to suffer from attenuation and due to this, at the receiver, it is tough to recover the data reliably which was transmitted by the transmitter. An FFE and DFE are required to be used to nullify the effect of the channel and the device noises like reflection, crosstalk, and EMI incurred at the receiver. The use of 3 level encoder replaces the need of power consuming equalizer circuit which in turn results in a lesser area and lower power consumption [4].

At very high frequency, a major issue with the transmission of signals is the reflection of the incoming signal both at the source and load end of the transmission line. So matching of this line is required. Instead of matching the transmission line at both the source and load terminal, only source end matching is sufficient [18]. The receiver takes the advantage of signal reflection and doubles up the amplitude of the signal after reflection.

A transmission line of attenuation constant of  $\alpha = 0.313 mm^{-1}$  is chosen. The characteristics

impedance of the line is  $Z_0 = 64ohms$  and propagation speed of V = 37.6 mm/ns. The parasitic components of the distortionless transmission line are calculated using the equation [28] given below as:

$$\alpha = R \sqrt{\frac{C}{L}} \tag{5.5}$$

$$Z_0 = \sqrt{\frac{L}{C}}$$
(5.6)

$$V = \sqrt{\frac{1}{LC}} \tag{5.7}$$

(5.8)

Condition for the distortionless channel is:

$$\frac{L}{R} = \frac{C}{G} \tag{5.9}$$

### 5.3 3- Level Decoder

It consists of inverters, low threshold inverters, three input AND gates, a two input NAND gate and a J-K latch. The signal received from the differential transmission line is fed as an input to low threshold inverter to convert three level signals into two level signals A and B.



Figure 5.4: Decoder design

### 5.4 Clock Recovery Circuit:

The characteristics of two level signals A and B will determine the logical expression for clock recovery circuit. The clock recovery circuit must generate the same clock signal which was used during serialization. Figure 5.5 shows the output of low threshold inverter A and B.



Figure 5.5: Desired output waveform of clock recovery circuit

From the truth table shown in Table 5.2, the expression for clock recovery circuit is calculated. A simple OR gate is used for reconstruction of the clock from signals A and B.

| Α | В | Clk out     |
|---|---|-------------|
| 1 | 0 | 1           |
| 0 | 0 | 0           |
| 0 | 1 | 1           |
| 1 | 1 | Do not care |

Table 5.2: Truth table of clock recovery circuit

### 5.5 Data Recovery Circuit:

The data recovery circuit should be able to reconstruct the serialized signal reliably at receiver side from the 3- level encoded signal. The same low threshold inverters are used for conversion of three



Figure 5.6: Desired output of data recovery circuit based on A and B.

level signals into two levels. Based on the output signals of low threshold inverters A and B and desired output of data recovery circuit as shown in Figure 5.6, a characteristics table is formed and compared with the excitation table of J-K latch which is provided in Table 5.3.

| Α | В | $Q_{n-1}$ | $Q_n$ | J | Κ |
|---|---|-----------|-------|---|---|
| 1 | 0 | 1         | 0     | × | 1 |
| 0 | 0 | 0         | 0     | 0 | × |
| 0 | 1 | 0         | 1     | 1 | × |
| 0 | 0 | 1         | 1     | × | 0 |
| 1 | 0 | 1         | 0     | × | 1 |
| 0 | 0 | 0         | 0     | 0 | × |

Table 5.3: Truth table of data recovery circuit

Table 5.4: Truth table for J



From truth table shown in Table 5.4, expression for J is calculated as:

Table 5.5: Truth table for K



$$J = A' BQ'. (5.10)$$

From truth table 5.5, expression for K is equal to:

$$K = AB'Q. (5.11)$$

Truth table 5.4 and 5.5 provides the expression for the input J and K of the J-K latch. Hence, data recovery circuit requires a simple circuitry consisting of J-K latch, AND gates and inverters.



Figure 5.7: Clock and data recovery circuit

Figure 5.7 shows the circuit diagram of clock and data recovery circuit.

Figure 5.8 shows the output waveform of the clock and data recovery circuit. In this figure, the first waveform corresponds to the output waveform of the data recovery circuit whereas the second one corresponds to the output waveform of the clock recovery circuit. The third waveform corresponds to the waveform of the encoded data, and the fourth one is its inverted waveform. Similarly, the fifth and sixth waveforms indicate the waveforms of inverted serialized signal and serialized signal respectively. Finally, the last one corresponds to the clock signal.



Figure 5.8: Output waveform of clock and data recovery circuit

The Power consumption of different components of SerDes is shown in Table: 5.6. The total power consumption of this work is compared with recent works proposed in Table: 5.7.

|    | Components     | Power  |
|----|----------------|--------|
|    | Serializer     | 2.06mW |
| Tx | Encoder        | 10uW   |
|    | ADPLL          | 2.9mW  |
| Dy | Phase Detector | 0.24mW |
| KX | Deserializer   | 0.5mW  |

Table 5.6: Power consumption

Table 5.7: Design comparison

| Architecture                      | Technology(nm) | Speed(Gbps) | Power(mW) |
|-----------------------------------|----------------|-------------|-----------|
| All digital low power SerDes [18] | 65             | 12          | 15.5      |
| WP-CML SerDes [14]                | 65             | 12.67       | 14.3      |
| WP-CMOS SerDes [27]               | 180            | 3.9         | 2.44      |
| CMOS-CML SerDes [2]               | 45/65          | 10          | 50/106    |
| This Work                         | 65             | 13          | 5.94      |

Figure 5.9 shows the eye diagram of extracted data. The vertical and horizontal opening of eye is 1.03 Volt and 1UI respectively.

37



Figure 5.9: Eye diagram of extracted data

Figure 5.10 shows the eye diagram of the extracted clock. The vertical and horizontal opening of eye is 1.1 Volt and 0.98UI respectively.



Figure 5.10: Eye diagram of extracted clock

## Chapter 6

## **Conclusion and Future Works**

### 6.1 Conclusion

Achieving high data rates is a challenging task. The flip- flops used in SerDes using CMOS logic is rather slow. It is due to the dependency of pMOS device in CMOS logic. The nMOS device is 2.7 faster than pMOS. Hence to achieve same  $\beta$  for both the transistor, the width of pMOS devices is increased which increases both the area of the chip. Moreover, it also increases the input capacitances at the converging node, which further slowdown the device. In this work, a CML technique is used in the designing of high speed SerDes. Although it suffers from static power dissipation, the overall power dissipation can be minimized by a proper selection of dynamic resistance and biasing current.

MUX based design is more sensitive to inter-symbol interference than DETFF. In this paper, a more sophisticated DETFF design is used for serialization of bits using CML technique. The proposed design takes advantage regarding both power and area requirement than recent work. The deserializer design consist of both positive edge triggered D-flip flop, and negative edge triggered D-flip flop. Advantage of using both positive edge triggered D-flip flop and negative edge triggered D-flip flop are as it samples the serialized data at both the positive and negative edge of the clock so, for 8 bits serial data to be de-serialized, only 4 clock periods are required. Hence De- serializer used in this work is faster than conventional De-serializer, which samples the signal only at positive edges. The proposed SerDes works at a data rate up to 20 Gbps with a power consumption of 10.13mW

The PLL used in conventional clock and data recovery circuit make the circuit be more power hungry. A PLL-less clock and data recovery circuit is designed by using 3- level encoder –decoder. The proposed circuit consumes very less power. Also as it converts 2 level signal into 3 level which

has high data rates, the need of equalizers are eliminated. The proposed SerDes consumes only 5.94 mW power at a data rate of 13 Gbps

## 6.2 Future Works

Increasing the number of levels of a signal increase the data rate of the signal. Doubling the number of levels double the data rates. Hence higher order multilevel signaling techniques is to be used in designing of encoder and decoder which can eliminate the need of equalizer even for high lossy channel.

Dynamic CML logic has advantage in terms of both high speed and low power dissipation than CML logic. Use of it will improve the performance of SerDes.

## **Bibliography**

- D. R. Stauffer, J. T. Mechler, M. A. Sorna, K. Dramstad, C. R. Ogilvie, A. Mohammad, and J. D. Rockrohr, *High speed SerDes devices and applications*. Springer Science & Business Media, 2008.
- [2] D. F. Tondo and R. R. Lopez, "A low-power, high-speed CMOS/CML 16:1 serializer," in *Micro-Nanoelectronics, Technology and Applications, 2009. EAMTA 2009. Argentine School* of, Oct 2009, pp. 81–86.
- [3] W. M. Arden, "The international technology roadmap for semiconductorsâ€" perspectives and challenges for the next 15 years," *Current Opinion in Solid State and Materials Science*, vol. 6, no. 5, pp. 371–377, 2002.
- [4] T. Geurts, W. Rens, J. Crols, S. Kashiwakura, and Y. Segawa, "A 2.5 Gbps 3.125 Gbps multi-core serial-link transceiver in 0.13 mu;m CMOS," in *Solid-State Circuits Conference*, 2004. ESSCIRC 2004. Proceeding of the 30th European, Sept 2004, pp. 487–490.
- [5] J. Park, J. Kang, S. Park, and M. P. Flynn, "A 9-gbit/s serial transceiver for on-chip global signaling over lossy transmission lines," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 56, no. 8, pp. 1807–1817, Aug 2009.
- [6] M. Harwood, N. Warke, R. Simpson, T. Leslie, A. Amerasekera, S. Batty, D. Colman, E. Carr, V. Gopinathan, S. Hubbins, P. Hunt, A. Joy, P. Khandelwal, B. Killips, T. Krause, S. Lytollis, A. Pickering, M. Saxton, D. Sebastio, G. Swanson, A. Szczepanek, T. Ward, J. Williams, R. Williams, and T. Willwerth, "A 12.5gb/s SerDes in 65nm CMOS using a baud-rate adc with digital receiver equalization and clock recovery," in *2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers*, Feb 2007, pp. 436–591.
- [7] R. R. Dobkin, A. Morgenshtein, A. Kolodny, and R. Ginosar, "Parallel vs. serial on-chip communication," in *Proceedings of the 2008 international workshop on System level interconnect prediction.* ACM, 2008, pp. 43–50.

- [8] A. X. Widmer and P. A. Franaszek, "A dc-balanced, partitioned-block, 8b/10b transmission code," *IBM Journal of research and development*, vol. 27, no. 5, pp. 440–451, 1983.
- [9] D. Lewis et al., "Designcon 2004 SerDes architectures and applications," 2004.
- [10] K. Nguyen, X. Wang, I. W. Kim, C. Sung, R. G. Cliff, J. Huang, B. I. Wang, and W. Yeung, "Programmable logic integrated circuit devices with low voltage differential signaling capabilities," May 22 2001, uS Patent 6,236,231.
- [11] B. Razavi, "Prospects of CMOS technology for high-speed optical communication circuits," *IEEE Journal of Solid-State Circuits*, vol. 37, no. 9, pp. 1135–1145, 2002.
- [12] M. Mizuno, M. Yamashina, K. Furuta, H. Igura, H. Abiko, K. Okabe, A. Ono, and H. Yamada,
   "A GHz mos adaptive pipeline technique using mos current-mode logic," *IEEE Journal of Solid-State Circuits*, vol. 31, no. 6, pp. 784–791, 1996.
- [13] M. M. Green, "CMOS design techniques for 10 Gb/s optical transceivers," in VLSI Technology, Systems, and Applications, 2003 International Symposium on. IEEE, 2003, pp. 209–212.
- [14] A. Jaiswal, D. walk, Y. Fang, and K. Hofmann, "Low-power high-speed on-chip asynchronous wave-pipelined CML SerDes," in 2014 27th IEEE International System-on-Chip Conference (SOCC), Sept 2014, pp. 5–10.
- [15] F.-T. Chen, J.-M. Wu, and M.-C. F. Chang, "40-Gb/s 0.7-v 2: 1 mux and 1: 2 demux with transformer-coupled technique for SerDes interface," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 62, no. 4, pp. 1042–1051, 2015.
- [16] M. Harwood, N. Warke, R. Simpson, T. Leslie, A. Amerasekera, S. Batty, D. Colman, E. Carr, V. Gopinathan, S. Hubbins *et al.*, "A 12.5 Gb/s SerDes in 65nm CMOS using a baud-rate adc with digital receiver equalization and clock recovery," in 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers. IEEE, 2007, pp. 436–591.
- [17] J. Lee and B. Razavi, "A 40-gb/s clock and data recovery circuit in 0.18-μm CMOS technology," *IEEE Journal of Solid-State Circuits*, vol. 38, no. 12, pp. 2181–2190, 2003.
- [18] S. Safwat, E. E. D. Hussein, M. Ghoneima, and Y. Ismail, "A 12 Gbps all digital low power SerDes transceiver for on-chip networking," in 2011 IEEE International Symposium of Circuits and Systems (ISCAS), May 2011, pp. 1419–1422.

- [19] E. E. D. Hussein, S. Safwat, M. Ghoneima, and Y. Ismail, "A 16 Gbps low power self-timed SerDes transceiver for multi-core communication," in 2012 IEEE International Symposium on Circuits and Systems, May 2012, pp. 1660–1663.
- [20] J. Cao, M. Green, A. Momtaz, K. Vakilian, D. Chung, K.-C. Jen, M. Caresosa, X. Wang, W.-G. Tan, Y. Cai *et al.*, "Oc-192 transmitter and receiver in standard 0.18-μm CMOS," *IEEE Journal of Solid-State Circuits*, vol. 37, no. 12, pp. 1768–1780, 2002.
- [21] M. M. Green and U. Singh, "Design of CMOS CML circuits for high-speed broadband communications," in *Circuits and Systems*, 2003. ISCAS'03. Proceedings of the 2003 International Symposium on, vol. 2. IEEE, 2003, pp. II–204.
- [22] D. R. Holberg and P. E. Allen, "Cmos analog circuit design," htto://www. cicmaa. com, 2002.
- [23] U. Singh and M. M. Green, "High-frequency CML clock dividers in 0.13-μm CMOS operating up to 38 GHz," *IEEE Journal of Solid-state circuits*, vol. 40, no. 8, pp. 1658–1661, 2005.
- [24] J. H. Shim, S. Byun, J. C. Lee, K. Kim, and C. S. Kim, "A low-power 10-gb/s 0.13-μm CMOS transmitter for oc-192/stm-64 applications," in 2007 50th Midwest Symposium on Circuits and Systems. IEEE, 2007, pp. 1165–1168.
- [25] K. Ishii, H. Nakajima, H. Nosaka, M. Ida, K. Kurishima, S. Yamahata, T. Enoki, and T. Shibata, "Over 40 gbit/s 16: 1 multiplexer ic using inp/ingaas hbt technology," *Electronics Letters*, vol. 39, no. 12, pp. 911–913, 2003.
- [26] F. Znidarsic, E. Mullner, and R. Strunz, "16: 1 retiming multiplexer for 10 gbit/s in si production technology," *Electronics Letters*, vol. 32, no. 3, pp. 207–209, 1996.
- [27] B. C. Hien, S.-M. Kim, and K. Cho, "Design of a wave-pipelined serializer-deserializer with an asynchronous protocol for high speed interfaces," in *Quality Electronic Design (ASQED)*, 2012 4th Asia Symposium on, July 2012, pp. 265–268.
- [28] M. N. Sadiku, *Elements of electromagnetics*. Oxford university press New York, 2001, vol. 428.