

# **Ultra-low Power SRAM Design in Nanoscale CMOS and Multigate FinFET Technologies**

**Ph.D. Thesis**

by

**CHANDRABHAN KUSHWAH**



**DISCIPLINE OF ELECTRICAL ENGINEERING  
INDIAN INSTITUTE OF TECHNOLOGY INDORE  
FEBRUARY 2015**

# Ultra-low Power SRAM Design in Nanoscale CMOS and Multigate FinFET Technologies

A THESIS

*Submitted in partial fulfillment of the  
requirements for the award of the degree  
of*

**DOCTOR OF PHILOSOPHY**

*by*

**CHANDRABHAN KUSHWAH**



**DISCIPLINE OF ELECTRICAL ENGINEERING  
INDIAN INSTITUTE OF TECHNOLOGY INDORE  
FEBRUARY 2015**



# INDIAN INSTITUTE OF TECHNOLOGY INDORE

## CANDIDATE'S DECLARATION

I hereby certify that the work which is being presented in the thesis entitled **Ultra-low Power SRAM Design in Nanoscale CMOS and Multigate FinFET Technologies** in the partial fulfillment of the requirements for the award of the degree of **DOCTOR OF PHILOSOPHY** and submitted in the **DISCIPLINE OF ELECTRICAL ENGINEERING, Indian Institute of Technology Indore**, is an authentic record of my own work carried out during the time period from January 2011 to December 2014 under the supervision of Dr. S. K. Vishvakarma, Assistant Professor, Electrical Engineering, IIT Indore, M. P., India and Dr. Devesh Dwivedi, Senior Manager, HSS Links, Analog Mixed Signal & Memory Development, System & Technology Group, IBM, Bangalore, Karnataka and Adjunct Faculty, IIT Indore, India.

The matter presented in this thesis has not been submitted by me for the award of any other degree of this or any other institute.

**Signature of the student with date  
(CHANDRABHAN KUSHWAH)**

---

This is to certify that the above statement made by the candidate is correct to the best of our knowledge.

Signature of Thesis Supervisor #1 with date

**(DR. S. K. VISHVAKARMA, IIT INDORE)**

Signature of Thesis Supervisor #2 with date

**(DR. DEVESH DWIVEDI, IBM)**

---

**CHANDRABHAN KUSHWAH** has successfully given his **Ph.D. Oral Examination** held on **22<sup>nd</sup> June 2015**.

Signature(s) of Thesis Supervisor(s)

Date:

Convener, DPGC

Date:

Signature of PSPC Member #1

Date:

Signature of PSPC Member #2

Date:

Signature of External Examiner

Date:

## ACKNOWLEDGEMENTS

I was blessed that I have got the best supervisor at IIT Indore. I would like to gratefully acknowledge the enthusiastic supervision of my supervisor Dr. S. K. Vishvakarma. Life during the progression of PhD program was very challenging, interesting and enjoyable.

I am fortunate that I have got parallel supervision of Dr. Devesh Dwivedi. I am deeply indebted for his constant encouragement and support throughout my PhD and IBM internship. His vast technical expertise and insight has given me an excellent background in the field of memory design.

My sincere thanks to Dr. Amod C. Umarikar and Dr. Anil Kumar Emadabathuni for their guidance and suggestions for my research work progress. I am also thankful to Prof. Pradeep Mathur, Director, Indian Institute of Technology, Indore for his motivation and support.

There are a many people that I would like to thank individually. I consider them as my friends, advisers and motivators. They are- Shruti Verma, Mahendra Sakere, Dheeraj Sharma, Pooja Jain, Kesav Patidar, Vikas Vijayvergiya, Bhupendra Singh Reniwal and Pooran Singh Bisht.

I am grateful to Prof. Sudeb Das Gupta (IIT Roorkee) for teaching SRAM operation, Prof. Saibal Mukhopadhyay (Georgia Institute of Technology) for helping me in choosing sub-threshold regime operation, Saurabh P. Sinha (ARM Inc.) & Yu Cao (Arizona State University) for guiding me suitable FinFET Model and Cadence Support personnel for their support.

I would like to extend my gratitude towards Prof. N. K. Jain for his liberal support in every academic issue.

I am thankful to every B. Tech. student I met, academic department, account department, transportation department, library staff, security staff for their obliged support.

I am highly grateful towards Mr. Krishnan S. Rengarajan, Sreenivasula R. Dhani Reddy and Sushma N. Sambatur for their guidance and valuable feedback throughout my work at IBM, Bangalore.

I take this opportunity to express my gratitude to Amit, Arjun, Binu, Bringi, Chaitanya, Dinesh, Eswara, Komal, Navin, Pankaj, Ravi, Shiju, Sathisha, Sreejith, Vinay, who have been instrumental in the successful completion of this research work.

Finally, I thank my parents and family for their encouragement and guidance. I am grateful to them for their complete love and support.

I am eternally grateful to my wife- Shruti for putting all needed efforts into this thesis. She is the reason behind this accomplishment and I thank her for care and support.

*Dedicated to my family, whom I love the most.*

## ABSTRACT

Embedded SRAMs are a critical component in modern digital systems, and their role is preferentially increasing. Highly energy-constrained systems (e.g. implantable biomedical devices, multimedia handsets, wearable devices etc.) are an important class of applications driving ultra-low-power SRAMs. As a result, SRAMs strongly impact the overall power, performance, and area, and, in order to manage these severely constrained trade-offs, they must be specially designed for target applications.

A novel 8-transistor (8T) static random access memory cell with improved data stability in sub-threshold operation is designed. The proposed single-ended with dynamic feedback control 8T SRAM cell enhances the static noise margin (SNM) for ultra-low power supply. It achieves write SNM of 1.4x and 1.28x as that of iso-area 6T and read-decoupled 8T (RD-8T), respectively, at 300mV power supply. The standard deviation of write SNM for 8T cell is reduced to 0.4x and 0.56x as that for 6T and RD-8T, respectively. It also possesses another striking feature of high read SNM ~2.33x, 1.23x and 0.89x as that of 5T, 6T and RD-8T, respectively. The cell has hold SNM of 1.43x, 1.23x and 1.05x as that of 5T, 6T and RD-8T, respectively. The write time is 71% less than that of the single-ended asymmetrical 8T cell. The proposed 8T consumes write power of 0.72x, 0.6x and 0.85x that of 5T, 6T and iso-area RD-8T, respectively. The read power is 0.49x of 5T, 0.48x of 6T and 0.64x of RD-8T. The power/energy consumption of 1kb 8T SRAM array during read and write operations is 0.43x and 0.34x, respectively of 1kb 6T array. These features enable ultra-low power applications of the proposed 8T cell.

A novel single-ended boost-less (SE-BL) 7T static random access memory (SRAM) cell with high write-ability and reduced read failure is proposed. The proposed 7T cell utilizes dynamic feedback cutting (DFC) during write/read operation. The 7T also uses dynamic read decoupling during a read operation to reduce the read disturb. The proposed 7T writes “1” through one NMOS and Writes “0” using two NMOS pass transistors. The 7T has mean ( $\mu$ ) of 222.3mV (74.1% of supply voltage) for write trip point (WTP) where 5T fails to write “1” at 300mV. It gives mean ( $\mu$ ) of 276mV (92% of supply voltage) for read margin while 5T fails due to read disturb at 300mV. The hold static noise margin of 7T is maintained close to as that of 5T. The read delay of 7T is 22.5% lower than 5T and saves 10.8% read power consumption. It saves 36.9% read and 50% write power consumption, as compared to conventional 6T. The techniques used by the proposed 7T SRAM cell

allow it to operate at ultra-low voltage (ULV) supply without any write assist in UMC 90nm technology node.

A 20nm FinFET based 7T SRAM cell is presented. For the proposed 7T, the mean and standard-deviation ( $\mu/\sigma$ ) ratio of hold static noise margin is 6.3% higher than that of conventional iso-area 5T at 0.2V VDD. The 7T has 28.55% higher  $\mu/\sigma$  of read margin as that of 5T at 0.4V VDD. The write static noise margin of 7T is  $\sim$ 50% of VDD for all VDD values whereas 5T fails to write. During write ‘0’, the proposed cell consumes only 0.11x power as that of 5T at 0.8V VDD. The read operation of 7T consumes 0.34x lesser power than 5T during a read operation for all values of bit-line capacitances at 0.2V VDD. At 0.2V VDD, the 7T has 0.46x lower write ‘0’ delay than that of 5T. The write delay of 7T is 0.32x lower than that of 5T at 0.8V VDD.

A novel differential 8T SRAM cell is proposed. This novel 8T structure results in 6% higher HS NM and 66% higher WS NM compared to a conventional 6T cell. The proposed 8T cell allows 29% faster write operation compared to 6T with 20% lower leakage power. The proposed 8T cell can sustain 7 sigma variations in process parameters in 14nm FinFET technology.

Voltage scaling and read port decoupling techniques are used. Minimum sized transistors are used to reduce the area overhead due to 10T configuration. With new circuit topology of the proposed 10T cell we have found the RS NM of the proposed 10T cell is 4.95x of 6T RS NM, has 6.42% higher HS NM than 6T, the write power is reduced by 50% and the read power is reduced by 35%. Moreover, at 300mV power supply, the conventional 6T shows the write failure while 10T gives 142mV write trip point value.

We extend our discussion and present results on the advantages of using charge sharing to increase the sensing speed using a single-ended read configuration. The charge sharing scheme shows 6.8% to 9.7% performance improvement over the conventional sensing scheme.

All these schemes, topologies and analyses can be helpful to design ultra-low power SRAMs that can be useful for implantable biomedical devices, multimedia handsets, mobile phones etc.

# LIST OF PATENTS AND PUBLICATIONS

## PATENTS

1. C. B. Kushwah, D. Dwivedi and Sathisha N., “8T Based SRAM Cell and Related Method”, *U. S. A., IBM docket no. IN920130218US1*, Filed April 2013, **Patent Pending**.
2. C. B. Kushwah, S. K. Vishvakarma, “P-N Tuned Differential 8T SRAM Cell”, Disclosure submitted at IIT Indore on 28<sup>th</sup> January 2015 for *Indian Patent*. **Accepted for Filing**.

## JOURNALS

1. C. B. Kushwah, S. K. Vishvakarma, “A Single-Ended with Dynamic Feedback Control 8T Sub-Threshold SRAM Cell”, *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, Issue: 99, January 2015, DOI:[10.1109/TVLSI.2015.2389891](https://doi.org/10.1109/TVLSI.2015.2389891). **Available Online**
2. C. B. Kushwah, S. K. Vishvakarma and D. Dwivedi, “Single-Ended Boost-Less (SE-BL) 7T Process Tolerant SRAM Cell Design in Sub-Threshold Regime for Ultra-Low Power Applications”, *Circuits, Systems & Signal Processing*, Springer, May 2015, DOI:[10.1007/s00034-015-0086-5](https://doi.org/10.1007/s00034-015-0086-5). **Available Online**
3. C. B. Kushwah, S. K. Vishvakarma and D. Dwivedi, “A boost-less write optimized single ended robust 7T SRAM cell for ultra-low power memory design”, *International Journal of Electronics Letters*, Taylor Francis, May 2015, **Accepted**.
4. C. B. Kushwah, S. K. Vishvakarma and D. Dwivedi, “A 20nm Robust Single-Ended Boost-Less 7T FinFET Sub-threshold SRAM Cell under Process-Voltage-Temperature Variations”, *Microelectronics Journal*, Elsevier, **Under Review**.

## CONFERENCES

1. C. B. Kushwah, S. K. Vishvakarma and D. Dwivedi, “Stability Analysis of Single-Ended Boostless Sub-threshold 7T FinFET SRAM Cell under Process-Voltage-Temperature Variations,” in *Proc. IEEE International Conference of Quality Electronic Design (ISQED)-2015*, Santa Clara, California, USA, pp. 18-22, March 2015.
2. C. B. Kushwah, S. K. Vishvakarma and D. Dwivedi, “Single-ended sub-threshold FinFET 7T SRAM cell without boosted supply,” in *Proc. IEEE International Conference on IC Design & Technology (ICICDT)-2014*, AMD, Austin, Texas, USA, pp.1-4, May 2014.

3. C.B. Kushwah and S. K. Vishvakarma, “A Sub-threshold Eight Transistor (8T) SRAM Cell Design for Stability Improvement,” *in Proc. IEEE International Conference on IC Design & Technology (ICICDT)-2014*, AMD, Austin, Texas, USA, pp.1-4, May 2014.
4. C. B. Kushwah, Devesh Dwivedi, Sathisha N, Krishnan S Rengarajan, “Fully Differential 8T SRAM Cell for Low Voltage Applications with Improved Stability”, *in Proc. India Circuit Layout Design Conference (ICLDC)-2013*, IBM, Bangalore, India, November 2013.
5. C. B. Kushwah and S. K. Vishvakarma, “Subthreshold 8T SRAM Cell Immune to Process Variations at ULV Supply”, *in Proc. Electronic Devices and Solid State Circuits (EDSSC)-2013*, Hong Kong, China, May 2013.
6. C. B. Kushwah and S. K. Vishvakarma, “Ultra-Low Power Sub-threshold SRAM Cell Design to Improve Read Static Noise Margin”, *Lecture Notes in Computer Science, Springer*, vol. 7373, pp. 139-146, July 2012.

## TABLE OF CONTENTS

|                                                                               |             |
|-------------------------------------------------------------------------------|-------------|
| <b>LIST OF FIGURES.....</b>                                                   | <b>xvii</b> |
| <b>LIST OF TABLES.....</b>                                                    | <b>xxii</b> |
| <b>ACRONYMS.....</b>                                                          | <b>xxiv</b> |
| <br>                                                                          |             |
| <b>Chapter 1: Introduction 1.....</b>                                         | <b>1</b>    |
| <b>1.1 Ultra-low Power SRAM .....</b>                                         | <b>2</b>    |
| 1.1.1 Sub-threshold Regime .....                                              | 3           |
| 1.1.2 SRAM in Sub-threshold Regime .....                                      | 4           |
| <b>1.2 Conventional 6T SRAM Cell: In ultra-low voltage.....</b>               | <b>6</b>    |
| 1.2.1 Conventional 6T SRAM Cell: Process Variation in ultra-low voltage ..... | 7           |
| 1.2.2 Inter-die: global variation .....                                       | 8           |
| 1.2.3 Intra-die: global variation .....                                       | 9           |
| <b>1.3 Solutions from the Literature .....</b>                                | <b>10</b>   |
| 1.3.1 Circuit Design Approach .....                                           | 10          |
| 1.3.2 Use of multigate MOSFETs .....                                          | 12          |
| <b>1.4 Current Status of Conventional 6T and 8T SRAM Cells .....</b>          | <b>14</b>   |
| 1.4.1 Current status of FinFET based Conventional 6T and 8T SRAM cells .....  | 15          |
| 1.4.2 Best Performance Reported.....                                          | 18          |
| 1.4.3 Challenges for SRAM Designers.....                                      | 18          |
| 1.4.4 Solutions Suggested by Our Thesis .....                                 | 19          |
| <b>1.5 Motivation and Problem Statement .....</b>                             | <b>19</b>   |
| 1.5.1 Cascading in Both Inverters.....                                        | 20          |

|                                                                                                   |           |
|---------------------------------------------------------------------------------------------------|-----------|
| 1.5.2 Cascading in One Inverter .....                                                             | 22        |
| <b>1.6 Thesis Contribution .....</b>                                                              | <b>24</b> |
| <b>1.7 Organization of Thesis .....</b>                                                           | <b>24</b> |
| <br>                                                                                              |           |
| <b>Chapter 2: A Single-Ended with Dynamic Feedback Control 8T Sub-threshold SRAM Design .....</b> | <b>30</b> |
| <b>2.1 Proposed 8T SRAM Cell Design .....</b>                                                     | <b>33</b> |
| 2.1.1 Write Operation .....                                                                       | 34        |
| 2.1.2 Read Operation .....                                                                        | 35        |
| 2.1.3 Half-selected Issue .....                                                                   | 38        |
| 2.1.4 Control Signal Generation .....                                                             | 43        |
| 2.1.5 Cell Layout .....                                                                           | 44        |
| <b>2.2 Simulation &amp; Analysis .....</b>                                                        | <b>45</b> |
| 2.2.1 Write Static Noise Margin (WSNM) .....                                                      | 46        |
| 2.2.2 Read Static Noise Margin (RSNM) .....                                                       | 48        |
| 2.2.3 Hold Static Noise Margin (HSNM) .....                                                       | 49        |
| 2.2.4 Write Time .....                                                                            | 50        |
| 2.2.5 Read Time .....                                                                             | 51        |
| 2.2.6 Write Power .....                                                                           | 51        |
| 2.2.7 Read Power .....                                                                            | 52        |
| <b>2.3 Comparison and Discussion .....</b>                                                        | <b>53</b> |
| 2.3.1 Write Static Noise Margin (WSNM) .....                                                      | 53        |
| 2.3.2 Read Static Noise Margin (RSNM) .....                                                       | 53        |
| 2.3.3 Hold Static Noise Margin (HSNM) .....                                                       | 54        |
| 2.3.4 Write and Read Time .....                                                                   | 55        |
| 2.3.5 Write and Read Power Consumption .....                                                      | 55        |

|                   |                                                                                                                                       |           |
|-------------------|---------------------------------------------------------------------------------------------------------------------------------------|-----------|
| 2.3.6             | Array Design .....                                                                                                                    | 56        |
| <b>2.4</b>        | <b>Comparison Summary .....</b>                                                                                                       | <b>58</b> |
| <b>2.5</b>        | <b>Chapter Summary .....</b>                                                                                                          | <b>61</b> |
| <br>              |                                                                                                                                       |           |
| <b>Chapter 3:</b> | <b>Single-Ended Boost-Less (SE-BL) 7T Process Tolerant SRAM Design in Sub-Threshold Regime for Ultra-Low Power Applications .....</b> | <b>65</b> |
| <b>3.1</b>        | <b>Proposed 7T SRAM Cell: Schemes and Operations .....</b>                                                                            | <b>65</b> |
| 3.1.1             | Area .....                                                                                                                            | 67        |
| 3.1.2             | Write Operation .....                                                                                                                 | 67        |
| 3.1.3             | Read Operation .....                                                                                                                  | 68        |
| 3.1.4             | Control Signal Generation .....                                                                                                       | 69        |
| 3.1.5             | Half Select Issue .....                                                                                                               | 69        |
| <b>3.2</b>        | <b>Simulation Results and Analysis of 7T, 5T and 6T .....</b>                                                                         | <b>70</b> |
| 3.2.1             | Hold Static Noise Margin (HSNM) .....                                                                                                 | 71        |
| 3.2.2             | Read Margin .....                                                                                                                     | 72        |
| 3.2.3             | Write Trip Point (WTP) .....                                                                                                          | 74        |
| 3.2.4             | Delay .....                                                                                                                           | 76        |
| 3.2.5             | Power Consumption .....                                                                                                               | 76        |
| <b>3.3</b>        | <b>Statistical Analysis of 7T, 5T and 6T .....</b>                                                                                    | <b>77</b> |
| 3.3.1             | Hold Static Noise Margin (HSNM) .....                                                                                                 | 78        |
| 3.3.2             | Read Margin .....                                                                                                                     | 79        |
| 3.3.3             | Write Trip Point (WTP) .....                                                                                                          | 79        |
| <b>3.4</b>        | <b>Comparison of 7T with State-of-art SRAM Cells .....</b>                                                                            | <b>80</b> |
| 3.4.1             | Design and Scheme Comparison .....                                                                                                    | 80        |
| 3.4.2             | Stability Comparison .....                                                                                                            | 81        |

|            |                                 |           |
|------------|---------------------------------|-----------|
| 3.4.3      | Delay Comparison .....          | 82        |
| 3.4.4      | Power Comparison .....          | 82        |
| <b>3.5</b> | <b>Comparison Summary .....</b> | <b>83</b> |
| 3.5.1      | Statistical Analysis .....      | 86        |
| 3.5.2      | Array Design .....              | 87        |
| <b>3.6</b> | <b>Chapter Summary .....</b>    | <b>90</b> |

|                   |                                                                                                                                     |            |
|-------------------|-------------------------------------------------------------------------------------------------------------------------------------|------------|
| <b>Chapter 4:</b> | <b>A 20nm Robust Single-Ended Boost-Less 7T FinFET Sub-threshold SRAM Design under Process-Voltage-Temperature Variations .....</b> | <b>93</b>  |
| <b>4.1</b>        | <b>Conventional 5T and Proposed 7T Cell Design .....</b>                                                                            | <b>93</b>  |
| <b>4.2</b>        | <b>Data Retention in 5T and 7T .....</b>                                                                                            | <b>95</b>  |
| 4.2.1             | Hold Static Noise Margin (HSNM) .....                                                                                               | 96         |
| 4.2.2             | HSNM under PVT Variations .....                                                                                                     | 96         |
| 4.2.3             | Summary of HSNM under PVT Variations .....                                                                                          | 98         |
| <b>4.3</b>        | <b>Write Operation of 5T and 7T .....</b>                                                                                           | <b>99</b>  |
| 4.3.1             | Write Operation of 5T under PVT Variations .....                                                                                    | 101        |
| 4.3.2             | Write Operation of 7T under PVT Variations .....                                                                                    | 102        |
| 4.3.3             | Summary of Write Operation .....                                                                                                    | 103        |
| <b>4.4</b>        | <b>Read Operation of 5T and 7T .....</b>                                                                                            | <b>105</b> |
| 4.4.1             | Read Operation of 5T and 7T under PVT Variations .....                                                                              | 106        |
| 4.4.2             | Summary of Read Operation .....                                                                                                     | 107        |
| <b>4.5</b>        | <b>Half-Select Condition .....</b>                                                                                                  | <b>108</b> |
| <b>4.6</b>        | <b>Statistical Results and Comparison .....</b>                                                                                     | <b>110</b> |
| 4.6.1             | Hold Static Noise Margin .....                                                                                                      | 110        |
| 4.6.2             | Write Static Noise Margin .....                                                                                                     | 111        |

|                                                                                            |            |
|--------------------------------------------------------------------------------------------|------------|
| 4.6.3 Read Margin .....                                                                    | 111        |
| <b>4.7 Chapter Summary .....</b>                                                           | <b>112</b> |
| <br>                                                                                       |            |
| <b>Chapter 5: Robust 8T and 10T Based FinFET SRAM Design and Fast Sensing Scheme .....</b> | <b>115</b> |
| <b>5.1 Review of Broken Feedback Bit-cells .....</b>                                       | <b>115</b> |
| <b>5.2 Proposed 8T SRAM Cell Design .....</b>                                              | <b>119</b> |
| 5.2.1 Write Operation .....                                                                | 119        |
| 5.2.2 Read Operation .....                                                                 | 120        |
| 5.2.3 Results and Brief Discussion .....                                                   | 122        |
| <b>5.3 Proposed Single Ended 10T SRAM cell .....</b>                                       | <b>124</b> |
| 5.3.1 Write Operation .....                                                                | 124        |
| 5.3.2 Read Operation .....                                                                 | 125        |
| <b>5.4 Simulation Results of Proposed 10T SRAM Cell .....</b>                              | <b>126</b> |
| 5.4.1 Read Static Noise Margin (RSNM) .....                                                | 126        |
| 5.4.2 Hold Static Noise Margin (HSNM) .....                                                | 127        |
| 5.4.3 Write Trip Point (WTP) .....                                                         | 127        |
| 5.4.4 Write Delay .....                                                                    | 128        |
| 5.4.5 Read Delay .....                                                                     | 128        |
| 5.4.6 Write and Read Power Consumption .....                                               | 129        |
| 5.4.7 Standby Power Consumption .....                                                      | 130        |
| 5.4.8 Comparison summary .....                                                             | 131        |
| <b>5.5 High Speed Sensing for Single Ended Bit-cells .....</b>                             | <b>132</b> |
| 5.5.1 Review of Sensing Schemes .....                                                      | 133        |
| 5.5.2 Proposed Sensing Scheme .....                                                        | 134        |

|                                                               |            |
|---------------------------------------------------------------|------------|
| 5.5.3 Design of Proposed Scheme .....                         | 134        |
| 5.5.4 Sensing Operation .....                                 | 134        |
| 5.5.5 Results .....                                           | 135        |
| <b>5.6 Chapter Summary .....</b>                              | <b>137</b> |
| <b>Chapter 6: Conclusions and Scope for Future Work .....</b> | <b>140</b> |

## LIST OF FIGURES

- 1.1 MOSFET normalized drain current ( $I_D$ ) versus gate to source voltage ( $V_{GS}$ )
- 1.2 Conventional 6T SRAM cell
- 1.3 Operation of 6T SRAM cell (a) Read (b) Write
- 1.4 Effect of ultra-low voltage on conventional 6T SRAM cell (a)  $VDD=0.5V$  (b)  $VDD=0.4V$  (c)  $VDD=0.3V$  (d)  $VDD=0.2V$
- 1.5 Effect of process and mismatch parameters variation on conventional 6T SRAM cell at  $VDD=0.5V$
- 1.6 Effect of inter-die variation on conventional 6T SRAM cell at ultra-low voltage ( $VDD=0.2V$ )
- 1.7 Effect of intra-die variation on conventional 6T SRAM cell at ultra-low voltage ( $VDD=0.2V$ )
- 1.8 Effect of inter-intra die (process and mismatch parameters) variation on conventional 6T SRAM cell at ultra-low voltage ( $VDD=0.2V$ )
- 1.9 Read-Decoupled 8T
- 1.10 Read-Decoupled 10T
- 1.11 Differential 10T
- 1.12 Single gate MOSFEET
- 1.13 Multi Gate MOSFETs
- 1.14 Conventional 5T SRAM cell
- 1.15 Conventional 5T SRAM cell (a) write ‘1’ operation (b) read ‘1’ operation
  
- 2.1 Schematic of (a) Conventional 6T (b) Read decoupled 8T (RD-8T) (c) Conventional 5T and (d) Proposed 8T SRAM cell.
- 2.2 Schematic of 8T during (a) Write ‘1’ (b) Write ‘0’ operation.
- 2.3 Waveforms of 8T during (a) Write ‘1’ (b) Write ‘0’ operation.
- 2.4 (a) Schematic and (b) Waveforms of 8T during read ‘1’ operation.
- 2.5 Waveforms of 8T during read operation (a) Normal Read ‘1’ (b) FSC1/FCS2 turns ‘1’ before RWL turns ‘0’.
- 2.6 Schematic of row half-selected 8T (a) Write (b) Read.

2.7 1000 MC simulations of row half-selected 8T (a) Write (b) Read.

2.8 Schematic of column half-selected 8T (a) Write ‘0’ (b) Read.

2.9 Waveforms of column half-selected 8T (a) Write ‘0’ (b) Read.

2.10 1000 MC simulations of column half-selected 8T (a) Write (b) Read.

2.11 Layout of (a) 6T (b) RD-8T (c) 5T and (d) Proposed 8T.

2.12 (a) Butterfly curve of HSNM for 8T at VDD=0.3V (b) Calculation of SNM

2.13 WSNM of 8T (MC 1000, SF corner). Inset:  $\log \mu$  and  $\sigma$ .

2.14 WSNM of 6T (MC 1000, SF corner). Inset:  $\log \mu$  and  $\sigma$ .

2.15 RSNM of 8T (MC 1000, FS corner). Inset:  $\log \mu$  and  $\sigma$ .

2.16 RSNM of 6T (MC 1000, FS corner). Inset:  $\log \mu$  and  $\sigma$ .

2.17 HSNM of 8T (MC 1000, FS corner). Inset:  $\log \mu$  and  $\sigma$ .

2.18 HSNM of 6T (MC 1000, FS corner). Inset:  $\log \mu$  and  $\sigma$ .

2.19 (a) Write ‘1’ time of 8T and (b) Write ‘0’ time of 8T.

2.20 (a) Write ‘1’ time of 6T and (b) Write ‘0’ time of 6T.

2.21 Read ‘1’ time (a) 8T and (b) 6T.

2.22 (a) Write ‘1’ power of 8T and (b) Write ‘0’ power of 8T.

2.23 (a) Write ‘1’ power of 6T and (b) Write ‘0’ power of 6T.

2.24 Read ‘1’ power (a) 8T and (b) 6T.

2.25 Comparison of WSNM at SF corner.

2.26 Comparison of RSNM at FS corner.

2.27 Comparison of HSNM at FS corner.

2.28 Comparison at SS corner (a) Write ‘1’ (b) Read ‘1’ time.

2.29 Comparison of power at FF corner (a) Write ‘0’ and (b) Read ‘1’.

2.30 Comparison of energy at FF corner (a) Write ‘0’ and (b) Read ‘1’

2.31 Block diagram of the proposed 1kb SRAM array

2.32 Layout of 1kb array of the proposed 8T

2.33 Layout of 1kb array of conventional 6T

3.1 Proposed 7T SRAM cell in UMC 90nm (a) Schematic (b) Layout

3.2 Basic operation of the proposed 7T SRAM cell in UMC 90nm (a) Write “1” (b) Write “0” (c) Read “0”

- 3.3 Column half-selected cell during write “1” or read operation (a) Schematic (b) Waveforms
- 3.4 Butterfly curve to find HSNM for 7T and 5T at 200mV power supply (b) HSNM for 7T and 5T at different process corners and 200mV power supply
- 3.5 Absolute value curves of diagonal’s length for different power supplies (200-500mV) at FS corner (a) 5T (b) 7T
- 3.6 Post layout read operation for 8 cells connected to a single bit-line (a) Waveforms and Read margin calculation at FS corner (b) Read margin for different process corners (FF, FS, SF, SS and TT)
- 3.7 (a) Read margin versus VDD at FS corner (b) Change in read margin versus VDD
- 3.8 Write Trip Point (WTP) for 200mV power supply at SF corner (a) Write “1” (b) Write “0”
- 3.9 Write Trip Point (WTP) against power supply at SF corner during (a) Write “1” (b) Write “0”
- 3.10 Post layout delay vs. VDD for 8 cells per bit-line at SS corner (a) Read delay (b) Write delay
- 3.11 Post layout average power consumption vs. VDD for 8 cells per bit-line at FF corner (a) Read power (b) Write power
- 3.12 Monte Carlo (MC) simulation for 1000 samples at FS corner (a) HSNM at 300mV (b) HSNM at 400mV
- 3.13 Monte Carlo (MC) simulation for 1000 samples at FS corner (a) HSNM at 500mV (b) Comparison of HSNM mean for 5T, 6T and 7T at all power supplies
- 3.14 Monte Carlo (MC) simulation for 1000 samples (a) Read margin mean against VDD from at FS corner (b) WTP mean against VDD from at SF corner
- 3.15 (a) Read margin against VDD at FS corner (b) Write Trip Point (WTP) against VDD at SF corner
- 3.16 Post layout delay against VDD for 8 cells per bit-line at SS corner (a) Read delay (b) Write delay
- 3.17 Post layout average power consumption against VDD for 8 cells per bit-line at FF corner (a) Read power (b) Write power
- 3.18 Layout of 1kb array of the proposed 7T SRAM cell

- 4.1 Schematic representation of FinFET SRAM cells (a) conventional 5T (b) proposed 7T
- 4.2 Butterfly curves for 5T and 7T at VDD=0.2V for HSNM determination at 27<sup>0</sup>C
- 4.3 Absolute value curves of diagonal's length for different voltages (0.1-0.8V) at FS corner and 27<sup>0</sup>C temperature (a) 7T (b) 5T
- 4.4 Absolute value curves of diagonal's length for different temperature values at 0.2V VDD (a) 7T (b) 5T at FS corner
- 4.5 HSNM of 7T and 5T for different temperature values with variation in VDD at FS corner
- 4.6 Comparison of 7T and 5T at FS corner (a) HSNM versus VDD at 27<sup>0</sup>C temperature (b) HSNM versus temperature at 0.2V VDD
- 4.7 (a) Schematic representation of Write '1' operation of 7T (b) Write '1' waveforms for 5T and 7T
- 4.8 (a) Schematic representation of Write '0' operation of 7T (b) Write '0' waveforms for 5T and 7T
- 4.9 Write vs. temperature with variation in all process corners and VDD for 5T (a) Write '1' delay (b) Write '0' delay (c) WSNM '1' (d) WSNM '0' (e) Write '1' power (f) Write '0' power
- 4.10 Write vs. temperature variation in all process corners and VDD for 7T (a) Write '1' delay (b) Write '0' delay (c) WSNM '1' (d) WSNM '0' (e) Write '1' power (f) Write '0' power
- 4.11 Comparison of 7T and 5T vs. VDD (a) Write delay at SF corner (b) Write delay ratio (c) WSNM at SF corner (d) WSNM ratio (e) Write power at FF corner (f) Write power ratio
- 4.12 (a) Schematic representation of read operation of 7T (b) Read waveforms for 5T and 7T
- 4.13 Read vs. VDD with variation in all process corners and temperature (a) Read delay of 7T (b) Read delay of 5T (c) Read margin of 7T (d) Read margin of 5T (e) Read power of 7T (f) Read power of 5T

4.14 Comparison of 5T and 7T vs bit-line capacitance at worst case corners and 0.2V VDD  
 (a) Read delay at SS corner (b) Read delay ratio (c) Read power at FF corner (d) Read power ratio (e) Read margin at FS corner (f) Read margin ratio

4.15 Half-Selected 7T cell at 125<sup>0</sup>C temperature (a) Schematic diagram (b) Timing diagram with 1000 MC samples at 0.2V and FF corner

5.1 Conventional 6T cell

5.2 Read SNM free (RSNF) 7T Cell

5.3 A 7T Cell

5.4 A read disturb free 8T Cell

5.5 Proposed 8T cell

5.6 Write operation of the proposed 8T SRAM cell

5.7 Read operation of proposed SRAM cell

5.8 Block diagram of Proposed 10T cell

5.9 Block diagram of Proposed 10T cell

5.10 Read operation of Proposed 10T cell

5.11 Read static noise margin vs power supply

5.12 Hold static noise margin vs power supply

5.13 Write trip point vs power supply

5.14 Write delay vs power supply

5.15 Read delay vs power supply

5.16 Write power consumption vs power supply

5.17 Read power consumption vs power supply

5.18 Standby power consumption vs power supply

5.19 Conventional single ended read sensing scheme

5.20 Charge sharing between two capacitors

5.21 Conventional single ended read sensing scheme

5.22 Read delay comparison when global bit line capacitance is 8x of local bit-line capacitance.

5.23 Read delay comparison when global bit line capacitance is 2x of local bit-line capacitance.

## LIST OF TABLES

- 1.1 Applications of SRAM and design tradeoff
- 1.2 Comparison of SRAM cells at VDD=1.2V in 65nm
- 1.3 Comparison of SRAM cells at 500mV
- 1.4 6T cell metric presented in [43] at VDD=500mV
- 1.5 6T cell metric presented in [44] at VDD=1V
- 1.6 6T and 8T cell metric presented in [45] at VDD=324mV
  
- 2.1 Operation table of proposed 8T SRAM Cell.
- 2.2 Layout area in UMC 90nm technology.
- 2.3 Comparison of 1Kb Array of 8T and 6T SRAM at 300mV.
- 2.4 Comparison of mean ( $\mu$ ) and standard deviation ( $\sigma$ ) for proposed 8T, iso-area 5T, 6T and RD-8T SRAM cells
- 2.5 Comparison of leakage of iso-area bit-cells
- 2.6 Comparison of SNM at 300mV with [14] and [17]
- 2.7 Comparison of the proposed 8T with 9T [28] and 7T [25]
  
- 3.1 Layout area in UMC 90nm technology
- 3.2 Operation table of the proposed 7T SRAM Cell
- 3.3 Design details of various SRAM cells.
- 3.4 Comparison summary of 5T, 6T and 7T at 300mV at worst case corners
- 3.5 Percentage change in various parameters of different cells w.r.t 7T at 300mV at worst case corners
- 3.6 Comparison of mean ( $\mu$ ) and standard deviation ( $\sigma$ ) of 7T with 5T and 6T in UMC 90nm CMOS technology with temp=27°C and VDD range of 200-500mV
- 3.7 Percentage change in various parameters of different cells w.r.t 7T in UMC 90nm CMOS technology and 300mV power supply at worst case corners
- 3.8 Comparison of 1Kb Array of the proposed 7T and 6T SRAM at 300mV
- 3.9 Comparison of the proposed 7T with 9T [28] and 7T [25]

- 3.10 Design details of the proposed SRAM cells
- 3.11 Mean ( $\mu$ ) and standard deviation ( $\sigma$ ) for the proposed SRAM cell
- 3.12 Table of mean ( $\mu$ ) and standard deviation ( $\sigma$ ) of 7T in UMC 90nm CMOS technology with temp=27°C and VDD range of 200-500mV
- 3.13 Comparison of 1kb array of 7T and 8T SRAM at 300mV
  
- 4.1 Sizing strategy for reported bit-cells
- 4.2 Operation table of the proposed 7T SRAM cell
- 4.3 Comparison of mean ( $\mu$ ) and standard deviation ( $\sigma$ ) of 7T and 5T in 20nm FinFET technology with temp=27°C and VDD range of 0.2V-0.5V.
- 4.4 Comparison of 7T and 5T in 20nm FinFET technology at worst case corners ( $R = \mu/\sigma$ )
  
- 5.1 Basic idea of the proposed 8T cell and its outcome
- 5.2 Results of the proposed 8T cell with reference to conventional 6T cell
- 5.3 Operation summary for proposed 10T SRAM cell
- 5.4 Comparison of the proposed 10T, conventional 6T and conventional 8T

## ACRONYMS

SRAM: Static Random Access Memory

CMOS: Complementary Metal Oxide Semiconductor

MOSFET: Metal Oxide Semiconductor Field Effect Transistor

FinFET: Fin Shaped Field Effect Transistor

PFET: P-channel Field Effect Transistor

NFET: N-channel Field Effect Transistor

PMOS: P-channel Metal Oxide Semiconductor

NMOS: N-channel Metal Oxide Semiconductor

$V_{TH}$ : Threshold Voltage

$V_{DS}$ : Drain to Source Voltage

$V_{GS}$ : Gate to Source Voltage

W/L: Width to Length ratio of MOSFET

Q: True Storage Node

QB: Q-Bar

MC: Monte Carlo

TSMC: Taiwan Semiconductor Manufacturing Company

UMC: United Microelectronics Corporation

PTM: Predictive Technology Model

IoT: Internet of Things

TCAD: Technology Computer Aided Design

RSCE: Reverse Short Channel Effect

DFC: Dynamic Feedback Cutting

RD: Read Decoupling

SE-DFC: Single-Ended Dynamic Feedback Cutting

ULP: Ultra-low Power

ULV: Ultra-low Voltage

PVT: process-voltage-temperature

SNM: Static Noise Margin

HSNM: Hold Static Noise Margin  
WSNM: Write Static Noise Margin  
RSNM: Read Static Noise Margin  
WTP: Write Trip Point Static Noise Margin  
WNM: Write Noise Margin  
RM: Read Margin  
ATL: Access Transistor Left  
ATL: Access Transistor Left 2  
ATR: Access Transistor Right  
ATR: Access Transistor Right 2  
WWL: Write Word Line  
WL: Word Line  
Inv: Inverter  
BL: Bit Line  
BLB: Bit Line Bar  
PUL: Pull Up Left  
PUR: Pull Up Right  
PDL: Pull Down Left  
PDR: Pull Down Right  
WBL: Write Bit Line  
WBLB: Write Bit Line Bar  
RPD: Read Pull Down  
RPG: Read Pass Gate  
NS: Series NMOS  
NS2: Series NMOS 2  
PDR: Pull Down Right 2  
NL: NMOS Left  
NR: NMOS Right  
RWL: Read Word Line  
RBL: Read Bit Line  
PU: Pull Up

PD: Pull Down

AT: Access Transistor

Tru.: True Storage Node

Cmp.: Complementary Storage Node

WBR: Right Write Bit line

RBR: Right Read Bit Line

RAL: Left Read Access Transistor

RAR: Right Read Access Transistor

WE: Write Enable

MUX: Multiplexer

DIBL: Drain Induced Barrier Lowering

VDD: Drain Voltage Supply

VVDD: Virtual Drain Voltage Supply

VSS: Source Voltage Supply

Gnd: Ground

SoC: System-on-Chip

# Chapter 1

## Introduction

Technology scaling has been the main reason for the enhanced capabilities of state-of-the-art integrated circuits and their abundant use in electronic systems. According to Moore's Law [1], the minimum feature size reduces by a factor of 0.7 in every new technology node. Technology scaling has been done aggressively in the last few decades, resulting in higher integration density and improved performance. The resultant exponential growth in device count per chip has been led by the miniaturization of the static-random-access-memory (SRAM) bit-cell. Because of its systematic structure and broad applicability to most electronic systems, SRAM is one of the important components for the development of new technology and therefore modern digital systems progressively embed more SRAMs [2].

As SRAM occupies a dominating portion of the total die area it accounts for the largest share of power consumption in a system. More emphasis has been placed on the design of low-power memories. More than half of the transistors in today's high performance microprocessors are devoted to cache memories and this ratio is expected to increase in the foreseeable future. Typically, SRAM is the choice for embedded memories because SRAM is robust to the noisy environment in such chips [2], [3]. As a result, considerable attention has been paid to design low-power, high-performance SRAMs, since they are a critical component in both hand-held devices and high-performance processors. To avoid unnecessary yield loss, a properly designed SRAM should be used for a system. This may leads to improvement in area, speed, and power. Therefore, depending on the application's need, an appropriate SRAM should be used [3].

Increasing process-induced variations in transistor performance with miniaturization down to the sub-nanometer technology node and beyond is a major technical challenge for continued advancement of planar metal-oxide-semiconductor technology. In particular, continued SRAM cell-area scaling for increased storage density and reduction in operating voltage (VDD) for lower stand-by power consumption are design challenges. Moreover, enhanced yield necessary to realize large capacity SRAM for microprocessors becomes increasingly difficult to achieve. This thesis explores

the benefits of advanced transistor structures and bit-cell design co-optimization for continued SRAM scaling [4].

### 1.1 Ultra-low Power SRAM

Technology trends have resulted in static and dynamic power dissipation emerging as a primary design consideration in micro-processor design. To keep the resulting switching power dissipation at an increasingly lower level, successive technology generations have depended on reducing the supply voltage. In order to maintain performance, consistent reduction in the transistor threshold voltage is required [4], [5]. Since the leakage current increases exponentially with reduced threshold voltage, the static power dissipation has grown to be a significant fraction of overall chip power dissipation in modern deep-sub-micron processes. Now, micro-processor-controlled hand-held devices contain embedded memory which represents a large portion of the system-on-chip (SoC). These portable systems need ultra-low power consuming circuits to utilize the battery for longer duration. Applications of ultra-low power SRAM are extremely broad including neural signal processor, sub-threshold processor, biomedical implants, wireless sensing, FFT core, low voltage cache operation etc. [4]-[7]. These applications demand careful design within the associated trade-offs. In order to adhere to intense scaling trends, SRAM design is also highly constrained.

Moreover, parameter variation in MOSFET and system-level power consumption increases the design challenges. Since the impact of SRAM on the whole processing unit is very significant, modern ultra-low power SRAMs must be developed with their own trade-offs. The trade-offs for SRAMs are mainly subject to power, performance, and density constraints as shown in Table 1.1.

TABLE 1.1 APPLICATIONS OF SRAM AND DESIGN TRADE-OFF

|                        | <b>Ultra Low Power SRAM</b>                                                     | <b>High Performance SRAM</b>                                | <b>High Density SRAM</b>                                    |
|------------------------|---------------------------------------------------------------------------------|-------------------------------------------------------------|-------------------------------------------------------------|
| <b>Application</b>     | Biomedical implants, wireless sensing,                                          | High-end server, complex computing,                         | Mobile, multimedia gadgets                                  |
| <b>Design Approach</b> | High- $V_{TH}$ MOSFET, low power supply, medium size bit-cells, short bit-lines | Low- $V_{TH}$ MOSFET, large size bit-cells, short bit-lines | High- $V_{TH}$ MOSFET, small size bit-cells, long bit-lines |

The difficulty is that the improvement in one kind of SRAM design affects the others. Therefore SRAM design involves compromises in order to fulfill specific requirements. The power consumption can be minimized using non-conventional device structures, new circuit topologies, and optimizing the architecture. Therefore, stable and ultra-low power on-chip memory is now mandatory to achieve higher reliability and longer battery life for portable applications, [2]-[23].

### 1.1.1 Sub-threshold Regime

Supply voltage scaling is one of the most effective techniques for power reduction in active as well as standby mode. But, voltage scaling has limitations due to loss of static noise margin (SNM), current fluctuations due to process variations and limitations on the number of cells connected to bit-line [3]-[8]. Sub-threshold operation can be achieved by fixing the supply voltage below the threshold voltage ( $V_{TH}$ ) of the metal oxide semiconductor field effect transistor (MOSFET). Scaling supply voltage below the device threshold shows exponential dependency of the drain current on the gate voltage. Figure 1.1 shows normalized drain current ( $I_D$ ) versus gate to source voltage ( $V_{GS}$ ) of a planner MOSFET and its sub-threshold regime. The power consumption can be reduced by operating MOSFET in sub-threshold regime. Basic equation of sub-threshold current and total off current [7], [8] of the MOSFET is as follows:

$$I_{D:sub-threshold} = I_0 \frac{W}{L} e^{\frac{V_{GS}-V_{TH}}{nV_t}}$$

Including  $V_{DS}$ :

$$I_{D:sub-threshold} = I_0 \frac{W}{L} e^{\frac{V_{GS}-V_{TH}}{nV_t}} \left( 1 - e^{\frac{-V_{DS}}{V_t}} \right)$$

where  $I_0$  is the drain current when  $V_{GS} = V_{TH}$ ,  $V_{DS} \gg V_t$  and  $W/L=1$  and is given by:

$$I_0 = \mu_0 C_{ox} (n - 1) V_t^2$$

were  $V_{TH}$  is the transistor threshold voltage,  $n$  is the sub-threshold slope factor ( $n = 1 + C_d/C_{ox}$ ),  $V_t = kT/q$ ,  $V_{DS}$  is the drain to source voltage and  $V_{GS}$  is the gate to source voltage,  $\mu_0$  is mobility,  $C_d$  is drain capacitance and  $C_{ox}$  is oxide capacitance.



Figure 1.1. MOSFET normalized drain current ( $I_D$ ) versus gate to source voltage ( $V_{GS}$ )

From the above equations the circuit design parameter to be set is  $I_D$ . As  $I_D \propto W/L$ , the transistor sizing aspect ratio  $W/L$  is not so effective in changing  $I_D$ . On the other, threshold voltage variation can be very effective in changing  $I_D$  while designing a sub-threshold SRAM circuit. Gate current due to carrier tunneling through the oxide is negligible compared to  $I_D$  in a sub-threshold circuit. Also, the junction leakage current is negligible in the sub-threshold regime as compared to  $I_D$ .

### 1.1.2 SRAM in Sub-threshold Regime

Voltage scaling has led to circuit operation in the sub-threshold regime for minimum power consumption along with the disadvantage of exponential reduction in performance [9]. Circuit operation in the sub-threshold regime has paved path the towards ultra-low power embedded memories, mainly SRAMs [9]-[11]. The 6 transistor (6T) cell which uses cross coupled inverter pair is the most commonly used bit-cell in the current SRAM designs, and is shown in Figure 1.2.



Figure 1.2. Conventional 6T SRAM cell

This 6T cell is comprised of a cross coupled inverter latch and pair of access transistors that allow differential read and write operations. The positive feedback loop of the cross-coupled inverters makes this structure very robust.

The common terminology for the robustness of this cell is achieved with the definition of the SNM, generally calculated as the side of the largest square that fits inside one of the lobes of the butterfly curve [12]. Read and write operations of the 6T are shown in self-explanatory Figure 1.3. To transfer the data to/from bit-lines to internal nodes (Q and QB), the write word line can be activated for both read and write operations.



Figure 1.3. Operation of 6T SRAM cell (a) Read (b) Write

## 1.2 Conventional 6T SRAM Cell: In ultra-low voltage

The stability problem of conventional 6T SRAM cell design (Figure 1.2) is such that during the read operation,  $Q=“0”$  can be overwritten by a “1” when the voltage at node  $Q$  reaches the  $V_{TH}$  of NMOS PDR to pull node  $QB$  down to “0” and in turn pull node  $Q$  up even further to “1” due to the mechanism of positive feedback.



Figure 1.4. Effect of ultra-low voltage on conventional 6T SRAM Cell (a)  $VDD=0.5V$  (b)  $VDD=0.4V$  (c)  $VDD=0.3V$  (d)  $VDD=0.2V$

For low VDD values, the stability degrades severely because of loss of bi-stability. This is because of the reduced signal levels at the low VDD levels and also because of the impact of  $V_{TH}$  variations. SRAM cell design can be optimized to minimize the impact of  $V_{TH}$  variation on SNM read. Figure 1.4 shows the effect of reduction in VDD on conventional 6T SRAM cell for VDD=0.5V to VDD=0.2V. It is clear from that the operation of 6T cell in ultra-low voltage VDD=0.3V and VDD=0.2V can cause cell flipping as shown in Figure 1.4(c) and Figure 1.4(d) respectively.

### 1.2.1 Conventional 6T SRAM Cell: Process Variation in ultra-low voltage

The impact of increased intra die variations with the voltage scaling is more pronounced on the 6T SRAM cell. The ratioed operations, both during read and write, leave the 6T bit-cell highly susceptible to both variation and manufacturing defects. In particular, since a typical SRAM is composed of bit-cell arrays of hundreds of kilo-bits to several mega-bits, extreme worst-case case behavior at the 4 or 5 sigma level must be considered. Two forms of variation affect SRAMs:

- (a) Inter-die (which will be called global variation) and
- (b) Intra-die (which will be called local variation) [46], [47].



Figure 1.5. Effect of process and mismatch parameters variation on conventional 6T SRAM Cell at VDD=0.5 V.

In this thesis we have considering both, inter-die and intra-die variation while simulating the circuits. To showcase the effect of Inter-die and Intra-die variation, the statistical simulations were performed on 6T SRAM cell and Figure 1.5 shows the timing waveforms of read operation of 6T SRAM cell.

Some of the waveforms in Figure 1.5 show that voltage at Q is changing from low to high and voltage at QB is from high to low and causes cell flipping during read operation. This shows the stability of 6T SRAM cell degrades under process variation.

### 1.2.2 Inter-die: global variation

Global variation is the difference between average parameter values of the die; for instance, these can include the average NMOS/PMOS threshold voltage, dielectric thickness, or poly width. Global variation comes about due to systematic processing changes affecting individual dies.

The effect of inter-die variation on 6T cell is shown in Figure 1.6 at ultra-low voltage (VDD=0.2V). The storage nodes Q and QB are changing their states and causing the cell flipping because of Inter-die variation at ultra-low voltage.



Figure 1.6. Effect of inter-die variation on conventional 6T SRAM cell at ultra-low voltage (VDD=0.2 V).

### 1.2.3 Intra-die: local variation

On the other hand, local variation is the difference between nominally matched devices on the same die. These can include the number of NMOS/PMOS channel-adjust doping ions, poly line-edge roughness, local-layout dependent lithography effects, as well as transient effects such as negative bias temperature instability (NBTI) [41], [42]. In advanced technologies, local variation sources have an increasingly dominating impact [42]; while global variation significantly degrades the operating margins of SRAMs, local variation represents the most urgent concern regarding the increasing rate of failures observed [43]. A complete treatment of variation in CMOS devices, and its impact on circuits, such as SRAMs, can be found in [41]-[47].

The effect of intra-die variation on 6T cell is shown in Figure 1.7 at ultra-low voltage (VDD=0.2V). The storage nodes Q and QB are changing their states and causing the cell flipping because of inter-die variation at ultra-low voltage.



Figure 1.7. Effect of intra-die (mismatch parameters) variation on conventional 6T SRAM cell at ultra-low voltage (VDD=0.2 V).

It can be noticed from Figure 1.6 and Figure 1.7 that inter-die variations are more systematic in processing changes than intra-die processing changes respectively. The combined effect of inter-die and intra-die process variations on 6T SRAM cell is shown in Figure 1.8.



Figure 1.8. Effect of inter-intra die (process and mismatch parameters) variations on conventional 6T SRAM cell at ultra-low voltage (VDD=0.2 V).

### 1.3 Solutions from the Literature

A variety of solutions are available from the literature but from our point of view, the solution for the stability issue of 6T SRAM cell can be categorized in two approaches. The first approach is to find new circuit topology and the second is to use of non-conventional MOSFETs.

#### 1.3.1 Circuit Design Approach

Different types of SRAM bit-cells have been proposed to improve the memory failure probability at a given supply voltage. Primarily, sub-threshold memories were presented in 2005 [3]-[11]. The research group [5] and [11] showed that operation of a standard 6T SRAM under process variations is problematic. In 2007, Kim's group [13] introduced a standard 8T SRAM cell that functions at voltages as low as 200mV, by utilizing reverse short channel effect (RSCE). Increasing the length of a transistor actually lowers  $V_{TH}$  in most modern processes until a minimum point. By using access transistors with a channel at this minimum  $V_{TH}$  length, the write current is increased, resulting in an equivalent write margin as that achieved with a boosted word-line. In addition, the standard 8T topology shown in Figure 1.9 decouples the cell node from the bit-line using additional read path transistors. By doing so, the SNM in read mode becomes equal to that in hold mode.



Figure 1.9. Read-Decoupled 8T

In 2008, Chandrakasan's group [14] proposed a standard 8T cell for increased density and achieved low voltage operation through peripheral modifications. Using the 8T cell, 30% reduction in area as compared to their 10T [15] cell was achieved. The “zero leakage” readout scheme raises the source of the readout transistor to VDD when the row is deselected, minimizing its drain induced barrier lowering (DIBL) leakage, which almost eliminates the leakage, as shown in Figure 1.10. The read is further improved using a differential sensing scheme that eliminates global variations [16], as shown in Figure 1.11.



Figure 1.10. Read-Decoupled 10T [15]



Figure 1.11. Differential 10T [16]

Sub-threshold and near-threshold design is becoming a popular selection for ultra-low power systems. The operation of standard SRAMs at sub-threshold or near-threshold voltage is unreliable, primarily due to the degraded static noise margins and extreme fluctuations in the device current under process variations at low voltage.

On the other hand, some popular assist techniques are used to maintain both desired write-ability and read stability of SRAM arrays. These techniques include lower column supply voltages during write, bit-line (BL) and word-line (WL) bias, pulsed BLs, and read and write assist column circuitry. Such techniques aim to increase the array robustness with smaller cells, but necessarily lower array efficiency, resulting in larger area [9]. Different topologies (7T, 8T and 9T) and techniques (feedback cutting and read decoupling) have also been proposed to address the above issues to an extent [13]-[22].

### 1.3.2 Use of Multigate MOSFETs

Operating a circuit in the sub-threshold region can reduce the power consumption to the minimum possible range. In the sub-threshold regime, data stability of the SRAM cell is a severe problem and worsens with the scaling of MOSFET (as shown in Figure 1.12) in sub-nanometer

technology. Moreover, the intra and inter die variations reduces the yield of SRAM which are more susceptible to failure in the sub-nanometer regime [20]-[24]. Therefore, the use of multi-gate solid-state devices, such as fin-shaped field effect transistor (FinFET), as shown in Figure 1.13, gives better channel control and enables SRAM scaling at the traditional rate. This would result in smaller die sizes [24]-[26] as compared to that obtained using conventional MOSFETs.



Figure 1.12. Single gate MOSFET [30]



Figure 1.13. Multi Gate MOSFETs [31]

In addition, the thin body of a multi-gate device is typically un-doped or lightly doped. Thus, the random dopant fluctuation is significantly decreased, which results in the reduction of  $V_{TH}$  variation [25], [26]. On the other hand, FinFETs add fringing capacitance, width-quantization, higher access resistance, 3D-factor, and low-field mobility creates new challenges [27]-[29].

## 1.4 Current Status of Conventional 6T and 8T SRAM cells

Current industry standard SRAM cells are composed of six transistors (6T) and eight transistors (8T) as shown in Figure 1.2 and 1.9 respectively. Conventional 6T and 8T SRAM cells suffer from an intrinsic data instability problem due to directly-accessed data storage nodes during a read operation. Noise margins of memory cells further shrink with increasing variability and decreasing power supply voltage in scaled CMOS technologies. A conventional six-transistor (6T) and an eight-transistor (8T) memory circuits are characterized for layout area, data stability, write voltage margin, data access speed, active power consumption, idle mode leakage currents, and minimum power supply voltage in [33]. A comprehensive electrical performance metric was evaluated by the authors [33] to compare the memory cells considering process parameter and supply voltage fluctuations and presented in Table 1.2.

TABLE 1.2 COMPARISON OF SRAM CELLS REPORTED IN [33] AT VDD=1.2V IN 65nm.

| Cell             | Size<br>( $\mu\text{m}^2$ ) | RSNM at<br>1000 MC<br>(mV)<br>Mean/SD | WSNM at<br>1000 MC<br>(mV)<br>Mean/SD | Write<br>Delay<br>(ps) | Read<br>Delay<br>(ps) | Write<br>power<br>@8kb<br>(mW) | Read<br>power<br>@8kb<br>(mW) | VDD<br>-min<br>(V) |
|------------------|-----------------------------|---------------------------------------|---------------------------------------|------------------------|-----------------------|--------------------------------|-------------------------------|--------------------|
| <b>6T</b>        | 0.937                       | 188.1/21.7                            | 382.5/41.5                            | 64.1/4.7               | 1238.5/6.7            | 2.1                            | 1.8                           | 0.88               |
| <b>Tri-Vt 6T</b> | 1.61                        | 200/16.7                              | 556.6/47.1                            | 49.2/4                 | 666/7.5               | 2.9                            | 2.1                           | 1                  |
| <b>Tri Vt 8T</b> | 1.61                        | 417.7/15.8                            | 587.5/40.9                            | 62.1/4.6               | 453.8/22.5            | 2                              | 2.7                           | 0.39               |

In [34] a 32-kb macro containing an eight-transistor soft error robust SRAM cell with differential read and write capabilities were presented. The 8T cell does not have dedicated access transistors, and its quad-latch configuration stores data on four interlocked storage nodes. The cell demonstrates read data stability down to 0.55 V and was well suited for low-voltage, low-power applications.

The article [35] paper presented a novel sub-threshold 8T SRAM for ultra-low power applications. Although by the use of the SRAM cell, the total leakage power was increased for super-threshold regions, the cell was able to work at supply voltages lower than 200mV with significantly improved robustness without any leakage increase.

Intelligent wearable devices and the Internet of things (IoT) require on-chip SRAM macros with (1) compact area to reduce costs; (2) single supply voltage and low minimum VDD to reduce power consumption; and (3) sufficient speed to facilitate real-time computing [36]. The conventional 6T SRAM is compact, but suffers write failure and half-select (HS) disturbance in read/write cycles at low VDD. Word line (WL) voltage under-drive is commonly used in 6T SRAMs [2-5] to improve half select read SNM during read/write cycles; however, this tends to degrade WM and cell read current, resulting in slower read/cycle speeds and necessitating an increase in VDD. Thus, half select SNM tradeoffs and WM have not yet been solved for 6T cells, except by adding additional transistors (i.e., 8T to 10T). The 6T macro fails during read and writes operations at 28nm. A 256kb array of the proposed 6T cell has access time of 2.2ns at VDD=580mV with cell size=0.127um<sup>2</sup>.

The paper [37] presented a new 8 transistors (8T) design for SRAM cell and the data is presented in Table 1.3. It shows that the 8T SRAM cell decreases write and read delays by 45% and 58% over conventional 6T SRAM cell at supply voltage of 500mV, where the power consumption for single write operation was decreased by 54%.

TABLE 1.3 COMPARISON OF SRAM CELLS AT 500mV

| Cells             | RSNM(mV) | HSNM(mV) | Write delay(ns) | Read delay (ns) | Power ( $\mu$ W) |
|-------------------|----------|----------|-----------------|-----------------|------------------|
| <b>WRE8T [37]</b> | 65.9     | 175.4    | 12              | 5.1             | 12.48            |
| <b>ST10T [38]</b> | 115.9    | 209.9    | 21.6            | 11.7            | 24.46            |
| <b>LP10T[39]</b>  | 205.6    | 72.2     | 21              | 14.4            | 25.18            |
| <b>6T conv.</b>   | 72.2     | 178      | 22              | 12.1            | 27.31            |
| <b>9T [40]</b>    | 178      | 178      | 22.19           | 13.7            | 26.6             |

#### 1.4.1 Current Status of FinFET based Conventional 6T and 8T SRAM cells

The growth of battery-powered mobile and wearable devices has increased the importance of low-power operation and cost in system-on-a-chip (SoC) design. Supply-voltage scaling is the predominant approach to active power reduction for SoC design, including voltage scaling for on-die memory given increasing levels of memory integration. SRAM can limit the minimum operating voltage (Vmin) of a design, often leading to the introduction of separate voltage supplies for on-die memory. Additional supplies increase platform cost, and operating memory at higher voltage leads to

increased power consumption. The introduction of FinFET devices at the 22nm technology node delivered superior short channel effects and sub-threshold slope relative to existing bulk planar device technology enabling reduction in threshold voltage within a fixed leakage constraint. Lower transistor  $V_{TH}$ , improvements to random device variability, and assist circuits to overcome device-size quantization enabled a  $>150$ mV reduction in SRAM  $V_{min}$  [41]. At the 14nm technology node, FinFET device-size quantization remains a challenge for compact 6T SRAM bitcells with minimum-size transistors. Careful co-optimization between technology and design of memory-assist circuits is required in order to deliver dense, low-power memory operation at low voltages.

Intel presented an 84Mb embedded SRAM fabricated in 14nm FinFET technology featuring the smallest bit cells to date at  $0.050\mu\text{m}^2$  for high density and  $0.058\mu\text{m}^2$  for low voltage. A 1.5GHz operation at 0.6V was demonstrated [42].

The authors of [43] showed a robust 6T SRAM design in 7nm technology node, at low supply voltage and rising leakage as given in Table 1.4. In this work asymmetric underlapped FinFET design with the help of quantum mechanical device simulations considering both the bit-cell and cache design constraints was explored. It was showed that optimized FinFET achieved a significant improvement in on current over conventional symmetrically underlapped FinFETs. They demonstrated significant energy savings and performance improvements for an 8KB L1 cache and a 4MB last-level cache.

TABLE 1.4 6T CELL METRIC PRESENTED IN [43] AT VDD=500mV

| Cell | HSNM(mV) | RSNM(mV) | WNM   | Access Time(ns) | VDDmin |
|------|----------|----------|-------|-----------------|--------|
| 6T   | 189.5    | 107.18   | 74.58 | 2.52            | 158    |

The papers [44] and [49] evaluated the impacts of Read- and Write-Assist circuits on the GeOI FinFET 6T SRAM cells compared with the SOI counterparts in Table 1.5. The word-line under-drive (WLUD) read-assist was more efficient to improve the read static noise margin (RSNM) and Read VMIN of FNSP GeOI FinFET SRAM cells compared with the SOI counterparts. GeOI FinFET SRAM cells with WLUD show smaller cell read access time compared with the SOI FinFET SRAM cells at both 25°C and 125°C. Negative bit-line (NBL) write-assist was more efficient to improve the write static noise margin (WSNM) than cell supply lowering for both GeOI and SOI FinFET SRAM cells.

TABLE 1.5 6T CELL METRIC PRESENTED IN [44] AT VDD=1V

| Cell                 | RSNM(mV) | Write delay(ps) |
|----------------------|----------|-----------------|
| <b>GeOI conv. 6T</b> | 80       | 16              |
| <b>6T WRA</b>        | 170      | 18              |

In article [45], a cross-layer framework (spanning device and circuit levels) was presented for designing robust and energy-efficient SRAM cells, made of deeply-scaled FinFET devices. Next, 6T and 8T SRAM cells, which composed of these devices, were designed and optimized as shown in Table 1.6. To enhance the cell stability and reduce leakage energy consumption, the dual (i.e., front and back) gate control feature of FinFETs was exploited. A dual-gate controlled 6T SRAM cell operating at 324mV (in the near-threshold supply regime) was finally presented as a high-yield and energy-efficient memory cell in the 7nm FinFET technology.

TABLE 1.6 6T AND 8T CELL METRIC PRESENTED IN [45] AT VDD=324mV

| Cell         | Area (nm <sup>2</sup> ) | SNM(mV) |
|--------------|-------------------------|---------|
| <b>6T-DG</b> | 6615                    | 82      |
| <b>8T-DG</b> | 9403                    | 109.1   |

In [48] a 6T high-density SRAM bitcell with write-assist circuitry was implemented using 16 nm high-k metal gate FinFET technology. In advanced nanometer CMOS technologies, the performance of SRAM is limited in the minimal supply voltage operations, because of local threshold voltage variations resulting from random dopant fluctuations and lithographic-dependent patterns variations. Technological solutions such as FinFET technology can provide superior short-channel effects and subthreshold slopes, as well as fewer random dopant fluctuations. Therefore, FinFET technology has become a widely applied technological solution, and it is anticipated to outperform other lower SRAM technologies. However, quantizing the channel length and width of FinFET transistors can cause problems in conventional 6T-SRAM bitcells.

In [50] complete device threshold voltage targeting methodology for FinFET SRAM in 10-nm technology was presented. The SRAM cell reduces the cost of the technology by sharing P-channel field effect transistor (PFET) and N-channel field effect transistor (NFET) mask with the high threshold voltage logic devices, whereas the SRAM device shares only NFET mask. The SRAM can

achieve target performance at less area by compromising read stability, which will result in lower yield. At 64-nm pitch, litho-etch litho-etch (LELE) double-patterned gate impacts device performance and alleviates variability; hence the read margin of SRAM cell should consider an additional  $1\sigma$  RSNM margin to retain the same yield in 10-nm-technology era.

#### 1.4.2 Best Performance Reported

It is very difficult to compare two SRAM cells from literature because of variety of different parameters associated with SRAM. For instance, different process-voltage-temperature (PVT) conditions are followed during analysis. For simplicity, if we opt industry specific technology node then triple-threshold-voltage 8T (Tri  $V_{TH}$  8T) SRAM cell [33] provide best performance up to 2.5x data stability and overall electrical quality as compared to the traditional 6T SRAM cells in a TSMC 65 nm CMOS technology.

Moreover of novel device structure FinFET changed the definition of circuit design beyond conventional MOSFET. From the recent selective research, asymmetric underlapped n-FinFETs are used as bit-line access transistors which can reduce read/write conflict. Further, best improvement in write noise margin as well as access time can also be achieved [43].

#### 1.4.3 Challenges for SRAM Designers

The implementation of the memory techniques discussed in section 1.1.1 to 1.1.6 would face further challenges in future scaled CMOS technologies. The mobility difference between electrons and holes tends to shrink with CMOS technology scaling [33]. The write voltage margins are degraded due to the higher contention currents that are produced by the PMOS transistors in 6T, 8T, and 7T SRAM cells in newer technology nodes. The lowest power supply voltage of the 6T, 8T, and 7T memory arrays can be therefore limited by the write ability degradation. Alternatively, the write-“0” operation can be equally critical as compared to the write-“1” operation due to the stronger PMOS transistors in the 7T SRAM cell that employ single-ended write scheme. Furthermore, process parameter variations are exacerbated with CMOS technology scaling. The data stability and write ability variations are increased due to more significant process parameter fluctuations in newer CMOS technology nodes. The yield target could be missed in scaled 6T, 7T and 8T SRAM cells that may suffer from data instability and write failure problems, respectively. The multi- $V_{TH}$  8T and 9T SRAM cells are expected to become even more attractive due to stronger data stability, wider write

voltage margin, and higher immunity to process parameter variations as compared to the 6T and 7T SRAM cells in future CMOS technologies. The layout-dependent effect is an emerging problem as the spacing among devices shrinks with CMOS technology scaling [29]. The layout context (the neighborhood in which a device is placed) plays an important role on device performance and power consumption. The asymmetrical SRAM cells (7T and 8T SRAM cells) suffer more from layout-dependent effect as compared to the symmetrical SRAM cells (6T and 9T SRAM cells). Therefore, the impact of layout context has to be considered more carefully for asymmetrical SRAM cell design in future CMOS technologies [33]-[40].

#### **1.4.4 Solutions Suggested by Our Thesis**

This thesis addresses some of the most critical issues of SRAM. For instance, to remove the constraints of read and write sizing conflict thesis suggest using separate path for read and writing current. To make more stable and rugged SRAM thesis opt symmetric inverter pair. To reduce the power consumption, thesis demonstrates the sub-threshold operation in addition with single-ended SRAM cells. To speed up the read write operations, thesis shows clever idea to minimize the fight between respective conflicting MOSFETs. To allow continuous technology scaling and improve the performance thesis explores the use of FinFET devices with novel the proposed SRAM structures. To reduce the effect of Fin quantization sizing constraints, thesis suggests using of novel SRAM topology compatible with FinFET technology.

### **1.5 Motivation and Problem Statement**

In sub threshold regime, the data stability of SRAM cell is a severe problem and worsens with the scaling of MOSFET to sub-nanometer technology. This augments the failure rate of memory operations. Another problem is to obtain optimized noise margin against process-voltage-temperature (PVT) variations during all operations. Moreover, FinFET produces some challenges for the conventional SRAM bit-cells. Still there is a need of SRAM bit-cell that can fulfills the requirement for improving both read and write stability in sub-threshold regime for ultra-low power applications. Also, there is a need to find the novel topology that can be compatible with conventional as well as FinFET technology.

### 1.5.1 Cascading in Both Inverters

The Motivation behind introducing the proposed SRAM with three cascaded transistors in one/each branch of cross coupled inverter is to reduce write and read sizing conflict thereby making writing easy into the memory cell. It also provides the possibility of read decoupling.

The following is the reason behind sizing constraint and to choose new SRAM topology. As conventional 6T SRAM cell is designed by using cross-coupled CMOS inverters. The circuit structure of the full CMOS SRAM cell is shown in Figure 1.2. The memory cell consists of simple CMOS inverters connected back to back, and two access transistors. The access transistors are turned on whenever a word line is activated for read or write operation, connecting the cell to the complementary bit line columns.

To determine width to length (W/L) ratios of the transistors, a number of design criteria must be taken into consideration. The two basic requirements, which dictate W/L ratios, are that:

- (a) The read operation should not destroy the stored information in the cell.
- (b) The cell should allow stored information modification during write operation.

During read operation, shown in Figure 1.3(a), assuming that logic '1' is stored in the cell. The left pull down (PDL) and right pull up (PUR) transistors are turned off, while the right pull down (PDR) and left pull up (PUL) transistors operate in linear mode. Thus internal node voltages are  $Q = 1$  and  $QB = 0$  before the cell access transistors are turned on.

After the Left Access Transistor (ATL) and Right Access Transistor (ATR) are turned on by the row selection circuitry, the voltage on BL of will not change any significant variation since no current flows through ATL. On the other hand ATR and PDR will conduct a nonzero current and the voltage level of BLB will begin to drop slightly. The node voltage QB will increase from its initial value of '0'V. The node voltage QB may exceed the threshold voltage of PDL during this process, forcing an unintended change of the stored state. Therefore voltage on QB must not exceed the threshold voltage of PDL, so the transistor PDL remains turned off during read phase. The ATR must be weak and PDR must be strong enough to hold  $QB = 0$ .

Now consider the write '0' operation assuming that logic '1' is stored at Q in the SRAM cell initially. Figure 1.3(b) shows the voltage levels in the CMOS SRAM cell during the data write operation. The transistors PUR and PDL are turned off, while PUL and PDR are operating in the linear mode. Thus the internal node voltage  $Q = VDD$  and  $QB = 0$  before the access transistors are turned on. The BL is forced to '0' by the write circuitry. Once ATL and ATR are turned on, we expect

the nodal voltage QB to remain below the threshold voltage of PDL, since PDL and ATR are designed according to read operation.

The voltage at node QB would not be sufficient to turn on PDL. To change the stored information, i.e., to force  $Q = 0$  and  $QB = VDD$ , the node voltage Q must be reduced below the threshold voltage of PDR, so that PDR turns off. As PDR is strong, Q needs to go near to '0' to turn PDR off. This slows down the write operation. The process variations make read and write more difficult.

A solution can be making ATR strong but it leads to read failure due to QB can increase beyond threshold of PDL. Another solution can be making PDR weak but it also causes read failure due to QB increases beyond threshold of PDL. It also causes reduced read current and reduced read speed. Moreover, there is still fight between ATR (making  $QB=1$ ) and PDR (holding  $QB=0$ ).

This write and read sizing conflict raised a question that, can it be possible to make ATL/ATR strong and PDL/PDR weak for write operation and weak ATL/ATR and strong and PDL/PDR for read operation?

We come up with the idea that ATR can be a strong by increasing its size to make write faster but can't make weak by cascading two ATLs because, it reduces speed of both read and write operations.

Second option is that, we can make PDR weak at the cost of degraded read speed. This problem can be solved by bypass the read current using separate MOS like RD-8T SRAM cell.

Now the question is how to make MOS weak device. Basically, to make a MOS weak, either use high threshold voltage MOS or use cascaded MOS in series (because we can't reduce size beyond minimum limit of respective technology). We can't use high threshold MOS because it needs extra process step and can't be controlled once fabricated. Therefore we use cascaded MOS in pull-down path of inverter pair. This gives a challenge to control cascaded MOSFETs and needs an extra bit/word line. It also increases the size and complexity of SRAM.

We have successfully completed the challenge and dynamically controlled the cascaded MOSFETs during read/write operation. As writing is already improved, we can use only one pass transistor to write and this will reduce the effect of increased size due to cascading. It will also save power during read/write operation. The SRAM cell based on this concept is presented in Chapter 2.

### 1.5.2 Cascading in One Inverter

To further reduce the size and complexity of the proposed 8T as compared to conventional 6T another SRAM is proposed in Chapter 3. The idea was to read and write with single bit-line. As the proposed 8T were able to write ‘1’ directly from BL to Q, we uses cascaded MOSFETs in one inverter only. This idea was successful and write ‘1’ was possible through single NMOS pass gate because of cascaded MOSFET was used to control the feedback. We also able to read without directly disturbing the storage from the common BL because of cascaded MOSFETs are used for read decoupling. To achieve this we noticed that the single-ended 5T cell as shown in Figure 1.14 is attractive due to its reduced area with considerable active and standby power saving as compared to conventional 6T SRAM cell. The write-ability and read stability of single-ended 5T severely degrades as compared to conventional 6T SRAM cell.



Figure 1.14. Conventional 5T SRAM Cell

The 5T cell has only one access transistor ‘M5’ and a single bit line ‘BL’. Writing of ‘1’ into the 5T cell is performed by driving the bit line and word line is asserted at VDD. The M5 would try to transfer the charge from BL to Q but M3 would try to hold Q at ‘0’ because QB is ‘1’. This fight of M5 and M3 makes writing difficult. Therefore, 5T needs a wide M5 transistor with boosted supply on its gate to write successfully as shown in Figure 1.15(a). This needs external circuit for boosted

supply and increased size of access transistor. It creates another problem of high voltage bump at Q which increases probability of cell flipping due to wide M5 transistor as shown in Figure 1.15(b).



Figure 1.15. Conventional 5T SRAM Cell (a) Write ‘1’ operation (b) Read ‘1’ operation

Therefore, to write without affecting read operation we can focus on pull-down transistor M3. We can make M3 weak by cascading (as discussed earlier in Chapter 1) one more MOSFET in pull-down path. Now, we can cut the feedback during write and allow the unhindered charge transfer from BL to Q without wider M3 and eliminate the need of boosted supply. Although this cascading allow easy write ‘1’, it reduces the read current and increases read delay. Therefore, we used read decoupling and the feedback cutting using cascaded MOSFETs to prevent direct disturbance on Q. In sub threshold regime, data stability of the SRAM cell is a severe problem and worsens with the scaling of MOSFET to sub-nanometer technology. This augments the failure rate of memory operations. Another problem is to obtain optimized noise margin against process-voltage-temperature (PVT) variations during all operations. Moreover, FinFET produces some challenges for the conventional SRAM bit-cells. Still there is a need for SRAM bit-cells that can fulfills the requirement for improving both read and write stability in sub-threshold regime for ultra-low power applications.

Also, there is a need to find a novel topology that can be compatible with conventional as well as FinFET technology.

## 1.6 Thesis Contribution

This thesis contributes to identify and solve some of the most critical issues of ultra-low power SRAM. The research work focuses on circuit techniques that are compatible with bulk CMOS and the most advanced FinFET technologies. Simulation and analyses have been done in 90nm bulk CMOS, 20nm Berkeley short-channel IGFET common multi-gate (BSIM-CMG) based FinFET and 14nm SOI FinFET technologies. Important contributions are as follows:

- A new sub-threshold 8T SRAM cell that operates in sub-nanometer technology node at ultra-low supply voltage has been designed. It uses single-ended write with dynamic feedback cutting to enhance write-ability and dynamic read decoupling to avoid read disturb.
- We have focused mainly on the stability of the cell which is affected by the process parameter variations. We emphasized delay, power and half-select issues for both row and column.
- A novel single-ended boost-less sub-threshold 7T SRAM cell using dynamic feedback cutting to enhance write-ability and dynamic read decoupling to avoid read disturb has been designed.
- To the best of our knowledge, this is the first time when feedback cutting using a pull-down path is applied for single-ended SRAM cell.
- A novel differential 8T and read decoupled 10T SRAM cell is proposed.
- A novel approach of using charge sharing to increase the sensing speed while using single ended read configuration is proposed.
- Apart from this, individual 1kb array is designed for 6T, 7T and 8T SRAM in UMC 90nm Bulk CMOS technology.

## 1.7 Organization of thesis

The thesis is organized in six chapters. In Chapter 2, Section 1 presents the proposed 8T SRAM cell. Section 2 analyzes the SNMs, power consumption and performance of the proposed 8T and 6T cells and 1kb arrays. In Section 3, the proposed 8T is compared with 5T, 6T, RD-8T, A-8T and 9T cells. Section 4 summarizes the comparison. Finally, the chapter summary is drawn in Section 5.

In chapter 3, Section 1 presents the proposed 7T SRAM cell along with its operations and the schemes utilized. Section 2 comprises of analyzed and compared results of 5T, 6T and 7T in UMC 90nm. In Section 3, statistical results are compared and summarized. Finally, the chapter summary is drawn in Section 4.

In Chapter 4, Section 1 introduces a 7T SRAM cell designed in 20nm FinFET technology. Section 2, 3 and 4 elaborate the hold, write and read operations, respectively. Section 5 discusses the half-select condition of the proposed FinFET 7T. In Section 6, statistical results are compared and summarized. Finally, the chapter summary is done in Section 7.

In Chapter 5, Section 1 reviews the feedback cutting cells. In Section 2 results of a novel differential 8T SRAM cell designed in 14nm FinFET technology are discussed. In Section 3, a read decoupled 10T SRAM cells is designed and simulated. In Section 5, high speed sensing for single-ended bit-cells is discussed. Then Section 6 summarizes the chapter.

Finally, Chapter 6 concludes all contributions and suggests directions for future research.

## References

- [1] Moore G. E. (1965), Cramming more components onto integrated circuits, *Electronics*, vol. 38, no. 8. pp. 1-4.
- [2] Bhavnagarwala et al. (2001), The impact of intrinsic device fluctuations on CMOS SRAM cell stability, *IEEE J. Solid-State Circuits*, vol. 36, no. 4, pp. 658–665.
- [3] Yoshinobu et al. (2003), Review and future prospects of low-voltage RAM circuits, *IBM J. Res. Devel.*, vol. 47, no. 5/6, pp. 525–552.
- [4] Marinissen et al. (2005), Challenges in embedded memory design and test, *Proceedings of Design, Automation and Test in Europe Conference and Exhibition*, Munich, Germany, pp. 722–727.
- [5] Mukhopadhyay et al. (2005), Modeling of failure probability and statistical design of SRAM array for yield enhancement in nano-scaled CMOS, *IEEE Trans. Comput.-Aided Design (CAD) Integr Circuits Syst.*, vol. 24, no. 12, pp. 1859–1880.
- [6] Zhang et al. (2008), Embedded memory design for nano-scale VLSI systems, *Integrated Circuits and Systems*, Springer (ISBN-13: 978-0387884967).
- [7] Jain et al. (2012), A 280 mV-to-1.2 V Wide-Operating-Range IA-32 Processor in 32nm CMOS, *Proceedings of IEEE International Solid-State Circuits Conference*, CA, USA, pp. 66-68.

- [8] Lütkemeier et al. (2013), A 65 nm 32b Subthreshold Processor With 9T Multi-Vt SRAM and adaptive Supply Voltage Control, *IEEE J. Solid-State Circuits*, vol. 48, no. 1, pp. 8-19.
- [9] Jeon et al. (2012), A Super-Pipelined Energy Efficient Sub-threshold 240 MS/s FFT Core in 65 nm CMOS, *IEEE J. Solid-State Circuits*, vol. 47, no. 1, pp. 23-34.
- [10] Zhu H. and Kursun V. (2014), A Comprehensive Comparison of Data Stability Enhancement Techniques With Novel Nanoscale SRAM Cells Under Parameter Fluctuations, *IEEE Trans. Circuit and Systems-I: Regular Papers*, vol. 61, no. 5, pp. 1473-1484.
- [11] Khellah et al. (2008), Read and write circuit assist techniques for improving of dense 6T SRAM cell, *Proceedings of Int. Conf. Integr. Circuit Design Technol.*, Texas, USA, pp. 185–189.
- [12] Seevinck E., List F., and Lohstroh J. (1987), Static-noise margin analysis of MOS transistors, *IEEE J. Solid-State Circuits*, vol. SC-22, no. 5, pp. 748-754.
- [13] Kim T.-H., Liu J. and Kim C.H. (2007), An 8T Subthreshold SRAM Cell Utilizing Reverse Short Channel Effect for Write Margin and Read Performance Improvement, *Proceedings of Custom Integrated Circuits Conference*, CA, USA, IEEE, pp. 241-244.
- [14] Verma N. and Chandrakasan A. P. (2008), A 256 kb 65 nm 8T Subthreshold SRAM Employing Sense-Amplifier Redundancy, *IEEE Journal of Solid-State Circuits* , vol. 43, pp. 141-149.
- [15] Calhoun B.H. and Chandrakasan A.P. (2007), A 256-kb 65-nm sub-threshold SRAM design for ultra-low-voltage operation, *IEEE J Solid State Circuits*, vol. 42, pp. 680-688.
- [16] Chang et al. (2009), A 32 kb 10T Sub-Threshold SRAM Array with Bit-Interleaving and Differential Read Scheme in 90 nm CMOS, *IEEE J. Solid-State Circuits*, vol. 44, no. 2, pp. 650-658.
- [17] Liu Tsung-Te and Rabaey Jan M. (2013), A 0.25 V 460 nW Asynchronous Neural Signal Processor With Inherent Leakage Suppression, *IEEE J. Solid-State Circuits*, vol. 48, no. 4, pp. 897-906
- [18] Kim et al. (2009), A voltage scalable 0.26 V, 64 kb 8T SRAM with voltage lowering techniques and deep sleep mode, *IEEE J. Solid-State Circuits*, vol. 44, no. 6, pp. 1785–1795.
- [19] Calhoun, B. H., and Chandrakasan A. (2007), A 256kb 65nm Sub-threshold SRAM Design for Ultra-low Voltage Operation, *IEEE J. Solid-State Circuits*, vol. 42, pp. 680-688.
- [20] Tu et al.(2012), A Single-Ended Disturb-Free 9T Subthreshold SRAM With Cross-Point Data-Aware Write Word-Line Structure, Negative Bit-Line, and Adaptive Read Operation Timing Tracing, *IEEE J. Solid-State Circuits* , vol. 47, no. 6, pp. 1469-1482.

- [21] Lo C.-H. and Huang S.-Y. (2011), P-P-N based 10T SRAM cell for low-leakage and resilient subthreshold operation, *IEEE J. Solid-State Circuits*, vol. 46, no. 3, pp. 695–704.
- [22] Chiu et al. (2014), 40 nm Bit-Interleaving 12T Subthreshold SRAM With Data-Aware Write-Assist, *IEEE Trans. Circuit and Systems-I: Regular Papers*, vol. 61, no. 9, pp. 2578-2582.
- [23] Kang et al. (2010), FinFET SRAM Optimization With Fin Thickness and Surface Orientation, *IEEE Trans. Electron Devices*, vol. 57, no. 11, pp. 2785-2793.
- [24] Gupta et al. (2013), Tri-Mode Independent Gate FinFET-Based SRAM With Pass-Gate Feedback: Technology–circuit Co-Design for Enhanced Cell Stability, *IEEE Trans. Electron Devices*, vol. 60, no. 11, pp. 3696-3704.
- [25] M. Kang, et.al. (2010), FinFET SRAM Optimization With Fin Thickness and Surface Orientation, *IEEE Trans. Electron Devices*, vol. 57, no. 11, pp. 2785-2793.
- [26] Fan et al. (2010), Investigation of cell stability and write ability of FinFET subthreshold SRAM using analytical SNM model, *IEEE Trans. Electron Devices*, vol. 57, no. 6, pp. 1375–1381.
- [27] Kim J. and Roy K. (2004), Double gate-MOSFET subthreshold circuit for ultralow power applications, *IEEE Trans. Electron Devices*, vol. 51, no. 9, pp. 1468–1474.
- [28] Liu et al. (2004), A highly threshold voltage-controllable 4T FinFET with an 8.5-nm-thick Si-Fin channel, *IEEE Electron Device Lett.*, vol. 25, no. 7, pp. 510–512.
- [29] Baravelli et al. (2008), Impact of LER and random dopant fluctuations on FinFET matching performance, *IEEE Trans. Nanotechnol.*, vol. 7, no. 3, pp. 291–298.
- [30] MOSFET - University of Warwick (2015), <http://www2.warwick.ac.uk/fac/sci/physics/current/postgraduate/regs/mpags/ex5/devices/mosfet/>, Accessed 4 January 2015.
- [31] FinFET Process Modeling and Extraction at 16-nm and Below (2015), <https://www.semiwiki.com/forum/content/1908-finfet-process-modeling-extraction-16-nm-below.html>, Accessed 4 January 2015.
- [32] Sinha et al. (2012), Exploring Sub-20nm FinFET Design with Predictive Technology Models, *Proceedings of Design and Automation Conference (DAC)*, CA, USA, pp. 283-288.
- [33] Zhu H. and Kursun V. (2014), A comprehensive comparison of data stability enhancement techniques with novel nanoscale SRAM cells under parameter fluctuations, *IEEE Trans. Circuits and Systems*, vol. 61, no. 5, pp. 1473-1484.
- [34] Shah J. S. et. al. (2015), A 32 kb macro with 8T soft error robust, SRAM cell in 65-nm CMOS, *IEEE Trans. Nuclear Sci.*, vol. 62, no. 3, pp.1-8.

- [35] Moradi F. and Madsen J. K. (2014), Improved Read and Write Margins Using a Novel 8T-SRAM Cell, Proceedings of 22nd International Conference on Very Large Scale Integration (VLSI-SoC), Mumbai, India, pp. 1-5.
- [36] Chang M. F. et al. (2015), A 28nm 256kb 6T-SRAM with 280mV Improvement in VMIN Using a Dual-Split-Control Assist Scheme, Proceedings of IEEE International Solid-State Circuits Conference, CA, USA, pp. 1-3.
- [37] Pasandi G. and Fakhraie S. M. (2013), A New Sub-300mV 8T SRAM Cell Design in 90nm CMOS, Proceedings of 17th CSI International Symposium on Computer Architecture and Digital Systems (CADS), Tehran, Iran, pp. 39-44.
- [38] Kulkarni J. et al. (2007), A 160 mv, fully differential, robust schmitt trigger based sub-threshold SRAM, Proceedings of ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED), Bavaria, Germany, pp. 171–176.
- [39] Islam A. and Hasan M. (2012), Leakage characterization of 10t SRAM cell, IEEE Trans. Electron Devices, vol. 59, no. 3, pp. 631–638.
- [40] Liu Z. and Kursun V. (2008), Characterization of a novel nine-transistor SRAM cell, IEEE Trans. Very Large Scale Integr. (VLSI) Systems, vol. 16, no. 4, pp. 488–492.
- [41] Karl E. et al. (2012), A 4.6GHz 162Mb SRAM Design in 22nm Tri-Gate CMOS Technology with Integrated Active VMIN-Enhancing Assist Circuitry, Proceedings of IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), CA, USA, pp. 230-231.
- [42] Karl E. et al. (2015), A 0.6V 1.5GHz 84Mb SRAM Design in 14nm FinFET CMOS Technology, Proceedings of IEEE International Solid-State Circuits Conference, CA, USA, pp. 310-312.
- [43] Goud A. et al. (2015), Asymmetric underlapped FinFET based robust SRAM design at 7nm node, Design, Proceedings of Automation & Test in Europe Conference & Exhibition (DATE), Dresden, Germany, pp. 659-664.
- [44] Hu V. P. et al. (2014), Evaluation of read- and write-assist circuits for GeOI FinFET 6T SRAM cells, Proceedings of IEEE International Symposium Circuits and Systems (ISCAS), Montreal, Canada, pp. 1122-1125.
- [45] Shafaei A. et al. (2015), A cross-layer framework for designing and optimizing deeply-scaled FinFET-based SRAM cells under process variations, Proceedings of Design Automation Conference (ASP-DAC), 2015 20th Asia and South Pacific, Chiba, Japan pp. 75-80.

- [46] Alioto M. et al. (2015), Variations in nanometer CMOS flip-flops: Part I—Impact of process variations on timing, *IEEE Trans. Circuit and Systems-I: Regular Papers*, vol. 99, pp. 1-9.
- [47] Alioto M. et al. (2015), Variations in Nanometer CMOS Flip-Flops: Part II— Energy Variability and Impact of Other Sources of Variations, *IEEE Trans. Circuit and Systems-I: Regular Papers*, vol. 62, pp. 835-843.
- [48] Chen Y-H. et al. (2015), A 16 nm 128 Mb SRAM in High- Metal-Gate FinFET Technology With Write-Assist Circuitry for Low-VMIN Applications *IEEE J. Solid-State Circuits*, vol. 50, no. 1, pp. 170-177.
- [49] Hu V. P. et al. (2015), Analysis of GeOI FinFET 6T SRAM Cells With Variation-Tolerant WLUD Read-Assist and TVC Write-Assist, *IEEE Trans. Electron Devices*, vol. 62, no. 6, pp. 1710-1715.
- [50] Sakhare S. S. et al. (2015), Simplistic Simulation-Based Device-VT-Targeting Technique to Determine Technology High-Density LELE-Gate-Patterned FinFET SRAM in Sub-10 nm Era, *IEEE Trans. Electron Devices*, vol. 62, no. 6, pp. 1716-1724.

## Chapter 2

### A Single-Ended with Dynamic Feedback Control 8T Sub-threshold SRAM Design

The yield of SRAM is highly susceptible to failure in nanometer regime due to intra and inter-die variations [1]-[10]. Due to these limitations it becomes difficult to operate the conventional 6-transistor (6T) cell at ultra-low voltage (ULV) power supplies [10]-[12]. Also, the 6T SRAM cell, as shown in Figure 2.1(a), has a severe problem of read disturb [11], [12]. A basic and effective way to eliminate this problem is decoupling of the true storing node from the bit-lines during read operation [11]-[14]. This would lead to achievement of higher read static noise margin (RSNM). This read decoupling approach is utilized by a conventional Read Decoupled 8-transistor (RD-8T) SRAM cell (Figure 2.1(b)) which offers RSNM comparable to hold static noise margin (HSNM) [11]-[14]. As described in [15], the cell can function at 200mV utilizing the reverse short channel effect (RCSE).  $V_{TH}$  lowers by increasing the length of a transistor and the write current increases, leading to the achievement of a write static noise margin (WSNM) that is equivalent to that obtained by a boosted word-line. However, RD-8T suffers from leakage introduced in the read path. This leakage current increases with scaling, thereby, increasing the probability of failed read/write operations. This problem is overcome by a 10T cell [16] that reduces the leakage current at the cost of area overhead of two transistors per cell. It has also been claimed that the number of bit-cells per bit-line has been increased. Similar cells which maintain cell current without disturbing the storage node are also proposed by [17]-[19].

Further, to reduce the power consumption of a differential bit-line, a single bit-line write/read scheme can be used. The data transmission by a single bit-line not only reduces the power consumption as compared to a differential bit-line scheme [20], but also provides higher density. A single-ended 5T bit-cell is shown in Figure 2.1 (c). It is attractive due to its reduced area and considerable active and standby power saving compared to conventional 6T SRAM cell [20]. However, writing ‘1’ through an NMOS pass transistor in a 5T cell is a design challenge. Another problem is to obtain optimized noise margin against process variations at all operations. Also, the

read stability of single-ended 5T severely degrades in comparison to a conventional 6T SRAM cell [20].

Various techniques like boosted supply (gate voltage of access transistor M5 is greater than VDD) generated from an additional circuit [20], gated-feedback write-assist [21], 7T dual-Vt [22], asymmetrical write/read-assist 8T [23], and cross-point data-aware 9T [24] have been proposed to mitigate the above issues associated with 5T. Another 9T cell [28] improves the RSNM by 4.1x as compared to conventional 6T cell using a read decoupled mechanism. The 9T [28] cell not only has larger write margin, but also has faster write time [28]. Still, none of the cells could fulfill the requirement of improving both read and write stability in the sub-threshold regime for ultra-low power applications.



Figure 2.1: Schematic of (a) Conventional 6T (b) Read decoupled 8T (RD-8T) (c) Conventional 5T and (d) Proposed 8T SRAM cell.

To design an efficient stable SRAM cell we should know the basic difference between single-ended and differential SRAM cell design. Reducing to a single-ended (Fig 1.9), is attractive due to the potential for cell area reduction because of eliminating the need of two pass transistors per cell. One more clear advantage of single-ended design is that the bitline can be driven from rail to rail, eliminating the need for a conventional differential pair sense amplifier (which can lead to density and variability problems in differential designs). Furthermore, noise during a read operation is isolated to the single bitline, making single-ended design inherently more robust to read upsets than conventional differential design. Therefore, instead of using the traditional differential structure (Fig. 1.2), we employ a single-ended cell.

The main problem with single-ended cell is that writing a “one” through an NMOS pass transistor poses a difficult design challenge. However, single-ended cell also has considerable potential for active and standby power reduction, even if the number of transistors in the cell is not reduced. Although, differential sensing is faster than single-ended sensing in the caches with single-ended sensing are easier to design and the performance gap between differential and single-ended sensing diminishes with technology scaling. On the other hand, for 64 rows or less, the bit line swing development rate may be fast enough such that comparable delay is achieved by a single-ended full-swing sensing scheme. Consequently, single-ended sensing is emerging as an attractive alternative for on-chip cache [8].

To reduce the power consumption, the proposed SRAM cell uses single-ended read/write operation in sub-threshold regime. On the other hand, to speed up the read/write operation, proposed SRAM cell shows clever idea to minimize the fight between respective conflicting MOSFETs (separates the read and write path and exhibits read decoupling).

In this chapter, we have designed a new sub-threshold 8T SRAM cell that operates in a sub-nanometer technology node at ultra-low voltages. To solve the above issues, we have designed a novel sub-threshold 8T SRAM cell using single-ended write with dynamic feedback cutting to enhance write-ability and dynamic read decoupling to avoid read disturb [25]-[28].

As 8T is single-ended it can save more power consumption and area as compared to [28]. Here, we focus mainly on the stability of the cell which is affected by process parameter variations. This work is an elaborate discussion of our previous work [27] on 8T including comparisons with other single-ended cells like conventional 5T and 8T [23]. We have also emphasized the delay, power and

half-select issues for both row and column. Apart from this, a 1kb SRAM array for the proposed 8T and conventional 6T was also designed. The circuit simulations are done in the UMC 90nm process technology at different power supplies.

For clear notification and description, the proposed 8T cell is referred as ‘8T’, the conventional 6T as ‘6T’, the conventional 5T as ‘5T’, the conventional read decoupled 8T cell as ‘RD-8T’ and the asymmetrical 8T [23] as ‘A-8T’.

This chapter presents the proposed 8T SRAM cell and analyzes the static noise margins, power consumption and performance of the proposed 8T and conventional 6T cells and 1kb arrays. The proposed 8T is compared with 5T, 6T, RD-8T, A-8T [23] and 9T [28] cells.

## 2.1 Proposed 8T SRAM Cell Design

To make a stable cell in all operations, a single-ended cell with dynamic feedback control (SE-DFC) is presented in Figure 2.1(d). The single-ended design is used to reduce the differential switching power during read-write operation. The power consumed during switching/toggling of data on a single bit-line is less than that on differential bit-line pair.

The SE-DFC enables writing through a single NMOS in 8T. It also separates the read and write path and exhibits read decoupling.

The structural change of the cell is considered to enhance immunity against process-voltage-temperature (PVT) variations. It improves the SNM of the 8T cell in the sub-threshold/near-threshold region. The proposed 8T has one cross coupled inverter pair, in which each inverter is made up of three cascaded transistors.

The Motivation behind introducing the proposed SRAM with three cascaded transistors in one/each branch of cross coupled inverter is to reduce write and read sizing conflict thereby making writing easy into the memory cell. It also provides the possibility of read decoupling. The detail description is given in Chapter 1 (Section).

These two stacked cross-coupled inverters: M1-M2-M4 and M8-M6-M5 retain the data during hold mode. The write word line controls only one NMOS transistor M7, used to transfer the data from the single write bit-line (WBL). A separate read bit-line (RBL) is used to transfer the data from the cell to the output when the read word-line (RWL) is activated. Two column biased feedback control signals FCS1 and FCS2, are used to control the feedback cutting transistors, M6 and M2, respectively.

## 2.1.1 Write Operation

It is challenging to maintain WSNM in sub-threshold/near-threshold SRAM design due to the small gate overdrive, large load capacitance, and severe PVT variations. The cell is designed to reduce the pull-down strength using M2 and M6 (Figure 2.1(d)) for achieving better WSNM.

The feedback cutting scheme is used to write into 8T. In this scheme, during a write ‘1’ operation FCS1 is made low which switches off M6. When the read word-line (RWL) is made low and FCS2 high, M2 conducts connecting QB to the ground. Now, if the data applied to word bit-line (WBL) is ‘1’ and write word-line (WWL) is activated, then current flows (an arrow pointing from WBL to Q in Figure 2.2(a)) and creates a voltage hike on Q (shown in Figure 2.3(a)) via M7, writing ‘1’ into the cell. Moreover, when Q changes its state from ‘0’ to ‘1’, the inverter (M1-M2-M4) changes the state of QB from ‘1’ to ‘0’ (a current flowing from QB to ground shown in Figure 2.2(a)) as shown in the waveform of Figure 2.3(a).

To write a ‘0’ at Q, WWL is made high, FCS2 low and WBL is pulled to the ground. The direction of switching currents during write ‘0’ operation is shown in Figure 2.2(b). The low going FCS2 leaves QB floating, which can go to a small negative value, and then the current from pull-up PMOS M1 charges QB to ‘1’ as shown in Figure 2.3(b).



Figure 2.2: Schematic of 8T during (a) Write ‘1’ (b) Write ‘0’ operation.



(a)



(b)

Figure 2.3: Waveforms of 8T during (a) Write '1' (b) Write '0' operation.

### 2.1.2 Read Operation

The read operation is performed by pre-charging the read bit-line (RBL) and activating RWL. If '1' is stored at node Q then M4 turns ON and makes a low resistive path for the flow of cell current

(I<sub>read</sub> shown in Figure 2.4(a)) through RLB to ground. This discharges RLB quickly to ground (waveform shown in Figure 2.4(b)) that can be sensed by the full swing inverter sense amplifier. Since WWL, FCS1 and FCS2 were made low during read operation; therefore, there is no direct disturbance on true the storing node QB during reading the cell.

The low going FCS2 leaves QB floating which goes to a negative value then comes back to its original '0' value after successful read operation as shown in of Figure 2.4(b). If Q is high then, the size ratio of M3 and M4 will govern the read current and the voltage difference on RBL.



(a)



(b)

Figure 2.4: (a) Schematic and (b) Waveforms of 8T during read '1' operation.

During read ‘0’ operation, Q is ‘0’ and RBL holds pre-charged high value and the inverter sense amplifier gives ‘0’ at the output. Since M2 is off so virtual QB (VQB) is isolated from QB and this prevents the chance of disturbance of the QB node voltage which ultimately reduces the read failure probability and improves the read static noise margin (RSNM).



Figure 2.5: Waveforms of 8T during read operation (a) Normal Read ‘1’ (b) FSC1/FCS2 turns ‘1’ before RWL turns ‘0’.

During read operation, if FCS1/FCS2 turns ‘1’ before RWL is turned ‘0’ then QB and VQB can share charge as shown in Figure 2.5. As WWL is ‘0’ no strong path exists between WBL and Q, and any disturbance in QB will not affect Q. After that if RWL goes low, the positive feedback will restore the respective states ( $Q = '1'$  and  $QB = '0'$ ). Supporting waveforms are given in Figure 2.5.

### 2.1.3 Half-selected Issue

Whenever a cell is selected for write operation, the voltage of the true storage node (Q) of row half-selected cells will rise due to charge transfer from the write bit-line (Figure 2.6(a)). The complementary storage node (QB) does not have a strong connection to the bit-line (RWL is off) and therefore there is less chance to flip the cell as compared to conventional 6T/8T cell.



Figure 2.6: Schematic of row half-selected 8T (a) Write (b) Read.

This can be verified by 1000 (limited due to facilities) Monte Carlo (MC) simulations as shown in Figure 2.7(a). Similarly, during read operation (Figure 2.6(b)), the 1000 MC simulations show leakage immunity in row half-selected cells (Figure 2.7(b)).



Figure 2.7: 1000 MC simulations of row half-selected 8T (a) Write (b) Read.

Write control signals are common for all the cells connected in a column and during write operation of a cell, the other cells in the same column will retain their data successfully. When column half-selected cells QB is ‘0’ and FCS2 goes low (write ‘0’ operation in selected cell in same

column), then QB will be floating for the write period (Figure 2.8(a) and 2.9(a)). The parasitic and gate capacitance of transistors M5 and M8 connected to the true storage node QB of column half-selected cells will hold their data during write operation for the selected cell (Figure 2.9(a)). The pulse width needed for write operation is very small compared to the data retention time (in ‘ $\mu$ s’) of the half-selected cells, while FCS2 is OFF to write ‘0’ in the selected cell.



Figure 2.8: Schematic of column half-selected 8T (a) Write ‘0’ (b) Read.

During a read operation, FCS1 and FCS2 go low in the whole column and QB of the column half-selected cell will be floating for the read period, as shown in Figure 2.8(b). There is a small variation in the floating QB because of weak driving currents from the power supply charging it as shown in Figure 2.9(b).



Figure 2.9: Waveforms of column half-selected 8T (a) Write '0' (b) Read.

The column half-selected cells can retain their data successfully even if the write/read or FCS1/FCS2 period is greater than that required. To verify leakage immunity, 1000 MC simulations were performed during write (Figure 2.10(a)) and read (Figure 2.10(b)) operations. Floating QB has a small variation because of weak leakage currents from the power supply are charging it (Figure 2.10).



Figure 2.10: 1000 MC simulations of column half-selected 8T (a) Write (b) Read.

#### 2.1.4 Control Signal Generation

The feedback control signals namely FCS1 and FCS2 are data dependent. These signals are connected in column-wise configuration [26]-[28]. Input data and column address signals are used to generate these control signals. A common circuit is used for a single column. Therefore, there would be a small area overhead at the array level.

The proposed 8T cell has a single-ended read port (as conventional read decoupled RD-8T) and therefore, the number of cells per bit-line would be small as compared to a differential 6T. Due to the small length RBL the parasitic capacitances are less and the delay/power in read/write operation would not be affected significantly.

The operation of the proposed cell is based on the conditions of word-lines, bit-lines and control signals, as presented in Table 2.1.

TABLE 2.1: OPERATION TABLE OF PROPOSED 8T SRAM CELL.

|             | <b>Hold</b> | <b>Read</b> | <b>Write '1'</b> | <b>Write '0'</b> |
|-------------|-------------|-------------|------------------|------------------|
| <b>WWL</b>  | '0'         | '0'         | '1'              | '1'              |
| <b>RWL</b>  | '0'         | '1'         | '0'              | '0'              |
| <b>FCS1</b> | '1'         | '0'         | '0'              | '1'              |
| <b>FCS2</b> | '1'         | '0'         | '1'              | '0'              |
| <b>WBL</b>  | '1'         | '1'         | '1'              | '0'              |
| <b>RBL</b>  | '1'         | Discharge   | '1'              | '1'              |

### 2.1.5 Cell Layout

For comparison of area, the layout of 5T, 6T, RD-8T and proposed 8T are drawn in UMC 90nm CMOS technology as illustrated in Figure 2.11.



Figure 2.11: Layout of (a) 6T (b) RD-8T (c) 5T and (d) Proposed 8T.

The sizes of MOSFETs used in the proposed 8T cell are depicted in Figure 2.11(d). The RD-8T occupies 1.5x area as compared to that of 6T. Due to the design constraints and contact area between M2, M3, M4, and M8 for proposed 8T, there is 2.8x area overhead as compared to 6T cell. Table 2.2 has the comparative values of layout area of 5T, 6T, RD-8T, and 8T cells. Even though it has 2.8x area of 6T, but its better built-in process tolerance and dynamic voltage applicability enables it to be employed similar to cells presented in [12]-[19].

TABLE 2.2: LAYOUT AREA IN UMC 90nm TECHNOLOGY.

|                                          | 5T  | 6T    | RD-8T | 8T   |
|------------------------------------------|-----|-------|-------|------|
| <b>Area (<math>\mu\text{m}^2</math>)</b> | 1.2 | 1.4   | 2.1   | 3.9  |
| <b>Area/(5T area)</b>                    | 1x  | 1.16x | 1.75x | 3.2x |

## 2.2 Simulation & Analysis

To validate the design of proposed 8T, post layout circuit simulations were performed for the iso-area (6T is upsized to the same layout area as the proposed 8T) conditions as suggested by [14]. As the RD-8T cell has separate read path, additional area can be used for access transistors to improve the WSNM. Thus, as iso-area RD-8T cell has 3x upsized access transistors compared to its min-cell (Figure 2.11(b)) access transistors.

The 6T is upsized to 4x of its min-cell (minimum possible W/L ratios for respective technology as shown in Figure 2. 11(a)) size. Similarly, the 5T is upsized to 5x of its min-cell (Figure 2.11(c)) size. During simulations 25°C and 50 MHz were maintained. The effect of PVT variations on cells is shown to justify the respected SNM in sub-threshold region at ULV power supply.

The Monte Carlo (MC) simulations for 1000 samples considering inter/intra die random variations in threshold voltage ( $V_{TH}$ ) were performed at different power supplies at different process corners.

In this paper, the stability during data retention is determined using a butterfly curve [29] as shown in Figure 2.12(a). The HSNM is estimated graphically as the edge of the largest square that can be inserted inside the lobes of butterfly curve [29]. The simulation method based on the graphical technique shown in Figure 2.12. To estimate SNM values, a procedure is needed that finds values for the diagonals of the maximum squares as shown in Figure 2.12(b).

Figure 2.12(b) shows a formalized version of Figure 2.12(a) in two coordinate systems which are rotated 45° relative to each other. The subtraction of the v values of normal and mirrored inverter characteristics at given u yields absolute curve, in (u, v) coordinate system, that is a measure of the diagonal's length.

The maximum of the absolute value curve represents the required maximum squares [29]. It can be noticed that the diagonal's lengths L1 and L2 are different for negative and positive variation in voltage. This is due to asymmetric inverter pair in 7T SRAM cell. It is found that the Fast NMOS Slow PMOS (FS) corner shows the lowest HSNM value among all other process corners (Fast

NMOS Fast PMOS (FF), Slow NMOS Fast PMOS (SF), Slow NMOS Slow PMOS (SS) and Typical NMOS Typical PMOS (TT)) and, therefore, FS can be selected as the worst case corner for HS NM analysis [21]. The butterfly curve in Figure 2.12(a) displays that the proposed 8T retains the data successfully at FS corner. The approach followed by [29] is used to find WSNM, RSNM, and HS NM from the butterfly curve (HS NM of 8T for 1000 MC shown in Figure 2.12(a)).



Figure 2.12: (a) Butterfly curves of HSNM for 8T at VDD=0.3V (b) Calculation of SNM

### 2.2.1 Write Static Noise Margin (WSNM)

The WSNM for different supply voltages (200mV-500mV) are shown in Figure 2.13 and Figure 2.14 at slow PMOS and fast NMOS (SF) worst case corner. The insets of these figures show variation of mean ( $\mu$ ) and standard deviation ( $\sigma$ ) with varying VDD. In 8T, the feedback was broken by M2

and M6 to create a high impedance path between the storage node and ground. This enhances WSNM as evident from Figure 2.13. The narrow peaks indicate Gaussian distribution with high mean and low  $\sigma$ . It can be observed from Figure 2.13 that  $\mu$  of WSNM for the proposed 8T increases linearly from 200mV to 500mV with low standard deviation for all power supplies.



Figure 2.13: WSNM of 8T (MC 1000, SF corner). Inset:  $\log \mu$  and  $\log \sigma$ .

Figure 2.14 revealed that the Gaussian curves are wider than those for the proposed 8T. It indicates that iso-area 6T has relatively lower  $\mu$  and larger  $\sigma$  as compared to the proposed 8T during write operation. The effect of process variations (mainly threshold voltage of MOSFET) is more on 6T as compared to 8T. These variations lower the mean and increase the standard deviation as shown in Figure 2.14.



Figure 2.14: WSNM of 6T (MC 1000, SF corner). Inset:  $\log \mu$  and  $\log \sigma$ .

### 2.2.2 Read Static Noise Margin (RSNM)

Figure 2.15 and Figure 2.16 show RSNM plots for 8T and 6T, respectively at different VDD at fast NMOS and slow PMOS (FS) worst case corner. As the 8T cell has a separate read path for the cell current, the RSNM curves show relatively narrow peaks as shown in Figure 2.15.



Figure 2.15: RSNM of 8T (MC 1000, FS corner). Inset:  $\log \mu$  and  $\log \sigma$ .



Figure 2.16: RSNM of 6T (MC 1000, FS corner). Inset:  $\log \mu$  and  $\log \sigma$ .

The RSNM is lower as compared WSNM but still, high enough to read without read failure. This is due to isolation of QB from VQB and RBL which confirms read access with low disturbance on QB even for low RSNM values. The appearance of wide and distorted waveforms (not following Gaussian distribution) in Figure 2.16 indicates that the 6T has higher read failure probability as compared to 8T under process variations. The lower values of  $\mu$  and increased  $\sigma$  justify the lower RSNM of 6T as compared to that of the proposed 8T, which is associated with read disturb.

### 2.2.3 Hold Static Noise Margin (HSNM)

The data retention capability of a cell can be tested by determining HSNM. As the worst case condition for a column of half-selected cells does not affect the data stored in the 8T cell during read/write operation, the HSNM is fairly good for the proposed 8T cell, as shown in Figure 2.17 at FS corner. The narrow peaks (in Figure 2.17) indicate low  $\sigma$  and linear  $\mu$  curve for proposed 8T.



Figure 2.17: HSNM of 8T (MC 1000, FS corner). Inset:  $\log \mu$  and  $\log \sigma$ .



Figure 2.18: HSNM of 6T (MC 1000, FS corner). Inset:  $\log \mu$  and  $\log \sigma$ .

The 6T cell fails (Figure 2.18) to follow the Gaussian distribution at low voltage (200mV) but shows successful retention at high supply voltages (300-500mV) with slightly higher  $\sigma$  compared to 8T. All results for HSNM are presented for fast NMOS and slow PMOS (FS) worst case corner.

#### 2.2.4 Write Time

The write time is measured as the time taken by the WWL signal to rise to  $VDD/2$  until the storage nodes intersect each other. The simulations for write time (WT) were performed at all process corners. The write time (for write '1' and write '0') for 6T and 8T increases (Figure 2.19 and Figure 2.20) with decrease in power supply. The write time is highest for slow NMOS and slow PMOS (SS) worst case corner as shown in Figures 2.19 and Figure 2.20. The proposed 8T has single ended write therefore, the write '1' (in Fig 19(a)) and write '0' time (Figure 2. 19(b)) is more at all process corners as compared to differential 6T write times (Figure 2.20(a) and Figure 2.20(b)).



Figure 2.19: (a) Write '1' time of 8T and (b) Write '0' time of 8T.



Figure 2.20: (a) Write '1' time of 6T and (b) Write '0' time of 6T.

### 2.2.5 Read Time

Read time is measured as the time the RWL signal is activated until the read bit-line is discharged to 90%. The SS process corner shows maximum read time as presented in Figure 2.21(a) and Figure 2.21(b). It is followed by the SF corner and then by the other process corners. The 6T shows similar variation with respect to different process corners with slightly lower read time compared to 8T as depicted in Figure 2.21(b).



Figure 2.21: Read '1' time (a) 8T and (b) 6T.

### 2.2.6 Write Power

During write '1'/'0' operation, the power consumption of 8T is highest for fast NMOS and fast PMOS (FF) process corner dominated by the fast switching activities (Figure 2.22). As write '0' operation is faster than write '1' (Figure 2.3), the write '0' power consumption during write '0' is more compared to that of write '1' (Figure 2.22(a) and Figure 2.22(b)).



Figure 2.22: (a) Write '1' power of 8T and (b) Write '0' power of 8T.

During write ‘1’/‘0’ operation, the power consumption of 6T varies in similar fashion as the proposed 8T at all process corners. As 6T is symmetrical in nature, write ‘1’ and write ‘0’ power consumption are nearly the same, as shown in Figure 2.23(a) and Figure 2.23(b), respectively. However, there is a slight difference in write ‘1’ and write ‘0’ power consumption due to MOSFET mismatch.



Figure 2.23: (a) Write ‘1’ power of 6T and (b) Write ‘0’ power of 6T.



Figure 2.24: Read ‘1’ power (a) 8T and (b) 6T.

## 2.2.7 Read Power

Similar to write power, the FF process corner condition draws the highest read power. Read power consumption at other process corners closely follows for different power supplies (Figure 2.24). High current from wide transistors in read path for iso-area 6T cell results in higher read power for 6T as compared to that of 8T at all process corners (Figure 2.24(a) and Figure 2.24(b)).

## 2.3 Comparison & Discussion

The Monte Carlo simulation results of different cells for the worst case process corner are elaborately discussed in this section. All presented cells are simulated in iso-area (5T, 6T and RD-8T are upsized to same layout area as proposed 8T) condition [14].

### 2.3.1 Write Static Noise Margin (WSNM)

In RD-8T and 6T, during write operation there is a fight between access and pull-down transistor. On the other hand, in proposed 8T, during write operation, FCS is low, which turns OFF M4 thereby cutting the feedback and prevent the fight between access and pull-down transistors and restricting current through node Q to ground. When WWL is asserted, this provides unhindered charging of Q through WBL without any boosted supply on the gate of access transistor M7. This SE-DFC scheme enhances WSNM significantly and results in highest  $\mu$  of WSNM of proposed 8T that is 1.4x and 1.28x compared to that of iso-area 6T and RD-8T respectively, where 5T fails to write. Apart from high  $\mu$ , the proposed cell has the lowest  $\sigma$  of 0.4x and 0.56x of 6T and RD-8T respectively, as shown in Figure 2.25.



Figure 2.25: Comparison of WSNM at SF corner.

### 2.3.2 Read Static Noise Margin (RSNM)

The conventional read decoupled RD-8T cell has two separate NMOS transistors for read operation. Therefore, there is no read-write design conflict and  $\mu$  of RSNM is better, 1.18x that of the proposed 8T. However, the RSNM mean of the proposed cell is good enough for a stable read operation under process variations and  $\sigma$  is 0.79x of RD-8T, as shown in Figure 2.26. It can be

noticed that plots of RSNM of 5T and 6T are complete but, unlike the proposed 8T and RD-8T, they fail to follow a Gaussian distribution (Figure 2. 26).



Figure 2.26: Comparison of RSNM at FS corner.

### 2.3.3 Hold Static Noise Margin (HSNM)

In data retention mode, all cells (5T, 6T, RD-8T and 8T) are able to sustain process variations at 300mV at FS corner. The  $\mu$  of HSNM of the proposed 8T is the best among the cells under consideration. Its value is 1.43x, 1.23x and 1.05x as that of iso-area 5T, 6T and RD-8T, respectively, as evident from Figure 2.27. Moreover,  $\sigma$  is the least among all, i.e. 0.65x, 0.63x and 0.76x as that of iso-area 5T, 6T and RD-8T, respectively.



Figure 2.27: Comparison of HSNM at FS corner.

### 2.3.4 Write and Read Time

The write ‘1’ time of proposed 8T is compared to referenced A-8T [23] because 5T fails to perform write ‘1’ operation. Figure 2.28(a) compares the time to write ‘1’ at worst case SS corner for different cells. It can be observed that the proposed 8T requires only 0.28x time as taken by A-8T at 300mV.

Being a single-ended SRAM cell, the proposed 8T requires relatively more time over differential cells (in this context 6T and RD-8T) to perform write operation.

The write ‘1’ time for proposed 8T is 7.05x and 3.71x as that for RD-8T and 6T respectively, at 300mV. Also, the read ‘1’ time of the proposed cell is 2.22x of 5T/6T and 1.22x of RD-8T at 300mV at SS corner as depicted in Figure 2.28(b).



Figure 2.28: Comparison at SS corner (a) Write ‘1’ (b) Read ‘1’ time.

### 2.3.5 Write and Read Power Consumption

Figure 2.29 depicts the average write and read power consumption for 5T, 6T, RD-8T and the proposed 8T cells at worst case FF corner. It is evident from Figure 2.29 that the proposed 8T consumes less write ‘0’ power of 0.72x, 0.60x and 0.85x as that consumed by iso-area 5T, 6T and RD-8T respectively, at 300mV. The read ‘1’ power consumption of the proposed 8T is 0.49x, 0.48x and 0.64x of iso-area 5T, 6T and RD-8T respectively, at 300mV (Figure 2.29(b)). Figure 2.30 depicts write and read energy consumption for 5T, 6T, RD-8T and the proposed 8T cells at worst case FF corner.



Figure 2.29: Comparison of power at FF corner (a) Write '0' and (b) Read '1'.



Figure 2.30: Comparison of energy at FF corner (a) Write '0' and (b) Read '1'.

### 2.3.6 Array Design

The proposed 8T with feedback cutting and read decoupled schemes is implemented in a  $64 \times 16$  bit SRAM array. The 1kb SRAM comprises an address decoder, 4 banks and each bank consists of 16 words  $\times$  16 bits, a sense amplifier unit and a control block as shown in Figure 2.31.



Figure 2.31: Block diagram of the proposed 1kb SRAM array.

Physical layout of sophisticated 8T 1kb SRAM array in 90nm UMC CMOS technology is shown in Figure 2.32. A similar architecture is used to design a 1kb array for 6T SRAM.



Figure 2.32: Layout of 1kb array of the proposed 8T.



Figure 2.33: Layout of 1kb array of conventional 6T.

To save the power/energy consumption, the array has been operated in sub-threshold regime. Both arrays are compared in Table 2.3 at 300mV and 10MHz. The power/energy consumption of 8T SRAM array during read and write operations is lower than 6T SRAM array. However, the read and write times are higher than 6T SRAM array.

TABLE 2.3: COMPARISON OF 1kb ARRAY OF 8T AND 6T SRAM AT 300mV.

| UMC 90nm | Write '1'<br>Power( $\mu$ W)<br>/Energy(fJ) | Write '1'<br>Time<br>(ns) | Write '0'<br>Power( $\mu$ W)<br>/Energy(fJ) | Write '0'<br>Time<br>(ns) | Read<br>Power( $\mu$ W)<br>/Energy(fJ) | Read Time<br>(ns) |
|----------|---------------------------------------------|---------------------------|---------------------------------------------|---------------------------|----------------------------------------|-------------------|
| 8T       | 6.71/6.71                                   | 49.88                     | 9.39/9.39                                   | 36.25                     | 15.8/15.8                              | 26.37             |
| 6T       | 19.32/19.32                                 | 23.94                     | 15.04/15.05                                 | 22.92                     | 36.1/36.1                              | 17.93             |

## 2.4 Comparison Summary

The architecture and schemes used for the proposed 8T cell makes it highly immune to process variations at ULV. This is well justified by the measured  $\mu$  and  $\sigma$  for write, read and hold modes as presented in Table 2.4. The WSNM of the proposed 8T is the highest among all other cells under consideration (5T, 6T, RD-8T and A-8T [23]).

RSNM is comparable to that of RD-8T while the HSNM is slightly improved compared to other (5T, 6T and RD-8T) cells. The proposed 8T cell has lower delay as compared to single-ended A-8T [23] during write operation and nearly the same delay as single-ended RD-8T during read operation.

The power consumption during read operation of the proposed 8T is 0.49x, 0.48x and 0.64x as compared to that of 5T, 6T and RD-8T, respectively, at 300mV.

It can be observed that, the proposed cell has higher power saving capability during read/write operations over the other cells under consideration. As iso-area 5T, 6T and RD-8T have wider transistors, they have more leakage and, therefore the proposed 8T has lowest leakage current ( $I_{Leak}$ ) during standby mode as shown in Table 2.5.

TABLE 2.4: COMPARISON OF MEAN ( $\mu$ ) AND STANDARD DEVIATION ( $\sigma$ ) FOR THE PROPOSED 8T, ISO-AREA 5T, 6T AND RD-8T SRAM CELLS

| Bit-cell           | SNM at worst process corner | VDD=200mV      |                  | VDD=300mV     |                  | VDD=400mV     |                  | VDD=500mV     |                  |
|--------------------|-----------------------------|----------------|------------------|---------------|------------------|---------------|------------------|---------------|------------------|
|                    |                             | $\mu$<br>(mV)  | $\sigma$<br>(mV) | $\mu$<br>(mV) | $\sigma$<br>(mV) | $\mu$<br>(mV) | $\sigma$<br>(mV) | $\mu$<br>(mV) | $\sigma$<br>(mV) |
| <b>Proposed 8T</b> | WSNM (SF)                   | 139.9          | 6.07             | 227.2         | 7.0              | 314.2         | 7.34             | 400.0         | 7.53             |
|                    | RSNM (FS)                   | 39.42          | 6.63             | 70.33         | 5.98             | 83.6          | 5.60             | 85.59         | 6.10             |
|                    | HSNM (FS)                   | 49.82          | 4.77             | 89.57         | 5.44             | 123.4         | 6.93             | 151.5         | 8.54             |
| <b>6T</b>          | WSNM (SF)                   | 121.3          | 11.5             | 163.1         | 12.3             | 202.9         | 11.97            | 238.5         | 11.99            |
|                    | RSNM (FS)                   | 28.45          | 6.99             | 53.73         | 7.04             | 69.75         | 15.65            | 70.51         | 16.9             |
|                    | HSNM (FS)                   | 42.22          | 5.60             | 72.6          | 8.55             | 85.20         | 10.82            | 102.9         | 13.3             |
| <b>RD-8T</b>       | WSNM (SF)                   | 139.2          | 11.6             | 176.3         | 11.1             | 225.6         | 11.32            | 271.4         | 11.63            |
|                    | RSNM (FS)                   | 45.32          | 4.70             | 78.9          | 8                | 97.20         | 11.1             | 110.93        | 14               |
|                    | HSNM (FS)                   | 47.22          | 4.60             | 83.69         | 7.55             | 99.30         | 10.82            | 120.9         | 13.3             |
| <b>5T</b>          | WSNM (SF)                   | Fails to write |                  |               |                  |               |                  |               |                  |
|                    | RSNM (FS)                   | 22.13          | 8.7              | 30.1          | 8.2              | 52.4          | 9.4              | 75.2          | 11.2             |
|                    | HSNM (FS)                   | 39.3           | 7.60             | 62.2          | 8.3              | 72.3          | 9.9              | 89.1          | 10.4             |
| <b>A-8T [23]</b>   | WSNM (SF)                   | 70             | Not given        | 90            | Not given        | 120           | Not given        | 170           | Not given        |

The leakage and single-ended write/read operation of the proposed 8T allows 64 bit-cells/WBL and 32 bit-cells/RBL at 200mV. Like the proposed 8T, the 9T cell [28] also utilizes feedback cutting and dynamic read decoupling. 9T has 23mV RSNM ( $P_{fail}=1e-9$ ) and due to differential write operation with feedback cutting, it gives a write trip point of 160mV.

TABLE 2.5: COMPARISON OF LEAKAGE OF ISO-AREA BIT-CELLS

|              | $I_{Leak}(nA)$ at<br><b>VDD=0.2V</b> | $I_{Leak}(nA)$ at<br><b>VDD=0.3V</b> | $I_{Leak}(nA)$ at<br><b>VDD=0.4V</b> | $I_{Leak}(nA)$ at<br><b>VDD=0.5V</b> |
|--------------|--------------------------------------|--------------------------------------|--------------------------------------|--------------------------------------|
| <b>8T</b>    | 11.43                                | 22.24                                | 37.35                                | 57.72                                |
| <b>6T</b>    | 24.94                                | 50.79                                | 89.44                                | 144.9                                |
| <b>RD-8T</b> | 13.94                                | 26.78                                | 44.66                                | 68.75                                |
| <b>5T</b>    | 32.06                                | 65.5                                 | 115.69                               | 187.94                               |

The proposed 8T cell is also compared to 10T cells [14], [17] found in literature and tabulated in Table 2.6. It is worth noticing that,  $\mu$  of WSNM of the proposed 8T cell is the highest while RSNM and HSNM are close to those of 10T cells (Table 2.6).

TABLE 2.6: COMPARISON OF SNM AT 300mV WITH [14] AND [17]

| UMC 90nm Technology, $25^0C$ , 300mV | SNM (Monte Carlo analysis) | $\mu$ (mV) | $\sigma$ (mV) |
|--------------------------------------|----------------------------|------------|---------------|
| <b>Proposed 8T</b>                   | HSNM (FS corner)           | 89.57      | 5.44          |
|                                      | RSNM (FS corner)           | 70.33      | 5.98          |
|                                      | WSNM (SF corner)           | 227.29     | 7.00          |
| <b>10T cell [14]</b>                 | HSNM (FS corner)           | 130        | 8.6           |
|                                      | RSNM (FS corner)           | 43.1       | 13            |
|                                      | WSNM (SF corner)           | 38.5       | 28.2          |
| <b>10T cell [17]</b>                 | HSNM (FS corner)           | 74.2       | 11.4          |
|                                      | RSNM (FS corner)           | 84.3       | 9.2           |
|                                      | WSNM (SF corner)           | 44.5       | 13.3          |

Further, the comparison of the proposed 8T with 9T [28] and 7T [25] is given Table 2.7. The available data is directly taken from their article and presented in Table 2.7. It is worth noticing that,

RSNM of the proposed 8T cell is the higher than 9T [28] because of reducing the chances of leakage by removing second access transistor. On the other hand, due to differential write scheme of 9T [28], the write delay is lower than that of 8T as presented in Table 2.7.

TABLE 2.7: COMPARISON OF THE PROPOSED 8T WITH 9T [28] AND 7T [25]

| Cell               | RSNM (mV)              | WSNM (mV)                  | Write Delay (ns)      | VDD-min (mV) | Array |
|--------------------|------------------------|----------------------------|-----------------------|--------------|-------|
| <b>Proposed 8T</b> | 70.33<br>(at VDD=0.3V) | 227                        | 1.41                  | 200          | 1kb   |
| <b>9T [28]</b>     | 23<br>(at VDD=0.3V)    | WTP=160mV<br>(at VDD=0.3V) | 1.02<br>(at VDD=0.3V) | 160          | 32kb  |
| <b>7T [25]</b>     | 200m<br>(at VDD=1V)    | -                          | -                     | 440          | 64kb  |

## 2.5 Chapter Summary

An 8T SRAM cell with high data stability (high  $\mu$  and low  $\sigma$ ) that operates in ultralow voltage supplies is presented. We attained enhanced static noise margin (SNM) in the sub-threshold regime using a single-ended cell with dynamic feedback cutting (SE-DFC) and read decoupling schemes. The proposed cell's area is 2.8x as that of 6T. Still, its better built-in process tolerance and dynamic voltage applicability enables it to be employed similar to cells (8T, 9T and 10T) along with 1.8x-2x area overhead.

The proposed 8T cell has high stability and can be operated at ultra-low voltages of 200-300mV power supplies. The advantage of reduced power consumption of the proposed 8T cell enables it to be employed for battery operated SoC design. Future applications of the proposed 8T cell can potentially be in low/ultra-low voltage and medium frequency operation, such as neural signal processor, sub-threshold processor, Wide-Operating-Range IA-32 Processor, FFT core, low voltage cache operation etc.

## References

- [1] Roy K. and Prasad S. (2003), Low Power CMOS VLSI Circuit Design, 1st ed. New York: Wiley.
- [2] Yoshinobu et al. (2003), Review and future prospects of low-voltage RAM circuits, *IBM J. Res. Devel.*, vol. 47, no. 5/6, pp. 525–552.
- [3] Kim et al. (2009), A voltage scalable 0.26 V, 64 kb 8T SRAM with voltage lowering techniques and deep sleep mode, *IEEE J. of Solid-State Circuits*, vol. 44, no. 6, pp. 1785–1795.
- [4] Gonzalez et al. (1997), Supply and threshold voltage scaling for low power CMOS, *IEEE J. Solid-State Circuits*, vol. 32, no. 8, pp. 1210–1216.
- [5] Khellah et al. (2007), A 256-kb dual-VCC SRAM building block in 65-nm CMOS process with actively clamped sleep transistor, *IEEE J. Solid-State Circuits*, vol. 42, no. 1, pp. 233–242.
- [6] Bhavnagarwala et al. (2001), The impact of intrinsic device fluctuations on CMOS SRAM cell stability,” *IEEE J. Solid-State Circuits*, vol. 36, no. 4, pp. 658–665.
- [7] Cheng et al. (2004), The impact of random doping effects on CMOS SRAM cell, in Proc. 30th European Solid-State Circuits Conf. (ESSCIRC), Belgium, pp. 219–222.
- [8] Mukhopadhyay et al. (2005), Modeling of failure probability and statistical design of SRAM array for yield enhancement in nanoscaled CMOS, *IEEE Trans. Comput.-Aided Design (CAD) Integr. Circuits Syst.*, vol. 24, no. 12, pp. 1859–1880.
- [9] Wang A. and Chandrakasan A. (2005), A 180-mV subthreshold FFT processor using a minimum energy design methodology, *IEEE J. Solid-State Circuits*, vol. 40, pp. 310–319.
- [10] Markovic et al. (2010), Ultralow-power design in near-threshold region, *Proceedings of IEEE*, vol. 98, no. 2, pp. 237–252.
- [11] Verma N. and Chandrakasan A. P. (2008), A 256 kb 65 nm 8T subthreshold SRAM employing sense-amplifier redundancy, *IEEE J. Solid-State Circuits*, vol. 43, pp. 141–149.
- [12] Kushwah C. B. and Vishvakarma S.K. (2012), Ultra-Low Power Sub-threshold SRAM Cell Design to Improve Read Static Noise Margin, *Lecture Notes in Computer Science*, 7373, pp. 139-146.
- [13] Liu Z. and Kursun V. (2008), Characterization of a novel nine-transistor SRAM cell, *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 16, pp. 488–492.

- [14] Kulkarni J. P., K. Kim, and K. Roy (2007), A 160 mV robust schmitt trigger based subthreshold SRAM, *IEEE J. Solid-State Circuits*, vol. 42, no. 10, pp. 2303–2313.
- [15] Kim T.-H., Liu J., and Kim C. H. (2007), An 8T subthreshold SRAM cell utilizing reverse short channel effect for write margin and read performance improvement, *Proceedings of IEEE Custom Integr. Circuits Conf. (CICC)*, CA, USA, pp. 241-244.
- [16] Calhoun B. H. and Chandrakasan A. P. (2007), A 256-kb 65-nm sub-threshold SRAM design for ultra-low-voltage operation, *IEEE J. Solid-State Circuits*, vol. 42, no. 3, pp. 680–688.
- [17] Lo C.-H. and Huang S.-Y. (2011), P-P-N based 10T SRAM cell for low-leakage and resilient subthreshold operation, *IEEE J. Solid-State Circuits*, vol. 46, no. 3, pp. 695–704.
- [18] Chang I. J. et al. (2009), A 32 kb 10T sub-threshold SRAM array with bit-interleaving and differential read scheme in 90 nm CMOS, *IEEE J. Solid-State Circuits*, vol. 44, no. 2, pp. 650–658.
- [19] Kushwah C. B., Dwivedi D. and Sathisha N. (2013), 8T Based SRAM Cell and Related Method, U. S. A., IBM docket no. IN920130218US1, Patent Pending.
- [20] Carlson I. et al. (2004), A high density, low leakage, 5T SRAM for embedded caches, *Proceedings of 30th Eur. Solid State Circuits Conf.*, Leuven, Belgium, pp. 215–218.
- [21] Zhai Bo (2008), A Variation-Tolerant Sub-200 mV 6-T Subthreshold SRAM, *Solid-State Circuits, IEEE J. of Solid State Circuits*, vol.43, no.10, pp.2338-2348.
- [22] Tawfik S. and Kursun V. (2008), Low power and robust 7T dual-Vt SRAM circuit, *Proceedings of Int. Symp. Circuits Syst.*, Knoxville, Tennessee, pp. 1452–1455.
- [23] Tu et al. (2010), Single-ended subthreshold SRAM with asymmetrical write/read-assist, *IEEE Trans. Circuit and Systems-I*, vol. 57, no. 12, pp. 3039-3047.
- [24] Tu Ming-Hsien et al. (2012), A Single-Ended Disturb-Free 9T Subthreshold SRAM With Cross-Point Data-Aware Write Word-Line Structure, Negative Bit-Line, and Adaptive Read Operation Timing Tracing, *IEEE J. of Solid-State Circuits*, vol.47, no.6, pp.1469-1482.
- [25] Takeda K. et al. (2006), A read-static-noise-margin-free SRAM cell for low-VDD and high-speed applications, *IEEE J. Solid-State Circuits*, vol. 41, no. 1, pp. 113–121.
- [26] Kushwah C. B., Vishvakarma S. K. and Dwivedi D. (2014), Single-ended sub-threshold FinFET 7T SRAM cell without boosted supply, *Proceedings of IEEE International Conference on IC Design & Technology (ICICDT)*, Texas, USA, pp.1-4.

- [27] Kushwah C.B. and Vishvakarma S. K. (2014), A sub-threshold eight transistor (8T) SRAM cell design for stability improvement, Proceedings of IEEE International Conference on IC Design & Technology (ICICDT), Texas, USA, pp.1-4.
- [28] Chang M.-F. et al. (2011), A 130 mV SRAM with expanded write and read margins for sub-threshold applications, IEEE J. Solid-State Circuits, vol. 46, no. 2, pp. 520-529.
- [29] Seevinck E. et. al. (1987), Static noise margin analysis of MOS SRAM cells, IEEE J. Solid State Circuits, vol. SC-22, no. 10, pp. 748-754.

# Chapter 3

## Single-Ended Boost-Less (SE-BL) 7T Process Tolerant SRAM Design in Sub-Threshold Regime for Ultra-Low Power Applications

In this chapter, we have designed a new sub-threshold 7T SRAM cell that exhibits ULV operation in sub-nanometer technology nodes. To solve the issues related to write-ability and read stability, we have designed a novel single-ended boost-less sub-threshold 7T SRAM cell using dynamic feedback cutting to enhance write-ability and dynamic read decoupling to avoid read disturb [25]-[28]. The proposed cell is a modified version of [27]. The proposed cell uses only one bit-line, one feedback cutting signal and saves one transistor as compared to [27]. This work is an elaborate discussion of our previous work [26] on 7T including comparisons with other single-ended and differential cells, such as the conventional 5T, 6T and state-of-art 5T [31], 6T [21], 7T [22] and read-decoupled 8T (RD-8T) [11] SRAM cells. We have emphasized delay, power and detailed statistical analysis. Apart from this, a 1kb SRAM array for the proposed 7T and conventional 6T were also designed. Here, we focus mainly on the stability of the cell under the influence of process parameter variations. The circuit simulations are done in the UMC 90nm commercial process technology. Among reported SRAM designs, this is the first time when feedback cutting using a pull-down path is applied to a single-ended SRAM cell.

For the sake of clarity and ease in description, the proposed 7T is referred as ‘7T’, conventional 5T as ‘5T’ and conventional 6T as ‘6T’. This chapter presents the proposed 7T SRAM cell along with its operation and the schemes utilized. The statistical results are compared and summarized.

### 3.1 Proposed 7T SRAM Cell: Schemes and Operation

The single-ended 5T cell as shown in Figures 2.1(c) is attractive due to its reduced area with considerable active and standby power saving compared to the conventional 6T SRAM cell as shown in Figure 2.1(a). The write-ability and read stability of a single-ended 5T severely degrades as compared to conventional 6T SRAM cell. In order to make a stable cell in all modes of operation, a

single-ended boost-less (SE-BL) with dynamic feedback cutting (DFC) is presented in Figure 3.1. The single-ended design is used to reduce the differential switching power during read-write operation. The DFC in pull-down path makes it possible to write “1” in the single-ended 7T cell through the single NMOS pass transistor. It is also used to separate the read and write path, exhibiting read decoupling.

The structural change in the 7T cell enhances immunity with process-voltage-temperature (PVT) variations in sub-threshold/near-threshold region. The proposed 7T has one cross coupled inverter pair, in which the left inverter is made up of three cascaded transistors. The motivation behind introducing three cascaded transistor in each branch of cross coupled inverter is discussed in Chapter1. The two stacked cross-coupled inverters M3-M4-M5 and M1-M2 retain data during hold mode. The write word line controls only one NMOS transistor M7, which is used to transfer data from the single bit-line (BL) to/from Q. When the read word-line is activated then BL is used to transfer data from the cell to the output. One column biased feedback control signal (FCS) is used to control feedback cutting transistors M4. Figure 3.1(b) depicts the layout of the proposed 7T cell in the commercial UMC 90nm process technology. The design is similar to that of 5T cell, except for two additional transistors M4 and M6 located in the layout, which results in 1.3x area compared to 5T. Transistor M6 is connected between bit-line (BL) and drain terminal of M3. The activation of RWL enables M6 to create a path for read operation. Transistor M4, placed in the pull-down path of the left inverter between M7 and M3, acts as a feedback controlling switch.



Figure 3.1: The proposed 7T SRAM cell in UMC 90nm (a) Schematic (b) Layout

The gate terminal of transistor M4 is controlled by FCS, which is generated and controlled by input data, read-enable, write-enable and column address signals. In data retention mode, write word-line (WWL) and RWL are low while FCS is high.

### 3.1.1 Area

For comparison of area, layout of 5T, 6T and 7T are drawn in UMC 90nm CMOS process technology and illustrated in Figures 2.11 and Figure 3.1. The sizes of MOSFETs used in the proposed 7T cell are depicted in Figure 3.1(b). Due to design constraints and contact area, the proposed 7T has 1.33x area compared to 5T cell. The 7T occupies 1.14x area as compared to that of 6T. Table 3.1 summarizes comparative values of layout area of 5T, 6T and 7T cells. Even though 7T has 1.33x area as that of 5T, but its better built-in process tolerance and dynamic voltage applicability enables it to be employed in low power application.

TABLE 3.1: LAYOUT AREA IN UMC 90NM TECHNOLOGY

|                          | 5T   | 6T    | 7T    |
|--------------------------|------|-------|-------|
| Area ( $\mu\text{m}^2$ ) | 1.24 | 1.44  | 1.65  |
| Normalized area          | 1x   | 1.16x | 1.33x |

### 3.1.2 Write Operation

It is really difficult to write “1” through NMOS transistor M5 without a boosted supply in the 5T cell (Figure 3.2(a)) because M5 is not capable to charge Q to a full VDD voltage. On the other hand, the proposed 7T cell is designed to reduce the pull-down strength using M4 for achieving better write-ability without boosted supply and any external read/write assist.

During write “1” operation, FCS is low, which turns OFF M4 thereby cutting the feedback and restricting current through node Q to ground. When WWL is asserted, this provides unhindered charging of Q through BL without any boosted supply on the gate of access transistor M7 as depicted in Figure 3.2(a).

On the other hand, during write “0”, signals: FCS, WWL and RWL are high. Therefore, node Q can quickly be pulled to zero using M6 and M7. This balances the writing speed for write ‘0’ and

write ‘1’ and allows setting a common write pulse width. The access transistor M7 provides a path between BL and Q, making M7 strong and allowing it to pass high current thereby, improving writing speed. During the write operation high current is needed through M7 that can be obtained by increasing the width of M7 to 2x, allowing the 7T cell to function at ULV supply as shown in Figure 3.2(b).

### 3.1.3 Read Operation

Read operation is performed by pre-charging BL to VDD and activating RWL, while WWL is made low. In the presence of “1” at node QB turns ON M3, which eventually makes a low resistive path for the cell current ( $I_{read}$  shown in Figure 3.2(c)) to flow through BL to ground without disturbing node Q. This discharges bit-line (BL) quickly to a sufficient voltage level that can be sensed by a full swing inverter sense amplifier. The absence of direct disturbance on the true storage node Q, due to low WWL and FCS during read operation, results in the reduction of failure probability under inter/intra die variations.



Figure 3.2: Basic operation of the proposed 7T SRAM cell in UMC 90nm (a) Write “1” (b) Write “0” (c) Read “0”

### 3.1.4 Control Signal Generation

The feedback control signal (FCS) is data dependent and connected in a column-wise configuration [26]-[28]. Input data and column address signals are used to generate the signal FCS. A common circuit is used for a single column hence, a small area overhead at array level.

The proposed 7T cell has a single-ended read port (as conventional read decoupled RD-8T [11]-[14]) and therefore the number of cells per bit-line would be small compared to differential 6T.

Due to small length BL, parasitic capacitances are less and the delay/power during read/write operation would not be affected significantly. The operation of the proposed cell is based on the conditions of word-lines, bit-lines and control signals, as presented in Table 3.2.

TABLE 3.2: OPERATION TABLE OF THE PROPOSED 7T SRAM CELL

|            | <b>Hold</b> | <b>Read</b> | <b>Write ‘1’</b> | <b>Write ‘0’</b> |
|------------|-------------|-------------|------------------|------------------|
| <b>WWL</b> | ‘0’         | ‘0’         | ‘1’              | ‘1’              |
| <b>RWL</b> | ‘0’         | ‘1’         | ‘0’              | ‘1’              |
| <b>FCS</b> | ‘1’         | ‘0’         | ‘0’              | ‘1’              |
| <b>BL</b>  | ‘1’         | Discharge   | ‘1’              | ‘0’              |

### 3.1.5 Half Select Issue

Whenever a cell is selected for write operation, the voltage on the true storage node (Q) of the cells connected in the same row will rise due to charge transfer from the write bit-line. In 7T the complementary storage node (QB) does not have a strong connection to the bit-line (single-ended design and RWL is off) and therefore, has less chances to flip the cell as compared to 6T/RD-8T cell.

During write “1” or read operation on the selected cell in a column, FCS is made low and if Q of the half-selected cell (in same column) stores “0”, then Q will be floating (Figure 3.3(a)). This column half-selected cell can hold the data if the data retention time during floating is longer than the required FCS off-period for write “1” or read operation. The parasitic and gate capacitance of transistors M1 and M2 connected to the true storage node Q of the column half-selected cells will hold data during write “1” or read operation for the selected cell (Figure 3.3(b)).



Figure 3.3: Column half-selected cell during write “1” or read operation (a) Schematic (b) Waveforms

There can be small variation in the voltage level of floating Q because of the weak driving currents from the power supply/BL charging it as shown in Figure 3.3(b). The required FCS off time (pulse width for write/read operation) is short because the 7T cell has fast write “1”/read time, hence, the column half selected cells can hold data successfully.

### 3.2 Simulation Results and Analysis of 7T, 5T and 6T

To validate the design of the proposed 7T, post layout circuit simulations were performed for iso-area (5T/6T is upsized to same layout area as the proposed 7T) conditions as discussed in [14], [20]. A column with 8 cells per bit-line is designed and simulated for each SRAM cell. For the proposed 7T, the overhead due to FCS and RWL is considered while comparing with other SRAM cells. The effect of process parameter variations on cells has shown to justify the operation in the sub-threshold region at ULV power supply.

The analysis is done at all process corners (TT, FF, SS, FS, SF). The first letter of the name of process corners refers to NMOS corner and second one refers to PMOS corner i.e. FS indicates fast NMOS (low threshold voltage) and slow PMOS (high threshold voltage). There are three ‘even’ corners i.e. typical-typical (TT), fast-fast (FF), slow-slow (SS) and two ‘skewed’ corners i.e. fast-slow (FS) and slow-fast (SF).

The FF corner (low threshold voltage and high drain current) is used to analyze the power consumption (dynamic and leakage), SS (high threshold voltage and low drain current) is to analyze

performance (read/write delay) and FS and SF are used to check operational functionality (hold, read and write) of SRAM cells [1]-[31].

### 3.2.1 Hold Static Noise Margin (HSNM)

The butterfly curve in Figure 3.4(a) displays a slight improvement in 7T HSNM as compared to 5T, at typical NMOS typical PMOS corner. The measured HSNM for different process corners is shown in the chart (Figure 3.4(b)). The proposed 7T has improved HSNM at different process corners except at fast NMOS and slow PMOS (FS) process corner. The FS corner shows lowest HSNM value among all process corners (FF, FS, SF, SS, TT) and therefore FS can be selected as the worst case corner.



Figure 3.4: Butterfly curve to find HSNM for 7T and 5T at 200mV power supply (b) HSNM for 7T and 5T at different process corners and 200mV power supply

Figure 3.5 depicts the absolute value curves of diagonal's length of squares fitted in the lobe of the butterfly curve [29]. As both cells (5T and 7T) are asymmetrical so the peaks are of different height and the smallest peak is chosen for worst case HSNM. The HSNM values are lower than the thermal voltage ( $\sim 26$ mV) and, therefore, it is not good to operate 5T/7T at 200mV (Figure 3.5(a)) power supply at FS corner. On the other hand, the HSNM value is sufficiently high at 300mV (Figure 3.5(b)) power supply and data can be retained in hold mode at FS corner.



(a)



(b)

Figure 3.5: Absolute value curves of diagonal's length for different power supplies (200-500mV) at FS corner (a) 5T (b) 7T

The HSNM of 5T (Figure 3.5(a)) and 7T (Figure 3.5(b)) increases with power supply (200-500mV). Consequently, the data failure rate is also reduced as shown in Figure 3.5.

### 3.2.2 Read Margin

The read margin represents dynamic read stability. The proposed SRAM holds the data dynamically in read mode therefore, it is more suitable to find the read margin using transient response, as depicted in Figure 3.6. The minimum difference between storage node voltages (Q and QB) is the measure of read margin (Figure 3.6(a)). The values measured from definitions used in Figure 3.6(a) are collected in Figure 3.6(b) at all process corners (FF, FS, SF, SS and TT). The bars

showing negative values represent failed read operation (read disturb causes cell flipping in 5T/6T). The proposed 7T can be read with less read disturb at all process corners (FF, FS, SF, SS and TT) as shown in Figure 3.6(b).



Figure 3.6: Post layout read operation for 8 cells connected to a single bit-line (a) Waveforms and Read margin calculation at FS corner (b) Read margin for different process corners (FF, FS, SF, SS and TT)

Dynamic read decoupling provides a separate read path, eventually enhancing read margin of the 7T in linear fashion with VDD, while 5T has negative read margin for 200-400mV. The Conventional 6T fails to read at 200mV but can be read for 300-500mV power supply at FS corner,

as shown in Figure 7(a). The 7T has 33% improvement over 6T at 300mV and 139% improvement over 5T at 500mV power supply (Figure 3.7(b)).



Figure 3.7: (a) Read margin versus VDD at FS corner (b) Change in read margin against VDD

### 3.2.3 Write Trip Point (WTP)

The write-ability is measured using write trip point (WTP) [30], as shown in Figure 3.8. The bit-line voltage is swept from “0” to VDD and flipping of the cell is captured. The value of bit-line voltage at the trip point (crossing) of internal storage nodes (Q and QB) represents WTP. The proposed 7T needs a low bit-line voltage (high write-ability) while 5T fails to write as shown in Figure 3.8(a) (write “1” at 200mV). As Q can be pulled to GND through NMOS, 5T can be written with “0” without boosted supply. Therefore, WTP for write “0” is lower than that of write “1” for both 5T and 7T (Figure 3.8(b)). It is worth noticing that the feedback cutting provides easy write “1” in 7T which needs less bit-line voltage (149.5mV at VDD=200mV and 313.9mV at VDD=500mV) to trip Q and QB whereas, 5T fails to write “1” at slow NMOS fast PMOS (SF) worst case corner. Therefore, we compared the WTP of 7T with 6T in Figure 3.9(a). Due to differential write scheme (BLB pull the QB to GND and BL charges Q to  $\sim$ VDD), conventional 6T has lower WTP as compared to that of 7T (Figure 3.9(a)). The proposed 7T has two NMOS (M6 and M7) pass-transistors for write “0” operation which leads to similar WTP as that of 6T. The 5T has only one NMOS (M5) between Q and BL, its WTP is higher than 6T and 7T (Figure 3.9(b)).



Figure 3.8: Write Trip Point (WTP) for 200mV power supply at SF corner (a) Write “1” (b) Write “0”



Figure 3.9: Write Trip Point (WTP) against power supply at SF corner during (a) Write “1” (b) Write “0”

### 3.2.4 Delay

The plots for delay against supply voltage of a single column containing 8 cells per bit-line are displayed in Figure 3.10. The write delay is measured from the time WWL signal raises to VDD/2 until the storage node is discharged or charged to VDD/2. The read delay is measured from the time the WWL signal is activated until the bit-line is discharged to 10% of VDD. As the read path has similar sized MOSFETs, the proposed 7T SRAM shows similar read delay as 5T/6T for 200-500mV power supply at slow NMOS slow PMOS (SS) corner (Figure 3.10(a)).

Since 5T fails to write “1”, the write delay of 7T is compared with 6T. The write delay for 7T is 32.9ns which is 3.45x as that of 6T at VDD=200mV whereas it is similar at 200-500mV at SS corner (Figure 3.10(b)).



Figure 3.10: Post layout delay vs. VDD for 8 cells per bit-line at SS corner (a) Read delay (b) Write delay

### 3.2.5 Power Consumption

The plots for average power consumption against supply voltage of a single column containing 8 cells per bit-line at FF corner are shown in Figure 3.11. Both 5T and 7T are single-ended and save read power compared to 6T for all VDDs (200-500mV). The read operation of 7T saves more power as compared to 5T for 300-500mV power supply (Figure 3.11(a)). The write power is similar for 7T and 5T but 7T saves more power as compared to 5T and 6T at 300-500mV power supply at fast NMOS fast PMOS (FF) corner (Figure 3.11(b)).



Figure 3.11: Post layout average power consumption vs. VDD for 8 cells per bit-line at FF corner  
 (a) Read power (b) Write power

### 3.3 Statistical Analysis of 7T, 5T and 6T

To validate the proposed design, circuit simulations were performed for similar conditions for the conventional and the proposed circuits. During simulations, 27°C temperature and 200-500mV power supply were used. The model parameters were taken from 90nm UMC FDK commercial technology. Monte Carlo (MC) simulations with  $V_{\text{TH}}$  variations were performed for 1000 samples and presented at worst case process corners.

### 3.3.1 Hold Static Noise Margin (HSNM)

The HSNM of the discussed cells (5T, 6T and 7T) follow Gaussian distribution for 1000 MC simulation at worst case corner (FS). The appearance of closer peaks indicates similar mean ( $\mu$ ) and standard deviation ( $\sigma$ ) values of 5T, 6T and 7T for HSNM (Figure 3.12(a)-12(b) and Figure 3.13(a)). As 6T is structurally symmetrical its HSNM is higher than 5T and 7T as shown in Figure 3.13(b).



Figure 3.12: Monte Carlo (MC) simulation for 1000 samples at FS corner (a) HSNM at 300mV (b) HSNM at 400mV



Figure 3.13: Monte Carlo (MC) simulation for 1000 samples at FS corner (a) HSNM at 500mV (b) Comparison of HSNM mean for 5T, 6T and 7T at all power supplies

### 3.3.2 Read Margin

Statistical analysis (1000 Monte Carlo simulation) was undertaken for read margin (at FS corner). The mean ( $\mu$ ) values for read margin are collected in Figure 3.14(a). 5T fails to read at low VDD (200-300mV) whereas 7T has a linear read margin profile with higher mean value as compared to 5T and 6T. 6T fails to read at 200mV but has higher read margin as compared to 5T (lower than 7T) at 400-500mV.

### 3.3.3 Write Trip Point (WTP)

The mean ( $\mu$ ) values (from 1000 Monte Carlo simulation) for WTP at SF corner are collected in Figure 3.14(b). 5T fails to write while 6T has linear and the lowest WTP profile as compared to 5T and 7T for 200-500mV power supply at SF corner. In Table 3.5, 7T has the highest mean/standard deviation ( $\mu/\sigma$ ) ratio of read margin compared to 5T and 6T for 200-500mV. The  $\mu/\sigma$  of WTP is also low enough to write “1” while maintaining similar  $\mu/\sigma$  of HSNM as that of 5T. This confirms that the 7T SRAM has higher immunity to variations in process parameters compared to 5T.



Figure 3.14: Monte Carlo (MC) simulation for 1000 samples (a) Read margin mean against VDD from at FS corner (b) WTP mean against VDD from at SF corner

### 3.4 Comparison of 7T with state-of-art SRAM cells

In this section, we compare the proposed 7T SRAM cell with some of the recently reported state-of-the-art SRAM designs. The proposed SRAM cell has many design similarities with the referenced designs like single-ended operation, feedback cutting and read decoupling. In spite of these similarities, 7T has some novelties in its design. To show these novelties and significance of this research work we have evaluated single-ended cells, differential cells, single-ended plus differential cells and cells with read/write assist.

#### 3.4.1 Design and Scheme Comparison

As we have mentioned the conventional and state-of-art SRAM cells and their used schemes in introduction section, we can find design novelty of the proposed 7T SRAM cell. To visualize the difference in the cell design, we have compared the 7T SRAM cell with other referenced SRAM cells in Table 3.3.

TABLE 3.3: DESIGN DETAILS OF VARIOUS SRAM CELLS.

| Bit-cell           | Normalized Area | #BL      | #WL      | Feed-back Cutting | Read Decoupling | Writing             | Reading             |
|--------------------|-----------------|----------|----------|-------------------|-----------------|---------------------|---------------------|
| <b>Proposed 7T</b> | <b>1.33x</b>    | <b>1</b> | <b>2</b> | <b>Yes</b>        | <b>Yes</b>      | <b>Single Ended</b> | <b>Single Ended</b> |
| <b>5T</b>          | 1x              | 1        | 1        | No                | No              | Single Ended        | Single Ended        |
| <b>BW-5T [31]</b>  | 1x              | 1        | 1        | No                | No              | Single Ended        | Single Ended        |
| <b>6T</b>          | 1.16x           | 2        | 1        | No                | No              | Differential        | Differential        |
| <b>DVT-7T [22]</b> | 1.34x           | 2        | 2        | No                | Yes             | Single Ended        | Single Ended        |
| <b>SE-6T [21]</b>  | 2.32x           | 1        | 2        | Yes               | No              | Single Ended        | Single Ended        |
| <b>RD-8T [11]</b>  | 1.69x           | 3        | 2        | No                | No              | Differential        | Single Ended        |
| <b>A-8T [23]</b>   | 2.55x           | 1        | 2        | No                | No              | Single Ended        | Single Ended        |
| <b>7T [25]</b>     | 1.32x           | 2        | 3        | Yes               | Yes             | Differential        | Single Ended        |
| <b>9T [28]</b>     | 2.08x           | 3        | 2        | Yes               | Yes             | Differential        | Single Ended        |
| <b>10T [14]</b>    | 2.32x           | 2        | 1        | No                | No              | Differential        | Differential        |
| <b>10T [17]</b>    | 3.19x           | 2        | 1        | No                | Yes             | Differential        | Differential        |

The feedback cutting scheme is used by the proposed 7T, single-ended 6T (SE-6T) [21], 7T [25] and 9T [28]. The read decoupling scheme is used by the proposed 7T, dual-Vt 7T (DVT-7T) [22], read-decoupled 8T (RD-8T) [11], 10T [17], 7T [25] and 9T [28]. The proposed 7T, 5T, boosted word-line 5T (BW-5T) [31], DVT-7T [22], SE-6T [21] and asymmetrical 8T (A-8T) [23] use single-ended read/write schemes. The novelty of the proposed 7T design lies in the feedback cutting and read decoupling with single-ended read and write operations, while 7T [25] and 9T [28] use differential write operation.

### 3.4.2 Stability Comparison

The similar design of cross-coupled inverter in 7T, 5T, BW-5T, DVT-7T and SE-6T give similar HSNM for these cells. On the other hand, RD-8T and 6T (highest HSNM) have higher HSNM compared to 7T (Figure 3.15(a)). The dynamic read decoupling used in 7T provides similar read margin as RD-8T and DVT-7T for 200-500mV (Figure 3.15(a)). In SE-6T the feedback loop between two cross-coupled inverters weakens by applying virtual VDD and virtual GND. This results into higher read margin over single-ended BW-5T (lowest read margin).



Figure 3.15: (a) Read margin against VDD at FS corner (b) Write Trip Point (WTP) against VDD at SF corner

The single-ended 7T, DVT-7T, SE-6T and BW-5T cells have higher WTP as compared to differential RD-8T WTP (lowest WTP) for 200-500mV as shown in Figure 3.15(b). The use of

boosted-word-line in BW-5T allows it to write into the cell, but still has higher WTP compared to 7T, DVT-7T, SE-6T and RD-8T.

### 3.4.3 Delay Comparison

As 7T has small sized MOSFETs in the read path like those in DVT-7T, SE-6T and RD-8T, it shows similar read delay as given in Figure 3.16(a). The SE-6T has the lowest read delay because of the presence of a transmission-gate in its read path. The RD-8T has differential write operation and both bit-lines help to charge and discharge the storage nodes faster than single-ended 7T, DVT-7T, SE-6T and BW-5T cells (Figure 3.16(b)). The feedback-cutting in 7T allows it to quickly charge Q, which results in lower write delay as that of DVT-7T and BW-5T (highest write delay).



Figure 3.16: Post layout delay versus VDD for 8 cells per bit-line at SS corner (a) Read delay (b) Write delay

### 3.4.4 Power Comparison

The proposed 7T consumes the least power among the other cells (5T, 6T, DVT-7T, SE-6T, RD-8T and BW-5T) during read operation at 200-500mV. On the other hand, SE-6T consumes the highest read power among all reported cells as shown in Figure 3.17(a). Due to the differential write operation, RD-8T consumes the highest write power compared to single-ended 7T, DVT-7T, SE-6T

and BW-5T cells, as shown in Figure 3.17(b). The write power consumption of 7T lies in between BW-5T and DVT-7T (lowest), as depicted in Figure 3.17(b).



Figure 3.17: Post layout average power consumption against VDD for 8 cells per bit-line at FF corner (a) Read power (b) Write power

### 3.5 Comparison Summary

As all the reported SRAM cells suffer from operation failure at 200mV, we compared the cells at 300mV at worst case conditions in Table 3.4 and summarize the comparison in Table 3.5. Table 3.4 shows that 7T, 5T, DVT-7T, BW-5T and SE-6T have similar HSNM. The symmetrical conventional 6T has 26% (highest HSNM) and RD-8T has 17% higher HSNM compared to 7T HSNM as shown in Table 3.5. SE-6T, 6T, and BW-5T (lowest read margin) have lower read margin while DVT-7T and RD-8T (highest read margin) have higher read margin as that of 7T. During write ‘1’, BW-5T and DVT-6T (highest WTP) have higher WTP while 6T, SE-6T and RD-8T (lowest WTP) have lower WTP as that of 7T. During write ‘0’, 7T, BW-5T, SE-6T and RD-8T have similar WTP while 5T has 22% higher (highest WTP) and 6T has 14% lower (lowest WTP) as that of 7T. Read delay of 5T is highest (29% lower than 7T) and SE-6T has lowest (21% lower than 7T) among other cells. Write delay of BW-5T is highest (57% higher than 7T) and RD-8T has lowest (27% lower than 7T) among other cells. The power consumption of 7T during read operation is lowest among all the reported cells (SE-6T has highest write power). The write power of 7T is lower than 5T, BW-5T, 6T and SE-6T but DVT-7T has 23% lower power than 7T.

TABLE 3.4: COMPARISON SUMMARY OF 5T, 6T AND 7T AT 300mV AT WORST CASE CORNERS

|                        | HSNM<br>(mV) | Read<br>Margin<br>(mV) | WTP<br>“1”<br>(mV) | WTP<br>“0”<br>(mV) | Read<br>Delay<br>(ns) | Write<br>Delay<br>(ns) | Read<br>Power<br>( $\mu$ W) | Write<br>Power<br>( $\mu$ W) |
|------------------------|--------------|------------------------|--------------------|--------------------|-----------------------|------------------------|-----------------------------|------------------------------|
| <b>7T</b>              | 49.9         | 259.6                  | 223.4              | 81.1               | 7.2                   | 8.4                    | 0.46                        | 0.48                         |
| <b>5T</b>              | 52.2         | Fails                  | Fails              | 99.1               | 9.3                   | Fails                  | 0.51                        | 0.52                         |
| <b>BW-5T [31]</b>      | 46.1         | 120.7                  | 267.35             | 85.32              | 7.4                   | 13.2                   | 0.55                        | 0.56                         |
| <b>6T</b>              | 63.01        | 195.8                  | 139.7              | 69.5               | 6.2                   | 8.1                    | 0.63                        | 0.72                         |
| <b>DVT-7T<br/>[22]</b> | 45.6         | 282.1                  | 290.17             | 94.2               | 7.06                  | 12.45                  | 0.54                        | 0.37                         |
| <b>SE-6T [21]</b>      | 48.3         | 225.3                  | 194.39             | 78.3               | 5.66                  | 9.51                   | 0.71                        | 0.57                         |
| <b>RD-8T [11]</b>      | 58.5         | 293.5                  | 134.74             | 76.5               | 6.63                  | 6.14                   | 0.61                        | 0.81                         |

TABLE 3.5: PERCENTAGE CHANGE IN VARIOUS PARAMETERS OF DIFFERENT CELLS W.R.T 7T AT 300mV AT WORST CASE CORNERS

| Bit-cell               | HSNM<br>(mV)  | Read<br>Margin<br>(mV) | WTP<br>“1”<br>(mV) | WTP<br>“0”<br>(mV) | Read<br>Delay<br>(ns) | Write<br>Delay<br>(ns) | Read<br>Power<br>( $\mu$ W) | Write<br>Power<br>( $\mu$ W) |
|------------------------|---------------|------------------------|--------------------|--------------------|-----------------------|------------------------|-----------------------------|------------------------------|
| <b>5T</b>              | 5%<br>Higher  | Fails                  | Fails              | 22%<br>Higher      | 29%<br>Higher         | Fails                  | 11%<br>Higher               | 8%<br>Higher                 |
| <b>BW-5T<br/>[31]</b>  | 3%<br>Higher  | 54%<br>Lower           | 20%<br>Higher      | 5%<br>Higher       | 3%<br>Higher          | 57%<br>Higher          | 20%<br>Higher               | 17%<br>Higher                |
| <b>6T</b>              | 26%<br>Higher | 25%<br>Lower           | 37%<br>Lower       | 14%<br>Lower       | 14%<br>Lower          | 4%<br>Lower            | 37%<br>Higher               | 50%<br>Higher                |
| <b>DVT-7T<br/>[22]</b> | 9%<br>Lower   | 9%<br>Higher           | 30%<br>Higher      | 16%<br>Higher      | 2%<br>Lower           | 48%<br>Higher          | 17%<br>Higher               | 23%<br>Lower                 |
| <b>SE-6T [21]</b>      | 3%<br>Lower   | 13%<br>Lower           | 13%<br>Lower       | 3%<br>Lower        | 21%<br>Lower          | 13%<br>Higher          | 54%<br>Higher               | 19%<br>Higher                |
| <b>RD-8T<br/>[11]</b>  | 17%<br>Higher | 13%<br>Higher          | 40%<br>Lower       | 6%<br>Lower        | 8%<br>Lower           | 27%<br>Lower           | 33%<br>Higher               | 69%<br>Higher                |

TABLE 3.6: COMPARISON OF MEAN ( $\mu$ ) AND STANDARD DEVIATION ( $\sigma$ ) OF 7T WITH 5T AND 6T IN UMC 90nm CMOS TECHNOLOGY WITH TEMP=27<sup>0</sup>C AND VDD RANGE OF 200-500mV.

| Bit-cell    | SNM at worst process corner | VDD=200 mV |               | VDD=300 mV |               | VDD=400 mV |               | VDD=500 mV |               |
|-------------|-----------------------------|------------|---------------|------------|---------------|------------|---------------|------------|---------------|
|             |                             | $\mu$ (mV) | $\sigma$ (mV) |
| 7T          | WTP (SF)                    | 143.3      | 13.65         | 222.3      | 13.42         | 276.8      | 13.28         | 335.2      | 13.21         |
|             | Read Margin (FS)            | 181.1      | 14.81         | 276        | 7.04          | 376.2      | 6.72          | 475.3      | 6.67          |
|             | HSNM (FS)                   | 11.75      | 9.37          | 46.93      | 8.01          | 90.62      | 7.95          | 129.6      | 7.9           |
| 5T          | WTP (SF)                    | Fails      |               |            |               |            |               |            |               |
|             | Read Margin (FS)            | -160.5     | 73.97         | -124.1     | 225           | 108        | 271.9         | 233.7      | 240.8         |
|             | HSNM (FS)                   | 10.85      | 10.02         | 50.47      | 7.41          | 93.8       | 7.35          | 133.4      | 7.25          |
| BW-5T [31]  | WTP (SF)                    | 182.65     | 15.21         | 289.35     | 14.63         | 381.85     | 14.21         | 473.63     | 13.7          |
|             | Read Margin (FS)            | 29.9       | 18.64         | 119.21     | 14.3          | 198.21     | 13.56         | 293.13     | 12.65         |
|             | HSNM (FS)                   | 12.21      | 9.51          | 41.4       | 9             | 88.46      | 8.87          | 125.3      | 8.1           |
| 6T          | WTP (SF)                    | 84.75      | 16.12         | 138.1      | 15.56         | 188.6      | 14.98         | 241.3      | 14.74         |
|             | Read Margin (FS)            | 43.1       | 35.76         | 190.3      | 16.07         | 351.2      | 13.76         | 436.3      | 11.85         |
|             | HSNM (FS)                   | 14.29      | 9.29          | 56.58      | 9.12          | 99.05      | 9.06          | 137        | 8.95          |
| DVT-7T [22] | WTP (SF)                    | Fails      |               | 292.4      | 13.5          | 374.2      | 12.34         | 410.1      | 11.53         |
|             | Read Margin (FS)            | 178.42     | 15.2          | 280.1      | 12.98         | 382.4      | 12.2          | 485.6      | 11            |
|             | HSNM (FS)                   | 10.82      | 12.7          | 42.57      | 11.4          | 85.4       | 10.93         | 121.5      | 10.54         |
| SE-6T [21]  | WTP (SF)                    | 118.12     | 14.6          | 176.3      | 14.41         | 225.6      | 14.32         | 271.4      | 13.63         |
|             | Read Margin (FS)            | 100.32     | 14.70         | 182.7      | 13.68         | 265.3      | 12.1          | 376.5      | 11.4          |
|             | HSNM (FS)                   | 10.22      | 10.60         | 43.69      | 10.2          | 89.3       | 10.01         | 121.8      | 9.55          |
| RD-8T [11]  | WTP (SF)                    | 79.5       | 15.12         | 135.3      | 14.6          | 176.3      | 13.8          | 234.2      | 13.4          |
|             | Read Margin (FS)            | 189.6      | 9.76          | 286.3      | 15.3          | 389.2      | 12.6          | 490.3      | 10.5          |
|             | HSNM (FS)                   | 13.9       | 8.9           | 51.58      | 8.4           | 93.5       | 8.2           | 132.2      | 8.1           |

### 3.5.1 Statistical Analysis

The results of Monte Carlo simulations are shown in Table 3.6. The percentage change in different parameters with respect to (w.r.t) 7T is tabulated in Table 3.7. The single-ended 7T, DVT-7T, SE-6T and BW-5T cells have higher WTP mean as compared to differential RD-8T WTP (lowest WTP mean (39% lower than 7T)).

The dynamic read decoupling used in 7T provides similar read margin mean as RD-8T and DVT-7T for 300mV. BW-5T has the lowest read margin mean (56% lower than 7T). The HSNM mean is similar for 7T, 5T, BW-5T, DVT-7T and SE-6T while RD-8T and 6T (highest HSNM mean (20.56% higher than 7T)) have higher HSNM mean as that of 7T as shown in Table 3.6 and Table 3.7. The robust design of 7T gives the lowest values of standard deviation for WTP, RSNM and HSNM as given in Table 3.6 and Table 3.7.

TABLE 3.7: PERCENTAGE CHANGE IN VARIOUS PARAMETERS OF DIFFERENT CELLS W.R.T 7T IN UMC 90nm CMOS TECHNOLOGY AND 300mV POWER SUPPLY AT WORST CASE CORNERS

| Bit-cell           | WTP              |                  | Read Margin     |                   | HSNM             |                  |
|--------------------|------------------|------------------|-----------------|-------------------|------------------|------------------|
|                    | $\mu$            | $\sigma$         | $\mu$           | $\sigma$          | $\mu$            | $\sigma$         |
| <b>5T</b>          | Fails            | Fails            | Fails           | Fails             | 7.54%<br>Higher  | 7.41%<br>Lower   |
| <b>BW-5T [31]</b>  | 30.16%<br>Higher | 9.02%<br>Higher  | 56.81%<br>Lower | 103.13%<br>Higher | 11.78%<br>Lower  | 12.36%<br>Higher |
| <b>6T</b>          | 37.88%<br>Lower  | 15.95%<br>Higher | 31.05%<br>Lower | 128.27%<br>Higher | 20.56%<br>Higher | 13.86%<br>Higher |
| <b>DVT-7T [22]</b> | 31.53%<br>Higher | 0.60%<br>Higher  | 1.49%<br>Higher | 84.38%<br>Higher  | 9.29%<br>Lower   | 42.32%<br>Higher |
| <b>SE-6T [21]</b>  | 20.69%<br>Lower  | 7.38%<br>Higher  | 33.80%<br>Lower | 94.32%<br>Higher  | 6.90%<br>Lower   | 27.34%<br>Higher |
| <b>RD-8T [11]</b>  | 39.14%<br>Lower  | 8.79%<br>Higher  | 3.73%<br>Higher | 117.33%<br>Higher | 9.91%<br>Higher  | 4.87%<br>Higher  |

### 3.5.2 Array Design

The proposed 7T with feedback cutting and read decoupled schemes is implemented in a  $64 \times 16$  bits SRAM array in 90nm UMC CMOS technology. To save power consumption, the array has been operated in the sub-threshold regime. The 1kb SRAM comprises 4 banks and each bank consists of 16 words  $\times$  16 bits as shown in Figure 3.18. A similar architecture is used to design a 1kb array for 6T SRAM. Both arrays are compared in Table 3.8 at 300mV and 10MHz. The power consumption of 6T SRAM array is 42% and 52.92% higher than 7T SRAM array during read and write operations, respectively. However, the read and write delay of 6T is lower than that of 7T SRAM array.



Figure 3.18: Layout of 1kb array of the proposed 7T SRAM cell.

TABLE 3.8: COMPARISON OF 1kb ARRAY OF THE PROPOSED 7T AND 6T SRAM AT 300mV.

| (UMC 90nm)                        | Read Delay<br>(ns) | Write Delay<br>(ns) | Read Power<br>( $\mu$ W) | Write Power<br>( $\mu$ W) |
|-----------------------------------|--------------------|---------------------|--------------------------|---------------------------|
| <b>7T</b>                         | 29.48              | 46.70               | 9.38                     | 11.81                     |
| <b>6T</b>                         | 21.86              | 32.42               | 13.32                    | 18.06                     |
| <b>%Change in 6T<br/>w.r.t 7T</b> | 25.85%<br>Lower    | 30.58%<br>Lower     | 42%<br>Higher            | 52.92%<br>Higher          |

Further, the comparison of the proposed 7T with 9T [28] and 7T [25] is given Table 3.9. The available data is directly taken from their article and presented in Table 3.9.

TABLE 3.9: COMPARISON OF THE PROPOSED 7T WITH 9T [28] AND 7T [25]

| Cell               | RSNM (mV)           | WSNM (mV)                  | Write Delay (ns)      | VDD-min (mV) | Array      |
|--------------------|---------------------|----------------------------|-----------------------|--------------|------------|
| <b>Proposed 7T</b> | <b>RM=259.6</b>     | <b>WTP=223.4</b>           | <b>8.4</b>            | <b>200</b>   | <b>1kb</b> |
| <b>9T [28]</b>     | 23<br>(at VDD=0.3V) | WTP=160mV<br>(at VDD=0.3V) | 1.02<br>(at VDD=0.3V) | 160          | 32kb       |
| <b>7T [25]</b>     | 200m<br>(at VDD=1V) | -                          | -                     | 440          | 64kb       |

It is worth noticing that, RSNM of the proposed 8T cell is the higher than 9T [28] because of reducing the chances of leakage by removing second access transistor. On the other hand, due to differential write scheme of 9T [28], the WSNM and write delay is lower than that of 8T as presented in Table 3.9.

The following Table 3.10 shows the basic architectural difference of the proposed 7T and 8T SRAM cells. The proposed 7T occupies the very less cell area as compared to 8T and also uses only one bitline for read/write operation. The 7T requires just one extra switching signal for quick write and read decoupling.

TABLE 3.10: DESIGN DETAILS OF THE PROPOSED SRAM CELLS

| Proposed dBit-cell | Normalized Area w.r.t 5T | #BL | #WL | Feedback Cutting | Read De-coupling | Writing      | Reading      | Feedback cutting switch/signal |
|--------------------|--------------------------|-----|-----|------------------|------------------|--------------|--------------|--------------------------------|
| <b>7T</b>          | 1.16x                    | 1   | 2   | Yes              | Yes              | Single Ended | Single Ended | 1                              |
| <b>8T</b>          | 2x                       | 2   | 2   | Yes              | Yes              | Single Ended | Single Ended | 2                              |

Moreover the different parameters of the proposed 7T and the proposed 8T cells (Chapter 2) are presented in Table 3.11 and Table 3.12 respectively. As the read and write stability is defined

differently for 7T as that of 8T, we cannot compare the data from Table 3.11 and Table 3.12. We can see that the HS NM of 8T is higher than that of 7T because of more symmetrical inverter pair.

TABLE 3.11: MEAN ( $\mu$ ) AND STANDARD DEVIATION ( $\sigma$ ) FOR THE PROPOSED SRAM CELL

| Bit-cell    | SNM at worst process corner | VDD=200mV     |                  | VDD=300mV     |                  | VDD=400mV     |                  | VDD=500mV     |                  |
|-------------|-----------------------------|---------------|------------------|---------------|------------------|---------------|------------------|---------------|------------------|
|             |                             | $\mu$<br>(mV) | $\sigma$<br>(mV) | $\mu$<br>(mV) | $\sigma$<br>(mV) | $\mu$<br>(mV) | $\sigma$<br>(mV) | $\mu$<br>(mV) | $\sigma$<br>(mV) |
| Proposed 8T | WSNM (SF)                   | 139.9         | 6.07             | 227.2         | 7.0              | 314.2         | 7.34             | 400.0         | 7.53             |
|             | RSNM (FS)                   | 39.42         | 6.63             | 70.33         | 5.98             | 83.6          | 5.60             | 85.59         | 6.10             |
|             | HSNM (FS)                   | 49.82         | 4.77             | 89.57         | 5.44             | 123.4         | 6.93             | 151.5         | 8.54             |

TABLE 3.12: TABLE OF MEAN ( $\mu$ ) AND STANDARD DEVIATION ( $\sigma$ ) OF 7T IN UMC 90nm CMOS TECHNOLOGY WITH TEMP=27<sup>0</sup>C AND VDD RANGE OF 200-500mV

| Bit-cell    | SNM at worst process corner | VDD=200 mV    |                  | VDD=300 mV    |                  | VDD=400 mV |                  | VDD=500 mV    |                  |
|-------------|-----------------------------|---------------|------------------|---------------|------------------|------------|------------------|---------------|------------------|
|             |                             | $\mu$<br>(mV) | $\sigma$<br>(mV) | $\mu$<br>(mV) | $\sigma$<br>(mV) | $\mu$ (mV) | $\sigma$<br>(mV) | $\mu$<br>(mV) | $\sigma$<br>(mV) |
| Proposed 7T | WTP (SF)                    | 143.3         | 13.65            | 222.3         | 13.42            | 276.8      | 13.28            | 335.2         | 13.21            |
|             | Read Margin (FS)            | 181.1         | 14.81            | 276           | 7.04             | 376.2      | 6.72             | 475.3         | 6.67             |
|             | HSNM (FS)                   | 11.75         | 9.37             | 46.93         | 8.01             | 90.62      | 7.95             | 129.6         | 7.9              |

Table 3.13 shows that a 1kb array of 8T has lower read/write delay as that of 7T. On the other hand the less number of switching signals used in 7T results in considerable power saving during read/write operation as compared to 8T.

TABLE 3.13: COMPARISON OF 1kb ARRAY OF 7T AND 8T SRAM AT 300mV

| Proposed Cells (UMC 90nm) | Read Delay (ns) | Write Delay (ns) | Read Power ( $\mu$ W) | Write Power ( $\mu$ W) |
|---------------------------|-----------------|------------------|-----------------------|------------------------|
| 7T                        | 29.48           | 46.70            | 9.38                  | 11.81                  |
| 8T                        | 26.37           | 49.88            | 15.8                  | 9.39                   |

### 3.6 Chapter Summary

A single-ended boost-less (SE-BL) 7T SRAM cell utilizing dynamic feedback cutting and dynamic read decoupling is presented in this paper. The proposed 7T SRAM cell has improved read margin, write-ability, performance and lower power consumption over the conventional and state-of-art SRAM cells. The proposed SRAM cell features the best robustness among the reported SRAM cells against PVT variations. It has highest write-ability among boosted and boost-less 5T single-ended cells. It consumes the lowest read power among all reported SRAM cells. Although it has small area overhead over 5T, it has better built-in process tolerance. High mean ( $\mu$ ) and low standard deviation ( $\sigma$ ) of 7T result in successful operation at 300mV power supply. The significant power reduction of 7T SRAM array over 6T SRAM array justifies the scope of 7T. With these favorable properties, the proposed 7T cell can be employed for battery operated system on chip (SoC) designs.

### References

- [1] Roy K. and Prasad S. (2003), Low Power CMOS VLSI Circuit Design, 1st ed. New York: Wiley.
- [2] Yoshinobu et al. (2003), Review and future prospects of low-voltage RAM circuits, *IBM J. Res. Devel.*, vol. 47, no. 5/6, pp. 525–552.
- [3] Kim et al. (2009), A voltage scalable 0.26 V, 64 kb 8T SRAM with voltage lowering techniques and deep sleep mode, *IEEE J. of Solid-State Circuits*, vol. 44, no. 6, pp. 1785–1795.
- [4] Gonzalez et al. (1997), Supply and threshold voltage scaling for low power CMOS, *IEEE J. Solid-State Circuits*, vol. 32, no. 8, pp. 1210–1216.
- [5] Khellah et al. (2007), A 256-kb dual-VCC SRAM building block in 65-nm CMOS process with actively clamped sleep transistor, *IEEE J. Solid-State Circuits*, vol. 42, no. 1, pp. 233–242.
- [6] Bhavnagarwala et al. (2001), The impact of intrinsic device fluctuations on CMOS SRAM cell stability,” *IEEE J. Solid-State Circuits*, vol. 36, no. 4, pp. 658–665.
- [7] Cheng et al. (2004), The impact of random doping effects on CMOS SRAM cell, in Proc. 30th European Solid-State Circuits Conf. (ESSCIRC), Belgium, pp. 219–222.
- [8] Mukhopadhyay et al. (2005), Modeling of failure probability and statistical design of SRAM array for yield enhancement in nanoscaled CMOS, *IEEE Trans. Comput.-Aided Design (CAD) Integr. Circuits Syst.*, vol. 24, no. 12, pp. 1859–1880.

- [9] Wang A. and Chandrakasan A. (2005), A 180-mV subthreshold FFT processor using a minimum energy design methodology, *IEEE J. Solid-State Circuits*, vol. 40, no. 1, pp. 310–319.
- [10] Markovic et al. (2010), Ultralow-power design in near-threshold region, *Proceedings of IEEE*, vol. 98, no. 2, pp. 237–252.
- [11] Verma N. and Chandrakasan A. P. (2008), A 256 kb 65 nm 8T subthreshold SRAM employing sense-amplifier redundancy, *IEEE J. Solid-State Circuits*, vol. 43, no. 1, pp. 141–149.
- [12] Kushwah C. B. and Vishvakarma S.K. (2012), Ultra-Low Power Sub-threshold SRAM Cell Design to Improve Read Static Noise Margin, *Lecture Notes in Computer Science*, 7373, pp. 139-146.
- [13] Liu Z. and Kursun V. (2008), Characterization of a novel nine-transistor SRAM cell, *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 16, pp. 488–492.
- [14] Kulkarni J. P., K. Kim, and K. Roy (2007), A 160 mV robust schmitt trigger based subthreshold SRAM, *IEEE J. Solid-State Circuits*, vol. 42, no. 10, pp. 2303–2313.
- [15] Kim T.-H., Liu J., and Kim C. H. (2007), An 8T subthreshold SRAM cell utilizing reverse short channel effect for write margin and read performance improvement, *Proceedings of IEEE Custom Integr. Circuits Conf. (CICC)*, CA, USA, pp. 241-244.
- [16] Calhoun B. H. and Chandrakasan A. P. (2007), A 256-kb 65-nm sub-threshold SRAM design for ultra-low-voltage operation, *IEEE J. Solid-State Circuits*, vol. 42, no. 3, pp. 680–688.
- [17] Lo C.-H. and Huang S.-Y. (2011), P-P-N based 10T SRAM cell for low-leakage and resilient subthreshold operation, *IEEE J. Solid-State Circuits*, vol. 46, no. 3, pp. 695–704.
- [18] Chang I. J. et al. (2009), A 32 kb 10T sub-threshold SRAM array with bit-interleaving and differential read scheme in 90 nm CMOS, *IEEE J. Solid-State Circuits*, vol. 44, no. 2, pp. 650–658.
- [19] Kushwah C. B., Dwivedi D. and Sathisha N. (2013), 8T Based SRAM Cell and Related Method, U. S. A., IBM docket no. IN920130218US1, Patent Pending.
- [20] Carlson I. et al. (2004), A high density, low leakage, 5T SRAM for embedded caches, *Proceedings of 30th Eur. Solid State Circuits Conf.*, Leuven, Belgium, pp. 215–218.
- [21] Zhai Bo (2008), A Variation-Tolerant Sub-200 mV 6-T Subthreshold SRAM, *Solid-State Circuits, IEEE J. of Solid State Circuits*, vol.43, no.10, pp.2338-2348.
- [22] Tawfik S. and Kursun V. (2008), Low power and robust 7T dual-Vt SRAM circuit, *Proceedings of Int. Symp. Circuits Syst.*, Knoxville, Tennessee, pp. 1452–1455.

- [23] Tu et al. (2010), Single-ended subthreshold SRAM with asymmetrical write/read-assist, IEEE Trans. Circuit and Systems-I, vol. 57, no. 12, pp. 3039-3047.
- [24] Tu Ming-Hsien et al. (2012), A Single-Ended Disturb-Free 9T Subthreshold SRAM With Cross-Point Data-Aware Write Word-Line Structure, Negative Bit-Line, and Adaptive Read Operation Timing Tracing, IEEE J. of Solid-State Circuits, vol.47, no.6, pp.1469-1482.
- [25] Takeda K. et al. (2006), A read-static-noise-margin-free SRAM cell for low-VDD and high-speed applications, IEEE J. Solid-State Circuits, vol. 41, no. 1, pp. 113–121.
- [26] Kushwah C. B., Vishvakarma S. K. and Dwivedi D. (2014), Single-ended sub-threshold FinFET 7T SRAM cell without boosted supply, Proceedings of IEEE International Conference on IC Design & Technology (ICICDT), Texas, USA, pp.1-4.
- [27] Kushwah C.B. and Vishvakarma S. K. (2014), A sub-threshold eight transistor (8T) SRAM cell design for stability improvement, Proceedings of IEEE International Conference on IC Design & Technology (ICICDT), Texas, USA, pp.1-4.
- [28] Chang M.-F. et al. (2011), A 130 mV SRAM with expanded write and read margins for sub-threshold applications, IEEE J. Solid-State Circuits, vol. 46, no. 2, pp. 520-529.
- [29] Seevinck E. et. al. (1987), Static noise margin analysis of MOS SRAM cells, IEEE J. Solid State Circuits, vol. SC-22, no. 10, pp. 748–754.
- [30] Jiajing W., Nalam S., Calhoun B.H. (2008), Analyzing static and dynamic write margin for nanometer SRAMs, Proceedings of ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED), Bavaria, Germany, pp.129-134.
- [31] Cosemans S., Dehaene W., Catthoor F. (2007), A Low-Power Embedded SRAM for Wireless Applications, IEEE J. Solid-State Circuits, vol.42, no.7, pp.1607-1617.
- [32] Yeoh et al. (2013), A 0.4V 7T SRAM with write through virtual ground and ultra-fine grain power gating switches, Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS), Beijing, China, pp.3030-3033.
- [33] Pasandi G., Fakhraie S.M. (2013), A new sub-threshold 7T SRAM cell design with capability of bit-interleaving in 90 nm CMOS, Proceedings of 21st Iranian Conference on Electrical Engineering (ICEE), Porto, Portugal, pp.1-6.

# Chapter 4

## A 20nm Robust Single-Ended Boost-Less 7T FinFET Sub-threshold SRAM Design under Process-Voltage-Temperature Variations

In connection with previous chapters, another problem is to obtain optimized noise margin against process-voltage-temperature (PVT) variations during all operations [1]-[7]. Still, there is lot of scope to fulfills the requirement for improving both read and write stability in sub-threshold regime for ultra-low power applications [8]-[16].

Therefore, we have designed a novel sub-threshold 7T SRAM cell that uses (i) dynamic feedback cutting to enhance write-ability and (ii) read decoupling to avoid read disturb. The circuit simulations in 20nm FinFET process technology demonstrate that 7T can be operated at ultra-low voltage (ULV) level with PVT variations. 5T and 7T cells are analyzed at all process corners, power supply voltage (VDD) variation from 0.2V to 0.8V and temperature variation from -40°C to 125°C.

The 20nm FinFET predictive technology model (PTM) [17] is used for 5T and 7T cell design and simulations. The effect of process parameter variations on cells demonstrates operation in the sub-threshold region at ULV power supply.

Analysis is done at all process corners as discussed in previous chapter (TT, FF, SS, FS, and SF). In this chapter FS indicates 30mV lower threshold voltage for fast NMOS and 30mV higher threshold voltage slow PMOS. The  $1\sigma$  value of  $V_{TH} = 30\text{mV}$ , calculated from Monte Carlo analysis, is used to differentiate process corners. The simulation results are presented at worst case process corners mentioned in previous chapter and reported in [18]-[21].

### 4.1 Conventional 5T and Proposed 7T Cell Design

The conventional 5T is designed using FinFET device (Figure 4.1(a)) with the sizing discussed in [15]. The write-ability of the 5T cell is ensured by reducing the trip-point of the inverter M1-M2 and increasing the trip-point of the inverter M3-M4. Further, the pass-transistor M5 is sized to support both write and read operation. 5T is upsized to the same approximate area as 7T in-order to make it an iso-area cell as discussed in [20]. The ‘MOSFET ratios’ and ‘Fin’ quantization constraints drive us

to make  $M2=M4=M5=4$  fins and  $M1=M3$  have 2 fins for the 5T cell. The sizing strategy for 5T and 7T is shown in Table 4.1. The High Performance (HP) 20nm FinFET model is used to reduce the minimum operating voltage and delay [16], [17]. The model parameters are taken from [16] for the 20nm technology node, using high performance (HP) NMOS/PMOS FinFET devices [17]. Both cells (5T and 7T) are designed and simulated using SPICE and Spectre (Cadence) simulator.

The proposed 7T SRAM cell is schematically represented in Figure 4.1(b). It also utilizes the same predictive technology model (PTM) HP FinFET devices [16], [17]. The design is similar to that of the 5T cell, except for two additional transistors namely,  $M4$  and  $M6$  as in Figure 4.1(b). The structural change of the cell is considered to enhance the robustness with the process-voltage-temperature (PVT) variations and to improve the stability of the cell in the sub-threshold/near-threshold region.

The proposed 7T has a cross coupled inverter pair. The left hand side inverter is made up of three transistors  $M3-M4-M5$  and other inverter by two transistors  $M1-M2$ . To reduce the area overhead of two extra transistors, we have chosen minimum possible size (1 Fin) of MOSFETs.



Figure 4.1: Schematic representation of FinFET SRAM cells (a) conventional 5T (b) proposed 7T

We also kept in mind the benefit of symmetric inverter pair for better stability [20]. Therefore, we tried to design as symmetric as inverter pair as possible. Hence FinFETs  $M3, M4, M6$  and  $M7$  have 2 fins and  $M1, M2$  and  $M5$  have 1 fin (Table 4.1). The write word line controls only one NMOS transistor  $M7$ , which is used to transfer data from the single bit-line (BL) to  $Q$ . When the read word-line (RWL) is activated then BL is used to transfer data from the cell as the output. A column biased

feedback control signal (FCS) is used to control the feedback cutting transistor M4.

The FCS is data dependent and is connected in column wise configuration. Input data and column address signals are used to generate FCS. The conditions of all control signals during all operations are tabulated in Table 4.2.

TABLE 4.1: SIZING STRATEGY FOR REPORTED BIT-CELLS

| Bit-cell Design                              | Access MOS<br>(no. of Fins) | Pull-up<br>MOS (no. of<br>Fins) | Pull-down<br>MOS (no. of<br>Fins) | Feedback<br>Cutting<br>MOS | Total No. of<br>Fins |
|----------------------------------------------|-----------------------------|---------------------------------|-----------------------------------|----------------------------|----------------------|
| <b>5T (1x)</b>                               | M5=2                        | M1=1, M4=2                      | M2=2, M3=1                        | ----                       | 8                    |
| <b>5T(2x upsized)<br/>(approx. Iso-area)</b> | M5=4                        | M1=2, M4=4                      | M2=4, M3=2                        | ----                       | 16                   |
| <b>7T (1.3x)</b>                             | M6=M7=2                     | M1=M5=1                         | M2=1, M3=2                        | M4=2                       | 11                   |

TABLE 4.2: OPERATION TABLE OF THE PROPOSED 7T SRAM CELL

|            | Hold  | Read      | Write '1' | Write '0' |
|------------|-------|-----------|-----------|-----------|
| <b>WWL</b> | '0'   | '0'       | '1'       | '1'       |
| <b>RWL</b> | '0'   | '1'       | '0'       | '1'       |
| <b>FCS</b> | '1'   | '0'       | '0'       | '1'       |
| <b>BL</b>  | '1/0' | Discharge | '1'       | '0'       |

## 4.2 Data Retention in 5T and 7T

In data retention mode of 5T, WL is low and no strong connection exists between BL and Q. The data written is held and retained by inverter pair (M1-M2 and M3-M4). Similarly, during data retention in 7T, the write word line (WWL) and read word-line (RWL) are low while FCS is high.

The cross-coupled inverter pair (M1-M2 and M3-M4-M5) retains data during the data retention period. The positive feedback between the cross-coupled inverter pair provides a full scale value at true (Q) and complementary (QB) storage nodes.

#### 4.2.1 Hold Static Noise Margin (HSNM)

In this paper, the stability during data retention is determined using a butterfly curve. Figure 4.2 depict butterfly curve at VDD of 0.2V for both 5T and 7T cells. FS can be selected as the worst case corner for HSNM analysis [21]. The butterfly curve in Figure 4.2 displays that 5T and 7T retains the data successfully at FS corner. However, 7T has 8.5% higher HSNM over 5T at FS corner.



Figure 4.2: Butterfly curves for 5T and 7T at VDD=0.2V for HSNM determination at 27°C

#### 4.2.2 HSNM under PVT Variations

The absolute value of diagonal's length can be used to determine the edge of the square fitted in the lobe of the butterfly curve representing HSNM [18]. Figure 4.3 shows the variation in absolute value of diagonal's length against VDD (0.1-0.8V). As both cells (5T and 7T) are asymmetrical, so the peaks are of different height and the smallest peak is chosen for worst case HSNM.

The HSNM values are lower than the thermal voltage (~26mV) and hence it is not recommendable to operate 5T or 7T at 0.1V VDD at FS corner. On the other hand, for 7T the HSNM is sufficiently high at 0.2V VDD and data can be retained in data retention mode at FS corner.

The HSNM of 7T (Figure 4.3(a)) and 5T (Figure 4.3(b)) increases with an increase in VDD. Consequently, the data failure rate is reduced (high and similar height peaks). The effect of temperature variation on HSNM is depicted in Figure 4.4 (a) and Figure 4.4(b) for 7T and 5T, respectively. It can be noticed that HSNM decreases as the temperature rises from -40°C to 125°C. The overlapping peaks indicate close values of HSNM for temperature variation.



Figure 4.3: Absolute value curves of diagonal's length for different voltages (0.1-0.8V) at FS corner and 27°C temperature (a) 7T (b) 5T

The HSNM values found from the Figure 4.3 and Figure 4.4 are drawn in Figure 4.5. The effect of temperature variation is the least at lower values of VDD ( $\sim 0.1V$  and  $0.2V$ ), as shown in Figure 4.5. The HSNM falls  $\sim 1.5\%$  and  $\sim 2.5\%$  for temperature variation from  $-40^{\circ}C$  to  $-15^{\circ}C$  and  $50^{\circ}C$  to  $75^{\circ}C$ , respectively. The HSNM falls by  $\sim 50\%$  for the entire temperature range from  $-40^{\circ}C$  to  $125^{\circ}C$ . A VDD rise of  $0.1V$  improves the HSNM by a factor of  $\sim 2x$  up to  $VDD=0.5V$ . However, the improvement rate decreases thereafter.



Figure 4.4: Absolute value curves of diagonal's length for different temperature values at 0.2V VDD (a) 7T (b) 5T at FS corner



Figure 4.5: HSNM of 7T and 5T for different temperature values with variation in VDD at FS corner

#### 4.2.3 Summary of HSNM under PVT Variations

Figure 4.6 presents summarized effects of PVT variations on HSNM of 5T and 7T cells. Figure 4.6(a) and Figure 4.6(b) focus on the influence of VDD variation on HSNM. As 5T is more asymmetrical than 7T, it can be inferred that 7T has slightly higher HSNM than 5T for VDD variation from 0.2V to 0.8V. The effect of temperature on HSNM of 5T and 7T is presented in Figures 4.6(c) and 4.6(d). It is clear from Figure 4.6(d) that the percentage change of HSNM of 7T over 5T increases from 4.20% at  $-40^{\circ}\text{C}$  to 23.58% at  $125^{\circ}\text{C}$ .



Figure 4.6: Comparison of 7T and 5T at FS corner (a) HSNM versus VDD at  $27^{\circ}\text{C}$  temperature (b) HSNM versus temperature at 0.2V VDD

### 4.3 Write Operation of 5T and 7T

In 5T, when WL is high, the data on BL is transferred to Q through NMOS transistor M5 (Figure 4.1(a)). Without a boosted supply, M5 is not capable to charge Q to a full high voltage and consequently fails to write (Figure 4.7(b)). The proposed 7T cell is designed to reduce the pull-down strength using M4. This helps to achieve better write-ability without boosted supply and any external read/write assist.

During write ‘1’, FCS is pulled to ground (GND) which turns M4 off and creates a high impedance path from Q to GND (Figure 4.7(a)). M4 being off prevents a fight between M7 (which tries to make Q high) and M3 (which tries to make Q low). This makes writing easier and faster without the need of a boosted supply for pass gate (M7). Now WWL is activated and the data on BL creates a voltage hike on Q via M7 and writes ‘1’ into the cell. Moreover, high on Q enables inverter (M1-M2) to change the state of QB from ‘1’ to ‘0’.

The access transistor M7 provides a path between BL and Q. If M7 is strong it will pass high current, thereby improving write-ability of the cell. The high current through M7 can be obtained by increasing the number of fins to 2 which eventually speeds up the write operation and allows it to function at ULV supply as shown in Figure 4.7(b).



Figure 4.7: (a) Schematic representation of Write ‘1’ operation of 7T (b) Write ‘1’ waveforms for 5T and 7T

During write ‘0’, FCS is at VDD, BL is pulled to GND, read word line (RWL) is high and M6 helps M7 to pull the charges from Q to GND (Figure 4.8(a)). The waveforms in Figure 4.8(b) clearly depict that 7T write successfully whereas 5T fails at 0.2V VDD. As RWL and WWL are in row-wise configuration, the effect of switching of these signals on the cells of the same row is discussed in Section 6.



Figure 4.8: (a) Schematic representation of Write ‘0’ operation of 7T (b) Write ‘0’ waveforms for 5T and 7T

#### 4.3.1 Write Operation of 5T under PVT Variations

Figure 4.9 shows different analyses (delay, write static noise margin (WSNM) [19] and power) of 5T for write ‘0’ and write ‘1’ under PVT variations. The direction of the arrow indicates variation of VDD (tail as low and head as high value) and process corners (Figure 4.9 and Figure 4.10).



Figure 4.9: Write vs. temperature with variation in all process corners and VDD for 5T (a) Write ‘1’ delay (b) Write ‘0’ delay (c) WSNM ‘1’ (d) WSNM ‘0’ (e) Write ‘1’ power (f) Write ‘0’ power

Figure 4.9(a) shows few waveforms because of 5T fails to write ‘1’ (fails to flip the state of Q and QB) at low VDD and low temperature. On the other hand, as we move from fast corner (FF) to slow corner (SS), the write ‘1’ delay increases. At skewed corners (FS, SF), it’s more difficult to write ‘1’ into the 5T cell (Figure 4.9(a)).

During write ‘0’, at higher VDD the NMOS (M5) is able to pull the storage node to GND. This shows reduction in chances of write failure with reduced delay as compared to that of write ‘1’ operation (Figure 4.9(b)). A rise in temperature reduces the write ‘1’/‘0’ failure chances and delay as depicted in Figure 4.9(a) and Figure 4.9(b). Similarly, the WSNM increases with an increase in VDD from low to high and from SS to FF corners as shown in Figure 4.9(c) and Figure 4.9(d). A change in temperature and process corner gives unpredictable response for write ‘1’/‘0’ cases (Figure 4.9(c) and Figure 4.9(d)).

The write power consumption increases with an increase in VDD, temperature and process corner from SS to FF (Figure 4.9(e) and Figure 4.9(f)).

#### 4.3.2 Write Operation of 7T under PVT Variations

Similar to 5T write operation, the PVT variations are applied on 7T and different analyses (delay, write static noise margin and power) are shown in Figure 4.10 for write ‘1’ and write ‘0’. Figure 4.10(a) shows all possible waveforms attained due to successful write ‘1’ operation of 7T. It can be seen from Figure 4.10(a) and Figure 4.10(b) that the write delay reduces with variation in (i) VDD from low to high, (ii) process corners from SS to FF, and (iii) temperature from  $-40^{\circ}\text{C}$  to  $125^{\circ}\text{C}$ . During write ‘0’, as two NMOS transistors (M6 and M7) are able to pull the storage node to GND quickly, therefore it shows reduced delay than write ‘1’ operation (Figure 4.10(b)). The WSNM increases with an increase in VDD and change in process corners from SS to FF, as shown in Figure 4.10(c) and Figure 4.10(d). Due to feedback cutting, WSNM ‘1’ is almost constant with increment in VDD and process corner variations from SS to FF with the increase in temperature over the entire range (Figure 4.10(c)). However, WSNM increases with an increase in temperature, VDD and process corners from SS to FF, as evident from Figure 4.10(d).

Power consumption increases with VDD, temperature and changes in corner from SS to FF (Figure 4.10(e) and Figure 4.10(f)). The write ‘0’ power is lower than the write ‘1’ power consumption for both cells (5T and 7T).

### 4.3.3 Summary of Write Operation

While considering and comparing write performance of SRAM cells, the impact of effective bit-line loading that directly influences performance is considered in analysis. The write '0' delay of 7T is lower than that of 5T for all VDD values (Figure 4.11 (a)). At 0.2V VDD the 7T has 0.46x lesser



Figure 4.10: Write vs. temperature variation in all process corners and VDD for 7T (a) Write '1' delay (b) Write '0' delay (c) WSNM '1' (d) WSNM '0' (e) Write '1' power (f) Write '0' power

write ‘0’ delay as that of 5T at SS corner. The ratio of 7T/5T write delay decreases with an increase in VDD, as shown in Figure 4.11(b). Although, 7T has higher delay in write ‘1’ as compared to the delay in write ‘0’ operation, 5T fails to write below 0.8V VDD (Figure 4.11(a)).



Figure 4.11: Comparison of 7T and 5T vs. VDD (a) Write delay at SF corner (b) Write delay ratio (c) WSNM at SF corner (d) WSNM ratio (e) Write power at FF corner (f) Write power ratio

At 0.8V VDD the write delay of 7T is 0.32x lower as that of 5T (5T is able to write ‘1’ only at 0.8V VDD), as shown in Figure 4.11(b). During write ‘1’ operation (direct writing through single NMOS), 7T has high WSNM compared to conventional 5T at SF corner whereas 5T fails to write (Figure 4.11(c)). As we don’t have WSNM values for 5T, we have normalized WSNM of 7T by VDD and presented in Figure 4.11(d). The WSNM of 7T is about 0.45-0.51x (~50% of VDD) lower than VDD (0.2-0.8V), as depicted in Figure 4.11(d). Moreover, in write ‘0’ operation, 7T shows higher WSNM than that of 5T for all VDD values (0.5-0.8V) (Figure 4. 11(c)). This improvement in WSNM increases (from 5.86x to 23.87x) with the reduction in VDD (0.8V to 0.5V), as presented in Figure 4.11(d).

In write ‘0’ operation, the power consumption of 5T is the highest as depicted in Figure 4.11(e). The proposed 7T consumes only 0.11x power compared to that of 5T at 0.8V VDD at FF corner. The power consumption of 7T as compared to 5T decreases (0.23x to 0.11x) with an increase in VDD (02 to 0.8V), as shown in Figure 4.11(e) and Figure 4.11(f).

Although the 5T fails to write ‘1’, we have compared the power consumption with 7T. For write ‘1’ operation 7T consumes 3.39x more power compared to the power needed for 5T operation at 0.8V because of switching of FCS and its control circuit (Figure 4.11(e) and Figure 4.11(f)). At low VDD (0.2-0.3V) the power consumption of 7T is similar (0.96x-1.12x) to 5T.

#### 4.4 Read Operation of 5T and 7T

The read operation of 5T is initialized by pre-charging BL to VDD and discharging through M3 and M5. A 4-fin wide strong pass gate M5 allows easy access to the discharging current flowing through BL (Fig 1(a)). This quick change in storage node voltages can cause cell flipping if any variation (PVT and/or increased bit-line capacitance) occurs.

For the proposed 7T, the read operation is performed by pre-charging BL and activating RWL. If ‘1’ is stored at node QB then M3 turns on which makes a low resistive path for the flow of cell current through RBL to GND (Figure 4. 12(a)). This discharges RBL quickly to sufficient voltage level that can be sensed by the full swing inverter sense amplifier (Figure 4. 12(b)). Since WWL and FCS are made low during read operation, no direct disturbance occurs on true storage node QB. This ultimately reduces the failure probability under inter/intra die variations. During read operation, 7T holds the data dynamically, and hence the read margin metric would be better over read static noise margin (RSNM) to analyze the read operation of 7T.

As FCS is in column-wise configuration, the effect of switching it on the cells of the same column is discussed in Section 6.



Figure 4.12: (a) Schematic representation of read operation of 7T (b) Read waveforms for 5T and 7T

#### 4.4.1 Read Operation of 5T and 7T under PVT Variations

To validate the robustness of SRAM cells during read operation, PVT variations are applied and results are presented in Figure 4.13. The direction of arrow shows variation in bit-line capacitance (BLcap), temperature (tail as low and head as high value) and process corners (Figure 4.13). As represented by Figure 4.13(a) and Figure 4.13(b) the read delay increases with (i) increases in BLcap, (ii) decreases in VDD and temperature, and (iii) change in process corners from fast (FF) to slow (SS). A strong read path (M5=4 fins and M3=2 fins) of 5T reduces its read delay as compared to the read delay of 7T (M6=M3=2 fins) as shown in (Figure 4.13(a) and Figure 4.13(b)).

The read margin decreases as BLcap increases, VDD decreases, process corner changes from fast to slow (FF to SS) and temperature reduces (Figure 4.13(c) and Figure 4.13(d)). Read decoupling increases the read margin of 7T compared to 5T read margin against all PVT variations.

It can be seen from Figure 4.13(e) and Figure 4.13(f) that the read power for both cells increases with the increase in BLcap, VDD, process corner change from slow (SS) to fast (FF) but with the reduction in temperature. The power consumption of 7T is lower than 5T for all PVT variations because of the presence of stacked inverter (M3-M4-M5) and lower read current.



Figure 4.13: Read vs. VDD with variation in all process corners and temperature (a) Read delay of 7T (b) Read delay of 5T (c) Read margin of 7T (d) Read margin of 5T (e) Read power of 7T (f) Read power of 5T

#### 4.4.2 Summary of Read Operation

The delay, RM and read power against bit-line capacitance (BLcap) at 0.2V VDD and worst case corner are displayed in Figure 4.14. It can be seen from Figure 4.14(a) that 7T has higher read

delay as compared to 5T read delay for all BLcap values. The strong access ( $M5=4$  fins) and pull-down transistor ( $M3=2$  fins) provide large read current and therefore, there is a constant difference in delay (7T has 1.19x higher read delay compared to 5T) at SS corner as shown in Figure 4.14(a) and Figure 4.14(b).

The 7T cell saves power during read operation for all BLcap values at FF corner as depicted in Figure 4.14(c). The read operation of 7T consumes 0.34x less power than 5T for all values of BLcap at FF corner (Figure 4.14 (d)).

As the strong access transistor in 5T ( $M5=4$  fins) allows a quick voltage rise at Q, it can flip the cell and, therefore, reduces the read margin. The read decoupled topology of 7T makes the read margin independent of BLcap as shown in Figure 4.14 (e).

As soon as the BLcap increases, the read margin of 5T reduces as depicted in Figure 4.14 (e). The difference between read margin of 5T and 7T increases (1-1.12x) with the increase in BLcap (2-5fF) at FS corner (Figure 4.14 (f)).

#### 4.5 Half-Select Condition

The FCS is common for all cells connected in a column. Whenever a cell is selected for write/read operation, voltage on Q of the write half-selected cells (connected in the same row) will rise due to charge transfer from BL. As QB does not have strong connection to BL (single-ended design and RWL is off) and FCS is at VDD therefore, there are less chances of flipping the cell.

During write/read operation, the node storing ‘0’ of column half-selected cells (connected in the same column) floats during the FCS pull-down period (Figure 4.15(a)). When node Q of a half selected cell is ‘0’, Q can be charged by the leakage current flowing through M5. The capacitance of node Q, leakage current of M5 and  $V_{TH}$  of M2 transistor decide the retention time of node Q (Figure 4.15(b)).

The required FCS off time is shorter, since the 7T cell has fast write/read time (shown in Figure 4.10(a) and Figure 4.13(a)). Hence the column half selected cells can hold the data successfully (Figure 4.15(b)).



Figure 4.14: Comparison of 5T and 7T vs bit-line capacitance at worst case corners and 0.2V VDD

(a) Read delay at SS corner (b) Read delay ratio (c) Read power at FF corner (d) Read power ratio

(e) Read margin at FS corner (f) Read margin ratio



Figure 4.15: Half-Selected 7T cell at 125<sup>0</sup>C temperature (a) Schematic diagram (b) Timing diagram with 1000 MC samples at 0.2V and FF corner

## 4.6 Statistical Results and Comparison

Monte Carlo (MC) simulation considering inter/intra die variations and  $V_{TH}$  mismatch ( $1\sigma$  value (30mV)) are performed for 1000 samples and presented for worst case process corners. To validate the benefits of the proposed design, circuit simulations were performed for similar conditions for the conventional and proposed circuits. During simulations, 27°C temperature and 200-500mV power supply were used. The model parameters were taken from 20nm FinFET technology. The mean/standard deviation ratio ( $\mu/\sigma$ ) represents yield [9]. In Table 4.3 the higher mean/standard deviation ratio ( $\mu/\sigma$  represented by R) value for read margin (independent of variation in BLcap), HSNM and WSNM as compared to conventional 5T, confirms that 7T SRAM has higher immunity to variations in process parameters.

### 4.6.1 Hold Static Noise Margin

The HSNM of the discussed cells (5T and 7T) follow Gaussian distribution for 1000 MC simulation at worst case corner (FS). Table 4.3 shows similar HSNM mean ( $\mu$ ) and standard deviation ( $\sigma$ ) values of 5T and 7T for all VDD values.

Due to a more robust inverter pair the HSNM of 7T is slightly higher than 5T. The values of R help to analyze the tolerance against process variations and therefore, the ratio of  $R(7T)/R(5T)$  is

collected in Table 4.4. In case of HSNM, R(7T)/R(5T) ratio is 6.3% at 0.2V VDD. The ratio of R(7T)/R(5T) is greater than 1 for all VDD values. This indicates higher robustness of 7T as compared to 5T in data retention mode (Table 4.4).

#### 4.6.2 Write Static Noise Margin

The WSNM of the 7T is about ~50% of VDD for all supply voltages (0.2-0.5VDD) at SF corner (Table 4.3). Because of write failure of 5T, we cannot compare WSNM and therefore WSNM of R(7T) is shown in Table 4.4. 7T has high WSNM  $\mu/\sigma$  at low voltage supply (6.20 at 0.2V VDD) and increases with VDD (11.18 at 0.5V). This confirms that the 7T SRAM has higher immunity to variations in process parameters as compared to 5T.

TABLE 4.3: COMPARISON OF MEAN ( $\mu$ ) AND STANDARD DEVIATION ( $\sigma$ ) OF 7T AND 5T IN 20nm FINFET TECHNOLOGY AT WORST CASE CORNERS WITH TEMP=27<sup>0</sup>C AND VDD RANGE OF 0.2V-0.5V.

| Cell | Margin at 27 <sup>0</sup> C     | VDD=0.25V |              | VDD=0.3V  |              | VDD=0.4V  |              | VDD=0.5V  |              |
|------|---------------------------------|-----------|--------------|-----------|--------------|-----------|--------------|-----------|--------------|
|      |                                 | $\mu$ (V) | $\sigma$ (V) |
| 7T   | WSNM (SF)                       | 0.123     | 0.0154       | 0.146     | 0.0142       | 0.195     | 0.0126       | 0.248     | 0.011        |
|      | Read Margin (FS)<br>(BLcap=5fF) | 0.193     | 0.066        | 0.272     | 0.052        | 0.368     | 0.043        | 0.461     | 0.032        |
|      | HSNM (FS)                       | 0.073     | 0.0115       | 0.102     | 0.0104       | 0.146     | 0.0092       | 0.181     | 0.008        |
| 5T   | WSNM (SF)                       | Fails     |              |           |              |           |              |           |              |
|      | Read Margin (FS)                | 0.180     | 0.067        | 0.252     | 0.0499       | 0.364     | 0.051        | 0.456     | 0.006        |
|      | HSNM (FS)                       | 0.071     | 0.012        | 0.101     | 0.0113       | 0.142     | 0.0109       | 0.177     | 0.009        |

#### 4.6.3 Read Margin

The value of read margin is close to VDD, and therefore shows reliable read operation against process variation. As 7T has separate path for read operation, it has improved read margin as compared to 5T read margin for all VDD values (Table 4.3). The read margin ratio of R(7T)/R(5T)

is 2.12% at 0.2V VDD. For higher VDD values the read margin ratio  $R(7T)/R(5T)$  increases (23.43% at 0.3V, 28.55 at 0.4V and 10.47% at 0.5V VDD) and confirms robust read operation (Table 4.4).

TABLE 4.4: COMPARISON OF 7T AND 5T IN 20nm FINFET TECHNOLOGY AT WORST CASE CORNERS.  
( $R = \mu/\sigma$ )

|                       | Margin at $27^0\text{C}$                   | <b>VDD=0.2V</b> | <b>VDD=0.3V</b> | <b>VDD=0.4V</b> | <b>VDD=0.5V</b> |
|-----------------------|--------------------------------------------|-----------------|-----------------|-----------------|-----------------|
| <b>7T wrt.<br/>5T</b> | WSNM ( $\mu/\sigma$ )<br><b>(5T Fails)</b> | 6.20            | 8.38            | 9.69            | 11.18           |
|                       | % Change in Read Margin $R(7T)/R(5T)$      | 2.12%           | 23.43%          | 28.55%          | 10.47%          |
|                       | % Change in HSNM $R(7T)/R(5T)$             | 6.30%           | 1.34%           | 2.07%           | 4.11%           |

#### 4.7 Chapter Summary

A robust and reliable single-ended boost-less 7T FinFET SRAM cell with high data stability under PVT variations is presented. Using feedback cutting and read decoupling schemes we attained high WSNM, HSNM and read margin for sub/super-threshold regime. SRAM cells 7T and 5T are studied under process corners varying from TT to FF, VDD from 0.1V to 0.8V, temperature from  $-40^0\text{C}$  to  $125^0\text{C}$  and bit-line capacitance from 2fF to 5fF. The proposed 7T cell exhibits 6.20% and 11.18% higher WSNM ( $\mu/\sigma$ ) at 0.2V and 0.5V VDD, respectively where 5T fails to write at SF corner. The ratio of  $\mu/\sigma$  of read margins of 7T/5T reveals 28.55% higher for 7T over 5T at 0.4V and FS corner. 7T has 6.30% wider ratio of  $\mu/\sigma$  of HSNM of 7T/5T than 5T at 0.2V and FS corner. The effect of temperature variation indicates that 7T is less affected than 5T. Simulation results using 20nm FinFET technology indicate that the proposed 7T cell can be operated at 0.2V. This ultra-low power FinFET 7T SRAM cell with improved stability and performance can be embedded in battery operated systems. Future applications of the proposed 7T cell can potentially be in low voltage, ultra-low voltage and medium frequency operation, such as neural signal processors, sub-threshold processor, wide-operating-range IA-32 processor, FFT core, low voltage cache operation etc.

## References

- [1] Wang A. and Chandrakasan A. (2005), A 180-mV subthreshold FFT processor using a minimum energy design methodology, *IEEE J. Solid-State Circuits*, vol. 40, no. 1, pp. 310–319.
- [2] Yoshinobu et al. (2003), Review and future prospects of low-voltage RAM circuits, *IBM J. Res. Devel.*, vol. 47, no. 5.6, pp. 525–552.
- [3] Kushwah C. B. and Vishvakarma S.K. (2012), Ultra-Low Power Sub-threshold SRAM Cell Design to Improve Read Static Noise Margin, *Lecture Notes in Computer Science*, Springer, vol. 7373, pp. 139-146.
- [4] Kang et.al.(2010), FinFET SRAM Optimization With Fin Thickness and Surface Orientation, *IEEE Trans. Electron Devices*, vol. 57, no. 11, pp. 2785-2793.
- [5] Fan et al. (2010), Investigation of cell stability and write ability of FinFET subthreshold SRAM using analytical SNM model, *IEEE Trans. Electron Devices*, vol. 57, no. 6, pp. 1375–1381.
- [6] Kim J. and Roy K. (2004), Double gate-MOSFET subthreshold circuit for ultralow power applications, *IEEE Trans. Electron Devices*, vol. 51, no. 9, pp. 1468–1474.
- [7] Liu et al. (2004), A highly threshold voltage-controllable 4T FinFET with an 8.5-nm-thick Si-Fin channel, *IEEE Electron Device Lett.*, vol. 25, no. 7, pp. 510–512.
- [8] Baravelli et al. (2008), Impact of LER and random dopant fluctuations on FinFET matching performance, *IEEE Trans. Nanotechnology.*, vol. 7, no. 3, pp. 291–298.
- [9] Chuang et al. (2007), High-Performance SRAM in Nanoscale CMOS: Design Challenges and Techniques, *Proceedings of IEEE International Conference on Memory Technology, Design and Testing*, Taipei, Taiwan, pp. 4-12.
- [10] Kushwah C. B., Dwivedi D. and Sathisha N. (2013), 8T Based SRAM Cell and Related Method, U. S. A., IBM docket no. IN920130218US1, Patent Pending.
- [11] Takeda K. et al. (2006), A read-staic-noise-margin-free SRAM cell for low-VDD and high-speed applications, *IEEE J. Solid-State Circuits*, vol. 41, no. 1, pp. 113–121.
- [12] Kushwah C.B. and Vishvakarma S. K. (2014), A sub-threshold eight transistor (8T) SRAM cell design for stability improvement, *Proceedings of IEEE International Conference on IC Design & Technology (ICICDT)*, Texas, USA, pp.1-4.
- [13] Chang M.-F. et al. (2011), A 130 mV SRAM with expanded write and read margins for sub-threshold applications, *IEEE J. Solid-State Circuits*, vol. 46, no. 2, pp. 520-529.

- [14] Kushwah C. B., Vishvakarma S. K. and Dwivedi D. (2014), Single-ended sub-threshold FinFET 7T SRAM cell without boosted supply, Proceedings of IEEE International Conference on IC Design & Technology (ICICDT), Texas, USA, pp.1-4.
- [15] Carlson I. et al. (2004), A high density, low leakage, 5T SRAM for embedded caches, Proceedings of 30th Eur. Solid State Circuits Conf., Leuven, Belgium, pp. 215–218.
- [16] Sinha et al. (2012), Exploring Sub-20nm FinFET Design with Predictive Technology Models, Proceedings of Design and Automation Conference (DAC), CA, USA, pp. 283-288.
- [17] PTM-MG Models (2013), Models for multi-gate FinFET transistors, for both HP and LSTP applications, <http://ptm.asu.edu/>, accessed August 2013.
- [18] Seevinck E. et. al. (1987), Static noise margin analysis of MOS SRAM cells, IEEE J. Solid State Circuits, vol. SC-22, no. 10, pp. 748–754.
- [19] Jiajing W., Nalam S., Calhoun B.H. (2008), Analyzing static and dynamic write margin for nanometer SRAMs, Proceedings of ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED), Bavaria, Germany, pp.129-134.
- [20] Kulkarni J. P., K. Kim, and K. Roy (2007), A 160 mV robust schmitt trigger based subthreshold SRAM, IEEE J. Solid-State Circuits, vol. 42, no. 10, pp. 2303–2313.
- [21] Lo C.-H. and Huang S.-Y. (2011), P-P-N based 10T SRAM cell for low-leakage and resilient subthreshold operation, IEEE J. Solid-State Circuits, vol. 46, no. 3, pp. 695–704.

# Chapter 5

## Robust 8T and 10T Based SRAM Design and Fast Sensing Scheme

As technology is moving towards sub-nanometer regime the conventional 6T cell is having issues related to stable read and write operation. The 6T SRAM suffers from read-disturb and write failures at low supply voltage, especially at deep sub-threshold operation (Figure 5.1).

Breaking the feedback technique is used to optimize the bit-cell to solve the problems which conventional 6T cell has. A novel 8T topology is discussed in this section which is based on the feedback cutting scheme, therefore, first we will go for a thorough review of such cells.



Figure 5.1: Conventional 6T cell

### 5.1 Review of Broken Feedback Bit-cells

The 7T [1] cell shown in Figure 5.2 is a bit-cell which utilizes the feedback technique well. A transistor N5 is inserted into the 6T cell structure for loop cutting. It enables differential write operation and single ended read operation. The data retention process is exactly the same as that of

6T cell. During read operation word line bar (WLB) is deactivated, the logical threshold voltage of the CMOS inverter driving node becomes very high, therefore the read static noise margin becomes high. The similar approach is used by [2] and proposes a 9T-SRAM cell with a data-aware-feedback-cutoff scheme to enlarge the write margin and dynamic-read-decoupled (DRD) scheme to prevent read-disturb for achieving deep sub-threshold operation. A negative-pumped word-line scheme is employed to suppress bit-line leakage current.



Figure 5.2: Read SNM free (RSNF) 7T cell [1]



Figure 5.3: A 7T cell [3]

The author in [3] uses feedback cutting to propose a novel write mechanism which depends only on one of the two bit-lines to perform a write operation. This 7T SRAM cell depends on cutting off the feedback connection between the two inverters before a write operation, as shown in Figure 5.3. The feedback connection and disconnection is performed through an extra NMOS transistor and the cell only depends on W to perform a write operation as shown in Figure 5.3. Therefore, the proposed 7T SRAM cell reduces the activity factor of discharging the bit-line pair to perform a write operation.

Another paper [4] proposes a similar way as the proposed in [3]. A 7T SRAM cell based on carbon nano-tube FET (CNTFET) only depends on one of the bit-lines for write operation and reduces the write-power consumption.

A paper proposes a read-disturb-free, 8-transistor (8T) bit-cell using separate read/write access transistors utilizing differential sensing as shown in Figure 5.4. A distributed read-access transistor shared across the bit-cells of every row enables read disturb-free differential sensing operation with eight transistors per bit-cell. Separate word/bit lines are used for read and write operations [5].

In [6] a 12-transistor SRAM bit-cell with buffered read and boosted word-line write schemes are presented. Separate word-lines for read and write are used. A self-adaptive leakage cut-off scheme is utilized to minimize the leakage energy dissipation.



Figure 5.4: A read disturb free 8T cell [22]

This is done by using two PMOS transistors in pull down path, which are gated by the signals of true and complementary storage nodes respectively. A 10T SRAM bit-cell is proposed in [7], focusing on low leakage current. This symmetrical cell comprises two cross-coupled CMOS inverters and two pass-gate transistors in series. This guarantees a highly resistive path between bit lines and internal data nodes. It is observed that a small degradation in term of write time. This is due to the resistive path formed by the two series pass-gate transistors. A virtual read word-line is introduced to read the bit-cell through two decoupled transistors. In order to have a proper write/read operation two series pass-gates transistors are used.

A virtual read word-line is introduced to read the bit-cell through two decoupled transistors. In order to have proper functionality during the write operation (to avoid parasitic current between the read word-line and bit line), the proposed bit-cell requires the use of four bit lines. A hard coding technique is proposed in order to solve the half-select cell bit lines issue. This technique is to multiplex the read word-line signal according to the number of words per line. This allows selecting only one word during the reading operation and hence eliminates the issue of the parasitic dynamic energy losses from the unselected bit lines.

A 10T SRAM bit-cell employing a fully-gated grounding scheme (10T-RGND) to limit memory bit-cell sub-threshold leakage current (IOFF) is presented in [8]. The source voltage of the read-port of 10T-RGND is selectively grounded by a row decoder only when it is accessed, while those of inactive bit-cells are forced to a supply voltage. The small voltage swing of 10T-RGND can compensate for redundant power consumption for driving the RGND line on every clock cycle.

In [9], a nine-transistor (9T) bit-cell with common read/write bit-lines and separate read word-line is presented. In this design there is no effect on write speed by increasing the bit-cell ratio as there are different bit-cell ratio definitions under read and write operations.

The proposed design is comprised of a standard 6T SRAM bit-cell and three additional NMOS transistors and an extra read word-line controlling the bottom read path transistor during read operation. Also the proposed design employs differential read operation for better read access time.

The bit-line comprises a standard 6T SRAM bit-cell and three additional NMOS transistors and an extra read word-line controlling the bottom read path transistor during read operation. Also the proposed design employs a differential read operation for better read access time.

The bit-line leakage current in the proposed 9T SRAM is reduced significantly due to the stacked combination of four transistors.

## 5.2 The Proposed 8T SRAM Cell Design

The proposed 8T bit-cell uses a pair of cross-coupled inverters forming a latch, access transistors. Access transistors provide access to the cell during write operation and isolate the cell during data retention state. The proposed 8T SRAM cell is designed to provide high write-ability and non-destructive read access with improved data retention. The Figure 5.5 shows the basic block diagram of the proposed 8T cell. Due to confidentiality (patent pending) the transistor level design is not shown.



Figure 5.5: The proposed 8T cell

### 5.2.1 Write Operation

The write operation shown in Figure 5.6 is similar to the conventional 6T cell where one of the bit lines, BL/BLB is driven from pre-charged value to ground potential by a write driver through an access transistor.

If the cell is properly sized, then the cell will be flipped and the data will be written. We have optimized the 8T cells inverter pair and access transistors to achieve better write/read ability. Only one pass gate is used between the storage node and bit-line, therefore, there is no write performance penalty.

Also, due to feedback cutoff, a sudden rise in the storage node voltage supports the fast write operation. No boosted supply is needed because the single access transistor and dynamic isolation provided by the NMOS pass gate allows for easy write operation.



Figure 5.6: Write operation of the proposed 8T SRAM cell

### 5.2.2 Read Operation

Sensing starts with biasing the sense amplifier (SA) in the high-gain metastable region by pre-charging and equalizing its inputs. Due to the presence of the column multiplexer (MUX)/isolation transistors, two precharge/equalize circuits are needed to ensure reliable sensing: global precharge and equalization for the column and a local precharge and equalization for the inputs of the SA as shown in Figure 5.7.

When a cell accessed by the word line WL has discharged the bit lines BL and BLB to a sufficient voltage differential, SA is enabled by a high-to-low transition of sense amplifier enable (SAE) pulse. Shortly after that, the column MUX/isolation turned off by YMUX control signal, isolating the highly capacitive bit lines from the SA latch and preventing the complete discharge bit-lines. Then, positive feedback of the cross-coupled inverters quickly drives the low-capacitance outputs to the full swing voltage.



Figure 5.7: Read operation of the proposed SRAM cell

### 5.2.3 Results and Brief Discussion

The proposed 8T bit-cell is different from the conventional 6T because, although reading the cell is similar as the conventional 6T bit-cell the proposed cell uses an extra NMOS pass gate in pull down path while reading. The read disturb problem is fixed by controlling the feedback path and optimizing the cell ratios during write conditions. The proposed design also employs a differential read operation to lower the read access time. No specific scheme is proposed for half select issue, as given in Table 5.1.

TABLE 5.1: BASIC IDEA OF THE PROPOSED 8T CELL AND ITS OUTCOME

| Challenge                 | Proposed Solution                             | Advantage                                              | Limitation                             |
|---------------------------|-----------------------------------------------|--------------------------------------------------------|----------------------------------------|
| <b>Read Disturb</b>       | Data Isolation<br>Enhanced Read (DIER)        | More immune to read disturb                            | Area and Read Delay increases          |
| <b>Write time</b>         | Reverse Bit Lines Controlled Feedback (RBLCF) | Low writing delay                                      | NMOS pass gate needed in pulldown path |
| <b>Write Margin</b>       | Reverse Bit Lines Controlled Feedback (RBLCF) | High WSNM                                              |                                        |
| <b>Process Variations</b> | DIER, RBLCF                                   | More robust against process variations                 | Area increases                         |
| <b>Area</b>               | Minimum possible size<br>MOS used             | Density increases                                      | 8 transistors per cell                 |
| <b>Half Select</b>        | Built-in assist                               | DIER, RBLCF provides the assist and no external assist | Area increases                         |

Using two stacked transistor the 8T bit-cell tries to control the leakage during all three operations (read, write and retention) with fixed zero ground level (Table 5.2). Common word/bit line is used for read and write. There is no extra logic circuit and no driving power is required. In 8T bit-cell the number of bit/word lines is the same as the conventional 6T cell. The proposed 8T increases the write static noise margin and also improves the writing speed. Bit-cell ratios are commonly optimized for read and write operations. Higher bit-line capacitance and two series transistors in the pull down path would increase the read access time.

TABLE 5.2: RESULTS OF THE PROPOSED 8T CELL WITH REFERENCE TO CONVENTIONAL 6T CELL

|                               | <b>Improvement</b> | <b>Degradation</b>                             |
|-------------------------------|--------------------|------------------------------------------------|
| <b>HSNM (mV)</b>              | 5%                 | NIL                                            |
| <b>WTP (mV)</b>               | 66%                | NIL                                            |
| <b>RSNM (mV)</b>              | NIL                | 50% (nil when pull down path is upsized to 2x) |
| <b>Write Performance (ps)</b> | 29%                | NIL                                            |
| <b>Read Performance (ps)</b>  | NIL                | 8%                                             |
| <b>Standby Power (nW)</b>     | 20% (saving)       | NIL                                            |

Careful selection of bit-cell ratios and differential read allows the read and write performance to be the same as that of the conventional 6T. The area of the proposed cell is approximately ~5x (logic layout rules) than 6T (bit-cell layout rules) area but it can be reduced up to ~2x (bit-cell layout rules) by layout optimization. Optimized layout will also reduce bit-line capacitance.

The proposed 8T cell can be optimized to read, write and hold the data which can sustain 7 sigma variations in process parameters. On the other hand 6T cannot maintain the balance between all three operations (read, write and hold) during 7 sigma variations considering similar cell optimization conditions, as used for the proposed 8T cell.

### 5.3 The proposed Single Ended 10T SRAM cell

A bit-cell is the main SRAM component to store one bit of information. The proposed 10T bit-cell uses a pair of cross-coupled inverters forming a latch, access transistors and read buffer. Access transistors provide access to the cell during write operation and isolate the cell during the read/hold state. A read buffer isolates the storage node from any disturbance during a read operation. The proposed 10T SRAM cell is designed to provide non-destructive read access, write-ability and data retention. Following block diagram in Figure 5.8 shows the building blocks of the proposed 10T cell. Due to confidentiality (patent pending) the transistor level design is not shown.



Figure 5.8: Block diagram of the proposed 10T cell

#### 5.3.1 Write Operation

The write operation shown in Figure 5.9 is similar to the conventional 6T cell where one of the bit lines, WBL/WBLB is driven from a pre-charged value to ground potential by a write driver through the access transistor. If the cell is properly sized, then the cell will be flipped and data will be written. We have optimized the 10T cells inverter pair and access transistors to achieve better write ability.



Figure 5.9: Block diagram of the proposed 10T cell

### 5.3.2 Read Operation

Before reading the data stored in the cell the bit line RBL is pre-charged to VDD. To read the cell we activate the read word line (RWL) and if there was '1' at true storage node the read buffer will allow the read bit line RBL to discharge by making a low resistive path for current to flow through RBL to ground as shown in Figure 5.10. Once the RBL discharges to a voltage level sufficient for reliable sensing by the inverter sense amplifier, the sense amplifier will be enabled and give a full swing output signal. For this single ended read operation an inverter is used as a sense amplifier, therefore, large signal swing is needed to sense the variation in bit-line voltage as compared to differential sensing. The switching conditions of all signals during write, read and hold operation are summarized in Table 5.3.



Figure 5.10: Read operation of the proposed 10T cell

TABLE 5.3: OPERATION SUMMARY FOR PROPOSED 10T SRAM CELL

|             | Hold | Read '1'  | Read '0' | Write '1' | Write '0' |
|-------------|------|-----------|----------|-----------|-----------|
| <b>WWL</b>  | 0    | 0         | 0        | 1         | 1         |
| <b>WBL</b>  | 1    | 1         | 1        | 1         | 0         |
| <b>WBLB</b> | 1    | 1         | 1        | 0         | 1         |
| <b>RWL</b>  | 0    | 1         | 1        | 0         | 0         |
| <b>RBL</b>  | 1    | Discharge | 1        | 1         | 1         |

## 5.4 Simulation Results of the proposed 10T SRAM Cell

To validate the design of the proposed 10T (P-10T) cell, circuit simulations were performed for the minimum cell size conditions. If the input voltage at the gate of a MOSFET drops below its threshold voltage then the device current becomes exponentially dependent on the difference between the gate and the threshold voltage. Slightly improved results were observed for iso-area conditions over minimum cell size conditions. Simulations are done for 27 °C temperature and 300 mV to 800 mV as supply voltage for all process corners and worst cases are presented. The effect of Process Voltage Temperature (PVT) variations on 6T, P-10T is shown to justify the stability in sub/super-threshold region at Ultra Low Voltage (ULV) power supply. The approach followed by [17] is used to find static noise margins. For WSNM write trip point analysis is used.

### 5.4.1 Read Static Noise Margin (RSNM)

Read Static noise margin can be found as the diagonal or as a side of the maximum square embedded between the two inverter voltage transfer characteristics of an SRAM cell. Read decoupling schemes confirm the isolation of the true storage node from RBL and read access without disturbing the stored value at this node ultimately increases RSNM. For fast NMOS and slow PMOS (FS) worst case corner the RSNM distribution for 6T and P-10T is shown in Figure 5.11.



Figure 5.11: Read static noise margin vs power supply

#### 5.4.2 Hold Static Noise Margin (HSNM)

To find the HSNM we follow the same definition as used to find RSNM. In the hold mode, the robustness of the inverter pair determines the data retention capability. Both hold and write operations are satisfied with specific aspect ratios, which results in a slight improvement of HSNM while maintaining WSNM higher than 6T, as shown in Figure 5.12.



Figure 5.12: Hold static noise margin vs power supply

#### 5.4.3 Write Trip Point (WTP)

The write Trip Point (WTP) is one of the measures of write static noise margin (WSNM) which is the voltage difference between bit-line and VSS when cell data flips. Write margin value and variation is a function of the cell design, SRAM array size and process variation. A cell is considered not writeable if the worst-case write margin becomes lower than the ground potential.



Figure 5.13: Write trip point vs power supply

Write-ability is a major concern for SRAM cell operating in the sub-threshold region so careful selection of circuit topology, its design and dynamic supply variation are of crucial importance. Due to independent write and read port the inverter pair can be optimized for better write conditions. It can be observed from Figure 5.13 the WTP for P-10T is higher than that of 6T and 8T. From the results it is clear that the cell has high yield even at low power supply range.

#### 5.4.4 Write Delay

A voltage level up to which the true storage node can be charged or discharged is closely related to the length of the write period. Without conflicting aspect ratios of the P-10T cell can be optimized for write and read separately.



Figure 5.14: Write delay vs. power supply

Therefore the write operation gives better write performance as shown in Figure 5.14. Write delay is calculated as the time difference of 50% rise in word line voltage and the intersection point of storage nodes.

#### 5.4.5 Read Delay

As Figure 5.15 shows at 300-800mV the read delay of P-10T is equal to 6T read delay because isolation provided by the read buffer leads to independent path for cell current. It is calculated as the

time taken to create a 100mV difference on bit-line after activation of read bit line. As the same aspect ratios are used in the read path the delay is the same for both 6T and the proposed 10T.



Figure 5.15: Read delay vs. power supply

#### 5.4.6 Write and Read Power Consumption

Scaling the threshold voltages of the devices down along with the supply voltage exponentially increases the standby leakage current ( $I_{off}$ ) of the circuit [P-87]. The growing leakage currents and power, which is wasted as generated heat, limits practical power supply scaling. Process variations together with low  $V_{TH}$  devices can significantly increase the absolute leakage magnitude.

The die-to-die and intra-die parameter variations are also worsening with technology scaling. These variations affect the maximum clock frequency and leakage power distributions. Variation effects are more pronounced at low supply voltages (VDD).

Average power is calculated for one complete cycle of write and read operation which is shown in Figure 5.16 and Figure 5.17 respectively. The proposed 10T requires less power to write/read the cell as compared to the conventional 6T cell.



Figure 5.16: Write power consumption vs power supply



Figure 5.17: Read power consumption vs power supply

#### 5.4.7 Standby Power Consumption

Static power consumption during write operation is shown in following Figure 5.18 where the proposed 10T consumes higher power because we have included the read buffer circuit while calculating the retention period power.



Figure 5.18: Standby power consumption vs power supply

#### 5.4.8 Comparison summary

The following Table 5.4 shows the comparison of different parameter values calculated for sub-threshold operating region. The calculated data shows that the proposed 10T cell is better compared to conventional 6T and 8T SRAM cells.

TABLE 5.4: COMPARISON OF THE PROPOSED 10T, CONVENTIONAL 6T AND CONVENTIONAL 8T

| (At 0.3V power supply)    | Proposed 10T | Conventional 6T | Conventional 8T |
|---------------------------|--------------|-----------------|-----------------|
| <b>RSNM (mV)</b>          | 72.9         | 14.7            | 72.4            |
| <b>HSNM (mV)</b>          | 72.9         | 68.5            | 72.9            |
| <b>WTP (mV)</b>           | 142          | -ve             | 72.4            |
| <b>Write Delay (ps)</b>   | 307.1        | Write Fail      | Write Fail      |
| <b>Write Power (nW)</b>   | 37.2         | 75.4            | 74.6            |
| <b>Read Delay (ps)</b>    | 153.5        | 152.1           | 153.2           |
| <b>Read Power (nW)</b>    | 7.09n        | 11n             | 8.4             |
| <b>Standby Power (nW)</b> | 2            | 1.7             | 2               |

## 5.5 High Speed Sensing For Single Ended Bit-cells

Cell stability has become a critical component for low power SRAM operation. As cell supply goes down to satisfy ultra-low power applications such as sensor nodes, the signal noise margin (SNM) decreases, which directly affects the yield of SRAM operation. Cell stability is enhanced significantly by introducing a separate bit line which does not introduce any noise inside the SRAM cell during read operation. For single-ended large-signal sensing shown in Figure 5.19, all SRAM cells are connected to a common read bit line (RBLC). An inverter is used to sense the signal on a global bit-line (GRBL) when the read word-line (RWL) is activated. In addition to the variations in bit-line voltage levels, due to device mismatch for the transistors connected to bit-line, the inverter trip voltage is used to evaluate the sensing margins for sub-threshold/super-threshold operations.



Figure 5.19: Conventional single ended read sensing scheme

However, improving the access time is a major challenge for single ended cells. This can be a major hurdle, since SRAM memory is used in cache applications where read time is very critical for microprocessor performance.

### 5.5.1 Review of Sensing Schemes

- Data-aware Sensing Reference (DASR)

This scheme is capable of maintaining sensing margins for both read-0 and read-1 under given timing constraints. The key mechanism involved in maintaining the sensing margin is the adaptive changing of the reference voltage, such that the sensing headroom or potential range for read-1 and read-0 overlap, as in differential BL sensing [10].

- Pseudo differential single ended current mode sense amplifier.

It demonstrates that this design can deliver a performance similar to that of conventional current mode differential amplifier without using dual bit-line for read operation [11].

This is a scheme where dual rails are still used for the sensing circuit. Both bit lines for each column are present in standard single-ended scheme although only one of them is used. Compared to domino-based single-ended sensing scheme, the sensing circuit consists of more transistors. However, there is only a slight increase in area because a sensing circuit is shared by multiple bits [12].

- Current direction sense circuit

The sense circuit's input node is clamped at an intermediate voltage level, and the circuit transforms current direction into a logic value. It operates four times faster than a CMOS inverter, when driver sizes are equal. When it is applied to a single-end multiport SRAM, access is accelerated 3.2 times faster than that with a CMOS inverter with no increase in power consumption [13].

- Write bit-line swing control circuit.

The write bit-line swing control circuit reduces the bit-line precharge level within the limit of correct operation by using a memory cell replica. The control circuit reduces power consumption for bit-line driving and the pseudo-read cell current by 40% [13].

### 5.5.2 The proposed Sensing Scheme

There can be many ways to improve the performance. We have chosen the charge sharing. The proposed scheme uses the basic idea of Charge Sharing between two capacitors. When we connect two capacitors together as shown in the following Figure 5.20, then charge will be shared.



Figure 5.20: Charge sharing between two capacitors

Let's assume initially  $V_a = VDD$  and  $V_b = 0$

Now when the switch is closed, we can model the situation as follows:

$$C_a VDD = (C_a + C_b) V_{final}$$

$$V_{final} = VDD * C_a / (C_a + C_b)$$

As the capacitor charge/discharge rate depends on the voltage and capacitance value if  $V_a \gg V_b$  and  $C_a \gg C_b$ , the capacitor  $C_b$  discharges faster than  $C_a$ .

### 5.5.3 Design of the proposed Scheme

Now as we know that the capacitance value of a local bit line is lower than the global bit line we can utilize charge sharing to increase the charge/discharge rate of both bit lines. As shown in Figure 5.21 the precharge block will provide the initial charge on the bit line capacitances. Memory bit-cells are directly connected to the local read bit line (RBLC\_CS). One switch will connect the local bit line to the global bit line (GRBL\_CS) according to the operational need. The evaluation block will take the input from both bit line and gives the data which is stored in the bit-cell as output.

### 5.5.4 Sensing Operation

First we precharge RBLC\_CS to VDD and predischarging GRBL\_CS to ground. Then we activate RWL, S and evaluation block at the same time. Charge sharing is occurred between

RBLC\_CS and GRBL\_CS capacitance (Figure 5.21). If the true node of the accessed bit-cell has a high value, it will allow RBLC\_CS to discharge and charge sharing helps to increase the discharge rate of the local bit-line (RBLC\_CS), consequently, it will speed up the charging rate of GRBL\_CS. Since RBLC\_CS quickly goes low, there is a chance of a false read “1” when the true storing node of the cell has a “0”.



Figure 5.21: Conventional single ended read sensing scheme

### 5.5.5 Results

To test the functionality of the proposed scheme we have simulated the circuit for all process corners and those worst case conditions are presented. Four sigma  $V_{TH}$  value is applied to the

accessed bit-cell while other cells store the opposite binary value. We can check for a false read condition by operating the circuit at fast corner and the performance testing can be done at slow corner. Moreover the high voltage (1.1V) - high temperature (125°C) and low voltage (0.85V) - low temperature (0°C) values are set for fast and slow corners respectively.



Figure 5.22: Read delay comparison when global bit line capacitance is 8x of local bit-line capacitance.

While calculating the delay -4-sigma  $V_{TH}$  for fast corner and +4-sigma value for slow corner is applied in the read path of the accessed bit-cell. To activate a read operation RWL with 100ps rise/fall time is selected. When the global bit line capacitance (CGRBL/GRBL\_CS) is 8x of the local bit line capacitance (CRBLC/RBLC\_CS) the read delay of the conventional and the proposed sensing scheme is the same for fast corner. On the other hand the proposed scheme shows 6.8% performance improvement over conventional sensing for slow corner, as depicted in Figure 5.22.

The capacitance ratio decides the charge sharing and if the CGRBL/GRBL\_CS is 8x of CRBLC/RBLC\_CS, the read 0 fail probability increases. Therefore if we can reduce CGRBL/GRBL\_CS then we will be able to reduce the 0 read failure probability with reduced read

delay. The condition when CGRBL/GRBL\_CS is 2x of CRBLC/RBLC\_CS read delay is 9.4% lower than the conventional read delay at the slow corner, which is shown in Figure 5.23.



Figure 5.23: Read delay comparison when global bit line capacitance is 2x of local bit-line capacitance.

## 5.6 Chapter Summary

Large size array and high packing density of embedded SRAMs makes them yield limiters in SoCs. As we know that process variations, such as threshold offset and mismatch, photo-lithography non-idealities causing channel length and width variations, can severely degrade the Static Noise Margin of an SRAM cell. Minute defects and tough operating conditions can cause many cells in an SRAM array to have degraded stability and can flip their state. We reviewed the various definitions of the SNM that can be encountered in the literature.

We analyzed the factors affecting SRAM cell stability such as the variations of the process parameters, and operating voltage variations. Breaking the feedback technique is used to optimize the bit-cell to cope with the problems (read disturb, write failure and half select) which the conventional 6T cell has.

We have shown the effect of using separate write and read circuits and, because of it, the write and read static noise margins are increased significantly. To save area we have used minimum size devices wherever possible.

We also reviewed the different access time improving techniques suitable for single ended SRAM cells. We suggested a charge-sharing scheme to improve the performance of large signal sensing. We have checked the functionality of the proposed designs and presented simulation results for the worst cases with different parameters.

## References

- [1] Takeda K. et al. (2006), A read-static-noise-margin-free SRAM cell for low-VDD and high-speed applications, IEEE J. Solid-State Circuits, vol. 41, no. 1, pp. 113–121.
- [2] Chang M.-F. et al. (2011), A 130 mV SRAM with expanded write and read margins for sub-threshold applications, IEEE J. Solid-State Circuits, vol. 46, no. 2, pp. 520-529.
- [3] Aly Ramy E. and Bayoumi Magdy A. (2007), Low-Power Cache Design Using 7T SRAM Cell, IEEE Trans. Circuits and Systems-II: Express Brief, vol. 54, no. 4, pp. 318-322.
- [4] Prasad S. Rajendra, Madhavi B. K, Kishore K. Lal (2012) ,Design of A 32nm 7T SRAM Cell Based on CNTFET for Low Power Operation, Proceeding of International Conference on Devices, Circuits and Systems (ICDCS), Macau, China, pp. 443-446.
- [5] Kulkarni et al. (2011), A Read-Disturb-Free, Differential Sensing 1R/1W Port, 8T Bitcell Array, IEEE Trans. Very Large Scale Integr. (VLSI) Systems, vol. 19, pp. 1727-1730.
- [6] Na et al. (2010), A Differential Read Subthreshold SRAM Bitcell with Self-adaptive Leakage Cut Off Scheme, Proceedings of IEEE International SOC Conference (SOCC), Indiana, USA, 455-460.
- [7] Feki et al. (2012), Proposal of a New Ultra Low Leakage 10T Sub threshold SRAM Bitcell, Proceedings of ISOCC, Jeju Island, South Korea, pp. 470-474.
- [8] Song et al. (2010), Fully-gated ground 10T-SRAM bitcell in 45nm SOI technology, Electronics Letters, vol. 46 no. 7, pp. 515-516.
- [9] Reddy et al. (2008), Process Variation Tolerant 9T SRAM Bitcell Design, Proceedings of 13th Int'l Symposium on Quality Electronic Design, CA, USA, pp. 493-497.
- [10] Yang et al. (2013), Low-Voltage Embedded NAND-ROM Macros Using Data-Aware Sensing Reference Scheme for VDDmin, Speed and Power Improvement, IEEE J. Solid-State Circuits vol. 48, no. 2, pp. 611-623.

- [11] Sil et al. (2008), High Speed Single-Ended Pseudo Differential Current Sense Amplifier for SRAM Cell, Proceedings of IEEE International Symposium Circuits and Systems (ISCAS), Knoxville, Tennessee, pp. 3330-3333.
- [12] Ye et al. (2006), Evaluation of Differential vs. Single-Ended Sensing and Asymmetric Cells in 90nm Logic Technology for On-Chip Caches, Proceedings of IEEE International Symposium Circuits and Systems (ISCAS), San Juan, Puerto Rico, pp 963-966.
- [13] Izumikawa Masanori, and Yamashina Masakazu (1996), A Current Direction Sense Technique for Multipart SRAM's, IEEE J. Solid-State Circuits, vol. 31, no. 4, pp. 546-551.

# Chapter 6

## Conclusions and Scope of Future Work

The research vision of this project is to design new circuit architectures that can achieve ultra-low power consumption in conventional bulk MOSFET and future cutting edge FinFET technologies. The main focus of this work is to find novel circuit topologies to improve the practicality of ultra-low power SRAMs. In order to achieve this goal, a single 8T SRAM cell with high data stability (high  $\mu$  and low  $\sigma$ ) that operates in ultra-low voltage supplies is presented. We attained enhanced static noise margin (SNM) in the sub-threshold regime using single-ended with dynamic feedback cutting (SE-DFC) and read decoupling schemes. The area of the proposed cell is twice as that of 6T. Still, its better built-in process tolerance and dynamic voltage applicability enables it to be employed similar to cells (8T, 9T and 10T) along with  $\sim 1.9x$  area overhead. The proposed 8T cell has high stability and can be operated at the ultra-low voltage of 200-300mV power supply. The advantage of reduced power consumption of the proposed 8T cell enables it to be employed for battery operated SoC design.

A single-ended boost-less (SE-BL) 7T SRAM cell utilizing dynamic feedback cutting and dynamic read decoupling is presented. The proposed 7T SRAM cell has improved read margin, write-ability, performance and lower power consumption over the conventional 5T SRAM cell with small area overhead. High mean ( $\mu$ ) and low standard deviation ( $\sigma$ ) of 7T result in successful operation at 300mV power supply. With these favorable properties, the proposed 7T cell can be employed for battery operated system on chip (SoC) designs.

A robust and reliable single-ended boost-less 7T FinFET SRAM with high data stability under PVT variations is presented. Using feedback cutting and read decoupling schemes we attained high WSNM, HSNM and read margin for the sub/super-threshold regime. The effect of temperature variation indicates that 7T is less affected as compared to 5T. Simulation results using 20nm BSIM FinFET technology indicate that the proposed 7T cell can be operated at 0.2V. This ultra-low power FinFET 7T SRAM cell with improved stability and performance can be embedded in battery operated systems.

We have analysed the factors affecting SRAM cell stability such as the variations of the process

parameters, and operating voltage variations. The proposed differential 8T and read decoupled 10T use broken feedback technique to mitigate the problems (read disturb, write failure and half select) which conventional 6T cell has. We have shown the effect of using separate write and read circuits and, because of it, the write and read static noise margins are increased significantly. To save area we have used minimum sized devices when possible.

Future scope and applications of the proposed SRAMs can potentially be in low/ultra-low voltage and medium frequency operation such as Biomedical Implants, Wireless Sensing, Mobile, Multimedia Gadgets, Neural signal processor, Low voltage cache operation etc. With the reduction in energy/power consumption and improvement in reliability, we hope that the proposed SRAM systems can be employed as embedded SRAM and would open the possibility of many new architectural options. Moreover, highly energy-constrained applications will also benefit from these memory-rich processor architectures. However, considering new techniques to overcome reliability issues have been limited solely to single device level or circuit level. However, as CMOS scaling continues, we believe there is a need for a new design paradigm device/circuit co-design methodology leading to properly optimized circuits and systems considering new device innovations.

The capacity of the 1kb SRAM array designed in this thesis is not enough for all digital systems, especially for some signal processing or data collection systems. Therefore, the design of sub-threshold SRAM macro with larger memory capacity is also an interesting work. Although some novel single-ended and differential SRAM designs are proposed, further possibilities exist to find mind breaking ideas for other circuit topologies. Also, asymmetric SRAMs could be useful to improve performance of ultra-low power SRAMs. The SRAM design will be fabricated in the future, and targeted applications of the proposed ultra-low-power SRAM designs are low-power biomedical and space applications. Therefore, the designed sub-threshold SRAM is targeted to be integrated in digital systems which inevitably involve sub-threshold microprocessors. Hence the design and fabrication of a sub-threshold microprocessor is one of the main research topics following sub-threshold SRAM design. Working on new techniques to enable circuits operating at ultra-low supply voltages less than 200mV for biomedical applications is one of the most important goals for future research. Furthermore, due to aggressive transistor scaling, process variation effects are unfortunately increased. As a result digital design in such a low voltage (e.g. 200mV) in the presence of huge process variations (Random dopant fluctuation) is an interesting challenge.

The continuous advancement in technology, the density of embedded SRAM has grown substantially and become the main consumer of power of a system-on-chips (SoC). Therefore the future course of action involves effective reduction of leakage in SRAM with a collaboration of research in device physics and an integrated circuit design methodology.

Achieving new methods to optimize memory arrays is one of the most important challenges in electronic design. Recently presented memory designs such as Spin Torque Transfer RAM (STT-RAM), Phase change etc. shows promising potential to improve the area of memory arrays, speed, and power consumption.