Low-power high-level synthesis for nanoscale CMOS circuits

Low-Power High-Level Synthesis for Nanoscale CMOS Circuits addresses the need for analysis, characterization, estimation, and optimization of the various forms of power dissipation in the presence of process variations of nano-CMOS technologies. The authors show very large-scale integration (VLSI) researchers and engineers how to minimize the different types of power consumption of digital circuits. The material deals primarily with high-level (architectural or behavioral) energy dissipation because the behavioral level is not as highly abstracted as the system level nor is it as complex as the gate/transistor level. At the behavioral level there is a balanced degree of freedom to explore power reduction mechanisms, the power reduction opportunities are greater, and it can cost-effectively help in investigating lower power design alternatives prior to actual circuit layout or silicon implementation.

The book is a self-contained low-power, high-level synthesis text for Nanoscale VLSI design engineers and researchers. Each chapter has simple relevant examples for a better grasp of the principles presented. Several algorithms are given to provide a better understanding of the underlying concepts. The initial chapters deal with the basics of high-level synthesis, power dissipation mechanisms, and power estimation. In subsequent parts of the text, a detailed discussion of methodologies for the reduction of different types of power is presented including:

* Power Reduction Fundamentals

* Energy or Average Power Reduction

* Peak Power Reduction

* Transient Power Reduction

* Leakage Power Reduction

Low-Power High-Level Synthesis for Nanoscale CMOS Circuits provides a valuable resource for the design of low-power CMOS circuits.

Acronym Definition	p. xxix
1 Introduction	p. 1
2 High-Level Synthesis Fundamentals	p. 5
2.1 Introduction	p. 5
2.2 The Complete Chip Story: From Customers' Requirements to Silicon Chips for Customers	p. 5
2.3 Various Phases of Circuit Design and Synthesis	p. 7
2.4 High-Level or Behavioral Synthesis: What and Why	p. 10
2.5 Various Phases of High-Level Synthesis	p. 11
2.5.1 Compilation	p. 12
2.5.2 Transformation	p. 12
2.5.3 Scheduling	p. 13
2.5.4 Selection or Allocation	p. 13
2.5.5 Binding or Assignment	p. 13
2.5.6 Output Generation	p. 14
2.5.7 A Demonstrative Example	p. 14
2.6 Behavioral HDL to CDFG Translation or Compilation	p. 14
2.7 Scheduling Algorithms	p. 16
2.7.1 ASAP and ALAP Scheduling and Mobility	p. 19
2.7.2 Integer Linear Programming (ILP) Scheduling	p. 19
2.7.3 List-Based Scheduling (LBS)	p. 23
2.7.4 Force-Directed Scheduling (FDS)	p. 25
2.7.5 Game Theory Scheduling (GTS)	p. 26
2.7.6 Tabu Search Scheduling (TSS)	p. 28
2.7.7 Simulated Annealing Scheduling (SAS)	p. 29
2.7.8 Genetic Algorithm Scheduling (GAS)	p. 30
2.7.9 Ant Colony Scheduling (ACS)	p. 30
2.7.10 Automata-Based Symbolic Scheduling	p. 31
2.7.11 Chaining, Multicycling and Pipelining Data Paths	p. 31
2.8 Binding or Allocations Algorithms	p. 32
2.8.1 Clique Partitioning Approach	p. 33
2.8.2 Graph Coloring Approach	p. 33
2.8.3 Left Edge Algorithm for Register Optimization	p. 35
2.8.4 Integer Linear Programming (ILP) Binding	p. 36
2.8.5 Heuristic Algorithm to Solve Clique Partitioning	p. 37
2.8.6 GTS Algorithm	p. 37
2.9 Control Synthesis	p. 38
2.10 High-Level Synthesis Benchmarks	p. 38
2.11 High-Level Synthesis Tools	p. 44
2.11.1 CatapultC from Mentor Graphics	p. 44
2.11.2 CyberWorkBench from NEC	p. 44
2.11.3 PICO Express from Synfora	p. 44
2.11.4 Cynthesizer from Forte Design Systems	p. 44
2.11.5 Cascade from Critical Blue	p. 45
2.11.6 Agility Compiler from Celoxica	p. 45
2.11.7 eXCite from Y Explorations	p. 45
2.11.8 ESEComp from BlueSpec	p. 45
2.11.9 VCS from Synopsys	p. 45
2.11.10 NC-SC, NC-Verilog and NC-VHDL from Cadence	p. 45
2.11.11 Synplify from Synplicity	p. 46
2.11.12 ISE from Xilinx	p. 46
2.11.13 Quartus from Altera	p. 46
2.12 Summary and Conclusions	p. 46
3 Power Modeling and Estimation at Transistor and Logic Gate Levels	p. 47
3.1 Introduction	p. 47
3.2 CMOS Technology Trends	p. 48
3.3 Current Conduction Mechanisms in Nano-CMOS Devices: A Resume	p. 49
3.3.1 The Ideal ON and OFF States	p. 49
3.3.2 Junction Reverse Bias Current	p. 50
3.3.3 Drain-Induced Barrier Lowering (DIBL)	p. 51
3.3.4 Subthreshold Leakage	p. 51
3.3.5 Gate-Induced Drain Leakage (GIDL)	p. 52
3.3.6 Punch-Through	p. 53
3.3.7 Hot-Carrier Injection	p. 53
3.3.8 Band-to-Band Tunneling (BTBT)	p. 54
3.3.9 Gate-Oxide Tunneling	p. 55
3.4 Power Dissipation in Nano-CMOS Logic Gates	p. 59
3.4.1 Static, Dynamic and Leakage Power Dissipation	p. 59
3.4.2 Case Study: The 45 nm NOT, NAND, NOR CMOS Gates	p. 60
3.5 Process Variation Effects	p. 67
3.5.1 Origins and Sources of Process Variation	p. 67
3.5.2 Methodologies to Accommodate Process Variation	p. 68
3.6 From Gates to Functional Units: A Power Modeling and Estimation Perspective	p. 72
3.6.1 SPICE level	p. 73
3.6.2 Probabilistic and Statistical Techniques	p. 75
3.7 Summary and Conclusions	p. 78
4 Architectural Power Modeling and Estimation	p. 81
4.1 Introduction	p. 81
4.2 Architecture-Level Estimation	p. 84
4.3 Dynamic Power Modeling and Estimation	p. 90
4.3.1 Abstract Data Path Power Estimation	p. 91
4.3.2 Capacitance Estimation	p. 92
4.3.3 Macro-modeling for Dynamic Power	p. 93
4.3.4 Estimation of Bounds on Average Power	p. 94
4.4 Leakage Modeling	p. 94
4.4.1 Subthreshold and Gate-Oxide Leakage Power Modeling and Estimation	p. 95
4.4.2 Methods for Total Leakage Estimation	p. 97
4.5 Modeling and Analysis of Architectural Components	p. 100
4.5.1 Design-Optimization-Aware Estimation	p. 100
4.5.2 Estimating Under Variation Effects	p. 103
4.5.3 Estimating Power in Control and Data Path Logic	p. 104
4.5.4 Communication Components	p. 106
4.6 Register Files	p. 108
4.6.1 Methodology	p. 109
4.6.2 Basic Power Model	p. 109
4.6.3 Pipelined Register Files	p. 111
4.6.4 Physical Dimensions and Latency	p. 11
4.6.5 Area, Power, Delay Models	p. 115
4.6.6 Device Sizing	p. 119
4.7 Cache Arrays	p. 120
4.7.1 CACTI Dynamic Power Model for Caches	p. 120
4.7.2 Leakage Modeling for Arrays	p. 123
4.8 Validation and Accuracy	p. 125
4.8.1 Model Validation: Arrays as an Example	p. 125
4.8.2 Simulator Accuracy	p. 127
4.8.3 Power Model Accuracy	p. 127
4.9 Effect of Temperature on Power	p. 128
4.10 Summary and Conclusions	p. 129
5 Power Reduction Fundamentals	p. 131
5.1 Introduction	p. 131
5.2 Power Dissipation or Consumption Profile of CMOS Circuits	p. 131
5.3 Why Low-Power Design?	p. 133
5.4 Why Energy or Average Power Reduction?	p. 135
5.5 Why Peak Power Minimization?	p. 136
5.6 Why Transient Power Minimization?	p. 137
5.7 Why Leakage Power Minimization?	p. 137
5.8 Power Reduction Mechanisms at Different Levels of Abstraction	p. 138
5.9 Why Power Optimization During High-Level or Behavioral Synthesis?	p. 138
5.10 Methods for Power Reduction in High-Level Synthesis	p. 139
5.11 Frequency and/or Voltage Scaling for Dynamic Power Reduction	p. 140
5.11.1 What Is Voltage or Frequency Scaling?	p. 140
5.11.2 Why Frequency and/or Voltage Scaling?	p. 142
5.11.3 Energy or Average Power Reduction Using Voltage or Frequency Scaling	p. 143
5.11.4 Peak Power Reduction Using Voltage and Frequency Scaling	p. 145
5.11.5 Issues in Multiple Supply Voltage-Based Design	p. 146
5.11.6 Voltage-Level Converter Design	p. 146
5.11.7 Dynamic Frequency Clocking Unit Design	p. 148
5.12 V[subscript Th] Scaling for Subthreshold Leakage Reduction	p. 150
5.12.1 The Concept	p. 150
5.12.2 Multiple Threshold CMOS (MTCMOS) Technology	p. 151
5.12.3 Variable Threshold CMOS (VTCMOS) Technology	p. 152
5.12.4 Dynamic Threshold CMOS (DTCMOS) Technology	p. 152
5.12.5 Leakage Control Transistor (LECTOR) Technique	p. 152
5.12.6 The Issues	p. 153
5.13 T[subscript ox], K or L Scaling for Gate-Oxide Leakage Reduction	p. 153
5.13.1 The Concept	p. 153
5.13.2 Multiple Oxide Thickness CMOS (MOXCMOS) Technology	p. 154
5.13.3 Multiple Dielectric (k) (MKCMOS) Technology	p. 154
5.13.4 The Issues	p. 154
5.14 Transformation Techniques for Power Reduction	p. 155
5.14.1 Operation Reduction	p. 155
5.14.2 Operation Substitution	p. 156
5.15 Increased Parallelism and Pipelining with Architecture-Driven Voltage Scaling for Power Reduction	p. 156
5.15.1 Parallelism with Voltage Scaling	p. 157
5.15.2 Pipelining with Voltage Scaling	p. 157
5.16 Guarded Evaluation to Reduce Power	p. 159
5.17 Precomputation-Based Power Reduction	p. 160
5.18 Clock Gating to Reduce Clock Power Dissipation	p. 161
5.19 Interconnect Power Minimization	p. 161
5.20 Summary and Conclusions	p. 161
6 Energy or Average Power Reduction	p. 163
6.1 Introduction	p. 163
6.2 Target Architecture and Data Path Specifications for Multiple Voltage	p. 164
6.3 ILP-Based Scheduling for EDP Reduction	p. 165
6.3.1 Introduction	p. 165
6.3.2 EDP Modeling of a DFG	p. 166
6.3.3 ILP Formulations for EDPs	p. 168
6.3.4 ILP-Based Data Path Scheduling Algorithm	p. 170
6.3.5 Experimental Results	p. 172
6.3.6 Conclusions	p. 174
6.4 Heuristic-Based Scheduling Algorithm for Energy Minimization	p. 176
6.4.1 Introduction	p. 176
6.4.2 Time-Constrained Scheduling: TC-DFC	p. 177
6.4.3 Resource-Constrained Scheduling: RC-DFC	p. 183
6.4.4 Experimental Results	p. 188
6.4.5 Conclusions	p. 190
6.5 Data Path Scheduling for Energy or Average Power Reduction Using Voltage Reduction	p. 191
6.5.1 Time- or Resource-Constrained Scheduling Algorithms	p. 191
6.5.2 Time- and Resource-Constrained Scheduling Algorithms	p. 193
6.6 Switching Activity Reduction During High-Level Synthesis	p. 194
6.6.1 Scheduling and/or Allocation for Switching Activity Reduction	p. 195
6.6.2 Scheduling and/or Binding for Switching Activity Reduction	p. 198
6.7 Summary and Conclusions	p. 200
7 Peak Power Reduction	p. 201
7.1 Introduction	p. 201
7.2 Peak and Average Power Dissipation Modeling of a Data Path Circuit	p. 201
7.3 ILP-Based Scheduling for Peak Power Reduction	p. 204
7.3.1 ILP Formulations	p. 205
7.3.2 ILP-Based Scheduler	p. 207
7.3.3 Experimental Results	p. 211
7.3.4 Conclusions	p. 215
7.4 ILP-Based Scheduling for Simultaneous Peak and Average Power Reduction	p. 215
7.4.1 ILP Formulations	p. 215
7.4.2 ILP-Based Scheduler	p. 216
7.4.3 Experimental Results	p. 219
7.4.4 Conclusions	p. 222
7.5 Scheduling or Binding for Peak Power Reduction	p. 222
7.5.1 Scheduling Algorithms	p. 222
7.5.2 Binding Algorithms	p. 223
7.6 Summary and Conclusions	p. 224
8 Transient Power Reduction	p. 225
8.1 Introduction	p. 225
8.2 Modeling for Power Transience or Fluctuation of a Data Path Circuit	p. 225
8.2.1 Model 1: CPF Using Mean Deviation	p. 226
8.2.2 Model 2: CPF Using Cycle-to-Cycle Gradient	p. 229
8.2.3 Minimization of CPF as an Objective Function	p. 230
8.3 Heuristic-Based Scheduling Algorithm for CPF Minimization	p. 232
8.3.1 Introduction	p. 232
8.3.2 Algorithm Flow	p. 232
8.3.3 Pseudocode of the Algorithm Heuristic	p. 234
8.3.4 Algorithm Time Complexity	p. 236
8.3.5 Experimental Results	p. 236
8.3.6 Conclusions	p. 238
8.4 Modified Cycle Power Function (CPF*)	p. 242
8.5 Linear Programming Modeling of Non-linearities	p. 244
8.5.1 Linear Programming Formulation Involving the Sum of Absolute Deviations	p. 244
8.5.2 Linear Programming Formulation Involving Fractions	p. 245
8.6 ILP Formulations to Minimize (CPF*)	p. 246
8.6.1 For MVDFC Operation	p. 246
8.6.2 For MVMC Operation	p. 249
8.7 ILP-Based Scheduling Algorithm for CPF* Minimization	p. 251
8.7.1 Introduction	p. 251
8.7.2 Algorithm	p. 252
8.7.3 Experimental Results	p. 254
8.7.4 Conclusions	p. 259
8.8 Data Monitoring for Transient Power Minimization	p. 259
8.9 Summary and Conclusions	p. 259
9 Leakage Power Reduction	p. 261
9.1 Introduction	p. 261
9.2 Gate-Oxide Leakage Reduction	p. 262
9.2.1 Dual-T[subscript ox] Technique	p. 262
9.2.2 Dual-k Technique	p. 271
9.3 Subthreshold Leakage Reduction	p. 274
9.3.1 Prioritization Algorithm for Dual-V[subscript Th]-Based Optimization	p. 274
9.3.2 MTCMOS-Based Clique Partitioning for Subthreshold Leakage Reduction	p. 275
9.3.3 MTCMOS-Based Knapsack Binding for Subthreshold Leakage Reduction	p. 275
9.3.4 Power Island Technique for Subthreshold Leakage Reduction	p. 276
9.3.5 Maximum Weight-Independent Set (MWIS) Problem Heuristic for Dual-V[subscript Th]-Based Optimization	p. 276
9.4 Summary and Conclusions	p. 276
10 Conclusions and Future Directions	p. 277
References	p. 281
Index	p. 299

Available:*

On Order

Summary

Summary

Table of Contents