## Towards an Ultra-Low-Power Architecture Using Single-Electron **Tunneling Transistors**





Queen's University Changyun Zhu Li Shang Robert Knobel

Northwestern University Zhenyu Gu Robert Dick

6 June 2007

# Outline

- Introduction
- $\cdot\,$  Power, energy, and thermal challenges · Background
- · SET properties and challenges Testbed design
- · IceFlex: a hybrid SET/CMOS reconfigurable architecture Evaluation
- Possible uses of SETs in low-power design
- · Conclusions

## What does history teach us about power consumption?



- Vacuum tube to semiconductor device in the 1960s
- · Bipolar device to CMOS transistor in the 1990s



Based on diagram by C. Johnson, IBM Server and Technology Group.

## Single electron tunneling transistor behavior

### Physical principles

- Coulomb charging effect governs electron tunneling · Coulomb blockade  $V_{GS} = me/C_G$ ,  $m = \pm 1/2, \pm 3/2, \cdots$  OFF,
  - $m = 0, \pm 1, \pm 2, \cdots$  ON



# **Executive Sumary**

### Motivation

- · CMOS is approaching fabrication, power, and thermal limits
- · Can new device technologies solve these problem?

# Single electron tunneling transistor (SET)

- · Unique property: lowest projected power consumption
- · Challenges: fabrication for room-temperature operation, offset charge noise, etc.

# Goal: investigate possible uses of SETs in low-power design

- · IceFlex: fault-tolerant, SET/CMOS reconfigurable architecture
- $\cdot~$  100 $\times$  energy efficiency improvement over 22 nm CMOS
- · Designed for unique challenges posed by SETs

# Power challenges

High-performance applications: energy cost, temperature, reliability Portable embedded systems: battery lifetime



# Single electron tunneling transistor structure

### Device structure

- Island, terminals (source, drain, gate)
- · Electron tunneling through tunneling junctions



# SET properties and challenges

| Ultra low power                                                                                                                                                                                                                                                              |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| $\cdot$ Projected energy per switching event (1 $\times10^{-18}$ J)                                                                                                                                                                                                          |
| Room temperature and fabrication challenge                                                                                                                                                                                                                                   |
| <ul> <li>Electrostatic charging energy must be greater than thermal energy</li> <li>e<sup>2</sup>/C<sub>∑</sub> &gt; k<sub>B</sub>T</li> <li>Requires e<sup>2</sup>/C<sub>∑</sub> &gt; 10k<sub>B</sub>T or even e<sup>2</sup>/C<sub>∑</sub> &gt; 40k<sub>B</sub>T</li> </ul> |

### Performance challenge

- Electrons must be confined in the island
- $R_S, R_D > h/e^2, h/e^2 = 25.8 \,\mathrm{k}\Omega$
- · High resistance, low driving strength

#### Reliability concerns

- · Tunneling between charge traps cause run-time errors
- · Unknown before fabrication
- · Device technology: Improved by silicon islands
- · Reliable design: Post-fabrication adaptation
- · Run-time error correction

# IceFlex architecture

## Fault-tolerant, hybrid SET/CMOS reconfigurable architecture

- Multi-gate SET-based reconfigurable look-up tables and switch fabric
- · SET-based arithmetic unit
- · SET-based reconfiguration memory
- · SET threshold logic-based majority voting logic
- · Hybrid SET/CMOS multi-level interconnect fabric



# SET configuration memory

#### Multi-context on-chip storage design

- · Multi-context configuration cache
- · Dual-island SET design



SET configuration memory



### Interconnect

### Local interconnect

- · Requires limited driving strength
- · Constant-latency, SET-based design
- · Simplify physical design, i.e., routing



# IceFlex: low-power, fault-tolerant, hybrid SET/CMOS reconfigurable architecture

#### Goal

Develop a testbed to investigate possible uses of SETs in low-power embedded system design

### Design metrics

Power consumption, peformance, reliability, fabrication, cooling

### SET-specific design features

- · Fabrication challenge: Regular architecture to ease fabrication
- · Reliability challenge: Built-in redundancy, fault-tolerant design
- · Performance challenge: Hybrid SET/CMOS design
- · Unique properties: Multi-gate design for non-linearly-separable functions and voting logic

# Multi-gate SET reconfigurable lookup table

#### SET multi-gate integration

- · Gate charging effect: a function of  $\sum C_{G_i} V_{GS_i}$
- · Multiplexer design: reduce logic depth, hence circuit delay



m-to-1 multi-gate multiplexer SET tree m<sub>c</sub>-to-1 multi-gate SET multiplexer

# Efficient SET arithmetic function



-20

20

60

0

V<sub>GS</sub>(mV)

## Potential uses of single-electron tunneling transistors

-60 -40

| Application domains <ul> <li>High-performance</li> <li>applications</li> <li>Battery-powered systems</li> </ul> |                                | Design metrics           • Power, performance           • Fabrication, reliability |                              |  |
|-----------------------------------------------------------------------------------------------------------------|--------------------------------|------------------------------------------------------------------------------------|------------------------------|--|
| Benchmarks                                                                                                      | Description                    | Benchmarks                                                                         | Description                  |  |
| AES                                                                                                             | AES (Rijndael) IP core         | ARM7                                                                               | Power-efficient RISC CPU     |  |
| AVR                                                                                                             | ATMega103 microcontroller      | ASPIDA DLX                                                                         | Synchronous / DLX core       |  |
| CORDIC                                                                                                          | Coordinate rotation computer   | Jam RISC                                                                           | Five-stage pipeline RISC CPU |  |
| ECC                                                                                                             | ECC core                       | LEON2 SPARC                                                                        | Entire SPARC V8 processor    |  |
| FPU                                                                                                             | 32-bit IEEE 754 floating-point | Microblaze                                                                         | RISC CPU                     |  |
| RS                                                                                                              | Reed Solomon encoder           | miniMIPS                                                                           | MIPS I clone                 |  |
| USB                                                                                                             | USB 2.0 function               | MIPS                                                                               | MIPS processor               |  |
| VC                                                                                                              | Video compression systems      | Plasma                                                                             | Supports most MIP I opcodes  |  |
| UCore                                                                                                           | MIPS I integer only clone      | YACC                                                                               | MIPS I clone                 |  |





IceFlex optimized for battery-powered applications



## Reliability

### Impact of Majority Voting Logic

- MVL can significantly minimize circuit failures
- · IceFlex supports Run-time failure detect and correction



Recent advances in device technology may greatly reduce error rate.

Estimates by Likharev in "Single-electron devices and their applications," Proc. IEEE.

## Case study: Battery-powered applications

### Given one AA battery

IceFlex AVR can run 20 years

### Given 5 cm<sup>3</sup> scavenging volume

- $\cdot$  Can run at max frequency from vibrations (200  $\mu W/cm^3)$
- $\cdot\,$  Max frequency from temperature variations (10  $\mu W/cm^3)$
- $\cdot$  3.7 MHz from indoor solar energy (4  $\mu$ W/cm<sup>3</sup>)
- $\cdot~$  2.8 kHz from 75 dB acoustic noise (0.003  $\mu W/cm^3)$

Energy densities from Roundy, Wright, and Rabaey in "A Study of Low Level Vibrations as a Power Source for Wireless Sensor Nodes," Computer Communications.

## IceFlex optimized for high-performance applications



## Room-temperature operation, cooling, and fabrication

|             | $C_{\Sigma} = e^2/(10^{-5})^2$ |             | $0k_BT)  C_{\Sigma} = e^2/(40k_BT)$ |             |          |
|-------------|--------------------------------|-------------|-------------------------------------|-------------|----------|
| Temperature |                                | Island      | Island                              | Island      | Island   |
| (K)         |                                | capacitance | diameter                            | capacitance | diameter |
|             |                                | (aF)        | (nm)                                | (aF)        | (nm)     |
| 40          | CMOS operation                 | 4.65        | 52.48                               | 1.16        | 13.12    |
| 77          | Liquid nitrogen cooling        | 2.41        | 27.26                               | 0.60        | 6.82     |
| 103         | Average cloud top temp.        | 1.80        | 20.38                               | 0.45        | 5.10     |
| 120         | Cryogenic                      | 1.55        | 17.49                               | 0.39        | 4.37     |
| 200         | SET device                     | 0.93        | 10.50                               | 0.23        | 2.62     |
| 250         | Stacked Peltier heat pump      | 0.74        | 8.40                                | 0.19        | 2.10     |
| 300         | Room temperature               | 0.62        | 7.00                                | 0.15        | 1.75     |

#### Observations

- Nanometer-scale fabrication to enable room-temperature operation
- operation
- $\cdot$  Compact cooling design at cryogenic temperature range

Case study: High-performance parallel applications

- $\cdot\,$  Assume many-core systems can be efficiently used in the future
- · Given 100 W power budget
- · Supports approximately 4,500 LEON2 SPARC cores at 1 GHz
- · Approximately 4.8 Terra instructions per second

## Conclusions

# Investigated potential of SETs in low-power system design

Designed IceFlex, a low-power, fault-tolerant, hybrid  $\mathsf{SET}/\mathsf{CMOS}$  reconfigurable architecture

#### Opportunities and challenges

- · Orders of magnitude power and energy efficiency improvement
- $\cdot$  Fabrication, cooling design, and reliability challenges