

# L12: Reconfigurable Logic Architectures



**Acknowledgements:** 

Materials in this lecture are courtesy of the following sources and are used with permission.

**Frank Honore** 

Prof. Randy Katz (Unified Microelectronics Corporation Distinguished Professor in Electrical Engineering and Computer Science at the University of California, Berkeley) and Prof. Gaetano Borriello (University of Washington Department of Computer Science & Engineering) From Chapter 2 of R. Katz, G. Borriello. Contemporary Logic Design. 2nd ed. Prentice-Hall/Pearson Education, 2005.

L12: 6.111 Spring 2006

### History of Computational Fabrics



- Discrete devices: relays, transistors (1940s-50s)
- Discrete logic gates (1950s-60s)
- Integrated circuits (1960s-70s)
   e.g. TTL packages: Data Book for 100's of different parts
- Gate Arrays (IBM 1970s)
  - Transistors are pre-placed on the chip & Place and Route software puts the chip together automatically – only program the interconnect (mask programming)
- Software Based Schemes (1970's- present)
  - Run instructions on a general purpose core
- Programmable Logic (1980's to present)
  - □ A chip that be reprogrammed after it has been fabricated
  - Examples: PALs, EPROM, EEPROM, PLDs, FPGAs
  - Excellent support for mapping from Verilog
- ASIC Design (1980's to present)
  - Turn Verilog directly into layout using a library of standard cells
  - Effective for high-volume and efficient use of silicon area

# **Reconfigurable Logic**





□ How many wires per logic block?

Illii

Configuration

- Based on the fact that any combinational logic can be realized as a sum-of-products
- PALs feature an array of AND-OR gates with programmable interconnect



Miī



- Each input pin (and its complement) sent to the AND array
- OR gates for each output can take 8-16 product terms, depending on output pin
- "Macrocell" block provides additional output flexibility...

Image removed due to copyright restrictions.







#### **From Lattice Semiconductor**



Images courtesy of Lattice Semiconductor Corporation. Used with permission.

| S <sub>1</sub> | \$ <sub>0</sub> | Output Configuration      |
|----------------|-----------------|---------------------------|
| 0              | 0               | Registered/Active Low     |
| 0              | 1               | Registered/Active High    |
| 1              | 0               | Combinatorial/Active Low  |
| 1              | 1               | Combinatorial/Active High |

0 = Programmed EE bit

1 = Erased (charged) EE bit

#### Outputs may be registered or combinational, positive or inverted



# RAM Based Field Programmable Logic - Xilinx



Courtesy of Xilinx. Used with permission.

F4 F3 F2 F1

#### Introductory Digital Systems Laboratory



### The Xilinx 4000 CLB





Simplified Block Diagram of XC4000 Series CLB (RAM and Carry Logic functions not shown)

#### Two 4-input Functions, Registered Output and a Two Input Function

Courtesy of Xilinx. Used with permission.



Simplified Block Diagram of XC4000 Series CLB (RAM and Carry Logic functions not shown)

Plii

1111

# **Julii 5-input Function, Combinational Output**

Courtesy of Xilinx. Used with permission.



Simplified Block Diagram of XC4000 Series CLB (RAM and Carry Logic functions not shown)



- N-LUT direct implementation of a truth table: any function of n-inputs.
- N-LUT requires 2<sup>N</sup> storage elements (latches)
- N-inputs select one latch location (like a memory)



#### 1467

### **Configuring the CLB as a RAM**

Plii



#### **Read is same a LUT Function!**



### Xilinx 4000 Interconnect





Single- and Double-Length Lines, with Programmable Switch Matrices (PSMs)

Courtesy of Xilinx. Used with permission.



### **Xilinx 4000 Interconnect Details**









Wires are not ideal!

Courtesy of Xilinx. Used with permission.



### Xilinx 4000 Flexible IOB







### Add Bells & Whistles





**Courtesy of David B. Parlour, ISSCC 2004 Tutorial, "The Reality and Promise of Reconfigurable Computing in Digital Signal Processing." and Xilinx. Used with permission.** 

# The Virtex II CLB (Half Slice Shown)



Used with permission.

#### Introductory Digital Systems Laboratory

14117

1417

### **Adder Implementation**







# **Carry Chain**







### **Virtex II Features**





#### **Double Data Rate registers**



#### **Embedded Multiplier**

Courtesy of Xilinx. Used with permission.



#### **Digital Clock Manager**



**Block SelectRAM** 

14117

### **The Latest Generation: Virtex-II Pro**





#### Hardwired multipliers | High-speed I/O

Courtesy of Xilinx. Used with permission.

# **FPGA Evolution Summary [Parlour04]**



Courtesy of Xilinx. Used with permission.





- Technology Mapping: Schematic/HDL to Physical Logic units
- Compile functions into basic LUT-based groups (function of target architecture)



```
always @(posedge Clock or negedge Reset)

begin

if (! Reset)

q <= 0;

else

q <= (a & b & c) | (b & d);

end
```

### Design Flow – Placement & Route



Placement – assign logic location on a particular device



Routing – iterative process to connect CLB inputs/outputs and IOBs. Optimizes critical path delay – can take hours or days for large, dense designs



Iterate placement if timing not met

Satisfy timing? → Generate Bitstream to config device

Challenge! Cannot use full chip for reasonable speeds (wires are not ideal).

#### Typically no more than 50% utilization.



### **Example: Verilog to FPGA**





#### Plii

#### Prototyping

- **Ensemble of gate arrays used to emulate a circuit to be manufactured**
- □ Get more/better/faster debugging done than with simulation

#### Reconfigurable hardware

□ One hardware block used to implement more than one function

#### Special-purpose computation engines

- □ Hardware dedicated to solving one problem (or class of problems)
- Accelerators attached to general-purpose computers (e.g., in a cell phone!)



- FPGA provide a flexible platform for implementing digital computing
- A rich set of macros and I/Os supported (multipliers, block RAMS, ROMS, high-speed I/O)
- A wide range of applications from prototyping (to validate a design before ASIC mapping) to highperformance spatial computing
- Interconnects are a major bottleneck (physical design and locality are important considerations)

"College students will study concurrent programming instead of "C" as their first

computing experience."

-- David B. Parlour, ISSCC 2004 Tutorial