ROUTER AND SWITCH ARCHITECTURE

Download Router internals — 3. What is routing? • Packet reception. ✓ Interface FIFO (ring buffer?) holds groups of bits as they arrive. ✓ Packet qu...

0 downloads 701 Views 909KB Size
Router and switch architecture Martin Heusse

Router internals — 1

Contenu

• Router architecture • Routing table data structure

Router internals — 2

What is routing? • Packet reception ✓ Interface FIFO (ring buffer?) holds groups of bits as they arrive ✓ Packet queued until treated by central CPU or interface card CPU (throw interrupt) ✓ Check CRC, is there space in memory… ✓ Packet classification (Dropped? Accepted? Switching method?) ✓ Moved to input hold queue Interface

Classify Ring buffer (sometimes)

Int. FIFO

Int. queue

Packet routing

• Packet forwarding ✓ Look up routing table ✓ Rewrite header (Ethernet, NAT?, TTL, checksum…) ✓ Packet moved to output hold queue Router internals — 3

Input and Output queues • Input queues absorb transient forwarding subsystem saturation Configurable • Output queues holds burst of packets directed to one interface

Switching

• Generally, queues hold a given number of packets (not bytes) How would you implement a queue? Ring? Chained list? What is the storage unit (MTU size bin, packet, particle…) ✓ There can be several queues in parallel (various priorities)

Router internals — 4

Shared memory — first generation • Ex.: conventional PC, Cisco 2800, HP ProCurve 7xxx

Bus

• Everything stored in same memory space

Shared memory

CPU memory (routing table)

CPU

• Limiting factor: memory access Router internals — 5

Shared memory — first generation (cont.)

Cisco 25xx Series





CIsco 25xx (1993)

Mgmt Card 2517-2519 Daughter and Hub Cards



WIC Slots 2524, 2525 Async 2509-2512 Hub Ports 2505, 2507 2516

Dual UART M 68030

Sys Ctrl ASIC

CPU Bus

WIC

System Bus



Boot ROM NVRAM PCMCIA Flash

Ether/TR WAN Intf

601 1094_06F9_c4

© 1999, Cisco Systems, Inc.

On Board DRAM

DRAM SIMM

60 Router internals — 6

Shared memory — first generation (cont.)  

CIsco 7200



PCI Bridge

PA 3

PCI Bridge

PA 1

PCI Bridge

PCI Bus 2

PA 5

PCI Bus 1

Cisco 720x Series PCI Bridge

PA 6

PCI Bridge

PA 4

PCI Bridge

PA 2

I/O Controller Fast Ether

Midplane

PCMCIA

PCI PCI Bridge Bridge

SRAM ! NPE-100 Sys Ctrl GT 64010

Dual UART I/O Bus

NVRAM 601 1094_06F9_c4

Boot Flash

© 1999, Cisco Systems, Inc.

EEPROM

DRAM

CPU Bus

PCI Bus 0 Boot ROM

CPU NPE

R 4700 R 5000 Layer 2 Cache NPE-200 69

Router internals — 7

Shared memory — first generation (cont.)



Juniper M40



 

• Decoupling of control plane and forwarding plane —forwarding by a dedicated ASIC • 1998 — 40Gb/s • JunOS based on FreeBSD

Router internals — 8

Shared memory — first generation (cont.)

PIC: Physical Interface Card

Router internals — 9

Intelligent line cards — 2d generation • Ex.: Cisco 7500 • Line cards have some intelligence, write into each other’s memory

CPU

CPU

Bus

CPU CPU memory

CPU

• Limiting factor: 1 shared bus (needs to be N times faster than each of N interfaces) • Central processor dedicated only to control plane Distinct from Forwarding plane

Router internals — 10

Intelligent line cards — 2d generation (cont.) 601 1094_06F9_c4

71

© 1999, Cisco Systems, Inc.

 

7500 CiscoCIsco 75xx Series Dual UART

NVRAM

I/O Bus

Boot ROM

RSP

PCMCIA

Register FPGA Diag Bus FPGA

Boot Flash

Sys Ctrl ASICs

DRAM Layer 2 Cache



R 4600 R 4700 R 5000

CPU Bus MemD Ctrl ASICs

SRAM QA ASIC

Diag Bus Cy Bus 1

Cy Bus 0 IP/VIP

601 1094_06F9_c4

© 1999, Cisco Systems, Inc.

IP/VIP

Cy Bus Arbiter

IP/VIP

IP/VIP

72

Router internals — 11

Intelligent line cards — 2d generation (cont.) 

PCI Bridge 2

PCI Bridge 1

DRAM

Boot ROM SRAM PMA ASICs

DRAM Ctrl ASICs

 

R 4600 R 4700 R 5000

CPU Bus Packet Bus

PA

VIP PCI Bus 0

PA

PCI Bus 1



PCI Bus 2

Cisco 75xxProcessors Series—VIP Versatile Interface (1/interface)

CBus

I/O Ctrl ASIC

Layer 2 Cache

CYA ASICs

EEPROM

Diag Bus

Router internals — 12 601

Intelligent line cards + crossbar switch 3d generation • Ex. Cisco 7600, juniper T-series, HP ProCurve Switch 4200vl… • Crossbar switch: CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

• Routing of N simultaneous packet (or cell) Router internals — 13

Head of line blocking

• Crossbar needs to be N times faster than each line or need one buffer / output on each input (i.e. one buffer per crosspoint) • What goes through the crossbar? ✓ ATM cells ✓ particles? (→ packet reassembly) ✓ packets Router internals — 14

Cisco router performance (packets/s)

Router 2500 2801 7200-NPE-G1 7600-dCEF720

Process switching 800 3000 79.000

Fast switching 4400 90.000 1.018.000 48.000.000 per slot

Router internals — 15

A step further

• Check Cisco CEF ( Cisco express forwarding) • Banyan switch • MPLS: packets carry an identifier of their processing

Router internals — 16

Routing table

• Static entries, routing protocols, ARP • Can be large! • Entries in use are cached (on interface cards, if applicable)  → the cache holds a small subset of know destinations

Router internals — 17

ry Trie

Longest match lookup — Routing table storage

y to represent prefixes is using a trie. A trie is a tree-based data structure allowing the • Source: Ruiz-Sanchez, M.A.; Biersack, E.W.; Dabbous, W., n a digital basis by using the bits of prefixes to direct the branching. Figure 7 shows ”Survey and taxonomy of IP address lookup algorithms,” Network, as at most two children) a setMar/Apr of prefixes of a forwarding table. IEEE , vol.15,representing no.2, pp.8-23, 2001 Prefixes a 0* b 01000* c 011* d 1* e 100* f 1100* g 1101* h 1110* i 1111*

0

1

a

d 1

0

1

0 c

0

1

0

1

0

e 0

1

0

1

f

g

h

i

0 b Router internals — 18

many exception prefixes exist. Note that with this scheme, in the worst case, the path times. In the case of thePath-compressed original Sklower’s scheme thetrie backtrack phase also needs to

e trie because non-contiguous masks are allowed. Prefixes a 0* b 01000* c 011* d 1* e 100* f 1100* g 1101* h 1110* i 1111*

1 0

3 0

b

1

a

d 1

2

0

c

1

3

e

1

0

4

4 0

1

f

g

0

h

1

i

Figure 9: A path-compressed trie • Useful for sparsely populated space. But many prefixes used in IPv4 matching prefix problem has been addressed by using data structu ently, the longest • Backtracking necessary : after reaching and finding that sed tries, like the BSD trie. Path-compression makese much sense out when theitbinary tr does not match, need to go back to d (for 101… for example) ut when the number of prefixes increases and the trie gets denser, using path compres

over, the principal disadvantage of path-compressed tries, as well as binary tries in g Router internals — 19

iginal prefix a, which now has been suppressed. Prefix d has been obtained in a similar wa prefix trie at internal nodes are expandedDisjoint or pushed down to the leaves of the trie, this technique has bee

hing by Srinivasan et al. [14]. Figure 11 shows the disjoint intervals of addresses that correspo prefix binary trie of figure 10.

xes * 1000* 11* *

Prefixes a 0* b 01000* c 011* d 1* e 100* f 1100* g 1101* h 1110* i 1111*

0

1

1

0

0

1

1

0

a! 0

1

0 c

0

e

1 a"

0 b

1

d! 0 f

1

0

1

g

h

i

1 a#

Figure 10: Disjoint-prefix binary trie

• Disjoint prefixes do not overlap

Router internals — 20

There are other techniques!

Router internals — 21

Sources

• S. Keshav; “An engineering approach to computer networking” • Cisco Router Architecture www.cisco.com/networkers/nw99_pres/601.pdf

• Ross & Kurose “Computer Networking” • …

Router internals — 22