Guest

Design Zone for Campus

Enterprise Campus 3.0 Architecture: Overview and Framework

Table Of Contents

Enterprise Campus 3.0 Architecture:
Overview and Framework

Contents

Enterprise Campus Architecture and Design Introduction

Audience

Document Objectives

Introduction

The Enterprise Campus

Campus Architecture and Design Principles

Hierarchy

Access

Distribution

Core

Mapping the Control and Data Plane to the Physical Hierarchy

Modularity

Access-Distribution Block

Services Block

Resiliency

Flexibility

Campus Services

Non-Stop High Availability

Measuring Availability

Unified Communications Requirements

Tools and Approaches for Campus High Availability

Access and Mobility Services

Converged Wired and Wireless Campus Design

Campus Access Services

Application Optimization and Protection Services

Principles of Campus QoS Design

Network Resiliency and QoS

Virtualization Services

Campus Virtualization Mechanisms

Network Virtualization

Security Services

Infrastructure Security

Perimeter Access Control and Edge Security

Endpoint Security

Distributed Security—Defense in Depth

Operational and Management Services

Fault Management

Accounting and Performance

Configuration and Security

Evolution of the Campus Architecture


Enterprise Campus 3.0 Architecture:
Overview and Framework



Note This document is the first part of an overall systems design guide. This document will become Chapter 1 of the overall design guide when the remaining chapters are completed.


Contents

Enterprise Campus Architecture and Design Introduction

This introductory section includes the following high-level sections to present the content coverage provided in this document:

Audience

Document Objectives

Introduction

The Enterprise Campus

Audience

This document is intended for network planners, engineers, and managers for enterprise customers who are building or intend to build a large-scale campus network and require an understanding of general design requirements.

Document Objectives

This document presents an overview of the campus network architecture and includes descriptions of various design considerations, topologies, technologies, configuration design guidelines, and other considerations relevant to the design of highly available, full-service campus switching fabric. It is also intended to serve as a guide to direct readers to more specific campus design best practices and configuration examples for each of the specific design options.

Introduction

Over the last 50 years, businesses have achieved improving levels of productivity and competitive advantage through the use of communication and computing technology. The enterprise campus network has evolved over the last 20 years to become a key element in this business computing and communication infrastructure. The interrelated evolution of business and communications technology is not slowing and the environment is currently undergoing another stage of that evolution. The emerging Human Network, as it has been termed by the media, illustrates a significant shift in the perception of and the requirements and demands on the campus network. The Human Network is collaborative, interactive and focused on the real-time communications of the end-user, whoever that user may be a worker, a customer, a partner, anyone. The user experience on the network has become the critical determinant of success or failure of technology systems, whether in private or professional lives.

Web 2.0, collaborative applications, mash-ups, and the like are all reflective of a set of business and technology changes that are changing the requirements of our networking systems. An increased desire for mobility, the drive for heightened security, and the need to accurately identify and segment users, devices and networks are all being driven by the changes in the way businesses partner and work with other organizations. The list of requirements and challenges that the current generation of campus networks must address is highly diverse and includes the following:

Global enterprise availability.

Unified Communications, financial, medical, and other critical systems are driving requirement for five nines (99999) availability and improved convergence times necessary for real-time interactive applications.

Migration towards fewer centralized data repositories increases the need for network availability for all business processes.

Network change windows are shrinking or being eliminated as businesses operations adjust to globalization and are operating 7x24x365.

Collaboration and real-time communication application use is growing.

The user experience is becoming a top priority for business communication systems.

As Unified Communications deployments increase, uptime becomes even more critical.

Continuing evolution of security threats.

Security threats continue to grow in number and complexity.

Distributed and dynamic application environments are bypassing traditional security chokepoints.

The need to adapt to change without forklift upgrades.

IT purchases face longer time-in-service and must be able to adapt to adjust to future as well as present business requirements.

Time and resources to implement new business applications are decreasing.

New network protocols and features are starting to appear (Microsoft is introducing IPv6 into the enterprise network).

Expectations and requirements for anywhere; anytime access to the network are growing.

The need for partner and guest access is increasing as business partnerships are evolving.

Increased use of portable devices (laptops and PDAs) is driving the demand for full featured and secure mobility services.

An increasing need to support multiple device types in diverse locations.

Next generation applications are driving higher capacity requirements.

Embedded rich media in documents.

Interactive high definition video.

Networks are becoming more complex.

Do it yourself integration can delay network deployment and increase overall costs.

Business risk mitigation requires validated system designs.

Adoption of advanced technologies (voice, segmentation, security, wireless) all introduce specific requirements and changes to the base switching design and capabilities.

This document is the first part of an overall systems design guide that addresses enterprise campus architectures using the latest advanced services technologies from Cisco and is based on best-practice design principles that have been tested in an enterprise systems environment. It introduces the key architectural components and services that are necessary to deploy a highly available, secure, and service-rich campus network. It also defines a reference design framework that provides the context for each of the specific design chapters—helping the network engineer understand how specific design topics fit into the overall architecture.

The Enterprise Campus

The enterprise campus is usually understood as that portion of the computing infrastructure that provides access to network communication services and resources to end users and devices spread over a single geographic location. It might span a single floor, building or even a large group of buildings spread over an extended geographic area. Some networks will have a single campus that also acts as the core or backbone of the network and provide interconnectivity between other portions of the overall network. The campus core can often interconnect the campus access, the data center and WAN portions of the network. In the largest enterprises, there might be multiple campus sites distributed worldwide with each providing both end user access and local backbone connectivity. From a technical or network engineering perspective, the concept of a campus has also been understood to mean the high-speed Layer-2 and Layer-3 Ethernet switching portions of the network outside of the data center. While all of these definitions or concepts of what a campus network is are still valid, they no longer completely describe the set of capabilities and services that comprise the campus network today.

The campus network, as defined for the purposes of the enterprise design guides, consists of the integrated elements that comprise the set of services used by a group of users and end-station devices that all share the same high-speed switching communications fabric. These include the packet-transport services (both wired and wireless), traffic identification and control (security and application optimization), traffic monitoring and management, and overall systems management and provisioning. These basic functions are implemented in such a way as to provide and directly support the higher-level services provided by the IT organization for use by the end user community. These functions include:

Non-Stop High Availability Services

Access and Mobility Services

Application Optimization and Protection Services

Virtualization Services

Security Services

Operational and Management Services

In the later sections of this document, an overview of each of these services and a description of how they interoperate in a campus network is discussed. Before we look at the six services in more detail, it is useful to understand the major design criteria and design principles that shape the enterprise campus architecture. The design can be viewed from many aspects starting from the physical wiring plant, moving up through the design of the campus topology, and eventually addressing the implementation of the campus services. The order or manner in which all of these things are tied together to form a cohesive whole is determined by the use of a baseline set of design principles which, when applied correctly, provide for a solid foundation and a framework in which the upper layer services can be efficiently deployed.

Campus Architecture and Design Principles

Any successful architecture or system is based on a foundation of solid design theory and principles. Designing a campus network is no different than designing any large, complex system—such as a piece of software or even something as sophisticated as the space shuttle. The use of a guiding set of fundamental engineering principles serves to ensure that the campus design provides for the balance of availability, security, flexibility, and manageability required to meet current and future business and technological needs. The remainder of this campus design overview and related documents will leverage a common set of engineering and architectural principles: hierarchy, modularity, resiliency; and flexibility. Each of these principles is summarized in the brief sections that follow:

Hierarchy

Modularity

Resiliency

Flexibility

These are not independent principles. The successful design and implementation of an enterprise campus network requires an understanding of how each applies to the overall design and how each principle fits in the context of the others.

Hierarchy

A critical factor for the successful implementation of any campus network design is to follow good structured engineering guidelines. A structured system is based on two complementary principles: hierarchy and modularity. Any large complex system must be built using a set of modularized components that can be assembled in a hierarchical and structured manner. Dividing any task or system into components provides a number of immediate benefits. Each of the components or modules can be designed with some independence from the overall design and all modules can be operated as semi-independent elements providing for overall higher system availability—as well as for simpler management and operations. Computer programmers have leveraged this principle of hierarchy and modularity for many years. In the early days of software development, programmers built spaghetti code systems. These early programs were highly optimized and very efficient. As the programs became larger and they had to be modified or changed, software designers very quickly learned that the lack of isolation between various parts of the program or system meant that any small change could not be made without affecting the entire system. Early LAN-based computer networks were often developed following a similar approach. They all started as simple highly optimized connections between a small number of PCs, printers, and servers. As these LANs grew and became interconnected—forming the first generation of campus networks—the same challenges faced by the software developers became apparent to the network engineers. Problems in one area of the network very often impacted the entire network. Simple add and move changes in one area had to be carefully planned or they might affect other parts of the network. Similarly, a failure in one part of the campus quite often affected the entire campus network.

In the software development world, these sorts of system growth and complexity problems lead to the development of structured programming design using modularized or subroutine-based systems. Each individual function or software module was written in such a way that it could be changed without having to change the entire program all at once. The design of campus networks has followed the same basic engineering approach as used by software engineers. By dividing the campus system into subsystems—or building blocks—and assembling them into a clear order, we achieve a higher degree of stability, flexibility, and manageability for the individual pieces of the campus and the campus as a whole.

In looking at how structured design rules should be applied to the campus, it is useful to look at the problem from two perspectives. First, what is the overall hierarchical structure of the campus and what features and functions should be implemented at each layer of the hierarchy? Second, what are the key modules or building blocks and how do they relate to each other and work in the overall hierarchy? Starting with the basics, the campus is traditionally defined as a three-tier hierarchical model comprising the core, distribution, and access layers as shown in Figure 1.

Figure 1 The Layers of the Campus Hierarchy

It is important to note that while the tiers do have specific roles in the design, there are no absolute rules for how a campus network is physically built. While it is true that many campus networks are constructed using three physical tiers of switches, this is not a strict requirement. In a smaller campus, the network might have two tiers of switches in which the core and distribution elements are combined in one physical switch, a collapsed distribution and core. On the other hand, a network may have four or more physical tiers of switches because the scale, wiring plant, and/or physical geography of the network might require that the core be extended. The important point is this—while the hierarchy of the network often defines the physical topology of the switches, they are not exactly the same thing. The key principle of the hierarchical design is that each element in the hierarchy has a specific set of functions and services that it offers and a specific role to play in each of the design.

Access

The access layer is the first tier or edge of the campus. It is the place where end devices (PCs, printers, cameras, and the like) attach to the wired portion of the campus network. It is also the place where devices that extend the network out one more level are attached—IP phones and wireless access points (APs) being the prime two key examples of devices that extend the connectivity out one more layer from the actual campus access switch. The wide variety of possible types of devices that can connect and the various services and dynamic configuration mechanisms that are necessary, make the access layer one of the most feature-rich parts of the campus network. Table 1 lists examples of the types of services and capabilities that need to be defined and supported in the access layer of the network.

Table 1 Examples of Types of Service and Capabilities

Service Requirements
Service Features

Discovery and Configuration Services

802.1AF, CDP, LLDP, LLDP-MED

Security Services

IBNS (802.1X), (CISF): port security, DHCP snooping, DAI, IPSG

Network Identity and Access

802.1X, MAB, Web-Auth

Application Recognition Services

QoS marking, policing, queuing, deep packet inspection NBAR, etc.

Intelligent Network Control Services

PVST+, Rapid PVST+, EIGRP, OSPF, DTP, PAgP/LACP, UDLD, FlexLink, Portfast, UplinkFast, BackboneFast, LoopGuard, BPDUGuard, Port Security, RootGuard

Physical Infrastructure Services

Power over Ethernet


The access layer provides the intelligent demarcation between the network infrastructure and the computing devices that leverage that infrastructure. As such it provides a security, QoS, and policy trust boundary. It is the first layer of defense in the network security architecture and the first point of negotiation between end devices and the network infrastructure. When looking at the overall campus design, the access switch provides the majority of these access-layer services and is a key element in enabling multiple campus services.

Distribution

The distribution layer in the campus design has a unique role in that it acts as a services and control boundary between the access and the core. Both access and core are essentially dedicated special purpose layers. The access layer is dedicated to meeting the functions of end-device connectivity and the core layer is dedicated to providing non-stop connectivity across the entire campus network. The distribution layer on the other hand serves multiple purposes. It is an aggregation point for all of the access switches and acts as an integral member of the access-distribution block providing connectivity and policy services for traffic flows within the access-distribution block. It is also an element in the core of the network and participates in the core routing design. Its third role is to provide the aggregation, policy control and isolation demarcation point between the campus distribution building block and the rest of the network. Going back to the software analogy, the distribution layer defines the data input and output between the subroutine (distribution block) and the mainline (core) of the program. It defines a summarization boundary for network control plane protocols (EIGRP, OSPF, Spanning Tree) and serves as the policy boundary between the devices and data flows within the access-distribution block and the rest of the network. In providing all these functions the distribution layer participates in both the access-distribution block and the core. As a result, the configuration choices for features in the distribution layer are often determined by the requirements of the access layer or the core layer, or by the need to act as an interface to both.

The function of the distribution layer is discussed in more detail in the description of the access-distribution block and the associated design sections.

Core

The campus core is in some ways the simplest yet most critical part of the campus. It provides a very limited set of services and is designed to be highly available and operate in an always-on mode. In the modern business world, the core of the network must operate as a non-stop 7x24x365 service. The key design objectives for the campus core are based on providing the appropriate level of redundancy to allow for near immediate data-flow recovery in the event of any component (switch, supervisor, line card, or fiber) failure. The network design must also permit the occasional, but necessary, hardware and software upgrade/change to be made without disrupting any network applications. The core of the network should not implement any complex policy services, nor should it have any directly attached user/server connections. The core should also have the minimal control plane configuration combined with highly available devices configured with the correct amount of physical redundancy to provide for this non-stop service capability.

The core campus is the backbone that glues together all the elements of the campus architecture. It is that part of the network that provides for connectivity between end devices, computing, and data storage services located within the data center—and other areas and services within the network. It serves as the aggregator for all of the other campus blocks and ties together the campus with the rest of the network. One question that must be answered when developing a campus design is this: Is a distinct core layer required? In those environments where the campus is contained within a single building—or multiple adjacent buildings with the appropriate amount of fiber—it is possible to collapse the core into the two distribution switches as shown in Figure 2.

Figure 2 Collapsed Distribution and Core Campus

It is important to consider that in any campus design even those that can physically be built with a collapsed distribution core that the primary purpose of the core is to provide fault isolation and backbone connectivity. Isolating the distribution and core into two separate modules creates a clean delineation for change control between activities affecting end stations (laptops, phones, and printers) and those that affect the data center, WAN or other parts of the network. A core layer also provides for flexibility for adapting the campus design to meet physical cabling and geographical challenges. As an example, in a multi-building campus design like that shown in Figure 3, having a separate core layer allows for design solutions for cabling or other external constraints to be developed without compromising the design of the individual distribution blocks. If necessary, a separate core layer can use different transport technology, routing protocols, or switching hardware than the rest of the campus, providing for more flexible design options when needed.

Figure 3 Multi Building Campus

Implementing a separate core for the campus network also provides one additional specific advantage as the network grows: A separate core provides the ability to scale the size of the campus network in a structured fashion that minimizes overall complexity. It also tends to be the most cost effective solution.

As shown in Figure 4, as the size of the network grows and the number of interconnections required to tie the campus together grow, adding a core layer significantly reduces the overall design complexity. Note that in Figure 4, the bottom design is recommended, not the top.

Figure 4 Use of Campus Core Layer to Reduce Network Scaling Complexity

Having a dedicated core layer allows the campus to accommodate this growth without compromising the design of the distribution blocks, the data center, and the rest of the network. This is particularly important as the size of the campus grows either in number of distribution blocks, geographical area or complexity. In a larger, more complex campus, the core provides the capacity and scaling capability for the campus as a whole.

The question of when a separate physical core is necessary depends on multiple factors. The ability of a distinct core to allow the campus to solve physical design challenges is important. However, it should be remembered that a key purpose of having a distinct campus core is to provide scalability and to minimize the risk from (and simplify) moves, adds, and changes in the campus. In general, a network that requires routine configuration changes to the core devices does not yet have the appropriate degree of design modularization. As the network increases in size or complexity and changes begin to affect the core devices, it often points out design reasons for physically separating the core and distribution functions into different physical devices.

Mapping the Control and Data Plane to the Physical Hierarchy

Implementing hierarchy in the campus network is not just a matter of physical design. In order to achieve the desired level of fault and change isolation, the logical control plane design and the data flow design must also follow hierarchical design principles. Most importantly, mapping all three elements—physical connectivity, logical control plane, and data flows—together in the same hierarchical model is necessary to produce an optimal network implementation. From a physical perspective, the distribution layer provides the boundary between the access-distribution block and the core of the network. It provides the physical demarcation between the core infrastructure and the access-distribution blocks. It should also be the demarcation and summarization point between the cores control plane and the access-distribution block control plane. Having a summarized view of the connectivity and control plane within the access-distribution block allows the core and the remainder of the network to be managed and changed without constantly considering the specific internal details of the access-distribution block. The third aspect of the hierarchical design—how data traffic flows through the campus—is configured in the network, but is a desirable property or goal of the design. As shown in Figure 5, the same link failure in three different switch configurations can result in three different traffic recovery paths ranging from the best case—where traffic flowing upstream recovers to another upstream path—to the worst case, in which traffic must flow back down to a lower layer of the hierarchy in order to restore network connectivity.

Figure 5 Traffic Recovery in a Hierarchical Design

One of the advantages of the hierarchical design is that we can achieve a degree of specialization in each of the layers, but this specialization assumes certain network behavior. One of the assumptions or requirements that allows this specialization is that traffic is always going to flow in the same upstream or downstream hierarchical fashion (access to distribution to core). When we know that the alternative path for any traffic flow will follow the same hierarchical pattern as the original path, we can avoid making certain design decisions—such as ensuring the access layer can support extra traffic loads. Similarly, knowing that traffic always flows from the access layer through a distribution layer and then to the core, it is easier to implement consistent policy mechanisms in each layer. It reduces design complications when there is no need to consider the possibility of traffic flowing around or through a policy layer twice. Designing the hierarchy of the network to support consistent data flow behavior also has the effect of improving the network convergence time in the event of a failure. Equal-cost multi-path (ECMP) designs and other fully redundant configurations ensure these hierarchical data flows also provide for fast and deterministic convergence times over non fully meshed designs, as shown in the Best case in Figure 5.

Modularity

The second of the two principles of structured design is modularity. The modules of the system are the building blocks that are assembled into the larger campus. The advantage of the modular approach is largely due to the isolation that it can provide. Failures that occur within a module can be isolated from the remainder of the network, providing for both simpler problem detection and higher overall system availability. Network changes, upgrades, or the introduction of new services can be made in a controlled and staged fashion, allowing greater flexibility in the maintenance and operation of the campus network. When a specific module no longer has sufficient capacity or is missing a new function or service, it can be updated or replaced by another module that has the same structural role in the overall hierarchical design. The campus network architecture is based on the use of two basic blocks or modules that are connected together via the core of the network:

Access-distribution block

Services block

The following sections introduce the underlying campus building blocks. For detailed design guidance, see each of the appropriate design document that addresses each specific module.

Access-Distribution Block

The access-distribution block (also referred to as the distribution block) is probably the most familiar element of the campus architecture. It is the fundamental component of a campus design. Properly designing the distribution block goes a long way to ensuring the success and stability of the overall architecture. The access-distribution block consists of two of the three hierarchical tiers within the multi-layer campus architecture: the access and distribution layers. While each of these layers has specific service and feature requirements, it is the network topology control plane design choices—such as routing and spanning tree protocols—that are central to determining how the distribution block glues together and fits within the overall architecture. There are currently three basic design choices for configuring the access-distribution block and the associated control plane:

Multi-tier

Routed access

Virtual switch

While all three of these designs use the same basic physical topology and cabling plant there are differences in where the Layer-2 and Layer-3 boundaries exist, how the network topology redundancy is implemented, and how load-balancing works—along with a number of other key differences between each of the design options. While a complete configuration description of each access-distribution block model can found within the detailed design documents, the following provides a short description of each design option.

Multi-Tier Access-Distribution Block

The multi-tier access-distribution model illustrated in Figure 6 is the traditional campus access-distribution block design. All of the access switches are configured to run in Layer-2 forwarding mode and the distribution switches are configured to run both Layer-2 and Layer-3 forwarding. VLAN-based trunks are used to extend the subnets from the distribution switches down to the access layer. A default gateway protocol—such as HSRP or GLBP—is run on the distribution layer switches along with a routing protocol to provide upstream routing to the core of the campus. One version of spanning tree and the use of the spanning tree hardening features (such as Loopguard, Rootguard, and BPDUGuard) are configured on the access ports and switch-to-switch links as appropriate.

Figure 6 Multi-Tier Campus Access Distribution Block

The multi-tier design has two basic variations, as shown in Figure 7, that primarily differ only in the manner in which VLANs are defined. In the looped design, one-to-many VLANs are configured to span multiple access switches. As a result, each of these spanned VLANs has a spanning tree or Layer-2 looped topology. The other alternative—the V or loop-free design—follows the current best practice guidance for the multi-tier design and defines unique VLANs for each access switch. The removal of loops in the topology provides a number of benefits—including per device uplink load balancing with the use of GLBP, a reduced dependence on spanning tree to provide for network recovery, reduction in the risk of broadcast storms, and the ability to avoid unicast flooding (and similar design challenges associated with non-symmetrical Layer-2 and Layer-3 forwarding topologies).

Figure 7 Two Major Variations of the Multi-Tier Distribution Block

The detailed design guidance for the routed access distribution block design can be found in the campus section of the CCO SRND site http://www.cisco.com/go/srnd.

Routed Access Distribution Block

As alternative configuration to the traditional multi-tier distribution block model is one in which the access switch acts as a full Layer-3 routing node (provides both Layer-2 and Layer-3 switching) and the access to distribution Layer-2 uplink trunks are replaced with Layer-3 point-to-point routed links. This alternative configuration, in which the Layer-2/3 demarcation is moved from the distribution switch to the access switch appears to be a major change to the design, but is actually simply an extension of the best practice multi-tier design. See Figure 8.

Figure 8 Routed Access Distribution Block Design

In the best practice multi-tier and routed access design, each access switch is configured with unique voice, data, and any other required VLANs. In the routed access design, the default gateway and root bridge for these VLANs is simply moved from the distribution switch to the access switch. Addressing for all end stations and for the default gateway remains the same. VLAN and specific port configuration remains unchanged on the access switch. Router interface configuration, access lists, ip helper and any other configurations for each VLAN remain identical. However, these are now configured on the VLAN Switched Virtual Interface (SVI) defined on the access switch, instead of on the distribution switches. There are notable configuration changes associated with the move of the Layer-3 interface down to the access switch. It is no longer necessary to configure an HSRP or GLBP virtual gateway address, as the router interfaces for all the VLANs are now local. Similarly, with a single multicast router for each VLAN it is unnecessary to tune PIM query intervals or to ensure the designated router is synchronized with the active HSRP gateway.

The routed access distribution block design has a number of advantages over the multi-tier design with its use of Layer-2 access to distribution uplinks. It offers common end-to-end trouble shooting tools (such as ping and traceroute), it uses a single control protocol (either EIGRP or OSPF), and removes the need for features such as HSRP. While it is the appropriate design for many environments, it is not suitable for all environments, because it requires that no VLAN span multiple access switches. The detailed design guidance for the routed access distribution block design can be found in the campus section of the CCO SRND site, http://www.cisco.com/go/srnd.

Virtual Switch

The Virtual Switching System (VSS) distribution block design is radical change from either the routed access or multi-tier designs. The introduction of the Cisco Catalyst 6500 VSS and Stackwise/Stackwise-Plus in the Cisco Catalyst 3750/3750E provides the opportunity to make a significant change to the way switch and link redundancy can be implemented. In the past, multiple access switches were connected to two redundant distribution switches and the configuration of the network control protocols (such as HSRP, 802.1D spanning tree, and EIGRP) determined the way in which the switches forwarded traffic over each of the uplinks and the network recovered in the event of a switch or link failure. With the introduction of the virtual switch concept, the distribution switch pair can now be configured to run as a single logical switch as shown in Figure 9. By converting the redundant physical distribution switches into a single logical switch, a significant change is made to the topology of the network. Rather than an access switch configured with two uplinks to two distribution switches—and needing a control protocol to determine which of the uplinks to use—now the access switch has a single multi-chassis Etherchannel (MEC) upstream link connected to a single distribution switch.

Figure 9 Virtual Switch Physical and Logical

The change from two independent uplinks to a single multi-chassis Etherchannel uplink has a number of advantages. See Figure 10. Load balancing of traffic and recovery from uplink failure now leverage Etherchannel capabilities. Traffic is load-balanced per flow, rather than per client or per subnet. In the event that one of the uplinks fails, the Etherchannel automatically redistributes all traffic to the remaining links in the uplink bundle rather than waiting for spanning tree, HSRP, or other protocol to converge. The ability to remove physical Layer-2 loops from the topology—and to no longer be dependent on spanning tree to provide for topology maintenance and link redundancy—results in a distribution block design that allows for subnets and VLANs to be spanned across multiple access switches (without the traditional challenges and limitations of a spanning tree-based Layer-2 design).

Figure 10 Virtual Switch vs. Spanning Tree Topology

The ability to remove physical loops from the topology, and no longer be dependent on spanning tree, is one of the significant advantages of the virtual switch design. However, it is not the only difference. The virtual switch design allows for a number of fundamental changes to be made to the configuration and operation of the distribution block. By simplifying the network topology to use a single virtual distribution switch, many other aspects of the network design are either greatly simplified or, in some cases, no longer necessary. Features like HSRP or GLBP are no longer necessary because both switches act as one logical default gateway. Configuration for both per-subnet or VLAN features such as access lists, ip-helper, and others must be made only once, not replicated and kept in sync between two separate switches. Similarly, any switch configuration must be done only once and is synchronized across the redundant supervisors.


Note While the virtual switch design does remove the dependency on spanning tree for active topology maintenance, spanning tree should not be turned off. Spanning tree should remain configured as a backup resiliency mechanism.


The virtual switch is not limited to the campus distribution. A virtual switch can be used in any location in the campus design where it is desirable to replace the current control plane and hardware redundancy with the simplified topology offered by the use of a virtual switch. The virtual switch simplifies the network topology by reducing the number of devices as seen by the spanning tree or routing protocol. Where two or more nodes existed with multiple independent links connecting the topology, a virtual switch can replace portions of the network with a single logical node with fewer links. Figure 11 illustrates an extreme case in which an end-to-end, Layer-2 topology is being migrated from a fully redundant spanning tree-based topology to an end-to-end virtual switch-based network. Here, the topology is both drastically simplified and now all links are actively forwarding with no spanning tree loops.

Figure 11 Use of the Virtual Switch Design in an End-to-End Layer-2 Topology

While the use of a virtual switch to simplify the campus topology can help address many design challenges, the overall design must follow the hierarchical design principles. The appropriate use of Layer-2 and Layer-3 summarization, security, and QoS boundaries all apply to a virtual switch environment. Most campus environments will gain the greatest advantages of a virtual switch in the distribution layer. For details on the design of the virtual switching distribution block see the upcoming virtual switch distribution block design, http://www.cisco.com/go/srnd.

Distribution Block Design Comparison

While each of the three access-distribution block designs provides a viable approach, there are advantages to the virtual switch and routed access designs over the traditional multi-tier approach. Simpler overall network configuration and operation, per flow upstream and downstream load balancing, and faster convergence are some of the differences between these newer design options and the traditional multi-tier approach. The selection of a specific design option for a given campus network is an important decision in the planning of a campus design. Table 2 provides an overview comparison of the three design options. Prior to making a final design decision, review detailed design descriptions provided by Cisco to ensure that all of the factors pertinent to your environment are considered.

Table 2 Comparison of Distribution Block Design Models

 
Multi-Tier Access
Routed Access
Virtual Switch

Access Distribution Control Plane Protocols

Spanning Tree (PVST+, Rapid-PVST+ or MST)

EIGRP or OSPF

PAgP, LACP

Spanning Tree

STP Required for network redundancy and to prevent L2 loops

No1

No 2

Network Recovery Mechanisms

Spanning Tree and FHRP (HSRP, GLBP, VRRP)

EIGRP or OSPF

Multi-Chassis Etherchannel (MEC)

VLAN spanning wiring closets

Supported (requires L2 spanning tree loops)

No

Supported

Layer 2/3 Demarcation

Distribution

Access

Distribution3

First Hop Redundancy Protocol

HSRP, GLBP, VRRP required

Not Required

Not Required

Access to Distribution Per Flow Load Balancing

No

Yes - ECMP

Yes - MEC

Convergence

900 msec to 50 seconds

(Dependent on STP topology and FHRP tuning)

50 to 600 msec

50 to 600 msec 4

Change Control

Dual distribution switch design requires manual configuration synchronization but allows for independent code upgrades and changes

Dual distribution switch design requires manual configuration synchronization but allows for independent code upgrades and changes

Single virtual switch auto-syncs the configuration between redundant hardware but does not currently allow independent code upgrades for individual member switches

1 Neither the routed access nor virtual switch designs require STP configured to maintain the network topology. It is still recommend and required to allow the use of features such as BPDU Guard on access ports.

2 Same as footnote 1.

3 With a virtual switch design, it is possible to configure a routed access layer, but this will affect the ability to span VLANs across wiring closets.

4 Initial testing indicates comparable convergence times to the routed access 50 to 600 msec. See the upcoming Virtual Switch Design Guide for final values.


Services Block

The services block is a relatively new element to the campus design. See Figure 12. As campus network planners begin to consider migration to dual stack IPv4/IPv6 environments, migrate to controller-based WLAN environments, and continue to integrate more sophisticated Unified Communications services, a number of real challenges lay ahead. It will be essential to integrate these services into the campus smoothly—while providing for the appropriate degree of operational change management and fault isolation and continuing to maintain a flexible and scalable design. As a example, IPv6 services can be deployed via an interim ISATAP overlay that allows IPv6 devices to tunnel over portions of the campus that are not yet native IPv6 enabled. Such an interim approach allows for a faster introduction of new services without requiring a network-wide, hot cutover.

Examples of functions recommended to be located in a services block include:

Centralized LWAPP wireless controllers

IPv6 ISATAP tunnel termination

Local Internet edge

Unified Communications services (Cisco Unified Communications Manager, gateways, MTP, and the like)

Policy gateways

Figure 12 Campus Services Block

The services block is not necessarily a single entity. There might be multiple services blocks depending on the scale of the network, the level of geographic redundancy required, and other operational and physical factors. The services block serves a central purpose in the campus design; it isolates or separates specific functions into dedicated services switches allowing for cleaner operational processes and configuration management.

Resiliency

While the principles of structured design and the use of modularity and hierarchy are integral to the design of campus networks they are not sufficient to create a sustainable and scalable network infrastructure. Consider the software development analogy. In the software world, it is no longer sufficient for programs to merely generate the correct output given the correct input. In the same way, it is not enough that a campus network be seen as being complete solely because it correctly passes data from one point to another. As shown by the numerous security vulnerabilities exposed in software operating systems and programs in recent years, software designers are learning that to be correct is no longer enough. Systems must also be designed to resist failure under unusual or abnormal conditions. One of the simplest ways to break any system is to push the boundary conditions—to find the edges of the system design and look for vulnerabilities. If you are trying to break a piece of software that accepts a range of input of values from one to ten, you try giving it inputs of ten thousand, ten million, and so on to determine when and how it will break. If you are trying to break a network, follow a similar approach. Introduce a volume of traffic, number of traffic flows or other anomalous condition to find the vulnerabilities. Software engineers have become well aware of the problem and have adopted various approaches to solving it, including the use of bounds checking, assert checks, and increased modularization. Network engineers faced with a similar fundamental design challenge must also adapt network design strategies to produce a more resilient architecture.

What does it mean to create a resilient design in the context of the campus network? A basic feature of resiliency is the ability for the system to remain available for use under both normal and abnormal conditions. Normal conditions include such events as change windows and normal or expected traffic flows and traffic patterns. Abnormal conditions include hardware or software failures, extreme traffic loads, unusual traffic patterns, denial-of-service (DoS) events whether intentional or unintentional, and any other unplanned event. As illustrated in Figure 13, there are a number of approaches to providing resiliency including hardening the individual components, switches, and links in the network, adding throttle or rate limiting capabilities to software and hardware functions, providing explicit controls on the behavior of edge devices, and the use of instrumentation and management tools to provide feedback to the network operations teams.

Figure 13 Examples of Campus Resiliency Features

Resilient design is not a feature nor is there a specific thing that you do in order to achieve it. As with hierarchy and modularity, resiliency is a basic principle that is made real through the use of many related features and design choices. The coordinated use of multiple features and the use of features to serve multiple purposes are aspects of resilient design. An example that illustrates this principle is the way in which an access port feature, such as port security, is used. Enabling port security on the access switch allows it to restrict which frames are permitted inbound from the client on an access port based on the source MAC address in the frame. When enabled, it can solve multiple problems—such as preventing certain man-in-the-middle and DoS flooding attacks, as well as mitigating against Layer-2 (spanning tree) loops involving the access ports. Implementing port security provides an explicit bounds check on the number of end devices that should be attached to an end port. Every network is designed to support a specific number of devices on an edge port. By implementing an explicit rule that enforces that expected behavior, the network design achieves a higher degree of overall resiliency by preventing all of the potential problems that could happen if thousands of MAC addresses suddenly appeared on an edge port. By engineering the network to both what you want it to do and prevent it from doing what you do not want it to do, you decrease the likelihood of some unexpected event from breaking or disrupting the network.

As the port security example illustrates, there are many cases where traditional security features and quality-of-service (QoS) features can and should be used to both address security and QoS requirements, but also to improve the availability of the campus infrastructure as a whole. The principle of resiliency extends to the configuration of the control plane protocols (such as EIGRP, Rapid-PVTS+, and UDLD) as well as the mechanisms used to provide switch or device level resiliency. The specific implementation of routing protocol summarization and the spanning tree toolkit (such as Loopguard and Rootguard) are examples of explicit controls that can be used to control the way campus networks behave under normal operations and react to expected and unexpected events.

Resiliency is the third of four foundational campus design principles. Just as the way in which we implement hierarchy and modularity are mutually interdependent, the way in which we achieve and implement resiliency is also tightly coupled to the overall design. Adding resiliency to the design might require the use of new features, but it is often just a matter of how we choose to implement our hierarchy and how we configure the basic Layer-2 and Layer-3 topologies.

Flexibility

In most enterprise business environments, campus networks are no longer new additions to the network. In general, campus networks have evolved through first and second generation build-out cycles and the expected lifecycle for campus networks have increased considerably—from three to five, and in some cases, seven years. At the same time, these networks have become larger and more complex, while the business environment and its underlying communication requirements continue to evolve. The result is that network designs must allows for an increasing degree of adaptability or flexibility. The ability to modify portions of the network, add new services, or increase capacity without going through a major fork-lift upgrade are key considerations to the effectiveness campus designs.

The structured hierarchical design inherently provides for a high degree of flexibility because it allows staged or gradual changes to each module in the network fairly independently of the others. Changes in core transport can be made independently of the distribution blocks. Changes in the design or capacity of the distribution layer can be implemented in a phased or incremental manner. Additionally, as a part of the overall hierarchical design, the introduction of the services block module into the architecture is specifically intended to address the need to implement services in a controlled fashion. This modularization of the overall design also applies to the selection of devices to fill each of the roles in the overall architecture. As the lifespan of a core, distribution, or access switch increases, it is necessary to consider how each will support and enable the continued evolution of functions required to support changing business requirements without whole scale hardware replacement.

There are a number of key areas where it is highly probable that networks will evolve over the next few years and existing designs should be adapted to incorporate the appropriate degree of flexibility into their designs to accommodate these potential changes. Key areas to consider include the following:

Control Plane Flexibility—The ability to support and allow migration between multiple routing, spanning tree, and other control protocols.

Forwarding Plane Flexibility—The ability to support the introduction and use of IPv6 as a parallel requirement along side IPv4.

User Group Flexibility—The ability to virtualize the network forwarding capabilities and services within the campus fabric to support changes in administrative structure of the enterprise. This could involve acquisition, partnering, or outsourcing of business functions.

Traffic Management and Control Flexibility—Unified communications, collaborative business approaches, and software models continue to evolve—along with a trend toward increased growth in peer-to-peer traffic flows. These fundamental changes require campus designs that allow the deployment the security, monitoring, and troubleshooting tools available to support these new traffic patterns.

Flexible Security Architecture—The high probability of changing traffic patterns and a continual increase in security threats as new applications and communications patterns develop will require a security architecture that can adapt to these changing conditions.

The ability to make evolutionary modifications to any campus is a practical business and operational necessity. Ensuring that the overall architecture provides for the optimal degree of flexibility possible will ensure that future business and technology requirements will be easier and more cost effective to implement.

Campus Services

The overall campus architecture is more than the fundamental hierarchical design discussed in Campus Architecture and Design Principles. While the hierarchical principles are fundamental to how to design a campus they do not address the underling questions about what a campus network does. What services should it provide to end users and devices? What are the expectations and parameters of those services? What functionality must be designed into each of the hierarchical layers? What must a campus network do in order to meet enterprise business and the technical requirements? What a campus does or needs to provide can be categorized into six groups:

Non-Stop High Availability

Access and Mobility Services

Application Optimization and Protection Services

Virtualization Services

Security Services

Operational and Management Services

In the following sections, each of these services or service level requirements is introduced. More detailed discussions of each subject will be available in the specific campus design chapters.

Non-Stop High Availability

In many cases, the principle service requirement from the campus network is the availability of the network. The ability for devices to connect and for applications to function is dependent on the availability of the campus. Availability is not a new requirement and historically has been the primary service requirement for most campus designs. The metrics of what availability means and the requirements for how available the network have changed as a result of the growth in unified communications, high-definition video, and the overall increasing dependence on the network for all business processes.

Measuring Availability

Availability is traditionally measured using a number of metrics, including the percentage of time the network is available or the number of nines—such as five nines—of availability. The calculation of availability is based on a function of the mean time between failures (MTBF) of the components in the network and the mean time to repair (MTTR)—or how long it takes to recover from a failure. See Figure 14.

Figure 14 Availability Calculation

improving availability is achieved by either increasing the MTBF (reducing the probability of something breaking) or decreasing the MTTR (reducing the time to recover from a failure) or both. In a network with a single device this is all we need in order to consider: How reliable is the device? And how fast can we fix it if it breaks? In a network of more than one device, there are other factors that influence overall availability and our design choices.

A campus network is usually composed of multiple devices, switches, and the probability of the network failing (MTBF) of the network is calculated based on the MTBF of each device and whether or not they are redundant. In a network of three switches connected in serial, with no redundancy, the network will break if any one of the three switches breaks. The overall network MTBF is a function of how likely it is that any one of the three will fail. In a network with redundant switches, or switches in parallel, the network will only break if both of the redundant switches fail. The calculations for the system MTBF are based on the probability that one switch in a non-redundant (serial) network breaks (Figure 15), or both switches in a redundant (parallel) design break (Figure 16).

Figure 15 MTBF Calculation with Serial Switches

Figure 16 MTBF Calculation with Parallel Switches

In addition to changing the MTBF calculations, redundancy and how redundancy is used in a design also affects the MTTR for the network. See Figure 17. The time to restore service, data flows, in the network is based on the time it takes for the failed device to be replaced or for the network to recover data flows via a redundant path. The time it takes any operations team to replace a device is usually measured in hours or days rather than in minutes or seconds and the impact on the availability of the network can be significant if the appropriate degree of device redundancy is missing from the design.

Figure 17 Impact of network redundancy on overall campus reliability

The other commonly used metric for measuring availability is defects per million (DPM). While measuring the probability of failure of a network and establishing the service-level agreement (SLA) that a specific design is able to achieve is a useful tool, DPM takes a different approach. It measures the impact of defects on the service from the end user perspective. It is often a better metric for determining the availability of the network because it better reflects the user experience relative to event effects. DPM is calculated based on taking the total affected user minutes for each event, total users affected, and the duration of the event, as compared to the total number of service minutes available during the period in question. You divide the sum of service downtime minutes by total service minutes and multiply by 1,000,000. See Figure 18.

Figure 18 Defects per Million Calculation

DPM is useful in that it is a measure of the observed availability and considers the impact to the end user as well as the network itself. Adding this user experience element to the question of campus availability is very important to understand and is becoming a more important part of the question of what makes a highly available or non-stop campus network. A five nines network, which has been considered the hallmark of excellent enterprise network design for many years, allows for up to five (5) minutes of outage or downtime per year. See Table 3.

Table 3 Availability, DPM and Downtime 

Availability (Percent)
DPM
Downtime/Year (24x7x365)

99.000

10,000

3 Days

15 Hours

36 Minutes

99.500

5,000

1 Day

19 Hours

48 Minutes

99.900

1,000

 

8 Hours

46 Minutes

99.950

500

 

4 Hours

23 Minutes

99.990

100

   

53 Minutes

99.999

10

   

5 Minutes

99.9999

1

   

0.5 Minutes


From a network operations perspective, achieving a maximum of five minutes of downtime over the year is a significant goal. However, as a single metric, it is not sufficient to characterize a network as meeting the availability requirements of the current and evolving business environments. DPM takes into consideration the measurement of the availability of the network from the user (or application) perspective and is valuable tool to determine whether or not the network SLA is being met. Nonetheless, it is not a sufficient metric either. The third metric to be considered in the campus design is the maximum outage that any application or data stream will experience during a network failure. Network recovery time from the user (or application) perspective is the third critical design metric to consider when designing a campus network. Five minutes of outage experienced in the middle of a critical business event has a significant impact on the enterprise.

Unified Communications Requirements

Providing for a high availability in a campus design requires consideration of three aspects:

What SLA can the design support (how many nines)?

Is the network meeting the SLA (DPM)?

What will the impact of any failure be on applications and user experience?

The first two are aggregated metrics of the operational integrity of a campus network and are used to determine the level of operational reliability of the network. The third consideration is a measure of business disruption—how disruptive to the business will any failure be. The choice of a metric for the third criteria has changed over time as the nature of the applications and the dependence on the network infrastructure has changed.

As enterprises migrate to VoIP and Unified Communications, what is considered acceptable availability must also be re-evaluated. The upper limit for acceptable network reconvergence, the MTTR, for a Unified Communications must consider several key metrics:

How fast must the network restore data flows before the loss becomes disruptive to an interactive voice or video? When will your conversation be disrupted?

How fast must the network converge and restore data flows before someone hangs up on an active conversation due to dead air? How long will someone listen to the phone if they do not hear anything? How long will it be before the network appears broken?

How fast must the network converge to avoid call signalling failures, loss of dial tone, reset triggered by loss of connection to the call agent (such as Cisco Unified Communications Manager, Cisco Unified SRST, or Cisco Unified Communications Manager Express)?

These metrics contain objective and subjective elements. In addition to defining when applications will fail, they also define what is disruptive to the employees and users of the network, what events will disrupt their ability to conduct business, and what events signify a failure of the network. As network-based communications become the norm for all aspects of personal and business life, the defining of metrics describing a working network is increasingly important and more restrictive.

While the metrics to evaluate subjective failure assessment are by definition subjective, they do have a basis in the common patterns of human communication patterns. The amount of time that a person is willing to listen to dead air before deciding that the call (network) failed—causing the user to hang up—is variable, but tends to be in the 3-to-6 second range. The length of data or bearer path loss in an RTP stream is much stricter. While the human ear can detect loss of sound in streaming audio down to 50 msec or less, the average interval that proves disruptive to a conversation is closer to 200 msec. The ability to fill lost phonetic information in a conversation and the threshold for what period of time constitutes a pause in speech—signalling it is someone else's turn to talk—are much longer than what the human ear can detect as lost sound. Loss of sound for periods of up to one second are recovered in normal speech pattern relatively easily, but beyond that they become disruptive to conversation and result in lost or failed communication. See Figure 19.

Figure 19 Comparative Measure of MTTR on Unified Communications

A campus that can restore RTP media streams in less time than it takes to disrupt an active business conversation is as much a design objective in a Unified Communications-enabled enterprise as is meeting a target of five nines of availability.


Note Voice and video are not the only applications with strict convergence requirements. Trading systems, health care, and other real-time applications might have just as strict or even more strict requirements for network recovery speed. Voice is used as a metric for the Cisco enterprise design guides because it is becoming a standard application in most enterprise networks and provides a common objective that all designs must meet as a minimum requirement.


Tools and Approaches for Campus High Availability

The approach taken in the ESE campus design guide to solving both the problem of ensuring five nines of availability and providing for the recovery times required by a Unified Communications-enabled campus is based on approaching the high-availability service problem from three perspectives:

Network resiliency

Device resiliency

Operational resiliency

This approach is based on an analysis of the major contributing factors of network downtime (as illustrated in Figure 20) and by using the principles of hierarchy, resiliency, and modularity—combined with the capabilities of the Cisco Catalyst switching family to define a set of design recommendations.

Figure 20 Common Causes of Network Downtime

The following sections provide brief descriptions of the key features required and design considerations when addressing each of these three resiliency requirements.

Network Resiliency

Network resiliency is largely concerned with how the overall design implements topology redundancy, redundant links and devices, and how the control plane protocols (such as EIGRP, OSPF, PIM, and STP) are optimally configured to operate in that design. The use of physical redundancy is a critical part of ensuring the availability of the overall network. In the event of a component failure, having a redundant component means the overall network can continue to operate. The control plane capabilities of the campus provide the ability to manage the way in which the physical redundancy is leveraged, the network load balances traffic, the network converges, and the network is operated. The detailed recommendations for how to optimally configure the various control plane protocols are covered in the specific campus design guide, but the following basic principles can be applied in all situations:

Wherever possible, leverage the ability of the switch hardware to provide the primary detection and recovery mechanism for network failures (for example, use Multi-Chassis Etherchannel, Equal Cost Multi-Path recovery for failure recovery). This ensures both a faster and a more deterministic failure recovery.

Implement a defense-in-depth approach to failure detection and recovery mechanisms. An example of this is configuring the UniDirectional Link Detection (UDLD) protocol which uses a Layer-2 keep-alive to test that the switch-to-switch links are connected and operating correctly and acts as a backup to the native Layer-1 unidirectional link detection capabilities provided by 802.3z and 802.3ae standards.

Ensure that the design is self-stabilizing. Utilize a combination of control plane modularization (such as route summarization) and software throttling (such as IP interface dampening) to ensure that any failures are isolated in their impact and that control plane prevents any flooding or thrashing conditions from arising.

These principles are intended to be a complementary part of the overall structured modular design approach to the campus architecture and primarily serve to re-enforce good resilient design practices.

Device Resiliency

While a redundant network topology, featuring redundant links and switches, can help address many overall campus availability challenges, providing redundancy alone does not comprise a complete solution. Every campus design will have single points of failure and the overall availability of the network might be dependent on the availability of a single device. A prime example of this is the access layer. Every access switch represents a single point of failure for all of the attached devices. Ensuring the availability of the network services is often dependent on the resiliency of the individual devices.

Device resiliency, as with network resiliency, is achieved through a combination of the appropriate level of physical redundancy, device hardening, and supporting software features. Studies indicate that most common failures in campus networks are associated with Layer-1 failures-from components such as power, fans, and fiber links. The use of diverse fiber paths with redundant links and line cards combined with fully redundant power supplies and power circuits, are the most critical aspects of device resiliency. The use of redundant power supplies becomes even more critical in access switches with the introduction of Power over Ethernet (PoE) devices such as IP phones. Multiple devices are now dependent on the availability of the access switch and its ability to maintain the necessary level of power for all of the attached end devices. After physical failures, the most common cause of device outage is often related to the failure of supervisor hardware or software. The network outages due to the loss or reset of a device due to supervisor failure can be addressed through the use of supervisor redundancy. Cisco Catalyst switches provides two mechanisms to achieve this additional level of redundancy:

Stateful switchover and non-stop forwarding (NSF/SSO) on the Cisco Catalyst 4500 and Cisco Catalyst 6500

Stackwise and Stackwise-Plus on the Cisco Catalyst 3750 and Cisco Catalyst 3750E

Both of these mechanisms provide for a hot active backup for the switching fabric and control plane—ensuring that both data forwarding and network control plane (featuring protocols such as EIGRP, OSPF, and STP) seamlessly recover (sub-second traffic loss) during any form of software or supervisor hardware crash.


Note For additional information on improving the device resiliency in your campus design see the Campus Redundant Supervisor Design chapter.


In addition to ensuring that each switch in the campus has the necessary level of physical hardware and software redundancy, it is also important to provide the appropriate protection for the switches control plane. The multi-gigabit speeds of modern switching networks can overwhelm the capacity of any CPU. While most traffic in the campus network is forwarded in the hardware and the CPU should only need to process control plane and other systems management traffic, the potential exists under certain failure conditions (or in the event of a malicious DoS attack) for the volume and type of traffic forwarded to overwhelm the CPU. In such events, unless the appropriate switch hardware architecture and controls are in place, the network as a whole can fail due to the CPU being unable to process critical control plane (e.g., EIGRP and STP) and management (such as Telnet and SSH) traffic. The campus design addresses this type of problem through three approaches:

Limit the baseline control plane and CPU load on each switch through modular design, as well as to provide control plane isolation between modules in the event any failure does occur.

Reduce the probability of a flooding event through the reduction in the scope of the Layer-2 topology and the use of the spanning tree toolkit features to harden the spanning tree design.

Leverage the hardware CPU protection mechanisms and Control Plane Protection (CoPP) features of the Catalyst switches to limit and prioritize traffic forwarded to each switch CPU.

The combination of all three elements (physical redundancy to address Layer-1 physical failures, supervisor redundancy to provide for a non-stop forwarding (data) plane, and the hardening of the control plane through the combination of good design and hardware CPU protection capabilities) are the key elements in ensuring the availability of the switches themselves and optimal uptime for the campus as a whole.

Operational Resiliency

Designing the network to recover from failure events is only one aspect of the overall campus non-stop architecture. Business environments are continuing to move toward requiring true 7x24x365 availability.

It is becoming increasing difficult to find a change window—or a time when the network can be shut down for maintenance with the globalization of business, the desire for always-on communications and the movement from mainframe-based monolithic application systems to web- and Unified Communications-based systems.

The campus—which might form or be a part of the backbone of the enterprise network—must be designed to enable standard operational processes, configuration changes, software and hardware upgrades without disrupting network services.

The ability to make changes, upgrade software, and replace or upgrade hardware in a production is possible due to the implementation of network and device redundancy. By having dual active paths through redundant switches designed to converge in sub-second timeframes, it is possible to schedule an outage event on one element of the network and allow it to be upgraded and then brought back into service with minimal disruption to the network as a whole. The ability to upgrade individual devices without taking them out of service is similarly based on having internal component redundancy (such as with power supplies, and supervisors) complemented with the system software capabilities. Two primary mechanisms exist to upgrade software in place in the campus:

Full-image In-Service Software Upgrade (ISSU) on the Cisco Catalyst 4500 leverages dual supervisors to allow for an full, in-place Cisco IOS upgrade. Moving from 12.2(37)SG1 to 12.2(40)SG, as an example. This leverages the NSF/SSO capabilities of the switch and provides for less than 200 msec of traffic loss during a full Cisco IOS upgrade.

Sub-system ISSU on the Cisco Catalyst 6500 leverages Cisco IOS modularity and the ability it provides to replace individual Cisco IOS components (such as routing protocols) without impacting the forwarding of traffic or other components in the system.

Having the ability to operate the campus as a non-stop system is dependent on the appropriate capabilities being designed-in from the start. Network and device level redundancy, along with the necessary software control mechanisms, guarantee controlled and fast recovery of all data flows following any network failure—while concurrently providing the ability to proactively manage the non-stop infrastructure.

Access and Mobility Services

Of all the factors influencing change in the campus architecture, the growing expectation within the business community for a flexible work environment—providing anytime/anywhere network connectivity—is one of the most visible. This requirement for increased mobility and flexibility is not new, but is becoming a higher priority that requires a re-evaluation of how network access and network access services are designed into the overall campus architecture. The growth in demand for enhanced mobility—both wired and wireless—can be characterized by observing three loosely related trends:

The growth in laptop and other portable devices as the primary business tool rather than desktop PCs.

The growth in the number of onsite partners, contractors and other guests using the campus services. These users will most often leverage a combination of their own computing equipment—usually their corporate provided laptop—and equipment, phones, printers, and the like provided by the host enterprise.

The growth in the number and type of devices connected to the campus network, such as VoIP phones, desktop video cameras, and security cameras.

The single thread that ties all of the requirements together is the need to cost-effectively move devices within the campus and have them associated with the correct network policies and services wherever they are connected. In order to achieve this level of access mobility, the campus network must ensure that the following access services are integrated into the overall campus architecture:

Ability to physically attach to the network and be associated with or negotiate the correct Layer-1 and Layer-2 network services—PoE, link speed and duplex, subnet (VLAN or SSID)

Ability to provide device identification and, where needed, perform network access authentication

Ability for the network to apply the desired QoS policies for the specific user, device or traffic flow (such as RTP streams)

Ability for the network to apply the desired security policies for the specific user or device

Ability for the network and device to determine and then register the location of the attaching device

Ability for the device to negotiate and register the correct end station parameters (such as DHCP), as well as register for any other necessary network services (such as register for Unified Communications presence and call agent services)

The challenge for the campus architect is determining how to implement a design that meets this wide variety of requirements, the need for various levels of mobility, the need for a cost-effective and flexible operations environment, while being able to provide the appropriate balance of security and availability expected in more traditional, fixed-configuration environments.

Converged Wired and Wireless Campus Design

One approach that is being used to address this growing need for more dynamic and flexible network access is the introduction of 802.11 wireless capabilities into the campus. While 802.11 can and does provide for easier roaming and can provide a cost effective method to enhance network access, the implementation of wireless must be integrated into an overall campus architecture in order to provide for a consistent set of services and ease of movement for both highly mobile wireless devices and highly available wired devices. The integration of wired and wireless access methods into a common campus architecture is just the latest phase of network convergence. As illustrated in Figure 21 (moving from the bottom to the top) the enterprise network has gone through several phases of integration or convergence.

Figure 21 Evolution of the Converged Campus Networks

There are two key motivators that have been driving the network convergence process. The first is the ability for a converged network to reduce the operational costs of the overall enterprise by leveraging common systems and (more importantly) a common operational support teams and processes. The second, and equally important, driver to convergence is the business advantage gained when previously isolated business processes can be more tightly integrated. The convergence of the voice, video, and data networks (as an example) has enabled the development of Unified Communications systems that are allowing businesses to more efficiently leverage all the various inter-personal communication tools. This next phase of integration, combining wired and wireless into a converged campus, is motivated by the sa