PCE : A High-Level Overview

PCE : A High-Level Overview

Introduction

The benefits of traffic engineering are very well known and hence it is ubiquitously present in most Service Provider networks. Providers use traffic engineered paths (LSPs) for optimization of network resources, QoS guarantees, fast reroute, etc. To achieve these functions in networks with multiple IGP Areas or multiple ASes (also called domains from [RFC 4655]), the LSP must cross area boundaries. However, the LSPs are limited to a single area due to limited visibility into the topology at the head end. Routing information is not distributed between domains due to scalability, confidentiality and trust issues.

Historically, three methods are available to setup LSPs across domain boundaries.

1. LSP Stitching

An end-to-end LSP is built using several smaller LSPs (TE segments) that are setup in different domains and stitched together at the domain boundary router. A 1:1 mapping is enforced between segments in different domains. The Purple LSP in the figure is an example of Stitched LSP that is crossing three domains. This method has various limitations. A primary limitation being - if a new LSP were to be setup between A and F, new TE segments must be created in each domain, because of 1:1 mapping. Hence, the amount of state created and maintained grows proportionally. This causes scaling issues which can be solved by LSP Nesting.

2. LSP Nesting

An end-to-end LSP is tunneled inside an LSP with per-domain scope as it crosses that domain, creating a hierarchy of LSPs. The LSP that hides the details of end-to-end LSPs (also called transit LSPs) is called the Forwarding Adjacency (FA). In the figure, the Green and Orange LSPs are tunneled through an FA LSP in the middle domain. Thus, nesting uses 1:N mapping between FA LSP and transit LSPs. Nesting brings the scaling advantages as no state needs to be created or maintained for transit LSPs. Nesting also provides admission control for transit LSPs. Nesting has few limitations - the bandwidth requirements for transit LSPs might not always be satisfied. Admission control/policing becomes necessary on FA LSPs.

3. Contiguous LSP

An end-to-end LSP is setup by hop-by-hop signaling. This method resembles the setup of a TE LSP within a single domain. This is indicated by Red LSP in the figure.

Regardless of what setup method is used, the path of the LSP must be computed. The focus of this article is Path Computation.

Path Computation

The scope of path computation is limited by the visibility that the path computing entity has into the topology. The path computing entity can be a router, or a server.

Per-Domain Path Computation

A way of setting up end-to-end LSP is to perform path computation separately within each domain. The head end LSR is configured with border routers (exit points) of the domains. The head end LSR signals the LSP after computing the path to the next border router. At the entrance of the domain, the path to the next border router is computed and added to the ERO. This is called ERO Expansion. In the figure, A is configured with B and D as loose hops for LSP to F, and A computes the path to B. When B receives RSVP PATH message, it computes the path to D (next loose hop in ERO), and adds to the ERO, and so on.

Although this method of path computation works, it has few limitations-

1. Exit points of every domain must be known beforehand.

2. In above description, B computes a path to D upon receiving PATH message from A. Now, suppose the path computed by B is B-C-E-D. However, if A had chose C as exit point, the path could be A-C-E-F. So, although B chooses the best path, it is not the most optimal path for A.

3. Also, if B could not find a path to D (due to link B-D bandwidth reserved by other LSPs, and link B-C being down), the LSP would not be created; even though going through C could result in LSP creation.

Note that some limitations described here can be overcome by Crankback signaling but it is very slow and complicated.

Path Computation Element Architecture

The Path Computation Element (PCE) architecture is defined in [RFC 4655]. The key components that make up PCE architecture are:

Path Computation Element (PCE)

The PCE is a network entity (eg. an application on a server, or a network node) that is capable of computing a network path or route based on a network graph and applying computational constraints. A PCE is able to compute the path of a TE LSP by operating on the TED and considering constraints applicable to the TE LSP service request. Paths that span multiple domains could be computed by a single PCE (Centralised Computation model) or multiple PCEs (Distributed Computation model). The path computed by PCE could be an explicit path with strict hops, or a combination of strict and loose hops.

Path Computation Client (PCC)

A PCC is any client application requesting a path computation to be performed by a PCE. This is typically an LSR or it can be an NMS.

Path Computation Element Protocol (PCEP)

A PCEP is a communication protocol used between PCC and PCE, and between PCEs defined in [RFC 5440]. PCEP is a TCP-based request/response communication protocol. The request can be a path computation request from PCC/PCE, and the response can be a set of computed paths, or a negative reply (with a reason) if the request cannot be satisfied.

PCE Functional Models

A PCE may participate in IGP to acquire TE information and construct the TED. It may retrieve TED information via out-of-band synchronization when it is not possible to participate in IGP.

A Composite PCE node is an LSR that also has PCE functionality. An LSR constructs TED by exchanging TE information with other nodes in the domain using IGP. Service requests for TE LSPs (which requires path computation) are received by the node and converted into signaling requests.

In External PCE node, the PCE functionality is external to the requesting element. The PCC requests path computation to an external PCE (this could be a Server or an LSR), and the PCE returns a path taking into consideration its local policy (if any).

It is possible that the external PCE may return a partial path that does not reach the intended destination, or the path is loose. The downstream LSR contacts another PCE with a service request to establish next hop(s) in the path. This can also be solved by co-operating PCEs so that the PCE consulted by the PCC makes a request to another PCE to help with the computation.

Discovering PCEs

PCC requests are targeted to a specific PCE, and hence the PCCs must have the knowledge of the location of PCEs. This can be achieved through local configuration on PCC (- management overhead, - not flexible) or IGP protocol-based discovery mechanism (+ achieved with IGP extensions). In case multiple PCEs are known to the PCC, the PCC selects an appropriate PCE for its purposes based on the capabilities advertised by PCEs (like a set of constraints that it can account for, support for P2MP trees, computational capacity, etc.). This allows for load-sharing between PCEs.

The PCCs discover PCEs via IGP extensions to OSPF and ISIS when PCC (as LSRs) and PCEs (as LSRs or NMSs) participate in IGP. This is the best mechanism as, ideally, the PCC must be able to automatically discover a set of PCEs and their capabilities. Additionally, it is valuable for PCCs to dynamically discover new PCEs, failed PCEs, or any modification to the PCE information.

OSPF Protocol Extensions for PCE Discovery

[RFC 5088] defines extensions to OSPFv2 and OSPFv3 for PCE discovery. It leverages Router Information LSA (Opaque Type of 4 and Opaque ID of 0) defined in [RFC 4970] to advertise the PCE location and capabilities within an area or an entire OSPF domain by defining new TLV called PCE Discovery (PCED) TLV type 6 (and various sub-TLVs within the PCED TLV) to be carried within Router Information LSA. The flooding scope is defined by the Opaque LSA type carrying the TLV.

The sub-TLVs defined are:

Sub-TLV Type 1 : PCE-ADDRESS (mandatory) - Carries IPv4 or IPv6 address of the PCE. This sub-TLV identifies the location of PCE.

Sub-TLV Type 2 : PATH-SCOPE (mandatory) - Indicates PCE's ability to compute or take part in the computation of paths for intra-area, inter-area, inter-AS or inter-layer TE LSPs.

Sub-TLV Type 3 : PCE-DOMAIN (optional) - Carries OSPF Area ID or AS number. This sub-TLV specifies a PCE Domain where the PCE has visibility and through which the PCE can compute paths.

Sub-TLV Type 4 : NEIG-PCE-DOMAIN (optional) - Carries neighbor OSPF Area ID or AS number. This sub-TLV specifies neighboring PCE domains towards which the PCE can compute paths.

Sub-TLV Type 5 : PCE-CAP-FLAGS (optional) - Indicates the PCE capabilities.

ISIS Protocol Extensions for PCE Discovery

[RFC 5089] defines extensions to ISIS for PCE discovery. It leverages ISIS Router Capability TLV (TLV type 242) defined in [RFC 4971]. [RFC 5089] defines PCED sub-TLV type 5 which is carried in ISIS Router Capability TLV.

The sub-TLVs for PCED sub-TLV are similar to sub-TLVs defined in [RFC 5088] and not covered here again.

Stateful versus Stateless PCEs

A PCE can either be stateful or stateless.

Stateful PCE : A stateful PCE maintains strict synchronization between the PCE and network states (topology and resource information), along with the set of computed paths. A stateful PCE utilizes information from TED as well as information about existing paths in the network when processing new requests from the PCC. Thus, a stateful PCE allows for optimal path computation and increased path computation success. For these reasons, a stateful PCE is attractive. However, the downside is - it requires reliable state synchronization mechanisms not just on PCE but also between several PCEs, with potentially significant control plane overhead.

Stateless PCE : A stateless PCE does not remember any computed path, and each set of requests is processed independently of each other.

Computation Algorithms

Again, per-domain path computation procedure is available with PCEs but face the same issues as mentioned earlier.

Backward Recursive Path Computation

[RFC 5441] describes a procedure to compute shortest constrained inter-domain TE LSPs. This procedure is called Backward Recursive Path Computation (BRPC). It makes few assumptions, some of which are:

    • Each PCE can compute any path across a domain
    • Each PCE knows a PCE for the neighboring domains
    • Destination domain is known

A BRPC procedure relies on co-operating PCEs. A PCC sends a request to a PCE in its domain. The request is forwarded between PCEs domain-by-domain, until the PCE responsible for the domain containing the LSP destination is reached. The PCE in the destination domain builds a tree of potential paths rooted at the destination (called Virtual Shortest Path Tree, VSPT), and passes it back to the previous PCE in a reply. Each PCE adds to the VSPT and passes it its previous PCE until it reaches the PCE in the source domain which uses it to select an end-to-end path and sends to the PCC.

The problem with BRPC is that destination domain must be known, or require a mechanism to distribute this information (BGP-LS, is the proposed solution and in draft phase in IETF).

Summary

PCE is a tool that can be used to improve and ease path computation, within a single domain and across domains. It hasn't been a huge success in packet networks, but has huge footprint in optical networks. It can be a great tool in SDN environments too.

References

[RFC 4655] : A PCE-Based Architecture

[RFC 5440] : PCE Communication Protocol

[RFC 5441] : BRPC

[RFC 4970] : Extensions to OSPF for Advertising Optional Router Capabilities

[RFC 4971] : ISIS Extensions for Advertising Router Information

[RFC 5088] : OSPF Extensions for PCE Discovery

[RFC 5089] : ISIS Extensions for PCE Discovery

Adrian Farrel's Presentation on PCE

Juniper Documentation