Requirements for hitless MPLS path segment monitoringTelecom ItaliaVia Reiss Romoli, 274Torino10148Italyalessandro.dalessandro@telecomitalia.itHuawei Technologiesloa@mail01.huawei.comNTT Communicationssatoshi.ueno@ntt.comNTTarai.kaoru@lab.ntt.co.jpNTTy.koike@vcd.nttbiz.com
One of the most important OAM capabilities for transport
network operation is fault localisation. An in-service,
on-demand segment monitoring function of a transport path
is indispensable, particularly when the service monitoring
function is activated only between end points. However,
the current segment monitoring approach defined for MPLS
(including the transport profile (MPLS-TP)) in RFC 6371
"Operations, Administration, and Maintenance Framework for
MPLS-Based Transport Networks" has drawbacks.
This document provides an analysis of the existing
MPLS-TP OAM mechanisms for the path segment monitoring
and provides requirements to guide the development of new
OAM tools to support a Hitless Path Segment Monitoring (HPSM).
According to the MPLS-TP OAM requirements RFC 5860 ,
mechanisms MUST be available for alerting service providers of
faults or defects that affects their services. In addition,
to ensure that faults or service degradation can be localized,
operators need a function to diagnose the detected problem.
Using end-to-end monitoring for this purpose is insufficient in that
an operator will not be able to localize a fault or service degradation accurately.
A segment monitoring function that can focus on a specific
segment of a transport path and that can provide a detailed analysis is
indispensable to promptly and accurately localize the fault.
A path segment monitoring function has been defined to
perform this task for MPLS-TP. However, as noted in the MPLS-TP OAM Framework
RFC 6371 , the current method for segment
monitoring of a transport path has implications that hinder the
usage in an operator network.
This document, after elaborating on the problem statement for the path
segment monitoring function as it is currently defined, provides
requirements for an on-demand segment monitoring function without
traffic distruption. Further works are required to evaluate
how proposed requirements match with current MPLS architecture and to identify
possibile solutions.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119
.
HPSM - Hitless Path Segment Monitoring
LSP - Label Switched Path
LSR - Label Switching Router
ME - Maintenance Entity
MEG - Maintenance Entity Group
MEP - Maintenance Entity Group End Point
MIP - Maintenance Entity Group Intermediate Point
OTN - Optical Transport Network
TCM - Tandem connection monitoring
SPME - Sub-path Maintenance Element
To monitor (and to protect and/or manage) MPLS-TP network
segments a Sub-Path Maintenance Element (SPME) function has
been defined in RFC 5921 . The SPME
is defined between the edges of the segment of a transport path
that needs to be monitored, protected, or managed. SPME is
created by stacking the shim header (MPLS header) according
to RFC 3031 and it is defined as the
segment where the header is stacked. OAM messages can be
initiated at the edge of the SPME and sent to the peer edge
of the SPME or to a MIP along the SPME by setting the TTL
value of the label stack entry (LSE) and interface identifier
value at the corresponding hierarchical LSP level in case of
a per-node model.
MPLS-TP segment monitoring should satisfy two network objectives
according to section 3.8 of RFC 6371 :
(N1) The monitoring and maintenance of current transport
paths has to be conducted in-service without traffic disruption.
(N2) Segment monitoring must not modify the forwarding of the segment
portion of the transport path.
The SPME function that has been defined in RFC 5921 has the following drawbacks:
(P1) It increases network management complexity, because a new
sublayer and new MEPs and MIPs have to be configured for the SPME.
(P2) Original conditions of the path change.
(P3) The client traffic over a transport path is disrupted if
the SPME is configured on-demand.
Problem (P1) is related to the management of each additional sub-layer
required for segment monitoring in a MPLS-TP network. When an SPME is
applied to administer on-demand OAM functions in MPLS-TP networks, a rule for
operationally differentiating those SPME will be required at
least within an administrative domain. This forces operators to
implement at least an additional layer into the management systems that will
only be used for on-demand path segment monitoring.
From the perspective of operation, increasing the number of managed layers and managed
addresses/identifiers is not desirable in view of keeping the
management systems as simple as possible. Moreover, using the
currently defined methods, on-demand setting of SPMEs causes problems (P2)
and (P3) due to additional label stacking.
Problem (P2) arises from the fact that MPLS exposed label value
and MPLS frames length changes. The monitoring function
should monitor the status without changing any condition of the target,
to be monitored, segment or transport path. Changing the settings of the original shim header should not
be allowed because this change corresponds to creating a new segment
of the original transport path that differs from the original
one. When the conditions of the path
change, the measured values or observed data will also change and
this may make the monitoring meaningless because the result of the
measurement would no longer reflect the performance of the connection
where the original fault or degradation occurred.
As an example, setting up an on-demand
SPME will result in the LSRs within the monitoring segment only
looking at the added (stacked) labels and not at the labels of the
original LSP. This means that problems stemming from incorrect
(or unexpected) treatment of labels of the original LSP by the nodes
within the monitored segment cannot be identified when setting up
SPME. This might include hardware problems during label look-up,
mis-configuration, etc. Therefore operators have to pay extra
attention to correctly setting and checking the label values of the
original LSP in the configuration. Of course, the reverse of this
situation is also possible, e.g., an incorrect or unexpected
treatment of SPME labels can result in false detection of a fault
where no problem existed originally.
Figure 1 shows an example of SPME settings. In the figure, "X" is the label
value of the original path expected at the tail-end of node D.
"210" and "220" are label values allocated for SPME. The label values of the
original path are modified as well as the values of the stacked labels.
As shown in Figure 1, SPME changes both the length of MPLS frames and the
label value(s). In particular, performance monitoring measurements (e.g. Delay
Measurement and Packet Loss Measurement) are sensitive to these changes. As an example,
increasing the packet lenght may impact on packet loss due to MTU settings, modifying
the label stack may introduce packet loss or it may fix packet loss depending on the configuration
status so modifying network conditions. Such changes influence packets delay too even if,
from a practical point of view, it is likely that only a few services will experience a practical impact.
Problem (P3) can be avoided if the operator sets SPMEs in advance and
maintains them until the end of life of a transport path. But this does not support on-demand.
Furthermore SMPEs cannot be set arbitrarily because
overlapping of path segments is limited to nesting relationships. As a result,
possible SPME configurations of segments of an original transport path are
limited due to the characteristic of the SPME shown in Figure 1, even if SPMEs
are pre-configured.
Although the make-before-break procedure in the survivability document
RFC 6372 supports configuration
for monitoring according to the framework document RFC 5921
, without traffic distruption, the configuration of an SPME is
not possible without violating network objective (N2).
These concerns are described in section 3.8 of RFC 6371 .
Additionally, the make-before-break approach typically relies on a control plane and requires additional
functionalities for a management system to properly support SPME creation and traffic
switching from the original transport path to the SPME.
As an example, the old and new transport resources (e.g. LSP tunnels) might compete with each other for resources which they have in common.
Depending on availability of resources, this competition can cause admission control to prevent
the new LSP tunnel from being established as this bandwidth accounting deviates from traditional (non control plane) management
system operation.
While SPMEs can be applied in any network context (single domain, multi domain, single carrier,
multi carrier, etc.), the main applications are in inter-carrier or
inter-domain segment monitoring where they are typically pre-
configured or pre-instantiated. SPME instantiates a hierarchical
path (introducing MPLS label stacking) through which OAM
packets can be sent. The SPME monitoring function is also mainly
important for protecting bundles of transport paths and carriers'
carrier solutions within an administrative domain.
The analogy for SPME in other transport technologies is Tandem Connection Monitoring (TCM), used in Optical
Transport Networks (OTN) and Ethernet transport networks, which supports on-demand but does not affect the path.
For example in OTN, TCM allows the insertion and removal of performance monitoring overhead within the frame at intermediate points
in the network. It is done such that their insertion and removal do not change the conditions of the path.
Though as the OAM overhead is part of the frame (designated overhead bytes), it is constrained to a
pre-defined number of monitoring segments.
To summarize: the problem statement is that the current sub-path
maintenance based on a hierarchical LSP (SPME) is problematic for
pre-configuration in terms of increasing the number of managed objects by layer
stacking and identifiers/addresses. An on-demand
configuration of SPME is one of the possible approaches for
minimizing the impact of these issues. However, the current
procedure is unfavourable because the on-demand configuration for
monitoring changes the condition of the original monitored
path. To avoid or minimize the impact of the drawbacks
discussed above, a more efficient approach is required for the
operation of an MPLS-TP transport network. A monitoring mechanism,
named Hitless Path Segment Monitoring (HPSM), supporting
on-demand path segment monitoring without traffic disruption is needed.
In the following sections, mandatory (M) and optional (O) requirements
for the Hitless Path Segment Monitoring function are listed.
HPSM would be an additional OAM tool that would not replace SPME. As such:
(M1) HPSM MUST be compatible with the usage of SPME
(O1) HPSM SHOULD be applicable at the SPME layer too
(M2) HPSM MUST support both the per-node and per-interface model as specified in RFC 6371 .
One of the major problems of legacy SPME highlighted in section 3 is
that it may not monitor the original path and it could
disrupt service traffic when set-up on demand.
(M3) HPSM MUST NOT change the original conditions of transport path
(e.g. must not change the length of MPLS frames, the exposed
label values, etc.)
(M4) HPSM MUST support on-demand provisioning without traffic disruption.
Along a transport path there may be the need to support simultaneously monitoring multiple segments
(M5) HPSM MUST support configuration of multiple monitoring segments along a transport path.
HPSM would apply mainly for on-demand diagnostic purposes.
With the currently defined approach, the most serious problem is that there is no way to locate
the degraded segment of a path without changing the conditions of the
original path. Therefore, as a first step, a single level, single segment
monitoring, not affecting the monitored path, is required for HPSM. A combination of
multi-level and simultaneous segments monitoring is the most powerful
tool for accurately diagnosing the performance of a transport path.
However, in the field, a single level, multiple segments approach would be less complex for management and operations.
(M6) HPSM MUST support single-level segment monitoring
(O2) HPSM MAY support multi-level segment monitoring.
Figure 3 shows an example of multi-level HPSM.
There is a need for simultaneously using existing end-to-end proactive
monitoring and on-demand path segment monitoring. Normally, the on-demand path segment monitoring is
configured on a segment of a maintenance entity of a transport path.
In such an environment, on-demand single-level monitoring should be
performed without disrupting the pro-active monitoring of the
targeted end-to-end transport path to avoid affecting user traffic
performance monitoring.
(M7) HPSM MUST support the capability of being operated concurrently to, and independently of OAM function operated on the end-to-end path
The main objective for on-demand segment monitoring is to
diagnose the fault locations. A possible realistic diagnostic
procedure is to fix one end point of a segment at the MEP of the
transport path under observation and change progressively the length
of the segments. It is therefore possible to monitoring step by step
all the path with a granularity that depends on equipment implementations.
For example, Figure 5 shows the case where the granularity is at
interface level (i.e. monitoring is at each input interface and output interface
of each piece of equipment).
Another possible scenario is depicted in Figure 6. In this case, the
operator wants to diagnose a transport path starting at a transit
node, because the end nodes (A and E) are located at customer sites
and consist of small boxes supporting only a subset of
OAM functions. In this case, where the source entities of the
diagnostic packets are limited to the position of MEPs, on-demand
segment monitoring will be ineffective because not all the segments
can be diagnosed (e.g. segment monitoring HPSM 3 in Figure 6 is not
available and it is not possible to determine the fault location
exactly).
(M8) It SHALL be possible to provision HPSM on an arbitrary
segment of a transport path.
Node or link failures may occur while HPSM is active. In this case,
if no resiliency mechanism is set-up on the subtended transport path,
there is no particular requirement for HPSM. If the
transport path is protected, the HPSM function may bring to monitoring unintended segments.
The following examples are provided for clarification.
Protection scenario A is shown in figure 7. In this scenario a
working LSP and a protection LSP are set-up. HPSM is activated
between nodes A and E. When a fault occurs between nodes B and C,
the operation of HPSM is not affected by the protection switch and
continues on the active LSP path.
Protection scenario B is shown in figure 8. The difference with
scenario A is that only a portion of the transport path is protected.
In this case, when a fault occurs between nodes B and C on the
working sub-path B-C-D, traffic will be switched to protection sub-
path B-G-H-D. Assuming that OAM packet termination depends only on
the TTL value of the MPLS label header, the target node of the HPSM
changes from E to D due to the difference of hop counts between the
working path route (A-B-C-D-E: 4 hops) and protection path route
(A-B-G-H-D-E: 5 hops). In this case the operation of HPSM is affected.
(M9) The HPSM SHOULD avoid monitoring an unintended
segment when one or more failures occur
There are potentially different solutions to satisfy such a
requirement. A possible solution may be to suspend HPSM monitoring
until network restoration takes place. Another possible approach may
be to compare the node/interface ID in the OAM packet
with that at the node reached at TTL termination and if this does not
match through some means trigger a suspension of
HPSM monitoring. The above approaches are valid in any circumstance,
both for protected and unprotected networks LSPs.
These examples should not be taken to limit the design of a solution.
From managing perspective, increasing the number of managed layers and managed addresses/identifiers is not desirable in view of keeping the management systems as simple as possible.
(M10)HPSM SHOULD NOT be based on additional transport layers (e.g. hierarchical LSPs)
(M11) The same identifiers used for MIPs and/or MEPs SHOULD be applied to maintenance points for the HPSM when they are
instantiated in the same place along a transport path.
Anyway maintenance points for the HPSM may be different from MIPs and MEPs functional components as
defined in the OAM framework document RFC 6371 . Investigating potential solutions for
satisfying proposed HPSM requirements might lead to propose new functional components that have to be
backward compatible with MPLS architecture. Solutions are outside the scope of this document.
A maintenance point supporting the HPSM function has to
be able to generate and inject OAM packets. OAM functions that may be applicable for on-demand HPSM
are basically the on-demand performance monitoring functions which
are defined in the OAM framework document RFC 6371 . The "on-demand" attribute is
typically temporary for maintenance operation.
(M12) HPSM MUST support Packet Loss and Packet Delay measurement.
That because these
functions are normally only supported at the end points of a
transport path. If a defect occurs, it might be quite hard to locate
the defect or degradation point without using the segment monitoring
function. If an operator cannot locate or narrow down the cause of
the fault, it is quite difficult to take prompt actions to solve the
problem.
Other on-demand monitoring functions (e.g. Delay Variation
measurement) are desirable but not as necessary as the functions
mentioned above.
(O3) HPSM MAY support Packet Delay variation,
Throughput measurement and other performance monitoring and fault management functions.
Support of out-of-service on-demand performance management functions
(e.g. Throughput measurement) is not required for HPSM.
A new hitless path segment monitoring (HPSM) mechanism is required to
provide on-demand segment monitoring without traffic disruption. It shall meet the
two network objectives described in section 3.8 of RFC 6371
and summarized in Section 3 of this document.
The mechanism should minimize the problems described in Section 3,
i.e. (P1), (P2) and (P3).
The solution for the on-demand segment monitoring without traffic disruption needs to
cover both the per-node model and the per-interface model specified
in RFC 6371 .
The on-demand segment monitoring without traffic disruption solution needs to support
on-demand Packet Loss Measurement and Packet Delay Measurement
functions and optionally other performance monitoring and fault
management functions (e.g. Throughput measurement, Packet Delay variation
measurement, Diagnostic test, etc.).
Security is a significant requirement of MPLS Transport Profile.
The document provides a problem statement and requirements to guide the development of new
OAM tools to support Hitless Path Segment Monitoring. Such new tools
must follow the security considerations provided in OAM Requirements for
MPLS-TP in RFC5860 .
There are no requests for IANA actions in this document.
Note to the RFC Editor - this section can be removed before publication.
Manuel Paul
Deutsche Telekom AG
Email: manuel.paul@telekom.de
The authors would also like to thank Alexander Vainshtein, Dave
Allan, Fei Zhang, Huub van Helvoort, Malcolm Betts, Italo Busi,
Maarten Vissers, Jia He and Nurit Sprecher for their comments and
enhancements to the text.