Network Working Group S. Vallin Internet-Draft M. Bjorklund Intended status: Standards Track Cisco Expires: November 5, 2015 May 4, 2015 YANG Alarm Module draft-vallin-alarm-yang-module-00 Abstract This YANG module defines an alarm interface for network devices. It includes functions for alarm list management and notifications to inform management systems. There are also RPCs to manage the operator state of an alarm and administrative alarm procedures. The module carefully maps to relevant alarm standards. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on November 5, 2015. Copyright Notice Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Vallin & Bjorklund Expires November 5, 2015 [Page 1] Internet-Draft YANG Alarm Module May 2015 Table of Contents 1. Requirements notation . . . . . . . . . . . . . . . . . . . . 2 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 3. Objectives . . . . . . . . . . . . . . . . . . . . . . . . . 4 4. Background and Usability Requirements . . . . . . . . . . . . 4 5. Alarm Concepts . . . . . . . . . . . . . . . . . . . . . . . 7 5.1. What is an Alarm? . . . . . . . . . . . . . . . . . . . . 8 5.2. What is an Alarm Type? . . . . . . . . . . . . . . . . . 8 5.3. How are Resources Identified? . . . . . . . . . . . . . . 11 5.4. How are Alarm Instances Identified? . . . . . . . . . . . 11 5.5. What is the Life-Cycle of an Alarm? . . . . . . . . . . . 12 5.5.1. Resource Alarm Life-Cycle . . . . . . . . . . . . . . 12 5.5.2. Operator Alarm Life-cycle . . . . . . . . . . . . . . 13 5.5.3. Administrative Alarm Life-Cycle . . . . . . . . . . . 13 6. Alarm Data Model . . . . . . . . . . . . . . . . . . . . . . 14 6.1. Alarm Control . . . . . . . . . . . . . . . . . . . . . . 15 6.1.1. Alarm Shelving . . . . . . . . . . . . . . . . . . . 15 6.2. Alarm Inventory . . . . . . . . . . . . . . . . . . . . . 15 6.3. Alarm Summary . . . . . . . . . . . . . . . . . . . . . . 16 6.4. The Alarm List . . . . . . . . . . . . . . . . . . . . . 16 6.5. RPCs . . . . . . . . . . . . . . . . . . . . . . . . . . 17 6.6. Notifications . . . . . . . . . . . . . . . . . . . . . . 18 7. Alarm YANG Module . . . . . . . . . . . . . . . . . . . . . . 18 8. X.733 Alarm Mapping Data Model . . . . . . . . . . . . . . . 40 9. X.733 Alarm Mapping YANG Module . . . . . . . . . . . . . . . 40 10. Security Considerations . . . . . . . . . . . . . . . . . . . 45 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 45 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 45 12.1. Normative References . . . . . . . . . . . . . . . . . . 45 12.2. Informative References . . . . . . . . . . . . . . . . . 45 Appendix A. Enterprise-specific Alarm-Types Example . . . . . . 46 Appendix B. Alarm Inventory Example . . . . . . . . . . . . . . 47 Appendix C. Alarm List Example . . . . . . . . . . . . . . . . . 48 Appendix D. Alarm Shelving Example . . . . . . . . . . . . . . . 50 Appendix E. X.733 Mapping Example . . . . . . . . . . . . . . . 50 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 51 1. Requirements notation The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. Vallin & Bjorklund Expires November 5, 2015 [Page 2] Internet-Draft YANG Alarm Module May 2015 2. Introduction This document defines a YANG [RFC6020] data model for alarm management. The purpose is to define a standardised alarm interface for network devices that can be easily integrated into management applications. Alarm monitoring is a fundamental part of monitoring the network. Raw alarms from devices do not always tell the status of the network services or necessarily point to the root cause. However, being able to feed alarms to the network management system in a standardised format is a starting point for performing higher level of network assurance tasks. The telecommunication domain has standardised the alarm interface in ITU-T X.733 [X.733]. This continued in mobile networks within the 3GPP organisation [ALARMIRP]. Although SNMP is the dominant mechanism for monitoring devices, IETF did not early on standardise an alarm MIB. Instead, management systems interpreted the enterprise specific traps per MIB and device to build the alarm list. When finally The Alarm MIB [RFC3877] was published, it had to address the existence of enterprise traps and map these into alarms. This requirement led to a MIB that is not easy to use. This document defines a standardised YANG module for alarm management. The design of the module is based on experience from using and implementing the above mentioned alarm standards. 2.1. Terminology The following terms are used within this document: o System: the system that implements this YANG alarm module, the "NETCONF server", the "agent". This corresponds to a network device or application which implements instrumentation for the alarms. o Management System: the alarm management application that consumes the alarms, the "NETCONF client", the "manager", the "NMS/OSS". o Alarm: An alarm signifies an undesirable state in a resource that requires corrective action. o Alarm Type: An alarm type identifies a unique alarm state for a resource. Alarm types are names to identify the state like 'linkAlarm', 'jittterViolation', 'highDiskUtilization'. Vallin & Bjorklund Expires November 5, 2015 [Page 3] Internet-Draft YANG Alarm Module May 2015 o Alarm Instance: the current alarm state for a specific resource and alarm type. For example (GigabitEthernet0/15, linkAlarm). o Resource: a fine-grained identification of the alarming resource, for example: an interface, a process. 3. Objectives The objectives for the design of the Alarm Module are: o Simple to use. If a device supports this module, it shall be straight-forward to integrate this into a YANG based alarm manager. o View alarms as states on resources and not as discrete notifications o Clear definition of "alarm" in order to exclude general events that should not be forwarded as alarm notifications o Clear and precise identification of alarm types and alarm instances. o A management system should be able to pull all available alarm types from a device, "alarm inventory". This makes it possible to prepare alarm operators with corresponding alarm instructions. o Address alarm usability requirements. While IETF has not really addressed alarm management, telecom standards has addressed it purely from a protocol perspective. The process industry has published several relevant standards addressing requirements for a useful alarm interface; [EEMUA], [ISA182]. This alarm module defines usability requirements as well as a YANG data-model. o Mapping to X.733, which is a requirement for many alarm systems. Still, keep some of the X.733 concepts out of the core model in order to make the model small and easy to understand 4. Background and Usability Requirements Common alarm problems and the cause of the problems are summarised in Table 1. This summary is adopted to networking based on the ISA [ISA182] and EEMUA [EEMUA] standards. Vallin & Bjorklund Expires November 5, 2015 [Page 4] Internet-Draft YANG Alarm Module May 2015 +------------------+--------------------------------+---------------+ | Problem | Cause | How this | | | | module | | | | address the | | | | cause | +------------------+--------------------------------+---------------+ | Alarms are | "Nuisance" alarms (chattering | Strict | | generated which | alarms and fleeting alarms), | definition of | | are ignored by | faulty hardware, redundant | alarms | | the operator. | alarms, cascading alarms, | requiring | | | incorrect alarm settings, | corrective | | | alarms have not been | response. | | | rationalised, the alarms | Alarm | | | represent log information | requirements | | | rather than true alarms. | in Table 2. | | | | | | When alarms | Insufficient alarm response | The alarm | | occur, operators | procedures and not well | inventory | | do not know how | defined alarm types | lists all | | to respond. | | alarm types | | | | and | | | | corrective | | | | actions. | | | | Alarm | | | | requirements | | | | in Table 2. | | | | | | The alarm | Nuisance alarms, stale alarms, | The alarm | | display is full | alarms from equipment not in | definition | | of alarms, even | service. | and alarm | | when there is | | shelving. | | nothing wrong. | | | | | | | | During an | Incorrect prioritization of | State-based | | failure, | alarms. Not using advanced | alarm model, | | operators are | alarm techniques (e.g. state- | alarm rate | | flooded with so | based alarming). | requirements | | many alarms that | | in Table 3 | | they do not know | | and Table 4 | | which ones are | | | | the most | | | | important. | | | +------------------+--------------------------------+---------------+ Table 1: Alarm Problems and Causes Based upon the above problems EEMUA gives the following definition of a good alarm: Vallin & Bjorklund Expires November 5, 2015 [Page 5] Internet-Draft YANG Alarm Module May 2015 +----------------+--------------------------------------------------+ | Characteristic | Explanation | +----------------+--------------------------------------------------+ | Relevant | Not spurious or of low operational value | | | | | Unique | Not duplicating another alarm | | | | | Timely | Not long before any response is needed or too | | | late to do anything | | | | | Prioritised | Indicating the importance that the operator | | | deals with the problem | | | | | Understandable | Having a message which is clear and easy to | | | understand | | | | | Diagnostic | Identifying the problem that has occurred | | | | | Advisory | Indicative of the action to be taken | | | | | Focusing | Drawing attention to the most important issues | +----------------+--------------------------------------------------+ Table 2: Definition of a Good Alarm Vendors should rationalise all alarms according to above. Another crucial requirement is acceptable alarm rates. Vendors SHOULD make sure that they do not exceed the recommendations from EEMUA below: +------------------------------------+------------------------------+ | Long Term Alarm Rate in Steady | Acceptability | | Operation | | +------------------------------------+------------------------------+ | More than one per minute | Very likely to be | | | unacceptable | | | | | One per 2 minutes | Likely to be over-demanding | | | | | One per 5 minutes | Manageable | | | | | Less than one per 10 minutes | Very likely to be acceptable | +------------------------------------+------------------------------+ Table 3: Acceptable Alarm Rates, Steady State Vallin & Bjorklund Expires November 5, 2015 [Page 6] Internet-Draft YANG Alarm Module May 2015 +----------------------------+--------------------------------------+ | Number of alarms displayed | Acceptability | | in 10 minutes following a | | | major network problem | | +----------------------------+--------------------------------------+ | More than 100 | Definitely excessive and very likely | | | to lead to the operator to abandon | | | the use of the alarm system. | | | | | 20-100 | Hard to cope with | | | | | Under 10 | Should be manageable - but may be | | | difficult if several of the alarms | | | require a complex operator response. | +----------------------------+--------------------------------------+ Table 4: Acceptable Alarm Rates, Burst The numbers in Table 3 and Table 4 are the sum of all alarms for a network being managed from one alarm console. So every individual device or NMS contributes to these numbers. Vendors SHOULD make sure that the following rules are used in designing the alarm interface: 1. Rationalize the alarms in the system to ensure that every alarm is necessary, has a purpose, and follows the cardinal rule - that it requires an operator response. Adheres to the rules of Table 2 2. Audit the quality of the alarms. Talk with the operators about how well the alarm information support them. Do they know what to do in the event of an alarm? Are they able to quickly diagnose the problem and determine the corrective action? 3. Analyze and benchmark the performance of the system and compare it to the recommended metrics in Table 3 and Table 4. Start by identifying nuisance alarms, standing alarms at normal state and startup. 5. Alarm Concepts This section defines the fundamental concepts behind the data model. This section is rooted in the works of Wallin et. al [ALARMSEM]. Vallin & Bjorklund Expires November 5, 2015 [Page 7] Internet-Draft YANG Alarm Module May 2015 5.1. What is an Alarm? There are two misconceptions regarding alarms and alarm interfaces that are important to sort out. The first problem is that alarms are mixed with events in general. Alarms MUST correspond to an undesirable state that needs corrective action. Many implementations of alarm interfaces do not adhere to this principle and just send events in general. In order to qualify as an alarm, there must exist a corrective action. If that is not true, it is an event that can go into logs. The other misconception is that the term alarm refers to the notification itself. Rather, an alarm is a state of a resource in the device or application. The alarm notifications report state changes of the alarm, such as alarm raise and alarm clear. Based upon the above, we will use the following alarm definition: An alarm signifies an undesirable state in a resource that requires corrective action. "One of the most important principles of alarm management is that an alarm requires an action. This means that if the operator does not need to respond to an alarm (because unacceptable consequences do not occur), then it is not an alarm. Following this cardinal rule will help eliminate many potential alarm management issues." [ISA182] 5.2. What is an Alarm Type? One of the fundamental requirements stated in the previous section is that every alarm must have a corresponding corrective action. This means that every vendor should be able to prepare a list of available alarms and their corrective actions. We use the term 'alarm type' to refer to every possible alarm that could be active in the system. Alarm types are also fundamental in order to provide a state-based alarm list. The alarm list correlates alarm state changes for the same alarm type and the same resource into one alarm. Different alarm interfaces use different mechanisms to define alarm types, ranging from simple error numbers to more advanced mechanisms like the X.733 triplet of event type, probable cause and specific problem. This document defines an alarm type with an alarm type id and an alarm type qualifier. Vallin & Bjorklund Expires November 5, 2015 [Page 8] Internet-Draft YANG Alarm Module May 2015 The alarm type id is modelled as a YANG identity. With YANG identities, new alarm types can be defined in a distributed fashion. YANG identities are hierarchical, which means that an hierarchy of alarm types can be defined. The primary goal for the alarm module has been to provide a simple but extensible mechanism. YANG identities is a good mechanism for enumerated values that are easy to extend. Identities are also hierarchical so that a hierarchy of alarm types can be defined if needed. This means that every possible alarm type that can appear in a system exists as a well defined hierarchical identity along with a description. Tools can provide a list of possible alarms by parsing the YANG identities rather then reading user guides. Standards and vendors should define their own alarm type identities based on this definition. The use of YANG identities means that all possible alarms are identified at "design time". This explicit declaration of alarm types makes it easier to allow for alarm qualification reviews and preparation of alarm actions and documentation. There are occasions where the alarm types are not known at design time. Say a system with digital inputs and the user of the system connects detectors to the inputs. Then, it is a configuration action that says that certain connectors are fire alarms for example. The drawback of this is that there is a big risk that alarm operators will receive alarm types as a surprise, they do not know how to resolve the problem since no defined alarm procedure does not necessarily exist. In order to allow for dynamic addition of alarm types the alarm module also allows for further qualification of the identity based alarm type using a string. A common misunderstanding is that individual alarm notifications are alarm types. This is not correct; e.g., "linkUp" and "linkDown" are two notifications reporting different states for the same alarm type, "linkAlarm". A vendor or standard can then define their own alarm-type hierarchy. The example below shows a hierarchy based on X.733 event types: Vallin & Bjorklund Expires November 5, 2015 [Page 9] Internet-Draft YANG Alarm Module May 2015 import ietf-alarms { prefix al; } identity vendor-alarms { base al:alarm-type; } identity communicationsAlarm { base vendor-alarms; } identity linkAlarm { base communicationsAlarm; } Alarm types can be abstract. An abstract alarm type is used as a base for defining hierarchical alarm types. Concrete alarm types are used for alarm states and appear in the alarm inventory. There are two kinds of concrete alarm types: 1. The last subordinate identity in the 'alarm-type-id' hierarchy is concrete, for example: "alarm-identity.environmentalAlarm.smoke". In this example alarm-identity and environmentalAlarm are abstract YANG identities, whereas "smoke" is a concrete YANG identity. 2. The YANG identity hierarchy is abstract and the concrete alarm type is defined by the dynamic alarm-qualifier string, for example: "alarm-identity.environmentalAlarm.externalDetector" with alarm-type-qualifier "smoke". For example: Vallin & Bjorklund Expires November 5, 2015 [Page 10] Internet-Draft YANG Alarm Module May 2015 // Alternative 1: concrete alarm type identity import ietf-alarms { prefix al; } identity environmentalAlarm { base al:alarm-type; description "Abstract alarm type"; } identity smoke { base environmentalAlarm; description "Concrete alarm type"; } // Alternative 2: concrete alarm type qualifier import ietf-alarms { prefix al; } identity environmentalAlarm { base al:alarm-type; description "Abstract alarm type"; } identity externalDetector { base environmentalAlarm; description "Abstract alarm type, a run-time configuration procedure sets the type of alarm detected. This will be reported in the alarm-qualifier."; } 5.3. How are Resources Identified? It is of vital importance to be able to refer to the alarming resource. This reference must be as fine-grained as possible. If the alarming resource exists in the data-tree then an instance- identifier is used with the full path to the object. This module also allows for alternate naming if the alarming resource is not available in the data-tree. 5.4. How are Alarm Instances Identified? A primary goal of this alarm module is to remove any ambiguity in how alarm notifications are mapped to an update of an alarm instance. X.733 and especially 3GPP was not really clear on this point. This YANG alarm module states that the tuple (resource, alarm type identifier, alarm type qualifier) corresponds to the same alarm instance. This means that alarm notifications for the same resource Vallin & Bjorklund Expires November 5, 2015 [Page 11] Internet-Draft YANG Alarm Module May 2015 and same alarm type are matched to update the same alarm instance. These three leafs are therefore used as the key in the alarm list: list alarm { key "resource alarm-type-identity alarm-type-qualifier"; ... } 5.5. What is the Life-Cycle of an Alarm? The alarm model clearly separates the resource alarm life-cycle from the operator and administrative life-cycles of an alarm. o resource alarm life-cycle: the alarm instrumentation that controls alarm raise, clearance, and severity changes. o operator alarm life-cycle: operators acting upon alarms with actions like acknowledgement and closing. Closing an alarm implies that the operator considers the corrective action performed. o administrative alarm life-cycle: deleting alarms, compressing alarm history. 5.5.1. Resource Alarm Life-Cycle From a resource perspective an alarm can have the following life- cycle: raise, change severity, change severity, clear, being raised again etc. Two important things to note: 1. Alarms are not deleted when they are cleared. Deleting alarms is an administrative process. The alarm module defines an rpc "purge" that deletes alarms. 2. Alarms are not cleared by operators, only the underlying instrumentation can clear an alarm. Operators can close alarms. The YANG tree representation below illustrates the resource oriented life-cycle: Vallin & Bjorklund Expires November 5, 2015 [Page 12] Internet-Draft YANG Alarm Module May 2015 +--ro alarm* [resource alarm-type-id alarm-type-qualifier] ... +--ro is-cleared boolean +--ro last-status-change yang:date-and-time +--ro last-perceived-severity severity +--ro last-alarm-text alarm-text +--ro status-change* [event-time] +--ro event-time yang:date-and-time +--ro perceived-severity severity +--ro alarm-text alarm-text For every state change from the resource perspective a row is added to the 'status-change' list. The last status values are also represented at leafs for the alarm. Note well that the alarm severity does not include 'cleared', alarm clearance is a flag. An alarm can therefore look like this: ((GigabitEthernet0/25, linkAlarm,""), false, T, major, "Interface GigabitEthernet0/25 down") 5.5.2. Operator Alarm Life-cycle Operators can also act upon alarms using the set-operator-state rpc: +--ro alarm* [resource alarm-type-id alarm-type-qualifier] ... +--ro last-operator-state operator-state {operator-actions}? +--ro last-operator? string {operator-actions}? +--ro last-operator-text? alarm-text {operator-actions}? +--ro last-operator-action? yang:date-and-time {operator-actions}? +--ro operator-action* [time] {operator-actions}? +--ro time yang:date-and-time +--ro state operator-state +--ro name string +--ro text? string The operator state for an alarm can be: 'none', 'ack', 'closed'. Alarm deletion, 'rpc purge', can use this state as a criteria. A closed alarm is an alarm where the operator has performed any required corrective actions. Closed alarms are good candidates for being deleted. 5.5.3. Administrative Alarm Life-Cycle Deleting alarms from the alarm list is considered an administrative action. This is supported by the 'purge rpc'. The 'purge rpc' takes a filter as input, the filter can select alarms based on the operator and resource life-cycle such as "all closed cleared alarms older than a time specification". Vallin & Bjorklund Expires November 5, 2015 [Page 13] Internet-Draft YANG Alarm Module May 2015 Alarms can also be compressed, this deletes the status-change list except for the last status change. 6. Alarm Data Model Alarm shelving and operator actions are YANG features so that a device can select not to support these. The data-model has the following overall structure: +--rw alarms +--rw control | +--rw max-alarm-history? uint16 | +--rw notify-status-changes? boolean | +--rw shelved-alarms {alarm-shelving}? | +--rw shelved-alarm* [shelf-name] | +--rw shelf-name string | +--rw resource? resource | +--rw alarm-type-id? alarm-type-id | +--rw alarm-type-qualifier? alarm-type-qualifier +--ro alarm-inventory | +--ro alarm-type* | +--ro alarm-type-id alarm-type-id | +--ro alarm-type-qualifier? alarm-type-qualifier | +--ro has-clear union | +--ro description string +--ro summary* [severity] | +--ro severity severity | +--ro total? yang:gauge32 | +--ro cleared? yang:gauge32 | +--ro cleared-not-closed? yang:gauge32 {operator-actions}? | +--ro cleared-closed? yang:gauge32 {operator-actions}? | +--ro not-cleared-closed? yang:gauge32 {operator-actions}? | +--ro not-cleared-not-closed? yang:gauge32 {operator-actions}? +--ro alarm-list +--ro number-of-alarms? yang:gauge32 +--ro last-changed? yang:date-and-time +--ro alarm* [resource alarm-type-id alarm-type-qualifier] +--ro resource resource +--ro alarm-type-id alarm-type-id +--ro alarm-type-qualifier alarm-type-qualifier +--ro alt-resource* resource +--ro related-alarms* | +--ro resource? resource | +--ro alarm-type-id? alarm-type-id | +--ro alarm-type-qualifier? alarm-type-qualifier +--ro impacted-resources* resource +--ro root-cause-resources* resource Vallin & Bjorklund Expires November 5, 2015 [Page 14] Internet-Draft YANG Alarm Module May 2015 +--ro is-cleared boolean +--ro last-status-change yang:date-and-time +--ro last-perceived-severity severity +--ro last-alarm-text alarm-text +--ro status-change* [event-time] | +--ro event-time yang:date-and-time | +--ro perceived-severity severity-with-clear | +--ro alarm-text alarm-text +--ro last-operator-state operator-state {operator-actions}? +--ro last-operator? string {operator-actions}? +--ro last-operator-text? alarm-text {operator-actions}? +--ro last-operator-action? yang:date-and-time {operator-actions}? +--ro operator-action* [time] {operator-actions}? +--ro time yang:date-and-time +--ro state operator-state +--ro operator string +--ro text? string 6.1. Alarm Control The "notify-status-changes" leaf controls if notifications are sent for all state changes, severity change and alarm text change, or just for new and cleared alarms. Every alarm has a list of status changes, this is a circular list. The length of this list is controlled by "max-alarm-history". 6.1.1. Alarm Shelving Alarm shelving is an important function in order for alarm management applications and operators to stop superfluous alarms. A shelved alarm implies that any alarms fulfilling this criteria are ignored. The instrumentation MUST not update the alarm list and not send any alarm notifications for alarms that match any shelving criteria. A device can select to not support the shelving feature. 6.2. Alarm Inventory The alarm inventory represents all possible alarm types that may occur in the system. A management system may use this to build alarm procedures. The alarm inventory is relevant for several reasons: The system might not instrument all alarm type identities. Vallin & Bjorklund Expires November 5, 2015 [Page 15] Internet-Draft YANG Alarm Module May 2015 The system has configured dynamic alarm types using the alarm qualifier. The inventory makes it possible for the management system to discover these. Note that the mechanism whereby dynamic alarm types are added using the alarm type qualifier MUST populate this list. 6.3. Alarm Summary The alarm summary list summarises alarms per severity; how many cleared, cleared and closed, and closed. 6.4. The Alarm List The alarm list is a function from (resource, alarm type) to the current alarm state. Vallin & Bjorklund Expires November 5, 2015 [Page 16] Internet-Draft YANG Alarm Module May 2015 +--ro alarm-list +--ro number-of-alarms? yang:gauge32 +--ro last-changed? yang:date-and-time +--ro alarm* [resource alarm-type-id alarm-type-qualifier] +--ro resource resource +--ro alarm-type-id alarm-type-id +--ro alarm-type-qualifier alarm-type-qualifier +--ro alt-resource* resource +--ro related-alarms* | +--ro resource? resource | +--ro alarm-type-id? alarm-type-id | +--ro alarm-type-qualifier? alarm-type-qualifier +--ro impacted-resources* resource +--ro root-cause-resources* resource +--ro is-cleared boolean +--ro last-status-change yang:date-and-time +--ro last-perceived-severity severity +--ro last-alarm-text alarm-text +--ro status-change* [event-time] | +--ro event-time yang:date-and-time | +--ro perceived-severity severity-with-clear | +--ro alarm-text alarm-text +--ro last-operator-state operator-state {operator-actions}? +--ro last-operator? string {operator-actions}? +--ro last-operator-text? alarm-text {operator-actions}? +--ro last-operator-action? yang:date-and-time {operator-actions}? +--ro operator-action* [time] {operator-actions}? +--ro time yang:date-and-time +--ro state operator-state +--ro operator string +--ro text? string Every alarm has three important states, the resource clearance state "is-cleared", the operator state "last-operator-state" and the severity "last-perceived-severity". In order to see the alarm history the resource state changes are available in the "status-change" list and the operator history is available in the "operator-actions" list. 6.5. RPCs The alarm module supports several RPCs to manage the alarms: "purge-alarms": delete alarms according to specific criteria, for example all cleared alarms older then a specific date. Vallin & Bjorklund Expires November 5, 2015 [Page 17] Internet-Draft YANG Alarm Module May 2015 "compress-alarms": compress the status-change list for the alarms. "set-operator-state": change the operator state for an alarm: for example acknowledge. 6.6. Notifications The alarm module supports a general notification to report alarm state changes. It carries all relevant parameters for the alarm management application. There is also a notification to report that an operator changed the operator state on an alarm, like acknowledge. notifications: +---n alarm-notification | +--ro resource resource | +--ro alarm-type-id alarm-type-id | +--ro alarm-type-qualifier? alarm-type-qualifier | +--ro alt-resource* resource | +--ro related-alarms* | | +--ro resource? resource | | +--ro alarm-type-id? alarm-type-id | | +--ro alarm-type-qualifier? alarm-type-qualifier | +--ro impacted-resources* resource | +--ro root-cause-resources* resource | +--ro event-time yang:date-and-time | +--ro perceived-severity severity-with-clear | +--ro alarm-text alarm-text +---n operator-action {operator-actions}? +--ro resource resource +--ro alarm-type-id alarm-type-id +--ro alarm-type-qualifier? alarm-type-qualifier +--ro time yang:date-and-time +--ro state operator-state +--ro operator string +--ro text? string 7. Alarm YANG Module file "ietf-alarms.yang" module ietf-alarms { namespace "urn:ietf:params:xml:ns:yang:ietf-alarms"; prefix alarms; import ietf-yang-types { prefix yang; Vallin & Bjorklund Expires November 5, 2015 [Page 18] Internet-Draft YANG Alarm Module May 2015 } organization "IETF NETMOD (NETCONF Data Modeling Language) Working Group"; contact "WG Web: WG List: WG Chair: Thomas Nadeau WG Chair: Juergen Schoenwaelder Editor: Stefan Vallin Editor: Martin Bjorklund "; description "This module is an interface for managing alarms. Main inputs to the module design are the 3GPP Alarm IRP and ITU-T X.733 alarm standards. Main features: * alarm list: a list of all alarms. Cleared alarms stay in the list until explicitly removed. * operator actions on alarms: acknowledging and closing alarms. * administrative actions on alarms: purging alarms from the list according to specific criteria. * alarm inventory: a management application can read all alarm types implemented by the system. * alarm shelving: shelving (blocking) alarms according to specific criteria. This module uses a stateful view on alarms. An alarm is a state for a specific resource. An alarm type is a possible alarm state for a resource. For example, the tuple ('linkAlarm', 'GigabitEthernet0/25') is an alarm of type 'linkAlarm' on the resource 'GigabitEthernet0/25'. Alarm types are identified using YANG identities and an optional string-based qualifier. The string-based qualifier allows for dynamic extension of the statically defined alarm types. Alarm types identifies a possible alarm state and not the individual notifications. 'linkDown' and 'linkUp' notifications are two notifications refering to the same alarm type 'linkAlarm'. Vallin & Bjorklund Expires November 5, 2015 [Page 19] Internet-Draft YANG Alarm Module May 2015 In this way there is no ambiguity about how alarm and alarm clear correlation should be performed: notifications reporting the same resource, and alarm type are considered updates of the same alarm, such as clearing an active alarm or changing the severity of an active alarm. Severity and alarm text can be changed on an existing alarm. The above alarm example can therefore look like: (('linkAlarm', 'GigabitEthernet0/25'), warning, 'interface down while interface admin state is ip') There is a clear separation between updates on the alarm from the underlying resource, like clear, and updates from an operator like acknowledge or closing an alarm: (('linkAlarm', 'GigabitEthernet0/25'), warning, 'interface down while interface admin state is ip', cleared, closed) Administrative actions like removing closed alarms older than a given time is supported."; revision 2015-05-04 { description "Initial revision."; reference "RFC XXXX: YANG Alarm Module"; } /* * Features */ feature operator-actions { description "This feature means that the systems supports operator states on alarms."; } feature alarm-shelving { description "This feature means that the systems shelf (filter) alarms."; } /* * Identities */ identity alarm-identity { description Vallin & Bjorklund Expires November 5, 2015 [Page 20] Internet-Draft YANG Alarm Module May 2015 "Base identity for alarm types. A unique identification of the alarm, not including the resource. Different resources can share alarm types. If the resource reports the same alarm type, it is to be considered to be the same alarm. The alarm type is a simplification of the different X.733 and 3GPP alarm IRP alarm correlation mechanisms and it allows for hierarchical extensions. A string-based qualifier can be used in addition to the identity in order to have different alarm types based on information not known at design-time, such as values in textual SNMP Notification var-binds. Standards and vendors can define sub-identities to clearly identify specific alarm types. This identity is abstract and shall not be used for alarms."; } /* * Common types */ typedef resource { type union { type instance-identifier { require-instance false; } type yang:object-identifier; type string; } description "If the alarming resource is modelled in YANG, this type will be an instance-identifier. If the resource is an SNMP object, the type will be an object-identifier. If the resource is anything else, for example a distinguished name or a CIM path, this type will be a string."; } typedef alarm-text { type string { length "1..1024"; } description "The string used to inform operators about the alarm. This MUST contain enough information for an operator to be able to understand the problem and how to resolve it. If this string contains structure, this format should be clearly Vallin & Bjorklund Expires November 5, 2015 [Page 21] Internet-Draft YANG Alarm Module May 2015 documented for programs to be able to parse that information."; } typedef severity { type enumeration { enum indeterminate { value 2; description "Indicates that the severity level could not be determined. This level SHOULD be avoided."; } enum minor { value 3; description "The 'minor' severity level indicates the existence of a non-service affecting fault condition and that corrective action should be taken in order to prevent a more serious (for example, service affecting) fault. Such a severity can be reported, for example, when the detected alarm condition is not currently degrading the capacity of the resource."; } enum warning { value 4; description "The 'warning' severity level indicates the detection of a potential or impending service affecting fault, before any significant effects have been felt. Action should be taken to further diagnose (if necessary) and correct the problem in order to prevent it from becoming a more serious service affecting fault."; } enum major { value 5; description "The 'major' severity level indicates that a service affecting condition has developed and an urgent corrective action is required. Such a severity can be reported, for example, when there is a severe degradation in the capability of the resource and its full capability must be restored."; } enum critical { value 6; description "The 'critical' severity level indicates that a service affecting condition has occurred and an immediate Vallin & Bjorklund Expires November 5, 2015 [Page 22] Internet-Draft YANG Alarm Module May 2015 corrective action is required. Such a severity can be reported, for example, when a resource becomes totally out of service and its capability must be restored."; } } description "The severity level of the alarm."; reference "ITU Recommendation X.733, 'Information Technology - Open Systems Interconnection - System Management: Alarm Reporting Function', 1992"; } typedef severity-with-clear { type union { type enumeration { enum cleared { value 1; description "The alarm is cleared by the instrumentation."; } } type severity; } description "The severity level of the alarm including clear. This is used only in state changes for an alarm."; } typedef operator-state { type enumeration { enum none { value 1; description "The alarm is not being taken care of."; } enum ack { value 2; description "The alarm is being taken care of. Corrective action not taken yet, or failed"; } enum closed { value 3; description "Corrective action taken successfully."; } } Vallin & Bjorklund Expires November 5, 2015 [Page 23] Internet-Draft YANG Alarm Module May 2015 description "Operator states on an alarm. The 'closed' state indicates that an operator considers the alarm being resolved. This is separate from the resource alarm clear flag."; } /* Alarm type */ typedef alarm-type-id { type identityref { base alarm-identity; } description "Identifies an alarm type. The description of the alarm type id MUST indicate if the alarm type is abstract or not. An abstract alarm type is used as a base for other alarm type ids and will not be used as a value for an alarm or be present in the alarm inventory."; } typedef alarm-type-qualifier { type string; description "If an alarm type can not be fully specified at design-time by alarm-type-id, this string qualifier is used in addition to fully define a unique alarm type. The configuration of alarm qualifiers is considered being part of the instrumentation and out of scope for this module."; } /* * Groupings */ grouping common-alarm-parameters { description "Common parameters for an alarm. This grouping is used both in the alarm list and in the notification representing an alarm state change."; leaf resource { type resource; mandatory true; description "The alarming resource. See also 'alt-resource'."; } leaf alarm-type-id { Vallin & Bjorklund Expires November 5, 2015 [Page 24] Internet-Draft YANG Alarm Module May 2015 type alarm-type-id; mandatory true; description "This leaf and the leaf 'alarm-type-qualifier' together provides a unique identification of the alarm type."; } leaf alarm-type-qualifier { type alarm-type-qualifier; description "This leaf is used when the 'alarm-type-id' leaf cannot uniquely identify the alarm type. Normally, this is not the case, and this leaf is the empty string."; } leaf-list alt-resource { type resource; description "Used if the alarming resource is available over other interfaces. This field can contain SNMP OID's, CIM paths or 3GPP Distinguished names for example."; } list related-alarms { description "References to related alarms. The reference is expressed as values for the alarm list and not leafrefs since the related alarm might have been removed from the alarm list."; // TODO: use YANG 1.1 leafref with require-instance false. // or use instance-identifier with require-instance false? leaf resource { type resource; description "The alarming resource for the related alarm."; } leaf alarm-type-id { type alarm-type-id; description "The alarm type identifier for the related alarm."; } leaf alarm-type-qualifier { type alarm-type-qualifier; description "The optional alarm qualifier for the related alarm."; } } leaf-list impacted-resources { type resource; Vallin & Bjorklund Expires November 5, 2015 [Page 25] Internet-Draft YANG Alarm Module May 2015 description "Resources that might be affected by this alarm."; } leaf-list root-cause-resources { type resource; description "Resources that are candidates for causing the alarm."; } } grouping alarm-status-change-parameters { description "Parameters for an alarm state change. This grouping is used both in the alarm list's status-change list and in the notification representing an alarm state change."; leaf event-time { type yang:date-and-time; mandatory true; description "The time the status of the alarm changed. The value represents the time the real alarm state change appeared in the resource and not when it was added to the alarm list."; } leaf perceived-severity { type severity-with-clear; mandatory true; description "The severity of the alarm as defined by X.733. Note that this may not be the original severity since the alarm may have changed severity."; reference "ITU Recommendation X.733, 'Information Technology - Open Systems Interconnection - System Management: Alarm Reporting Function', 1992"; } leaf alarm-text { type alarm-text; mandatory true; description "A user friendly text describing the alarm state change."; reference "Additional Text from ITU Recommendation X.733, 'Information Technology - Open Systems Interconnection - System Management: Alarm Vallin & Bjorklund Expires November 5, 2015 [Page 26] Internet-Draft YANG Alarm Module May 2015 Reporting Function', 1992"; } } grouping operator-parameters { description "This grouping defines parameters that can be changed by an operator"; leaf time { type yang:date-and-time; mandatory true; description "Timestamp for operator action on alarm."; } leaf state { type operator-state; mandatory true; description "The operator's view of the alarm state."; } leaf operator { type string; mandatory true; description "The name of the operator that has acted on this alarm."; } leaf text { type string; description "Additional optional textual information provided by the operator."; } } /* * The /alarms data tree */ container alarms { description "The top container for this module"; container control { description "Configuration to control the alarm behaviour."; leaf max-alarm-history { type uint16; default 32; Vallin & Bjorklund Expires November 5, 2015 [Page 27] Internet-Draft YANG Alarm Module May 2015 description "The status-change entries are kept in a circular list. When this number is exceeded, the oldest status change entry is automatically removed. If the value is 0, the status change entries are accumulated indefinitely."; } leaf notify-status-changes { type boolean; default false; description "This leaf controls whether notifications are sent on all alarm status updates, e.g., updated perceived-severity or alarm-text. By default the notifications are only sent when a new alarm is raised, re-raised after being cleared and when an alarm is cleared."; } container shelved-alarms { if-feature alarm-shelving; description "This list is used to shelf alarms. The server will stop updating the alarm list and sending notifications for the shelved alarms. Any alarms corresponding to the shelving criteria stays in the alarm list. When a shelved alarm is deleted or changed, the server SHOULD update the alarm list to the current state."; list shelved-alarm { key shelf-name; leaf shelf-name { type string; description "A description of the shelved alarm. It SHOULD include the reason for shelving this alarm"; } description "Each entry defines the criteria for shelving alarms."; leaf resource { type resource; description "Shelv alarms for this resource."; } leaf alarm-type-id { type alarm-type-id; description "Shelv alarms for this alarm type identifier."; } leaf alarm-type-qualifier { type alarm-type-qualifier; Vallin & Bjorklund Expires November 5, 2015 [Page 28] Internet-Draft YANG Alarm Module May 2015 description "Shelv alarms for this alarm type qualifier."; } } } } container alarm-inventory { config false; description "This list contains all possible alarm types for the system. The list also tells if each alarm type has a corresponding clear state. The inventory shall only contain concrete alarm types."; list alarm-type { description "An entry in this list defines a possible alarm."; leaf alarm-type-id { type alarm-type-id; mandatory true; description "The statically defined alarm type identifier for this possible alarm."; } leaf alarm-type-qualifier { type alarm-type-qualifier; description "The optionally dynamically defined alarm type identifier for this possible alarm."; } leaf has-clear { type union { type boolean; } mandatory true; description "This leaf tells the operator if the alarm will be cleared when the correct corrective action has been taken. Implementations SHOULD strive for detecting the cleared state for all alarm types. If this leaf is true, the operator can monitor the alarm until it becomes cleared after the corrective action has been taken. If this leaf is false the operator needs to validate that the alarm is not longer active using other mechanisms. Alarms can lack a corresponding clear due to missing instrumentation or that there is no logical corresponding clear state."; } Vallin & Bjorklund Expires November 5, 2015 [Page 29] Internet-Draft YANG Alarm Module May 2015 leaf description { type string; mandatory true; description "A description of the possible alarm. It SHOULD include information on possible underlying root causes and corrective actions."; } } } list summary { key severity; config false; description "A global summary of all alarms in the system."; leaf severity { type severity; description "Alarm summary for this severity level."; } leaf total { type yang:gauge32; description "Total number of alarms of this severity level."; } leaf cleared { type yang:gauge32; description "For this severity level, the number of alarms that are cleared."; } leaf cleared-not-closed { if-feature operator-actions; type yang:gauge32; description "For this severity level, the number of alarms that are cleared but not closed."; } leaf cleared-closed { if-feature operator-actions; type yang:gauge32; description "For this severity level, the number of alarms that are cleared and closed."; } leaf not-cleared-closed { if-feature operator-actions; Vallin & Bjorklund Expires November 5, 2015 [Page 30] Internet-Draft YANG Alarm Module May 2015 type yang:gauge32; description "For this severity level, the number of alarms that are not cleared but closed."; } leaf not-cleared-not-closed { if-feature operator-actions; type yang:gauge32; description "For this severity level, the number of alarms that are not cleared and not closed."; } } container alarm-list { config false; description "The alarms in the system."; leaf number-of-alarms { type yang:gauge32; description "This object shows the total number of currently alarms, i.e., the total number of entries in the alarm list."; } leaf last-changed { type yang:date-and-time; description "A timestamp when the active alarm list was last changed. The value can be used by a manager to initiate an alarm resynchronization procedure."; } list alarm { key "resource alarm-type-id alarm-type-qualifier"; description "The list of alarms. Each entry in the list holds one alarm for a given alarm type and device, managed object. An alarm can be updated from the underlying device or by the user. These changes are reflected in different lists below the corresponding alarm."; uses common-alarm-parameters; leaf is-cleared { type boolean; Vallin & Bjorklund Expires November 5, 2015 [Page 31] Internet-Draft YANG Alarm Module May 2015 mandatory true; description "Indicates the clearance state of the alarm. An alarm might toggle from active alarm to cleared alarm and back to active again. This leaf reflects the perceived severity in the latest entry in the status-change list."; } leaf last-status-change { type yang:date-and-time; mandatory true; description "A timestamp when the status-change list was last changed. This value equals the latest 'when' leaf in the status-change list. The value can be used by a manager to read the last status change without iterating the status-change list below."; } leaf last-perceived-severity { type severity; mandatory true; description "The severity of the last status-change that reported a severity that is not equal to cleared."; } leaf last-alarm-text { type alarm-text; mandatory true; description "The alarm-text of the last status-change that reported a severity that is not equal to cleared."; } list status-change { key event-time; min-elements 1; description "A list of status change events for this alarm. This list is ordered according to the timestamps of alarm state changes. The last item corresponds to the latest state change. The following state changes creates an entry in this Vallin & Bjorklund Expires November 5, 2015 [Page 32] Internet-Draft YANG Alarm Module May 2015 list: - changed severity (warning, minor, major, critical) - clearance status, this also updates the is-cleared leaf - alarm text update"; uses alarm-status-change-parameters; } leaf last-operator-state { if-feature operator-actions; type operator-state; mandatory true; description "The state of the alarm as set by the operator. When the alarm is first raised by the instrumentation it has the 'none' state. After initial alarm raise this leaf represents the state in the latest entry in the 'operator-action' list. The 'closed' state indicates that the alarm is considered resolved by the operator."; } leaf last-operator { if-feature operator-actions; type string; description "The last operator that acted upon the alarm."; } leaf last-operator-text { if-feature operator-actions; type alarm-text; description "The alarm-text of the last status-change that reported a severity that is not equal to cleared."; } leaf last-operator-action { if-feature operator-actions; type yang:date-and-time; description "A timestamp when the operator-change list was last changed."; } list operator-action { if-feature operator-actions; Vallin & Bjorklund Expires November 5, 2015 [Page 33] Internet-Draft YANG Alarm Module May 2015 key time; description "This list is used by operators to indicate the state of human intervention on an alarm. For example, if an operator has seen an alarm, the operator can add a new item to this list indicating that the alarm is acknowledged."; uses operator-parameters; } } } } /* * Operations */ rpc compress { description "This action requests the server to compress the alarm entry by removing the history of this alarm. The latest state change will be kept."; input { leaf alarm-type-id { type leafref { path "/alarms/alarm-list/alarm/alarm-type-id"; } description "Compress alarms with this alarm-type-id."; } leaf alarm-type-qualifier { type leafref { path "/alarms/alarm-list/alarm[alarm-type-id=current()" + "/../alarm-type-id]/alarm-type-qualifier"; } description "Compress the alarm with this alarm-type-qualifier."; } leaf resource { type leafref { path "/alarms/alarm-list/alarm[alarm-type-id=current()" + "/../alarm-type-id][alarm-type-qualifier=current()" + "/../alarm-type-qualifier]/resource"; } description "Compress the alarm with this resource."; } Vallin & Bjorklund Expires November 5, 2015 [Page 34] Internet-Draft YANG Alarm Module May 2015 } output { leaf result { type string; description "Information on the compress operation."; } leaf compressed-elements { type uint16; description "Number of removed entries"; } } } rpc compress-alarms { description "This operation requests the server to compress the alarm entries by removing the history of each individual alarm. The latest state change will be kept. Note that no alarm entries such are removed only the history for each alarm."; output { leaf result { type string; description "Overall information on the compress rpc"; } leaf compressed-elements { type uint16; description "Total number of compressed entries"; } } } grouping filter-input { description "Grouping to specify a filter construct on alarm information."; leaf alarm-status { type enumeration { enum any { description "Ignore alarm clearance status"; } enum cleared { description "Filter cleared alarms"; } Vallin & Bjorklund Expires November 5, 2015 [Page 35] Internet-Draft YANG Alarm Module May 2015 enum not-cleared { description "Filter not cleared alarms"; } } mandatory true; description "The clearance status of the alarm."; } container older-than { presence "Age specification"; description "Matches the 'last-status-change' leaf in the alarm."; choice age-spec { description "Filter using date and time age."; case seconds { leaf seconds { type uint16; description "Seconds part"; } } case minutes { leaf minutes { type uint16; description "Minute part"; } } case hours { leaf hours { type uint16; description "Hours part."; } } case days { leaf days { type uint16; description "Day part"; } } case weeks { leaf weeks { type uint16; Vallin & Bjorklund Expires November 5, 2015 [Page 36] Internet-Draft YANG Alarm Module May 2015 description "Week part"; } } } } container severity { presence "Severity filter"; choice sev-spec { description "Filter based on severity level."; leaf below { type severity; description "Severity less than this leaf."; } leaf is { type severity; description "Severity level equal this leaf."; } leaf above { type severity; description "Severity level higher than this leaf."; } } description "Filter based on severity."; } container operator-state-filter { if-feature operator-actions; presence "Operator state filter"; leaf state { type operator-state; description "Filter on operator state."; } leaf user { type string; description "Filter based on which operator."; } description "Filter based on operator state."; } } Vallin & Bjorklund Expires November 5, 2015 [Page 37] Internet-Draft YANG Alarm Module May 2015 rpc set-operator-state { if-feature operator-actions; description "This is a means for the operator to indicate the level of human intervention on an alarm."; input { leaf resource { type leafref { path "/alarms/alarm-list/alarm[alarm-type-id=current()" + "/../alarm-type-id][alarm-type-qualifier=current()" + "/../alarm-type-qualifier]/resource"; } description "Set operator state for alarm with this resource."; } leaf alarm-type-id { type leafref { path "/alarms/alarm-list/alarm/alarm-type-id"; } description "Set operator state for alarm with this alarm type identifier."; } leaf alarm-type-qualifier { type leafref { path "/alarms/alarm-list/alarm[alarm-type-id=current()" + "/../alarm-type-id]/alarm-type-qualifier"; } description "Set operator state for alarm with this alarm qualifier."; } leaf state { type operator-state; mandatory true; description "Set this operator state"; } leaf text { type string; description "Additional optional textual information."; } } } rpc purge-alarms { description "This operation requests the server to delete entries from the Vallin & Bjorklund Expires November 5, 2015 [Page 38] Internet-Draft YANG Alarm Module May 2015 alarm list according to the supplied criteria. Typically it can be used to delete alarms that are in closed operator state and older than a specified time. The number of purged alarms is returned as an output parameter"; input { uses filter-input; } output { leaf result { type string; description "Overall result for the purge rpc"; } leaf purged-alarms { type uint16; description "Number of purged alarms."; } } } /* * Notifications */ notification alarm-notification { description "This notification is used to report a state change for an alarm. The same notification is used for sending a newly raised alarm, a cleared alarm or changing the text and/or severity of an existing alarm."; uses common-alarm-parameters; uses alarm-status-change-parameters; } notification operator-action { if-feature operator-actions; description "This notification is used to report that an operator acted upon an alarm"; leaf resource { type resource; mandatory true; description "The alarming resource."; } leaf alarm-type-id { Vallin & Bjorklund Expires November 5, 2015 [Page 39] Internet-Draft YANG Alarm Module May 2015 type alarm-type-id; mandatory true; description "This leaf and the leaf 'alarm-type-qualifier' together provides a unique identification of the alarm type."; } leaf alarm-type-qualifier { type alarm-type-qualifier; description "This leaf is used when the 'alarm-type-id' leaf cannot uniquely identify the alarm type. Normally, this is not the case, and this leaf is the empty string."; } uses operator-parameters; } } 8. X.733 Alarm Mapping Data Model Many alarm management systems are based on the X.733 alarm standard. This YANG module allows a mapping from alarm types to X.733 event- type and probable-cause. The module augments the alarm inventory, the alarm list and the alarm notification with X.733 parameters. The module also supports a feature whereby the alarm manager can configure the mapping. This might be needed when the default mapping provided by the system is in conflict with other systems or not considered good. 9. X.733 Alarm Mapping YANG Module file "ietf-alarms-x733.yang" module ietf-alarms-x733 { namespace "urn:ietf:params:xml:ns:yang:ietf-alarms-x733"; prefix x733; import ietf-alarms { prefix al; } organization "IETF NETMOD (NETCONF Data Modeling Language) Working Group"; Vallin & Bjorklund Expires November 5, 2015 [Page 40] Internet-Draft YANG Alarm Module May 2015 contact "WG Web: WG List: WG Chair: Thomas Nadeau WG Chair: Juergen Schoenwaelder Editor: Stefan Vallin Editor: Martin Bjorklund "; description "This module augments the ietf-alarms module with X.733 mapping information. The following is augemented with event type and probable cause: 1) alarm inventory: every candidate alarm. 2) alarm: every alarm in the system 3) alarm notification: notifications indicating alarm state changes. The module also allows (a feature) the alarm management system to configure the mapping."; reference "ITU Recommendation X.733, 'Information Technology - Open Systems Interconnection - System Management: Alarm Reporting Function', 1992"; revision 2015-05-04 { description "Initial revision."; reference "RFC XXXX: YANG Alarm Module"; } feature configure-x733-mapping { description "The system can support configurable X733 mapping from alarm-type to event-type and probable cause."; } typedef event-type { type enumeration { enum other { Vallin & Bjorklund Expires November 5, 2015 [Page 41] Internet-Draft YANG Alarm Module May 2015 value 1; description ""; } enum communicationsAlarm { value 2; description "An alarm of this type is principally associated with the procedures and/or processes required to convey information from one point to another"; } enum qualityOfServiceAlarm { value 3; description "An alarm of this type is principally associated with a degradation in the quality of a service"; } enum processingErrorAlarm { value 4; description "An alarm of this type is principally associated with a software or processing fault"; } enum equipmentAlarm { value 5; description "An alarm of this type is principally associated with an equipment fault"; } enum environmentalAlarm { value 6; description "An alarm of this type is principally associated with a condition relating to an enclosure in which the equipment resides."; } enum integrityViolation { value 7; description ""; } enum operationalViolation { value 8; description ""; } enum physicalViolation { value 9; Vallin & Bjorklund Expires November 5, 2015 [Page 42] Internet-Draft YANG Alarm Module May 2015 description ""; } enum securityServiceOrMechanismViolation { value 10; description ""; } enum timeDomainViolation { value 11; description ""; } } description "The event types as defined by X.733. The use of the term 'event' is a bit confusing. In an alarm context these are top level alarm types."; reference "ITU Recommendation X.736, 'Information Technology - Open Systems Interconnection - System Management: Security Alarm Reporting Function', 1992"; } augment "/al:alarms/al:alarm-inventory/al:alarm-type" { leaf event-type { type event-type; description "The alarm type has this X.733 event-type."; } leaf probable-cause { type uint32; description "The alarm type has this X.733 probable cause value."; } description "Augment X.733 mapping information to the alarm inventory."; } augment "/al:alarms/al:control" { description "Add X.733 mapping capabilities. "; list x733-mapping { if-feature configure-x733-mapping; key "alarm-type-id alarm-type-qualifier-match"; description "This list allows a management application to control the X.733 mapping for all alarm types in the system. Any entry Vallin & Bjorklund Expires November 5, 2015 [Page 43] Internet-Draft YANG Alarm Module May 2015 in this list will allow the alarm manager to over-ride the default X.733 mapping in the system and the final mapping will be shown in the alarm-inventory"; leaf alarm-type-id { type al:alarm-type-id; description "Map the alarm type with this alarm type identifier."; } leaf alarm-type-qualifier-match { type string; description "A W3C regular expression that is used when mapping an alarm type and specific problem to X.733 parameters."; } leaf event-type { type event-type; mandatory true; description "The event type as defined in X.733/X.736."; } leaf probable-cause { type uint32; description "The probable cause for the alarm originally defined by X.733 and subsequent standards. Due to the history of problems in maintaining a standardized probable cause the probable cause is not unique. A best effort mapping of the alarm to existing probable causes are used."; } } } /* * Add X.733 parameters to alarm and notification. */ augment "/al:alarms/al:alarm-list/al:alarm" { description "Augment X.733 information to the alarm."; leaf event-type { type event-type; description "The X.733 event type for this alarms."; } leaf probable-cause { type uint32; Vallin & Bjorklund Expires November 5, 2015 [Page 44] Internet-Draft YANG Alarm Module May 2015 description "The X.733 probable cause for this alarm."; } } augment "/al:alarm-notification" { description "Augment X.733 information to the alarm notification."; leaf event-type { type event-type; description "The X.733 event type for this alarms."; } leaf probable-cause { type uint32; description "The X.733 probable cause for this alarm."; } } } 10. Security Considerations None. 11. Acknowledgements The author wishes to thank Viktor Leijon and Johan Nordlander for their valuable input on forming the alarm model. 12. References 12.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC6020] Bjorklund, M., "YANG - A Data Modeling Language for the Network Configuration Protocol (NETCONF)", RFC 6020, October 2010. 12.2. Informative References Vallin & Bjorklund Expires November 5, 2015 [Page 45] Internet-Draft YANG Alarm Module May 2015 [ALARMIRP] 3GPP, "Telecommunication management; Fault Management; Part 2: Alarm Integration Reference Point (IRP): Information Service (IS)", 3GPP TS 32.111-2 3.4.0, March 2005. [ALARMSEM] Wallin, S., Leijon, V., Nordlander, J., and N. Bystedt, "The semantics of alarm definitions: enabling systematic reasoning about alarms. International Journal of Network Management, Volume 22, Issue 3, John Wiley and Sons, Ltd, http://dx.doi.org/10.1002/nem.800", March 2012. [EEMUA] EEMUA Publication No. 191 Engineering Equipment and Materials Users Association, London, 2 edition., "Alarm Systems: A Guide to Design, Management and Procurement.", 2007. [ISA182] International Society of Automation,ISA, "ANSI/ ISA-18.2-2009 Management of Alarm Systems for the Process Industries", 2009. [RFC3877] Chisholm, S. and D. Romascanu, "Alarm Management Information Base (MIB)", RFC 3877, September 2004. [X.733] International Telecommunications Union, "Information Technology - Open Systems Interconnection - Systems Management: Alarm Reporting Function", ITU-T Recommendation X.733, 1992. Appendix A. Enterprise-specific Alarm-Types Example This example shows how to define alarm-types in an enterprise specific module. In this case "xyz" has chosen to define top level identities according to X.733 event types. Vallin & Bjorklund Expires November 5, 2015 [Page 46] Internet-Draft YANG Alarm Module May 2015 module example-xyz-alarms { namespace "urn:example:xyz-alarms"; prefix xyz-al; import ietf-alarms { prefix al; } identity xyz-alarms { base al:alarm-identity; } identity communicationsAlarm { base xyz-alarms; } identity qualityOfServiceAlarm { base xyz-alarms; } identity processingErrorAlarm { base xyz-alarms; } identity equipmentAlarm { base xyz-alarms; } identity environmentalAlarm { base xyz-alarms; } // communications alarms identity linkAlarm { base communicationsAlarm; } // QoS alarms identity highJitterAlarm { base qualityOfServiceAlarm; } } Appendix B. Alarm Inventory Example This shows an alarm inventory, it shows one alarm type defined only with the identifier, and another dynamically configured. In the latter case a digital input has been connected to a smoke-detector, therefore the 'alarm-type-qualifier' is set to "smoke-detector" and the 'alarm-type-identity' to "environmentalAlarm". Vallin & Bjorklund Expires November 5, 2015 [Page 47] Internet-Draft YANG Alarm Module May 2015 Link failure, operational state down but admin state up xyz-al:linkAlarm true Connected smoke detector to digital input xyz-al:environmentalAlarm smoke-alarm true Appendix C. Alarm List Example In this example we show an alarm that has toggled [major, clear, major]. An operator has acknowledged the alarm. 1 2015-04-08T08:39:40.702544+00:00 /dev:interface/FastEthernet[name='1/0'] xyz-al:linkAlarm false 1.3.6.1.2.1.2.2.1.1.17 Vallin & Bjorklund Expires November 5, 2015 [Page 48] Internet-Draft YANG Alarm Module May 2015 2015-04-08T08:39:40.000000+00:00 major Link operationally down but administratively up 2015-04-08T08:39:40.000000+00:00 major Link operationally down but administratively up 2015-04-08T08:30:00.000000+00:00 cleared Link operationally up and administratively up 2015-04-08T08:20:10.000000+00:00 major Link operationally down but administratively up ack Will investigate, ticket TR764999 2015-04-08T08:39:50.000000+00:00 ack joe Will investigate, ticket TR764999 Vallin & Bjorklund Expires November 5, 2015 [Page 49] Internet-Draft YANG Alarm Module May 2015 Appendix D. Alarm Shelving Example This example shows how to shelf alarms. We shelf alarms related to the smoke-detectors since they are being installed and tested. We also shelf all alarms from FastEthernet1/0. FE10 /dev:interface/dev:FastEthernet[name='1/0'] detectortest xyz-al:environmentalAlarm smoke-alarm Appendix E. X.733 Mapping Example This example shows how to map a dynamic alarm type (alarm-type- identity=environmentalAlarm, alarm-type-qualifier=smoke-alarm) to the corresponding X.733 even-type and probable cause parameters. Vallin & Bjorklund Expires November 5, 2015 [Page 50] Internet-Draft YANG Alarm Module May 2015 xyz-al:environmentalAlarm smoke-alarm qualityOfServiceAlarm 777 Authors' Addresses Stefan Vallin Cisco Email: svallin@cisco.com Martin Bjorklund Cisco Email: mbj@tail-f.com Vallin & Bjorklund Expires November 5, 2015 [Page 51]