Network Working Group I. Nadareishvili Internet-Draft January 15, 2018 Intended status: Informational Expires: July 19, 2018 Healtch Check Response Format for HTTP APIs draft-inadarei-api-health-check-00 Abstract This document proposes a "health check response" format for API HTTP clients. Note to Readers _RFC EDITOR: please remove this section before publication_ The issues list for this draft can be found at https://github.com/inadarei/rfc-healthcheck/issues [1]. The most recent draft is at https://inadarei.github.io/rfc- healthcheck/ [2]. Recent changes are listed at https://github.com/inadarei/rfc- healthcheck/commits/master [3]. See also the draft's current status in the IETF datatracker, at https://datatracker.ietf.org/doc/draft-inadarei-api-healthcheck/ [4]. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on July 19, 2018. Nadareishvili Expires July 19, 2018 [Page 1] Internet-Draft Healtch Check Response Format for HTTP APIs January 2018 Copyright Notice Copyright (c) 2018 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1. Notational Conventions . . . . . . . . . . . . . . . . . 3 2. API Health Response . . . . . . . . . . . . . . . . . . . . . 3 3. Another subtitle . . . . . . . . . . . . . . . . . . . . . . 6 4. Final subtitle . . . . . . . . . . . . . . . . . . . . . . . 6 5. Security Considerations . . . . . . . . . . . . . . . . . . . 6 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 6 6.1. Media Type Registration . . . . . . . . . . . . . . . . . 6 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 6 7.1. Normative References . . . . . . . . . . . . . . . . . . 6 7.2. Informative References . . . . . . . . . . . . . . . . . 7 7.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 7 Appendix B. Creating and Serving Health Responses . . . . . . . 7 Appendix C. Consuming Health Check Responses . . . . . . . . . . 8 Appendix D. Frequently Asked Questions . . . . . . . . . . . . . 8 D.1. Why not use (insert other health check format)? . . . . . 8 D.2. Why doesn't the format allow references or inheritance? . 8 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 9 1. Introduction Vast majority of modern APIs, that drive data to web and mobile applications use HTTP [RFC7230] as a transport protocol. The health and uptime of these APIs determine availability of the applications themselves. In distributed systems built with a number of APIs, understanding the health status of the APIs and making corresponding decisions, for failover or circuit-breaking, are essential for providing highly available solutions. Nadareishvili Expires July 19, 2018 [Page 2] Internet-Draft Healtch Check Response Format for HTTP APIs January 2018 There exists a wide variety of operational software that relies on the ability to read health check response of APIs. There is currently no standard for the health check output response, however, so most applications either rely on the basic level of information included in HTTP status codes [RFC7231] or use task-specific formats. Usage of task-specific or application-specific rformats creates significant challenges, disallowing any meaningful interoprerability across different implementations and between different tooling. Standardizing a format for health checks can provide any of a number of benefits, including: o Flexible deployment - since operational tooling and API clients can rely on rich, uniform format, they can be safely combined and substituted as needed. o Evolvability - new APIs, conforming to the standard, can safely be introduced in any environment and ecosystem that also conforms to the same standard, without costly coordination and testing requirements. This document defines a "health check" format using the JSON format [RFC7159] for APIs to use as a standard point for the health information they offer. Having a well-defined format for this purpose promotes good practice and tooling. 1.1. Notational Conventions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. 2. API Health Response An API Health Response Format (or, interchangeably, "health check response") uses the format described in [RFC7159] and has the media type "application/vnd.health+json". *Note: this media type is not final, and will change before final publication.* Its content consists of a single mandatory root field and several optional fields: o status: (required) indicates whether the service status is acceptable or not. API publishers SHOULD use following values for the field: Nadareishvili Expires July 19, 2018 [Page 3] Internet-Draft Healtch Check Response Format for HTTP APIs January 2018 * "pass": healthy, * "fail": unhealthy, and * "warn": healthy, with some concerns. For "pass" and "warn" statuses HTTP response code in the 2xx - 3xx range MUST be used. for "fail" status HTTP response code in the 4xx - 5xx range MUST be used. In case of "warn" status, additional information SHOULD be provided, utilizing optional fields of the response. o serviceID: (optional) unique identifier of the service, in the application scope. o description: (optional) human-friendly description of the service. o memory: (optional) array of sizes for the currently utilized resident memory (in kilobytes) on each of the logical nodes backing the service. Logical node can be a physical server, VM, a container or any other logical unit that makes sense for service publisher. o cpu: (optional) array of cpu utiliation percentage on each of the logical nodes backing the service. Logical node can be a physical server, VM, a container or any other logical unit that makes sense for service publisher. o uptime: (optional) current uptime in seconds since the last restart o notes: (optional) array of notes relevant to current state of health o output: (optional) raw error output, in case of "fail" or "warn" states. This field SHOULD be omitted for "pass" state. o details: (optional) an array of objects optionally providing additional information regarding the various sub-components of the service. o links: (optional) an array of objects containing link relations and URIs [RFC3986] for external links that MAY contain more information about the health of the endpoint. Per web-linking standards [RFC5988] a link relationship SHOULD either be a common/ registered one or be indicated as a URI, to avoid name clashes. Nadareishvili Expires July 19, 2018 [Page 4] Internet-Draft Healtch Check Response Format for HTTP APIs January 2018 For example: GET /health HTTP/1.1 Host: example.org Accept: application/vnd.health+json HTTP/1.1 200 OK Content-Type: application/vnd.health+json Cache-Control: max-age=3600 Connection: close { "serviceID": "service:authz", "description": "health of authz service", "status": "pass", "memory": [4096, 1024, 3456], "cpu": [20, 40, 50], "uptime": "1209600", "notes": [""], "output": "", "details": [ { "id": "dfd6cf2b-1b6e-4412-a0b8-f6f7797a60d2", "name": "sub-component-X", "status": "pass", "value": "12313", "output": "" }, { "id": "3c1f048c-a4be-4aa2-83e6-2629073d19dc", "name": "sub-component-Y", "status": "warn", "value": "0920394", "output": "Close to capacity" } ], "links": [ {"rel": "about", "uri": "http://api.example.com/about/authz"}, { "rel": "http://api.example.com/rel/thresholds", "uri": "http://api.example.com/about/authz/thresholds" } ] } Nadareishvili Expires July 19, 2018 [Page 5] Internet-Draft Healtch Check Response Format for HTTP APIs January 2018 3. Another subtitle Lorem Ipsum 4. Final subtitle Lorem ipsum 5. Security Considerations Clients need to exercise care when reporting health information. Malicious actors could use this information for orchestrating attacks. In some cases the health check endpoints may need to be authenticated and institute role-based access control. 6. IANA Considerations 6.1. Media Type Registration TODO: application/vnd.health+json will be submitted for registration per [RFC6838] 7. References 7.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986, DOI 10.17487/RFC3986, January 2005, . [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA Considerations Section in RFCs", RFC 5226, DOI 10.17487/RFC5226, May 2008, . [RFC5988] Nottingham, M., "Web Linking", RFC 5988, DOI 10.17487/RFC5988, October 2010, . [RFC7159] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data Interchange Format", RFC 7159, DOI 10.17487/RFC7159, March 2014, . Nadareishvili Expires July 19, 2018 [Page 6] Internet-Draft Healtch Check Response Format for HTTP APIs January 2018 [RFC7234] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., "Hypertext Transfer Protocol (HTTP/1.1): Caching", RFC 7234, DOI 10.17487/RFC7234, June 2014, . 7.2. Informative References [RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type Specifications and Registration Procedures", BCP 13, RFC 6838, DOI 10.17487/RFC6838, January 2013, . [RFC7230] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing", RFC 7230, DOI 10.17487/RFC7230, June 2014, . [RFC7231] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content", RFC 7231, DOI 10.17487/RFC7231, June 2014, . 7.3. URIs [1] https://github.com/inadarei/rfc-healthcheck/issues [2] https://inadarei.github.io/rfc-healthcheck/ [3] https://github.com/inadarei/rfc-healthcheck/commits/master [4] https://datatracker.ietf.org/doc/draft-inadarei-api-healthcheck/ Appendix A. Acknowledgements Thanks to Mike Amundsen, Erik Wilde, Justin Bachorik and Randall Randall for their suggestions and feedback. And to Mark Nottingham for blueprint for authoring RFCs easily. Appendix B. Creating and Serving Health Responses When making an health check endpoint available, there are a few things to keep in mind: o A health response endpoint is best located at a memorable and commonly-used URI, such as "health" because it will help self- discoverability by clients. Nadareishvili Expires July 19, 2018 [Page 7] Internet-Draft Healtch Check Response Format for HTTP APIs January 2018 o Health check responses can be personalized. For example, you could advertise different URIs, and/or different kinds of link relations, to afford different clients access to additional health check information. o Health check responses must be assigned a freshness lifetime (e.g., "Cache-Control: max-age=3600") so that clients can determine how long they could cache them, to avoid overly frequent fetching and unintended DDOS-ing of the service. o Custom link relation types, as well as the URIs for variables, should lead to documentation for those constructs. Appendix C. Consuming Health Check Responses Clients might use health check responses in a variety of ways. Note that the health check response is a "living" document; links from the health check response MUST NOT be assumed to be valid beyond the freshness lifetime of the health check response, as per HTTP's caching model [RFC7234]. As a result, clients ought to cache the health check response (as per [RFC7234]), to avoid fetching it before every interaction (which would otherwise be required). Likewise, a client encountering a 404 (Not Found) on a link is encouraged obtain a fresh copy of the health check response, to assure that it is up-to-date. Appendix D. Frequently Asked Questions D.1. Why not use (insert other health check format)? There are a fair number of existing health check formats. However, these formats have generally been optimised for particular use-cases, and less capable of fitting into general scenarios, optimized for interoperability. D.2. Why doesn't the format allow references or inheritance? Implementing them would add considerable complexity and the associated potential for errors (both in the specification and by its users). For the sake of interoperability and ease of implementation this specification doesn't attempt to create the most powerful format possible. Nadareishvili Expires July 19, 2018 [Page 8] Internet-Draft Healtch Check Response Format for HTTP APIs January 2018 Author's Address Irakli Nadareishvili Email: irakli@gmail.com URI: http://www.freshblurbs.com/ Nadareishvili Expires July 19, 2018 [Page 9]