Network Working Group J. Arkko Internet-Draft Ericsson Intended status: Informational October 2021 Expires: 28 April 2022 Data minimization draft-arkko-iab-data-minimization-principle-00 Abstract Communications security has been at the center of many security improvements in the Internet. The goal has been to ensure that communications are protected against outside observers and attackers. This memo suggests that this is no longer alone sufficient to cater for the security and privacy issues seen on the Internet today. For instance, it is often also necessary to protect against endpoints that are compromised, malicious, or whose interests simply do not align with the interests of users. While such protection is difficult, there are some measures that can be taken. It is particularly important that new technology and new deployments consider the role of data passed to various parties -- including the primary protocol participants -- and balance the information given to them considering their roles and possible compromise of the information. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on 4 April 2022. Copyright Notice Copyright (c) 2021 IETF Trust and the persons identified as the document authors. All rights reserved. Arkko Expires 28 April 2022 [Page 1] Internet-Draft Data Minimization in Internet Architecture October 2021 This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/ license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Background . . . . . . . . . . . . . . . . . . . . . . . . . 3 3. Related work . . . . . . . . . . . . . . . . . . . . . . . . 3 4. Principles . . . . . . . . . . . . . . . . . . . . . . . . . 4 4.1. Scope of protocol exchanges . . . . . . . . . . . . . . . 4 4.2. Principle: Transmission is publication . . . . . . . . . 5 4.3. Principle: Build for eventual compromise . . . . . . . . 5 4.4. Principle: Data and recipient minimization . . . . . . . 6 4.4.1. Protocol design implications . . . . . . . . . . . . 6 4.4.2. Fingerprinting avoidance . . . . . . . . . . . . . . 6 5. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 7 6. Informative References . . . . . . . . . . . . . . . . . . . 7 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 9 1. Introduction Communications security has been at the center of many security improvements on the Internet. The goal has been to ensure that communications are protected against outside observers and attackers. This has been exemplified in many aspects of IETF efforts, in the threat models [RFC3552], concerns about surveillance [RFC7258], and the introduction of encryption in many protocols [RFC9000], [RFC7858], [RFC8484]. This memo suggests that current security and privacy issues on the Internet require even further action. For instance, it is often also necessary to protect against endpoints that are compromised, malicious, or whose interests simply do not align with the interests of users. While such protection is difficult, there are some measures that can be taken. It is particularly important that new technology and new deployments consider the role of data passed to various parties -- including the primary protocol participants -- and balance the information given to them considering their roles and possible compromise of the information. Arkko Expires 28 April 2022 [Page 2] Internet-Draft Data Minimization in Internet Architecture October 2021 2. Background The primary reason for having to go beyond communications security is success. Advances in protecting most of our communications with strong cryptographic has resulted in much improved security, but also highlight the need for addressing other, remaining issues. Particularly when adversaries have increased their pressure against other avenues of attack. New adversaries and risks have arisen, e.g., due to increasing amount of information stored in various Internet services, or with the services whose interests are not aligned with their users. In short, attacks are migrating towards the currently easier targets, which no longer necessarily include direct attacks on traffic flows. These have been discussed at length in, for instance, [RFC8980], [I-D.farrell-etm] [I-D.arkko-arch-internet-threat-model-guidance], [I-D.lazanski-smart-users-internet], and others. It is important that when it comes to basic Internet infrastructure, our technology addresses non-communications threats to the extent possible. It is particularly important to ensure that non- communications security related threats are properly understood for any new Internet technology. The sole consideration of communications security aspects in designing Internet protocols may lead to accidental or increased impact of security issues elsewhere. For instance, allowing a participant to unnecessarily collect or receive information may lead to a similar effect as described in [RFC8546] for protocols: over time, unnecessary information will get used with all the associated downsides, regardless of what deployment expectations there were during protocol design. 3. Related work Hardie [RFC8558] discusses path signals, i.e., messages to or from on-path elements to endpoints. In the past, path signals were often implicit, e.g., network nodes interpreting in a particular way transport protocol headers originally intended for end-to-end consumption. The document recommends a principle that implicit signals should be avoided and that explicit signals be used instead, and only when the signal's originator intends that it be used by the network elements on the path. Arkko, Kuhlewind, Pauly, and Hardie [I-D.arkko-iab-path-signals-collaboration] discuss the same topic, and extend the RFC 8558 principles with recommendations to ensure minimum set of parties, minimum information, and consent. Arkko Expires 28 April 2022 [Page 3] Internet-Draft Data Minimization in Internet Architecture October 2021 Thomson [I-D.thomson-tmi] discusses the role intermediaries in the Internet architecture, at different layers of the stack. For instance, a router is an intermediary, some parts of DNS infrastructure can be intermediaries, messaging gateways are intermediaries. Thomson discusses when intermediaries are or are not an appropriate tool, and presents a number of principles relating to the use of intermediaries, e.g., deliberate selection of protocol participants or limiting the capabilities or information exposure related to the intermediaries. Trammel and Kuehlewind [RFC8546] discuss the concept of a "wire image" of a protocol. This is an abstraction of the information available to an on-path non-participant in a networking protocol. It relates to the topic of non-participants interpreting information that is available to them in the "wire image" (or associated timing and other indirect information). The issues are largely the same even for participants. Even proper protocol participants may start to use information available to them, regardless of whether it was intended to that participant or simply relayed through them. 4. Principles This memo approaches the topic from the point of disclosing information to another participant in a protocol exchange. 4.1. Scope of protocol exchanges This memo does not limit what types of protocol exchanges can lead to information disclosure. The protocol exchanges may relate to: * The interaction of an endpoint with the network, e.g., information they provide in any network attachment process or the wire images of the packets sent via the network. * The interaction of an endpoint with intermediaries, in the meaning discussed by Thomson. * The interaction of an endpoint with a service, such as a website or social networking function. * End-to-end interactions between users, represented by applications running on their computers. It is also important to observe that information disclosure can appear in several ways: * Explicitly carried information, e.g., a data item in a message sent to a protocol participant. Note that the carried information Arkko Expires 28 April 2022 [Page 4] Internet-Draft Data Minimization in Internet Architecture October 2021 may appear at multiple layers in the protocol stack. For instance, both protocol participants and non-participants may observe lower layer information, such as topological network addresses. Such information can be collected, used, and perhaps misused or leaked. * Indirectly inferred information, such as message arrival times or patterns in the traffic flow. Information may also be obtained from fingerprinting the protocol participants, in an effort to identify unique endpoints or users. * Information gathered from a collaboration among several parties, e.g., websites and social media systems collaborating to identify visiting users [WP2021]. 4.2. Principle: Transmission is publication PRINCIPLE: Consider information passed to another party as a publication. Avoid passing information that should not be published. This principle applies even if the communications that carry that information are encrypted, as the party that received the communications and can decrypt them may use the information, e.g., because it has become or will later become compromised. 4.3. Principle: Build for eventual compromise PRINCIPLE: Build defenses to protect information, even when some component in a system is or becomes compromised. For instance, at the service side encryption of data at rest may assist in protecting information if an attacker gains access to the servers. Similarly, protecting data in use can prevent leakage in some cases, and regular purging of old information can limit the amount of leaked information. Protocols can ensure that perfect forward secrecy is provided, so that the damage resulting from a compromise of keying material has limited impact. On the client side, the client may trust that another party handles information appropriately, but take steps to ensure or verify that this is the case. For instance, as discussed above, the client can encrypt a message only to the actual final recipient, even if the server holds our message before it is delivered. In some case the client may also verify correct behavior, e.g., through confidential computing attestations. Arkko Expires 28 April 2022 [Page 5] Internet-Draft Data Minimization in Internet Architecture October 2021 4.4. Principle: Data and recipient minimization PRINCIPLE: Information should not be disclosed, stored, or routed in cleartext through services that do not absolutely need to have that information for the function they perform. This the "need to know" principle. Note that this principle applies at multiple layers in the stack. It is not just about intermediaries in the network and transport layers, but also intermediaries and services on the application layer. Information should only be passed between the "real ends" of a conversation, unless the information is necessary for a useful function in a service. For instance, a transport connection between two components of a system is not an end-to-end connection even if it encompasses all the protocol layers up to the application layer. It is not end-to-end, if the information or control function it carries extends beyond those components. For instance, just because an e-mail server can read the contents of an e-mail message do not make it a legitimate recipient of the e-mail. Typically, information can be classified in different categories, such as information needed for the function provided by a service (e.g., addressing information such as e-mail headers needed to find targeted destination) and information that should only be revealed to the targeted destination (such as e-mail message contents). 4.4.1. Protocol design implications An obvious implication of the above is that it is necessary to have mechanisms that allow secure communication and data object protection, that is not tied to a particular IP packet source and destination or a transport layer connection. These mechanisms also require associated key distribution and agreement facilities. 4.4.2. Fingerprinting avoidance Fingerprinting warrants a separate discussion. Internet technology has tended to move towards richer and more power mechanisms over time. For instance, full-functionality web and transport layer security stacks are now used for almost all purposes across the network. Arkko Expires 28 April 2022 [Page 6] Internet-Draft Data Minimization in Internet Architecture October 2021 This is of course good, and the performance, expressive power, and security improvements that came through these are much needed. Nevertheless, all protocol mechanisms come with some fingerprinting opportunities, and this tends to be easier the higher in the stack we are, given the wealth of options and algorithms in use. [Fingerprinting] and [AmIUnique] provide a good starting point for some of the issues, along with measurements about the accuracy of fingerprinting mechanisms and defenses against them. The general topic of ensuring that protocol mechanisms stays evolvable and workable is covered in [I-D.iab-use-it-or-lose-it]. But the associated methods for reducing fingerprinting possibilities probably deserve further study. [I-D.wood-pearg-website-fingerprinting] discusses one aspect of this. 5. Acknowledgements The author would like to thank the members of the IAB, and the participants of IETF SAAG WG, Model-T IAB program, and the 2019 IAB DEDR workshop that all discussed some aspects of these issues. The author would like to acknowledge the significant contributions of Stephen Farrell, Martin Thomson, Mark McFadden, Chris Wood, Dominique Lazanski, Eric Rescorla, Russ Housley, Robin Wilton, Mirja Kuehlewind, Tommy Pauly, Jaime Jimenez and Christian Huitema in discussions around this general problem area. 6. Informative References [AmIUnique] INRIA, ., "Am I Unique?", https://amiunique.org , 2020. [Fingerprinting] Laperdrix, P., Bielova, N., Baudry, B., and G. Avoine, "Browser Fingerprinting: A survey", arXiv:1905.01051v2 [cs.CR] 4 Nov 2019 , n.d.. [I-D.arkko-arch-internet-threat-model-guidance] Arkko, J. and S. Farrell, "Internet Threat Model Guidance", Work in Progress, Internet-Draft, draft-arkko- arch-internet-threat-model-guidance-00, 12 July 2021, . [I-D.arkko-iab-path-signals-collaboration] Arkko, J., Hardie, T., and T. Pauly, "Considerations on Application - Network Collaboration Using Path Signals", Work in Progress, Internet-Draft, draft-arkko-iab-path- Arkko Expires 28 April 2022 [Page 7] Internet-Draft Data Minimization in Internet Architecture October 2021 signals-collaboration-01, 25 October 2021, . [I-D.farrell-etm] Farrell, S., "We're gonna need a bigger threat model", Work in Progress, Internet-Draft, draft-farrell-etm-03, 6 July 2019, . [I-D.iab-use-it-or-lose-it] Thomson, M. and T. Pauly, "Long-term Viability of Protocol Extension Mechanisms", Work in Progress, Internet-Draft, draft-iab-use-it-or-lose-it-04, 12 October 2021, . [I-D.lazanski-smart-users-internet] Lazanski, D., "An Internet for Users Again", Work in Progress, Internet-Draft, draft-lazanski-smart-users- internet-00, 8 July 2019, . [I-D.thomson-tmi] Thomson, M., "Principles for the Involvement of Intermediaries in Internet Protocols", Work in Progress, Internet-Draft, draft-thomson-tmi-02, 6 July 2021, . [I-D.wood-pearg-website-fingerprinting] Goldberg, I., Wang, T., and C. A. Wood, "Network-Based Website Fingerprinting", Work in Progress, Internet-Draft, draft-wood-pearg-website-fingerprinting-00, 4 November 2019, . [RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC Text on Security Considerations", BCP 72, RFC 3552, DOI 10.17487/RFC3552, July 2003, . [RFC7258] Farrell, S. and H. Tschofenig, "Pervasive Monitoring Is an Attack", BCP 188, RFC 7258, DOI 10.17487/RFC7258, May 2014, . Arkko Expires 28 April 2022 [Page 8] Internet-Draft Data Minimization in Internet Architecture October 2021 [RFC7858] Hu, Z., Zhu, L., Heidemann, J., Mankin, A., Wessels, D., and P. Hoffman, "Specification for DNS over Transport Layer Security (TLS)", RFC 7858, DOI 10.17487/RFC7858, May 2016, . [RFC8484] Hoffman, P. and P. McManus, "DNS Queries over HTTPS (DoH)", RFC 8484, DOI 10.17487/RFC8484, October 2018, . [RFC8546] Trammell, B. and M. Kuehlewind, "The Wire Image of a Network Protocol", RFC 8546, DOI 10.17487/RFC8546, April 2019, . [RFC8558] Hardie, T., Ed., "Transport Protocol Path Signals", RFC 8558, DOI 10.17487/RFC8558, April 2019, . [RFC8980] Arkko, J. and T. Hardie, "Report from the IAB Workshop on Design Expectations vs. Deployment Reality in Protocol Development", RFC 8980, DOI 10.17487/RFC8980, February 2021, . [RFC9000] Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based Multiplexed and Secure Transport", RFC 9000, DOI 10.17487/RFC9000, May 2021, . [WP2021] Fowler, Geoffrey A., "There's no escape from Facebook, even if you don't use it", Washington Post , August 2021. Author's Address Jari Arkko Ericsson Valitie 1B FI- Kauniainen Finland Email: jari.arkko@piuha.net Arkko Expires 28 April 2022 [Page 9]