Privacy Preserving Measurement                                M. Thomson
Internet-Draft                                                   Mozilla
Intended status: Standards Track                         18 October 2024
Expires: 21 April 2025


     Distributed Aggregation Protocol (DAP) Extensions for Improved
                  Application of Differential Privacy
                    draft-thomson-ppm-dap-dp-ext-00

Abstract

   The Distributed Aggregation Protocol (DAP) can be a key component of
   a system that provides differentially-private guarantees for
   participants.  Extensions to DAP are defined to support these
   guarantees.  This includes bindings of reports to specific options,
   so that the aggregation service can better implement privacy
   budgeting and replay protections.

About This Document

   This note is to be removed before publishing as an RFC.

   The latest revision of this draft can be found at
   https://martinthomson.github.io/dap-dp-ext/draft-thomson-ppm-dap-dp-
   ext.html.  Status information for this document may be found at
   https://datatracker.ietf.org/doc/draft-thomson-ppm-dap-dp-ext/.

   Discussion of this document takes place on the Privacy Preserving
   Measurement Working Group mailing list (mailto:ppm@ietf.org), which
   is archived at https://mailarchive.ietf.org/arch/browse/ppm/.
   Subscribe at https://www.ietf.org/mailman/listinfo/ppm/.

   Source for this draft and an issue tracker can be found at
   https://github.com/martinthomson/dap-dp-ext.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.






Thomson                   Expires 21 April 2025                 [Page 1]

Internet-Draft              DAP DP Extensions               October 2024


   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 21 April 2025.

Copyright Notice

   Copyright (c) 2024 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Conventions and Definitions . . . . . . . . . . . . . . . . .   3
   3.  Late Task Binding . . . . . . . . . . . . . . . . . . . . . .   3
   4.  Scoping Extensions  . . . . . . . . . . . . . . . . . . . . .   4
     4.1.  Requester (Website) Identity  . . . . . . . . . . . . . .   4
     4.2.  Report Partition  . . . . . . . . . . . . . . . . . . . .   5
   5.  Privacy Budget Consumption  . . . . . . . . . . . . . . . . .   5
     5.1.  Privacy Budget Report Extension Format  . . . . . . . . .   6
     5.2.  Privacy Budget Usage  . . . . . . . . . . . . . . . . . .   7
   6.  Security Considerations . . . . . . . . . . . . . . . . . . .   7
   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   7
   8.  References  . . . . . . . . . . . . . . . . . . . . . . . . .   7
     8.1.  Normative References  . . . . . . . . . . . . . . . . . .   8
     8.2.  Informative References  . . . . . . . . . . . . . . . . .   8
   Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . .   9
   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .   9

1.  Introduction

   The Distributed Aggregation Protocol (DAP) [DAP] can be used as part
   of a differentially-private system.

   Differential privacy depends on being able to limit the contributions
   from participants.  The basic mechanism that DAP uses to cap
   contributions is anti-replay.  Aggregators are responsible for



Thomson                   Expires 21 April 2025                 [Page 2]

Internet-Draft              DAP DP Extensions               October 2024


   ensuring that the same report cannot be aggregated more than once.
   An honest participant will contribute a limited number of reports and
   can rely on at least one aggregator preventing that report from being
   used multiple times.  (The threat model does not seek to protect the
   privacy of a dishonest participant.)

   This basic anti-replay mechanism allows DAP to provide caps on
   contributions.  The resulting system is somewhat inflexible, which
   can limit the applicability of the protocol outside of the narrowly-
   defined usage modes in the basic specification.

   This document defines several report extensions to DAP that either
   enable greater flexibility or help constrain the flexibility allowed
   by other options.

      |  TODO: It would make sense for the corresponding extensions to
      |  be added to [TASKPROV].  However, that protocol does not
      |  include any provision for extensions.

2.  Conventions and Definitions

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in
   BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

3.  Late Task Binding

   DAP presently requires that a client be aware of the task that it is
   contributing to.  The identity of the task is bound to each report
   through the inclusion of the task ID in the call to the sharding
   function of the VDAF (Section 5.1 of [VDAF]).

   The late_binding report extension (codepoint 0xTBD) signals to
   aggregators that a report was not bound to a specific task when it
   was created.

   Late task binding might be useful when reports are collected by an
   intermediary.  The client that generates the report in this case
   might be unaware of how the report will ultimately be aggregated.
   This allows the intermediary to defer the creation of a task until it
   has determined the necessary parameters for the task.

   When sharding and protecting reports, the task ID is replaced with a
   the fixed, 32-byte sequence of
   b13e8440f1cdb4da51eed3967e0a2652d27f5005bc35f751daf188b4b746708b (in
   hex).



Thomson                   Expires 21 April 2025                 [Page 3]

Internet-Draft              DAP DP Extensions               October 2024


      |  This is the output from SHA-256 [SHA2] when passed an ASCII-
      |  encoded [ASCII] input of 'no task_id'.

   Enforcing anti-replay for a report that is not bound to a specific
   task is challenging.  An aggregator cannot constrain its search for
   duplicate reports to those that were submitted to the task.  This
   could greatly increase the cost of meeting anti-replay requirements.
   The intent with this extension is that additional constraints, such
   as one or more of the scoping extensions (see Section 4), will be
   used to make it more feasible for an aggregator to comply with anti-
   replay requirements.

4.  Scoping Extensions

   The DAP report extensions in this section might be used to either
   constrain the use of reports for tasks that are configured with
   matching values or group reports for the purposes of detecting
   duplicates.

   Including additional scoping information can also ensure that reports
   do not get reused outside of their intended scope.

   This section defines report extensions that carry requester identity
   (Section 4.1) and report partition (Section 4.2).

4.1.  Requester (Website) Identity

   Reports might be requested by an entity that operates at lower trust
   level than the entity that assembles the report.  The entity at the
   lower trust level might not have access to the information necessary
   to generate the report.

   The requester_identity report extension (codepoint 0xTBD) contains an
   encoding of the entity that requested the report be created.

   For example, an application could ask the operating system to
   generate a report using information that would normally be withheld
   from it.  Similarly, a website could ask a web browser to generate a
   report based on otherwise secret information.  In either case, the
   release of information for report is conditional on it only being
   used by a specific aggregation service under terms that have been
   previously established with the aggregators.  Binding the report to
   the identity of the requester ensures that any use of the system can
   be accounted for as coming from that requester.

   The specific encoding used in this extension will depend on the
   application.  However, the use of a globally-unique identifier, such
   as an origin ([ORIGIN]) or serialized site ([SITE]), reduces the



Thomson                   Expires 21 April 2025                 [Page 4]

Internet-Draft              DAP DP Extensions               October 2024


   likelihood of name collisions.  A name collision might either allow
   two requesters that share an aggregator to share and reuse each
   others reports (or perhaps to marginally increase the odds of having
   reports spuriously detected duplicates).

4.2.  Report Partition

   This extension allows a client to bind a report to an application-
   defined label.  This allows applications to partition reports and
   have each partition managed separately.

   The report_partition report extension (codepoint 0xTBD) contains an
   application-defined sequence of bytes.

   The use of this report extension allows aggregators to partition
   their state for tracking reports.  Duplicate reports only need to be
   tracked across a matching partition, for detecting duplicates within
   a task or for detecting duplicates across tasks.

   The selection of partition values might need to be coordinated with
   aggregators.  If partitions are used by aggregators, the amount of
   state the aggregator tracks is increased by the number of partitions.
   This represents an increase in total storage, in exchange for
   reducing the scope over which that storage needs to be consistent.

   An aggregator could constrain the values that are accepted for this
   extension, rejecting reports that lack the extension or have
   disallowed values.

5.  Privacy Budget Consumption

   The gathering of reports can be modeled as the expenditure of privacy
   budget by a client.  That is, clients treat the creation of a report
   from private information as a limited release of information.

   Total privacy loss in this case is determined by the combination of
   two factors:

   *  How the report is aggregated.

   *  How many reports are produced.










Thomson                   Expires 21 April 2025                 [Page 5]

Internet-Draft              DAP DP Extensions               October 2024


   If aggregation includes the application of an appropriate
   differential privacy mechanism (that is, added noise; see [DWORK],
   [DAP-DP], and Section 7.5 of [DAP]), the client might rely on an
   understanding of that mechanism to model privacy loss.  However, such
   privacy loss might be based on an assumption that the client
   contributes just one report.  A complete model needs to consider the
   contributions of multiple reports.

   A privacy budgeting system provides additional flexibility.  Privacy
   loss associated with any task (or information release) can be
   adjusted to control the amount of noise that is added.  A budget
   might be specified in terms of a metric (like the epsilon parameter
   in (ε, δ)-differential privacy) that is expended with each
   information release.

   In one version of that model, a client is responsible for the
   management of any privacy budget.  Each report represents a logical
   information release, contingent on being sent for aggregation by
   aggregators that are trusted to apply an appropriate differential
   privacy mechanism with the appropriate level of noise.

5.1.  Privacy Budget Report Extension Format

   The privacy_budget report extension (codepoint 0xTBD) encodes the
   amount of privacy budget that the client considers to be expended as
   a result of producing a report.

   The value of the codepoint is an encoding of the number of milli-
   epsilons of budget that are expended, using as many bytes as needed
   to encode the value in network byte order.  Each unit is a one-
   thousandth of an epsilon (ε) as used in (ε, δ)-differential privacy.

      |  Note(1): Where the delta (δ) value is non-zero, and small
      |  epsilon increments can be expended, clients might also need to
      |  limit the number of reports to prevent the overall delta value
      |  from getting large.

      |  Note(2): A separate report extension could be defined to change
      |  the scale of this value or switch to a different unit, as
      |  necessary.











Thomson                   Expires 21 April 2025                 [Page 6]

Internet-Draft              DAP DP Extensions               October 2024


5.2.  Privacy Budget Usage

   An aggregator that is configured to apply a differential privacy
   mechanism can operate in one of two modes: either one where the
   privacy budget value is validated and reports that contain a small
   value are rejected; or, where the minimum privacy budget value is
   used to determine the parameters for the differential privacy
   mechanism.

   In the first mode, aggregators each validate this parameter as part
   of validating each report.  The value in the report is compared with
   the value configured for the task.  A report that contains a value
   that is lower than the value configured for the task is the result of
   a client that expects that the aggregators will add more noise than
   the task configuration presently allows.  Aggregators MUST reject
   reports with a privacy budget value that is smaller than their
   configured privacy budget.

   Alternatively, aggregators could adjust the parameters of the
   differential privacy mechanism they use to match the smallest privacy
   budget that was included in reports.  For long-running tasks that
   produce multiple outputs over time, it is only necessary to ensure
   that each output contain noise that is based on the minimum budget
   expenditure of the reports that are included in that aggregate.

   This report extension can be used to protect reports that are
   conveyed from client by untrusted entities, especially where those
   entities might be able to choose any task, as enabled by the
   late_binding report extension (Section 3).  This parameter ensures
   that the entity cannot direct reports to a task that has an
   inadequate differential privacy mechanism.

6.  Security Considerations

   The security considerations relevant to each extension is enumerated
   in the respective sections: Section 3, Section 4.1, Section 4.2, and
   Section 5.

   Use of DAP is subject to the security considerations of DAP
   (Section 7 of [DAP]) and the VDAF that is in use (Section 9 of
   [VDAF].

7.  IANA Considerations

   Registrations for the defined report extensions need to be made, but
   this depends on the resolution of the TODO in Section 8.2.2 of [DAP].

8.  References



Thomson                   Expires 21 April 2025                 [Page 7]

Internet-Draft              DAP DP Extensions               October 2024


8.1.  Normative References

   [DAP]      Geoghegan, T., Patton, C., Pitman, B., Rescorla, E., and
              C. A. Wood, "Distributed Aggregation Protocol for Privacy
              Preserving Measurement", Work in Progress, Internet-Draft,
              draft-ietf-ppm-dap-12, 10 October 2024,
              <https://datatracker.ietf.org/doc/html/draft-ietf-ppm-dap-
              12>.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/rfc/rfc2119>.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <https://www.rfc-editor.org/rfc/rfc8174>.

8.2.  Informative References

   [ASCII]    Cerf, V., "ASCII format for network interchange", STD 80,
              RFC 20, DOI 10.17487/RFC0020, October 1969,
              <https://www.rfc-editor.org/rfc/rfc20>.

   [DAP-DP]   Chen, J., McMillan, A., Patton, C., Talwar, K., and S.
              Wang, "Differential Privacy Mechanisms for DAP", Work in
              Progress, Internet-Draft, draft-wang-ppm-differential-
              privacy-00, 23 October 2023,
              <https://datatracker.ietf.org/doc/html/draft-wang-ppm-
              differential-privacy-00>.

   [DWORK]    Dwork, C. and A. Roth, "The Algorithmic Foundations of
              Differential Privacy", Now Publishers, Foundations and
              Trends® in Theoretical Computer Science vol. 9, no. 3-4,
              pp. 211-407, DOI 10.1561/0400000042, 2013,
              <https://doi.org/10.1561/0400000042>.

   [ORIGIN]   Barth, A., "The Web Origin Concept", RFC 6454,
              DOI 10.17487/RFC6454, December 2011,
              <https://www.rfc-editor.org/rfc/rfc6454>.

   [SHA2]     Eastlake 3rd, D. and T. Hansen, "US Secure Hash Algorithms
              (SHA and SHA-based HMAC and HKDF)", RFC 6234,
              DOI 10.17487/RFC6234, May 2011,
              <https://www.rfc-editor.org/rfc/rfc6234>.

   [SITE]     WHATWG, "HTML - Living Standard", 26 January 2021,
              <https://html.spec.whatwg.org/#site>.



Thomson                   Expires 21 April 2025                 [Page 8]

Internet-Draft              DAP DP Extensions               October 2024


   [TASKPROV] Wang, S. and C. Patton, "Task Binding and In-Band
              Provisioning for DAP", Work in Progress, Internet-Draft,
              draft-ietf-ppm-dap-taskprov-00, 9 October 2024,
              <https://datatracker.ietf.org/doc/html/draft-ietf-ppm-dap-
              taskprov-00>.

   [VDAF]     Barnes, R., Cook, D., Patton, C., and P. Schoppmann,
              "Verifiable Distributed Aggregation Functions", Work in
              Progress, Internet-Draft, draft-irtf-cfrg-vdaf-12, 4
              October 2024, <https://datatracker.ietf.org/doc/html/
              draft-irtf-cfrg-vdaf-12>.

Acknowledgments

   Roxana Geambesu noted that a binding to requester identity
   (Section 4.1) was an important component of a robust differential
   privacy system design.

Author's Address

   Martin Thomson
   Mozilla
   Email: mt@lowentropy.net




























Thomson                   Expires 21 April 2025                 [Page 9]