The ISP Column
A column on things Internet
 
 
Other Formats: PDF   TXT  
 


 

Who Uses Google's Public DNS?
November 2013


Geoff Huston

Much has been said about how Google uses the services they provide, including their mail service, their office productivity tools, file storage and similar services, as a means of gathering an accurate profile of each individual user of their services. The company has made a very successful business out of measuring users, and selling those metrics to advertisers. But can we measure Google as they undertake this activity? How many users avail themselves of their services? Perhaps that's a little ambitious at this stage, so maybe a slightly smaller scale may be better. Let's just look at one Google service. What I would like to describe here is the results of an extended effort to measure which of the world’s Internet user population are users of Google’s Public DNS Service.

How do folk get to use Google’s Public DNS service? One way is for end users to configure their systems to use Google’s Public DNS service by following the configuration instructions at https://developers.google.com/speed/public-dns/docs/using. Yes, its as simple as placing 8.8.8.8 into the locally configured list of DNS resolvers. Most of the time this should Just Work. Of course there are some times, perhaps where there is DNS traffic interception going on, that your efforts to use a particular DNS resolver might well be thwarted by the actions of these middleware interceptors that intercept your DNS query packets, and answer them via a local DNS resolver, who then spoofs the identity of Google’s DNS resolvers in their response back to you. However, for many users it does work exactly as intended. And it’s not just individuals who have taken advantage of this service. It’s also evident that larger networks and ISP platforms have also availed themselves of this Google service, and they appear to use Google’s DNS resolvers as DNS forwarders from their own DNS resolver services.

Can we actually measure the extent to which end users and networks make use of Google’s DNS services?

Why is this question of interest?

It’s worth noting that almost everything we do on the Internet starts with a domain name. Whether it’s browsing the web, Twitter, Facebook, videos, talking, or almost any other form of application, the universal way of connecting to a service is by resolving the service’s domain name to an IP address, then starting a communications session with the identified remote service point. If one could see the entire panoply of DNS queries in one single view, then one would have a comprehensive picture of what everybody is doing on the Internet, in real time. But it’s not really necessary to have such a comprehensive view. As any statistician could tell you, it is possible to construct a comprehensive picture of the Internet from a far smaller sample set. Google’s Public DNS service is apparently very successful. Many folk direct their queries to these Google servers. Sometimes its faster, sometimes its more complete, but whatever the reason, many users have gone down this path. So I’d like to look at just how much of the Internet’s user population has their DNS queries answered by Google. And from that measurement data maybe we could make a guess as to just how complete Google's DNS-derived picture of the Internet might look like.

Measurement Technique

We had started along this path by looking for something entirely different. We were interested to measure the extent to which users pass their DNS queries to DNS resolvers that perform DNSSEC validation. During this investigation, at the start of 2013, Google announced that they would be turning on DNSSEC validation for their resolvers. At that point we were interested to understand to what extent would this announcement alter the overall landscape of DNSSEC validation.

How can we measure what end users do? Broadly speaking, there are two forms of approach. The first is to instrument a service that is very widely used and conduct the measurement exercise from that platform. Of course the precondition here is to have access to a widely used service point and be permitted to add various forms of action scripts into that service environment. The second approach is to inject the measurement code into the user’s environment, and have the user perform the measurement test directly. We have been using this latter approach for some years now, based on using the online advertisement network as a means of gaining access to the user environment, and then using a simple script embedded in the ad to request the user to perform a fetch of a small set of URLs.

If we carefully construct the URLs that are presented to end users to fetch, then it is possible to expose a number of aspects of the user’s environment. The basic approach is to use three URLs, where the DNS names are respectively DNSSEC-signed, DNSSEC-signed with invalid signatures and are not DNSSEC-signed at all. These considerations relate to the use of unique URLs at each invocation of the experiment. It is critical to avoid the interception of these URL resolution and fetch tasks being trapped by intermediate DNS and WEB caches, as we need to intuit end user behaviour based on interactions we see at the authoritative DNS and web servers for the experiment’s URLs. That means that we use an environment that is a little unwise in a normal context, in that the DNS is served from a single authoritative name server, rather than the more typical configuration of 2 or perhaps 3 name servers. Also, we use a DNS name where we have quite deliberately broken the DNSSEC signature. And of course every name contains unique components, and we apply the DNSSEC signatures across these unique name components.

Analysing the Experimental Technique

The DNS is both simple and incredibly complicated. Its simple in so far as its a protocol where a client generates a query as a DNS packet, and sends it to a DNS resolver, and the DNS resolver returns the packet as an response. If the queried name exists, the resolver is expected to have placed the details of the answer into the answer section of the DNS packet. Otherwise, the response is expected to have an appropriate diagnostic code set. Question. Answer. Simple.

And the model of resolution can equally be presented in extremely simple terms. To answer the question, the resolver asks the DNS name server that is “authoritative” for the zone being queried, and passes that response back to thew resolver. Figure 1 shows the DNS in this naive manner.


Figure 1 - A Naive view of DNS Resolution

But of course this naïve view conveniently covers up a massive amount of detail and complexity in the DNS. Hidden behind this seemingly simple query/response interface is a distributed database of hundreds of millions of individual entries, served from a set of some tens of millions of DNS resolvers. Their interconnection is highly varied, and the resultant system behaviour is not only diverse, but at times completely perverse as well! A small snapshot of the internal structure of DNS resolvers is shown in Figure 2.


Figure 2 – Some forms of DNS Resolver query paths

However, this level of internal structure of the DNS is not directly visible. DNS queries have no “trail” information. When resolver A forwards a DNS query to resolver B, it has no ability to describe its motives in so doing. It cannot identify the client that triggered the original query, nor expose the internal logic that lead to the resolver generating this query nor the logic that selected this particular resolver as the resolver to receive this query.


Figure 3 – A Working Model of DNS Resolution

So when we want to ask: “How much of the Internet’s end user base uses Google’s Public DNS Service?” it probably makes some sense to explain how we will go about answering that question. The simplification we use here is look at the DNS world from the perspective of the authoritative name server, which tends to cut out all the intermediate DNS resolvers. What we are left with is, from the perspective of the authoritative name server, a “visible” DNS resolver (Figure 3). By itself, this simplification would still not really help. However, if we pass every client a completely unique DNS name to resolve, then so as long as we keep track of the association of clients to unique DNS names, we can associate each client with the visible resolver or resolver(s) that they use.


Figure 4 – Mapping Users to Visible Resolvers

Of course the client may have selected this resolver themselves, in which case they may be directly aware of which resolvers they use. Or they may be using a local access network, that has a local resolver that passes all its requests to a recursive forwarder who, in turn,… and so on. In this case the selection of the visible resolver may well be a choice that is completely opaque to the end user. So when we say that a client is “using a Google DNS service”, what we mean in this context is that the visible resolver that ultimately passes the DNS query to the authoritative name server is part of the Google Public DNS resolver collection.

Results

We ran an online advertisement with these URLS as fetch targets from the 9th to the 26th of May 2013. The experiment was invoked by 2,498,497 clients over this period. 92.8% of these experiments used visible resolvers that were not operated by Google. The other 7.2% of clients ultimately had their queries passed to the experiment’s authoritative name server from Google’s DNS servers.

That’s a very large number for a relatively recent service offering. The uptake of use of this service is certainly very impressive.

And if the logs of these DNS resolvers provide a useful insight as to the real time online activities of the end user population, then having a clear view of the activities of some 7% of the entire end user population of the Internet is a particularly valuable observational vantage point!

We can drill down into these numbers to get a bit more detail. The URL that is invalidly DNSSEC-signed returns a somewhat unhelpful error code back to the client, namely a SERVFAIL error, indicating some unspecified error on the part of the DNS server. All Google’s DNS servers perform DNSSEC validation, so they will return these codes to the DNS client who posed the query. If the client has alternate resolvers configured, then they will interpret this response as grounds to repeat the query to the alternate resolvers. From this behaviour we can see the number of clients who exclusively use Google’s Public DNS services, and those who have alternate servers. We observed that 5.3% of users exclusively used Google’s DNS servers, while 1.9% used Google in conjunction with other DNS resolvers.

Given that the end client’s IP address can also be geo-mapped into a country of origin with a reasonably level of certainty, its also possible to see if particular countries make significant use of Google’s service.

Rank CC Count ALL MIXED NOT Country 
VN 25,784 39.2% 2.8% 58.0% Vietnam 
JM 1,413 27.5% 0.5% 72.0% Jamaica 
GT 1,720 25.3% 11.1% 63.5% Guatemala 
BN 410 20.0% 36.1% 43.9% Brunei Darussalam 
ID 50,935 19.0% 5.4% 75.6% Indonesia 
LA 300 18.7% 7.3% 74.0% Lao People's Democratic Republic 
TR 47,816 18.3% 1.6% 80.0% Turkey 
HN 931 18.1% 18.9% 62.9% Honduras 
AZ 6,970 17.9% 31.2% 50.8% Azerbaijan 
10 TZ 297 16.2% 23.6% 60.3% United Republic of Tanzania 
11 NI 992 16.0% 40.1% 43.8% Nicaragua 
12 BO 1,295 16.0% 17.3% 66.7% Bolivia 
13 EG 34,719 14.9% 4.0% 81.1% Egypt 
14 GH 912 14.7% 6.7% 78.6% Ghana 
15 PS 2,779 14.2% 38.9% 46.8% Occupied Palestinian Territory 
16 IT 76,489 13.9% 0.5% 85.6% Italy 
17 DZ 7,397 13.2% 24.0% 62.8% Algeria 
18 BD 712 12.8% 9.4% 77.8% Bangladesh 
19 MY 32,041 12.1% 2.1% 85.8% Malaysia 
20 UA 25,124 11.9% 2.7% 85.4% Ukraine 

Table 1 – Use of Google’s Public DNS by Country – May 2013</p>  

The table shows the adoption of Google’s Public DNS by country. In terms of the relative level of penetration within countries this certainly shows that if Google’s aim through this facility was to provide services to the developing world, then this list is consistent with that overall objective, in so far as there is a relatively high level of representation here from such economies.

Further Results

Of course in June of this year Edward Snowden fled the United States, and released material relating to the until then covert eavesdropping activities of the National Security Agency of the United States. There has been some resulting public concern about the extent to which our online activities generate a rich plume of digital exhaust, and the extent to which others have been sniffing these fumes and generating accurate profiles of ourselves, not only as online users, but as consumers and as individuals. There is no published material whatsoever to assume that Google’s Public DNS service has been compromised in any way by such agency activity, but at the same time there is no undertakings by Google as to what use it makes of the DNS data generated by this free service, nor any undertakings that others may have had access to such data.

As Renesys reported at the end of October () Google’s DNS service has left Brazil, and the report suggests that this action by Google is in response to forthcoming Brazilian legislation that will require Internet companies operating in Brazil to store data about Brazilian users within Brazil.

Did the level of public use of Google’s Public DNS services change in response to these events?

We have re-run the same experiment in the ensuing months, and the picture is certainly not one of monotonically increasing up and to the right adoption of Google’s public DNS service. Numbers were at their lowest in August, when the stories of the Snowden revelations and their consequences appeared to be well covered throughout the world’s press. Since then the adoption rate has resumed its increase, and by November it appears that the level of use is back to where it was in May.

   All-Google  Mixed-Google  No-Google 
May-13  5.3% 1.9% 92.8% 
Jul-13  4.6% 2.1% 93.4% 
Aug-13  4.4% 2.1% 93.5% 
Sep-13  4.7% 2.1% 93.2% 
Oct-13  5.1% 2.2% 92.6% 
Nov-13  5.0% 2.4% 92.6% 

Table 2 – Use of Google’s Public DNS May – November 2013

If we compare the September ’13 numbers against the May ‘13 numbers we can derive a national table of those countries where the level of use of Google’s DNS service fell over that period, and those countries where it rose. Table 3 shows the top 20 list of countries where use fell over that period, and Table 4 shows a comparable list where this use increased.

Rank CC  Delta OFF MAY%  SEP % Country 
NI  37.7%  56.1%  18.3%  Nicaragua 
PS  22.7%  53.1%  30.4%  Occupied Palestinian Territory 
BO  21.5%  33.2%  11.7%  Bolivia 
BN  10.2%  56.1%  45.8%  Brunei Darussalam 
KE  8.2%  27.5%  19.2%  Kenya 
AL  6.4%  16.7%  10.3%  Albania 
LA  6.3%  26.0%  19.6%  Lao People's Democratic Republic 
MZ  6.3%  17.5%  11.2%  Mozambique 
PK  6.1%  18.2%  12.0%  Pakistan 
10  JM  5.3%  27.9%  22.6%  Jamaica 
11  TR  5.2%  19.9%  14.7%  Turkey 
12  AZ  5.1%  49.1%  43.9%  Azerbaijan 
13  TZ  4.9%  39.7%  34.7%  United Republic of Tanzania 
14  GT  3.5%  36.4%  32.9%  Guatemala 
15  BA  3.1%  9.0%  5.8%  Bosnia and Herzegovina 
16  SR  2.5%  5.0%  2.5%  Suriname 
17  IT  2.3%  14.4%  12.0%  Italy 
18  EG  2.2%  18.8%  16.6%  Egypt 
19  UG  2.1%  18.4%  16.3%  Uganda 
20  AF  2.1%  50.2%  48.1%  Afghanistan 

Table 3 – Falling Use of Google’s Public DNS: May to September ‘13

And the list where use increased over the same period:

Rank  CC  Delta ON MAY% SEP%  Country 
  1  KH  21.7%  9.5%  31.2%  Cambodia 
  2  TN  18.7%  4.3%  23.0%  Tunisia 
  3  EU  17.0%  8.2%  25.2%  European Union* 
  4  DZ  16.1%  37.1%  53.3%  Algeria 
  5  NG  15.7%  29.9%  45.7%  Nigeria 
  6  AM  15.1%  10.0%  25.2%  Armenia 
  7  MW  14.4%  24.7%  39.1%  Malawi 
  8  AW  9.1%  2.8%  11.9%  Aruba 
  9  BD  8.2%  22.1%  30.4%  Bangladesh 
 10  LK  8.2%  3.7%  11.9%  Sri Lanka 
 11  ZW  7.6%  22.1%  29.7%  Zimbabwe 
 12  GH  7.3%  21.3%  28.7%  Ghana 
 13  IQ  6.9%  22.0%  29.0%  Iraq 
 14  MV  6.5%  18.9%  25.5%  Maldives 
 15  BH  5.6%  7.9%  13.6%  Bahrain 
 16  MM  5.5%  11.4%  16.9%  Myanmar 
 17  PH  5.2%  7.0%  12.2%  Philippines 
 18  VN  5.1%  42.0%  47.1%  Vietnam 
 19 DO  4.3%  5.3%  9.6%  Dominican Republic 
 20 AR 4.0% 6.9% 11.0% Argentina 

* The EU entry is an anomaly - some resources in Europe are not geo-located to an individual country, but are listed as the EU region. This entry is not to be confused with the aggregation of all EU countries!

Table 4 – Rising Use of Google’s Public DNS: May to September ‘13

A similar picture can be drawn at the level of networks whose clients have their DNS queries directed to Google’s Public DNS service. Table 5 shows this for the top 20 such networks, using the originating AS as the network indicator, for September 2013.

Rank  AS Count  ALL  MIXED  NO  ASName 
45899 4,449 51.4%  2.0%  46.4%  VNPT-AS-VN VNPT Corp,VN,Vietnam 
7552 1,597 38.6%  1.8%  59.5%  VIETEL-AS-AP Vietel Corporation,VN,Vietnam 
18403 2,560 35.9%  1.1%  62.8%  FPT-AS-AP T, Technology,VN,Vietnam 
4230 505 29.1%  11.4%  59.4%  EMBRATEL-EMPRESA BRASILEIRA DE TELECOMUNIC, Brazil 
34296 440  26.1%  46.5%  27.2%  MILLENICOM-AS MILLENI.COM,DE,Germany 
17762 315  26.0%  22.8%  51.1%  HTIL-TTML-IN-AP Tata Teleservices Maharashtra Ltd,IN,India 
17974 7,162  25.1%  5.4%  69.4%  TELKOMNET-AS2-AP PT Telekomunikasi Indonesia,ID,Indonesia 
3549 529  22.5%  6.4%  71.0%  GBLX Global Crossing Ltd.,US,United States of America 
131090 583  19.9%  15.6%  64.4%  CAT-IDC-4BYTENET-AS-AP ,TH,Thailand 
10  131222 577  19.7%  24.9%  55.2%  MTS-INDIA-IN 334,Udyog Vihar,IN,India 
11  8452 6,612  19.3%  6.5%  74.1%  TE-AS TE-AS,EG,Egypt 
12  8517 452  19.2%  8.8%  71.9%  ULAKNET National Academic Network,TR, Turkey 
13  174 596  19.1%  3.8%  77.0%  COGENT Cogent/PSI, US,United States of America 
14  55824 558  18.6%  14.8%  66.4%  RSMANI-NKN-AS-AP National Knowledge Network,IN,India 
15  17451 455  18.4%  4.8%  76.7%  BIZNET-AS-AP BIZNET NETWORKS,ID,Indonesia 
16  9387 280  17.8%  37.1%  45.0%  AUGERE-PK AUGERE-Pakistan,PK,Pakistan 
17  36947 6,806  17.7%  36.2%  45.9%  ALGTEL-AS,DZ,Algeria 
18  14754 981  17.3%  5.4%  77.2%  Telgua,GT,Guatemala 
19  18101 1,009  16.9%  6.5%  76.5%  Reliance Communications.DAKC MUMBAI,IN,India 
20  20960 269  16.3%  6.6%  76.9%  TKTELEKOM-AS TK Telekom sp. z o.o.,PL,Poland 

Table 5 – Use of Google’s Public DNS by Network: September ‘13

And again its possible to look at those networks where the change in use has varied between May and September. The following two tables show the top 20 networks with falling and rising use.

Rank  AS  Delta OFF  May  Sep  AS Name 
17754  62.4%  75.9%  13.5%  EXCELL-AS Excellmedia,IN,India 
15975  46.9%  57.9%  10.9%  Hadara,PS,Occupied Palestinian Territory 
14754  14.7%  37.4%  22.7%  Telgua,GT,Guatemala 
38547  14.2%  38.9%  24.7%  WITRIBE PAKISTAN,PK,Pakistan 
10620  13.2%  16.2%  3.0%  Telmex Colombia S.A.,CO,Colombia 
45609  10.7%  12.4%  1.7%  BHARTI-AS Bharti Airtel,IN,India 
36423  7.6%  18.0%  10.3%  SAN-JUAN-CABLE,PR,Puerto Rico 
45595  7.4%  14.2%  6.8%  Pakistan Telecom Company,PK,Pakistan 
34984  6.9%  20.4%  13.4%  Tellcom Iletisim Hizmetleri,TR,Turkey 
10  47524  6.8%  19.3%  12.4%  TURKSAT-AS Turksat,TR,Turkey 
11  12978  6.8%  21.6%  14.8%  DOGAN-ONLINE,TR,Turkey 
12  4780  6.3%  21.5%  15.1%  SEEDNET Digital United Inc.,TW,Taiwan 
13  34569  6.2%  6.2%  0.0%  NETWORX-BG Networx-Bulgaria,BG,Bulgaria 
14  44957  5.5%  15.9%  10.4%  OPITEL Vodafone Omnitel N.V.,IT,Italy 
15  131090  5.4%  41.0%  35.5%  CAT-IDC-4BYTENET-AS-AP ,TH,Thailand 
16  47331  5.3%  18.6%  13.3%  TTNET TTNet A.S.,TR,Turkey 
17  8612  5.1%  14.2%  9.1%  TISCALI-IT Tiscali Italia S.P.A.,IT,Italy 
18  9498  4.7%  24.7%  20.0%  BBIL-AP BHARTI Airtel Ltd.,IN,India 
19  9121  4.7%  19.2%  14.5%  TTNET Turk Telekomunikasyon,TR,Turkey 
20  8452  4.4%  30.3%  25.8%  TE-AS TE-AS,EG,Egypt 

Table 6 – Falling Use of Google’s Public DNS by Network: May to September ‘13

Rank  AS  Delta ON  May  Sep  AS Name 
45356  64.7%  0.2%  64.9%  MOBITEL-LK,,LK,Sri Lanka 
2609  24.7%  5.4%  30.1%  Tunisia BackBone AS,TN,Tunisia 
6648  18.6%  26.8%  45.5%  Bayan Telecommunications,PH,Philippines 
36947  17.3%  36.6%  54.0%  ALGTEL-AS,DZ,Algeria 
131222  13.8%  30.9%  44.7%  MTS-INDIA, Udyog Vihar,IN,India 
16637  11.2%  6.2%  17.5%  MTNNS-AS,ZA,South Africa 
18403  9.0%  28.1%  37.1%  FPT-AS-AP,VN,Vietnam 
12066  7.8%  2.7%  10.5%  TRICOM,DO,Dominican Republic 
10029  7.8%  7.8%  15.6%  Citycomnetworks-As Citycom,IN,India 
10  11664  6.9%  12.7%  19.6%  Techtel LMDS,AR,Argentina 
11  10139  6.0%  1.4%  7.4%  Smart Broadband, Inc.,PH,Philippines 
12  6939  5.8%  5.6%  11.5%  Hurricane Electric, Inc.,US,United States 
13  8997  5.7%  8.7%  14.5%  Rostelecom,RU,Russian Federation 
14  4755  5.2%  14.1%  19.3%  TATA Communications,IN,India 
15  55824  5.0%  28.4%  33.5%  RSMANI, National Knowledge Net,IN,India 
16  45899  4.5%  49.0%  53.5%  VNPT-AS-VN VNPT Corp,VN,Vietnam 
17  6503  4.1%  4.4%  8.6%  Axtel, S.A.B. de C.V.,MX,Mexico 
18  10292  4.1%  2.5%  6.7%  CWJAM ASN-CWJAMAICA,JM,Jamaica 
19  7303  4.0%  5.5%  9.5%  Telecom Argentina S.A.,AR,Argentina 
20  9829  3.9%  4.4%  8.4%  BSNL National Internet Backbone,IN,India 

Table 7 – Rising Use of Google’s Public DNS by Network: May to September ‘13

Conclusions

There is no doubt in the value of Google’s public DNS service.

It’s a welcome step to see a DNS resolution service take DNS security seriously, and validate the responses that they pass back to their clients. It’s also a welcome step to see a very large scale DNS service operate using dual stack capabilities. The Google service operates with integrity and does not appear to filter the DNS in arbitrary ways. And it’s well engineered, so it’s fast and reliable. And it’s free. So these are all good reasons to use the service.

But of course TNSTAAFL, and while there is no specific information from Google as to how the p-DNS data might be used by the company, there is no doubt that a real time feed of the online activities of some 7% of the entire Internet user base is a rich vein of information, and this data stream could be added to the existing corporate information sets to add to the accuracy of the individual profiles that fuel their advertising business. Whether the same information is accessible to various US government agencies, and under what terms, is not something that appears to have been mentioned in the recent disclosures.

For some, this may be an acceptable tradeoff of some level of information about their online use in exchange for service. For others such an exchange may be a step too far. And for others the decision has been placed out of their hands, as their service provider may have decided to use Google’s service in any case. But in the morass of the other issues with the DNS, including the various forms of exploitation and attack, and the ongoing issues with the DNS being perverted to perform massive DOS attacks, the various forms of use of DNS-like names in differing contexts, new and old TLDs, colliding names, IDNs and every other topic that forms the universe of DNS discourse, its still really encouraging for me to see that there are still some folk are talking high quality DNS resolution performance seriously!

















Disclaimer

The views expressed are the authors' and not those of APNIC, unless APNIC is specifically identified as the author of the communication. APNIC will not be legally responsible in contract, tort or otherwise for any statement made in this publication.

About the Author

 
 
GEOFF HUSTON B.Sc., M.Sc., has been closely involved with the development of the Internet for many years, particularly within Australia, where he was responsible for the initial build of the Internet within the Australian academic and research sector. He is author of a number of Internet-related books, and has been active in the Internet Engineering Task Force for many years.

www.potaroo.net