Millions Affected: Unpacking the Cloudflare 1.1.1.1 DNS Outage of July 2025

On July 14, 2025, Cloudflare’s popular 1.1.1.1 DNS resolver suffered a 62-minute global outage due to an internal configuration error, not a cyberattack. This post examines what happened, the impact on Internet users worldwide, and the lessons for DNS resiliency and best practices.

On July 14, 2025, Cloudflare experienced a global outage of its 1.1.1.1 DNS resolver that lasted about 62 minutes. Because 1.1.1.1 is one of the most widely used public DNS servers (handling hundreds of billions of queries daily), the outage effectively cut off name resolution for millions of users worldwide. In practical terms, affected users suddenly found that “basically all Internet services were unavailable” until Cloudflare restored the service. Cloudflare quickly clarified that the outage was caused by an internal misconfiguration – not a cyberattack or BGP hijack – but the incident underscores how fragile Internet routing and DNS can be. In this article, we walk through the background of the 1.1.1.1 service, the timeline and root cause of the outage, its global impact, and key takeaways for DNS service reliability.

Background: Cloudflare’s 1.1.1.1 DNS Resolver

Cloudflare launched 1.1.1.1 in 2018 as a fast, privacy‑focused public DNS resolver. Over the years it has become extremely popular – by 2020 it was already handling over 200 billion DNS queries per day, making it the world’s second-largest public resolver behind Google. Like many CDN and DNS providers, Cloudflare uses anycast routing to serve traffic from data centers around the world. In anycast, Cloudflare announces the same IP prefix from many global locations, so a user’s DNS request is routed to the nearest available data center. This approach greatly improves performance and capacity, but it also means that changes to the advertised IP prefixes must be handled carefully: if the network withdraws those routes by mistake, all traffic to those addresses disappears everywhere at once. In short, 1.1.1.1 is an anycast address; if Cloudflare stops announcing it, virtually no one in the world can reach it until routes are restored.

What Happened on July 14, 2025

Cloudflare’s post‑mortem shows that the outage was triggered by a human error in its configuration system. On June 6, 2025, engineers made a change preparing a new Data Localization Suite (DLS) service (a feature that restricts services to certain regions). Unfortunately, that change inadvertently included the IP prefixes for 1.1.1.1 in the DLS configuration. Because the new DLS service was not yet live, this mistake had no effect at the time – no routes changed and no traffic was impacted (so no alarms went off). The misconfiguration simply sat dormant in the system.

The problem came on July 14, 2025. The Cloudflare team again updated the (still non-production) DLS setup, this time adding a test location to the service. That change triggered Cloudflare’s network to recompute its global routing configuration. Due to the earlier error, the 1.1.1.1 prefixes were accidentally bound to the inactive DLS service. In effect, Cloudflare’s control plane began treating the 1.1.1.1 address as if it only belonged in one offline location. This immediately withdrew the 1.1.1.1 prefixes from all production data centers worldwide.

The sequence of events was fast but can be summarized:

  • 2025-07-14 21:48 UTC – Engineers apply the second configuration change. 1.1.1.1 routes are withdrawn globally (due to the June 6 error).
  • 2025-07-14 21:52 UTC – DNS requests to 1.1.1.1 begin failing worldwide. Users lose the ability to resolve domain names, causing broad Internet outages.
  • 2025-07-14 22:01 UTC – Monitoring systems detect a sudden drop in DNS traffic. An incident is declared and engineers begin troubleshooting.
  • 2025-07-14 22:20 UTC – Cloudflare reverts the faulty configuration. BGP prefixes for 1.1.1.1 are re-announced to the Internet.
  • 2025-07-14 22:54 UTC – Service is fully restored as routes propagate and cached configurations update. DNS query volumes return to normal levels.

These details come from Cloudflare’s incident report and confirm that no external actor caused the problem – it was entirely an internal misconfiguration.

Global Impact: DNS Queries Collapsed

With Cloudflare’s 1.1.1.1 routes gone from the global routing tables, all legitimate DNS traffic to those addresses instantly had no destination. Any user who had set their DNS server to 1.1.1.1 or its secondary (1.0.0.1) could no longer resolve any names. In practical terms, people all around the world discovered their Internet was “down” – websites wouldn’t load, email couldn’t resolve addresses, and so on. Cloudflare noted that “for many users… not being able to resolve names using the 1.1.1.1 Resolver meant that basically all Internet services were unavailable”.

Traffic graphs during the event show a steep drop in DNS queries. For example, as soon as the prefixes disappeared, queries over UDP, TCP, and DNS-over-TLS (DoT) to 1.1.1.1 collapsed to near zero. (Most systems use the IP address directly, so when the address becomes unreachable, the queries fail immediately.) Cloudflare’s metrics clearly captured this – almost all query traffic vanished after 21:52 UTC. At that point internal alerts fired, and engineers raced to diagnose why the resolver had gone dark.

Query rate (per second) for 1.1.1.1 by protocol during the outage. UDP, TCP and DoT traffic dropped to almost zero, whereas DNS-over-HTTPS traffic (via cloudflare-dns.com) stayed steady.

Interestingly, DNS-over-HTTPS (DoH) traffic was largely unaffected. That’s because most DoH clients use the domain cloudflare-dns.com rather than hard-coded IPs, and that domain continued to be routed via different IP prefixes. Similarly, some UDP DNS queries remained working if they hit Cloudflare IPs outside the affected ranges. But the traditional DNS traffic (UDP/TCP to 1.1.1.1) effectively halted worldwide. This explains why some people reported that using Google’s or Cloudflare’s DoH/DoT services still worked, while “normal” DNS lookups failed.

Affected IP Prefixes

The outage impacted not just 1.1.1.1/32 but all of Cloudflare’s DNS resolver address space. In Cloudflare’s own words, “any traffic coming to Cloudflare via 1.1.1.1 Resolver services on these IPs was impacted”:

  • IPv4: 1.1.1.0/24, 1.0.0.0/24, 162.159.36.0/24, 162.159.46.0/24, 172.64.36.0/24, 172.64.37.0/24, 172.64.100.0/24, 172.64.101.0/24
  • IPv6: 2606:4700:4700::/48, 2606:54c1:13::/48, 2a06:98c1:54::/48

Any DNS resolver call sent to those networks simply had nowhere to go. For context, these include not just the “standard” 1.1.1.1/32 and 1.0.0.1 addresses, but also IPv6 equivalents and other related Cloudflare DNS blocks. The result was broad – often global – impact, as confirmed by multiple independent outage monitors.

Misconfiguration, Not a Hack or Hijack

In the immediate aftermath, many observers wondered if this was the result of a cyber attack or an Internet routing hack (BGP hijack). After all, there had been notorious BGP incidents in the past that took 1.1.1.1 offline by malicious or accidental announcements. However, Cloudflare was clear that the outage was self-inflicted: “The root cause was an internal configuration error and not the result of an attack or a BGP hijack”. Security news sites echoed this point, noting that the outage was due to a “misconfigured system update from June 6, not a BGP attack”.

That said, a coincidental BGP event did occur during the outage, which briefly confused some analysts. Once Cloudflare’s routes disappeared, an unrelated network (Tata Communications, AS4755, in India) began announcing the 1.1.1.0/24 prefix as if it owned it. From the perspective of many ISPs, this looked like a classic prefix hijack: traffic destined for 1.1.1.1 would be directed to Tata’s routers. However, this was an effect, not a cause. Cloudflare engineers noted that “this BGP hijack was not the cause of the outage” – it simply appeared when Cloudflare’s own advertisements vanished. In other words, the world had no valid route for 1.1.1.1, and Tata’s announcement temporarily became the de facto route until Cloudflare re-advertised. Once the configuration was fixed (at 22:20 UTC), Tata withdrew its announcement by 22:19-22:20 UTC, and normal routing resumed.

BGP route announcements for 1.1.1.0/24 during the incident. Cloudflare withdrew its route (blue), making the prefix unreachable. Shortly after, a Tata Communications route (red) briefly appeared, then Cloudflare re-adsorbed (blue) to restore service.

The key point is that no malicious actor took down 1.1.1.1; it was never an external attack. The change management error alone caused the global outage. (It is worth noting that Cloudflare’s 1.1.1.1 service has seen routing incidents before – e.g. the June 2024 1.1.1.1 hijack – but this time the trigger was internal.)

Restoration and Recovery

Once Cloudflare identified the problem, engineers rolled back the change and restored the correct routes. After the revert was initiated at 22:20 UTC, Cloudflare began re-announcing the withdrawn BGP prefixes including the 1.1.1.0/24 block. Traffic levels jumped immediately: within seconds, query volumes recovered to about 77% of normal, as networks again had a valid route. (The remaining 23% of traffic took longer because some edge servers had lost their IP bindings and needed a manual update through the change management system.) By 22:54 UTC – roughly 1 hour and 2 minutes after the problem began – all prefixes were fully restored across all locations and DNS service returned to normal levels.

In total, the outage lasted 62 minutes, precisely as Cloudflare reported. Traffic graphs from external monitors (e.g. ThousandEyes) confirm this timeline of withdrawal, recovery, and full stabilization by 22:54.

Lessons Learned and Next Steps

Cloudflare treats this kind of incident very seriously and has already outlined steps to prevent a repeat. The core takeaway is that static, legacy configuration processes were the weak link. The erroneous update had passed peer review, but the deployment system applied it globally all at once. Cloudflare notes that its older system did not allow progressive rollouts – a common safety measure where changes are tested in stages, with health checks before full deployment. In this case, the flawed change immediately affected every data center.

As a result, Cloudflare plans to migrate away from the legacy hard‑coded IP model toward a more flexible “service topology” approach. This new model uses abstract definitions of a service’s regions rather than listing every IP prefix, making it easier to stage updates and catch mistakes early. Specifically, Cloudflare will “deprecate legacy systems” and implement staged deployment methodologies with health monitoring, so that future config errors would be caught in limited canary regions rather than applied network-wide.

For DNS architects and network operators everywhere, the incident is a reminder: DNS is a critical dependency and any single point of failure can have broad effects. Even a short disruption of a major resolver can appear as if the whole Internet is offline. Best practices include using multiple DNS providers (or alternate resolvers) and monitoring traffic patterns. The good news is that Cloudflare’s 1.1.1.1 is just one of several public resolvers, and an outage like this – while painful – was resolved relatively quickly. (By comparison, DNS outages in the past, such as Amazon’s Route 53 incident in 2017, also took major services offline, underscoring that even top providers can err.)

Going forward, Cloudflare’s promised changes – automated checks, canarying, and rapid rollback capabilities – should greatly reduce the risk of such an incident recurring. DNS and network engineers should take note: even routine configuration updates can have outsized impact, so deployment hygiene and gradual rollouts are essential. As one security analyst summarized, this event “highlighted the complexity of managing anycast routing” and reinforced that careful change management is crucial.

The July 14, 2025 outage of Cloudflare’s 1.1.1.1 DNS resolver was a stark lesson in the importance of robust infrastructure practices. A simple human error in updating service topologies caused a global DNS outage, affecting millions and briefly making large portions of the Internet unreachable. Cloudflare’s analysis and public report make it clear there was no hack or external attack – just a cascading failure set off by a configuration slip-up. The incident underscores that in large-scale anycast networks, even small mistakes can have immediate worldwide consequences.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top