DNS Change Management

DNS changes are deceptively simple to make and remarkably easy to get wrong. A single misconfigured record can take down a website, break email delivery, or disrupt services for hours. The difference between a smooth DNS update and an outage comes down to process. This guide provides a strategic framework for managing DNS changes with zero downtime.

Why DNS Changes Need a Process

DNS is cached at every layer of the internet. When you change a record, the old value persists in resolver caches worldwide until the TTL expires. This means DNS changes are not instant, they cannot be easily undone in a hurry, and mistakes have a blast radius that extends well beyond your own infrastructure.

The cost of ad-hoc changes

DNS changes made without a plan are the leading cause of self-inflicted outages. Even experienced teams make mistakes when they skip the process. A five-minute change can create a multi-hour incident if you do not have a rollback path.

The DNS Change Management Framework

Every DNS change, no matter how small, should follow a structured process. The level of rigor scales with the risk of the change, but the basic steps remain the same.

Phase 1: Pre-Change Preparation

Before touching any DNS records, complete this checklist:

Document the current state

Record the exact current values of every record you plan to change. Include the record type, name, value, and TTL. Use dig or your DNS provider's API to capture this programmatically. This is your rollback reference.

Define the target state

Write down exactly what the records should look like after the change. Specify every field. Ambiguity here leads to errors during implementation.

Identify dependencies

Determine what services depend on the records you are changing. An A record change affects web traffic. An MX record change affects email. A TXT record change can break email authentication. Map the full impact.

Assess the risk level

Rate the change: low risk (adding a new subdomain), medium risk (changing an existing record's value), or high risk (changing NS records, migrating providers, modifying root domain records). Scale your process accordingly.

Plan the timing

Schedule changes during low-traffic periods when possible. Avoid making DNS changes on Fridays, before holidays, or before other planned maintenance windows.

Phase 2: TTL Management

TTL (Time to Live) management is the single most important technique for safe DNS changes. The TTL determines how long resolvers cache a record before checking for updates.

Lowering TTLs before a change:

The standard approach is to lower the TTL on records you plan to change well in advance of the actual change. This ensures that when you make the change, the old cached values expire quickly.

Scenario	Recommended TTL	Lower How Far in Advance
Routine record update	300 seconds (5 min)	24 hours before change
Migration to new server	60 seconds (1 min)	48 hours before change
DNS provider migration	60 seconds (1 min)	48-72 hours before change
Emergency change	Cannot pre-lower	Accept delay equal to current TTL

Why 48 hours?

You need to lower the TTL at least one full current-TTL period before the planned change. If your current TTL is 86400 seconds (24 hours), you need to lower it at least 24 hours in advance so that all caches have picked up the new, shorter TTL. Adding a buffer of an extra 24 hours accounts for resolvers that do not strictly honor TTLs.

Raising TTLs after a change:

Once you have verified that the change is working correctly and you no longer need a fast rollback path, raise the TTLs back to their normal values. Higher TTLs reduce query volume to your authoritative servers and improve performance for end users.

Phase 3: Implementation

When it is time to make the change, follow these steps:

Verify TTLs have propagated

Before making the actual change, confirm that the lowered TTLs are being served by querying several public resolvers. If the old high TTL is still showing, wait longer.

Make the change

Apply the new record values. If changing multiple records, consider whether they should be changed simultaneously or in sequence.

Verify the change at the source

Immediately query your authoritative name servers directly to confirm the new values are being served correctly.

Monitor propagation

Check the new values from multiple geographic locations and resolvers. Watch for inconsistencies that might indicate partial propagation or caching issues.

Automate your change verification

DNS Monitor can verify that your changes propagate correctly and alert you if anything goes wrong during the process.

Phase 4: Staged Rollouts

For high-risk changes, a staged rollout reduces the blast radius of any issues. There are several approaches:

Canary records: If you are migrating to a new server, set up a test subdomain (e.g., canary.example.com) pointing to the new target first. Verify it works correctly before changing the production records.

Weighted DNS: Some DNS providers support weighted routing, allowing you to send a percentage of traffic to the new target while keeping the rest on the old one. Gradually shift traffic as you build confidence.

Geographic staging: If your DNS provider supports geo-routing, roll out the change to one region first, monitor for issues, then expand to additional regions.

Phase 5: Monitoring During Changes

Active monitoring during and after a DNS change is not optional. You need to watch for:

Resolution failures: Are queries returning the expected results from all locations?
Increased error rates: Are your applications seeing more connection failures or timeouts?
Email delivery: If you changed MX or TXT records, is email still flowing correctly?
Certificate validation: If you changed A/AAAA records, are TLS certificates still valid for the new server?
Third-party integrations: Are any services that depend on your DNS (CDNs, load balancers, SaaS tools) still functioning?

Phase 6: Rollback Plan

Every DNS change needs a rollback plan documented before the change begins. Your rollback plan should include:

Exact rollback values

The precise record values to restore, captured during Phase 1. Do not rely on memory.

Rollback trigger criteria

Define in advance what conditions warrant a rollback. For example: resolution failures exceeding 1%, application error rate spike, or email bounce rate increase.

Rollback timeline

How long will a rollback take to propagate? This depends on the TTL you set. With a 60-second TTL, rollback takes 1-2 minutes. With a 3600-second TTL, it takes up to an hour.

Responsible party

Who has the authority and access to execute the rollback? Ensure this person is available during the change window.

Phase 7: Post-Change Verification

After the change has propagated and been stable for a period:

Comprehensive resolution check

Verify the new records resolve correctly from at least five different geographic locations and using multiple public DNS resolvers.

Service validation

Confirm all dependent services are functioning: website loads, email delivers, APIs respond, third-party integrations work.

Raise TTLs

Once confident the change is correct and stable, raise TTLs back to production values (typically 3600-86400 seconds).

Document the change

Record what was changed, when, by whom, and any issues encountered. This documentation is invaluable for future changes and incident reviews.

Special Considerations

DNS Provider Migrations

Migrating from one DNS provider to another is one of the highest-risk DNS changes. The key steps:

Replicate all records in the new provider before changing NS records
Lower TTLs on all records at the old provider
Lower the TTL on NS records at the registrar (if possible)
Change NS records at the registrar
Keep the old provider active until propagation is complete
Monitor for at least 48-72 hours before decommissioning the old provider

Nameserver Changes

Changing NS records affects the entire zone. This is not a record-level change; it is a delegation change at the registrar level. NS changes can take 24-48 hours to fully propagate regardless of TTL settings, because some registries enforce their own caching behavior.

Emergency Changes

Sometimes you need to make an urgent DNS change without the luxury of pre-lowering TTLs. In this case:

Accept that propagation will take as long as the current TTL
Make the change immediately at the authoritative server
Proactively flush caches where possible (some resolvers support this via API)
Communicate the expected timeline to stakeholders

Change Management Checklist

Use this quick checklist for every DNS change:

Before the change

Current state documented. Target state defined. Dependencies mapped. TTLs lowered. Rollback plan written. Change window scheduled. Monitoring in place.

During the change

Records updated. Authoritative response verified. Propagation monitored. Services validated. Team informed.

After the change

Full propagation confirmed. All services healthy. TTLs restored. Change documented. Rollback plan archived.

A disciplined approach to DNS change management is the difference between seamless updates and preventable outages. The best time to build this process is before your next change.

Monitor every DNS change automatically

DNS Monitor tracks your records 24/7 and notifies you immediately when anything changes, planned or otherwise.

DNS Change Management: A Strategy for Zero-Downtime Updates