Back to Resources
Field Note Nov 2025 8 min read

Rebuilding an OT Network Without Downtime

Replacing or significantly upgrading OT infrastructure while keeping production running requires parallel systems, careful sequencing, and fallback plans at every step.

C

Cascadia OT Security

Founder · Managing Principal · CISSP · GICSP

T+0Initial AccessT+12hDiscoveryT+3dLateral MoveT+14dOT PivotT+84dDetonationDWELL TIME · 84 DAYSATTACK TIMELINET+84d DETONATE

Major network upgrades—replacing aging switches, restructuring for segmentation, migrating to new firewall architectures—have historically required production downtime measured in days. Modern manufacturing cannot accept that risk. The challenge is executing complex network changes in parallel with running production, validating the new infrastructure thoroughly before cutting over, and having a rapid rollback plan if anything goes wrong.

The key principle: build the new network alongside the old one, run both in parallel until you are confident the new one is stable, then migrate in the smallest possible increments. Each increment should be reversible without affecting production.

Parallel Infrastructure Phase

Deploy the new network infrastructure—new switches, firewalls, cabling, IP addressing scheme—without touching the existing network. Commission the new infrastructure separately with isolated test traffic. This allows you to validate all the configuration details, failover scenarios, and performance characteristics before any production systems depend on it.

Use this validation phase to catch configuration errors, identify protocol incompatibilities, and train your operations team on the new infrastructure. If something is broken, you discover it while production is running on the old network and you have time to fix it.

Parallel Operation and Verification

Staged Migration and Rollback

Begin cutover with the lowest-impact production zone or process. Route that zone's traffic exclusively to the new network. Let it run through at least two complete production cycles (48 hours for many facilities) without incident. If problems arise, fall back to the old network immediately—a single device or process is much easier to debug than an entire facility.

Once the first zone is stable, migrate the next lowest-impact zone. Continue zone by zone until all production is on the new network. If you reach a point where migration is too risky, stop and maintain parallel operation indefinitely. The cost of running parallel infrastructure is usually lower than the risk of an incomplete cutover.

Decommissioning the old network is the final step, only after the new network has been stable for at least a month and all rollback paths have been closed. Keep old equipment for at least six months as spare parts in case failures occur during the stability period.

If you're planning major network infrastructure changes, reach out to discuss migration strategy for your facility.

About the author

This article was written by the Cascadia OT Security practice, which advises Pacific Northwest data centers and manufacturers on industrial cybersecurity. For engagement inquiries, reach our practice team.

Working on something similar?

We'd rather have a direct conversation than send you a sales pitch.

Book a 30-minute call