Consider a 48-MW data center campus where the Building Management System (BMS) sits on the same network segments as corporate IT. Over 14 weeks, a program of work can move that BMS to a formally segmented, continuously monitored OT enclave — without a single moment of BMS downtime. This is the plan we bring to that engagement.
The facility profile in this playbook is illustrative. The sequence, controls, and change windows below reflect how we would run the program for a real operator; they are not drawn from any single prior client.
Typical starting state
The starting posture this playbook assumes — representative of facilities we are built to serve:
- Approximately 340 BMS devices — ATSes, CRACs, chillers, UPS monitoring, leak detection, fire panel integrations
- A supervisor running on a Windows Server that has not been patched in 14 months
- A single flat VLAN shared with physical security camera traffic and DCIM sensors
- Three integrator remote-access pathways, two using shared credentials
- Inter-VLAN routing to corporate subnets with no firewall filtering
Leadership is usually aware of the issues in general terms but has been unable to get the project scoped — the default risk of change appears to outweigh the default risk of leaving the architecture in place. The playbook below is built for exactly that deadlock.
The operational constraint
A facility of this profile hosts tenant workloads with strict SLAs. Any BMS interruption that affects cooling or power delivery would breach those SLAs at substantial cost. The operator's explicit requirement is zero BMS downtime during transition. Secondarily, zero surprises — every change staged, reviewed, and reversible.
Approach
We would break the engagement into four phases.
Phase 1: Discovery (weeks 1–2)
Physical and logical inventory. We walk every IDF. We map every BMS device to its controller, its VLAN, its integrator, and its remote-access pathway. Where records and reality disagree, we trust reality and update the records. By the end of week 2, the operator typically has the first accurate BMS asset list the facility has ever had.
Phase 2: Parallel-build the new enclave (weeks 3–6)
Rather than rearchitect in place, we build the new segmented network in parallel. New dedicated switching. New firewall enforcing an explicit allow-list between BMS and corporate. New jump host for integrator access with MFA, session recording, and per-integrator credentials. New supervisor host, patched and hardened.
None of this is connected to production yet. The entire new enclave is validated in a staging configuration against a small set of test BMS devices — either donated from facilities' spare inventory or sourced from the integrator.
Phase 3: Staged cutover (weeks 7–12)
BMS devices are migrated in small groups — typically 15 to 25 devices per change window, always during low-load hours, always with a documented rollback plan. Each migration is tested against the supervisor before the next group is scheduled. The integrator is briefed on each change and participates in cutover verification.
Expect roughly ten formal change windows over six weeks. The overwhelming majority proceed as planned. A small number will exercise the rollback procedure — commonly triggered by unexpected dependencies on legacy Modbus bridges or undocumented integrations that do not surface during discovery. That is why the approach is staged and reversible: the rollback is the point, not a failure mode.
Phase 4: Decommission and transfer (weeks 13–14)
The old flat VLAN is decommissioned. Shared integrator credentials are revoked. Final penetration testing against the new enclave is conducted from both inside and outside the boundary. Remaining findings are remediated before handoff.
Final deliverables: the new architecture diagram, the firewall ruleset, the hardened supervisor baseline, the integrator access procedure, the incident response playbook for BMS events, and a training session for the operator's security and facilities teams.
Expected outcomes
- Zero BMS downtime during transition. Parallel-build plus staged cutover is designed to make SLA-affecting interruptions avoidable.
- BMS attack surface reduction on the order of 90% between pre- and post-engagement vulnerability measurement, driven primarily by segmentation and supervisor hardening.
- Integrator remote access collapsed to a single auditable pathway from three or more.
- Improved cyber insurance posture — the new architecture is increasingly accepted as a compensating control by insurers.
- SOC 2 scope coverage for BMS, closing a gap that often surfaces during audit.
Why parallel-build
The parallel-build approach is deliberately expensive in equipment, but it is the only approach we recommend for production-critical OT transitions because it gives the operator a reversible path at every step. A rearchitect-in-place program saves capex and costs the facility its safety net.
The single most common source of cutover surprises is a legacy Modbus bridge, industrial protocol translator, or undocumented vendor link that discovery misses. Thirty minutes with the facility's longest-tenured technician during Phase 1 will almost always surface them — that is part of our standard discovery checklist.
If your facility has a BMS environment that you have been meaning to segment for years and have not been able to get scoped, let's have a conversation.
This article was written by the Cascadia OT Security practice, which advises Pacific Northwest data centers and manufacturers on industrial cybersecurity. For engagement inquiries, reach our practice team.