Work That Moved the Needle
Real-world projects that improved reliability, expanded coverage, and reduced operational toil at national scale.
SRE Observability Program
Challenge
Xplore's national core network lacked a unified observability strategy. Monitoring was fragmented across teams, alert noise was high, and there was no standard way to measure service health or incident response performance.
Approach
- Defined service health indicators (SLIs) and reliability targets (SLOs) for critical network services across core, transport, and access layers
- Built Grafana dashboards providing customer-impacting service health views, blast radius context during incidents, and saturation trending
- Implemented structured alert tuning — deduplicating noisy signals, improving routing and escalation paths, and ensuring every alert was actionable
Outcome
Measurably reduced MTTA and MTTR through better detection and faster triage. Improved alert actionability by reducing noise. Established a repeatable framework for onboarding new services into monitoring coverage.
Cisco ASR 32-to-64-bit Migration
Network EngineeringChallenge
Cisco announced end-of-support for 32-bit IOS-XR on ASR9K routers. Every core router across Xplore's North American network needed to be migrated to 64-bit — a fundamental OS change requiring careful planning to avoid service disruption.
Approach
- Inventoried all ASR9K routers across the national network and categorized by criticality and traffic load
- Developed a phased migration plan with detailed per-device runbooks, rollback procedures, and health validation checks
- Coordinated maintenance windows with NOC, engineering teams, and business stakeholders across multiple time zones
- Executed migrations during low-traffic windows with real-time monitoring and instant rollback capability
- Validated post-migration health using traffic counters, BGP session states, and customer-facing service metrics
Outcome
Successfully migrated all ASR9K routers to 64-bit IOS-XR across North America with zero customer-impacting downtime. Unlocked access to newer features, security patches, and extended vendor support.
Nokia SR7750 5G Core Deployment
Network EngineeringChallenge
Xplore was expanding into Fiber and 5G fixed wireless services, requiring a new core routing platform. The Nokia SR7750 needed to be integrated into the existing Cisco-dominated MPLS backbone seamlessly.
Approach
- Worked with vendor teams and platform engineering to plan and execute SR7750 deployment into primary and satellite data centers
- Configured MPLS/BGP peering with existing ASR core routers ensuring seamless traffic flow
- Built monitoring coverage — onboarding the new platform into Grafana dashboards, alarm systems, and SNMP polling
- Created operational runbooks covering common failure scenarios, restart procedures, and escalation paths
- Validated end-to-end service delivery for Fiber and 5G subscribers through the new platform
Outcome
Nokia SR7750 core routers fully operational and integrated into the national MPLS backbone. Complete monitoring and operational documentation in place from day one. Platform now serves Fiber and 5G customers across multiple regions.
300+ WAN Router Upgrade Program
Network OperationsChallenge
The WAN aggregation layer had accumulated firmware drift across roughly 300 routers. Varying versions meant inconsistent behavior, unpatched vulnerabilities, and increased troubleshooting complexity.
Approach
- Audited all WAN aggregation routers to catalog current firmware versions, hardware models, and dependency chains
- Developed a standardized upgrade procedure with pre/post validation checks that could be executed consistently across all devices
- Grouped routers by region and criticality, creating a phased rollout schedule to limit blast radius
- Automated pre-check and post-check scripts to validate config integrity, interface states, and routing adjacency after each upgrade
- Coordinated with NOC for real-time monitoring during each maintenance window
Outcome
All ~300 WAN aggregation routers upgraded to target firmware with minimal service impact. Standardized the fleet, reduced vulnerability exposure, and simplified future troubleshooting.
150+ POP Fixed Wireless Rollout
InfrastructureChallenge
Rural Manitoba communities needed broadband connectivity, but traditional fiber deployment was not economically viable. LTE Fixed Wireless was selected as the delivery mechanism, requiring rapid POP deployment at scale.
Approach
- Performed site surveys and RF planning using Pathanal5 and Google Earth to optimize tower placement and coverage
- Provisioned PTP and PTMP fixed wireless backhaul links for each POP site
- Configured routing integration into the core network — BGP peering, VLAN assignments, and QoS policies
- Built out power, grounding, and environmental monitoring for each site
- Created standardized deployment checklists and acceptance testing procedures to ensure consistency across 150+ sites
Outcome
150+ POP sites deployed and operational, providing LTE Fixed Wireless broadband to underserved rural communities across Manitoba. Established a repeatable deployment process that accelerated future site builds.
Legacy ISP Network Consolidation
Network OperationsChallenge
NetSet Communications acquired several smaller ISPs, each with their own equipment, configurations, and operational practices. These legacy networks were causing frequent customer-impacting outages due to outdated gear and inconsistent operations.
Approach
- Audited each acquired network — cataloging equipment, configurations, IP addressing, and customer circuits
- Designed a migration plan to move customers onto the NetSet backbone with minimal disruption
- Upgraded or replaced outdated equipment with standardized Juniper and Cisco platforms
- Re-addressed IP space, consolidated routing, and implemented consistent QoS and security policies
- Migrated customers in phases with rollback capability, validating service quality at each step
Outcome
Achieved 100% reduction in customer-impacting network interruptions caused by legacy infrastructure. Unified all acquired networks onto a single standardized backbone with consistent monitoring and operations.