NS
All Posts
CareerMarch 10, 20268 min read

From Network Analyst to SRE Team Lead: My 11-Year Journey

SRECareer GrowthNetwork EngineeringLeadership

Eleven years ago, I was a junior network analyst in Brandon, Manitoba, configuring gateway services for enterprise customers and managing IP address databases. Today, I lead the SRE Observability & Service Assurance program at Xplore Inc., defining SLIs and SLOs for a national core network that spans coast to coast across Canada.

This is the story of that journey — not as a straight line, but as a series of moments where saying "yes" to hard problems opened doors I didn't know existed.

Starting at the Edge

My career began at NetSet Communications, a regional ISP in Manitoba. As a junior network analyst, my world was small — enterprise gateway configurations, CPE deployments, and managing the IP address database. But it was here that I fell in love with networking. There's something deeply satisfying about understanding how packets move, why a route converges the way it does, and what happens when it doesn't.

Within a year, I was designing L2-VPN and VPLS solutions for enterprise customers, deploying MPLS services, and learning how to think about networks as systems rather than collections of individual boxes.

Scaling Up: 150 POP Sites and a Provincial Rollout

The project that changed my trajectory was the LTE Fixed Wireless rollout across Manitoba. Over three years, I helped deploy more than 150 POP sites — from RF planning and radio provisioning to routing integration and operational handoff.

This wasn't just technical work. It was project management, vendor coordination, and building repeatable processes that could scale. I learned that deploying one site is engineering; deploying 150 is operations. And operations requires a completely different mindset — standardization, automation, checklists, and the discipline to do the boring things consistently.

Moving to National Scale

When NetSet was acquired by Xplore (then Xplornet), I moved into IP Operations — and my scope went from provincial to national. Suddenly I was working on a core network spanning multiple provinces, with data centers from coast to coast, and customer counts measured in hundreds of thousands.

The challenges scaled too. I migrated Cisco ASR9K routers from 32-bit to 64-bit IOS-XR across all of North America. I deployed Nokia SR7750 routers for the new Fiber and 5G core. I upgraded roughly 300 WAN aggregation routers network-wide. Every one of these projects required meticulous planning because the blast radius of a mistake was no longer a few hundred customers — it was tens of thousands.

The Shift to SRE Thinking

The pivotal shift in my career wasn't a promotion — it was a change in how I thought about my work. Somewhere around 2023, I started reading about Site Reliability Engineering, and I realized that the best network operations teams weren't just reacting to incidents. They were measuring service health, defining reliability targets, and systematically reducing the gap between where they were and where they needed to be.

I started asking different questions:

  • "Would we know if this failed?" — Instead of assuming our monitoring was complete, I started testing detection paths and identifying silent failures.
  • "Is this alert actionable?" — Instead of accepting alert noise as normal, I started tuning thresholds and deduplicating signals.
  • "Why did this happen again?" — Instead of treating repeat incidents as bad luck, I started tracking recurring failures and driving corrective actions to completion.

These questions led me naturally from operations into SRE.

Leading the Observability Program

In February 2026, I stepped into my current role: Team Lead SRE, Observability & Service Assurance. The mandate was clear — build the observability program that Xplore's network needed.

This means defining SLIs and SLOs for critical services, building dashboards that give teams real-time context during incidents, tuning alerts so that every notification drives action, and establishing post-incident review processes that actually prevent repeat failures.

It also means mentoring. Some of the most rewarding work I do now is helping NOC analysts and junior SREs develop their troubleshooting instincts — teaching them to read telemetry, ask the right questions, and communicate clearly during high-pressure incidents.

What I've Learned

A few lessons from 11 years of building and running networks:

The best monitoring tells a story. A dashboard full of green dots isn't useful. A dashboard that shows you service health, trending, and context — that's what helps you make decisions at 3 AM.

Automation isn't optional. Every manual process that runs more than twice should be automated. Not because humans are unreliable, but because automation frees humans to work on harder problems.

Incidents are data. Every outage, every degradation, every near-miss is information about where your system is weak. The teams that learn from incidents get better. The teams that don't, keep getting paged.

Leadership is influence, not authority. As a team lead in an IC-track role, I don't have direct reports. I lead by setting standards, building tools, writing runbooks, and being the person people trust when things are on fire.

What's Next

I'm still early in this SRE chapter. There's a lot more to build — better service health reporting, more automation, deeper integration between observability and change management. And there are blog posts coming about all of it.

If you're a network engineer thinking about SRE, or an SRE wondering how telecom networks work, I hope this blog becomes useful to you. The intersection of networking and SRE is where I live now, and I think it's one of the most interesting places in infrastructure engineering.


Thanks for reading. If you want to connect, find me on LinkedIn.