Back to jobsJob overview

About the role

Principal Network Architect at Microsoft

Required Skills

network architectureethernetrdmacongestion controlpythonansibleai clustersswitch asicoptics

About the Role

Principal Network Architect responsible for end-to-end network architecture for AI training/inference clusters at Microsoft. Leads a team of engineers to design, optimize, and validate large-scale network fabrics for hyperscale cloud infrastructure.

Key Responsibilities

  • Own end-to-end network architecture for AI training/inference clusters
  • Lead and grow a high-performing team of engineers
  • Define scale-out/scale-up designs and network services
  • Drive congestion-control strategy and transport tuning
  • Evaluate and influence silicon & optics roadmaps

Required Skills & Qualifications

Must Have:

  • Master's Degree in Electrical Engineering, Computer Engineering, Mechanical Engineering or related field AND 9+ years technical engineering experience OR Bachelor's Degree AND 11+ years experience
  • 10+ years designing and operating large-scale L2/L3 Ethernet fabrics for HPC/AI or hyperscale services
  • 5+ years experience with Ethernet, RDMA/RoCEv2, congestion control, routing, and load balancing
  • 5+ years experience with switch/NIC architecture and optics

Nice to Have:

  • Experience optimizing networks for AI collectives and distributed training systems
  • Familiarity with programmable data planes and NIC offloads
  • Depth in buffer management and queue disciplines
  • Experience with optic/PHY roadmaps and DC power/cooling constraints
  • Contributions to standards bodies/consortia
  • Proficiency in Python/Go and automation frameworks

Benefits & Perks

  • Industry leading healthcare