Back to jobsJob overview

About the role

Principal Software Engineer at Microsoft

Required Skills

pythonai/mlgpusystem designllmcloud infrastructurec++reliability modeling

About the Role

The Principal Software Engineer role at Microsoft Azure AI/HPC focuses on designing and optimizing AI systems for training and inference workloads on hyperscale cloud infrastructure. Responsibilities include system design, reliability modeling, and deep LLM modeling to enhance software-hardware codesign. The position requires expertise in AI architectures, GPU planning, and cross-functional collaboration.

Key Responsibilities

  • Partners with stakeholders to determine user requirements for scenarios
  • Leads identification of dependencies and development of design documents
  • Mentors others to produce extensible and maintainable code
  • Drives project plans and work items with cross-product features expertise
  • Acts as Designated Responsible Individual (DRI) for system monitoring and mentoring

Required Skills & Qualifications

Must Have:

  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages like C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • 5+ years of experience in system design
  • Ability to pass Microsoft Cloud Background Check upon hire and every two years
  • Deep knowledge of AI systems and architectures for training and inference across multi-vendor GPUs

Nice to Have:

  • Master's Degree in Computer Science or related technical field AND 8+ years technical engineering experience OR Bachelor's Degree AND 12+ years experience
  • Experience with deep LLM modeling and software-hardware codesign features

Benefits & Perks

  • Industry leading healthcare