Back to jobsJob overview
About the role
Principal Software Engineer at Microsoft
Required Skills
pythonai/mlgpusystem designllmcloud infrastructurec++reliability modeling
About the Role
The Principal Software Engineer role at Microsoft Azure AI/HPC focuses on designing and optimizing AI systems for training and inference workloads on hyperscale cloud infrastructure. Responsibilities include system design, reliability modeling, and deep LLM modeling to enhance software-hardware codesign. The position requires expertise in AI architectures, GPU planning, and cross-functional collaboration.Key Responsibilities
- Partners with stakeholders to determine user requirements for scenarios
- Leads identification of dependencies and development of design documents
- Mentors others to produce extensible and maintainable code
- Drives project plans and work items with cross-product features expertise
- Acts as Designated Responsible Individual (DRI) for system monitoring and mentoring
Required Skills & Qualifications
Must Have:
- Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages like C, C++, C#, Java, JavaScript, or Python OR equivalent experience
- 5+ years of experience in system design
- Ability to pass Microsoft Cloud Background Check upon hire and every two years
- Deep knowledge of AI systems and architectures for training and inference across multi-vendor GPUs
Nice to Have:
- Master's Degree in Computer Science or related technical field AND 8+ years technical engineering experience OR Bachelor's Degree AND 12+ years experience
- Experience with deep LLM modeling and software-hardware codesign features
Benefits & Perks
- Industry leading healthcare