Back to jobsJob overview
About the role
Senior Software Engineer at Microsoft
Required Skills
c++pythonai/mlhigh performance computingnetworkingdistributed systemsazure cloudsoftware-defined networkinginfiniband
About the Role
Senior Software Engineer role focused on designing and developing next-generation networking infrastructure for large-scale AI training in Azure Cloud. The position involves optimizing high-performance, low-latency communication frameworks for distributed AI systems and working at the intersection of AI and high-performance computing.Key Responsibilities
- Design, develop, and optimize networking solutions for large-scale AI training infrastructure
- Architect and implement high-performance, low-latency communication frameworks for distributed systems
- Benchmark, analyze, and enhance scalability and reliability of networking systems for petabyte-scale data transfer
- Debug and resolve complex networking issues in large-scale, high-performance environments
- Act as Designated Responsible Individual (DRI) to monitor systems and guide other engineers
Required Skills & Qualifications
Must Have:
- Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including C, C++, C#, Java, JavaScript, or Python OR equivalent experience
- 2+ years of experience with network virtualization, software-defined networking (SDN), or network performance tuning
- Ability to meet Microsoft security screening requirements including Microsoft Cloud Background Check
Nice to Have:
- Hands-on experience with networking technologies in AI-specific hardware (e.g., InfiniBand, ROCE, NVLink)
- Familiarity with AI accelerators such as GPUs (NVIDIA, AMD) or TPUs and their interaction with networking infrastructure
- Experience with telemetry and observability tools for network monitoring at scale
- Background in building scalable and fault-tolerant systems in large, distributed environments
Benefits & Perks
- Industry leading healthcare