Back to jobsJob overview

About the role

Sr. Software Development Engineer, ML Infrastructure Team at Annapurna Labs (U.S.) Inc.

Required Skills

pythontypescriptawsci/cdmachine learninghigh performance computinggrafanaathenalinux

About the Role

This role is for a Senior Software Development Engineer on the ML Infrastructure team at AWS, focusing on building automation tools to ensure the performance and functionality of AWS ML and HPC technologies. Responsibilities include developing CI/CD automation, running benchmarks, and creating dashboards to monitor performance data. The engineer will work with technologies like Python, TypeScript, and AWS services to support AI offerings such as Trainium and Neuron.

Key Responsibilities

  • Build and maintain infrastructure for monitoring and reporting on large-scale testing workloads
  • Automate software delivery using CI/CD tools, Linux, and AWS products
  • Write Python code to deploy clusters and run ML/HPC benchmarks and applications
  • Create dashboards with AWS Managed Grafana and Athena to analyze performance data
  • Develop automatic alerting mechanisms for functional and performance regressions

Required Skills & Qualifications

Must Have:

  • 5+ years of non-internship professional software development experience
  • 5+ years of programming experience with at least one software programming language
  • 5+ years of experience leading design or architecture of systems
  • 5+ years of full software development life cycle experience including coding standards and testing

Nice to Have:

  • Bachelor's degree in computer science or equivalent

Benefits & Perks

  • Base pay ranging from $151,300 to $261,500/year depending on geographic market
  • Equity, sign-on payments, and other forms of compensation
  • Full range of medical, financial, and other benefits