Software Development Manager, SageMaker Distributed Training

26 Sep 2024

Vacancy expired!

DESCRIPTION

Job summaryInterested in Machine Learning? Amazon SageMaker is a fully managed Machine Learning platform that makes it easy to build ML models, manage them, and integrate them with custom applications for online predictions. SageMaker https://aws.amazon.com/sagemaker/) takes away the heavy-lifting normally associated with large-scale Machine Learning implementations, so that developers and scientists can focus on the truly creative work of modeling and solving the business problem at hand.

As an engineering leader, you will own the innovation in the space of ML Platforms, building compelling functionality for the Amazon SageMaker Service. You will be responsible for leading a team of strong engineers in design, development, test, and deployment of distributed systems and big data solutions. A successful candidate will have an established background in developing distributed systems, a strong technical ability, excellent project management skills, great communication skills, and a motivation to achieve results in a fast paced environment.

Key job responsibilities
  • Responsible for the over-all systems development life cycle
  • Management and execution against project plans and delivery commitments; Manage the day-to-day activities of the engineering team within an Agile/Scrum environment
  • Management of departmental resources, staffing, mentoring, and enhancing and maintaining a best-of-class engineering team
  • Work closely with the engineers to architect and develop the best technical design and approach
  • Report on status of development, quality, operations, and system performance to management
  • Customer engagement and product road map definition
About the teamSageMaker is one of the fastest growing AWS services. While ML is no longer a new concept, applying ML in production to solve real-world problems at scale is still in this infancy. Today, most of the AI/ML projects don't see the light of day because of the challenges in productionizing the model. Established ML teams are facing various challenges in automating their entire ML workflows. Our mission is to build a collection of platform services and tools in SageMaker to help customer in making their ML projects successful. We solve challenges in ML like what DevOps solves for software engineering over the past decade. You will lead the founding team in building an end-to-end ML pipeline platform in SageMaker. Your work will enable customers to build and run entire ML pipelines on Amazon SageMaker in a fully-managed, highly scalable and highly available manner.

BASIC QUALIFICATIONS

  • 7+ years of experience working directly within engineering teams
  • Experience partnering with product OR program management teams
  • 3+ years of people management experience, managing engineers
  • 3+ years of experience architecting and designing (architecture, design patterns, reliability and scaling) of new and current systems

PREFERRED QUALIFICATIONS

  • Experience with large scale distributed systems
  • Experience building and operating mission critical, highly available (24x7) systems
  • Experience with Machine Learning
  • Experience building tools for data scientists or developers
  • Experience in preparing quality metrics and effectively engaging with stakeholders to set and drive quality goals

Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit https://www.amazon.jobs/en/disability/us.

  • ID: #45995039
  • State: Washington Seattle-tacoma 98101 Seattle-tacoma USA
  • City: Seattle-tacoma
  • Salary: USD TBD TBD
  • Job type: Permanent
  • Showed: 2022-09-26
  • Deadline: 2022-11-23
  • Category: Et cetera