Site Reliability Engineer

20 Mar 2025

Vacancy expired!

Your Opportunity

Our team is looking for an experienced Site Religability engineer who can lead multiple scrum teams at the same time. Ideal candidates must thrive in a fast-paced team environment and have a strong passion for technology and innovation.

What you are good at

  • Develop and maintain tooling used for environment monitoring and task automation
  • Identify application reliability and availability improvements and build solutions to drive an improved experience
  • Analyze and establish efficient configurations for software and servers, DB connections, indexes, drivers, etc.
  • Coordinate with development teams, technical and non-technical Partners, and clients to maintain wide knowledge on dependencies of the critical business transaction including platform, services and tools
  • Monitor internal and vendor service level objectives (SLOs) and agreements (SLAs); identifies and resolves SLO / SLA gaps
  • Serve as technical subject matter expert (SME) for cross-functional engineering Teams
  • Assist with and troubleshoot systems-related issues and maintenance
  • Collaborate on maintaining services once they are live, measures and monitors availability, latency, and overall system health
  • Develop runbook and build automation
  • Develop and maintain E2E monitoring dashboards to support critical business transaction
  • Develop and maintain synthetic monitoring for critical business transaction using tools such as Thousand Eyes
  • Practice sustainable incident response and blameless postmortems
  • Document and promote SRE standards and procedures
  • Develop and assist in deployment and rollback automation
  • Review Release and deployments requirements
  • Build and setup automation tests
  • Incident communication to impacted stakeholders
  • Coach and mentor junior engineers and fellow practitioners

What you have

  • 5+ years of professional engineering experience developing, managing, or supporting distributed systems
  • 2+ SRE experience managing multi-cloud platforms preferred
  • Enterprise Cloud infrastructure experience e.g., AWS, Azure, Google Cloud Platform, Cloud Foundry
  • Experience with microservices architecture patterns
  • Proven track record of researching, understanding, and effectively applying Scalability and High Availability principles
  • Experience in developing and managing operations leveraging key event streaming, messaging and DB services e.g., Casandra, MQ/JMS/Kafka, Aurora, RDS, Cloud SQL, BigTable, DynamoDB, Cloud Spanner, Kinesis, Cloud Pub/Sub, etc.
  • Experience working with containers e.g., Docker, Kubernetes, Cloud Foundry, etc.
  • Strong experience in using industry standard monitoring tools e.g., AppDynamics, Dynatrace, APICA, Splunk, ELK, FluentD, Prometheus, Kibana, Elasticsearch, Grafana, Nagios, Datadog, New Relic, Tempo, Loki, etc.
  • Schwab systems experience
  • Strong working knowledge of modern development technologies and tools e.g., Agile, CI/CD, Git, Jira and Confluence

Why work for us?

Own Your Tomorrow embodies everything we do! We are committed to helping our employees ignite their potential and achieve their dreams. Our employees get to play a central role in reinventing a multi-trillion-dollar industry, creating a better, more modern way to build and manage wealth.

Benefits: A competitive and flexible package designed to empower you for today and tomorrow. We offer a competitive and flexible package designed to help you make the most of your life at work and at home-today and in the future.

TD Ameritrade, a subsidiary of Charles Schwab, is an Equal Opportunity Employer. At TD Ameritrade we believe People Matter. We value diversity and believe that it goes beyond all protected classes, thoughts, ideas, and perspectives.

  • ID: #49503276
  • State: Texas Austin 73301 Austin USA
  • City: Austin
  • Salary: USD $87,800 - $195,200 / Year
  • Job type: Permanent
  • Showed: 2023-03-20
  • Deadline: 2023-05-18
  • Category: Et cetera