Vacancy expired!
RESPONSIBILITIES
- Establish monitoring, tracing, logging, and alerting for shared platforms
- Define SLAs and SLOs and set up monitoring to ensure availability targets are being met
- Develop tools and workflows utilizing engineering best practices, such as infrastructure as code and CI/CD, to promote reliability and availability
- Collaborate with platform engineers and developers to improve operational stability and reliability
- Bachelor's degree in computer science or related or equivalent experience
- Proven work experience as a Site Reliability Engineer or in a similar role
- Expert in infrastructure as code (Terraform, Docker, Helm)
- Expert in monitoring tools such as DataDog or Dynatrace
- Cloud experience, preferably Azure
- Experience with container technologies - Docker and Kubernetes
- Experience with configuration and administration of CI/CD pipelines, preferably using GitHub Actions
- Capable of writing comprehensive technical documentation and diagrams
- Working knowledge of bash and shell scripting
- Understanding of end-to-end application development lifecycle from code commit to production deployment
- Have DevOps, Reliability, and Security mindsets - understand production controls and change processes
- ID: #48955231
- State: Texas Dallas / fort worth 75201 Dallas / fort worth USA
- City: Dallas / fort worth
- Salary: Depends on Experience
- Job type: Contract
- Showed: 2023-02-01
- Deadline: 2023-03-25
- Category: Et cetera