Senior SRE Engineer - REMOTE in DALLAS

30 Nov 2024

Vacancy expired!

Join this growing healthcare company that is transforming the healthcare ecosystem for greater, healthier outcomes of our patients. We also provide opportunities for personal and professional growth in technology skills, leadership, dedicated mentorship, communication skills among others. Work across development and systems teams to champion the adoption of modern reliability practices like SLOs, error budget policies, actionable alerts, incident retrospectives, chaos testing, and end-to-end ownership. Simply put, as an SRE, you will help build and operate fast and reliable systems that help people get jobs. Are you up for the challenge?

RESPONSIBILITIES
  • Work closely with software engineers to design, develop, and implement reliable, performant software that improves the stability, scalability, availability, and latency
  • Implement application/infrastructure observability solutions and perform maintenance to ensure desired application availability
  • Real-time service management inclusive of building Golden Signal Monitoring.
  • Establish and negotiate SLOs and SLI's with the business, building alerting, creating playbooks and runbooks for services
  • Triage and decompose incidents to identify probable root causes through debugging code, operating networks, building hardware, or other techniques

BACKGROUND
  • Strong experience working on high data volume applications managed with modern IaC methodologies/tooling.
  • Solid experience building and running distributed systems in an AWS environment.
  • Strong experience building and deploying systems to production deployments, monitoring, scheduling and load balancing
  • Strong experience with container technologies and orchestration platforms (Docker, Kubernetes, Rancher, Cloud Foundry)
  • Strong experience managing and using CI/CD tech stack systems (Bamboo, Azure DevOps, Jenkins, CircleCi)
  • Good experience implementing a highly scalable/distributed CI/CD Pipeline.
  • Good experience working with monitoring and observability tools (New Relic and OpsGenie )
  • Strong knowledge working with RDS and Snowflake.
  • Strong knowledge of programming/scripting languages (Python, Bash, Groovy, Go lang, IaC (Terraform))
  • Knowledge of standard methodologies related to security, performance, and disaster recovery
  • Strong analytical skills identifying performance bottlenecks, in support of production issue resolution and root cause identification.
RedRiver offers benefits including Major Medical, Dental, Vision, LTD and 401k. More positions @: http://redriversystems.com/jobs/#/jobs RedRiver Systems is an Equal Opportunity Employer.

  • ID: #23651083
  • State: Texas Dallas / fort worth 75201 Dallas / fort worth USA
  • City: Dallas / fort worth
  • Salary: Depends on Experience
  • Job type: Permanent
  • Showed: 2021-11-30
  • Deadline: 2022-01-28
  • Category: Et cetera