Site Reliability Engineer

27 Mar 2024

Vacancy expired!

Detailed Job Description:
  • Design, develop, build and deploy Cloud Native Infrastructure for Enterprises
  • Expertise in handling incident managements in lower and higher (Production) environments
  • Analyze & Improve infrastructure stability, reliability, performance and scalability of Cloud Native Platform Infrastructure to meet ever increasing customer demands
  • Build the observability stack using a combination of open source and industry standard tools
  • Write code and apply engineering best practices and tools to automate operational tasks
  • Be responsible for the overall reliability and stability of Cloud native applications
  • Expertise in troubleshooting complex issues affecting performance and scaling.
  • Refactor existing code and service infrastructure to ensure scalability and reliability.
  • Identify process gaps and implement process improvements to increase operational efficiency.
  • Participate in the development of tools, systems and processes aimed at improving product supportability and overall support productivity.

Mandatory Skills:
  • Minimum of 8 years of work experience
  • Experience working as a developer and/or Site Reliability Engineer.
  • Experience with DevSecOps and Sire reliability practices
  • Knowledge or experience with Resilience experiments by leveraging tools like Chaos Tool Kit, Chaos Monkey, Gremlin and others
  • Experience of technologies like Python/Java
  • Proven track record building/supporting/scaling a high transactional 24x7 SaaS solution on any Cloud layer (Azure/Google Cloud Platform/AWS Preferred)
  • Experience with Security as it applies to infrastructure, systems and network engineering
  • Experience with distributed computing and distributed applications
  • Experience of infrastructure automation, such as Terraform or Ansible, and building/using/deploying Containers.
  • Experience with containerization technologies such as Docker, Kubernetes
  • Experience with logging and monitoring tools such as Grafana, Prometheus, Sumo Logic, Cortex, Splunk
  • Experience with Concourse Jenkins, Bamboo etc
  • Experience of Agile development, DevOps models or similar methodologies
  • Graduate degree in Computer Science or equivalent engineering experience
  • Experience with source control, including pull requests, branching and merging (GitHub).
  • Experience with cloud security concepts and tools, Ex: Twistlock, Divvy Cloud, Expel
  • Familiarity with Open Tracing/Open Telemetry

Good to have Skills:
  • Good communication and presentation skills
  • Experience in Agile Methodology

  • ID: #49563305
  • State: Texas Irving 75014 Irving USA
  • City: Irving
  • Salary: Depends on Experience
  • Job type: Permanent
  • Showed: 2023-03-27
  • Deadline: 2023-05-23
  • Category: Et cetera