Site Reliability Engineer job vacancy

Vacancy expired!

Role : Site Reliability Engineer Location : Alpharetta,GA looking for Site Reliability Engineer with 10+ Years Experience; Mandatory : Google Cloud Platform Site Reliability Engineer - SRE Description:

looking for Site Reliability Engineer to manage end to end application and system stack and to work with one of the leading financial services organization in the US.
Site Reliability Engineering (SRE) is a discipline that combines software and systems engineering for building and running large-scale, distributed, fault-tolerant systems. S
RE ensures that internal and external services meet or exceed reliability and performance expectations. SRE is also an engineering approach to building and running production systems -engineer solutions to operational problems.
As SREs are responsible for overall system operation, utilizing a breadth of tools and approaches to solve a broad set of problems. Practices such as limiting time spent on operational work, blameless postmortems, proactive identification, and prevention of potential outages.

Responsibilities: As a Site Reliability Engineer,

You will be part of the team to migrate and transform the on-prem applications and data centers to public Cloud (Google Cloud Platform), and then.
You will engage in and improve the software development lifecycle - from inception and design, through development, deployment, operation and refinement
Develop and maintain the large-scale infrastructure
Build out new infrastructure using IAC practices
Own build tools and CI/CD automation pipeline
Influence and design infrastructure, architecture, standards and methods for large-scale systems
Support services prior to production via infrastructure design, software platform development, load testing, capacity planning and launch reviews
Maintain services during deployment and in production by measuring and monitoring key performance and service level indicators including availability, latency, and overall system health
Automate system scalability and continually work to improve system resiliency, performance and efficiency Investigate, diagnose, and resolve performance and reliability problems in a wide range of large-scale and high-throughput services
Collaborate with architects and application engineers to ensure applications are maintainable, scalable, and follow appropriate disaster recovery and high availability strategies
Contributions to handbook, runbooks, and general documentation
You will remediate tasks within corrective action plan via sustainable, preventative, and automated measures whenever possible

Requirements:

BS degree in Computer Science or related technical field, or equivalent job experience required
8 plus experience in DevOps, SRE roles
4 plus years of SRE experience in Cloud environments
2+ years of experience developing and/or administering software in public cloud
Strong working knowledge and working experience on Google Cloud Platform (Google Cloud Platform)
Experience in DevOps and CI/CD pipelines and build tools like Jenkins.
Experience managing Infrastructure as code via Terraform
2-4 years of experience in languages such as Python, Ruby, Bash, Java, Go, Perl, JavaScript and/or node.js
Must have strong communication and interpersonal skills
Experience operating a production environment at high scale with emphasis on availability, latency
Deep knowledge of container orchestration tools such as Docker, Kubernetes
Familiar with configuration management tools and Deployment tools such as Chef, Octopus
Experience in software development in one or more of the following: C, C, Java, Go and/or Perl, Python. Strong team player with a "can do" attitude, and the flexibility to jump in wherever needed
Demonstrable cross-functional knowledge with systems, storage, networking, security and databases
System administration skills, incl

ID: #40533583
State: Georgia Alpharetta 30009 Alpharetta USA
City: Alpharetta
Salary: USD TBD TBD
Job type: Contract
Showed: 2022-05-09
Deadline: 2022-07-05
Category: Et cetera