Vacancy expired!
- Site Reliability Engineering – DevOps
- AEM
- Java
- Azure
- Webservices
- API
- Responsible for Toil Reduction, implementing identified improvement opportunities, handling minor enhancement and non-ticketed activity.
- Prior experience in supporting web and mobile apps
- Basic knowledge of CDN (Akamai)
- Exposure to Monitoring tools (APM, Synthetic & Log Monitoring etc.) Azure exposure (any cloud)
- Unix & Scripting for Automation
- eCommerce experience ( supporting web applications etc)
- Define and monitor service level metrics that include incident management KPIs
- like: MTTD, MTTR, MTBF, MTTF, Unavailability rate, Incident count, etc.
- Create rules to optimize incident response by metrics, streamlining alert flows, and collaboration and communication across squads.
- Proactively identify the issues that might disrupt the service in production
- Address incoming service request to their support groups/Jira tool
- Create and maintain alerts
- Change validation or change planning related requests
- Assist business stakeholder in determining SLO or adjusting threshold limits
- Demand and capacity management & make corrections to SLI/SLO threshold limits
- Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
- Partner with development teams to improve services through rigorous testing and release procedures
- Participate in system design consulting, platform management, and capacity planning
- Create sustainable systems and services through automation and uplifts
- Balance feature development speed and reliability with well-defined service level objective (SLO, SLI)
- Debug production issues across services and levels of the stack.
- ID: #43479894
- State: California Pleasanton 94566 Pleasanton USA
- City: Pleasanton
- Salary: Depends on Experience
- Job type: Contract
- Showed: 2022-06-22
- Deadline: 2022-08-19
- Category: Et cetera