Vacancy expired!
- SRE hands on good experience
- Web base application (java) support experience
- APM Tool experience
- Experience with application operation, cloud platform, system uptimes, system recovery, performance, Latency, monitoring, and root cause analysis.
- 4-6 + year experience as automation and tooling engineer.
- Solid knowledge and experience of scripting (Python / Bash) for java/NodeJS runtime environment.
- Deep understanding and experience of microservices, API and Web Services.
- Strong hands-on experience developing applications using Java, NodeJS / AngularJS, Python, GO, etc.
- Experience with cloud native applications, docker, Kubernetes, etc.
- Experience writing clean, modular Typescript code using external libraries or custom code.
- Experience with CI\CD pipeline using Jenkins and Github.
- Good to have experience with tools such as BlueTriangle, writing splunk query, and monitoring tool such as Dynatrace.
- Excellent verbal and written communication skills.
- Prior experience in supporting web and mobile apps
- Basic knowledge of CDN (Akamai)
- Exposure to Monitoring tools (APM, Synthetic & Log Monitoring etc.)
- Azure exposure (any cloud)
- Unix & Scripting for Automation
- eCommerce experience (supporting web applications etc)
- Responsible for Toil Reduction, implementing identified improvement opportunities, handling minor enhancement and non-ticketed activity.
- Prior experience in supporting web and mobile apps
- Basic knowledge of CDN (Akamai)
- Exposure to Monitoring tools (APM, Synthetic & Log Monitoring etc.)
- Azure exposure (any cloud)
- Unix & Scripting for Automation
- eCommerce experience ( supporting web applications etc)
- Define and monitor service level metrics that include incident management KPIs like: MTTD, MTTR, MTBF, MTTF, Unavailability rate, Incident count, etc.
- Create rules to optimize incident response by metrics, streamlining alert flows, and collaboration and communication across squads.
- Proactively identify the issues that might disrupt the service in production
- Address incoming service request to their support groups/Jira tool
- Create and maintain alerts
- Change validation or change planning related requests
- Assist business stake holder in determining SLO or adjusting threshold limits
- Demand and capacity management & make corrections to SLI/SLO threshold limits
- Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
- Partner with development teams to improve services through rigorous testing and release procedures
- Participate in system design consulting, platform management, and capacity planning
- Create sustainable systems and services through automation and uplifts
- Balance feature development speed and reliability with well-defined service level objective (SLO, SLI)
- Debug production issues across services and levels of the stack.
- ID: #43687038
- State: California Pleasanton 94566 Pleasanton USA
- City: Pleasanton
- Salary: Depends on Experience
- Job type: Contract
- Showed: 2022-06-29
- Deadline: 2022-08-27
- Category: Et cetera