Senior Site Reliability Engineer

Details of the offer

Reports to: Project Lead
Experience: 5+ years
Start date: 1st August 2022
ResponsibilitiesResponsible for Toil Reduction, implementing identified improvement opportunities, and handling minor enhancement and non-ticketed activity.Define and monitor service level metrics that include Reliability metrics like MTTD, MTTR, MTBF, MTTF, Unavailability rate, Incident count, etc.Create rules to optimize incident response by metrics, streamlining alert flows, and collaboration and communication across squads.Proactively identify the issues that might disrupt the service in production.Address incoming service requests to their support groups/Jira tool.Create and maintain alerts.Change validation or change planning-related requests.Assist business stakeholders in determining SLO or adjusting threshold limits.Demand and capacity management & make corrections to SLI/SLO threshold limits.Gather and analyze metrics from both Infrastructure and applications to assist in bug fixing.Engage in capacity planning & performance tuning exercises.Partner with development teams to improve services through rigorous testing and release procedures.Participate in system design consulting, platform management, and capacity planning.Create sustainable systems and services through automation and uplifts.Balance feature development speed and reliability with well-defined service level objective (SLO, SLI).Debug production issues across services and levels of the stack.Required Skills and QualificationsBachelor's degree in computer science or other highly technical, scientific discipline.Experience in AEM, Webservices/APIs.Experience in working with Public Clouds (Min 3 years experience is a must).Experience with Git or other source control systems.Experience using tools to create and manage CI (continuous integration) and CD (continuous delivery) pipelines.Working knowledge in service level definitions and identifying the KPIs.Working knowledge of the TCP/IP stack, internet routing, and load balancing.Experience with distributed storage technologies like NFS, HDFS, Ceph.Experience in Observability strategy.Delivery Model: Onsite
Job Type: Full Time
Job Location: Auckland

#J-18808-Ljbffr


Source: Jobleads

Job Function:

Requirements

Tester/Surveyors

Our Mission: To protect people, business, livelihood, and property. Due to continued growth, Argus Fire Protection is seeking Tester/Surveyors for our Auckla...


From Argus Fire Systems Service Ltd. - Auckland

Published 22 days ago

Graduate Hv Electrical Engineer - Executive

Graduate HV Electrical Engineer - Executive Skip to content Electrical/Electronic Engineering (Engineering) Graduate HV Electrical Engineer - Executive **Loc...


From Do Consulting Limited - Auckland

Published 22 days ago

Structural Engineer

Full time At Formsteel Technologies, we're committed to delivering innovative engineering solutions that drive progress and sustainability. With a diverse po...


From Formsteel - Auckland

Published 22 days ago

Directional Driller Wanted!

Our clients around the Waikato are seeking a Directional Drill Operator for an immediate start! We can offer a competitive pay rate depending on your experie...


From Tradestaff New Zealand - Auckland

Published 22 days ago

Built at: 2024-11-05T10:52:42.972Z