As a Site Reliability Engineer you will help build a meaningful engineering discipline, combining software and systems to develop creative engineering solutions to operations problems. Much of support and development will focus on existing systems, building infrastructure and reducing work through automation. You'll join a team of curious problem solvers with diverse set of perspectives who are thinking big and taking risks. In this environment you'll take on the relevant projects, supported by and organization that provides the support and mentorship you need to learn and grow. As an SRE you'll be focused on running better production applications and systems.
Responsibilities:
Design, code, debug, test, and deliver software to automate manual operational work
Troubleshoot minor incidents, facilitate blameless post-mortems and ensure permanent closure of incidents
Participate in the application or service development lifecycle through code contributions
Engage with tools and operations teams to address failure patterns and incidents.
Develop automation tools for efficient, noiseless alerting, toil and technical dept.
Conduct performance tests and document and/or identify application optimization solutions.
Perform analytics on previous incidents and usage patterns to better predict issues and take proactive actions
Logging, Metrics and Alerting - managing and organizing an on-call schedule connected to metrics and log events.
Participate in the 24x7 support coverage as needed
Qualifications:
Bachelor's degree or equivalent experience in an software engineering discipline
Proficiency in at least one programming language ( e.g. Python, Java, Go etc..) with respect to designing, coding, testing, and software delivery.
Understanding of the software deliver lifecycle using Agile practices.
Expertise in application, data and infrastructure disciplines
Advanced knowledge in on or more infrastructure components ( e.g. networking, storage, compute systems etc...)
Capable of managing service level changes to a system or service.
Hands on experience with deployment, monitoring, automation and ops analysis tools such as Prometheus, Elasticsearch, Grafana, Kibana, Splunk, Dynatrace Managed, UiPath tec.
Software Engineering experience, using one or more object oriented programming languages and/or scripting.
Proficiency in one or more technology domains, may be a cross-domain expert able to solve complex and mission critical problems within a business or across the firm
MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.