SRE Principal Engineer
Important Information
Location: Hyderabad, Bangalore
The Software Site Reliability Engineer ensures that enterprise-wide systems are reliable, scalable, and performant by relentlessly measuring and improving environments. They lead and guide teams to implement new software and system capabilities, enhance code, and optimize processes and tools. Leveraging infrastructure and software engineering expertise, they build reliable solutions from inception or refactor legacy systems for improved reliability. Success is driven by data, customer satisfaction, and empowering teams to achieve excellence.
ESSENTIAL JOB FUNCTIONS AND RESPONSIBILITIES
Develop Software: Build highly available, reliable and scalable SaaS Cloud-Based solutions
Drive Reliability: Lead efforts that enhance system reliability and operational efficiency
Design Guidance: Contribute and guide the design of performant, reliable, and scalability capabilities
Produce Insights: Instrument systems & applications to deliver monitoring and alerting insights
Mentorship: Foster a culture of reliability by mentoring, training & guiding SRE principles and practices
Incident Management: Manage and leading incidents to restore service to quickly to customers
Improve Systems: Lead root cause analysis sessions which mature software systems and teams
Proactive Resolution: Monitor and guide teams in anomalies detection and response.
Standards & Practices: Develop, publish & guide use in reliability and incident management practices
Technical Leadership: Lead projects, ensuring solutions align with organizational goals.
Improve Practice: Document engineering & operations case studies refining SRE practices.
CI/CD Scale: Drive construction of platform reliability components usable by CICD delivery pipelines
Agile Practices: Participate in planning, prioritization, and breakdown of team deliverables.
KNOWLEDGE, SKILLS AND ABILITIES Candidate must possess Advanced proficiency of the following:
Technical
Design and delivery of highly reliable SaaS solutions hosted in AWS, Azure, OCI, or GCP
Software Development frameworks using Java, Spring Boot, .NET Core, MVC, JavaScript
Designing and delivering highly observed, reliable and recoverable enterprise event-driven systems
Observability and monitoring experience with Open Telemetry, Datadog, and CloudWatch
Infrastructure, application and synthetic monitoring and alerting techniques and patterns
Institutionalization of application and system metrics with KPIs, SLIs and SLOs
Observable and reliable relational storage solutions with Postgres, MSQL, or similar
Observable and reliable non-relational database technologies and cloud storage like AWS S3
Observable and reliable containerization apps in Kubernetes, Argo CD, Helm and TF
CI powered performance and synthetics augmenting shift-left testing strategy methods
CD experience using GitHub Actions, Terraform, Go, PowerShell and/or Python
Exposure to AI automation paired programing with GitHub Copilot or similar tools
Scaling application optimization for Network, Memory and IO performance concerns
Interpersonal
Results-oriented and customer-focused, acting with urgency and purpose.
Ability to make data-driven decisions guided by commitment to customer outcomes.
Strong time management and cross-team partnership ensuring alignment in commitments.
Adaptive verbal and listening skills, being clear and concise while practicing empathy to foster trust and provide meaningful feedback.
Strong written and presentation skills, representing various viewpoints.
Passionate hunger for learning and applying emerging technologies.
Proven ability to root cause system issues and create/own remediation plans.
EDUCATION and TRAINING
An undergraduate degree, preferably in Computer Science or a similar technical degree.
7+ years of experience in a Site Reliability role.
4+ years of experience in a Software Development role in a production SaaS environment.
About Encora
Encora is a global company that offers Software and Digital Engineering solutions. Our practices include Cloud Services, Product Engineering & Application Modernization, Data & Analytics, Digital Experience & Design Services, DevSecOps, Cybersecurity, Quality Engineering, AI & LLM Engineering, among others.
At Encora, we hire professionals based solely on their skills and do not discriminate based on age, disability, religion, gender, sexual orientation, socioeconomic status, or nationality.
MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.