Currently, we are looking for a remote Lead Site Reliability Engineer to join our team.
Responsibilities
Work with development partners to shape the architecture, design, and implementations of new and existing systems to enhance their reliability, performance, efficiency, and scalability
Ensure all key services are measured, monitored, and raising alerts when needed
Automation of deployment and configuration processes
Develop reliability tools and frameworks for use by all engineers
Share On-Call for most critical systems and lead incident response and no-blame post-mortem analysis and review
Drive efficiencies in systems and processes: capacity planning, configuration management, performance tuning, monitoring and root cause analysis
Requirements
5+ years of relevant experience
A dynamic persona with grit, drive and a deep sense of ownership
BS or MS in Computer Science or a related technical discipline. Equivalent practical experience is a reasonable substitute
Expertise or deep working knowledge in Cloud networking and Next Gen security services
Product and working knowledge with Palo Alto products (Next gen firewalls, Panorama, Global Protect VPN) and with Hashicorp products (Vault, Terraform etc.)
Expertise with coding infrastructure, automation and orchestration
Working knowledge of Kubernetes, Terraform, Prometheus, Elastic, Jenkins (or other similar toolset)
Well versed in multiple cloud flavours (AWS and Azure)
Good understanding of IAM (Identity and Access Management) in cloud
Good programming skills in one of C/C++, Java, JavaScript, Python or Go, and an ability to pick up new ones
A good understanding of large-scale distributed systems in practice, including multi-tier architectures, application security, monitoring and storage systems
Good understanding of the DevOps and SAFe/Scrum ways of working
We offer/Benefits
MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.