At NetApp, we have a history of helping customers turn challenges into business opportunities. That's because we bring new thinking to age-old problems, like how to use data most effectively in the most efficient possible way. As an Engineer with NetApp, you'll have the opportunity to work with modern cloud and container orchestration technologies in a production setting. You'll play an important role in scaling systems sustainably through automation and evolving them by pushing for changes to improve reliability and velocity.
About NetApp
NetApp is the intelligent data infrastructure company, turning a world of disruption into opportunity for every customer. No matter the data type, workload or environment, we help our customers identify and realize new business possibilities. And it all starts with our people.
If this sounds like something you want to be part of, NetApp is the place for you. You can help bring new ideas to life, approaching each challenge with fresh eyes. Of course, you won't be doing it alone. At NetApp, we're all about asking for help when we need it, collaborating with others, and partnering across the organization - and beyond.
Job summary
---------------
As a Cloud Infrastructure/Site Reliability Engineer, you will be operating at the intersection of development and operations. Your role will involve engaging in and enhancing the lifecycle of cloud services - from design through deployment, operation, and refinement. You will be responsible for maintaining these services by measuring and monitoring their availability, latency, and overall system health.
You will play a crucial role in sustainably scaling systems through automation and driving changes that improve reliability and velocity. As part of your responsibilities, you will administer cloud-based environments that support our SaaS/IaaS offerings, which are implemented on a microservices, container-based architecture (Kubernetes).
In addition, you will oversee a portfolio of customer-centric cloud services (SaaS/IaaS), ensuring their overall availability, performance, and security. You will work closely with both NetApp and cloud service provider teams, including those from Google, located across the globe in regions such as RTP, Reykjavik, Bangalore, Sunnyvale, Redmond, and more.
Due to the critical nature of the services we support, this position involves participation in a rotation-based on-call schedule as part of our global team. This role offers the opportunity to work in a dynamic, global environment, ensuring the smooth operation of vital cloud services. To be successful in this role, you should be a motivated self-starter and self-learner, possess strong problem-solving skills, and be someone who embraces challenges.
Job requirements
--------------------
Incident Response and Troubleshooting: Address and perform root cause analysis (RCA) of complex live production incidents and cross-platform issues involving OS, Networking, and Database in cloud-based SaaS/IaaS environments. Implement SRE best practices for effective resolution.
Analysis, and Infrastructure Maintenance: Continuously monitor, analyze, and measure system health, availability, and latency using tools like Prometheus, Stackdriver, ElasticSearch, Grafana, and SolarWinds. Develop strategies to enhance system and application performance, availability, and reliability. In addition, maintain and monitor the deployment and orchestration of servers, docker containers, databases, and general backend infrastructure.
Document system knowledge as you acquire it, create runbooks, and ensure critical system information is readily accessible.
Security Management: Stay updated with security protocols and proactively identify, diagnose, and resolve complex security issues.
Automation and Efficiency: Identify tasks and areas where automation can be applied to achieve time efficiencies and risk reduction. Develop software for deployment automation, packaging, and monitoring visibility.
Issue Tracking and Resolution: Use Atlassian Jira, Google Buganizer, and Google IRM to track and resolve issues based on their priority.
Team Collaboration and Influence: Work in tandem with other Cloud Infrastructure Engineers and developers to ensure maximum performance, reliability, and automation of our deployments and infrastructure. Additionally, consult and influence developers on new feature development and software architecture to ensure scalability.
Debugging, Troubleshooting, and Advanced Support: Undertake debugging and troubleshooting of service bottlenecks throughout the entire software stack. Additionally, provide advanced tier 2 and 3 support for NetApp's Cloud Data Services solutions.
Directly influence the decisions and outcomes related to solution implementation: measure and monitor availability, latency, and overall system health.
Proficiency in Linux/Unix and CORE OS.
Demonstrated experience in scripting and infrastructure automation using tools such as Ansible, Python, Go or Ruby.
Deep working knowledge of Containers, Kubernetes, and Serverless computing implementation.
DevOps development methodologies.
Experience with distributed systems design patterns using tools such as Kubernetes.
Experience with cloud platforms such as AWS, Azure, or Google Cloud.
Education
-------------
A minimum of 8-12 years of experience is required.
A Bachelor of Science Degree in Computer Science, a master's degree; or equivalent experience is required.
At NetApp, we embrace a hybrid working environment designed to strengthen connection, collaboration, and culture for all employees. This means that most roles will have some level of in-office and/or in-person expectations, which will be shared during the recruitment process.
Equal Opportunity Employer:
NetApp is firmly committed to Equal Employment Opportunity (EEO) and to compliance with all laws that prohibit employment discrimination based on age, race, color, gender, sexual orientation, gender identity, national origin, religion, disability or genetic information, pregnancy, and any protected classification.
Why NetApp?
We are all about helping customers turn challenges into business opportunity. It starts with bringing new thinking to age-old problems, like how to use data most effectively to run better - but also to innovate. We tailor our approach to the customer's unique needs with a combination of fresh thinking and proven approaches.
We enable a healthy work-life balance. Our volunteer time off program is best in class, offering employees 40 hours of paid time off each year to volunteer with their favourite organizations. We provide comprehensive benefits, including health care, life and accident plans, emotional support resources for you and your family, legal services, and financial savings programs to help you plan for your future. We support professional and personal growth through educational assistance and provide access to various discounts and perks to enhance your overall quality of life.
If you want to help us build knowledge and solve big problems, let's talk.
Submitting an application
To ensure a streamlined and fair hiring process for all candidates, our team only reviews applications submitted through our company website. This practice allows us to track, assess, and respond to applicants efficiently. Emailing our employees, recruiters, or Human Resources personnel directly will not influence your application.
Our values
--------------
Put the customer at the center. Care for each other and our communities. Think and act like owners. Build belonging every day. Embrace a growth mindset.
Benefits
------------
Volunteer time off
40 hours of paid volunteer time each year.
Well-being
Employee Assistance Program, fitness, and mental health resources to help employees be their best.
Time away
Paid time off for vacation and to recharge.
Beware of fraud agents! do not pay money to get a job
MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.