Senior Systems Engineer Ii, Sre

Year Hyderabad, Telangana, India

Apply Now

Job Description

About Marriott:
Marriott Tech Accelerator is part of Marriott International, a global leader in hospitality. Marriott International, Inc. is a leading American multinational company that operates a vast array of lodging brands, including hotels and residential properties. It consists of over 30 well-known brands and nearly 8,900 properties situated in 141 countries and territories.
Role Title: Senior Systems EngineerII, SRE
Position Summary:
The Senior Systems Engineer - Site Reliability Engineering (SRE) is responsible for the reliability, scalability, and performance of mission-critical cloud and on-prem services that support millions of Marriot customers globally. This role involves overseeing incident management, driving automation efforts, and working closely with cross-functional teams to ensure alignment between SRE strategy and business objectives. Partners closely with Product Teams, Applications teams, Infrastructure, and the broader Applications and Infrastructure Delivery teams to develop key metrics and KPIs to improve applications stability, availability and performance. The ideal candidate will bring strong communication skills, collaborating with key stakeholders across the company to optimize cloud infrastructure and uphold the highest standards of operational excellence in a dynamic, fast-paced environment.
Job Responsibilities:

Ensure the reliability, availability, and performance of mission-critical cloud services, implementing best practices for monitoring, alerting, and incident management.
Oversee the management of high-severity incidents, driving quick resolution and post-incident analysis to identify root causes and prevent recurrence.
Drive the automation of operational processes and ensure systems can scale effectively to support growing user demand, optimizing cloud and on-prem infrastructure and resource usage.
Develop and execute the SRE strategy aligned with business goals, and communicate service health, reliability, and performance metrics to senior leadership and stakeholders

Service Reliability & Automation:

Maintain availability and performance of critical systems.
Lead incident management and postmortem analysis.
Automate operations and optimize infrastructure for scalability.

Monitoring & Observability:

Develop KPIs and implement end-to-end monitoring across applications.
Create and maintain dashboards, alerts, and reports.
Ensure compliance with enterprise monitoring standards.

Collaboration & Communication:

Work with architecture, application, and infrastructure teams.
Coordinate with vendors to align tools with business goals.
Communicate system health and performance to leadership.

Project & Technical Leadership:

Plan and lead technical initiatives aligned with business goals.
Deliver timely reports and ensure operational excellence.
Provide expert consultation and technical mentoring across teams.

Skill and Experience:

6-8 years experience in information technology process and / or technical project management including:
2+ years of experience as a Site Reliability Engineer (SRE), building and managing highly available and mission critical systems, with 3+ years of experience on public cloud, preferably AWS
Expertise in enterprise storage platforms (e.g., NetApp, Dell EMC, Isilon, Unity, Pure Storage, PURE Cloud Block Store)
Expertise in cloud storage platforms (e.g., EBS, S3, Azure Blob, AWS FSx, ONTAP FSx etc)
Deep knowledge of enterprise backup technologies (e.g., Commvault, Rubrik, Veeam, Veritas)
Deep knowledge of cloud native backups (e.g., AWS Backup, Azure Backup etc)
Strong scripting skills (Python, Shell, PowerShell).
Familiarity with Infrastructure as Code (IaC) tools like Terraform, Cloudformation
Monitoring and observability experience using Prometheus, Grafana, ELK Stack, or similar.
Proven automation and programming experience in one or more of the following languages: Java, Python, Go, Perl, Bash
Deep understanding of SRE practices such as Service Level Objectives, Error Budgets, Toil Management, Observability & Monitoring, Blameless Postmortems, Incident Response Process, Capacity Planning
Exposure to Cloud Native, Relational and NoSQL databases like RDS, MySQL, PostgreSQL, Cassandra or Couchbase preferable.
Experience with deploying, monitoring, and troubleshooting large-scale, distributed applications in cloud environments such as AWS
Experience in vulnerability assessment, patching, security compliance of infrastructure, storage & backup
Experience is setting up DR using approved Storage and Backup technologies
Familiarity with security frameworks such as ISO27001, SOCII, PCI-DSS, and / or HIPAA
Experience working with SaaS, IaaS, and PaaS offerings
Ability to work with global teams located in US and India
6+ years experience in a technical discipline role with experience in planning, implementing and evaluating processes, systems and/or initiatives
Broad technical acumen across multiple disciplines applications with a solid understanding of current technologies
Experience applying measurement processes / methods for assessing program outputs and outcomes or progress toward goals and objectives.
Extremely high level of analytical ability with complex problems
Ability to work across organizational boundaries, to help lead and influence change
Ability to command the process across all levels to ensure customer focus; including being assertive and self-starting
Demonstrated leadership experience in influence and garnering alignment from external organizations
Ability to align change management strategies with projects
Skilled in conceptualizing creative solutions, documenting them, and presenting / selling them to senior management
Very high level of interpersonal skills to work effectively with others, motivate employees, and elicit work output in a team environment

Education and Certifications:

Undergraduate degree in Computer Science or related technical field or equivalent experience/certification

Work location: Hyderabad, India.
Work mode: Hybrid
Marriotts core values:
At Marriott, our make us who we are. We believe that success is never final. As we change and grow, the beliefs that are most important to us stay the sameputting people first, pursuing excellence, embracing change, acting with integrity, and serving our world. Being part of Marriott Tech Accelerator means being part of a proud history and a thriving culture.

Beware of fraud agents! do not pay money to get a job

MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.

Job Detail

Job Id

JD3890630
Industry

Not mentioned
Total Positions

1
Job Type:

Full Time
Salary:

Not mentioned
Employment Status

Permanent
Job Location

Hyderabad, Telangana, India
Education

Not mentioned
Experience

Year

MNC Jobs India

Jobs by Function

Popular Job Skills

Popular Industries

Popular Cities

Jobseekers

Employers