Site Reliability Engineer

Year Mumbai, Maharashtra, India

https://www.mncjobsindia.com/company/morningstar

Apply Now

Job Description

Our Team:
Technology drives our business. Our team is made up of talented software engineers, infrastructure engineers, leaders and UX professionals. We care about technology as a craft and a differentiator. We bring our global products to market with a mix of software, cloud, data centers, infrastructure, design and grit.
Our Product Groups:
Individual Investor - building products like Morningstar.com and mobile apps for individuals like yourself
Institutional Investor - developing some of our flagship products like Morningstar Direct for institutional investors and our Advisor products for financial advisors
Workplace - this is where we build and provide our hosted digital advice platform for Retirement plans, 401K's, etc. (what some call robo-advisors)
Data - this is the heart of Morningstar where all data is sourced, collected, transformed, calculated and distributed across the world
What You'll Do:
As a Site Reliability (SRE)/DevOps Engineer on our data and analytics team, you will work on the availability, automation, performance, efficiency, scaling, monitoring and emergency response of the core systems that store data at Morningstar. You build deep understanding of platforms, architecture, people, systems, and processes to both establish and continuously improve SLIs and SLOs for uptime, performance, deployment, monitoring, and troubleshooting.
Your Day to Day:

Maintain and support the product and data systems: proactively monitor events, investigate issues, analyze solutions, and drive problems through to resolution.
Develop tools and reporting as needed by projects and operations.
Work with products to define application hardening and define opportunities for chaos engineering.
Use operational tools and monitoring platforms to gain in-depth knowledge, understanding, and ongoing monitoring of system availability, performance, and capacity.
Implement alerting strategy that makes alerts actionable and unique.
Provide follow-through to ensure issues are resolved to satisfaction
Contribute to continuous improvement and innovation within the team.
A sense of ownership, initiative and drive.

Basic Qualifications:
Bachelor's degree or higher with some experience in a technical support role.
You have been working in technology for 0 - 2 years
Responsibilities:

1st level of support for data triage/issues

Support for other teamz - all data consumers
Review data logs, manifests, track lineage of data changes.
Identify causes of data changes, report out to owners of that change.
Understand event framework and triage events in audit DB
Access Management

Include entitlement access (EAMS)
Release management - deployment check lists

Support for data lake releases, dashboard changes, etc.
Coordinate with Data Lake DevOps in Mumbai around releases
Event and Incident management - Alerts and Incidents

RCA like contributions, why did this data move
Do we need to make changes to the pipes, qc checks, etc.
Incident commander for P1/P2 incidents
Drive continuous improvement by assessing trend of metrics such as MTTA, MTTR
Monitoring, data thresholds/coverage checks

Building and monitoring dashboards, alerts.
Contribute to the mechanical testing of changes (row count, nulls, break schema, etc)
Help with deep integration with QC framework
Ops readiness check lists

Ensure to follow standards, architecture diagrams, dataset catalog, data contract, logging standards etc.
Dashboards for data status, workflow status for all data movements in all zones

Etleap will be building some dashboards out; we'll need to understand and outline any overlap here
Tools for data view for troubleshooting purposes like XOI viewer

Same as above
Ad-hoc operations project coordination

Server maintenance, upgrade
Application and server log management
Disaster recovery plan and event
Security event and patching

Preferred Qualifications:
Experience in Python, other scripting languages
Experience with AWS: S3, SNS, SQS, DynamoDB, Glue, Lake Formation, Spark, SQL
Experience with Linux, Parquet, Avro and ORC formats
Knowledge of monitoring tools and strategy: VictorOps, New Relic, CloudWatch, Splunk ideally
Experience running incident post-mortems
Understanding of automated deployment processes leveraging Terraform, Jenkins
You have been working in technology for 0-2 years
Please include a cover letter describing your passion for engineering operations and participating in building efficient, reliable systems
Morningstar is an equal opportunity employer.
Morningstar's hybrid work environment gives you the opportunity to work remotely and collaborate in-person each week. We've found that we're at our best when we're purposely together on a regular basis, at least three days each week. A range of other benefits are also available to enhance flexibility as needs change. No matter where you are, you'll have tools and resources to engage meaningfully with your global colleagues.
I10_MstarIndiaPvtLtd Morningstar India Private Ltd. (Delhi) Legal Entity

Beware of fraud agents! do not pay money to get a job

MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.

Job Detail

Job Id

JD3776243
Industry

Not mentioned
Total Positions

1
Job Type:

Full Time
Salary:

Not mentioned
Employment Status

Permanent
Job Location

Mumbai, Maharashtra, India
Education

Not mentioned
Experience

Year

MNC Jobs India

Jobs by Function

Popular Job Skills

Popular Industries

Popular Cities

Jobseekers

Employers