Lead Dynatrace Sme

Year    TN, IN, India

Job Description

EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.


We are seeking a talented and experienced

Lead Dynatrace SME

to drive the implementation and optimization of observability solutions across dynamic, distributed systems. This role will play a pivotal part in designing advanced monitoring frameworks, enhancing incident response workflows, and ensuring high reliability and performance of critical systems.

Responsibilities



Implement and manage Dynatrace observability solutions across distributed systems and environments Migrate dashboards, alerts, and telemetry from Grafana to Dynatrace, ensuring data consistency and performance visibility Design and configure telemetry ingestion pipelines using the Dynatrace toolset Develop and operationalize SLOs/SLIs and automated alerting frameworks aligned with business KPIs Deploy and fine-tune AI-driven anomaly detection and AIOps use cases to improve root-cause analysis and incident prevention Create the Order-ID Observability Dashboard for end-to-end visibility of order processing Collaborate with L2 and L3 support teams to extend observability coverage and enhance incident response workflows Integrate observability insights with ServiceNow and other ITSM tools for unified monitoring and ticket correlation Drive continuous improvement in MTTD, MTTR, and overall system resilience through proactive analysis and optimization Document observability architecture, dashboards, and operational runbooks in Confluence

Requirements



7+ years of experience in Site Reliability Engineering, Observability, or Monitoring roles Proven hands-on experience with Dynatrace (dashboards, Smartscape, Davis AI, alerting, tagging, SLOs) Solid understanding of AIOps platforms, event correlation, and anomaly detection concepts Familiarity with ServiceNow or similar ITSM systems for alert/ticket automation Experience building and maintaining observability in Azure environments Proficiency in scripting or automation (Python, PowerShell, or similar) Strong analytical, diagnostic, and problem-solving skills with attention to system reliability and performance English level of minimum B2 (Upper-Intermediate) for effective communication

We offer



Opportunity to work on technical challenges that may impact across geographies Vast opportunities for self-development: online university, knowledge sharing opportunities globally, learning opportunities through external certifications Opportunity to share your ideas on international platforms Sponsored Tech Talks & Hackathons Unlimited access to LinkedIn learning solutions Possibility to relocate to any EPAM office for short and long-term projects Focused individual development Benefit package: + Health benefits
+ Retirement benefits
+ Paid time off
+ Flexible benefits
* Forums to explore beyond work passion (CSR, photography, painting, sports, etc.)

Beware of fraud agents! do not pay money to get a job

MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Job Detail

  • Job Id
    JD4976023
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    TN, IN, India
  • Education
    Not mentioned
  • Experience
    Year