Sr. DevOps Engineer & Site Reliability Engineer Job in Talentxplore

Sr. Devops Engineer & Site Reliability Engineer

Year TS, IN, India

Talentxplore

110 Current Jobs Openings

Apply Now

Job Description

Senior DevOps & Site Reliability Engineer (DevOps + SRE)

About the Role

We are seeking a highly experienced

Senior DevOps & Site Reliability Engineer

to support and scale our cloud-native, containerized IoT platform built on AWS. You will work closely with the Technical Manager to automate infrastructure, build CI/CD pipelines, manage large-scale deployments, and ensure the platform's reliability, security, and performance.

This role requires deep hands-on expertise in

AWS, Docker/Kubernetes, serverless workflows, infrastructure automation, scripting (Python), and IoT-scale distributed systems reliability

Key Responsibilities

DevOps Responsibilities

Design, implement, and maintain

CI/CD pipelines

using GitHub Actions, AWS CodePipeline, or GitLab CI.

Develop and automate deployment workflows following

DevOps strategy and best practices

.

Manage

Docker containerization

, including multi-stage builds, optimization, and image security.

Orchestrate containers using

Kubernetes (EKS)

or AWS

ECS

(Fargate/EC2).

Manage and optimize

ECR

for image storage and versioning.

Implement Infrastructure-as-Code using

AWS CDK, Terraform, or CloudFormation

.

Build automated workflows for backend, microservices, and IoT services deployment.

Support

serverless architectures

using AWS Lambda, Step Functions, EventBridge, etc.

Implement secure secrets management using AWS IAM, KMS, and Secrets Manager.

Handle configuration, environment management, and zero-downtime deployment strategies.

Site Reliability Engineering (SRE) Responsibilities

Build and maintain

monitoring, logging, tracing

pipelines using CloudWatch, Grafana, Prometheus, X-Ray, and OpenTelemetry.

Define and implement

SLIs, SLOs, error budgets

, and reliability dashboards.

Ensure high availability, resilience, and performance of all systems under production.

Conduct incident management, root cause analysis, and post-incident reviews.

Optimize cost, compute utilization, autoscaling policies, and failover strategies.

Implement cloud reliability patterns--circuit breaker, retries, throttling, canary and blue-green deployments.

Manage production readiness, release safety, and operational excellence.

Required Skills & Qualifications

7+ years

of experience in DevOps, SRE, or Cloud Infrastructure roles.

Deep hands-on experience with:

o

Docker containerization & orchestration

Kubernetes (EKS)

and/or

AWS ECS

AWS ECR

(image lifecycle management)

o

AWS IoT Core, Lambda, API Gateway, VPC, S3, IAM, CloudWatch

Strong scripting experience --

Python expertise preferred

(Bash is a plus).

Proficiency with

GitHub

for code management, automation, and CI/CD workflows.

Strong background in

Infrastructure-as-Code

: AWS CDK, Terraform, or CloudFormation.

Experience with reliability engineering frameworks, large-scale distributed systems, and HA/DR design.

Knowledge of

serverless computing

and event-driven architectures.

Strong understanding of cloud security, identity management, and compliance.

Job Type: Full-time

Pay: ₹2,000,000.00 - ₹3,000,000.00 per year

Work Location: In person

Beware of fraud agents! do not pay money to get a job

MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.

Job Detail

Job Id

JD5029443
Industry

Not mentioned
Total Positions

1
Job Type:

Full Time
Salary:

Not mentioned
Employment Status

Permanent
Job Location

TS, IN, India
Education

Not mentioned
Experience

Year

MNC Jobs India

Jobs by Function

Popular Job Skills

Popular Industries

Popular Cities

Jobseekers

Employers