Sr. Devops Engineer & Site Reliability Engineer

Year    TS, IN, India

Job Description

Senior DevOps & Site Reliability Engineer (DevOps + SRE)



About the Role



We are seeking a highly experienced

Senior DevOps & Site Reliability Engineer

to support and scale our cloud-native, containerized IoT platform built on AWS. You will work closely with the Technical Manager to automate infrastructure, build CI/CD pipelines, manage large-scale deployments, and ensure the platform's reliability, security, and performance.

This role requires deep hands-on expertise in

AWS, Docker/Kubernetes, serverless workflows, infrastructure automation, scripting (Python), and IoT-scale distributed systems reliability

.

Key Responsibilities



DevOps Responsibilities



Design, implement, and maintain

CI/CD pipelines

using GitHub Actions, AWS CodePipeline, or GitLab CI.

Develop and automate deployment workflows following

DevOps strategy and best practices

.

Manage

Docker containerization

, including multi-stage builds, optimization, and image security.

Orchestrate containers using

Kubernetes (EKS)

or AWS

ECS

(Fargate/EC2).

Manage and optimize

ECR

for image storage and versioning.

Implement Infrastructure-as-Code using

AWS CDK, Terraform, or CloudFormation

.

Build automated workflows for backend, microservices, and IoT services deployment.

Support

serverless architectures

using AWS Lambda, Step Functions, EventBridge, etc.

Implement secure secrets management using AWS IAM, KMS, and Secrets Manager.

Handle configuration, environment management, and zero-downtime deployment strategies.

Site Reliability Engineering (SRE) Responsibilities



Build and maintain

monitoring, logging, tracing

pipelines using CloudWatch, Grafana, Prometheus, X-Ray, and OpenTelemetry.

Define and implement

SLIs, SLOs, error budgets

, and reliability dashboards.

Ensure high availability, resilience, and performance of all systems under production.

Conduct incident management, root cause analysis, and post-incident reviews.

Optimize cost, compute utilization, autoscaling policies, and failover strategies.

Implement cloud reliability patterns--circuit breaker, retries, throttling, canary and blue-green deployments.

Manage production readiness, release safety, and operational excellence.

Required Skills & Qualifications



7+ years

of experience in DevOps, SRE, or Cloud Infrastructure roles.

Deep hands-on experience with:

o

Docker containerization & orchestration



o

Kubernetes (EKS)

and/or

AWS ECS



o

AWS ECR

(image lifecycle management)

o

AWS IoT Core, Lambda, API Gateway, VPC, S3, IAM, CloudWatch



Strong scripting experience --

Python expertise preferred

(Bash is a plus).

Proficiency with

GitHub

for code management, automation, and CI/CD workflows.

Strong background in

Infrastructure-as-Code

: AWS CDK, Terraform, or CloudFormation.

Experience with reliability engineering frameworks, large-scale distributed systems, and HA/DR design.

Knowledge of

serverless computing

and event-driven architectures.

Strong understanding of cloud security, identity management, and compliance.

Job Type: Full-time

Pay: ₹2,000,000.00 - ₹3,000,000.00 per year

Work Location: In person

Beware of fraud agents! do not pay money to get a job

MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Job Detail

  • Job Id
    JD5029443
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    TS, IN, India
  • Education
    Not mentioned
  • Experience
    Year