Cloud Engineering / Site Reliability Engineering (SRE)
###
Job Summary:
We are seeking a skilled and proactive Developer with hands-on experience in
Azure Chaos Studio
to join our cloud engineering team. The ideal candidate will be responsible for designing, implementing, and managing chaos engineering experiments to improve the resilience and reliability of our cloud-based applications and infrastructure.
###
Key Responsibilities:
Design and execute chaos experiments using
Azure Chaos Studio
to simulate real-world outages and failures.
Collaborate with development, DevOps, and SRE teams to identify critical systems and define failure scenarios.
Analyze experiment results and provide actionable insights to improve system resilience.
Automate chaos testing as part of CI/CD pipelines.
Monitor and report on system behavior during and after chaos experiments.
Develop custom fault injection scripts and integrate with other Azure services.
Stay updated with the latest features and best practices in Azure Chaos Studio and chaos engineering.
###
Required Skills & Qualifications:
Proven experience with
Azure Chaos Studio
and Azure ecosystem.
Strong programming/scripting skills in
Python, PowerShell, or C#
.
Experience with
Azure DevOps
,
ARM templates
, or
Bicep
.
Familiarity with
resilience engineering
,
fault tolerance
, and
disaster recovery
principles.
Experience with monitoring tools like
Azure Monitor
,
Application Insights
, or
Log Analytics
.
Understanding of microservices architecture and distributed systems.
* Excellent problem-solving and communication skills.
Beware of fraud agents! do not pay money to get a job
MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.