ASAP
Contact: Qualified candidates are invited to submit their resumes to info@walsharp.com for immediate consideration.
About Walsharp Technologies
Walsharp Technologies is a
multi-domain, product-based technology company
building
secure, scalable, and intelligent software platforms
using
cloud-native architecture, AI/ML, and modern engineering practices
. We design, develop, and own our products
end-to-end
, with a strong focus on
reliability, security, and compliance by design
.
Our flagship healthcare AI platform,
VEROXON
, is engineered to meet
enterprise reliability, performance, and healthcare compliance standards
, supporting mission-critical digital health systems globally.
Role Overview
We are seeking a
Platform Reliability Engineer
to ensure the
availability, performance, resilience, and operational excellence
of the
VEROXON healthcare AI platform
. This role blends
SRE, DevOps, and systems engineering
practices, with a strong emphasis on
healthcare-grade uptime, observability, and incident response
.
The ideal candidate will partner closely with
DevOps, Cloud Security, Architecture, and Product Engineering teams
to build systems that are
reliable by default and resilient under scale
.
Key Responsibilities
Ensure
high availability and reliability
of the VEROXON platform
Define and manage
SLIs, SLOs, and error budgets
Design and improve
monitoring, alerting, and observability
Proactively identify and mitigate reliability risks
Participate in and lead
incident response and postmortems
Improve platform resilience, scalability, and fault tolerance
Automate operational workflows to reduce toil
Collaborate with DevOps to improve CI/CD reliability
Support Kubernetes-based production environments
Optimize system performance and capacity planning
Ensure operational readiness for healthcare compliance
Maintain runbooks, dashboards, and operational documentation
Required Skills & Experience
Core Reliability & SRE Experience
6-10 years of experience in
SRE, Platform Engineering, or Reliability Engineering
Experience supporting
enterprise or SaaS platforms
Strong understanding of
distributed systems
Cloud & Infrastructure
Hands-on experience with
AWS / Azure / GCP
Experience managing production cloud environments
Knowledge of networking, load balancing, and failover strategies
Kubernetes & Containers
Strong experience with
Kubernetes
in production
Experience with Docker, Helm, and cluster troubleshooting
Understanding of autoscaling and workload optimization
Observability & Monitoring
Experience with
Prometheus, Grafana, ELK, CloudWatch
, or similar
Ability to design actionable alerts and dashboards
Experience correlating metrics, logs, and traces
Automation & Scripting
, and audit readiness
Understanding of operational logging and audit trails
Incident Management & Collaboration
Experience leading or participating in production incidents
Strong root cause analysis and post-incident learning
Ability to work with cross-functional teams under pressure
Why Join Walsharp Technologies
Ensure reliability of a
mission-critical healthcare AI platform
Work on large-scale, distributed, compliant systems
Influence platform reliability strategy from the ground up
Collaborate with senior engineering and architecture leaders
Build systems where
uptime, trust, and patient impact matter
Job Types: Full-time, Permanent
Pay: ₹153,301.63 - ₹1,031,976.91 per year
Benefits:
Flexible schedule
Food provided
Health insurance
Leave encashment
Paid sick time
Paid time off
Provident Fund
Work Location: In person
Beware of fraud agents! do not pay money to get a job
MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.