Improve reliability quality and time to market of our suite of products applications
Define suitable metrics for system with SLO SLI and setup observability mechanism to track it
Define error budget as per the SLO
Define strategy and setup up High Availability and Load Balancer based architecture
Drive a metrics driven culture and software delivery process using data to measure overall system quality and reliability
Balance feature development speed and reliability with well defined service level objectives
Provide primary operational support and engineering for products applications
Partner with solution architect and development teams to improve services reliability
Participate in system design infra management and capacity planning
Participate in automating operational tasks and toil reduction
Provide automation solutions for performance management disaster recovery monitoring and observability