Application Lead

Year    HR, IN, India

Job Description

Project Role :

Application Lead

Project Role Description :

Lead the effort to design, build and configure applications, acting as the primary point of contact.


Must have skills :

Site Reliability Engineering

Good to have skills :

Google Cloud Data Services, Microsoft Azure Analytics Services

Minimum

12

year(s) of experience is required

Educational Qualification :

15 years full time education



Job Title: SRE and Automation Architect Location: [Insert Location or Remote] Experience Level: 10+ Years Employment Type: Full-Time ________________________________________ Job Summary: We are looking for a seasoned Site Reliability Engineering (SRE) and Automation Architect to lead the design and implementation of highly available, reliable, and automated platforms and operations. The ideal candidate will bridge the gap between development and operations, driving infrastructure automation, observability, resiliency engineering, and SRE best practices at scale across multi-cloud and hybrid environments. This role requires deep technical expertise in cloud platforms (Azure/AWS/GCP), CI/CD pipelines, IaC, SLO/SLI implementation, and incident management automation. ________________________________________ Key Responsibilities: Platform Reliability & Architecture: o Architect highly available, resilient, and self-healing systems and services. o Define and implement SLOs, SLIs, error budgets, and performance benchmarks across platforms. o Drive observability standards including logging, metrics, and distributed tracing. Automation Strategy: o Lead design and implementation of end-to-end automation across infrastructure provisioning, configuration management, CI/CD pipelines, and incident response. o Build reusable IaC modules using tools like Terraform, Ansible, Pulumi, or Bicep. o Automate environment creation, scaling, patching, and compliance using scripts and DevOps toolchains. DevOps & CI/CD: o Architect and maintain CI/CD pipelines using Azure DevOps, GitHub Actions. o Ensure secure and reliable software deployments by implementing automated testing, canary deployments, blue-green strategies, and rollback automation. Monitoring & Incident Response: o Define standards for monitoring, alerting, and incident management using tools like Prometheus, Grafana, ELK, Datadog, Splunk, or Azure Monitor. o Build auto-remediation runbooks and event-driven workflows using platforms like StackStorm, Azure Logic Apps, PagerDuty, or OpsGenie. o Facilitate blameless post-mortems and continuous improvement processes. Security, Compliance & Cost Optimization: o Integrate security checks and policy-as-code into automation and deployment pipelines (e.g., with OPA, Sentinel, or Azure Policy). o Optimize cost through right-sizing, autoscaling, and usage-based automation. Collaboration & Leadership: o Act as the SRE and automation thought leader across development, infrastructure, and operations teams. o Mentor engineers and advocate for modern SRE principles such as Toil Reduction, Error Budgeting, and Release Engineering. o Collaborate with architecture teams to align reliability with business and technical goals. ________________________________________ Required Skills & Experience: o 10+ years of experience in infrastructure, DevOps, or SRE roles, with at least 3 years in an architect-level role o Deep expertise in cloud platforms: Azure (preferred), AWS, or GCP o Strong experience with IaC (Terraform, ARM/Bicep, Ansible) and automation scripting (Python, Bash, PowerShell) o Hands-on experience with CI/CD tools and container orchestration (Kubernetes, Helm, Istio) o Proven ability to design and manage high-availability and disaster recovery strategies o Strong observability experience with APM tools, log aggregation, and distributed tracing o Knowledge of incident response automation and auto-remediation frameworks ________________________________________ Preferred Qualifications: o Certified: Azure DevOps Expert, GCP SRE, AWS DevOps Engineer, or Kubernetes Administrator (CKA) o Experience with GitOps tools like Flux or ArgoCD o Familiarity with Service Meshes, Chaos Engineering (e.g., Chaos Monkey, Litmus) o Understanding of FinOps, Cloud Governance, and Security Automation ________________________________________ Soft Skills: o Strategic mindset with attention to detail o Excellent problem-solving and analytical skills o Strong communication and documentation skills o Passion for automation, scalability, and improving developer productivity ________________________________________




15 years full time education

Beware of fraud agents! do not pay money to get a job

MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Job Detail

  • Job Id
    JD4135049
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    HR, IN, India
  • Education
    Not mentioned
  • Experience
    Year