Alvarez & Marsal (A&M) is a global consulting firm with over 10,000 entrepreneurial, action and results-oriented professionals in over 40 countries. We take a hands-on approach to solving our clients' problems and assisting them in reaching their potential. Our culture celebrates independent thinkers and doers who positively impact our clients and shape our industry. The collaborative environment and engaging work--guided by A&M's core values of Integrity, Quality, Objectivity, Fun, Personal Reward, and Inclusive Diversity - are why our people love working at A&M.
The Team
Our DTS team covers the full breadth of Technology Consulting and M&A services, including -
Technology M&A and Strategy - Assist clients to manage the technology aspects and business enablement of complex M&A, integrations and carve-outs as well as post-deal value creation
Technology Consulting - End to end technology advisory for clients, including developing technology roadmaps, platform/cloud/data advisory as well as transformation excellence for a digital transformation
Data & AI services - Helping clients in harnessing the power of data and cutting-edge analytics to drive intelligent decision-making and transform businesses.
Develop GenAI and Agentic AI solutions that create real business value for clients through process re-invention
How you will contribute
The
Lead / DevOps Platform Engineer
is a
foundational role
responsible for enabling
reliable, secure, scalable, and cost-governed delivery of AI, Machine Learning, and Generative AI solutions
across the enterprise. This role owns the
platform layer that sits beneath AI applications
--covering cloud infrastructure, CI/CD pipelines, MLOps/LLMOps automation, observability, security, and cost controls. The role ensures that AI solutions do not remain experimental but are
production-ready, repeatable, auditable, and operable at scale
. This role exists to
eliminate risks
and provide a
stable platform backbone
for AI and data teams to innovate safely and efficiently.
Key Responsibilities
1. AI Platform & Cloud Architecture
Own and evolve cloud platform architecture supporting AI, ML, and GenAI workloads across all environments
Design platforms for model training, fine-tuning, high-availability inference, batch and event-driven pipelines, and long-running or agent-based workflows
Ensure platforms are cloud-native, modular, extensible, and aligned with enterprise architecture standards
Enable multi-cloud portability (Azure, AWS, GCP) through abstraction of cloud dependencies
Partner with GenAI & Data Architects to align platform capabilities with RAG pipelines, agent orchestration, and data platform architectures
2. CI/CD & Automation
Design and implement end-to-end CI/CD pipelines for applications, data pipelines, ML models, and GenAI prompts
Standardize environment promotion with automated testing, approvals, rollback, and release controls
Integrate pipelines with source control, artifact repositories, model registries, and prompt repositories
Implement progressive delivery patterns such as blue-green deployments, canary releases, and feature flags
Embed security scans, quality gates, and compliance checks directly into CI/CD workflows
3. Infrastructure as Code & Environment Standardization
Define and enforce Infrastructure-as-Code standards using Terraform, ARM/Bicep, and cloud SDKs
Automate provisioning of compute, storage, networking, Kubernetes clusters, and AI platform services
Ensure environments are reproducible, version-controlled, auditable, and free from configuration drift
4. Observability, Reliability & SRE Practices
Design and implement end-to-end observability including metrics, logs, and distributed tracing
Define and monitor SLIs and SLOs for AI, data, and platform services
Design for high availability, fault tolerance, and disaster recovery
Lead incident response, root-cause analysis, and post-incident reviews
Drive continuous reliability improvements using operational metrics
5. Cost Management & FinOps
Implement FinOps practices for AI and data platforms
Track and optimize infrastructure usage, cost per inference, and GenAI token consumption
Establish cost guardrails including budgets, alerts, auto-scaling, and shutdown policies
Partner with architects and business stakeholders to balance accuracy, latency, scale, and cost
6. Security, Governance & Compliance
Embed security-by-design into platform architecture and delivery pipelines
Implement IAM, secrets management, encryption, network segmentation, and secure connectivity
Enable audit logging, traceability, and governance for model execution, prompt usage, and data access
Support internal and external audits, penetration testing, and compliance reviews
7. MLOps / LLMOps Enablement
Enable and operate MLOps and LLMOps platforms covering training, serving, monitoring, versioning, and rollback
Support automated evaluation, retraining, drift detection, and performance degradation alerts
Ensure platforms support experimentation without compromising production stability
8. Collaboration & Leadership
Collaborate with GenAI & Data Architects, AI Engineers, Backend and Frontend Engineers, Security, QA, and Delivery teams
Participate in Agile ceremonies, release planning, and roadmap discussions
Provide technical leadership and mentoring to DevOps and platform engineers
Define platform standards, documentation, and best practices
Act as a trusted advisor to leadership on scalability, risk, and cost
Qualifications
Bachelor's degree in Computer Science, Engineering, or a related discipline
Master's degree preferred
Relevant certifications strongly desired (Azure/AWS/GCP Architect or DevOps, Kubernetes, Terraform)
8-12+ years of experience in DevOps, Platform Engineering, Cloud Infrastructure, or SRE roles
Proven experience designing and operating enterprise-scale, production platforms
Hands-on experience supporting AI/ML and GenAI workloads in regulated or security-conscious environments
Deep expertise in at least one major cloud platform (Azure, AWS, or GCP)
Strong experience with CI/CD, Infrastructure as Code, Kubernetes, and containerized workloads
Proven experience implementing observability, reliability engineering, and incident management practices
Strong understanding of cloud security, governance, and compliance requirements
Hands-on experience with cloud cost optimization and FinOps practices
Proven ability to lead and mentor platform teams and communicate effectively with executive stakeholders
Your journey at A&M
We recognize that our people are the driving force behind our success, which is why we prioritize an employee experience that fosters each person's unique professional and personal development. Our robust performance development process promotes continuous learning, rewards your contributions, and fosters a culture of meritocracy. With top-notch training and on-the-job learning opportunities, you can acquire new skills and advance your career. We prioritize your well-being, providing benefits and resources to support you on your personal journey. Our people consistently highlight the growth opportunities, our unique, entrepreneurial culture, and the fun we have together as their favorite aspects of working at A&M. The possibilities are endless for high-performing and passionate professionals.
Inclusive Diversity
-----------------------
A&M's entrepreneurial culture celebrates independent thinkers and doers who can positively impact our clients and shape our industry. The collaborative environment and engaging work--guided by A&M's core values of Integrity, Quality, Objectivity, Fun, Personal Reward, and Inclusive Diversity--are the main reasons our people love working at A&M. Inclusive Diversity means we embrace diversity, and we foster inclusiveness, encouraging everyone to bring their whole self to work each day. It runs through how we recruit, develop employees, conduct business, support clients, and partner with vendors. It is the A&M way.
Equal Opportunity Employer
------------------------------
It is Alvarez & Marsal's practice to provide and promote equal opportunity in employment, compensation, and other terms and conditions of employment without discrimination because of race, color, creed, religion, national origin, ancestry, citizenship status, sex or gender, gender identity or gender expression (including transgender status), sexual orientation, marital status, military service and veteran status, physical or mental disability, family medical history, genetic information or other protected medical condition, political affiliation, or any other characteristic protected by and in accordance with applicable laws. .
Please note that as per A&M policy, we do not accept unsolicited resumes from third-party recruiters unless such recruiters are engaged to provide candidates for a specified opening. Any employment agency, person or entity that submits an unsolicited resume does so with the understanding that A&M will have the right to hire that applicant at its discretion without any fee owed to the submitting employment agency, person or entity.
Beware of fraud agents! do not pay money to get a job
MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.