Country/Region: IN
Requisition ID: 28350
Work Model:
Position Type:
Salary Range:
Location: INDIA - MUMBAI - CRISIL
Title:
Technical Lead-ML Development
=============================================
Description:
Area(s) of responsibility
-----------------------------
What You'll Do:
Develop, and manage efficient MLOps pipelines tailored for Large Language Models, automating the deployment and lifecycle management of models in production.
- Deploy, scale, and monitor LLM inference services across cloud-native environments using - Kubernetes, Docker, and other container orchestration frameworks.
Optimize LLM serving infrastructure for latency, throughput, and cost, including hardware acceleration setups with GPUs or TPUs.
Build and maintain CI/CD pipelines specifically for ML workflows, enabling automated validation, and seamless rollouts of continuously updated language models.
Implement comprehensive monitoring, logging, and alerting systems (e.g., Prometheus, Grafana, ELK stack) to track model performance, resource utilization, and system health.
Collaborate cross-functionally with ML research and data science teams to operationalize fine-tuned models, prompt engineering experiments, and multi agentic LLM workflows.
Handle integration of LLMs with APIs and downstream applications, ensuring reliability, security, and compliance with data governance standards.
Evaluate, select, and incorporate the latest model-serving frameworks and tooling (e.g., Hugging Face Inference API, NVIDIA Triton Inference Server).
Troubleshoot complex operational issues impacting model availability and degradation, implementing fixes and preventive measures.
Stay up to date with emerging trends in LLM deployment, optimization techniques such as quantization and distillation, and evolving MLOps best practices.
What We're Looking For:
Experience & Skills:
3 to 5 years of professional experience in Machine Learning Operations or ML Infrastructure engineering, including experience deploying and managing large-scale ML models.
Proven expertise in containerization and orchestration technologies such as Docker and Kubernetes, with a track record of deploying ML/LLM models in production.
Strong proficiency in programming with Python and scripting languages such as Bash for workflow automation.
Hands-on experience with cloud platforms (AWS, Google Cloud Platform, Azure), including compute resources (EC2, GKE, Kubernetes Engine), storage, and ML services.
Solid understanding of serving models using frameworks like Hugging Face Transformers or OpenAI APIs.
Experience building and maintaining CI/CD pipelines tuned to ML lifecycle workflows (evaluation, deployment).
Familiarity with performance optimization techniques such as batching, quantization, and mixed-precision inference specifically for large-scale transformer models.
Expertise in monitoring and logging technologies (Prometheus, Grafana, ELK Stack, Fluentd) to ensure production-grade observability.
Knowledge of GPU/TPU infrastructure setup, scheduling, and cost-optimization strategies.
Strong problem-solving skills with the ability to troubleshoot infrastructure and deployment issues swiftly and efficiently.
Effective communication and collaboration skills to work with cross-functional teams in a fast-paced environment.
Educational Background:
Bachelor's or Master's degree from premier Indian institutes (IITs, IISc, NITs, BITS, IIITs etc.) in:
Computer Science, or
Any Engineering discipline, or
* Mathematics or related quantitative fields.
Beware of fraud agents! do not pay money to get a job
MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.