Role Title: AI Platform Engineer
Location: Bangalore (In Person in office when required)
Part of the GenAI COE Team
Key Responsibilities
Platform Development and Evangelism:
Build scalable AI platforms that are customer-facing.
Evangelize the platform with customers and internal stakeholders.
Ensure platform scalability, reliability, and performance to meet business
needs.
Machine Learning Pipeline Design:
Design ML pipelines for experiment management, model management, feature
management, and model retraining.
Implement A/B testing of models.
Design APIs for model inferencing at scale.
Proven expertise with MLflow, SageMaker, Vertex AI, and Azure AI.
LLM Serving and GPU Architecture:
Serve as an SME in LLM serving paradigms.
Possess deep knowledge of GPU architectures.
Expertise in distributed training and serving of large language models.
Proficient in model and data parallel training using frameworks like DeepSpeed
and service frameworks like vLLM.
Model Fine-Tuning and Optimization:
Demonstrate proven expertise in model fine-tuning and optimization
techniques.
Achieve better latencies and accuracies in model results.
Reduce training and resource requirements for fine-tuning LLM and LVM models.
LLM Models and Use Cases:
Have extensive knowledge of different LLM models.
Provide insights on the applicability of each model based on use cases.
Proven experience in delivering end-to-end solutions from engineering to
production for specific customer use cases.
DevOps and LLMOps Proficiency:
Proven expertise in DevOps and LLMOps practices.
Knowledgeable in Kubernetes, Docker, and container orchestration.
Deep understanding of LLM orchestration frameworks like Flowise, Langflow,
and Langgraph.
Skill Matrix
LLM: Hugging Face OSS LLMs, GPT, Gemini, Claude, Mixtral, Llama
LLM Ops: ML Flow, Langchain, Langraph, LangFlow, Flowise, LLamaIndex, SageMaker,
AWS Bedrock, Vertex AI, Azure AI
Databases/Datawarehouse: DynamoDB, Cosmos, MongoDB, RDS, MySQL,
PostGreSQL, Aurora, Spanner, Google BigQuery.
Cloud Knowledge: AWS/Azure/GCP
Dev Ops (Knowledge): Kubernetes, Docker, FluentD, Kibana, Grafana, Prometheus
Cloud Certifications (Bonus): AWS Professional Solution Architect, AWS Machine
Learning Specialty, Azure Solutions Architect Expert
Proficient in Python, SQL, Javascript
Job Type: Full-time
Work Location: In person
MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.