Principal Llm & Mlops Engineer

Year    IN, India

Job Description

We are looking for a senior engineer who specializes in LLM systems, prompt engineering, and agentic application deployment, combined with strong MLOps and cloud platform engineering experience.


You will design, deploy, and scale Generative AI models, retrieval-augmented generation (RAG) pipelines, and autonomous agent frameworks on OCI.


In this role, you'll work closely with data scientists, platform architects, and research teams to build production-grade AI systems, including:


LLM finetuning and adaptation Prompt and prompt-chain optimization Multi-agent orchestration frameworks Automated evaluation and guardrail systems Model + data drift monitoring and continuous retraining workflows
You will be a key contributor to defining our AI platform architecture, ensuring operational scale, efficiency, security, and reliability.



Your responsibilities will include:



LLM & Agentic Development:



Design, evaluate, and optimize prompts, prompt chains, and agent behaviors. Build and deploy RAG systems, vector search pipelines, and knowledge-grounding layers. Develop agent orchestration workflows using frameworks like LangChain, LlamaIndex, Guidance, or AG2. Integrate LLMs with external tools, APIs, and internal business systems.

LLMOps & Platform Engineering:



Deploy and host open-source and proprietary LLMs on OCI (e.g., GPT, Llama, Mistral, Grok). Implement automated evaluation frameworks to measure truthfulness, relevance, safety, latency, and cost. Manage fine-tuning, LoRA adaptation, or embedding model selection.

Data Pipeline & Quality:



Build pipelines that ensure data freshness, traceability, and semantic relevance for downstream LLM tasks. Use data validation frameworks (e.g., Great Expectations, Evidently) to detect drift or knowledge degradation.

Observability, Monitoring & Cost Optimization:



Track LLM system performance, token usage, latency, and operational anomalies. Implement model guardrails, safety layers, and automated fallback behavior.

Collaboration & Mentorship



Work directly with Data Science + Product to translate domain problems into LLM+Agent architectures. Mentor engineers and scientists on LLM deployment, prompt strategy, and evaluation methods. Work closely with architects, product teams, data engineers, and other stakeholders to deliver end-to-end AI solutions that address business needs.

Technical Skills:



Strong Python engineering background. Experience with LLMs, RAG pipelines, or agent frameworks (LangChain, LlamaIndex, Haystack, AG2, etc.). Hands-on cloud infrastructure experience (OCI, AWS, GCP, or Azure). Experience with vector databases (e.g., Chroma, Pinecone, Weaviate, Milvus, PGVector). Experience with Kubernetes, Docker, and CI/CD automation.

Nice to Have:



Experience fine-tuning or adapting LLMs (e.g., LoRA, QLoRA, RLHF, supervised finetuning). Prompt evaluation and automated testing frameworks (e.g., RAGAS, TruLens, DeepEval). Experience deploying microservices architectures in production environments.

Qualifications

:


8+ years of experience in software engineering, machine learning engineering, or platform engineering, with at least 2+ years focused on ML/AI systems in production. Hands-on experience developing or deploying Large Language Model (LLM) systems, including prompt engineering, RAG pipelines, agent-based workflows, or LLM fine-tuning. Strong proficiency in Python and experience with one or more LLM/agent frameworks (e.g., LangChain, LlamaIndex, Haystack, Guidance, AG2). Experience designing and operating cloud-native ML systems on OCI, AWS, GCP, or Azure. Proficiency with Kubernetes, Docker, and CI/CD pipelines for deploying and scaling services. Experience with data workflow orchestration (e.g., Airflow, Prefect, Dagster) and data validation frameworks (e.g., Great Expectations, Evidently). Strong understanding of vector databases (e.g., Pinecone, Weaviate, Milvus, Chroma, Postgres + pgvector). Demonstrated ability to build and maintain production monitoring, alerting, and observability dashboards (e.g., Prometheus, Grafana). Excellent communication and collaboration skills with the ability to mentor and lead technical discussions. * Bachelor's or master's degree in computer science, engineering, or a related field, or equivalent practical experience.

Beware of fraud agents! do not pay money to get a job

MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Job Detail

  • Job Id
    JD4664112
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    IN, India
  • Education
    Not mentioned
  • Experience
    Year