Design, build, and optimize RAG pipelines for enterprise AI applications.
Develop automated prompt engineering frameworks using reinforcement and evaluation feedback loops.
Fine-tune foundation models for domain-specific use cases, ensuring optimal accuracy and latency.
Optimize token usage to reduce inference cost while maintaining semantic precision and relevance.
Collaborate with AI Ops and MLOps teams to deploy optimized models into production environments.
Continuously benchmark and improve model performance through data-driven evaluation methods.
Maintain documentation of prompt, model, and dataset experiments for reproducibility and governance.
Required Technical Skills :
LLM Optimization & Performance
Experience optimizing LLM inference performance across frameworks like vLLM, Hugging Face Transformers, or OpenAI APIs.
Hands-on with quantization, pruning, distillation, and memory-efficient fine-tuning techniques (LoRA, QLoRA, PEFT).
Knowledge of model evaluation frameworks (HELM, EvalHarness, DeepEval) for performance benchmarking.
RAG Architecture & Tuning
Expertise in building Retrieval-Augmented Generation (RAG) pipelines using LangChain or LlamaIndex.
Experience with vector databases such as Pinecone, Milvus, FAISS, or Weaviate.
Ability to fine-tune retrieval embeddings, chunking strategies, and ranking optimization for contextual relevance.
Prompt Engineering & Automation
Design, test, and optimize dynamic prompt templates for complex reasoning and domain-specific tasks.
Develop automated prompt tuning pipelines using AI-assisted evaluation and reinforcement feedback.
Implement prompt caching and adaptive prompt selection strategies to enhance inference performance.
Fine-Tuning & Adaptation
Implement supervised fine-tuning (SFT), reinforcement learning from human feedback (RLHF), and domain adaptation methods.
Use fine-tuning frameworks like Hugging Face Trainer, PEFT, or DeepSpeed.
Manage model versioning, dataset curation, and experiment tracking using MLflow or Weights & Biases.
Token Usage & Cost Optimization
Analyze and optimize token utilization through summarization, instruction compression, and dynamic truncation strategies.
Implement intelligent token budgeting and API cost monitoring systems.
Leverage AI-based analytics tools to optimize inference efficiency and output quality.
AI Infrastructure & Tooling
Familiarity with cloud-based AI infrastructure (AWS, Azure, GCP, or OCI) for large-scale model operations.
Experience with containerization, GPU orchestration, and deployment automation pipelines.
Proficiency with Python, PyTorch, TensorFlow, and modern LLM SDKs.
Required Skills & Experience :
5+ years of experience in AI/ML or NLP engineering roles.
Strong understanding of transformer architectures and attention mechanisms.
Hands-on experience with LLM fine-tuning and inference optimization.
Proficiency in building and deploying RAG systems in production environments.
Experience with vector databases, embeddings, and retrieval frameworks.
Proficient in Python and modern deep learning libraries such as PyTorch or TensorFlow.
Familiarity with model evaluation, cost analysis, and performance profiling tools.
APPLY
Close
Drivestream's Employee Benefits
.
----------------------------------------
Remuneration
----------------
Drivestream offers competitive pay and attracts a diverse community of skilled individuals. We recognize the value of investing in our talent.
Medical, Disability and Life Insurance
------------------------------------------
We provide an array of coverage options including full medical, full dental and vision plans, employee life insurance, LTD and STD coverage, flexible spending account and employee accidental death and dismemberment
Leave Benefits
------------------
Drivestream's generous paid leave programs feature vacation/paid time off (PTO), holiday leave and bereavement leave
Professional Development
----------------------------
Our training and development programs include traditional classroom training, online courses, including leadership, communication and project planning development, strategic planning and management programs, and professional society membership incentive
Work/Life Programs
----------------------
Drivestream offers Work-Life Integration options that help individuals manage their personal and professional responsibilities. Options Include work from home and telecommuting, day care, flexible spending accounts, internal job transfer, and career mobility, and health and wellness programs
Community Involvement
-------------------------
Drivestream believes in supporting community and philanthropic activities that allow our employees to engage in outreach and educational programs.
Awards Programs
-------------------
Drivestream recognizes and rewards our staff through various annual awards programs.
Retirement Benefits
-----------------------
Drivestream offers complete 401(k) plans and annual profit-sharing contribution.
Beware of fraud agents! do not pay money to get a job
MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.