to help us build intelligent, scalable AI assistants and agent-based systems--similar to ChatGPT. This role is ideal for someone passionate about
large language models (LLMs)
,
conversational AI
, and
backend architecture
.
You will play a critical role in designing, developing, and optimizing
end-to-end AI solutions
that power multi-tenant environments, secure contextual handling, vector search, and real-time interactions using advanced AI tooling and vector databases such as
Qdrant
and
Milvus
.
Key Responsibilities
Design & develop conversational AI agents using cutting-edge LLMs (GPT, Claude, Mistral, etc.)
Architect and implement
scalable backend infrastructure
for multi-tenant use cases
Build secure,
role-based access systems
with isolated contextual memory per organization/user
Integrate LLM pipelines with
APIs, data stores, and vector databases
(Qdrant, Milvus)
Optimize vector search & embedding workflows for speed, scalability, and accuracy
Fine-tune and prompt-engineer LLMs for
domain-specific use cases
Implement
chat session handling, long-term memory, and feedback loops
Manage context windows, optimize token usage, and apply
caching strategies
for efficient LLM use
Collaborate with DevOps on CI/CD pipelines, monitoring, and cloud scalability
Stay up to date with latest developments in
LangChain, LlamaIndex, embeddings, and AI infrastructure
Required Skills & Qualifications
Strong proficiency in
Python
with hands-on AI/ML development experience
Experience with
OpenAI API, LangChain, or LlamaIndex
Proven expertise with
vector databases
(Qdrant, Milvus): indexing, querying, embedding optimization
Familiarity with
FastAPI / Flask
for API development & deployment
Solid understanding of