Job Description

Position Overview


We are looking for an

AI Data Engineer

with 2+ years of hands-on experience in data engineering, vector databases, LLM/AI integrations, and Python-based automation. The ideal candidate should have strong knowledge of

Open AI

,

AWS Bedrock

,

embeddings

,

Flowise

,

prompt engineering

, and building scalable

RAG (Retrieval-Augmented Generation)

pipelines.

Key Responsibilities

:

AI & LLM Integrations



Integrate and optimize

OpenAI, GPT models, AWS Bedrock models (Claude, Titan, etc.)

for production workflows. Build and maintain

Retrieval-Augmented Generation (RAG)

systems using vector search. Develop

prompt engineering strategies

, structured prompts, and dynamic contextual prompts. Implement LLM orchestration using

Flowise, Lang Chain, or Llama Index

.

Data Engineering & Pipelines



Design, build, and maintain

ETL/ELT pipelines

for structured & unstructured data. Develop ingestion workflows for PDFs, docs, images, and text for LLM training and retrieval. Implement data cleaning, transformation, preprocessing, chunking, and embedding generation. Handle large-scale data pipelines that feed AI models and vector databases.

Vector Database Engineering



Work with

Pinecone, Qdrant, Milvus, We aviate, Chroma

to store and retrieve embeddings. Optimize vector indexes, similarity search, metadata filtering, and document-versioning logic. Manage vector schema design and vector DB performance tuning.

Python Development & Automation



Build

Python-based microservices, APIs (FastAPI/Flask)

, and automation scripts. Create backend functions to handle AI requests, data ingestion, embeddings, and retrieval logic. Integrate with cloud storage, messaging queues, and external APIs.

Cloud & DevOps



Deploy AI and data pipelines on

AWS

(Lambda, S3, DynamoDB, EC2, API Gateway). Manage secrets, IAM roles, scalability, and cloud resource optimization. Containerize workloads using

Docker

and work with CI/CD workflows (GitHub/GitLab).

Cross-functional Collaboration



Work alongside AI engineers, backend teams, data scientists, and product managers. Document workflows, maintain internal knowledge bases, and support debugging across teams.

Required Skills & Qualifications



Bachelor's degree in Computer Science, Data Science, Engineering, or related field.

2+ years of experience

in data engineering, AI, or ML-focused development. Strong in

Python

(FastAPI, Flask, Pandas, NumPy, AsyncIO). Experience with

Open AI, GPT models, AWS Bedrock, embeddings, and tokenization

. Strong understanding of

data preprocessing for LLMs: chunking, cleaning, vectorization

. Hands-on experience with

vector databases

: Pinecone, Qdrant, Milvus, We aviate, Chroma. Practical experience with

Flowise

, Lang Chain, or Llama Index. Knowledge of

prompt engineering

and optimizing LLM responses. Experience with

SQL & NoSQL

databases. Familiar with

API integrations

, backend workflows, and cloud-based pipelines. Understanding of

CI/CD workflows

, version control (Git), and containerization (Docker).

Nice-to-Have



Experience with MLOps tools and model monitoring. Exposure to model fine-tuning or supervised generation training. Familiarity with Airflow, Prefect, or cloud-native workflow orchestrators. Hands-on with parallel processing or distributed pipelines.

Soft Skills



Strong analytical thinking and problem-solving capability. Clear communication and documentation. Ability to work in fast-paced, agile environments. Quick learner with deep curiosity about AI/ML technologies.
Job Types: Full-time, Permanent

Pay: ₹300,000.00 - ₹420,000.00 per year

Work Location: In person

Beware of fraud agents! do not pay money to get a job

MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Job Detail

  • Job Id
    JD4746477
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    PB, IN, India
  • Education
    Not mentioned
  • Experience
    Year