Data Engineering Manager – Web Crawling & Pipeline Architecture

Year    Remote, IN, India

Job Description

Data Engineering Manager - Web Crawling & Pipeline Architecture


--------------------------------------------------------------------



#

Aimleap




#

India (Remote)




#

Posted on November 29, 2025



AIMLEAP is Hiring:


----------------------

Data Engineering Manager - Web Crawling & Pipeline Architecture



Experience: 7 to 12 Years

Location: Remote / Bangalore

Engagement: Full-time

Positions: 2

Qualification: B.E / B.Tech / M.Tech / MCA / Computer Science / IT

Industry: IT / Data / AI / E-commerce / FinTech / Healthcare

Notice Period: Immediate

:


--------------------

#

What We Are Looking For:



Proven experience

leading data engineering

teams with strong ownership of web crawling systems and pipeline architecture. Expertise in

designing, building, and optimizing scalable data pipelines,

preferably using workflow orchestration tools such as Airflow or Celery. Hands-on proficiency in

Python and SQL

for data extraction, transformation, processing, and storage. Experience working with cloud platforms such as

AWS, GCP, or Azure

for data infrastructure, deployments, and pipeline operations. Deep understanding of

web crawling frameworks

, proxy rotation, anti-bot strategies, session handling, and compliance with global data collection standards (GDPR/CCPA-safe crawling). Strong expertise in

AI-driven automation

, including integrating AI agents or frameworks like Crawl4ai into scraping, validation, and pipeline workflows..
#

Responsibilities:



Lead and mentor data engineering and web crawling teams, ensuring high-quality delivery and adherence to best practices. Architect, implement, and optimize scalable data pipelines that support high-volume data ingestion, transformation, and storage. Build and maintain robust crawling systems using modern frameworks, handling IP rotation, throttling, and dynamic content extraction. Establish pipeline orchestration using Airflow, Celery, or similar distributed processing technologies. Define and enforce data quality, validation, and security measures across all data flows and pipelines. Collaborate with product, engineering, and analytics teams to translate data requirements into scalable technical solutions. Develop monitoring, logging, and performance metrics to ensure high availability and reliability of data systems. Oversee cloud-based deployments, cost optimization, and infrastructure improvements on AWS/GCP/Azure. Integrate AI agents or LLM-based automation for tasks such as error resolution, data validation, enrichment, and adaptive crawling.
#

Qualifications:



Bachelor's or master's degree in engineering, Computer Science, or related field. 7-12 years of relevant experience in data engineering, pipeline design, or large-scale web crawling systems. Strong expertise in Python, SQL, and modern data processing practices. Experience working with Airflow, Celery, or similar workflow automation tools. Solid understanding of proxy systems, anti-bot techniques, and scalable crawler architecture. Hands-on experience with cloud data platforms (AWS/GCP/Azure). Experience with AI/LLM frameworks (Crawl4ai, LangChain, LlamaIndex, AutoGen, OpenAI, or similar). Strong analytical, architectural, and leadership skills.

About Us:


-------------


AIMLEAP is an ISO 9001:2015 and ISO/IEC 27001:2013 certified global technology consulting and service provider offering Digital IT, AI-augmented Data Solutions, Automation, and Research & Analytics Services.


AIMLEAP has been recognized as 'The Great Place to Work'. With focus on AI and automation-first approach, our services include end-to-end IT application management, Mobile App Development, Data Management, Data Mining Services, Web Data Scraping, Self-serving BI reporting solutions, Digital Marketing, and Analytics solutions.


We started in 2012 and successfully delivered projects in IT & digital transformation, automation driven data solutions, and digital marketing for more than 750 fast-growing companies in the USA, Europe, New Zealand, Australia, Canada; and more.


- An ISO 9001:2015 and ISO/IEC 27001:2013 certified

- Served 750+ customers

- 12+ Years of industry experience

- 98% Client Retention

- Great Place to Work Certified

- Global Delivery Centers in the USA, Canada, India & Australia.

#

Apply Here

Beware of fraud agents! do not pay money to get a job

MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Job Detail

  • Job Id
    JD4820573
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    Remote, IN, India
  • Education
    Not mentioned
  • Experience
    Year