Senior Data Science Engineer

Year    HR, IN, India

Job Description

As a Senior Data Science Engineer in IOL's Data Team, you will lead the development of

advanced predictive models to power a smart caching layer for our B2B hospitality

marketplace. Handling an unprecedented scale of data--2 billion searches, 1 billion price

verifications, and 100 million bookings daily--you will design machine learning solutions to

predict search patterns and prefetch data from 3P suppliers, reducing their infrastructure load

and improving system reliability. This role demands deep expertise in big data, machine

learning, and distributed systems, as well as the ability to architect scalable, data-driven

solutions in a fast-paced environment.

The Challenge



IOL operates a high-traffic B2B marketplace that matches hotel room supply with demand.

Our platform processes:

Searches

: 2 billion daily queries for hotel prices based on hotel ID, room type, check-
in date, length of stay, and party size.

Price Verifications

: 1 billion daily checks to confirm pricing.

Bookings

: 100 million daily bookings.

Key Responsibilities

Predictive Modeling

: Design and implement machine learning models to predict
high-demand search patterns based on historical data (e.g., hotel IDs, room types,

dates, and party sizes).

Big Data Processing

: Develop scalable data pipelines to process and analyze massive
datasets (2 billion searches daily) using distributed computing frameworks.

Smart Caching Layer

: Architect and optimize a predictive cache prefetcher that
proactively populates the cache cluster (Redis) with high-value data during 3P off-

peak hours.

Data Analysis

: Leverage Elasticsearch and ES Searches Log to extract insights from
search patterns, seasonal trends, and user behavior.

Model Optimization

: Continuously refine predictive models to handle the massive
permutations of search parameters, ensuring high accuracy and low latency.

Collaboration

: Work with the Data Team, platform engineers, and 3P proxy teams to
integrate models into the existing architecture (Load Balancer, API Gateway, Service

Router, Cache Cluster).

Performance Monitoring

: Monitor cache hit/miss ratios, model accuracy, and
system performance, using tools like Cache Stats Collector to drive optimization.

Scalability

: Ensure models and pipelines scale horizontally to handle increasing data
volumes and traffic spikes.

Innovation

: Stay updated on advancements in machine learning, big data, and
distributed systems, proposing novel approaches to enhance predictive capabilities.

Required Skills & Qualifications



Education

: Master's or Ph.D. in Data Science, Computer Science, Statistics, or a
related field.

Experience

:
o 7+ years of experience in data science, with a focus on machine learning and

predictive modeling.

o 5+ years of hands-on experience processing and analyzing big data sets

(terabyte-scale or larger) in distributed environments.

o Proven track record of building and deploying machine learning models in

production for high-traffic systems.

Technical Skills

:
o Deep expertise in machine learning frameworks (e.g., TensorFlow, PyTorch,

Scikit-learn) and algorithms (e.g., regression, clustering, time-series

forecasting, neural networks).

o Extensive experience with big data technologies (e.g., Apache Spark, Hadoop,

Kafka) for distributed data processing.

o Proficiency in Elasticsearch for search and analytics, including querying and

indexing large datasets.

o Strong programming skills in Python, with experience in data science libraries

(e.g., Pandas, NumPy, Dask).

o Familiarity with Redis or similar in-memory data stores for caching.

o Knowledge of cloud platforms (e.g., AWS, Azure, GCP) for deploying and

scaling data pipelines.o Experience with SQL and NoSQL databases (e.g., PostgreSQL, MongoDB)

for data extraction and transformation.

o Proficiency in designing and optimizing data pipelines for high-throughput,

low-latency systems.

Problem-Solving

: Exceptional ability to tackle complex problems, such as handling
massive permutations of search parameters and predicting trends in dynamic datasets.

Communication

: Strong written and verbal communication skills to collaborate with
cross-functional teams and present insights to stakeholders.

Work Style

: Self-motivated, proactive, and able to thrive in a fast-paced, innovative
environment.

Preferred Skills



Experience in the hospitality or travel industry, particularly with search or booking
systems.

Familiarity with real-time data streaming and event-driven architectures (e.g., Apache
Kafka, Flink).

Knowledge of advanced time-series forecasting techniques for seasonal and cyclical
data.

Exposure to reinforcement learning or online learning for dynamic model adaptation. Experience optimizing machine learning models for resource-constrained
environments (e.g., edge devices or low-latency systems).

Job Type: Full-time

Pay: ?1,500,000.00 - ?2,400,000.00 per year

Schedule:

Day shift
Application Question(s):

How many years of work experience do you have with Data Science? How many years of hands-on experience do you have with processing and analyzing big data sets (terabyte-scale or larger) in distributed environments? How many year of experience do have with big data technologies ? How many years of experience in data science, with a focus on machine learning and predictive modeling? What is your current CTC? what is your expected CTC? What is your current notice period? Do you have Strong programming skills in Python, with experience in data science libraries (e.g., Pandas, NumPy, Dask).
Work Location: In person

Beware of fraud agents! do not pay money to get a job

MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Job Detail

  • Job Id
    JD3721615
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Contract
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    HR, IN, India
  • Education
    Not mentioned
  • Experience
    Year