Data Analyst

Year    KA, IN, India

Job Description

Dayforce is a global human capital management (HCM) company headquartered in Toronto, Ontario, and Minneapolis, Minnesota, with operations across North America, Europe, Middle East, Africa (EMEA), and the Asia Pacific Japan (APJ) region.

Our award-winning Cloud HCM platform offers a unified solution database and continuous calculation engine, driving efficiency, productivity and compliance for the global workforce.

Our brand promise - Makes Work Life Better(TM)- Reflects our commitment to employees, customers, partners and communities globally.



About the opportunity

We are looking for a Data Analyst to join the team that powers search, AI Assistant, and AI agents in our Dayforce product. This role is central to ensuring that the data behind our AI experiences is clean, trustworthy, well-organized, and ready for use.

You will work with large and complex datasets, own data quality and governance for key domains, and enable data-driven decisions across the product and engineering teams. The ideal candidate has strong analytical skills, deep hands-on experience with data preparation and management, and a passion for turning messy, real-world data into reliable, usable assets.

Your impact will be visible across the full data lifecycle:

from ingestion and cleanup, modeling and documentation, to reporting and insight generation. You will collaborate closely with product managers, software engineers, data scientists, and business stakeholders to ensure that the right data is available, accurate, and actionable.



What you'll get to do

Data Annotation & Labeling: Annotate, tag, and label large volumes of data (such as text, images, or audio) according to predefined guidelines to create "ground truth" datasets for machine learning. Ensure labels are accurate, consistent, and meet quality standards . This includes reviewing and correcting labels, performing quality assurance checks, and refining the labeling process over time. Data Augmentation & Validation: Apply data augmentation techniques to increase the diversity and volume of training data--such as generating synthetic examples or transforming existing data--while maintaining data integrity. In collaboration with cross-functional teams across Dayforce, including Workforce Management (WFM), Payroll, Scheduling, Learning, and other product areas, perform data validation and error-checking routines to detect anomalies or inconsistencies. This ensures that datasets used for model training are accurate, representative, and free of issues that could negatively impact model performance. Data & Training Pipeline Automation: Design, implement, and own end-to-end data pipelines that move data from raw sources through labeling, validation, and preprocessing into training-ready datasets. This includes writing and maintaining Python-based automation using Databricks to ingest, clean, label, version, and store data in well-structured formats that ensure reproducibility and traceability. You will also own the automation of model training workflows, ensuring that newly labeled or updated datasets seamlessly trigger retraining jobs. In this role, you will monitor pipeline execution, troubleshoot data- and pipeline-related failures, and work closely with ML engineers to define clean interfaces between data pipelines and training systems, while keeping model design and evaluation out of scope. Workflow Improvement: Continuously evaluate and improve the data labeling workflow. Provide feedback on labeling tools and processes to increase efficiency - for example, suggesting better annotation tools or semi-automated labeling approaches. You may help develop simple utilities or scripts to assist annotators (e.g. automation for repetitive labeling tasks or active learning integration to prioritize labeling the most informative data). You will also document guidelines, edge cases, and best practices for the labeling process, and ensure knowledge transfer and training for collegues assisting with annotation.

Skills and experience we value

Experience: 2+ years of relevant experience such as data labeling/annotation, data quality, or data engineering roles. Proven track record of working with large datasets and following detailed data annotation guidelines. Mid-level understanding of machine learning data needs (e.g., basics of supervised learning and why consistent labeling matters). Technical Skills: Proficiency in Python for data manipulation and scripting automation (pandas, NumPy, etc.). Experience with data processing platforms or notebooks such as Databricks (or similar tools like Jupyter, Spark) to handle big data workflows. Familiarity with data labeling tools and the ability to quickly learn new annotation software . Comfortable with using version control (Git) and, ideally, data versioning tools for datasets. Data Management: Solid understanding of data management best practices - including data cleaning, validation, and augmentation techniques. Ability to implement data quality checks and troubleshoot data issues in a pipeline. Familiarity with the concept of data versioning and reproducible data pipelines. Attention to Detail: Excellent attention to detail and a methodical approach to tasks. You must be able to maintain high accuracy in labeling data, catching inconsistencies or errors (your meticulous work will directly affect model outcomes). An eye for consistency and patience for repetitive tasks when necessary are essential traits for this role. Organizational & Independence: Strong organizational skills to manage and prioritize multiple datasets, versions, and pipeline tasks. Ability to work independently and take initiative in improving processes - we expect you to be a self-starter who can manage the end-to-end data prep workflow with minimal supervision Communication: Good communication and collaboration skills. Capable of documenting guidelines clearly and discussing requirements or issues with the engineering team. You should be comfortable providing feedback and raising questions when instructions are unclear, as well as mentoring junior data labelers or coordinating with any external labeling support if needed.

What would make you really stand out

Education: Bachelor's degree in Computer Science, Data Science, Information Systems, or a related field is preferred (or equivalent practical experience). MLOps/Automation: Experience with MLOps or pipeline automation tools. For example, familiarity with ML workflow orchestration (such as MLflow, Airflow, or Databricks ML pipelines) and continuous integration/continuous deployment (CI/CD) practices for data or models. Experience setting up automated training pipelines in a cloud environment is a strong plus. Advanced Tools: Exposure to data versioning tools, data augmentation libraries, or active learning frameworks. Experience with any auto-labeling techniques or using AI to assist labeling is a bonus. Quality Focus: Experience in a data quality or data curation role. Past involvement in setting up labeling quality assurance processes (such as consensus labeling, review workflows, or calibration sessions) is beneficial, as it shows you know how to maintain high annotation standards at scale.

What's in it for you

Dayforce is fueled by the diversity of our talented employees. We are an equal opportunity employer and consider and embrace ALL individuals and what makes them unique. We believe our employees should be happy and healthy, with peace of mind and a sense of fulfillment.

We encourage individuals to apply based on their passions.

Dayforce encourages personal and professional growth. We offer excellent time away from work programs, comprehensive wellness initiatives and recognition through competitive pay and benefits.

With a commitment to community impact, including volunteer days and our charity, Dayforce Cares we provide opportunities for you to thrive both in your career and personal life. Our focus is not just on your job but on supporting you to be the best version of yourself.

Fraudulent Recruiting

Beware of fraudulent recruiting. Legitimate Dayforce contacts will use an @dayforce.com email address. We do not request money, checks, equipment orders, or sensitive personal data during the recruitment process. If you have been asked for any of the above, or believe you have been contacted by someone posing as a Dayforce employee, please refer to our fraudulent recruiting statement found here: https://www.dayforce.com/be-aware-of-recruiting-fraud

Dayforce actively monitors all job applications to ensure authenticity. Submissions determined to be fraudulent or misleading will be declined from the recruitment process

Beware of fraud agents! do not pay money to get a job

MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Job Detail

  • Job Id
    JD5105481
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    KA, IN, India
  • Education
    Not mentioned
  • Experience
    Year