Job Description

Key Responsibilities




Design, build, and maintain scalable data pipelines for batch and real-time processing using tools like Apache Airflow, dbt, or Apache Spark.




Develop robust ETL/ELT workflows to ingest, clean, transform, and load data from diverse sources (APIs, databases, files, streams).




Work with stakeholders to understand data needs and translate business requirements into technical solutions.




Ensure data is accurate, timely, and accessible to downstream users (BI tools, ML models, applications).




Collaborate with data architects and engineers to build a modern data stack leveraging cloud-native platforms (e.g., AWS, GCP, Azure).




Monitor and optimize data pipeline performance, scalability, and cost-efficiency.




Implement data quality, validation, and observability frameworks to proactively detect and resolve issues.




Maintain clear documentation of pipelines, data flows, and architecture.




Support data compliance, governance, and security policies.




Required Qualifications


Technical Skills




Strong programming skills in Python or Scala for data engineering tasks.




Experience with ETL/ELT tools and orchestration frameworks (e.g., Airflow, dbt, Luigi, Kedro).




Proficiency in SQL for data manipulation and modeling.




Experience with big data and distributed processing technologies (e.g., Spark, Kafka, Flink).




Familiarity with data warehousing solutions (e.g., Snowflake, BigQuery, Redshift).




Hands on experience with cloud platforms and data services (e.g., AWS Glue, GCP Dataflow, Azure Data Factory).




Experience working with version control (Git), CI/CD pipelines, and containerized environments (Docker, Kubernetes).




Soft Skills




Strong problem solving and debugging skills.




Excellent communication and documentation abilities.




Ability to work independently and collaborate across cross-functional teams.




Strong attention to detail and data quality.




Preferred Qualifications




Experience with real time data streaming and event driven architectures.




Familiarity with data cataloging and lineage tools (e.g., Amundsen, DataHub).




Knowledge of MLOps or integration with ML pipelines is a plus.




Experience in industries like ecommerce, finance, or healthcare is a bonus.




Education & Experience




Bachelors or Masters degree in Computer Science, Engineering, or a related field.




2 to 5 years of experience in data engineering or pipeline development roles.

About Virtusa





Teamwork, quality of life, professional and personal development: values that Virtusa is proud to embody. When you join us, you join a team of 27,000 people globally that cares about your growth -- one that seeks to provide you with exciting projects, opportunities and work with state of the art technologies throughout your career with us.



Great minds, great potential: it all comes together at Virtusa. We value collaboration and the team environment of our company, and seek to provide great minds with a dynamic place to nurture new ideas and foster excellence.



Virtusa was founded on principles of equal opportunity for all, and so does not discriminate on the basis of race, religion, color, sex, gender identity, sexual orientation, age, non-disqualifying physical or mental disability, national origin, veteran status or any other basis covered by appropriate law. All employment is decided on the basis of qualifications, merit, and business need.

Beware of fraud agents! do not pay money to get a job

MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Job Detail

  • Job Id
    JD4197422
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    AP, IN, India
  • Education
    Not mentioned
  • Experience
    Year