5-8 years of experience designing and building data pipelines using
Apache Spark, Databricks or equivalent bigdata frameworks.
- Handson expertise with streaming and messaging systems such as
Apache Kafka (publish subscribe architecture), Confluent Cloud,
RabbitMQ or Azure Event Hub. Experience creating producers,
consumers and topics and integrating them into downstream
processing.
- Deep understanding of relational databases and CDC. Proficiency
in SQL Server, Oracle or other RDBMSs; experience capturing
change events using Debezium or native CDC tools and
transforming them for downstream consumption.
- Proficiency in programming languages such as Python, Scala or
Java and solid knowledge of SQL for data manipulation and
transformation.
- Cloud platform expertise. Experience with Azure or AWS services for
data storage, compute and orchestration (e.g., ADLS, S3, Azure
Data Factory, AWS Glue, Airflow, DBX, DLT).
- Data modelling and warehousing. Knowledge of data Lakehouse
architectures, Delta Lake, partitioning strategies and performance
optimisation.
- Version control and DevOps. Familiarity with Git and CI/CD
pipelines; ability to automate deployment and manage
infrastructure as code.
- Strong problem solving and communication skills. Ability to work
with cross functional teams and articulate complex technical
concepts to nontechnical stakeholders.
MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.