Azure Databricks, Azure Data Factory (ADF), and PySpark
. The ideal candidate should have hands-on experience in building scalable data pipelines, integrating diverse data sources, and working on end-to-end data engineering solutions in the Azure ecosystem. This role involves working with structured, semi-structured, and unstructured data sources including
SAP, SharePoint, JSON, XML, APIs, and files from Azure Blob Storage
.
Key Responsibilities
Design, develop, and maintain
data pipelines
using
Azure Databricks, ADF, and PySpark
.
Ingest, transform, and process data from multiple sources including
SAP, SharePoint, APIs, JSON, XML, and unstructured files
.
Implement efficient data ingestion from
Azure Blob Storage
and other Azure-native services.
Collaborate with stakeholders to understand business requirements and translate them into scalable technical solutions.
Optimize performance of
PySpark scripts
and Databricks workflows for large-scale data processing.
Ensure
data quality, governance, and security compliance
throughout the pipeline lifecycle.
Support integration of data into downstream systems for analytics and reporting.
Troubleshoot data pipeline issues and provide timely resolutions.
Required Skills & Qualifications
3-7 years
of hands-on experience in
Data Engineering
.
Strong expertise in:
Azure Databricks
(PySpark, Delta Lake, notebooks, cluster management).