to design and implement high-performance data pipelines, ensuring the smooth flow of large datasets across systems. In this role, you will leverage your expertise in
Databricks
,
Pyspark
, and
ETL design
to build efficient data solutions. You will also contribute to optimizing database performance and ensure that data systems meet both scalability and security standards.
Key Responsibilities:
Develop, optimize, and maintain scalable
ETL pipelines
for processing large datasets using
Databricks
and
Pyspark
.
Apply advanced techniques such as
chunking
and
partitioning
to handle large file volumes efficiently.
Tuning and optimizing databases for better performance and storage efficiency.
Collaborate with cross-functional teams to ensure the architecture meets business requirements and data quality standards.
Design and implement solutions with an eye toward scalability and long-term performance.
Work with
Azure
services, with exposure to
Azure Function Apps
,
Azure Data Factory
, and
Cosmos DB
(preferred).
Communicate effectively with stakeholders to align on project goals and deliverables.
Key Skills and Experience:
Databricks
(advanced-level skills in data engineering workflows)
Pyspark
(intermediate-level skills for processing big data)
Strong
ETL design skills
, particularly in partitioning, chunking, and database tuning for large datasets
Azure
experience (a plus, including
Function Apps
,
Cosmos DB
, and
Azure Data Factory
)
* Excellent communication skills for effective collaboration with teams and stakeholders
Beware of fraud agents! do not pay money to get a job
MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.