Design, develop, and maintain scalable and reliable data pipelines and platforms on Google Cloud Platform to support enterprise analytics and reporting.
Collaborate with business and technical teams to understand data requirements and translate them into efficient data engineering solutions.
Build data ingestion frameworks using Kafka Confluent and integrate data from multiple cloud and on-premises sources.
Implement ELT and ETL pipelines using PySpark, SQL, Dataform, DBT, and Dataproc for efficient data transformation and processing.
Design and manage data models, schemas, and data warehousing structures optimized for BigQuery performance and scalability.
Ensure data quality, consistency, and integrity through effective data validation and monitoring mechanisms.
Implement data governance practices including data lineage, metadata management, privacy, and compliance.Develop automated monitoring and alerting mechanisms to ensure reliability, availability, and performance of data pipelines.
Collaborate with cross-functional teams to enable downstream reporting, analytics, and AI/ML model consumption.
Follow best practices for version control, documentation, and CI/CD workflows using Git and infrastructure-as-code tools.
Provide operational support for data pipelines and proactively troubleshoot performance issues.
Continuously optimize data processing, query execution, and resource utilization in GCP environments.
Qualifications
Strong programming skills in
Python
and
PySpark
for large-scale data processing.
Proficiency in
SQL
for data manipulation, analysis, and performance tuning.
Experience with
Dataform
,
Dataproc
, and
BigQuery
for data pipeline development and orchestration.
Hands-on experience with
Kafka
and
Confluent
for real-time data streaming.
Knowledge of
Cloud Scheduler
and
Dataflow
for automation and workflow management.
Familiarity with
DBT
,
Machine Learning
, and
AI
concepts is a good advantage.
Understanding of
Data Governance
principles and implementation practices.
Experience using
Git
for version control and
CI/CD
pipelines for automated deployments.
Working knowledge of
Infrastructure as Code (IaC)
for cloud resource management and automation.
This job posting will remain open a minimum of 72 hours and on an ongoing basis until filled.
Job
Information Technology
Primary Location
India-Karnataka-Bengaluru
Schedule:
Full-time
Travel:
No
Req ID:
254883
Job Hire Type
Experienced Not Applicable #BMI N/A
Beware of fraud agents! do not pay money to get a job
MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.