Required Experience: 5-8 YearsSkills: Data Engineering, GCP, BigQueryShare
Key Responsibilities:
?
Data Architecture & Engineering:
?Design and implement scalable data architectures leveraging BigQuery, Iceberg,
Starburst, and Trino.
?Develop robust, high-performance ETL/ELT pipelines to process structured and
unstructured data.
?Optimize SQL queries and data processing workflows for efficient analytics and
reporting.
?
Cloud & Big Data Infrastructure:
?Build and maintain data pipelines and storage solutions using Google Cloud
Platform (GCP) and BigQuery.
?Implement best practices for data governance, security, and compliance within
cloud-based environments.
?Optimize data ingestion, storage, and query performance for high-volume and
high-velocity datasets.
?
Data Processing & Analytics:
?Leverage Apache Iceberg for large-scale data lake management and
transactional processing.
?Utilize Starburst and Trino for distributed query processing and federated data
access.
?Develop strategies for data partitioning, indexing, and caching to enhance
performance.
?
Collaboration & Integration:
?Work closely with data scientists, analysts, and business stakeholders to
understand data needs and requirements.
?Collaborate with DevOps and platform engineering teams to implement CI/CD
pipelines and infrastructure-as-code for data workflows.
?Integrate data from multiple sources, ensuring data integrity and accuracy across
systems.
?
Performance Optimization & Monitoring:
?Monitor, troubleshoot, and optimize data pipelines for efficiency, scalability, and
reliability.
?Implement data quality frameworks and automated validation checks to ensure
consistency.
?Utilize monitoring tools and performance metrics to proactively identify
bottlenecks and optimize queries.
Qualifications:
?
Education:
Bachelor's or Master's degree in Computer Science, Data Engineering,
Information Systems, or a related field.
?
Experience:
?4+ years of experience in data engineering, with expertise in SQL, BigQuery, and
GCP.
?Strong experience with Apache Iceberg, Starburst, and Trino for large-scale data
processing.
?Proven track record of designing and optimizing ETL/ELT pipelines and
cloud-based data workflows.
?
Technical Skills:
?Proficiency in SQL, including query optimization and performance tuning.
?Experience working with BigQuery, Google Cloud Storage (GCS), and GCP data
services.
?Knowledge of data lakehouse architectures, data warehousing, and distributed
query engines.
?Hands-on experience with Apache Iceberg for managing large-scale
transactional datasets.
?Expertise in Starburst and Trino for federated queries and cross-platform data
access.
?Familiarity with Python, Java, or Scala for data pipeline development.
?Experience with Terraform, Kubernetes, or Airflow for data pipeline automation
and orchestration.
Preferred Skills:
?Understanding of machine learning data pipelines and real-time data processing.
?Experience with data governance, security, and compliance best practices.
?Exposure to Kafka, Pub/Sub, or other streaming data technologies.
?Familiarity with CI/CD pipelines for data workflows and infrastructure-as-code..
Job Type: Full-time
Application Question(s):
Must Have a Skill :Data Engineering, GCP, BigQuery ?
Experience:
data engineer: 5 years (Required)
Work Location: Remote
Beware of fraud agents! do not pay money to get a job
MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.