Big Data Engineer

2 to 3 Years    Pune (Maharashtra)

Job Description

A Engineer/analysts with strong ability to comprehend engineering/science, mathematical, statistical, and analytical skills
A personnel who is willing to travel for customer site for project requirements. The location for deployment shall depend on the target customer.
Expertise in big data processing/movement and architecting the data governance
Designing and implementing highly performant data ingestion pipelines from multiple sources using Apache Spark and/or Azure Databricks
Delivering and presenting proofs of concept to of key technology components to project stakeholders.
Developing scalable and re-usable frameworks for ingesting of geospatial data sets
Integrating the end to end data pipleline to take data from source systems to target data repositories ensuring the quality and consistency of data is maintained at all times
Working with event based / streaming technologies to ingest and process data
Working with other members of the project team to support delivery of additional project components (API interfaces, Search)
Evaluating the performance and applicability of multiple tools against customer requirements
Working within an Agile delivery / DevOps methodology to deliver proof of concept and production implementation in iterative sprints.
Strong knowledge of Data Management principles
Experience in building ETL / data warehouse transformation processes
Direct experience of building data piplines using Azure Data Factory and Apache Spark (preferably Databricks).
Microsoft Azure Big Data Architecture certification.
Hands on experience designing and delivering solutions using the Azure Data Analytics platform (Cortana Intelligence Platform) including Azure Storage, Azure SQL Data Warehouse, Azure Data Lake, Azure Cosmos DB, Azure Stream Analytics
Experience with Apache Kafka / Nifi for use with streaming data / event-based data
Experience with other Open Source big data products Hadoop (incl. Hive, Pig, Impala)
Experience with Open Source non-relational / NoSQL data repositories (incl. MongoDB, Cassandra, Neo4J)
Studio Team Services, Chef, Puppet or Terraform
Must have worked on Databricks Cloud / Azure Databricks / Databricks Delta for at least 2 year
Highly experienced in Big Data solutions, particularly Hadoop and Spark
Good understanding of Lambda Architecture
Very advanced expertise in any of the programming language Java / Python / Scala
Experience with integration of data from multiple data sources
Knowledge of building Real-Time Data processing solutions using Kafka and Spark-Streaming
Fair knowledge on NoSQL databases such as MongoDB, HBase, Cassandra etc
Fair knowledge on any RDBMS and Data Warehousing Concepts
Must have experience in understanding business vision to prepare ETL/ELT solution architecture
Identifying solutions to key technical challenges in projects concerning Data Integration.
Experienced in Implementation of below mentioned tool across all platforms for Data Ingestion (like Spark/Storm etc.), Data Scheduling (like Apache), Data Processing (Hadoop Yarn),Data Search & Index (Solr), Data Storage (Hadoop HDFS, Apache Hbase, Kudu), Cluster & configuration management( Apache Ambari)
Effective mining of data and analysis of data to build performance model and decision support frameworks such as predictive/Diagnostics analytics
Knowledge with database languages like SQL, MySQL, mongoBD, MariaDB
Proficient with statistical (process) and time series/time-stamped data (DOE, manufacturing / Historian)
Knowledge of creating data visualizations from Superset, Tableau, Zeppelin,..,etc.
Knowledge of big data tools like Hive, HBase, Zookeeper or Pig
Experienced/Hands-on in either one of the deployable solutions such as seeq, Azure, AWS to provide the analytical solutions
Innovative in terms of presenting client demonstration (case studies) / upskilling the client-side team (Digitization) / training the centralized team on the developed use case
Knowledge of development languages and statistical programming languages like R, Scala, Pyspark and RSpark or Python
Handling data from multiple sources and formats sensors, logs, structured data from an RDBMS, video, text, etc.
Ready to relocate as per the clients requirement, be it on-site or off-site or travelling/site visit as per needed
Influence stakeholders from various disciplines and across different levels of seniority across the organization.
Experienced in understanding problems, collecting data, establishing facts and drawing valid conclusions
Provide expertise in deployment of high-end technological initiatives and digital solutions like historian, data analytics, BI, workflow automation tools, Data Analytics (using partner platforms),
Collaborate / work closely with Service Delivery Team & Marketing team
Successful track record in leading and delivering projects, large programs, applications across companies and domains
Results oriented leadership and management skills with strong communication, people management and problem-solving skills
Self-starter, quick learner, experience in working in multi-cultural environment across different geographies
Education: Any Graduate
Industry: Others - other Industry

Skills Required

Beware of fraud agents! do not pay money to get a job

MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Related Jobs

Job Detail

  • Job Id
    JD2900606
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    Pune (Maharashtra),
  • Education
    Not mentioned
  • Experience
    2 to 3 Years