Lead Assistant Manager

Year    Ahmedabad, Gujarat, India

Job Description


:

Consultant II \xe2\x80\x93 Data Engineering Profile

Role and Responsibilities:

  • Execute and manage large scale ETL processes to support development and publishing of reports, Datamart\xe2\x80\x99s and predictive models.
  • Build ETL pipelines in Spark, Python, HIVE that process transaction and account level data and standardize data fields across various data sources.
  • Build and maintain high performing ETL processes, including data quality and testing aligned across technology, internal reporting and other functional teams
  • Create data dictionaries, setup/monitor data validation alerts and execute periodic jobs like performance dashboards, predictive models scoring for client\xe2\x80\x99s deliverables
  • Define and build technical/data documentation and experience with code version control systems (e.g. git). Ensure data accuracy, integrity and consistency
  • Find opportunities to create, automate and scale repeatable financial and statistical analysis
  • Collaborate with Data Engineering teams across regions for production and maintenance of client\xe2\x80\x99s key data assets.
  • Build right Data Engineering governance and practices to ensure sustainable and scalable processes.
Design, build and deploy advance analytics models aimed to improve fraud risk strategies of clients using Hadoop technology stacks and programming languages such as Hive, PySpark, Python, Spark & shell
  • Design, build and deploy an anomaly detection tool to identify suspicious transaction behavior of accounts
  • Develop a monitoring & alert mechanism for the anomalies using shell scripts
  • Design, build and deploy self-serve reporting tool across business functions and clients
  • Design complex algorithm and apply machine learning and statistical methods on large datasets for reporting, predictive and prescriptive modeling
  • Develop and implement coding best practices using Hadoop/Hive, Python, and PySpark
  • Collaborate with offshore and onshore team and effectively communicate status, issues, and risks daily
  • Review and propose new standards for naming, describing, managing, modeling, cleansing, enriching, transforming, moving, storing, searching and delivering all data products within the enterprise
  • Analyze existing and future data requirements, including data volumes, data growth, data types, latency requirements, data quality, the volatility of source systems, and analytic workload requirements
  • Design, build and deploy advance analytics models aimed to improve fraud risk strategies of clients using Hadoop technology stacks and programming languages such as Hive, PySpark, Python, Spark & shell
  • Design, build and deploy an anomaly detection tool to identify suspicious transaction behavior of accounts
  • Develop a monitoring & alert mechanism for the anomalies using shell scripts
  • Design, build and deploy self-serve reporting tool across business functions and clients
  • Design complex algorithm and apply machine learning and statistical methods on large datasets for reporting, predictive and prescriptive modeling
  • Develop and implement coding best practices using Hadoop/Hive, Python, and PySpark
  • Collaborate with offshore and onshore team and effectively communicate status, issues, and risks daily
  • Review and propose new standards for naming, describing, managing, modeling, cleansing, enriching, transforming, moving, storing, searching and delivering all data products within the enterprise
  • Analyze existing and future data requirements, including data volumes, data growth, data types, latency requirements, data quality, the volatility of source systems, and analytic workload requirements
Candidate Profile:

Preferred Qualifications -
  • Required \xe2\x80\x93 Hadoop/Hive, Python, Spark
  • 2+ years hands-on experience working with Big Data Platforms such as Cloudera, Hortonworks, or MapR
  • Python & ML modeling experience is a plus
  • Experience with using the Agile approach to deliver solutions
  • Experience with handling large and complex data in Big Data Environment
  • Experience with designing and developing complex data products and transformation routines
  • Experience of working in financial services and risk analytics domain, a plus
  • Strong record of achievement, solid analytical ability, and an entrepreneurial hands-on approach to work
  • Outstanding written and verbal communication skills
  • BA/BS/B.Tech. minimum educational requirement with 3-5 years of minimum work experience
  • Lead Experience building Data Engineering Pipeline for Large BigData / Data warehouse / Data Lakes with BigData Hadoop technologies.
  • Working knowledge of Hadoop ecosystem and associated technologies, (for e.g., Apache Spark, Hive, Python, Presto, Airflow & Pandas)
  • Should have strong problem-solving capabilities and ability to quickly propose feasible solutions and effectively communicate strategy and risk mitigation approaches to leadership.
Technical Qualifications:
  • Strong experience in creating Large scale data engineering pipelines, data-based decision-making and quantitative analysis.
  • Strong Experience and exposure to code version control systems tools like GIT and job automation tools like Apache Airflow etc. and good knowledge of CI/CD pipelines is desirable.
  • Advanced experience in writing and optimizing efficient SQL queries with Python, Hive, Scala handling Large Data Sets in Big-Data Environments.
  • Experience with complex, high volume, multi-dimensional data, as well as machine learning models based on unstructured, structured, and streaming datasets.
  • Experience with SQL for extracting, aggregating and processing big data Pipelines using Hadoop, EMR & NoSQL Databases.
  • Experience creating/supporting production software/systems and a proven track record of identifying and resolving performance bottlenecks for production systems.
  • Experience with Unix/Shell or Python scripting and exposure to Scheduling tools like Oozie and Airflow.
  • Exposure to stream-processing systems like Apache Storm, Spark-Streaming

EXL Service

Beware of fraud agents! do not pay money to get a job

MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Job Detail

  • Job Id
    JD3200182
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    Ahmedabad, Gujarat, India
  • Education
    Not mentioned
  • Experience
    Year