Essential Duties and Responsibilities:
Create and maintain optimal data pipeline architecture
Assemble large, complex data sets that meet functional / non-functional business requirements.
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL, and scripting tools
Ensure that data transformations are well documented follow best practices and understood by business users and developers
Understand the relationships across business information and units of data, Reverse engineer data from various source systems.
Technical Skills:
Working with cloud services, data lake storage, Hadoop, Spark, Python/ Scala, Hive, HDFS, SQL and No-SQL databases.
Knowledge on creating Batch or Realtime Data pipelines with on-premise or different cloud services, ETL tools and Kafka etc.
Performance optimization of complex ETL mappings for relational and non-relational workloads.
Hands on Unix scripting or PowerShell.
Develop data warehouse model, ensuring data design follows the prescribed reference architecture framework while reflecting appropriate business rules built for logical, physical and conceptual model.
Knowledge on setting up code versioning with GitHub and CICD pipelines.
Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow
Experience with big data tools: Hadoop, Spark or Kafka
Education background:
Mandatory (if any): Graduate Degree in Computer Science or equivalent
Desirable (if any): Strong exposure in programming languages (Python / Scala, SQL)
Preferred Experience (Industry/therapy/function no. of years): 1-3 years
Knowledge of big data in pharmaceutical field
Product thinking, capable of assessing requirements from a users standpoint
Expertia AI Technologies
MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.