Established in 1833, McKesson is a US Fortune 10 global leader in healthcare supply chain management solutions, retail pharmacy, healthcare technology, community oncology, and specialty care. We partner with life sciences companies, manufacturers, providers, pharmacies, governments, and other healthcare organizations to help provide the right medicines, medical products, and healthcare services to the right patients at the right time, safely and cost effectively.
Based in Bangalore India, McKesson Compile's data is a comprehensive, full linked system of record for the US Healthcare market, with intelligence on 2M+ healthcare professionals (HCPs) and over 800K facilities. Compile's data includes high capture medical and pharmacy claims, closed capture Medicare claims (100%), along with best-in-class provider affiliations and customer master.
At McKesson we deliver careers with purpose and potential. Our focus on better health starts with creating an inclusive environment with strong values where you can build a fulfilling career. You can count on us to provide you with resources and opportunities to grow and be your best, while contributing to our pursuit of improving lives.
About Us
At
Compile (a McKesson company)
, we're transforming fragmented healthcare data into powerful intelligence that drives real-world impact -- from mapping patient journeys to optimizing go-to-market strategies for life sciences.
We're building a modern, scalable, and secure
data platform
that powers data products across the organization. As a
Principal Engineer
, you'll be the hands-on technical leader driving the design and development of this foundational platform.
If you're passionate about clean architecture, distributed systems, and solving real-world data challenges -- especially in healthcare -- this is your opportunity to make a deep impact.
What You'll Do
Architect and lead development of a
reusable, scalable data platform framework
Design robust
ETL/ELT pipelines
for structured and semi-structured healthcare data
Build APIs and internal tools using
Django
, focused on performance and maintainability
Use
Prefect
for orchestration, and
Ray
or
Spark
for distributed compute
Leverage
Databricks
for testing and validation of data pipelines (not for primary compute)
Enforce
data quality, observability
, and reliability using
Metaplane
or similar tools
Integrate and manage data across
Postgres, Snowflake
, and
Snowflake Shares
Optimize for scalability and performance in a
cloud-native Azure
environment
Mentor engineers and collaborate with product, data, and platform teams
Tech Stack
Languages & Frameworks:
Python (Django, FastAPI), SQL
Orchestration & Compute:
Prefect, Ray, Apache Spark
Data Storage:
Postgres, Snowflake, dbt, Snowflake Shares
Cloud Platform:
Azure (Blob Storage, Data Factory, Azure Functions)
Testing & CI/CD:
Pytest, GitHub Actions, Databricks (for test pipelines)