Senior Linux/Systems Engineer Job in FlexAI

Senior Linux/systems Engineer

Year KA, IN, India

Apply Now

Job Description

Join FlexAI:

FlexAI is at the forefront of revolutionizing AI computing by reengineering infrastructure at the system level. Our groundbreaking architecture, combined with sophisticated software intelligence, abstraction, and an orchestration layer, allows developers to leverage a diverse array of compute, resulting in efficient, more reliable computing at a fraction of the cost. We are seeking a skilled and experienced

Senior Linux/Systems Engineer.

Founded by Brijesh Tripathi, who bring experience from Nvidia, Apple, Tesla, Intel and Zoox, FlexAI is not just building a product - we're shaping the future of AI. Our teams are strategically distributed across Paris, Silicon Valley, and Bangalore, united by a shared mission: to deliver more compute with less complexity.

If you're passionate about shaping the future of artificial intelligence, driving innovation, and contributing to a sustainable and inclusive AI ecosystem,

FlexAI is the place for you !

Position Overview:

A Senior Linux/Systems Engineer to design, build, and operate bare-metal AI/HPC GPU

clusters. You'll own platform bring-up (UEFI/BIOS bootloaders OS), kernel/device

enablement, low-level networking (RoCEv2/InfiniBand), GPU/accelerator stack readiness,

and repeatable automation for provisioning and compliance. This role suits someone who

enjoys getting hands-dirty in firmware, kernel and PCIe, and then scaling that knowledge

with Ansible/Python

What you'll do:

Platform

&

Boot

Enablement:

Own server bring-up: UEFI/BIOS configuration, Secure Boot/TPM/Measured Boot, GRUB, PXE/iPXE flows. Integrate and automate BMC/IPMI/Redfish workflows for out-of-band provisioning and fleet management.

OS

&

Kernel

Engineering:

Build, customize, and harden Ubuntu images (cloud-init, Debos) and tune systemd/init for low-latency, high-throughput workloads. Diagnose and fix kernel/user-space issues using perf, ftrace, eBPF/bpftrace; configure NUMA, IRQ affinity, cgroups/namespaces.

PCIe/Driver

Enablement:

Validate PCIe topologies and features (ACS/ARI/ATS), SR-IOV, IOMMU/VFIO; bring up NIC/GPU drivers and firmware. Root-cause device initialization and performance regressions across kernel, drivers, and userspace.

Provisioning

&

Automation

at

Scale:

Author idempotent Ansible playbooks/roles; implement Python/Pytest test harnesses for pre/post-provision validation. Build golden images and repeatable pipelines for server provisioning, configuration drift detection, and remediation.

GPU/Accelerator

&

HPC

Stack

Readiness:

Enable NVIDIA CUDA/NCCL/GPUDirect RDMA and AMD ROCm; validate multi-GPU/multi-node performance. Stand up and tune NCCL/UCX, MPI (OpenMPI), torchrun/PyTorch for distributed training workloads.

Containers

&

Build

Tooling:

Build and maintain minimal, reproducible Docker images and docker-compose environments for CI and validation. Use C/Go/Python, Make/CMake, and CI (GitHub Actions/GitLab CI) to publish and maintain Validation and automation tools.

High-Performance

Networking:

Configure and tune RoCEv2 and/or InfiniBand fabrics; validate rdma-core/libibverbs paths end-to-end. Optimize congestion control, MTU/jumbo frames, NUMA/RSS/IRQ steering for consistent throughput/latency.

Security

&

Compliance:

Apply CIS hardening baselines; maintain Secure Boot policy, measured boot attestations, and patch compliance. Implement access controls and auditability across firmware, OS, and cluster automation.

What you'll need to be successful:

Educational

Background:

Bachelor's or Master's degree in Computer Science, Software Engineering, or a related field.

Technical

Skills:

Platform/Boot: UEFI/BIOS, GRUB, Secure Boot, PXE/iPXE, BMC/IPMI/Redfish. OS/Kernel: Linux (Ubuntu), systemd/init, eBPF, perf/ftrace/bpftrace, cgroups, namespaces, NUMA, IRQ affinity. Drivers/PCIe: PCIe fundamentals (ACS/ARI/ATS), SR-IOV, VFIO, IOMMU, NIC/GPU drivers. Provisioning/Automation: Ansible, Python, Pytest, Debos, cloud-init. Containers: Docker, docker-compose. Build/Dev: C, Python, Go (optional), Make, CMake, CI (GitHub Actions/GitLab CI). Networking (HPC): RoCEv2, InfiniBand, libibverbs/rdma-core, NCCL/UCX, MPI (OpenMPI). GPU/Accel: NVIDIA (CUDA, NCCL, GPUDirect RDMA), AMD ROCm. Security/Compliance: CIS hardening, Secure Boot, TPM/Measured Boot.

Professional

Experience:

7+ years in Linux systems engineering, including kernel/userspace debugging and performance tuning. Proven ownership of bare-metal server bring-up and fleet-scale provisioning via Ansible/Python. Hands-on with PCIe device enablement (SR-IOV/VFIO/IOMMU) and NIC/GPU driver stacks. Demonstrated success enabling multi-GPU/multi-node training over RoCEv2 or InfiniBand. Track record building reproducible OS images and container artifacts for production use.

Soft

Skills:

Ability to mentor peers, partner with researchers/ML engineers, and influence cross-functional roadmaps. Clear, concise documentation habits; you turn tribal knowledge into automation and runbooks.

Preferred

Qualifications:

Experience in cloud-based AI solutions and infrastructure. Familiarity with performance benchmarking and optimization. Knowledge of modern development practices and Agile methodologies.

What we offer:

A competitive salary and benefits package, tailored to recognize your dedication and contributions. The opportunity to collaborate with leading experts in AI and cloud computing, learning from the best and the brightest, fostering continuous growth. An environment that values innovation, collaboration, and mutual respect. Support for personal and professional development, empowering you with the tools and resources to elevate your skills and leave a lasting impact. A pivotal role in the AI revolution, shaping the technologies that power the innovations of tomorrow.
#

Offices :

Our teams are strategically distributed across three continents--Europe, North America, and Asia--united by a shared mission: to deliver more compute with less complexity.

Paris - HQ San Francisco (Bay Area) - US office Bangalore - India office
#

Apply NOW!

You've seen what this role entails. Now we want to hear from you! Does this opportunity align with your aspirations? If you're even slightly curious, we encourage you to apply - it could be the start of something extraordinary!

At FlexAI, we believe diverse teams are the most innovative teams. We're committed to creating an inclusive environment where everyone feels valued, and we proudly offer equal opportunities regardless of gender, sexual orientation, origin, disabilities, veteran status, or any other facets of your identity that make you uniquely you.

Beware of fraud agents! do not pay money to get a job

MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.

Job Detail

Job Id

JD4664369
Industry

Not mentioned
Total Positions

1
Job Type:

Full Time
Salary:

Not mentioned
Employment Status

Permanent
Job Location

KA, IN, India
Education

Not mentioned
Experience

Year

MNC Jobs India

Jobs by Function

Popular Job Skills

Popular Industries

Popular Cities

Jobseekers

Employers