AI Systems Engineer

Ai Systems Engineer – Gpu/rocm/cuda | Ml Frameworks Optimization

Year TS, IN, India

SEMI LEAF

10 Current Jobs Openings

Apply Now

Job Description

Job Title: AI Systems Engineer - GPU/ROCm/CUDA | ML Frameworks Optimization

Location:

Hyderabad

Experience :

3-6 [Mid-Senior]

:

We are looking for a passionate and experienced

to join our team to work on next-generation Machine Learning technologies and optimize performance across AMD GPU accelerators. This role involves low-level GPU programming, custom ML kernel development, and working with state-of-the-art inference engines.

Key Responsibilities:

Develop and optimize custom

Deep Learning GPU kernels

using

ROCm/CUDA

or shader languages Support and enhance

ML model deployment

Linux platforms

Optimize performance of

ROCm drivers

and inferencing engines for

AI/ML workloads

Collaborate closely with internal hardware/software teams to support

next-gen GPU accelerators

Profile, debug, and improve performance of

GPU kernels and AI model pipelines

Contribute to designing and implementing new

AI technologies

and workflows

Required Skills & Qualifications:

BS/MS in Computer Science, Electrical Engineering

, or equivalent Strong programming skills in

C/C++

Python

Solid experience working with

Linux CLI

bash scripting

, or

PowerShell

Hands-on experience with

Python ML libraries

such as

PyTorch

Transformers

Knowledge of writing high-performance ML kernels using

Triton

JAX

, or similar Experience with

debugging tools

like gdb, valgrind, and

profiling tools

such as nsys, rocprof Familiarity with AI inferencing runtimes such as

vllm

ollama

llama.cpp

, or

sglang

Understanding of

GPU and PC architecture

x86/x64 instruction sets

Experience developing with

ROCm

CUDA

, or shader programming

Nice to Have:

Knowledge of

x86 Assembly

Contributions to

open-source ML/DL performance libraries

Exposure to compiler optimization techniques for GPU code

What We Offer:

Work on cutting-edge GPU technologies and ML systems Exposure to performance-critical AI workloads Collaborative and research-oriented environment Competitive compensation and career growth opportunities

Apply:

If you are looking for job change share your updated resume to

vagdevi@semi-leaf.com

Job Type: Full-time

Pay: Up to ₹3,000,000.00 per year

Experience:

Deep Learning GPU kernels using ROCm/CUDA: 2 years (Required) programming skills in C/C++, Python: 1 year (Required) Python ML libraries such as PyTorch, Transformers: 1 year (Required) developing with ROCm, CUDA, : 1 year (Required)
Work Location: In person

Speak with the employer

+91 7483459258

Beware of fraud agents! do not pay money to get a job

MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.

Job Detail

Job Id

JD3982798
Industry

Not mentioned
Total Positions

1
Job Type:

Full Time
Salary:

Not mentioned
Employment Status

Permanent
Job Location

TS, IN, India
Education

Not mentioned
Experience

Year

MNC Jobs India

Jobs by Function

Popular Job Skills

Popular Industries

Popular Cities

Jobseekers

Employers