AI Platform Engg -Python (FastAPI, Flask) Go/Node.js Job in Guardian Management Services

Ai Platform Engg Python (fastapi, Flask) Go/node.js

Year PY, IN, India

Apply Now

Job Description

Job Specification: AI Platform Engineer

About the Role

We are seeking an AI Platform Engineer to build and scale the infrastructure that powers

our production AI services. You will take cutting-edge models--ranging from speech

recognition (ASR) to large language models (LLMs)--and deploy them into highly

available, developer-friendly APIs.

You will be responsible for creating the bridge between the R&D team, who train models,

and the applications that consume them. This means developing robust APIs, deploying

and optimizing models on Triton Inference Server (or similar frameworks), and ensuring

real-time, scalable inference.

Responsibilities

? API Development

? Design, build, and maintain production-ready APIs for speech, language, and

other AI models.

? Provide SDKs and documentation to enable easy developer adoption.

? Model Deployment

? Deploy models (ASR, LLM, and others) using Triton Inference Server or

similar systems.

? Optimize inference pipelines for low-latency, high-throughput workloads.

? Scalability & Reliability

? Architect infrastructure for handling large-scale, concurrent inference

requests.

? Implement monitoring, logging, and auto-scaling for deployed services.

? Collaboration

? Work with research teams to productionize new models.

? Partner with application teams to deliver AI functionality seamlessly through

APIs.

? DevOps & Infrastructure

? Automate CI/CD pipelines for models and APIs.

? Manage GPU-based infrastructure in cloud or hybrid environments.

Requirements

? Core Skills

? Strong programming experience in Python (FastAPI, Flask) and/or

Go/Node.js for API services.

? Hands-on experience with model deployment using Triton Inference Server,

TorchServe, or similar.

? Familiarity with both ASR frameworks and LLM frameworks (Hugging

Face Transformers, TensorRT-LLM, vLLM, etc.).

? Infrastructure

? Experience with Docker, Kubernetes, and managing GPU-accelerated

workloads.

? Deep knowledge of real-time inference systems (REST, gRPC, WebSockets,

streaming).

? Cloud experience (AWS, GCP, Azure).

? Bonus

? Experience with model optimization (quantization, distillation, TensorRT,

ONNX).

? Exposure to MLOps tools for deployment and monitoring

Job Types: Full-time, Permanent

Pay: From ?50,000.00 per month

Experience:

total work: 3 years (Preferred)
Work Location: In person

Beware of fraud agents! do not pay money to get a job

MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.

Job Detail

Job Id

JD4647309
Industry

Not mentioned
Total Positions

1
Job Type:

Full Time
Salary:

Not mentioned
Employment Status

Permanent
Job Location

PY, IN, India
Education

Not mentioned
Experience

Year

MNC Jobs India

Jobs by Function

Popular Job Skills

Popular Industries

Popular Cities

Jobseekers

Employers