We aim to bring about a new paradigm in medical image diagnostics -- intelligent, holistic, ethical, explainable, and patient?centric. We're looking for innovative problem?solvers who empathize with clinicians and patients, understand business problems, and can design and deliver reliable, intelligent products.
Key Responsibilities
Manage Linux servers, GPU nodes (drivers/CUDA/MIG), and networked storage for AI training/inference.
Operate PACS/XNAT, DICOM routing (AE Titles, TLS), and high-throughput data movement (DICOM/NIfTI, 100GB-TB).
Provision and maintain MQ brokers (RabbitMQ/ActiveMQ/Kafka): clustering, TLS, HA, backups.
Monitoring & alerting: Prometheus/Grafana; ship logs to ELK/OpenSearch/Loki; SIEM integration and alerts.
Security & access: SSO/IdP, RBAC, MFA, PAM; IAM guardrails; cert/PKI management with automated rotation.
Patch & vulnerability mgmt: CIS baselines, kernel/driver updates, EDR; scheduled windows and remediation SLAs.
Backups & DR: datasets, model artifacts, PostgreSQL/MongoDB; RPO/RTO targets; quarterly restore drills.
Config management: Ansible/SSM/Salt for baseline images, desired-state, and drift detection.
Network services: DNS/DHCP/NTP, segmentation/VPC/VLAN, VPN/WireGuard, egress/WAF policies.
Storage ops: RAID/ZFS, snapshots/quotas, S3/MinIO lifecycle & versioning, cost-aware tiering.
Compliance & audit: HIPAA/GDPR/ISO-27001 controls, immutable audit logs, retention policies.
Change/incident mgmt: runbooks, on-call, postmortems; coordinate maintenance windows.
Support AI teams: capacity planning (CPU/GPU/RAM/IOPS), license servers, specialized imaging environments.
Skills and Qualifications (Required)
3+ years Linux systems administration in high?availability environments.
GPU servers: NVIDIA CUDA/drivers; Docker runtime fundamentals.
Networking fundamentals (firewalls, VPNs, load balancers); storage (NAS/S3).
Monitoring/logging: Prometheus, Grafana, ELK/Opensearch.
Message queues: RabbitMQ/ActiveMQ/Kafka basics (clustering, failover).
Cloud familiarity (AWS/GCP/Azure), IAM and cost?aware provisioning.
Knowledge of cybersecurity best practices and compliance frameworks (HIPAA, GDPR, ISO 27001).
Security & compliance: CIS hardening, patch mgmt, EDR, audit/retention.
Config management (Ansible/SSM/Salt) and scripting (Bash/Python).
Preferred
PACS or XNAT administration; DICOM routers; imaging data movers.
SaltStack/systemd health checks; log stacks (Loki/Tempo).
Scripting (Bash, Python) for automation and troubleshooting.
Exposure to database administration (PostgreSQL, MongoDB).
Loki/Tempo stacks; Vault/Secrets Manager; OPA/Gatekeeper awareness.
Education
BE/B.Tech or equivalent experience.
Location & Work Setup
On-Site Gurugram. On-premises if required
Job Type: Full-time
Pay: ₹800,000.00 - ₹1,000,000.00 per year
Application Question(s):
How many years of Linux Production Server Administration do you have?
Have you managed GPU Servers and CUDA Drivers?
Which monitoring and logging tools have you used?
How many years of Production experience with docker and kubernetes do you have?
Do you have experience with Configuration Automation?
Work Location: In person
Speak with the employer
+91 8126334433
Beware of fraud agents! do not pay money to get a job
MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.