For more than three decades, Aeris has been a trusted cellular IoT leader enabling the biggest IoT programs and opportunities across Automotive, Utilities and Energy, Fleet Management and Logistics, Medical Devices, and Manufacturing. Our IoT technology expertise serves a global ecosystem of 7,000 enterprise customers and 30 mobile network operator partners, and 90 million IoT devices across the world. Aeris powers today's connected smart world with innovative technologies and borderless connectivity that simplify management, enhance security, optimize performance, and drive growth.
Role Overview
As an Engineer- Infrastructure, your role covers designing, deploying, and troubleshooting Linux systems, hypervisors, VMs (with live migration), Kubernetes clusters, and OpenStack environments. You are the SME ensuring the availability, reliability, security, and migration of the infrastructure stack, including secure SSH access and enterprise virtualization.
Location
: Noida, Work from office - 5 days per week
Responsibilities
Design, implement, and manage complex infrastructure: Linux systems, hypervisors (OLVM, Proxmox, KVM, VMware), virtual machines, Kubernetes clusters, and OpenStack clouds.
Configure, secure, and troubleshoot sshd (Secure Shell Daemon), manage SSH keys, and enforce secure remote access policies on Linux hosts.
Perform live migration of VMs including setup and operational management within OLVM and other hypervisors.
Oversee VM lifecycle: provisioning, migration, resource allocation, backup, and fault resolution.
Monitor and automate infrastructure for reliability and scalability.
Ensure Linux system hardening (patching, security, auditing) and manage advanced system/network performance issues.
Deploy, scale, and secure Kubernetes clusters for container workloads.
Architect, operate, and troubleshoot OpenStack services and integrations.
Lead incident management (L3/expert escalations) and technical mentoring.
Document infrastructure standards and procedures.
Implement monitoring, alerting, and backup solutions.
Collaborate with cross-functional teams for infrastructure automation and CI/CD integration.
Drive compliance with security, regulatory, and operational controls.
Skills Required
A. Operating Systems
Expert administration of Red Hat, Oracle Linux, Ubuntu, CentOS, SUSE, etc.
Deep understanding and troubleshooting of
sshd
.
Install, configure, tune and secure
/etc/ssh/sshd_config.
SSH key management, user access control, two-factor, and auditing. Handling root and user SSH policies, port forwarding, and proxies.
Linux boot/internal: kernel, systemd, SELinux, PAM, disk partitions, LVM/RAID, ZFS
Automation and scripting: Bash, Python, Ansible
OS optimization, security hardening, backup/restores, disaster recovery
Experience of Windows Server upgrades
B. Hypervisors
Advanced management of OLVM (Oracle Linux Virtualization Manager), Proxmox, KVM, VMware, Xen, and others
VM provisioning, storage pools, network bridges, VM snapshot and backup
Live migration of VMs across Oracle Linux KVM hosts
Security policies, templates, cluster setup, performance tuning in OLVM
VM lifecycle, clustering, resource automation, live migration, network configuration
Integration with storage and backups, fault finding/remediation
C. Kubernetes
Cluster deployment, scaling, backup/restore, and upgrades (kubeadm, kops, Rancher, etc.)
Deep knowledge of Kubernetes architecture: API, etcd, controller-manager, kubelet, and networking (CNI plugins)
Advanced experience creating, modifying, and troubleshooting Pods, including multi-container and ephemeral containers
Pod scheduling, affinity/anti-affinity, taints/tolerations, node selectors, and pod priorities
Health checks with liveness, readiness, and startup probes for pods
Pod resource management--understanding resource requests, limits, and best practices for optimizing resource usage
Managing pod disruption budgets (PDBs) to ensure high availability during maintenance
Handling pod lifecycle events (init containers, hooks, restarts, termination, graceful shutdowns)
Securing pods with pod security policies (PSPs), SecurityContexts, and namespaces
Troubleshooting pod networking issues, DNS integration, inter-pod communication (Services, NetworkPolicies)
Managing pod logs, events, and debugging (kubectl exec, logs, describe)
Volume management for pods (PersistentVolumeClaims, ephemeral volumes, projected volumes)
Rolling updates, canary deployments, and managing pod availability during application upgrades
Hands-on experience with Operators and Custom Resources (CRDs) for advanced pod management
Application deployment using Helm, Kustomize, and manifest authoring
Cluster monitoring and alerting: Prometheus, Grafana, ELK stack
Disaster recovery planning: etcd and cluster state backups, restore procedures
Security: RBAC, secrets, service accounts, role bindings, pod identity
Integrating Kubernetes with CI/CD pipelines (Jenkins, GitLab, ArgoCD)
Cluster autoscaling (HPA/VPA), node autoscaling, and performance optimization
D. OpenStack
Architectural mastery of core OpenStack services (Nova, Neutron, Cinder, Swift, Keystone, Glance, Horizon), including service dependencies and message queues.
Automated deployment and upgrade management using tools like Ansible, Kolla, TripleO, and Packstack for installing, configuring, and updating OpenStack clusters.
Advanced Nova compute operations including VM provisioning, resizing, live migration, host aggregates, resource optimization, and troubleshooting.
Neutron networking design and troubleshooting: building complex network topologies (provider/self-service networks, VLAN/VXLAN), managing routers, security groups, and resolving L2/L3 issues.
Cinder block storage administration with multi-backend setups (Ceph, NFS, iSCSI), volume/snapshot management, performance tuning, and backups.
Keystone identity, access, and RBAC management -- project isolation, integration with LDAP/AD/SAML, token security, and API endpoint security.
Glance image service management -- secure image creation, registration, distribution, snapshotting, and integration with automation pipelines.
High availability and disaster recovery design -- HA clustering of controllers/services (Pacemaker/Corosync), database/message bus HA, failover, and DR strategy/testing.
Monitoring, telemetry, and logging with Ceilometer, Monasca, or Prometheus/Grafana to enable real-time metrics, alerting, and advanced troubleshooting.
OpenStack API and automation expertise -- scripting with OpenStack CLI/REST API, orchestration with Heat, and integration with external and hybrid cloud workflows.
Minimum Qualifications
Bachelor's in Computer Science, Engineering or related field (or equivalent experience)
5+ years managing enterprise Linux systems and VMs (L3)
Direct hands-on experience with OLVM in production
3+ years with Kubernetes and/or OpenStack clusters
Proven experience with live migration in OLVM, Proxmox, or VMware
Preferred Certifications
Red Hat Certified Engineer (RHCE)
Oracle Linux Certified Implementation Specialist (for OLVM)
Certified Kubernetes Administrator (CKA)
OpenStack Administrator Certification
VMware Certified Professional (VCP)
Soft Skills
Strong troubleshooting and analytical skills
Excellent communication and documentation practices
* Collaborative, proactive, and adaptable
Beware of fraud agents! do not pay money to get a job
MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.