Engineer/sr. Engineer Infrastructure

Year    UP, IN, India

Job Description

About Aeris Communications Inc.





For more than three decades, Aeris has been a trusted cellular IoT leader enabling the biggest IoT programs and opportunities across Automotive, Utilities and Energy, Fleet Management and Logistics, Medical Devices, and Manufacturing. Our IoT technology expertise serves a global ecosystem of 7,000 enterprise customers and 30 mobile network operator partners, and 90 million IoT devices across the world. Aeris powers today's connected smart world with innovative technologies and borderless connectivity that simplify management, enhance security, optimize performance, and drive growth.

Role Overview




As an Engineer- Infrastructure, your role covers designing, deploying, and troubleshooting Linux systems, hypervisors, VMs (with live migration), Kubernetes clusters, and OpenStack environments. You are the SME ensuring the availability, reliability, security, and migration of the infrastructure stack, including secure SSH access and enterprise virtualization.

Location

: Noida, Work from office - 5 days per week

Responsibilities



Design, implement, and manage complex infrastructure: Linux systems, hypervisors (OLVM, Proxmox, KVM, VMware), virtual machines, Kubernetes clusters, and OpenStack clouds. Configure, secure, and troubleshoot sshd (Secure Shell Daemon), manage SSH keys, and enforce secure remote access policies on Linux hosts. Perform live migration of VMs including setup and operational management within OLVM and other hypervisors. Oversee VM lifecycle: provisioning, migration, resource allocation, backup, and fault resolution. Monitor and automate infrastructure for reliability and scalability. Ensure Linux system hardening (patching, security, auditing) and manage advanced system/network performance issues. Deploy, scale, and secure Kubernetes clusters for container workloads. Architect, operate, and troubleshoot OpenStack services and integrations. Lead incident management (L3/expert escalations) and technical mentoring. Document infrastructure standards and procedures. Implement monitoring, alerting, and backup solutions. Collaborate with cross-functional teams for infrastructure automation and CI/CD integration. Drive compliance with security, regulatory, and operational controls.

Skills Required



A. Operating Systems



Expert administration of Red Hat, Oracle Linux, Ubuntu, CentOS, SUSE, etc. Deep understanding and troubleshooting of

sshd

. Install, configure, tune and secure

/etc/ssh/sshd_config.

SSH key management, user access control, two-factor, and auditing. Handling root and user SSH policies, port forwarding, and proxies. Linux boot/internal: kernel, systemd, SELinux, PAM, disk partitions, LVM/RAID, ZFS Automation and scripting: Bash, Python, Ansible OS optimization, security hardening, backup/restores, disaster recovery Experience of Windows Server upgrades

B. Hypervisors



Advanced management of OLVM (Oracle Linux Virtualization Manager), Proxmox, KVM, VMware, Xen, and others VM provisioning, storage pools, network bridges, VM snapshot and backup Live migration of VMs across Oracle Linux KVM hosts Security policies, templates, cluster setup, performance tuning in OLVM VM lifecycle, clustering, resource automation, live migration, network configuration Integration with storage and backups, fault finding/remediation

C. Kubernetes



Cluster deployment, scaling, backup/restore, and upgrades (kubeadm, kops, Rancher, etc.) Deep knowledge of Kubernetes architecture: API, etcd, controller-manager, kubelet, and networking (CNI plugins) Advanced experience creating, modifying, and troubleshooting Pods, including multi-container and ephemeral containers Pod scheduling, affinity/anti-affinity, taints/tolerations, node selectors, and pod priorities Health checks with liveness, readiness, and startup probes for pods Pod resource management--understanding resource requests, limits, and best practices for optimizing resource usage Managing pod disruption budgets (PDBs) to ensure high availability during maintenance Handling pod lifecycle events (init containers, hooks, restarts, termination, graceful shutdowns) Securing pods with pod security policies (PSPs), SecurityContexts, and namespaces Troubleshooting pod networking issues, DNS integration, inter-pod communication (Services, NetworkPolicies) Managing pod logs, events, and debugging (kubectl exec, logs, describe) Volume management for pods (PersistentVolumeClaims, ephemeral volumes, projected volumes) Rolling updates, canary deployments, and managing pod availability during application upgrades Hands-on experience with Operators and Custom Resources (CRDs) for advanced pod management Application deployment using Helm, Kustomize, and manifest authoring Cluster monitoring and alerting: Prometheus, Grafana, ELK stack Disaster recovery planning: etcd and cluster state backups, restore procedures Security: RBAC, secrets, service accounts, role bindings, pod identity Integrating Kubernetes with CI/CD pipelines (Jenkins, GitLab, ArgoCD) Cluster autoscaling (HPA/VPA), node autoscaling, and performance optimization

D. OpenStack



Architectural mastery of core OpenStack services (Nova, Neutron, Cinder, Swift, Keystone, Glance, Horizon), including service dependencies and message queues. Automated deployment and upgrade management using tools like Ansible, Kolla, TripleO, and Packstack for installing, configuring, and updating OpenStack clusters. Advanced Nova compute operations including VM provisioning, resizing, live migration, host aggregates, resource optimization, and troubleshooting. Neutron networking design and troubleshooting: building complex network topologies (provider/self-service networks, VLAN/VXLAN), managing routers, security groups, and resolving L2/L3 issues. Cinder block storage administration with multi-backend setups (Ceph, NFS, iSCSI), volume/snapshot management, performance tuning, and backups. Keystone identity, access, and RBAC management -- project isolation, integration with LDAP/AD/SAML, token security, and API endpoint security. Glance image service management -- secure image creation, registration, distribution, snapshotting, and integration with automation pipelines. High availability and disaster recovery design -- HA clustering of controllers/services (Pacemaker/Corosync), database/message bus HA, failover, and DR strategy/testing. Monitoring, telemetry, and logging with Ceilometer, Monasca, or Prometheus/Grafana to enable real-time metrics, alerting, and advanced troubleshooting. OpenStack API and automation expertise -- scripting with OpenStack CLI/REST API, orchestration with Heat, and integration with external and hybrid cloud workflows.

Minimum Qualifications



Bachelor's in Computer Science, Engineering or related field (or equivalent experience) 5+ years managing enterprise Linux systems and VMs (L3) Direct hands-on experience with OLVM in production 3+ years with Kubernetes and/or OpenStack clusters Proven experience with live migration in OLVM, Proxmox, or VMware

Preferred Certifications



Red Hat Certified Engineer (RHCE) Oracle Linux Certified Implementation Specialist (for OLVM) Certified Kubernetes Administrator (CKA) OpenStack Administrator Certification VMware Certified Professional (VCP)

Soft Skills



Strong troubleshooting and analytical skills Excellent communication and documentation practices * Collaborative, proactive, and adaptable

Beware of fraud agents! do not pay money to get a job

MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Job Detail

  • Job Id
    JD5010315
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    UP, IN, India
  • Education
    Not mentioned
  • Experience
    Year