Maintaining end-to-end life-cycle management of RHEL servers: including provisioning, installation, software packaging, patching, planned & unplanned maintenance, service configuration and integration with our monitoring platform.
Development & continuous enhancement of tools, utilities, reports & frameworks to assist production support, operational processes, re-engineering efforts etc.
Work closely with Cloud Engineering to enable development of end to end automated platforms
Maintain Health and Hygiene of Linux servers.
Contribute towards API gateway-related deliverables & proactively move towards server-less infra.
Contribute towards developing a holistic Front-end for our Core Infrastructure services, which would initially meant for operational & visibility for our team, but would simultaneously provide few frequently-needed info by App-teams.
Should be able to handle independent assignments in the troubleshooting, problem diagnosis, problem resolution for one or more technologies.
Pro-actively monitor the stability and performance of various technologies within area of expertise and drive appropriate corrective action prior to an incident or problem occurring.
Actively collaborate with fellow members of the team and contractors/vendors on bridge calls to prevent or resolve incidents/problems in an expeditious manner.
Recommend, deploy and document strategies and solutions for problems/incidents based upon comprehensive and thoughtful analysis of business goals, objectives, requirements and existing technologies.
Independently identify key issues, patterns and deviations during the analysis.
Participate and provide input in the continual refinement of processes, policies and best practices to ensure the highest possible performance and availability of technologies.
Create, maintain and update documentation including troubleshooting guides, procedure/support manuals, and communication plans.
Continuously develop specialized knowledge and technical subject matter expertise by remaining apprised of Industry trends, the direction of emerging technologies, and their potential value to the business.
Contribute towards development of operational reporting including daily health check reports, capacity/performance reports, and incident/problem reports.
Data Collection, Tracking & Analysis
Use a variety of data collection techniques and systems to collect technology operations performance data.
Analyze to draw accurate conclusions regarding performance, trends and issues (current and/or potential).
Develop tools & utilities to enhance compliance- adherence with defined SLA/OLA's.
Monitor consumption/usage metrics to understand trends to assist in the effective management of vendor partners (as applicable).
Perform trend analysis to identify cause of performance and/or usage issues.
Continuous Improvement
Work with application teams to determine the impact of application changes to the monitors configured for an application and determine if any changes or additions are required.
Assist teams in identifying monitoring requirements and implementing the appropriate monitors to achieve the desired results.
Use experience, expertise and data analysis to collaborate with manager and team members in the identification of corrective action to increase efficiency, improve performance and meet or exceed targets.
MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.