Your Title: Site Reliability Engineering(SRE) Lead
Job Location: Chennai - India
Our Department: Mobility (Transportation)
What You Will Do
As a Site Reliability Engineer (SRE), you will play a pivotal role in ensuring the robustness, stability, and efficiency of our systems. Your primary focus will be on designing and implementing solutions that enhance the reliability of our technology stack. From proactive monitoring to incident response, you will contribute to a culture of operational excellence. Collaborating closely with development teams, you will integrate reliability into the software development lifecycle. Automation will be at the core of your approach, as you streamline operational tasks to achieve scalable and dependable services. Join us in shaping the future of our technological landscape, where your expertise in observability, automation, and collaboration will be instrumental in maintaining the highest standards of system reliability.
Primary Responsibilities:
Implement and enhance robust monitoring solutions for proactive system observation
Collaborate in the design and implementation of resilient and scalable system architectures
Conduct reliability analyses, identifying areas for continuous improvement
Actively participate in incident resolution, minimizing downtime and impact on users
Conduct post-mortem analyses to understand and address the root causes of incidents
Collaborate on scalability testing and optimization of system performance
Identify opportunities to enhance the overall efficiency and responsiveness of the systems
Integrate security practices throughout development and operations
Contribute to the implementation of incident response measures for security incidents
Develop and maintain detailed documentation of operational processes
Facilitate training sessions for development teams on operational and reliability best practices
What Skills & Experience You Should Bring
Bachelor\'s degree in Electrical Engineering, Computer Engineering, or related field.
Minimum 5 years of proven experience in SRE or related roles;
Solid automation skills using tools such as Ansible, Puppet, Chef, etc;
Expertise in working with Terraform(Mandatory)
Proficiency in platforms such as AWS, Azure, or Google Cloud, with the ability to optimize cloud architectures(any one)
Experience with configuration tools like Ansible, Chef, Puppet, Terraform, or Salt;
Experience leading responses to complex incidents with quick problem resolution;
Collaboration with development teams to design highly available distributed systems;
Practical knowledge of tools like Prometheus, Grafana, ELK Stack, Kubernetes, Docker, etc;
Excellent communication skills to collaborate effectively and present solutions clearly and concisely.
About Trimble Transportation Division
Trimble Transportation is in business for optimizing the movement of freight by providing shippers and carriers both mobility, enterprise and visibility software tools they need to run their businesses more efficiently. As the leading provider of Transportation Management Software (TMS), Asset Management Software (AMS), and Fleet Management Software (FMS) we are devoted to propelling companies in the trucking industry toward increased efficiency, lower costs and optimize operations.
Trimble\xe2\x80\x99s Inclusiveness Commitment
We believe in celebrating our differences. That is why our diversity is our strength. To us, that means actively participating in opportunities to be inclusive. Diversity, Equity, and Inclusion have guided our current success while also moving our desire to improve. We actively seek to add members to our community who represent our customers and the places we live and work.
We have programs in place to make sure our people are seen, heard, and welcomed and most importantly that they know they belong, no matter who they are or where they are coming from.
MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.