Senior Site Reliability Developer | Oase

Year    Bengaluru, Karnataka, India

Job Description


Solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Design, write, and deploy software to improve the availability, scalability, and efficiency of Oracle products and services. Design and develop designs, architectures, standards, and methods for large-scale distributed systems. Facilitate service capacity planning and demand forecasting, software performance analysis, and system tuning.

Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services. Responsible for the design and delivery of the mission critical stack, with focus on security, resiliency, scale, and performance. Authority for end-to-end performance and operability. Partner with development teams in defining and implementing improvements in service architecture. Articulate technical characteristics of services and technology areas and guide Development Teams to engineer and add premier capabilities to the Oracle Cloud service portfolio. Understand and communicate the scale, capacity, security, performance attributes, and requirements of the service and technology stack. Demonstrate clear understanding of automation and orchestration principles. Act as ultimate escalation point for complex or critical issues that have not yet been documented as Standard Operating Procedures (SOPs). Utilize a deep understanding of service topology and their dependencies required to troubleshoot issues and define mitigations. Understand and explain the affect of product architecture decisions on distributed systems. Professional curiosity and a desire to a develop deep understanding of services and technologies.

A BS or MS in Computer Science, or equivalent. Identifies solutions to knowledge of server hardware and software configuration, networking, standard internet services, scripting languages, cloud computing patterns, technology security and compliance. Experience running large scale customer facing web services. Identifies solutions to understanding of load balancing technologies and experience with development in programming languages, databases and big data stores, and container technologies. Work involves defining and documenting technical architecture of complex and highly scalable products. A minimum of 5+ years experience of running large scale customer facing web services.
Role : Site Reliability Engineer Who are we? The Service Excellence team at Oracle Analytics Cloud (OAC) is on the verge of transforming the development paradigms at the 42 year old software giant. With the world moving towards the Cloud, Oracle is at the forefront with tremendous portfolio of Cloud offerings. However, this transformation happens not just at the product level, but also the process of developing, deploying, and operating these products in the Cloud. Using a combination of cutting edge technologies, continuous process improvements and innovative business transformation methodologies, a small group of us are blazing the trail on the Service Excellence philosophy. Who are we looking for? The candidate will work with the skilled, highly motivated Oracle Analytics Service Excellence (OASE) team who embrace an agile work style. You will work alongside a software development team within the greater OAC organization where you will support existing features in the cloud as well as new operational processes, automation and content. You will play a key role in improving the processes supporting the OAC services, so the service functions more and more autonomously over time. Roles and Responsibilities

  • Perform DevOps activities to support customers, engineers, and processes through our release cycles as well as production
  • Participate in a follow-the-sun model for 24x7 support of OAC services
  • Respond to incidents, own them and drive to completion, participate in root cause analysis
  • Document various processes & runbooks; update existing processes
  • Execute, with excellence, delivery of interim patches and hotfixes as required
  • Work with various teams to take ownership of and resolve service failure/outages.
  • Monitor metrics and develop ways to improve the CI and CD tools utilized by the team
  • Follow all best practices and procedures as established by the company
  • Mentor and train other engineers and seek to continually improve processes
  • Other duties as assigned
General Qualifications The candidate must have knowledge and experience with:
  • A BS or MS in Computer Science, or equivalent
  • Providing cloud networking, infrastructure, and service support, configuration, operations, tools, and processes
  • Understand networking, and TCP/IP fundamentals and services such as DNS, HTTP, etc.
  • Linux/Unix system administration including system level knowledge of Linux on OCI Gen 2, creating and executing scripts
  • Methodical approaches to troubleshooting and solving complex technical problems
  • Producing documentation in support of developed work (KBs, run books, help guides)
  • Utilizing agile methodologies
  • Communicating effectively in a team environment
  • Working with remote, global teams as well as individuals
  • Working independently and in a self-directed manner
  • Able to work extended week day and week-end shifts as required for on-call, after hours upgrades, and other duties as assigned.
Preferred Qualifications An ideal candidate will have the following skill sets:
  • 5+ years of experience of running large scale customer facing web services.
  • Oracle Cloud Infrastructure (OCI) or AWS, Azure, GCP compute, storage, and network operational experience.
  • Programming and scripting languages (Python, bash, Java Script - additional experience with PHP, Groovy, Java, and/or Go is a plus)
  • Using CI/CD scripting tools such as Ansible, Puppet, or Chef
  • Containers and orchestration (Docker, Kubernetes, and docker-compose).
  • Oracle database, MySQL (experience with MS SQL and/or NoSQL is a plus).
  • Issue tracking and collaboration (Jira and Confluence).

Beware of fraud agents! do not pay money to get a job

MNCJobsIndia.com will not be responsible for any payment made to a third-party. All Terms of Use are applicable.


Related Jobs

Job Detail

  • Job Id
    JD2869944
  • Industry
    Not mentioned
  • Total Positions
    1
  • Job Type:
    Full Time
  • Salary:
    Not mentioned
  • Employment Status
    Permanent
  • Job Location
    Bengaluru, Karnataka, India
  • Education
    Not mentioned
  • Experience
    Year