Department Overview
The
Communications Service Reliability Engineering (SRE) team is an integral part of Oracle’s Communications Global Business Unit which provides Communications Cloud SaaS solutions. The SRE team provides the SaaS solutions technical and operational support link between Level 1 support, IaaS and PaaS operations teams, and SaaS development teams for Oracle Corporation. The SRE team is responsible for the overall health, performance, availability and reliability of the Communications Cloud SaaS services operating in the Oracle Cloud (PaaS/IaaS) environment. The team’s mission, goals, and objectives are based on advocating and implementing Agile, DevOps, Automation, and Site Reliability Engineering Philosophies and Best Practices to achieve Excellence in operating Communications SaaS Services.
Position Description: About the Job…
The
Communications Service Reliability Engineer (SaaSSRE) position supports SaaS operations teams providing24x7x365 coverage, may require working in rotational shifts, including on weekends and public holidays. The primary responsibility is supporting daily operations and maintenance of critical Communications SaaS offerings deployed in production Cloud Platform (IaaS and PaaS) and Database environments. SaaSSRE members play a key role as operational systems and process administrators/Owners. SaaSSRE members continually exercise and grow diverse skills to comply and execute SRE industry best practices. Finally, SaaSSRE members support SaaS application development projects as Operational Advisers/SMEs, executing production readiness activities and manage/execute development project for operational/non-functional capabilities.
Roles and Responsibilities: What You%27ll Do…
- Perform day-day operational activities to support production SaaS Communications Services offerings adherence to Service Level Agreement and Objective (SLA/SLO) for Reliability, and Availability requirements and organizational Operational Level Agreements (OLA).
- Perform duties as defined by Incident, Change, Service Request, and Problem Management processes.
- Participate with the investigation, documentation, and resolution of issues affecting SaaS Services in the production environment.
- Using a data-driven process/mindset, author technical content to support the incident response process (e.g. postmortem/root cause analysis) and develop interim solution to prevent or quickly resolve issues/problems the next time.
- When not working on operations, you will be working on software engineering tasks such as design and development of systems that increase reliability, scalability, and reduce operational overhead through automation and orchestration principles.
- Interact with colleagues on technical and Non-functional topics related to operational and support of SaaS Services.
- Quickly master technical and functional areas to perform the duties assigned.
- Constant assignments to learn cutting edge Cloud computing technologies, tools, and services.
Desired Qualifications: What the Perfect Candidate Will Have
- Strong Experience with programming in a high level language such as Python, Java, Golang, Groovy, BASH scripting, JavaScript.
- Strong Experience of monitoring, analytics tool/applications, such as ELK, Kibana, Grafana, Prometheus, Nagios, Oracle Enterprise Manager/Management Cloud.
- Strong Experience with Microservice architecture and containerization technologies (e.g. Docker)
- Strong Experience with APIs, REST, JSON, XML as well as other common standards, data structures and protocols commonly used in creating internet and cloud services.
- Strong Experience with continuous integration, continuous delivery/deployment (CI/CD) tools (e.g. GitLab-CI, Jenkins)
- Strong Experience with DevOps, Agile, methodologies and practices.
- Strong Experience of ITIL and Site Reliability Engineering practices
- Experience with web/cloud-based software version control tools such as GitLab, GitHub, Git, Artifactory, Docker Registries
- Experience with cloud orchestration tools (e.g. Kubernetes, Chef, Ansible, Puppet).
- Experience with networking technologies (routing, switching, IP addressing, DNS, Load balancers, etc.)
- Experience with Relational Databases, SQL language, and Database Management tools (Oracle, MySQL).
- 4-year Degree with a technical major (e.g. Computer Science, Systems Engineering, Engineering, IT, etc.)
- Technical problem solving capabilities and the ability to breakdown complex problems across multiple domains.
- Experience with operations/support processes and tools such as: Incident Management, Change Management, Ticketing Systems, Service Level Agreements, Escalation processes.
- Experience working with major cloud platform(s): Oracle Cloud Infrastructure (OCI), Microsoft Azure, Google Cloud Platform or AWS – any certification(s) a plus.
- Excellent communications skills – verbal and written English
- Motivated to learn multiple, cutting edge technologies in the cloud industry
Detailed Description and Job Requirements
Responsible for our production infrastructure, including the servers and services which support our growing client base as well as designing and implementing highly scalable environments. This Engineer works with other teams in the organization and provides infrastructure solutions for their needs. Understands client systems and applications, networking, infrastructure, data centers, web tools and technologies, databases and Cloud, Big Data, Enterprise Resource Planning (ERP), and more.
Design new scalable solutions for fast changing infrastructure environment with complex needs in fields like: configuration deployments, monitoring, and logging. Perform deep drill down analysis into performance bottlenecks and provide necessary fixes. Bring in new ideas, change, evolve, improve and simplify the production infrastructure. Work closely with our development and research teams and provide customer friendly solutions and support. Responsible for working on the design, development, and/or deployment of enterprise supporting systems.
BS degree or equivalent experience relevant to functional area. Suggested majors include Computer Science or Mathematics. Working knowledge of software development tools, methodologies, and programming languages. Experience working with external or internal customers to implement large scale solutions, business process architecture, application system design, and implementation. Design and implementation of Infrastructure as a Service (IaaS), Platform as a Service (PaaS) and Software as a Service (SaaS) solutions using a variety of cloud platform services. Highly technical and analytical, possessing significant implementation and operations experience. Identifies solutions in experience of application or server architecture and networking. A minimum of 5+ years experience in application or server architecture and networking or related experience.
Source link