We are strengthening the team responsible for the operational health (availability, performance, manageability) of Oracle Cloud Infrastructure Block Storage. OCI Block Volumes provide an industry first among hyperscale cloud vendors: a performance auto-tuning feature that dynamically scales performance as demand changes.
In this role you will solve complex problems related to infrastructure cloud services and build automation to prevent problem recurrence. Work to improve the availability, scalability, and efficiency of Oracle products and services.
* Guide and mentor junior team members and drive projects end to end
* Monitor our service and proactively debug operational issues.
* Work with internal and external teams to diagnose performance issues.
* Test systems including systems for performance and scalability testing.
* Improve efficiency of the deployment processes across a fast-growing number of regions
* Participate in our on-call rotation and resolve complex distributed issues through debugging, communication and collaboration across multiple SRE teams across OCI.
* Improve our operational capabilities by developing runbooks, alarming, and building tools and documentation that enable customers to self-diagnose problems.
* Deploy our service in new regions and help to automate this process.
Basic Qualifications:
* 5+ years of relevant experience in IT industry in a Linux based environment
* Familiarity with Linux shell scripting and ideally Python
* Proficient with Linux based build and analysis tools (e.g. make, scons/cons, bazel)
* Familiarity with CICD environments
* Familiarity with Agile Development
* Proficient with commonly used networking protocols such as TCP/IP,
* Familiarity with docker containers
* Familiarity with databases, NoSQL systems, storage and distributed persistence technologies.
* Troubleshooting and performance tuning skills.
Preferred Qualifications:
* Graduation in Computer Science or related engineering fields
* Cloud technology related certification