About this role
Capital One's mission is to change banking for good by bringing humanity, ingenuity and simplicity to banking. Sitting at the core of these values is our Software Engineering department, whose primary role is to be an effective enabler of Capital One’s ambitions.
We are keen to add a Senior Site Reliability Engineering Manager (SSREM) to our Nottingham based SRE organisation whose primary focus is to provide effective leadership as we evolve and mature site reliability practices for the benefit of our cloud applications and their customers. The successful candidate will be a leader of leaders with custodianship of application services across 5+ SRE teams.
We’re looking for an experienced professional whose technical background allows effective challenge and support of teams managing primarily Java based applications running in a dynamic IaaC AWS cloud environment. A proven ability to lead, inspire, include, empower, coach and develop their teams to deliver challenging outcomes in the pursuit of business, functional and personal goals.
The successful application will lead by example, build strong and valuable relationships within the SRE org, wider tech and business stakeholders. They have the ability to face ambiguity and understand how to make sense of complexity, importantly being able to communicate this to varying levels of seniority and experience. They work with autonomy where needed, as well as being part of an effective team whilst constantly seeking opportunities to be creative, to drive improvement through continuous improvement that all can benefit from.
We are proud of who we are and what we do and this is an exciting time to join Capital One as we are excited about what the future holds.
What you’ll do
* Lead a cross-functional group of reliability engineering teams (sourced from internal associates and preferred third party vendors) in applying Site Reliability Engineering principles to in-house developed applications.
* Optimise and reduce operational overheads through observability and service automation.
* Identify growth opportunities for your manager level reportees on how to achieve their technical, business and personal goals.
* Work closely with peer senior manager people leader(s) and staff engineer(s) within the SRE organisation as a key member of the SRE leadership team.
* Lead the definition and track Service Level Objectives (SLO) to measure service availability in combination with service, product and engineering communities.
* Collaborate with product and engineering senior managers to ensure delivery and reliability outcomes are mutually agreed and achieved.
* Oversee the management of services that are performant, reliable, scalable and secure.
* Ensure a framework and culture that ensures continuous improvement of platform health, compliance and resiliency.
* Work with senior stakeholders to mature the concept of Site Reliability Engineering within the UK CapitalOne Tech organisation.
What we’re looking for:
* Proven experience in leading engineering teams to achieve business goals.
* Technical leadership coupled with a passion for software engineering and operational processes.
* Strong background in software/system engineering and architecture within the cloud.
* Strong background/appreciation in observability principles, techniques and toolsets.
* Demonstrable knowledge in the software development lifecycle within a cloud based environment.
* Demonstrable knowledge of developing and managing RESTful API services written within a modern OO language such as Java or Python.
* Technical aptitude and passion for understanding complex distributed systems.
* Proven ability to partner effectively across engineering to maximise inner-sourcing opportunities and reduce waste.
* Ability to efficiently manage workloads and utilise organisational skills within an agile environment
* Excellent interpersonal and stakeholder management skills with the ability to work with multi-vendor technical and business teams.
#J-18808-Ljbffr