Lead site reliability engineer

London

Board Intelligence Limited

Site reliability engineer

€150,000 - €200,000 a year

Posted: 24 March

Offer description

The Opportunity

Are you interested in making a difference? To work for a tech-for-good company whose reason for being is to help all boards and leadership teams to be a powerful driver of performance and a force for good? Board Intelligence is on a mission to bring kindness and success together and to drive companies to think about what matters. We work with over 30,000 Chairs, CEOs, and board members to embed the discipline of focus into their organisations, and we’re helping a new board every day to focus on what matters. We are in it for the long term, come join us on this journey.

As our Lead Site Reliability Engineer (SRE), you'll be leading an existing team that ensures the availability, performance, security, and reliability of our platform and core services. You will take the lead on key technical projects, mentor and guide the team, and ensure our systems meet the needs of our users.

Reliability Engineering at Board Intelligence

The SRE team provides the highest standards of availability, scalability, performance, and security for our SaaS environments across multiple cloud vendors and our private cloud infrastructure. Your team will deliver enabling infrastructure, pipelines, and tooling to support product development. Through collaboration with security, product development, and commercial teams you'll ensure the future suitability of our infrastructure, whilst setting standards and methodologies for engineering work and proactively monitoring our platform and responding to incidents.

Key Responsibilities

* Lead and mentor a team of SREs, fostering a collaborative and high-performing environment.
* Project manage key technical projects, ensuring timely delivery and adherence to quality standards.
* Maintain a strong technical understanding of our systems and contribute to their development and maintenance.
* Improve the security posture of our infrastructure and applications.
* Ensure the reliability and stability of our platform.
* Contribute to the design and implementation of a scalable, multi-tenant architecture.
* Implement and maintain monitoring solutions and build automation to reduce toil.
* Participate in on-call duties.

What experience and skills might you have

We prefer to work with the best talent regardless of whether you are familiar with all of the tools that we use. We don’t need you to be familiar with everything on this list but experience in some or all of these areas will be useful and a willingness to dive in and learn the others is essential.

* Proven experience leading and mentoring SRE or DevOps teams, with strong delegation, communication, and collaboration skills.
* Extensive experience managing and maintaining on-premises infrastructure.
* Deep understanding of cloud-native architectures and experience managing infrastructure solutions.
* Expertise in IAC (Terraform), configuration management tools, and CI/CD pipelines.
* Strong understanding of security best practices and experience implementing security controls.

Desirable skills would be:

* Experience with service mesh technologies.
* Familiarity with co-located physical infrastructure.
* Experience with database administration.
* Knowledge of Ruby, Java, or Go.

Engineering at Board Intelligence

Everyone says it, but in our case it’s true: Each member of our engineering team is amazing in their own right, but together they are what brings our product to life.

We’re very proud of the team we’ve built – there’s around 50 of us in Product and Tech now after growing quickly in 2023/24. We have ambitious plans to further improve our ways of engineering and to continue to enable boards to ‘see what matters’. You’ll play a big role in helping us achieve this in 2025/26 and beyond.

Tech Stack

Our applications are written in Ruby (with Rails) or Java. Client-side web apps are written in React, and some services in Clojure, Java, and Go.

Our platform consists of:

* Multiple Kubernetes Clusters for container orchestration.
* Apache Kafka and Redis, along with Postgres for event messaging.
* Postgres for data storage.
* OpenStack Swift for object storage.
* Juniper & Cisco networking devices.
* A number of internally written tools for managing the platform, written in Go.

We run our own physical infrastructure co-located in three datacentres across the UK. We also run a public cloud Production Environment on GCP for one of our products and we’re moving in the direction of more public cloud for production and pre-production environments and pipelines.

Benefits

* Competitive salary & pension scheme.
* Personal performance bonus.
* 26 days holiday each calendar year.
* Bupa health & dental cover.
* Group life insurance.
* EAP; AIG Smart Health and Bereavement Counselling & Probate Helpline.
* Regular training & development, mini MBA series, lunch & learns.
* Cycle to work scheme.
* Competitive parental policies.
* Gym membership discounts.
* Monthly company socials.
#J-18808-Ljbffr

Apply

Create E-mail Alert

Save

Similar job

Site reliability engineer

London

Stealth IT Consulting Limited

Site reliability engineer

£50,000 a year

Similar job

Site reliability engineer (sre), data infrastructure

London

Apple

Site reliability engineer

€150,000 - €200,000 a year

Similar job

Site reliability engineer

London

TikTok

Site reliability engineer

€150,000 - €200,000 a year