About Beeks
Beeks Group is a leading managed cloud provider exclusively within the agile and fast-moving financial services and capital markets sector. Our Infrastructure-as-a-Service (IAAS) model is optimised for low-latency private cloud compute, connectivity and analytics, providing the flexibility to deploy and connect to Exchanges, trading venues and public cloud for a true hybrid cloud experience.
Founded in 2011, Beeks Group is listed on the London AIM Stock Exchange and has enjoyed continued growth each year. Beeks Group now employs over 100 team members across the globe and has an international network of over thirty data centres.
We have a fantastic opportunity for a Site Reliability Engineer to join us at our unique Head Office in Renfrew, which includes our state-of-the-art gym with weekly circuit training, a personal trainer and yoga classes as well as the Beeks Bar or weekly masseuse to help you unwind!
About the role
As part of the newly-formed Site Reliability Engineering team at Beeks, you will be working closely with other teams throughout the business to foster the adoption of SRE practices and methodologies. In particular, you will be driving improvements in reliability through the design and implementation of automation, and in leading a shift from classical monitoring towards observability. You will also assist the existing NOC-based on-call team by acting as a senior point of escalation for incident management.
Key Responsibilities
* Champion the adoption of SRE culture and practices in other teams throughout the business
* Identify opportunities for the implementation of automation and improved tooling
* Enhance service reliability by maintaining and improving monitoring and alerting systems
* Take part in the product design lifecycle to advocate for best practices around reliability
* Be involved in the incident management process, working closely with the existing NOC-based on-call team when incidents occur
* Proactively identify areas where the application of SRE methodologies could lead to improvements in reliability and efficiency
Required Qualifications and Skills
* Proven track record of success as an SRE or in a related role (DevOps, Infrastructure Engineer, etc.)
* Highly experienced with modern monitoring and observability solutions (Prometheus/Grafana, ELK stack, or third-party hosted solutions such as Datadog, NewRelic, etc.)
* Highly proficient with automation and orchestration platforms (Ansible, Chef, Puppet, etc.)
* Fluent in a programming or scripting language (preferably Python)
* Experienced in the use of CI/CD tools (e.g. BitBucket, Jenkins, etc.)
Desired Skills
* Bachelor's or Master’s degree in Computer Science, Engineering, or a related field
* Formal training or certifications in SRE concepts or related disciplines
* Experience with installation and support of Kubernetes container hosting platforms
* Familiarity with maintaining Web sites and applications implemented using Django
* Experienced in the use of Atlassian Jira for project and change management
What We Can Offer You
Compensation & Benefits
* A competitive salary
* A unique and highly rewarding Share Options scheme
* Highly competitive pension scheme
* EV salary exchange scheme
* Life assurance cover
* Investment in Training
* Family cover Private Health Insurance
Lifestyle
* Hybrid working (3 days in the office, 2 days at home)
* Flexible work hours
* 33 days annual leave
This full-time position is available only to candidates who have full Right to Work in the UK.
We are an equal opportunity employer.
#J-18808-Ljbffr