Our client, a forward-thinking SaaS company, is seeking a skilled Site Reliability Engineer (SRE) to join their team and enhance the reliability, scalability, and performance of their infrastructure.
While professional experience and qualifications are key for this role, make sure to check you have the preferable soft skills before applying if required.
This role will involve applying core SRE principles to ensure system availability, troubleshooting critical issues, and collaborating across teams to optimize production systems.
Key Responsibilities: Apply SRE principles (SLI/SLO/SLA) to improve system reliability and eliminate toil.
Build, maintain, and evolve SLO/SLI baselines for networks, systems, and applications.
Collaborate with product teams for go/no-go planning, validation, and testing of new services/products.
Analyze data and ensure the integrity of systems to optimize production performance.
Troubleshoot and resolve business-affecting issues, working closely with internal teams.
Implement best practices for system reliability and operational workflows.
Lead incident response, perform root cause analysis (RCA), and contribute to blameless post-mortems.
Qualifications: 5 years of experience with cloud/web/CDN infrastructure.
Proficiency in Python and Go; C/C++ experience a plus.
Strong knowledge of Linux systems and network protocols (TCP, UDP, DNS, TLS/SSL, HTTP).
Experience with Prometheus, Grafana, GitLab, Jenkins, and CI/CD practices.
Familiarity with big data technologies (Redis, ElasticSearch, Kafka) and container management (Docker, Kubernetes).
Strong collaboration, communication, and documentation skills.
This is a great opportunity to be part of a rapidly growing company and take a key role in scaling and maintaining mission-critical systems.
If you have a passion for SRE and want to contribute to shaping the future of SaaS infrastructure, we encourage you to apply For a confidential chat about support, cyber, and SRE roles, feel free to reach out.
I specialize in these areas with over 18 years of experience and work with exclusive clients.
I'd be happy to discuss how I can assist you Skills: sre site reliability engineer Benefits: Work From Home