Job Title: Site Reliability Platform Engineer
About Luupli
:Luupli is a social media app that has equity, diversity, and equality at its heart. We believe that social media can be a force for good, and we are committed to creating a platform that maximizes the value that creators and businesses can gain from it, while making a positive impact on society and the planet. Our team is made up of passionate and dedicated individuals who are committed to making Luupli a success
.
Role Descripti
on:We are seeking a talented and experienced Site Reliability Engineer (SRE) to join our team. As an SRE, you will play a crucial role in ensuring the reliability, scalability, and performance of our cloud-based infrastructure and services, primarily hosted on AWS. If you have a passion for problem-solving, a deep understanding of AWS services, hands-on experience with Terraform, and proficiency in scripting with Python or Bash, we invite you to apply for this exciting opportuni
ty.
Role and Responsibilit
ies:
1. Infrastructure Design and Automa
tion:- Collaborate with software engineering and operations teams to design, build, and maintain cloud-based infrastructure using AWS and Terra
form.- Implement and enhance infrastructure-as-code (IaC) practices using Terraform to ensure reproducibility and scalability of infrastructure compon
ents.
2. Monitoring and Incident Manag
ement:- Develop and maintain monitoring solutions to proactively identify performance bottlenecks, system outages, and other potential i
ssues.- Participate in incident response and root cause analysis efforts to drive continuous improvement and prevent future inci
dents.
3. Reliability and Performance Optimi
zation:- Optimise system performance, reliability, and cost efficiency through continuous monitoring, performance tuning, and capacity pl
anning.- Identify opportunities to automate manual processes and improve system resi
lience.
4. Scripting and Aut
omation:- Utilise Python or Bash scripting to create and maintain automation tools for various operational tasks and depl
oyments.- Implement and improve continuous integration and continuous deployment (CI/CD) pi
pelines.
5. Security and Co
mpliance:- Collaborate with security teams to implement best practices for securing cloud infrastructure and
services.- Ensure compliance with relevant industry standards and reg
ulations.
6. Deployment and Release M
anagement:- Support CI/CD pipelines for application deployments an
d updates.- Contribute to the design and implementation of deployment strategies that promote zero-downtime
releases.
7. Documentation and Knowled
ge Sharing:- Maintain clear and up-to-date documentation for infrastructure configurations, processes, and incident resolution
procedures.- Participate in knowledge sharing with team members to enhance overall expertise and
skill sets.
R
equirements:
1. Education an
d Experience:- Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent practical
experience).- Proven experience as a Site Reliability Engineer or
similar role.
2. Tec
hnical Skills:- Extensive experience with Amazon Web Services (AWS) and its core services (EC2, S3, RD
S, IAM, etc.).- Strong proficiency in infrastructure-as-code (IaC) tools, with a focus
on Terraform.- Proficient in scripting with Python or Bash for automation and oper
ational tasks.- Solid understanding of networking principles
and protocols.- Knowledge of CI/CD pipelines and
related tools.
3. Problem-Solving and Analyt
ical Abilities:- Ability to diagnose and resolve complex technical issues in a fast-pac
ed environment.- Analytical mindset to proactively identify potential system weaknesses and performan
ce bottlenecks.
4. Collaboration an
d Communication:- Strong teamwork and collaboration skills to work effectively with cross-f
unctional teams.- Excellent verbal and written commu
nication ski
lls.
CompensationThis is an equity-only position, offering a unique opportunity to gain a stake in a rapidly growing company and contribute direct
ly to its success.