We are seeking a highly motivated and results-oriented Junior Site Reliability Engineer (SRE) to join our team. As an SRE, you will play a critical role in ensuring the reliability, performance, and scalability of our mission-critical applications and infrastructure. The role will see you learning and embracing multiple tenets of SRE. You will be responsible for designing, implementing, and maintaining systems that deliver a seamless experience for our users.
Responsibilities:
* Work in our SRE team to design, implement and maintain monitoring solutions using tools like Grafana, Kibana, Prometheus and App dynamics.
* Contribute to the projects and sprints related to firming up of SLOs/SLIs for Developer pipeline applications ensuring high availability and performance.
* Look at proactively identifying bottlenecks in the application performance and bring your findings to Sprint stand-up; helping to implement solutions.
* Harness Python/Javascript to come up with scripts to automate and streamline operational tasks.
* Take ownership of problems and work with the larger team in driving resolution. Bias for action is preferred and upskilling encouraged.
* Analyze system logs and metrics to identify root cause of issues.
* Effectively communicate technical details to both technical and non-technical audiences.
* Work collaboratively with development, operations, and other teams to ensure alignment and smooth execution.
* Contribute to a culture of continuous improvement and knowledge sharing.
Being an SRE, you will -
* Get exposed to cutting edge and latest technology in use by a market leader like Citi, including Tekton and Harness involving OpenShift.
* Get hands-on experience of Gen AI, developing use cases from idea inception to product delivery.
* Work with agile tools to manage tasks.
* Learn ground-up processes for building observability for our supported applications and be part of the team that designs SLO/SLIs for improving application performance.
Skills Required:
* Proficiency in Python, Javascript or willingness to learn.
* A good understanding of the Software development lifecycle and Pipeline management. Working experience is an added advantage.
* Basic understanding of observability principles and SLO/SLIs.
* Experience in using monitoring tools like Grafana, Kibana, Prometheus, and AppDynamics.
* Basic working knowledge of data visualization tools like Tableau.
* Understanding of Agile concepts and related processes.
* Working or theoretical knowledge on OpenShift, Tekton and Harness pipelines.
* Excellent problem-solving and analytical skills.
* Strong communication and collaboration skills.
Job Family Group:
Technology
Job Family:
Applications Support
Time Type:
Full time #J-18808-Ljbffr