Summary
About the Company: Join a leading global financial services firm where technologists and investment professionals collaborate to drive innovation and operational excellence. The team focuses on building and maintaining cutting-edge systems for high-performing, fast-evolving investment strategies.
About the Role: As a Site Reliability Engineer (SRE), you will leverage your expertise in software and systems engineering to design, build, and maintain robust infrastructure. Your role involves reducing system complexity, improving performance, and ensuring smooth operation of mission-critical applications from development to production. The SRE team has diverse technical experience, including production engineering, software quality metrics, chaos engineering, networking, UNIX internals, and cloud architecture. You will drive automation, optimization, incident response, and support other engineering teams while collaborating with key stakeholders to create reliable, resilient systems.
Responsibilities:
* Manage the full lifecycle of applications at a business level.
* Ensure high availability, reliability, and performance of critical systems.
* Automate repetitive tasks and resolve complex system issues.
* Investigate new solutions to enhance system operations.
* Develop green-field engineering solutions based on root cause analysis.
* Lead incident response, driving both tactical fixes and long-term improvements.
* Promote SRE principles across engineering teams and the organization.
* Collaborate with industry leaders and top-tier engineers.
Qualifications:
* Bachelor’s degree in Computer Science, a related technical field, or equivalent professional experience.
* Strong foundation in computer science principles, including data structures, algorithms, and distributed systems.
* Proficiency in at least one modern programming language (Python preferred).
* Experience with software development tools and best practices (testing, version control, CI/CD).
* Strong database knowledge (SQL experience is a plus).
* Web UI development skills (JavaScript, CSS, React) are beneficial.
* Excellent communication skills and a passion for solving challenging technical problems.
* An entrepreneurial spirit with adaptability to new technologies and evolving requirements.
* A drive for learning and applying novel solutions to hard problems.
* Only candidates with better tenure and consistence in employment will be considered
Additional Qualifications:
* Experience with Cloud Platforms: Familiarity with cloud services like AWS, Google Cloud, or Azure is highly beneficial.
* Knowledge of Containerization and Orchestration: Experience with Docker and Kubernetes for managing containerized applications.
* Monitoring and Logging Tools: Proficiency with tools like Prometheus, Grafana, ELK stack, or Splunk for monitoring and logging.
* Security Best Practices: Understanding of security principles and best practices in system and network administration.
* Automation Tools: Experience with automation tools such as Ansible, Terraform, or Chef.
* Incident Management: Strong skills in incident management and root cause analysis to quickly resolve issues and prevent recurrence