Senior Site Reliability Engineer (GCP, Kubernetes)
About the role:
Take charge of ensuring our data-intensive infrastructure is robust, secure, scalable, and optimized for exceptional performance, delivering best experiences for our customers. As an SRE, you’ll champion best practices across teams, shaping the future of our technological landscape.
Help us build an innovative platform that enables seamless, real-time transactions, including instant fund transfers between our app and exchanges. Your work will focus on solving complex engineering challenges that bridge traditional and crypto financial services, driving cost-effective, 24/7 access to digital finance.
Join us remotely, you can be located anywhere around the CET time zone, as our work is 100% online. The position is full-time.
About us:
BeOne is a next-generation neobank that redefines how individuals and businesses manage money by blending traditional and digital finance. Our platform offers multi-currency accounts, ultra-low fees, real-time global payments, and robust financial tools, all within an intuitive, refined interface.
Our bold vision is to become the largest regulated funds and data transfer network for both retail and business customers. We empower users with financial freedom, security, and efficiency, whether for personal finances, business operations, or global investments.
In this role, you will:
* Participate in defining and leading the SRE vision and strategy, ensuring alignment with business objectives and engineering priorities.
* Architect, maintain, and develop infrastructure within GCP and GKE - on high and low-level design for performance at all levels and with security, availability and reliability at the core of it.
* Develop automated solutions for system reliability, capacity planning, and incident response to minimize manual intervention.
* Cooperate with engineering and product teams to design and implement highly available and fault-tolerant systems.
* Participate in improving Service Level Objectives, Service Level Indicators, and error budgets to enhance system reliability.
* Work towards increased compliance with applicable frameworks and regulations (DORA, SOC 2, ISO 27001, GDPR).
* Create documentation from the implemented solutions.
* Influence and mentor engineering teams on SRE principles, DevOps culture, and best practices.
* Keep up with industry trends, leveraging new tools, frameworks, and methodologies to consistently enhance system reliability.
* Care for keeping the right balance between a high level of security and comfort and flexibility of teamwork.
* Participate in daily and planning meetings.
What we expect from you:
* 5+ years in a DevOps, SRE, or similar role in FinTech business domain.
* Strong experience in managing platforms autonomously, with a focus on risk assessment and decision-making.
* Proficiency in at least one programming language: Python, GoLang, C++, or Java.
* Strong Linux administration skills (Debian/Ubuntu).
* Solid grasp of LAN/WAN networking, firewalls, proxy servers, load balancers, and protocols (HTTP(s), DNS, SSH, TCP/IP, REST).
* Hands-on experience with Docker containerization.
* Familiarity with CI/CD systems and version control.
* Expertise in Kubernetes and Helm.
* Experience with public cloud platforms (GCP, AWS, or Azure).
* Proven ability to implement redundancy and disaster recovery scenarios.
* Track record in scaling high-efficiency production systems.
* Proficiency with observability tools (e.g., Prometheus, Grafana, Grafana Mimir, OpenTelemetry).
* Strong written and spoken English (B2 level or higher).
Nice to Have:
* Experience with Argo CD and Argo Rollouts.
* Familiarity with technologies such as Kafka, Redis, Nginx, Apache HTTP Server, OpenVPN, and Nats.
* Knowledge of logging tools (Kibana, FluentD, Elasticsearch).
* Expertise in configuring, managing, and optimizing large PostgreSQL databases.
* Understanding of SSO and Okta technologies.
* Self-motivated, accountable, and capable of working independently.
* An interest in finance, trading, and crypto.
Why it’s worth a try - advantages of working at ICEO:
* Remote-first company - we enable you to work from anywhere in the world.
* Flexible working hours - we understand the challenges of juggling personal and professional lives. That is why we have core working hours between 11 am and 3 pm CET, offering you the opportunity to choose when you work outside of those hours.
* 38 days PTO - you have 38 days of paid time off per year, such that you can recharge and relax.
* Learning & development - Opportunity to grow by accessing internal and external learning & development programs.
* A modern technical stack with an emphasis on quality.
Recruitment Process:
* Screening with Talent Acquisition Partner.
* First interview with the Hiring Manager.
* Technical Challenge Interview with DevOps Team.
Want to know more?:
* Take a look at our profile on Clutch and find out what our clients say about us.
* Visit our website and check who we have helped to succeed.
#J-18808-Ljbffr