The Role
We are looking to add a new stunning colleague to the organisation driving a transformation of Addepar’s Production engineering team towards a platform enabling high-level declarative infrastructure orchestration and its operations. Said platform closely integrates our Compute, Network, and Storage control planes allowing us to evolve blueprints of highly efficient and fast to iterate-on services tailored to various product areas within the company, abstracting our developers from the nuances of underlying infrastructure.
The ideal candidate will play a leading role in implementing and maintaining Addepar Production Infrastructure administration by bringing in a combination of leading innovative solutions across functional teams and hands-on development experience in AWS/cloud, Linux/Unix, networking, scripting abilities, containerisation, Kubernetes, Terraform, Information security, debugging and monitoring/observability skills to design, deploy, monitor and automate all operational aspects of Addepar's platform.
Must-have Skills
1. Recent and proficient experience with Java, Python, Go, or similar.
2. Recent and proficient experience with Terraform and IAC.
3. Experience building & operating highly reliable distributed systems in a cloud environment.
4. Passion for technology, pragmatic thinking, ability to jump into an ambiguous area and break down complex problems.
What You’ll Do
1. Using Kubernetes, k8’s and maintain or operationalise container infrastructure.
2. Design, build, and maintain automated CI/CD pipelines using Jenkins, ArgoCD, AWS Code build/Pipeline, GitHub Actions or similar.
3. Deploy and maintain Kubernetes and related technologies as part of App deployments to various Clusters.
4. Use Terraform for developing, operationalising and evangelising infrastructure as code for Scaling Addepar Platform across regions.
5. Operationalise and evangelise application and infrastructure upgrades/patches.
6. Gain deep application-level knowledge to inform infrastructure requirements and constraints to Developers, QA and Management, implementing dashboards for Cost and Inventory management.
7. Monitor and troubleshoot our infrastructure or App stack using Logging/monitoring tools.
8. Collaborate with cross-functional teams to identify and resolve Application or infrastructure issues.
9. Work with engineering and operations teams to improve, document, and establish processes and broadly improve the operability and security of our systems.
10. Participate in on-call rotation and contribute to resolving Incidents.
11. As a Senior Engineer, you will be expected to mentor more junior engineers as well as serve as a contributor to the engineering culture of the SRE team.
Who You Are
1. Ideally you'll have a Bachelors/Graduate degree in Computer Science or related field.
2. Extensive experience in the SRE/DevOps/Systems Engineer field.
3. Cloud Infra fundamentals (we use AWS).
4. Strong Programming/Scripting in various common languages (we use python [boto3], bash, and general UNIX tools; java is a plus).
5. Broad and deep experience with any applied aspect of UNIX/BSD/Linux internals (we use Ubuntu).
6. Containerisation experience with k8’s (we use KOPS, EKS, ECS).
7. Networking fundamentals, IPv4, v6 etc (AWS VPC a plus).
8. Demonstrable experience with infrastructure-as-code tools such as Terraform.
9. Experience with monitoring and alerting tools such as Prometheus, Grafana, Sentry, Sumologic or AWS cloud native tools.
10. Good interpersonal skills to collaborate with multi-functional teams.
11. Demonstrable experience writing systems automation tooling is a plus (if you have open source code to share we're happy to discuss).
12. Experience administering large scale Databases, Aurora Mysql, Mongodb is a plus.
13. Experience with Upgrading/Patching Vendor tools is a plus.
14. Exposure to industry practices in financial services is a plus.
#J-18808-Ljbffr