At Ripjar, we help governments and organisations automate the detection, investigation, and monitoring of threats from criminal activity.
Ripjar originally span out from GCHQ and now has 140 staff based across Cheltenham, Bristol, London and Canberra, as well as a smaller presence in the USA. We have two successful, inter-related products; Labyrinth Screening and Labyrinth Intelligence. Labyrinth Screening allows companies to monitor their customers or suppliers for entities that they aren’t allowed to or do not want to do business with (for ethical or environmental reasons). Labyrinth Intelligence empowers organisation to perform deep investigations into varied datasets to find interesting patterns and relationships.
Data infuses everything Ripjar does. We work with a wide variety of datasets of all scales, including an always-growing archive of 10 billion news articles in (nearly!) every language in the world going back over 30 years, sanctions and watchlist data provided by governments, plus 250 million organisations and ownership data from global corporate registries.
This is a great time to join a growing group of highly talented technologists and data scientists who are building products that solve real world issues and are changing the way criminal activities are detected and prevented.
Team Mission
The data science team, which sits within the engineering team, enables the delivery of high-quality data science products and software to a variety of environments through technical skills, process implementation and software management, anchored in a continuous innovation culture.
What you'll be doing
We're looking for an experienced, highly motivated Data Scientist to support the research, development, and ongoing maintenance of Ripjar's analytics and data products. You will carry out data analysis tasks to develop Ripjar’s understanding of relevant data and will develop, evaluate and deploy machine learning models that integrate with Ripjar's software products and data processing pipelines. You will be working with Language models, machine learning tools and large-scale distributed clusters. This role is well suited to a Data Scientist with a strength in computing and engineering, who (as well as deriving insights) is keen to deploy data science products and continue their ongoing improvements through iteration.
You will have a strong technical and theoretical background, and be proficient in at least one programming language, preferably Python. You will have a good understanding of machine learning and large-scale data analysis, and will be comfortable working with complex data at scale.
Some recent developments, Ripjar’s data scientists have been involved with:
* AI Risk Profiles - Whitepaper – Entity Resolution across tens of millions of news articles
* Profile Summaries (generated using LLMs) – Using LLMs to summarise news articles linking an entity to financial crime (or other similar risks)
Key Tasks:
* Carry out data analysis tasks to develop Ripjar’s understanding of relevant data.
* Make use of Ripar’s large-scale data processing and analysis infrastructure to analyse data sets in order to identify patterns and to produce statistical outputs to support the development of new analytics and models.
* Develop and evaluate machine learning models to enhance Ripjar’s software and data products.
* Integrate these models into our software and consider the lifecycle and practical use of each model.
* Work with Ripjar's Data Engineers and engineering teams to support the scaling up and integration of new analytics and models into Ripjar's products and data processing pipelines.
* Produce statistical tests and summarise test outputs.
* Document analytics, models and test methodologies.
* Provide support to stakeholders in understanding analytics, models and test results.
* Support and maintain your models in production.
Requirements
Key Skills
We value diversity of experience and thought and recognise successful candidates may not tick all the following boxes. If you If you think you have something to offer, then we'd love to chat to you and hear how you would contribute to this role.
* A good understanding of machine learning and experience training and evaluating machine learning models.
* Experience integrating data science models into products, with testing, and maintaining those models long term.
* Proficiency using Natural Language Processing techniques for solving problems, ideally including Large Language Models
* Proficiency in Python, particularly with machine learning and data science libraries such as PyTorch, scikit-learn, numpy and scipy.
* Good communication and interpersonal skills.
* Experience working with large-scale data processing systems such as Spark and Hadoop.
* Experience in software development in agile environments and an understanding of the software development lifecycle.
* Experience using or implementing ML Operations approaches is valuable.
* Working knowledge of statistics and experience with producing data visualisations.
Benefits
Why we think you’ll enjoy it here:
* Base Salary of up to £70,000 per year DOE
* 25 days annual leave, rising to 30 days after 5 years of service
* Hybrid working option for employees
* Life assurance
* Company Share Scheme
* Private Family Healthcare
* Employee Assistance Programme
* Company contributions to your pension
* Enhanced maternity/paternity pay
* The latest tech including a top of the range MacBook Pro
* Offices equipped with well-stocked pantries with food, snacks and drinks when in the office
Ripjar's Commitment to Diversity
“Diversity is essential in the way we operate. Having people from different backgrounds, genders and experiences ensures that we make decisions with a truly global perspective. Diversity gives us strength in our technology, analysis and relationships.” - Maria Cox, Head of People Operations