PhD Studentship: Learning Caching Strategies for Dynamic Workloads on Graph Databases
As the volume and complexity of data has changed, particularly in recent years, database systems have begun to evolve along different paths for different use cases. While relational databases have been traditionally popular, more recently graph databases have risen to prominence. A graph database uses nodes (vertices) and relationships (edges) to create a graph (network) that represents the entities and associativity between them.
Data caching plays a central role in maintaining low latency and high throughput of data access by ensuring data times that are likely to be accessed in the near future are available in the main memory. However, most existing caching strategies employed by graph databases do not take into account graph topology when determining which items should be kept in cache. They also do not adapt well to dynamic changes in the graph topology and the query workload.
In this project, we plan to investigate how the topology of a graph can be exploited to optimise the caching strategy with the aim of improving hit rate, and overall throughput. We will also aim to explore how machine learning techniques can be used to dynamically learn an optimal caching policy based on current topological context and data access patterns.
A successful candidate will work in the Distributed Systems and Concurrency group at University of Surrey and benefit from close collaboration with Neo4j, a world leader in graph database technology. The supervisory team has a strong publication record in systems, distributed systems and concurrency, programming languages, and formal verification, including publications at flagship venues, such as OSDI, ATC, EuroSys, VLDB, and PODC among others.
Seniority Level
* Internship
Employment Type
* Full-time
Job Function
* Research
* Analyst
* Information Technology
Industries
* Higher Education
#J-18808-Ljbffr