OpenAccess Week: GraphDB and ContentService

Since version 57 we provide our data in a Neo4j graph database helping to reduce the complexity of the represented knowledgebase and allowing a more straightforward access to our content. Neo4j’s query language, Cypher, allows queries to be written in a more intuitive way and reduces the average response time per query by 93% (Fabregat et al., 2018).

The graph database also benefits Reactome in other aspects like (i) the creation of complex data quality assessment (QA) queries or (ii) supporting software that requires data pattern analysis, e.g. reactions classification. QA queries are executed during each quarterly release to identify instances that need to be checked out and corrected by Reactome curators ensuring that high-quality content is delivered to the final user. The reactions classifier project, for example, uses Cypher to formalise the concepts presented in Jupe et al. (2014) to generate a series of reports that help curators classify the reactions in Reactome.

The Content Service constitutes an easy API, based on the Representational State Transfer (REST) protocol, that provides access to the Reactome knowledgebase. It includes a set of methods classified in groups according to their functionality. For instance, expanding the pathways group reveals a set of methods that provide specific information about pathways such as the contained Events or the participating PhysicalEntities.