The Reactome Graph Database models the Reactome knowledgebase as an interconnected graph database.

More Info

At the cellular level, life is a network of molecular reactions. In Reactome, these processes are systematically described in molecular detail to generate an ordered network of molecular transformations (Fabregat et al. 2015). This amounts to millions of interconnected terms naturally forming a graph of biological knowledge. The Reactome Graph provides an intuitive way for data retrieval as well as interpretation and analysis of pathway knowledge.

Retrieving, and especially analysing such complex data becomes tedious when using relational databases. Queries across the pathway knowledgebase are composed by a number of expensive join operations resulting in poor performance and a hard-to-maintain project. Due to the schema-based approach, relational databases are limited in how information is stored and thus are difficult to scale for new requirements. In order to overcome these problems the Reactome database is imported in Neo4j, creating one large interconnected graph. Graph database technology is an effective tool for modelling highly connected data.

Storing Reactome data in this form has many benefits. No denormalisation is required so data can be stored in its natural form. Nodes in the vicinity of a starting point can quickly be traversed giving the user the possibility to not only retrieve data but also perform fast analysis of these neighbour networks. Thus, knowledge that previously was unavailable due to the limitations of relational data storage can now be retrieved.

To easily access and benefit from the graph database, we have developed the GraphCore; an open source library implemented in Java. This project uses Spring Data Neo4j, which provides an automatic object graph mapping on top of Neo4j and tightly integrates with other parts of the spring framework used across the project.

Get Started

Download

Our Graph Database is available in our download data section. It is possible to use it in your local environment by following these steps:

  1. Download and install the Neo4j.
    1. We strongly recommend to download and install the Neo4j version of 3.5.X because of the version compatibility issue.
    2. Untar/unzip Neo4j tar/zip file.
  2. Download the Graph Database for the latest data release.
  3. Install the Graph Database for Mac/Linux user.
    1. Extract the reactome.graph.db after downloading.
    2. Move graph.db folder to /path/to/neo4j/data/databases/
      1. If graph.db already exists, remove it or rename it.
    3. Rename the folder to graph.db.
    4. Start Neo4j  ./path/to/neo4j/bin/neo4j start
  4. Install the Graph Database for Windows user
    1. Please see the instructions here.
  5. If the standard procedure has been followed, the graph database should be accessible via the Neo4j browser at your localhost. More instructions are avaliable in the Neo4j operations tutorial, specifically the sections “file locations” and “restoring a backup“.

Want to use Neo4j 4.x.x ?

If you have trouble installing the Neo4j 3.5.X and persist to use Neo4j 4.x.x, you may face the unavailable database error. Neo4j cannot be started because the database files require upgrading and upgrades are disabled in the configuration. Please set 'dbms.allow_upgrade' to 'true' in neo4j.conf and try again.

Docker

If you are comfortable working with docker,  you are able to build a docker image that contains Neo4j and a Reactome graph database.

Troubleshooting

Not able to access graph.db in Mac/Linux
DatabaseUnavailable:

Database "`graph.db`" is unavailable, its status is "offline."

There may be a few reasons for being unable to access the graph database. Some areas you can check include:

  1. Ensure the "graph.db" folder is accessible under your Neo4j installation folder. It should be located at "/path/to/neo4j/data/databases/graph.db".
  2. Ensure the user and group owner are recursively set to "neo4j:adm" using the command "chown -R neo4j:adm /path/to/neo4j".
  3. Depending on how you are using Neo4j, the database may need to be started using the "cypher-shell" or a configuration value may need to be changed to allow access to the database.

If you still face any problems please send an email to This email address is being protected from spambots. You need JavaScript enabled to view it. with information about the following to help debug the problem:

  1. The version of Neo4j you are using
  2. The working environment you are using, Mac? Linux? Windows?
  3. The software you are using to run and access Neo4j 
    1. Neo4j Docker image vs. Neo4j community edition installed to your system
    2. Neo4j Desktop vs. Web Browser vs. Cypher-Shell
  4. Path of the Neo4j installation, if installed locally

Great! Now you have your own copy of the current version of the Reactome data content in your instance of Neo4j, so let’s see how you can take advantage of it either with direct queries to the graph database or using our GraphCore java library.

Directly querying to the Reactome Graph Database

The Neo4j browser offers a nice interface to submit your own queries to the graph database. We recommend using this platform for the first interaction with the Reactome Graph database to see how easy is to use the Cypher query language.

Please refer to our extracting pathway participating molecules tutorial to introduce yourself to using Cypher to query the Reactome Graph Database.

API

The API for the Reactome GraphCore Java library is available on our GitHub repository.

Resources

Tutorial: Extracting participating molecules using the Graph Database.

To learn more about our graph database, have a look at our relevant publication entitled Reactome graph database: Efficient access to complex pathway data.

Cite Us!