The Reactome Graph Database models the Reactome knowledgebase as an interconnected graph database.

More Info

At the cellular level, life is a network of molecular reactions. In Reactome, these processes are systematically described in molecular detail to generate an ordered network of molecular transformations (Fabregat et al. 2015). This amounts to millions of interconnected terms naturally forming a graph of biological knowledge. The Reactome Graph provides an intuitive way for data retrieval as well as interpretation and analysis of pathway knowledge.

Retrieving, and especially analysing such complex data becomes tedious when using relational databases. Queries across the pathway knowledgebase are composed by a number of expensive join operations resulting in poor performance and a hard-to-maintain project. Due to the schema-based approach, relational databases are limited in how information is stored and thus are difficult to scale for new requirements. In order to overcome these problems the Reactome database is imported in Neo4j, creating one large interconnected graph. Graph database technology is an effective tool for modelling highly connected data.

Storing Reactome data in this form has many benefits. No denormalisation is required so data can be stored in its natural form. Nodes in the vicinity of a starting point can quickly be traversed giving the user the possibility to not only retrieve data but also perform fast analysis of these neighbour networks. Thus, knowledge that previously was unavailable due to the limitations of relational data storage can now be retrieved.

To easily access and benefit from the graph database, we have developed the GraphCore; an open source library implemented in Java. This project uses Spring Data Neo4j, which provides an automatic object graph mapping on top of Neo4j and tightly integrates with other parts of the spring framework used across the project.

Get Started

Download

To run our graph database on your personal computer, please choose the installation option that suits your need. Our team recommends Neo4j Desktop, but we can't compete with developers and their love for terminal console.

1. Neo4j Desktop

If you would like to use Neo4j Desktop, we have created a dedicated page for it. Please see the instructions here.

2. Neo4j Community manual installation

Our Graph Database is available in our download data section. It is possible to use it in your local environment by following these steps:

  1. Download and install the Neo4j V4.
    1. Untar/unzip Neo4j tar/zip file.
  2. Download the Graph Database for the latest data release.
  3. Install the Graph Database for Mac/Linux users.
    1. Extract the Reactome.graph.db after downloading.
    2. Move graph.db folder to /path/to/neo4j/data/databases/
      1. If graph.db already exists, remove it or rename it.
    3. Rename the folder to the graph.db.
    4. Config Neo4j
      1. Edit Neo4j confirmation file in /path/to/neo4j/conf/neo4j.conf
      2. We recommend having all these settings:
        1. dbms.default_database=graph.db
        2. dbms.recovery.fail_on_missing_files=false
        3. unsupported.dbms.tx_log.fail_on_corrupted_log_files=false
    5. Start Neo4j  ./path/to/neo4j/bin/neo4j start
  4. Install the Graph Database for Windows user
    1. Please see the instructions here.
  5. If the standard procedure has been followed, the graph database should be accessible via the Neo4j browser at your localhost. More instructions are available in the Neo4j operations tutorial, specifically the sections “file locations” and “restoring a backup“.

Want to use Neo4j 4.x.x ?

If you have trouble using the Neo4j 3.5.X with our data, we strongly recommend upgrading your neo4j version to 4.X.X and following the instructions above.

Docker

If you are comfortable working with docker,  you are able to build a docker image that contains Neo4j and a Reactome graph database.

Troubleshooting

Not able to access graph.db in Mac/Linux
DatabaseUnavailable:

Database "`graph.db`" is unavailable, its status is "offline."

There may be a few reasons for being unable to access the graph database. Some areas you can check include:

  1. Ensure the "graph.db" folder is accessible under your Neo4j installation folder. It should be located at "/path/to/neo4j/data/databases/graph.db".
  2. Ensure the user and group owner are recursively set to "neo4j:adm" using the command "chown -R neo4j:adm /path/to/neo4j".
  3. Depending on how you are using Neo4j, the database may need to be started using the "cypher-shell" or a configuration value may need to be changed to allow access to the database.

If you still face any problems please send an email to This email address is being protected from spambots. You need JavaScript enabled to view it. with information about the following to help debug the problem:

  1. The version of Neo4j you are using
  2. The working environment you are using, Mac? Linux? Windows?
  3. The software you are using to run and access Neo4j 
    1. Neo4j Docker image vs. Neo4j community edition installed to your system
    2. Neo4j Desktop vs. Web Browser vs. Cypher-Shell
  4. Path of the Neo4j installation, if installed locally

Great! Now you have your own copy of the current version of the Reactome data content in your instance of Neo4j, so let’s see how you can take advantage of it either with direct queries to the graph database or using our GraphCore java library.

Directly querying to the Reactome Graph Database

The Neo4j browser offers a nice interface to submit your own queries to the graph database. We recommend using this platform for the first interaction with the Reactome Graph database to see how easy is to use the Cypher query language.

Please refer to our extracting pathway participating molecules tutorial to introduce yourself to using Cypher to query the Reactome Graph Database.

API

The API for the Reactome GraphCore Java library is available on our GitHub repository.

Resources

Tutorial: Extracting participating molecules using the Graph Database.

To learn more about our graph database, have a look at our relevant publication entitled Reactome graph database: Efficient access to complex pathway data.

Cite Us!