The Reactome Graph Database models the Reactome knowledgebase as an interconnected graph database.

More Info

At the cellular level, life is a network of molecular reactions. In Reactome, these processes are systematically described in molecular detail to generate an ordered network of molecular transformations (Fabregat et al. 2015). This amounts to millions of interconnected terms naturally forming a graph of biological knowledge. The Reactome Graph provides an intuitive way for data retrieval as well as interpretation and analysis of pathway knowledge.

Retrieving, and especially analysing such complex data becomes tedious when using relational databases. Queries across the pathway knowledgebase are composed by a number of expensive join operations resulting in poor performance and a hard-to-maintain project. Due to the schema-based approach, relational databases are limited in how information is stored and thus are difficult to scale for new requirements. In order to overcome these problems the Reactome database is imported in Neo4j, creating one large interconnected graph. Graph database technology is an effective tool for modelling highly connected data.

Storing Reactome data in this form has many benefits. No denormalisation is required so data can be stored in its natural form. Nodes in the vicinity of a starting point can quickly be traversed giving the user the possibility to not only retrieve data but also perform fast analysis of these neighbour networks. Thus, knowledge that previously was unavailable due to the limitations of relational data storage can now be retrieved.

To easily access and benefit from the graph database, we have developed the GraphCore; an open source library implemented in Java. This project uses Spring Data Neo4j, which provides an automatic object graph mapping on top of Neo4j and tightly integrates with other parts of the spring framework used across the project.

Get Started


To run our graph database on your personal computer, please choose the installation option that suits your need. Our team recommends Neo4j Desktop, but we can't compete with developers and their love for terminal console.

1. Docker

If you are comfortable working with docker,  you can build a docker image that contains Neo4j and a Reactome graph database

You can use the Neo4j graph database: 

  1. Download Docker on your desktop. 
  2. Pull a docker image that contains Neo4j and a Reactome graph database. 
  3. Run the docker image:  

docker run -p 7474:7474 -p 7687:7687 -e NEO4J_dbms_memory_heap_maxSize=8g reactome/graphdb:${VERSION}

  1. Go to localhost:7474 in your browser. 

You now have a docker image containing Neo4j and the Reactome graph database. You can create custom queries using Cypher and submit your own queries. Please refer to our extracting pathway participating molecules tutorial to introduce yourself to using Cypher to query the Reactome Graph Database.

2. Neo4j Desktop

If you would like to use Neo4j Desktop, we have created a dedicated page for it. Please see the instructions here.

3. Neo4j Community manual installation

If you have trouble using the Neo4j 3.5.X with our data, we strongly recommend upgrading your neo4j version to 4.X.X and following the instructions above.

Our Graph Database is available in our download data section. It is possible to use it in your local environment by following these steps:

  1. Download and install the Neo4j V4.
    1. Untar/unzip Neo4j tar/zip file.
  2. Download the Graph Database for the latest data release.
  3. Install the Graph Database for Mac/Linux users.
    1. Extract the Reactome.graph.db after downloading.
    2. Move graph.db folder to /path/to/neo4j/data/databases/
      1. If graph.db already exists, remove it or rename it.
    3. Rename the folder to the graph.db.
    4. Config Neo4j
      1. Edit Neo4j confirmation file in /path/to/neo4j/conf/neo4j.conf
      2. We recommend having all these settings:
        1. dbms.default_database=graph.db
        2. dbms.recovery.fail_on_missing_files=false
        3. unsupported.dbms.tx_log.fail_on_corrupted_log_files=false
    5. Start Neo4j  ./path/to/neo4j/bin/neo4j start
  4. Install the Graph Database for Windows user
    1. Please see the instructions here.
  5. If the standard procedure has been followed, the graph database should be accessible via the Neo4j browser at your localhost. More instructions are available in the Neo4j operations tutorial, specifically the sections “file locations” and “restoring a backup“.

You can also restore a graph database dump file to Neo4j Community if you have Java 11 installed on your local.

  1. Download and install the Neo4j V4.
    1. Untar/unzip Neo4j tar/zip file.
  2. Download the Graph Database Dump file.
  3. Run ./path/to/neo4j/bin/neo4j-admin load --force --from=/path/to/reactome.graphdb.dump --database=graph.db
  4. Start Neo4j  ./path/to/neo4j/bin/neo4j start

4. Using Neo4j V5: 

Reactome data dumps are in Neo4j version 4 format. To use them with Neo4j version 5 the Reactome database needs to be downloaded as a dump and converted. The following instructions are for CLI installation of Neo4j (Linux): 

  1. Download and install any 5.* version of Neo4j 
  2. Download the Reactome database dump
  3. cd to installation directory. From there, run all the commands provided below
  4. cp <PATH OF DOWNLOADED DUMP> reactome.dump 
  5. ./bin/neo4j-admin database migrate --force-btree-indexes-to-range reactome
  6. Now there should be a new directory created under data/databases
  7. Config Neo4j:
    1. We recommend having all these settings in conf/neo4j.conf:
      1. initial.dbms.default_database=reactome
      2. db.recovery.fail_on_missing_files=false
      3. unsupported.dbms.tx_log.fail_on_corrupted_log_files=false


Not able to access graph.db in Mac/Linux

Database "`graph.db`" is unavailable, its status is "offline."

There may be a few reasons for being unable to access the graph database. Some areas you can check include:

  1. Ensure the "graph.db" folder is accessible under your Neo4j installation folder. It should be located at "/path/to/neo4j/data/databases/graph.db".
  2. Ensure the user and group owner are recursively set to "neo4j:adm" using the command "chown -R neo4j:adm /path/to/neo4j".
  3. Depending on how you are using Neo4j, the database may need to be started using the "cypher-shell" or a configuration value may need to be changed to allow access to the database.

If you still face any problems please send an email to This email address is being protected from spambots. You need JavaScript enabled to view it. with information about the following to help debug the problem:

  1. The version of Neo4j you are using
  2. The working environment you are using, Mac? Linux? Windows?
  3. The software you are using to run and access Neo4j 
    1. Neo4j Docker image vs. Neo4j community edition installed to your system
    2. Neo4j Desktop vs. Web Browser vs. Cypher-Shell
  4. Path of the Neo4j installation, if installed locally

Great! Now you have your own copy of the current version of the Reactome data content in your instance of Neo4j, so let’s see how you can take advantage of it either with direct queries to the graph database or using our GraphCore java library.

Directly querying to the Reactome Graph Database

The Neo4j browser offers a nice interface to submit your own queries to the graph database. We recommend using this platform for the first interaction with the Reactome Graph database to see how easy is to use the Cypher query language.

Please refer to our extracting pathway participating molecules tutorial to introduce yourself to using Cypher to query the Reactome Graph Database.


The API for the Reactome GraphCore Java library is available on our GitHub repository.


Tutorial: Extracting participating molecules using the Graph Database.

To learn more about our graph database, have a look at our relevant publication entitled Reactome graph database: Efficient access to complex pathway data.

Cite Us!