EnteroBase Internal Structure (Website)

Persistent data in EnteroBase is split across several PostGresSQL databases. The website databases store User information and strain-centric information. NServ stores information pertaining to typing schemes, such as information about alleles, sequences and STs. For more information about NServ, see EnteroBase Database Structure (NServ).

System Database Structure

The system database (Figure 1) stored User information, their preferences, and their uploaded reads (before it is processed). Thier sharing settings with their ‘buddies’ are stored across UserBuddies (which defines who is a buddy) and BuddyPermissionTags (which defines buddy permissions).

../_images/system_database.png

Figure 1: System database schematic

Species Database Structure

Each species in EnteroBase has its own database, with an identical table structure (Figure 2A). Fields in certain tables, namely Strains, are slightly different depending on the metadata required for different species. For instance, provision is made for ribotype in the Clostridioides database (Figure 2B).

../_images/species_database.png

Figure 2: Species database schematic A) Generic table structure for each species database. B) Strain table fields specific to a given species

Versioning in EnteroBase

EnteroBase manages an internal log of all changes made to database records, particularly for data pertaining to strains, sequenced read traces, assemblies and genotyping. When these records are modified the current record is stamped with a version number, the time of the modification and the user who made the change. The previous state of the row is saved verbatim in an archive table. This provides a precise audit log of all changes in the database (Figure 2A).

Public API Structure

The API is implemented through the Flask web framework. A live demo of the API is available at http://enterobase.warwick.ac.uk/api/v2.0/swagger-ui . There are three generic classes that each specify how to handle requests for the following:

  1. A single record (/api/v2.0/{database}/schemes/{barcode}),
  2. Multiple records (/api/v2.0/{database}/schemes/),
  3. And requests that have to be fetch internally from NServ (e.g. Sequence types)

Each API endpoint, e.g. ‘Schemes’ which is accessible through URLS like http://enterobase.warwick.ac.uk/api/v2.0/senterica/schemes, maps to a Resource class that define specific behaviours for processing different HTTP requests (GET, POST, PUT etc.) (Figure 4). These resource classes in turn have a Schema class that defines validation rules for API parameters, rules for mapping values to the correct database field and how to represent the final output (Figure 4).

../_images/api_request.png

Figure 3: Basic interaction of API classes, using ‘Strains’ as an example.

../_images/api_schema.png

Figure 4: Attributes of schema for different API endpoints, these usually map to internal database fields.