Setting a new Database Structure

The Active database are specified in the ACTIVE_DATABASES variable in and consist of a dictionary where the name of the database is the key which points to an array containing the following

  • The name of the Genus - this is important as it is used to retrieve the appropriate records from the SRA and to check whether assemblies are of the correct taxa
  • The url of the database
  • A boolean showing whether the database is public (True) or private (False)
  • A number ?
  • The three letter code identifying the database


'senterica': [
              'postgresql://%s:%s@%s/senterica'%(USER, PASS, POSTGRES_SERVER),

Adding a Column to the strains Table

  • Add the column to the actual strains and strains_archive table in the database
  • Add the column to Stains and StrainsArchive classes in the SQLAclchemy models located at entero/databases/<database_name>/
class Strains
class StrainsArchive
  • Add the column description to the data_param table. If it corresponds to metadata in the SRA then fill in the sra_field with the appropriate json path e.g. Sample,Metadata,Species

You can retrospectively add data to the column using the script update_sra_fields

Example Adding Geographic details

tabname name sra_field order nested_orderlabel datatype   groupname
strains geographic_details Sample,Metadata,geography details, 5,9,text,Location

class Strains(Base,mod.Strains):