Setting a new Database Structure¶
The Active database are specified in the ACTIVE_DATABASES variable in config.py and consist of a dictionary where the name of the database is the key which points to an array containing the following
- The name of the Genus - this is important as it is used to retrieve the appropriate records from the SRA and to check whether assemblies are of the correct taxa
- The url of the database
- A boolean showing whether the database is public (True) or private (False)
- A number ?
- The three letter code identifying the database
e.g.
'senterica': [
'Salmonella',
'postgresql://%s:%s@%s/senterica'%(USER, PASS, POSTGRES_SERVER),
True,
1,
'SAL'
]
Adding a Column to the strains Table¶
- Add the column to the actual strains and strains_archive table in the database
- Add the column to Stains and StrainsArchive classes in the SQLAclchemy models located at entero/databases/<database_name>/models.py
class Strains
new_column=Column("new_column",String(100))
class StrainsArchive
new_column=Column("new_column",String(100))
- Add the column description to the data_param table. If it corresponds to metadata in the SRA then fill in the sra_field with the appropriate json path e.g. Sample,Metadata,Species
You can retrospectively add data to the column using the script update_sra_fields
Example Adding Geographic details¶
tabname name sra_field order nested_orderlabel datatype groupname
strains geographic_details Sample,Metadata,geography details, 5,9,text,Location
class Strains(Base,mod.Strains):
geographic_details=Column("geographic_details",String(100))