SNP Projects

Required Schemes

In order to carry out SNP analysis, two schemes are required in the species database schemes table. Data is then stored in the assembly_lookup table to prevent duplication of jobs

  • snp_calls - Allows the gff file containing the SNPs to be stored for each assembly/reference combination
    • description snp_calls
    • name SNP Calls
    • param {“display”:”none”}
  • ref_masker - Allows the gff file containing the repeat regions for an assembly to be stored
    • description ref_masker
    • name Repeat Finder
    • param {“display”:”none”}
{
 "microreact_data:
    {
      "url":"http:\/\/microreact.org\/project\/r1GfN0ufg",
      "short_id":"r1GfN0ufg"
    },
 "matrix":"\/share_space\/interact\/outputs\/1794\/1794930\/test_mr_snp_1.aln.matrix.gz",
 "parent_workspace":"2698",
 "snp_count":
    {
       "77580":1297,
       "69837":2771,
       ......,
       "7021":"ESC_FA7021AA_AS"
    },
 "processing_matrix":"true",
 "processing":"true",
 "ref_id":57021,
 "profiles":"\/share_space\/snps\/ecoli\/55\/2699\/profiles.json",
 "snp_fasta":"\/share_space\/snps\/ecoli\/55\/2699\/snps.fasta",
 "raxml_tree":"\/share_space\/interact\/outputs\/1794\/1794931\/test_mr_snp_1.rooted.nwk",
 "assembly_ids":[69396,81488,25374,19187,77580,77672,22907,26316,69835,69837],
 "complete":"true",
 "data_file":"/path_to_data/33.json"
}
  • data_file contains the following json format
    • isolate A list of all metadata in the same format as MS Trees . The ID key refers to the node in the nwk tree , which is the assembly barcode