SNP Projects

SNPs are called using assembled genomes

Creating an SNP Project

https://bitbucket.org/repo/Xyayxn/images/823204470-snp_creation.png

Like MS Trees, SNP Projects can either be created as stand alone or part of a Workspace from the main search page , but you need to be logged in.

Stand Alone Project

To create a stand alone project make sure you have all the strains you want in the table. You can copy/paste strains between searches using the Workspace -> Copy Selected and Workspace -> Paste Strains (1). Click the SNP icon (2) and a dialog will appear with a drop down allowing you to pick a reference genome (4) and add a suitable name (3). It is best to pick a complete genome as these are usually a single contig and well annotated. You can also just create an SNP project from a subset of strains in the table by selecting the strains you want and checking ‘Selected Strains Only’ in the dialog. Once you are ready you can press submit.

Project Linked to a Workspace

First of all create a Workspace. Once created an ‘Analysis’ toolbar should appear in the main menu. Choose the ‘Create SNP Project’ under the sub menu items and a dialog box should appear as above. The process is now exactly the same as a stand alone project (follow the instructions as above). The only difference is that the project will now be connected to the Workspace.

Loading A Project

SNP projects usually take a while to run, you can follow the progress of the SNP calling with ‘Show My Jobs’. The four jobs that need to be completed for an SNP Project are:

  • refMasker Finds repeat regions in the reference which are ignored for SNP calls
  • refMapper Maps contigs of all the query strains against the reference and work out SNPs
  • refMapperMatrix combines all the SNPs and filters those out that are in repeat regions or are in regions that are missing in 10% or more of the all the query strains
  • matrix_phylogeny The SNP profiles are then used to create a RAxML tree

Once all four jobs are complete your project can be loaded. You will be informed if you try and load a project and it is not yet complete.

Stand Alone Projects

  • Clicking the load Workspace panel on the top right of the database home page
  • On the main search page menu Workspace -> load
  • Going directly to the tree link e.g. enterobase.warwick.ac.uk/pyhlo_tree?tree_id=xxxx. If the project is not public and you are not logged in you will be asked to do so

Projects Connected to a Workspace

  • First load the Workspace and then Analysis -> load SNP Project or you can click on the Project in the summary dialog
  • Go directly to the link e.g. enterobase.warwick.ac.uk/phylo_tree?tree_id=xxxx

Visualizing SNPs

When you load an SNP project, two windows should open (be sure that your browser allows popups from this site). [One of the windows](EnteroBase%20SNPs%20JBrowse) is an instance of [JBrowse] which allows you to visualize the SNPs with respect to the reference. The reference will be annotated by the [prokka_annotation] pipeline plus any MLST schemes that have been called and if it is a complete genome, the [GenBank annotation] as well. For each of the strains in the project will have a track (left hand panel) showing the SNPs. It is hoped in the future to display where the contigs of each strain mapped to the reference and which regions were masked for repeats.

Working with the Trees

General Manipulations

  • Moving the tree Dragging outside of the tree will move the tree in the direction of the drag.
  • Resizing the tree Use the mouse wheel to zoom in or out, with zoom centre, being the current mouse position. You can also zoom in and out using the magnifying glass icons on the left panel.
  • Selecting nodes Nodes can be selected by pressing shift and dragging the mouse, all nodes in the rectangle formed will become selected as by the connected branches turning red. Alternatively branches can be clicked (in selection mode) and the branch and all its children will become selected. Further selections are added to the current selection and all selections can be cleared by the ‘Clear Selection’ button at the bottom of the left hand panel.
  • Saving Layout The layout of the tree

Branches

  • X scale This slider will either expand or collapse the tree with respect to the X axis
  • Log Scale Branch lengths will be scaled to the log of their actual lengths
  • Max X Distance Sets the maximum branch length (in native units). Type the value in the input and type go. Branch lengths longer than the maximum will be shortened and appear dotted. This is useful if some branch lengths are disproportionally longer than all the others e.g. the outlier
  • Y Scale This slider will either expand or collapse the tree with respect to the Y axis
  • Branch labels Check this box if you need the lengths of the branches displayed
  • Font Size This slider adjusts the font size of the branch labels
  • Branch Thickness This slider adjusts the thickness of the branches

Branches

  • Markers Check box which controls whether the markers on each leaf are displayed
  • Marker Size Adjusts the size of the markers
  • Aligned Specifies whether the markers and leaf labels should be right-aligned
  • Leaf Labels Specifies whether the leaf labels should be shown
  • Font Size This slider adjusts the size of the leaf text
  • Label Text Controls whether the leaf text should be the strain name or the category being displayed

Layout

  • Horizontal/Radial specifies the layout type of the tree
  • Create Sub tree If you select a portion of the tree (by clicking a branch) and then create a subtree, only that portion of the tree will be displayed. This is useful if you have a branch with many nodes that are all of similar length and cannot be distinguished
  • Show Whole Tree This will restore the whole tree if you are viewing a subtree

Click Mode

Normally when you click a branch, the branch and all its descendants will become selected. However you can change the click mode to ‘Swap Branches’ which will swap the order of the two child branch. In ‘Collapse Branches’ mode, clicking a branch will collapse it and if leaf markers are displayed the marker size will be log proportional to number of strains in the collapsed branch and will become a pie chart showing the proportion of each value. You will also get the option to name the branch. Clicking on a collapsed branch will expand it again.

Downloading Results

You can download the Newick format (.nwk) file describing the tree or the matrix. The matrix consists of a tab delimited file consisting of all the nucleotide calls at the variant positions. However in the file strains are specified by barcode so a map is also available which links the barcode to the strain name.

Deleting A Tree

  • Stand alone Projects can be deleted from the Workspace dialog, which can be accessed either in the database home page (click the top right panel) or in the main search page (using Workspace -> Load). Click on the Project you want to delete and press Delete WS.
  • Projects can be deleted from a Workspace by clicking the edit button in the Workspace Summary dialog. The list of projects should now have a cross by them. Remove projects by clicking the cross and the pressing Update.