JBrowse In Enterobase

Installing JBrowse

Installing the executables

Instructions for installing can be found https://jbrowse.org/install/.

wget https://github.com/GMOD/jbrowse/releases/download/1.12.4-release/JBrowse-1.12.4.zip
unzip JBrowse-1.12.4.zip
cd JBrowse-1.12.4/
./setup.sh
cd bin
wget http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/wigToBigWig
chmod 755 wigToBigWig

This will be sufficient for the server to process genomes in order to format them for JBrowse

Enabling the JBrowse Server

if using NginX then the installation directory needs to be mapped with the following in the NginX config.

location /entero_jbrowse {
     alias /path/to/JBrowse
}

The index in the main directory needs to be altered by adding the following include.

<script type="text/javascript" src="src/jbrowse_utilities.js"></script>

Also the following lines need to be added to the main script

if (queryParams.extra_include){
     config.include.push(queryParams.data+"/"+queryParams.extra_include+".json")
}
if(queryParams.highlight_genes){
     JBrowseUtilities.highlight_genes = queryParams.highlight_genes.split(",");
}

Finally any plugins need to be added to the main config object

plugins: ["GCContent","NeatCanvasFeatures","SNPViewer"]

In the src diectory place the jbrowse_utilities.js file (see this repository). Also in the main installation directory create a folder entero_data, which needs to be linked to the jbrowse_repository (see below) where all the formatted files that JBrowse requires to display a genome will be deposited.

Source Data

JBrowse will load data from the entero_data folder, which should have been created in the main installation directory. It needs to be linked to the actual location of data, a place which can be accessed by instances of Enterobase that will be formatting the data (which is currently /share_space/jbrowse). The location is specified in the jbrowse_repository variable in entero/jbrowse/utilities

Data for each assembly is in the following structure within the jbrowse_repository folder DB Code/First Four symbols of Barcode/Barcode

e.g. ESC_HA6306AA_AS would be in ESC/HA63/HA6306AA_AS

Formatting The Assembly

An assembly can be formatted for viewing in JBrowse i.e. by running the script [make_jbrowse_annotation](Maintenance%20Scripts#markdown-header-make_jbrowse_annotation) in Enterobase. This script will create all the files necessary in the appropraite directory. The actual methods are present in Jbrowse.utilities.py, which has two variables, jbrowse_bin and jbrowse_repository. As already mentioned, jbrowse_repository is the root directory, where all the formatted files will be placed (/share_space/jbrowse). jbrowse_bin should point to the directory where all the executables for formatting the data are present and depends on where JBrowse was installed. During installation, JBrowse will place all the necessary programs in the bin directory, except wigToBigWig which is needed for generating the quality annotations and can be found [here](http://hgdownload.soe.ucsc.edu/admin/exe/)

Steps In Formatting The make_jbrowse_from_genome in entero/cell_app/tasks.py will carrry out the following steps

  • A sequence index will be built creating the sequence track.
  • If it is a ‘complete’ assembly the genbank annotation will be downloaded and made into a track.
  • If it is a local assembly, a qualities track will be made from the fastq file.
  • All ‘schemes’ will be made into a tracks.
  • Prokka annotation, if present, will be made into a track.
  • If a ref_masker file is associated with the assembly (which shows repeat regions ignored in SNP calling), this too will be made into a track.
  • Information for all tracks is placed in the trackList.json file as well as information for the GCRatio track.
  • In the other_data json column in the assembly table, “can_view”:”true” is added, which indicates that a JBrowse annotation is available for the assembly.

Displaying An Assembly

To display an assembly, it first needs to be formatted (see above), then just point the browser to the following URL jbrowse_root?data=/path/to/data

e.g. http://137.205.123.127/entero_jbrowse/?data=entero_data/CLO/BA36/CLO_BA3684AA_AS

Various parameters can be passed in the GET request see [here](http://gmod.org/wiki/JBrowse_Configuration_Guide#Controlling_JBrowse_with_the_URL_Query_String). In addition there are two Enterobase specific parameters extra_include and highlight_genes.

By default the tracks available are in the trackList.json file in the data file. However under certain circumstances you may want extra tracks to view and these should be specified in the extra_include parameter. e.g. for viewing an SNP project, as well as all the default tracks for the reference, you want the SNPs as well, which can be achieved by specifying the config containing the SNP track:-

http://<server>/entero_jbrowse?data=entero_data/SAL/MA21/SAL_MA2182AA_AS&extra_include=snps_9433&tracks=snp_9433

The highlight_genes parameter should contain the name of the genes (comma delimited) that will be highlighted (coloured red) in the appropriate track. By also adding loc=highlighted_gene_name, Jbrowse will open displaying the gene specified.

http://<server>/entero_jbrowse/?data=entero_data/SAL/XA13/SAL_XA1357AA_AS&tracks=wgMLST&highlight_genes=SESA_RS21645,CFSAN002050_RS14600,SESA_RS21635,SESA_RS21630,CFSAN002050_RS14575,SESA_RS21650&loc=SESA_RS21645

However in practice <enterobase_root>/view_jbrowse_annotation is used to display annotations, with the following parameters:-

  • locus The gene to be initially displayed (optional)
  • highlighted_genes A comma delimited list of loci to highlight (optional)
  • barcode The assembly barcode
  • database The name of the database

e.g.

enterobase.warwick.ac.uk/view_jbrowse_annotation?barcode=SALXA1357AA_AA&database=senterica&locus=SESA_RS2163&highlight_genes=SESA_RS21645,CFSAN002050_RS14600,SESA_RS21635,SESA_RS21630,CFSAN002050_RS14575,SESA_RS21650

This way if the genome has not been formatted, it will be sent off for formatting and the page will display a waiting symbol. The page will then poll the backend every 10 seconds and when the genome is ready will swap its href to the JBrowse page

Displaying SNPs

SNPs are displayed using [this](https://github.com/martinSergeant/jbrowse_snp_viewer) plugin. When complete a vcf file is created, which along with the tree is added to the reference assembly’s data directory. A config file pointing to the SNP track is also added and this needs to be specified in the extra_include parameter (see above)

Use of Celery

Calls to format genomes for JBrowse are passed to celery, so this needs to running on the server (see [here](jobs#markdown-header-running-celery)). You can always pass the request to a server with celery running in the nginx config as follows:-

location /view_jbrowse {
        rewrite ^/(view_jbrowse.*) /$1 break;
        proxy_pass http://karmo.lnx.warwick.ac.uk:8000;
        proxy_set_header Host $host;
        add_header X-Proxy-Upstream $upstream_addr;
        proxy_set_header X-Real-IP  $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
   }