JBrowse In Enterobase
=====================
Installing JBrowse
------------------
**Installing the executables**
Instructions for installing can be found https://jbrowse.org/install/.
.. code-block:: bash
wget https://github.com/GMOD/jbrowse/releases/download/1.12.4-release/JBrowse-1.12.4.zip
unzip JBrowse-1.12.4.zip
cd JBrowse-1.12.4/
./setup.sh
cd bin
wget http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/wigToBigWig
chmod 755 wigToBigWig
This will be sufficient for the server to process genomes in order to format
them for JBrowse
**Enabling the JBrowse Server**
if using NginX then the installation directory needs to be mapped with the following in the NginX config.
.. code-block:: bash
location /entero_jbrowse {
alias /path/to/JBrowse
}
The index in the main directory needs to be altered by adding the following include.
.. code-block:: javascript
Also the following lines need to be added to the main script
.. code-block:: javascript
if (queryParams.extra_include){
config.include.push(queryParams.data+"/"+queryParams.extra_include+".json")
}
if(queryParams.highlight_genes){
JBrowseUtilities.highlight_genes = queryParams.highlight_genes.split(",");
}
Finally any plugins need to be added to the main config object
.. code-block:: bash
plugins: ["GCContent","NeatCanvasFeatures","SNPViewer"]
In the src diectory place the jbrowse_utilities.js file (see this
repository). Also in the main installation directory create a folder
entero_data, which needs to be linked to the jbrowse_repository (see below)
where all the formatted files that JBrowse requires to display a genome will
be deposited.
Source Data
-----------
JBrowse will load data from the entero_data folder, which should have been
created in the main installation directory. It needs to be linked to the
actual location of data, a place which can be accessed by instances of
Enterobase that will be formatting the data (which is currently
*/share_space/jbrowse*). The location is specified in the
**jbrowse_repository** variable in *entero/jbrowse/utilities*
Data for each assembly is in the following structure within the
jbrowse_repository folder DB Code/First Four symbols of Barcode/Barcode
e.g. ESC_HA6306AA_AS would be in ESC/HA63/HA6306AA_AS
Formatting The Assembly
-----------------------
An assembly can be formatted for viewing in JBrowse i.e. by running the
script
[make_jbrowse_annotation](Maintenance%20Scripts#markdown-header-make_jbrowse_annotation)
in Enterobase. This script will create all the files necessary in the
appropraite directory. The actual methods are present in
Jbrowse.utilities.py, which has two variables, **jbrowse_bin** and
**jbrowse_repository**. As already mentioned, **jbrowse_repository** is the
root directory, where all the formatted files will be placed
(*/share_space/jbrowse*). **jbrowse_bin** should point to the directory where
all the executables for formatting the data are present and depends on where
JBrowse was installed. During installation, JBrowse will place all the
necessary programs in the bin directory, except **wigToBigWig** which is
needed for generating the quality annotations and can be found
[here](http://hgdownload.soe.ucsc.edu/admin/exe/)
**Steps In Formatting**
The *make_jbrowse_from_genome* in *entero/cell_app/tasks.py* will carrry out
the following steps
* A sequence index will be built creating the sequence track.
* If it is a 'complete' assembly the genbank annotation will be downloaded and made into a track.
* If it is a local assembly, a qualities track will be made from the fastq file.
* All 'schemes' will be made into a tracks.
* Prokka annotation, if present, will be made into a track.
* If a ref_masker file is associated with the assembly (which shows repeat regions ignored in SNP calling), this too will be made into a track.
* Information for all tracks is placed in the trackList.json file as well as information for the GCRatio track.
* In the other_data json column in the assembly table, "can_view":"true" is added, which indicates that a JBrowse annotation is available for the assembly.
Displaying An Assembly
----------------------
To display an assembly, it first needs to be formatted (see above), then just
point the browser to the following URL jbrowse_root?data=/path/to/data
e.g. http://137.205.123.127/entero_jbrowse/?data=entero_data/CLO/BA36/CLO_BA3684AA_AS
Various parameters can be passed in the GET request see
[here](http://gmod.org/wiki/JBrowse_Configuration_Guide#Controlling_JBrowse_with_the_URL_Query_String).
In addition there are two Enterobase specific parameters **extra_include**
and **highlight_genes**.
By default the tracks available are in the trackList.json file in the data
file. However under certain circumstances you may want extra tracks to view
and these should be specified in the **extra_include** parameter. e.g. for
viewing an SNP project, as well as all the default tracks for the reference,
you want the SNPs as well, which can be achieved by specifying the config
containing the SNP track:-
http:///entero_jbrowse?data=entero_data/SAL/MA21/SAL_MA2182AA_AS&extra_include=snps_9433&tracks=snp_9433
The **highlight_genes** parameter should contain the name of the genes (comma
delimited) that will be highlighted (coloured red) in the appropriate track.
By also adding loc=highlighted_gene_name, Jbrowse will open displaying the
gene specified.
http:///entero_jbrowse/?data=entero_data/SAL/XA13/SAL_XA1357AA_AS&tracks=wgMLST&highlight_genes=SESA_RS21645,CFSAN002050_RS14600,SESA_RS21635,SESA_RS21630,CFSAN002050_RS14575,SESA_RS21650&loc=SESA_RS21645
However in practice */view_jbrowse_annotation* is used to
display annotations, with the following parameters:-
* **locus** The gene to be initially displayed (optional)
* **highlighted_genes** A comma delimited list of loci to highlight (optional)
* **barcode** The assembly barcode
* **database** The name of the database
e.g.
enterobase.warwick.ac.uk/view_jbrowse_annotation?barcode=SALXA1357AA_AA&database=senterica&locus=SESA_RS2163&highlight_genes=SESA_RS21645,CFSAN002050_RS14600,SESA_RS21635,SESA_RS21630,CFSAN002050_RS14575,SESA_RS21650
This way if the genome has not been formatted, it will be sent off for
formatting and the page will display a waiting symbol. The page will then
poll the backend every 10 seconds and when the genome is ready will swap its
href to the JBrowse page
Displaying SNPs
---------------
SNPs are displayed using
[this](https://github.com/martinSergeant/jbrowse_snp_viewer) plugin. When
complete a vcf file is created, which along with the tree is added to the
reference assembly's data directory. A config file pointing to the SNP track
is also added and this needs to be specified in the extra_include parameter
(see above)
Use of Celery
-------------
Calls to format genomes for JBrowse are passed to celery, so this needs to
running on the server (see [here](jobs#markdown-header-running-celery)). You
can always pass the request to a server with celery running in the nginx
config as follows:-
.. code-block:: bash
location /view_jbrowse {
rewrite ^/(view_jbrowse.*) /$1 break;
proxy_pass http://karmo.lnx.warwick.ac.uk:8000;
proxy_set_header Host $host;
add_header X-Proxy-Upstream $upstream_addr;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}