Top level links:

  • [Main top level page for all documentation](Home)
  • [EnteroBase Features](Home)
  • [Registering on EnteroBase and logging in](Enterobase%20website)
  • [Tutorials](Tutorials)
  • [Using the API](About%20the%20API)
  • [About the underlying pipelines and other internals](EnteroBase Backend Pipeline)
  • [How schemes in EnteroBase work](About%20EnteroBase%20Schemes)
  • [FAQ](FAQ)

# RCatch #

[TOC]

# Overview # RCatch implements the automated downloading of SRAs.

In order to accomplish this task, RCatch provides a standalone RESTful service which responds to HTTP requests for downloading SRAs from NCBI. RCatch also implements uploading SRAs to the S3 interface in [CLIMB].

RCatch is written in Python and uses [Flask] to offer web-based APIs, PostgreSQL to store information, and [ZeroMQ] as a broker between them.

Because of occasional communication break downs, RCatch tries five successive protocols for downloading short reads before giving up (with an ERROR message). These protocols are:

  1. FASTQ/GZIP file from [ENA/EBI], using [Aspera].
  2. FASTQ/BZIP2 file from [DRA/DDBJ], using [Aspera].
  3. FASTQ/GZIP file from [ENA/EBI], using FTP.
  4. SRA file from [SRA/NCBI][Short Read Archives (SRAs)], using [Aspera].
  5. SRA file from [SRA/NCBI][Short Read Archives (SRAs)], using FTP.

RCatch automatically reformats FASTQ/BZIP2 files or SRA files into FASTQ/GZIP files after downloading.

# API #

## RCatch URI ## In the examples below, the RCatch URI is configuration dependent, depending on which system RCatch runs.

## Downloading short reads ##

The get method is used to download short reads. An example of downloading short reads is provided below which downloads short with accession codes ERR036000 and ERR036001:

http://<RCatch Host>/ET/RCatch/get?run=ERR036000,ERR036001

Another example of downloading short reads that also controls the priority is provided below:

http://<RCatch Host>/ET/RCatch/get?run=ERR036002,ERR036003,ERR036004&priority=-1

The lower the number, the higher the priority. By default priority=0.

## Delete downloaded short reads ##

An example of deleting downloaded short reads is provided below which deletes short reads with accession codes ERR036002, ERR036003 and ERR036004:

http://<RCatch Host>/ET/RCatch/del?run=ERR036002,ERR036003,ERR036004

## Priority of tasks ##

Below is an example which changes the priority of a existing task, for a short read with accession code ERR0360000 to priority=2:

http://<RCatch Host>/ET/RCatch/priority?run=ERR036000&priority=2

## Controlling the choice of downloading protocols ## It is possible to control the choice of downloading protocols using the source method. Below is an example which change the downloading protocols (i.e. do not download from DRA and try SRA before ENA):

http://<RCatch Host>/ET/RCatch/source?sites=SRA,SRA-FTP,ENA,ENA-FTP

Default order of downloading protocols are : ENA,DRA,ENA-FTP,SRA,SRA-FTP.

[CLIMB]: http://www.climb.ac.uk/ “external link” [Flask]: http://flask.pocoo.org/ “external link” [ZeroMQ]: http://zeromq.org/ “external link” [Aspera]: http://asperasoft.com/ “external link” [ENA/EBI]: http://www.ebi.ac.uk/ena “external link” [DRA/DDBJ]: http://trace.ddbj.nig.ac.jp/dra/index_e.html “external link” [Short Read Archives (SRAs)]: http://www.ncbi.nlm.nih.gov/sra “external link”