Analysis objects - Workspaces

Analysis objects (sometimes called workspaces) are items in Enterobase such as trees, custom columns/views etc. A workspace (basically just a list of strains) is also an analysis object (just to confuse matters).

Data for the analysis is stored in the user_preferences table. The id of the entry in this table is the id used to get a handle on the analysis and is used in urls and progamatically e.g. get_analysis_object(23233)

Description in the Config

All analysis objects need to be described in the ANALYSIS_TYPES dictionary of the top level config. The following keys

  • label The human readable label
  • icon The icon used for the analysis type when listing workspaces
  • shareable If True, the object can be shared
  • parameters A dictionary of parameters which will describe the analysis. In the format of parameter_name:parameter_label. This way the label can be changed without effecting any of the code.
  • Class The class used to manipulate the analysis. The class should be in entero/ExtraFuncs/workspace and inherit from the base Analysis class
  • url The url for analysis types, which can be shown in a stand alone web page. They can include <database>, where the name of the database will be inserted in the final url and <id> , where the id of the analysis being shown will be substituted e.g. species/<database>/snp_project/<id>
  • job_required If True will indicate that a job is required to create the analysis. The object’s data should have a ‘complete’ or ‘failed’ tag once the job is complete
  • create_from_search If True, the the analysis can be created from the main search page e.g Trees

Other parameters specific to an individual analysis type can also be added.

user_preferences database table

Analysis objects such as workspaces and user preferences are stored in the user_preferences table within the common entero database. There is a single table that covers all the species within an Enterobase instance. The id, user_is are self explanatory. The data column will contain in json format everything describing the analysis. All analysis objects have the following

  • date_created The date the analysis type was created
  • date_modified The date the anlysis type was lat modified
  • data
    • description A short description of the analysis
    • links A dictionary of link_name:link_hef
  • complete ‘true’ if the analysis job is complete
  • failed ‘true if the anlysis job has failed
  • job_id The id of the analysis job
  • name The name of the workspace
  • type The type of workspace

Simple workspaces

Workspaces are stored in the user_preferences system database with all the associated data as json in the data columns json in the following format:

{
    grid_params:{....}
    ,data:{....}
}

The data is comprised of the following

  • experimental_data The name (description) of the scheme currently displayed
  • strain_ids A list of all strain Ids
  • sort_order A list of the columns used for sorting, each value being a list containing the name of the table with the sort column (‘main’ for the strain table), the name of the column and either 1 (ascending) or 0 (descending)
  • current_page The index of the current page

Some of the various types of simple workspaces include:

  • main_workspace Used when “Workspace/Save as” is used to save a specific set of strains
  • grid_layout Saves user’s customised layouts of the main and experimental grids. The “name” column saves the type of grid whose layout is being saved.

Storage of users workspace list

The list of workspaces associated with a user are stored in a workspace of type workspace_folders

The format is a dictionary of folders each describing the workspaces they contain, their subfolders and text. All will have a Root folder with an id of RN. For example the following folder structure

{
    "folder_id":{
        "id":"folder_id",
        "workspaces:[id1,id2,....],
        "children":["folder2","folder3",.....],
        "text":"folder_name"
    },
    .........
}

``` for example the following

Root
│
└───Project 1
│    big snp tree \\id is 3
│    all typhi \\id 365
│
└───Project 2
      small tree \\id 673

would have the following structure

{
    "RN":{
        "id":"RN",
        "workspaces:[],
        "children":["j1","j2"],
        "text":"Root"
    },
    "j1":{
        "id":"j1",
        "workspaces:[3,365],
        "children":[],
        "text":"Project 1"
    },
     "j2":{
        "id":"j2",
        "workspaces:[673],
        "children":[],
        "text":"Project 2"
    }

}

For a users individual folders, each time they are requested /species/get_user_workspaces (get_user_workspaces in entero.species.views) This method will retrieve the folder structure and then a remove workspaces depending on whether they have been deleted or workspaces shared with you have been deleted/unshared. Any workspaces that are new will be added to the root folder.

this does not happen for public folders but upon delete they are removed and added in make_public

The sructure of buddies cannot be altered and ids not stored but is just created on the fly with user name as folder containing any shared folders

Folder Structure for storing associated workspace data

The folder structure for public and individuals are stored in the folder defined by BASE_WORKSPACE_DIRECTORY in config.py, normally /share_space/workspace/. This contains data for both individual’s and public workspaces. The directory structure is

where <userid>=0 for public workspaces