Redirected from page "Admin/Tools/Tool Config Syntax"

Clear message
Locked History Actions

Admin/Tools/ToolConfigSyntax

Galaxy Tool XML File

The XML File for a Galaxy tool, generally referred to as the "tool config file" or "wrapper", serves a number of purposes. First, it lays out the user interface for the tool ( e.g. form fields, text, help, etc. ). Second, it provides the glue that links your tool to Galaxy by telling Galaxy how to invoke it, what options to pass, and what files it will produce as output. It would be best to take some time to browse through the various tool configs ( files with a .xml extension ) in the ~/tools subdirectories of your local Galaxy instance as you read this document.

Pay attention to the following when creating a new tool:

  1. Make sure your XML is valid - Improper XML will most likely cause Galaxy to not load your tool. The easiest way to validate your XML is just to open the XML file itself in Firefox, which will either parse the file and display it, or showf the error and its location in large letters. There are also numerous XML validators available on line.

  2. Don't forget to restart Galaxy - Galaxy loads and parses XML at run-time, which means you'll have to restart it after updating any XML files. The same does not apply if you only update an executable.

  3. Use the -file_strandCol options - Using interval files is more of a pain than using BED files, because the column locations are variable. But by including command line options in your executable and passing them the "automagic" column variables, you can easily handle interval formats. Plus, since BED formats are treated internally as intervals, you don't have to worry about figuring out which one your program is being passed. Everything will be provided as an interval file. Read more about this in the "command line" tag section below.

  4. Make sure your parameter names match your command-line variables - Galaxy will populate your command line options from the parameters selected by the user when the tool executes. Form field values are mapped to parameters in the command line via such that "form field name <-> $parameter name in the command line.

  5. Provide tool tips and other help - useful for those that will use your tool. The help section provides information on how to use the tool.

  6. Properly use quotes around placeholders. This is especially true for data inputs/filenames, where a space exists in the filename (e.g. because Galaxy's database/files dir structure contains spaces) Instead of e.g. $input_file do "${input_file}"

A Galaxy tool's config file consists of a subset of the following XML tag sets - each of these is described in detail in the following sections.

Details of XML tag sets


<tool> tag set

The outer-most tag set

attribute

values

details

required

example

id

a string *

Must be unique across all tools; should be lowercase and contain only letters, numbers, and underscores. It allows for tool versioning and metrics of the number of times a tool is used, among other things.

yes

id="sort1"

name

a string

This string is what is displayed as a hyperlink in the tool menu

yes

name="Sort"

version

a string

This string defaults to "1.0.0' if it is not included in the tag. It allows for tool versioning and should be changed with each new version of the tool.

no

version="1.0.1"

hidden

true, false

Allows for tools to be loaded upon server startup, but not displayed in the tool menu

no

hidden="true"

tool_type

data_source

Allows for certain framework functionality to be performed on certain types of tools. This is currently only used in "data_source" tools, but will undoubtedly be used with other tools in the future.

no

tool_type="data_source"

URL_method

get, post

Only if "tool_type" attribute value is "data_source" - defines the HTTP request method to use when communicating with an external data source application ( the default is "get" ).

no

URL_method="post"

workflow_compatible

true, false

Default is true.

no

workflow_compatible="false"

Example

The following is an example that contains all of the attributes described above.

   1 <tool id="ucsc_table_direct1" name="UCSC Main" version="1.0.0" hidden="false" tool_type="data_source" URL_method="post">



<description> tag set

The attribute value is displayed in the tool menu immediately following the hyperlink for the tool ( based on the "name" attribute of the <tool> tag set described above ).

Example

   1 <description>table browser</description>



<version_command> tag set

Specifies the command to be run in order to get the tool's version string. The resulting value will be found in the "Info" field of the history dataset. For example:

   1 <version_command>tophat -version</version_command>


<command> tag set

This tag specifies how Galaxy should invoke the tool's executable, passing its required input parameter values (the command line specification links the parameters supplied in the form with the actual tool executable). Any word inside it starting with a dollar sign ($) will be treated as a variable whose values can be acquired from one of three sources: parameters, metadata, or output files. After the substitution of variables with their values, the content is interpreted with Cheetah and finally given to the interpreter specified in the corresponding attribute (if any).

attribute

values

details

required

example

interpreter

python, perl, bash, etc

This attribute defines the programming language in which the tool's executable file is written. Any language can be used (tools can be written in Python, C, Perl, Java, etc.). The executable file must be in the same directory of the XML file. If instead this attribute is not specified, the tag content should be a Bash command calling executable(s) available in the $PATH.

no (unless executable is interpreted)

interpreter="python"

Example

The following uses a compiled executable ( see the various tool config files in emboss_5 tools for examples of using compiled executables ).

   1  <command>backtranseq -sequence $input1 -outfile $out_file1 -cfile $cfile -osformat2 $out_format1 -auto</command>

Example

The following uses an interpreted executable. The values of the $<variables> (e.g. $input) are acquired from form field parameters in the tool form (see ~/tools/filters/sorter.xml for an example of using an interpreted executable). The file sorter.py must be in the same directory of the XML file.

   1   <command interpreter="python">sorter.py -i $input -o $out_file1 -cols $column -order $order -style $style</command>

Example

The values of the ${<variables>} ( e.g., ${input.metadata.chromCol} ) are acquired from the Metadata associated with the objects selected as the values of each of the relative form field parameters in the tool form. Accessing this information is generally enabled using the following feature components:

  • A set of "metadata information" is defined for each supported data type ( see the _MetadataElement_ objects in the various data types classes in ~/lib/galaxy/datatypes ).

  • The _DatasetFilenameWrapper_ class in the ~/lib/galaxy/tools/__init__.py code file wraps a Metadata Collection to return Metadata parameters wrapped according to the Metadata spec.

There are a few reserved variables which Galaxy will automatically fill in

Also note the use of the reserved parameter name GALAXY_DATA_INDEX_DIR - it points to the ~/tool-data directory.

   1 <command interpreter="python">
   2  extract_genomic_dna.py $input $out_file1 -1 ${input.metadata.chromCol},${input.metadata.startCol},${input.metadata.endCol},${input.metadata.strandCol} -d $dbkey -o $out_format -g ${GALAXY_DATA_INDEX_DIR}
   3 </command>

Reserved Variables

Galaxy provides a few pre-defined variables which can be used in your command line, even though they don't appear in your tool's parameters.

name

description

$__tool_directory__

The directory the tool currently resides in (new in 15.03)

$__new_file_path__

config/galaxy.ini new_file_path value

$__tool_data_path__

config/galaxy.ini tool_data_path value

$__root_dir__

Top-level Galaxy source directory made absolute via os.path.abspath()

$__datatypes_config__

config/galaxy.ini datatypes_config value

$__user_id__

Email's numeric ID (id column of galaxy_user table in the database)

$__user_email__

User's email address

$__app__

The galaxy.app.UniverseApplication instance, gives access to all other configuration file variables (e.g. $__app__.config.output_size_limit). Should be used as a last resort, may go away in future releases.

Additional runtime properties are available that should be escaped with a backslash ( \ ) when appearing in command or config_file elements.

name

description

\${GALAXY_SLOTS:-4} 

Number of cores/threads allocated by the job runner or resource manager to the tool for the given job (here 4 is the default number of threads to use if running via custom runner that does not configure GALAXY_SLOTS or in an older Galaxy runtime).


<inputs> tag set

Consists of all tag sets that define the tool's input parameters. Each <param> tag within the <inputs> tag set maps to a command line parameter within the <command> tag set described above.


<repeat> tag set

See xy_plot.xml for an example of how to use this tag set. This is a container for any tag sets that can be contained within the <inputs> tag set. When this is used, the tool will allow the user to add any number of additional sets of the contained parameters ( an "Add new <title>" button will be displayed on the tool form ). An example of the use of this tag set is in the xy_plot.xml tool config. All <inputs> tag sets contained within the <repeat> tag can be retrieved by enumerating over $<name_of_repeat_tag_set> in Cheetah code. This returns the rank and the object (containing the <inputs> tag sets) of the repeat container. To fetch the data from the object, use $object.<name_of_param>. See the Cheetah code below.

attribute

values

details

required

example

name

a string

The name of the repeat section

yes

name="series"

title

a string

The title of the repeat section, which will be displayed on the tool form

yes

title="Series"

help

a string

Rendered on the tool form just below the title to provide help information

no

help="Add your series here"

min

an integer

The minimum number of repeat units

no

min="1"

max

a number

The maximum number of repeat units

no

max="5"

default

a number

The default number of repeat units

no

default="1"

Example

This is part is contained in the <inputs> tag set.

   1 <repeat name="series" title="Series">
   2     <param name="input" type="data" format="tabular" label="Dataset"/>
   3     <param name="xcol" type="data_column" data_ref="input" label="Column for x axis"/>
   4     <param name="ycol" type="data_column" data_ref="input" label="Column for y axis"/>
   5 </repeat>


This Cheetah code can be used in the <command> tag set or the <configfile> tag set.

   1 #for $i, $s in enumerate( $series )
   2     rank_of_series=$i
   3     input_path=${s.input.file_name}
   4     x_colom=${s.xcol}
   5     y_colom=${s.ycol}
   6 #end for



<conditional> tag set

See ~/tools/maf/interval2maf.xml for an example of how to use this tag set. This is a container for conditional parameters in the tool ( must contain <when> tag sets ) - the command line is wrapped in an if-else statement.

attribute

values

details

required

example

name

any string

The name of the conditional parameter

yes

name="maf_source_type"

Example

Select the alignment target database ( a Galaxy cached genome build or a dataset in the history ). Note the different input variables in the command lines.

   1 <command interpreter="python">
   2     #if $source.source_select=="database"
   3         blat_wrapper.py 0 $source.dbkey $input_query $output1 $iden $tile_size $one_off
   4     #else
   5         blat_wrapper.py 1 $source.input_target $input_query $output1 $iden $tile_size $one_off
   6     #end if
   7 </command>
   8 
   9 <conditional name="source">
  10     <param name="source_select" type="select" label="Target source">
  11         <option value="database">Genome Build</option>
  12         <option value="input_ref">Your Upload File</option>
  13     </param>
  14     <when value="database">
  15         <param name="dbkey" type="genomebuild" label="Genome" />
  16     </when>
  17     <when value="input_ref">
  18         <param name="input_target" type="data" format="fasta" label="Reference sequence" />
  19     </when>
  20 </conditional>



<when> tag set

Contained within the <conditional> tag set - each <when> tag set contains a set of input parameters, and the conditional variables are usually defined within <option> tag sets.

attribute

values

details

required

example

value

a possible conditional value

This tag set will be used when the value of the containing conditional parameter equals this attribute value

yes

value="user"

Example

This example provides details for how to choose the MAF source file, either locally cached data or an history item ( there are two options: "cached" or "user" ). If a user selects "Alignments in Your History", a variable of type "data" will be generated. If the user selects "Locally Cached Alignments", a drop-down selection menu will be generated according to entries contained in the file "maf_index.loc", which is stored in the configured tool_data_path directory.

   1 <command>
   2     #if $maf_source_type.maf_source == "user"
   3 your_program $maf_source_type.maf_file
   4     #else
   5 your_program $maf_source_type.maf_identifier
   6     #end if
   7 </command>
   8 
   9 <inputs>
  10     <conditional name="maf_source_type">
  11         <param name="maf_source" type="select" label="MAF Source">
  12             <option value="cached" selected="true">Locally Cached Alignments</option>
  13             <option value="user">Alignments in Your History</option>
  14         </param>
  15         <when value="user">
  16             <param name="maf_file" type="data" format="maf" label="MAF File" />
  17         </when>
  18         <when value="cached">
  19             <param name="maf_identifier" type="select" label="MAF Type" >
  20                 <options from_file="maf_index.loc">
  21                     <column name="name" index="0"/>
  22                     <column name="value" index="1"/>
  23                 </options>
  24             </param>
  25         </when>
  26     </conditional>
  27 </inputs>



<param> tag set

Contained within the <inputs> tag set - each of these specifies a field that will be displayed on the tool form. Ultimately, the values of these form fields will be passed as the command line parameters to the tool's executable.

attribute

values

details

required

example

name

a string *

* Attribute values must map to each command line parameter name. "Reserved" names are: REDIRECT_URL, DATA_URL, GALAXY_URL.

yes

name="input"

type

See below

The list of supported parameter types is in the parameter_types dictionary in ~/lib/galaxy/tools/parameters/basic.py.

yes

type="data"

optional

true, false

If "false", parameter must have a value. Defaults to "false".

no

optional="true"

label

a string

The attribute value will be displayed on the tool page as the label of the form field

no

label="Sort Query"

help

a string

Rendered on the tool form just below the associated field to provide information about the field

no

help="No data? See tip below"

In addition, there are several more attributes that are only used in combination with specific values of the type attribute.

type Attribute Values and Dependent Attributes

The type attribute specifies what kind of parameter it is.  The list of supported parameter types is in the parameter_types dictionary in ~/lib/galaxy/tools/parameters/basic.py.

type="text"

Free form text; parameter appears as a text box.

Dependent attributes

attribute

values

details

required

example

size

an appropriate number

Only if "type" attribute value is "text". To create a multi-line text box add an 'area="True"' attribute to the param tag.

no

size="4"

Example

Sometimes you need labels for data or graph axes, chart titles, etc. This can be done using a text field. The following will create a text box 30 characters wide with the default value of "V1".

   1 <param name="xlab" size="30" type="text" value="V1" label="Label for x axis"/>

Example

   1 <param name="foo" type="text" area="True" size="5x25" />

type="integer" and type="float"

Whole number and real number, respectively.

Dependent attributes

attribute

values

details

required

example

value

a string

Default value for the form field

only for type="integer" and type="float"

value="0"

min

a number

minimum parameter value; only valid when type is "integer" or "float"

no

min="0"

max

a number

maximum parameter value; only valid when type is "integer" or "float"

no

max="600000"

Example

The following will create a text box 4 characters wide with the default value of 1 and will restrict values entered to integers:

   1 <param name="region_size" size="4" type="integer" value="1" label="flanking regions of size" />

type="boolean"

A True / False value

Dependent attributes

attribute

values

details

required

example

checked

yes, true, on

Only if "type" attribute value is "boolean"

no

checked="true"

truevalue

a string

Only if "type" attribute value is "boolean"

no

truevalue="-p"

falsevalue

a string

Only if "type" attribute value is "boolean"

no

falsevalue="-q"

type="data"

A dataset from the current history. Multiple types might be used for the param form.

Dependent attributes

attribute

values

details

required

example

format

a string *

* Only if "type" attribute value is "data" or "data_collection" - the list of supported data formats is contained in the ~/config/datatypes_conf.xml.sample file. Use the file extension.

no *

format="tabular"

multiple

true, false

Only if "type" attribute value is "data"

no

multiple="true"

Example

The following will find all "coordinate interval files" contained within the current history and dynamically populate a select list with them. If they are selected, their destination and internal file name will be passed to the appropriate command line variable. Presto! Automatic temporary file management.

   1 <param name="interval_file" type="data" format="interval">
   2   <label>near intervals in</label>
   3 </param>

Example

   1 <param format="sam,bam" name="bamOrSamFile" type="data" label="Alignments in BAM or SAM format" help="The set of aligned reads." />

type="select"

...

Dependent attributes

attribute

values

details

required

example

data_ref

attribute value of the input dataset *

* Only if "type" attribute value is "select" - used with select lists whose options are dynamically generated based on certain metadata attributes of the dataset upon which this parameter depends ( usually but not always the tool's input dataset )

no

data_ref="input"

data_ref

attribute value of the input dataset *

* Only if "type" attribute value is "select" - used with select lists whose options are dynamically generated based on certain metadata attributes of the dataset upon which this parameter depends ( usually but not always the tool's input dataset )

no

data_ref="input"

display

checkboxes, radio

Only if "type" attribute value is "select" - render a select list as a set of check boxes or radio buttons. Defaults to a drop-down menu select list.

no

display="checkboxes"

multiple

true, false

Only if "type" attribute value is "select" - render a multi-select list

no

multiple="true"

Example (multiple=false)

The following will create a select list containing the options "Downstream" and "Upstream". Depending on the selection, a "d" or "u" value will be passed to the $upstream_or_down variable on the command line.

   1 <param name="upstream_or_down" type="select" label="Get">
   2   <option value="u">Upstream</option>
   3   <option value="d">Downstream</option>
   4 </param>

Example (multiple=true)

The following will create a checkbox list allowing the user to select "Downstream", "Upstream", both, or neither. Depending on the selection, the value of $upstream_or_down will be "d", "u", "u,d", or "".

   1 <param name="upstream_or_down" type="select" label="Get" multiple="true" display="checkboxes">
   2   <option value="u">Upstream</option>
   3   <option value="d">Downstream</option>
   4 </param>

type="data_column"

...

Dependent attributes

attribute

values

details

required

example

force_select

true, false

Only if "type" attribute value is "data_column" - force user to select an option in the list

no

force_select="true"

numerical

true, false

Only if "type" attribute value is "data_column" - builds a dynamically generated select list of numerical or string data

no

numerical="true"

use_header_names

true, false

Only if "type" attribute value is "data_column" - assumes first row of data_ref is a header and builds the select list with these values rather than c1 ... cN

no

use_header_names="true"

type="drill_down"

...

Dependent attributes

attribute

values

details

required

example

hierarchy

exact, recurse

Only if "type" attribute value is "drill_down"

no

hierarchy="recurse"

type="data_collection"

Dependent attributes

attribute

values

details

required

example

format

a string *

* Only if "type" attribute value is "data" or "data_collection" - the list of supported data formats is contained in the ~/config/datatypes_conf.xml.sample file. Use the file extension.

no *

format="tabular"

collection_type

a string *

* "data_collection" - restrict the kind of collection that can be consumed by this parameter (e.g. paired, list:paired, list).

no *

collection_type="paired"

Example

The following will create a parameter that only accepts paired FASTQ files.

   1 <param name="inputs" type="data_collection" collection_type="paired" label="Input FASTQs" format="fastq">
   2 </param>

type="color"

(New as of 15.03 release.) ...

type="genomebuild"

...

type="hidden"

...

type="hidden_data"

...

type="baseurl"

...

type="file"

...

type="ftpfile"

...

type="library_data"

...



<validator> tag set

See the annotation_profiler tool for an example of how to use this tag set. This tag set is contained within the <param> tag set - it applies a validator to the containing parameter.

attribute

values

details

required

example

type

expression, regex, in_range, length, metadata, unspecified_build, no_options, empty_field, dataset_metadata_in_file, dataset_metadata_in_data_table, dataset_ok_validator

The list of supported validators is in the validator_types dictionary in ~/lib/galaxy/tools/parameters/validation.py

yes

type="dataset_metadata_in_file"

message

any string *

The error message displayed on the tool form if validation fails

no

message="Sequences are not currently available for the specified build"

filename

the name of a file stored locally

The file contains values for validation

no

filename="alignseq.loc"

metadata_name

a valid metadata attribute name *

* The metadata attribute name

no

metadata_name="dbkey"

metadata_column

a number *

* The column index in the file containing the values for validation

no

metadata_column="0"

line_startswith

a string *

* Lines in the file being used for validation start with a this attribute value

no

line_startswith="seq"

min

a number *

* Only when the "type" attribute value is "in_range" - the minimum number allowed

no

min="0"

max

a number *

* Only when the "type" attribute value is "in_range" - the maximum number allowed

no

max="50"

Example

   1 <param format="bed" name="input1" type="data" label="Send this dataset to EpiGRAPH">
   2     <validator type="unspecified_build" />
   3 </param>

Example

The genome build of the dataset must be stored in Galaxy clusters and the name of the genome ("dbkey") must be one of the values in the first column of file alignseq.loc.

   1 <validator type="dataset_metadata_in_file" filename="alignseq.loc" metadata_name="dbkey" metadata_column="1" message="Sequences are not currently available for the specified build." split=" " line_startswith="seq" />

Example

Paths/names that downstream tools use in filenames may not contain ..

   1 <validator type="expression" message="No two dots (..) allowed">'..' not in value</validator>



<option> tag set

See ~/tools/filters/sorter.xml for an example of how to use this tag set. This tag set is optionally contained within the <param> tag when the "type" attribute value is "select" ( used for statically generated select lists ).

attribute

values

details

required

example

value

a string

The value to be passed in the command line

yes

value="0"

selected

true

The option selected as the default when the form is initially refreshed

no

selected="true"

Example

   1 <param name="col" type="select" label="From">
   2     <option value="0" selected="true">Column 1 / Sequence name</option>
   3     <option value="1">Column 2 / Source</option>
   4     <option value="2">Column 3 / Feature</option>
   5     <option value="6">Column 7 / Strand</option>
   6     <option value="7">Column 8 / Frame</option>
   7 </param>



<options> tag set

See ~/tools/extract/liftOver_wrapper.xml for an example of how to use this tag set. This tag set is optionally contained within the <param> tag when the "type" attribute value is "select" or "data" ( used for dynamically generated select lists ). This tag set dynamically creates a list of options whose values can be obtained from a predefined file stored locally or a dataset selected from the current history.

attribute

values

details

required

example

from_dataset

the attribute name of the input dataset in the tool config

The options for the select list are dynamically obtained from input dataset selected for the tool from the current history

no

from_dataset="input1"

from_file

the name of a file contain in the configured tool_data_path directory

The options for the select list are dynamically obtained from a file

no

from_file="alignseq.loc"

from_data_table

the name of a table named in tool_data_table_conf.xml

The options for the select list are dynamically obtained from a file specified in tool_data_table_conf.xml

no

from_data_table="bowtie_indexes"

from_parameter

a valid parameter name

The options for the select list are dynamically obtained from a parameter

no

from_parameter="tool.app.datatypes_registry.upload_file_formats"

Example

Select a database that is pre-formatted and cached in Galaxy clusters. When a new dataset is available, it will be added to the local file named "blastdb.loc" and included in the options of the select list. For a local instance, the file ("blastdb.loc" or "alignseq.loc") must be stored in the configured tool_data_path directory. In this example, the option names and values are taken from column 0 of the file.

   1 <param name="source_select" type="select" display="radio" label="Choose target database">
   2     <options from_file="blastdb.loc">
   3         <column name="name" index="0"/>
   4         <column name="value" index="0"/>
   5     </options>
   6 </param>

Example

Show all of the species that are available in the dataset selected for the parameter named "input1".

   1 <param name="species1" type="select" label="When Species" multiple="false">
   2     <options>
   3         <filter type="data_meta" ref="input1" key="species" />
   4     </options>
   5 </param>

Example

Select datasets that are available in both the dataset selected for the parameter named "input1" and the binned_scores.loc file locally stored in the configured tool_data_path directory.

   1 <param name="datasets" type="select" label="Available datasets" display="radio">
   2   <options from_file="binned_scores.loc">
   3      <column name="name" index="1"/>
   4      <column name="value" index="2"/>
   5      <column name="dbkey" index="0"/>
   6      <filter type="data_meta" ref="input1" key="dbkey" column="0" />
   7    </options>
   8 </param>

from_data_table

Basically, there are 3 steps to using from_data_table:

  1. Modify tool_data_table_conf.xml to specify:

    - The bowtie data table
    - The column types in the loc file
    - It should look something like this:

   1 <table name="bowtie_indexes">
   2   <columns>name, value</columns>
   3   <file path="tool-data/bowtie_indices.loc" />
   4 </table>

When defining column names in data_tables, it is suggested that the use of special characters (e.g. hyphens) is avoided. This will allow simplified access to additional fields for a parameter value when e.g. building the command-line; for example if a 'path' value needs to be accessed, it could be done with a syntax of ${param.fields.path}.

  1. Create/modify the loc file to correspond with the column types specified in tool_data_table_conf.xml (in this example, the loc file doesn't have to be changed), though we are going to be changing the specification to <columns>value, dbkey, name, path</columns>

  2. Modify the Bowtie wrapper to use the data table instead of the loc file directly, replacing this:

   1 <options from_file="bowtie_indices.loc">
   2   <column name="value" index="1" />
   3   <column name="name" index="0" />
   4 </options>

with this:

   1 <options from_data_table="bowtie_indexes"/>

Example

Select a reference genome that is indexed for bowtie. (see Bowtie wrapper)

   1 <param name="index" type="select" label="Select a reference genome" help="if your genome of interest is not listed - contact Galaxy team">
   2      <options from_data_table="bowtie_indexes"/>
   3 </param>



<column> tag set

Optionally contained within an <options> tag set - displays columns of values from a file stored locally or a dataset in the current history.

attribute

values

details

required

example

name

a string *

The valid name of the desired column

yes

name="value"

index

a number *

The index of the column in the referenced file or history item

yes

index="0"

Example

Show options from the dataset in the current history that has been selected as the value of the parameter named "input1".

   1 <options from_dataset="input1">
   2     <column name="name" index="0"/>
   3     <column name="value" index="0"/>
   4 </options>



<filter> tag set

Optionally contained within an <options> tag set - filter out values obtained from a locally stored file or a dataset in the current history.

attribute

values

details

required

example

type

data_meta, param_value, static_value, unique_value, multiple_splitter, attribute_value_splitter, add_value, remove_value, sort_by

The list of valid filter types is contained in the "filter_types" dictionary in the ~/lib/galaxy/tools/parameters/dynamic_options.py file

yes

type="data_meta"

name

a string

The name of the filter

no

name="dbkey"

column

a number

The column index (starting from 0) within the file that contains the values to be filtered

no

column="1"

ref

a string *

* the attribute name of the reference file or input dataset

no

ref="input1"

key

a string *

For type "data_meta", the name of the metadata key of ref to filter by

no

key="species"

multiple

true, false

For types "data_meta" and "remove_value", whether option values are multiple. Columns will be split by separator. Defaults to "false"

no

multiple="true"

separator

a string *

* For types "data_meta", "multiple_splitter" and "remove_value", the column separator of the reference file or dataset. Defaults to ","

no

separator=";"

Example

Filter values in dataset "input".

   1 <param name="to_dbkey" type="select" label="To">
   2     <options from_file="liftOver.loc">
   3         <column name="name" index="1"/>
   4         <column name="value" index="2"/>
   5         <column name="dbkey" index="0"/>
   6         <filter type="data_meta" ref="input" key="dbkey" column="0" />
   7     </options>
   8 </param>

Example

Show all options contained in the file "encode_datasets.loc", which is locally stored in the configured tool_data_path directory.

   1 <options from_file="encode_datasets.loc">
   2     <column name="name" index="2"/>
   3     <column name="value" index="3"/>
   4     <column name="dbkey" index="1"/>
   5     <column name="encode_group" index="0"/>
   6     <column name="uid" index="3"/>
   7     <filter type="static_value" name="encode_group" value="ALD" column="0"/>
   8     <filter type="static_value" name="dbkey" value="hg17" column="1"/>
   9 </options>



<request_param_translation> tag set

See ~/tools/data_source/ucsc_tablebrowser.xml for an example of how to use this tag set. This tag set is used only in "data_source" tools ( the "tool_type" attribute value is "data_source" ). This tag set is contained within the <param> tag set - it contains a set of <request_param> tags.

<request_param> tag set

Contained within the <request_param_translation> tag set ( used only in "data_source" tools ) - the external data source application may send back parameter names like "GENOME" which must be translated to "dbkey" in Galaxy.

attribute

values

details

required

example

galaxy_name

URL, dbkey, organism, table, description, name, info, data_type

Each of these maps directly to a remote_name value

yes

galaxy_name="URL"

remote_name

a string *

* The string representing the name of the parameter in the remote data source

yes

remote_name="URL"

missing

a string

The default value to use for galaxy_name if the remote_name parameter is not included in the request

yes

missing=""

Example

   1 <request_param_translation>
   2     <request_param galaxy_name="organism" remote_name="org" missing="unknown species" />
   3 </request_param_tranlsation>



<append_param> tag set

Optionally contained within the <request_param> tag set if galaxy_name="URL" - some remote data sources ( e.g., Gbrowse, Biomart ) send parameters back to Galaxy in the initial response that must be added to the value of "URL" prior to Galaxy sending the secondary request to the remote data source via URL.

attribute

values

details

required

example

separator

a string *

* The text to use to join the requested parameters together

yes

separator="&amp;"

first_separator

a string *

* The text to use to join the request_param parameters to the first requested parameter

no

first_separator="?"

join

a string *

* The text to use to join the param name to its value

yes

join="="


<value> tag set

Contained within the <append_param> tag set - allows for appending a param name / value pair to the value of URL.

attribute

values

details

required

example

name

a string *

* Any valid HTTP request parameter name. The name / value pair must be received from the remote data source and will be appended to the value of URL as something like "&_export=1"

yes

name="_export"

missing

a string *

* Must be a valid HTTP request parameter value

yes

missing="1"

Example

   1 <request_param_translation>
   2     <request_param galaxy_name="URL" remote_name="URL" missing="">
   3         <append_param separator="&amp;" first_separator="?" join="=">
   4             <value name="_export" missing="1" />
   5         </append_param>
   6     </request_param>
   7 </request_param_tranlsation>



<value_translation> tag set

Optionally contained within the <request_param> tag set the parameter value received from a remote data source may be named differently in Galaxy, and this tag set allows for the value to be appropriately translated.


<value> tag set

Contained within the <value_translation> tag set - allows for changing the data type value to something supported by Galaxy.

attribute

values

details

required

example

galaxy_value

a supported data type *

* The target value. e.g. For setting data format: the list of supported data formats is contained in the ~/config/datatypes_conf.xml.sample file

yes

galaxy_value="tabular"

remote_value

a string *

* The value supplied by the remote data source application

yes

remote_value="selectedFields"

Example

   1 <request_param_translation>
   2     <request_param galaxy_name="data_type" remote_name="hgta_outputType" missing="bed" >
   3         <value_translation>
   4             <value galaxy_value="tabular" remote_value="primaryTable" />
   5         </value_translation>
   6     </request_param>
   7 </request_param_tranlsation>



<sanitizer> tag set

See ~/tools/filters/grep.xml for an example of how to use this tag set. This tag set is used to replace the basic parameter sanitization with custom directives. This tag set is contained within the <param> tag set - it contains a set of <valid> and <mapping> tags.

property

values

details

required

example

sanitize

True or False

is this parameter sanitized

no, default is True

<sanitizer sanitize="True"/>

invalid_char

string

character to replace invalid characters with

no, default is X

<sanitizer invalid_char="~"/>

<valid> tag set

Contained within the <sanitizer> tag set. Used to specify a list of allowed characters. Contains <add> and <remove> tags.

property

values

details

required

example

initial

string

initial characters to allow

no, default is string.letters + string.digits +" -=_.()/+*^,:?!"

<valid initial="none">

<add> and <remove> tag set

Contained within the <valid> tag set. Used to add or remove individual characters or preset lists of characters. Character must not be allowed as a valid input for the mapping to occur. Preset lists include default and none as well as those available from Python string constants (e.g. string.printable).

attribute

values

details

required

example

preset

none, default or available from string

Add or Remove these characters from the list of valid characters

no

<add preset="string.printable"/> or <remove preset="string.printable"/>

value

a character

A character to add or remove from the list of valid characters

no

<remove value="&quot;"/> or <add value="&quot;"/>

Example

   1   <param name="mystring" type="text" label="Say something interesting">
   2     <sanitizer invalid_char="">
   3       <valid initial="string.letters,string.digits"><add value="_" /> </valid>
   4     </sanitizer>
   5   </param>

<mapping> tag set

Contained within the <sanitizer> tag set. Used to specify a mapping of disallowed character to replacement string. Contains <add> and <remove> tags.

property

values

details

required

example

initial

string

initial character mapping

no, default is galaxy.util.mapped_chars

<valid initial="none">

<add> and <remove> tag set

Contained within the <valid> tag set. Used to add or remove individual characters or preset lists of characters. Character must not be allowed as a valid input for the mapping to occur. Preset lists include default and none as well as those available from string.* (e.g. string.printable).

attribute

values

details

required

example

source

a character

Replace all occurrences of this character with the string of target

no

<add source="&quot;" target="\&quot;"/> or <remove source="&quot;"

target

a string

Replace all occurrences of source with this string

no

<add source="&quot;" target="\&quot;"/>

Example

   1       <sanitizer>
   2         <valid initial="string.printable">
   3          <remove value="&apos;"/>
   4         </valid>
   5         <mapping initial="none">
   6           <add source="&apos;" target=""/>
   7         </mapping>
   8       </sanitizer>



<configfiles> tag set

See ~/tools/maf/maf_filter.xml for an example of how this tag set is used in a tool. This tag set is a container for <configfile> tag sets - it defines an additional configuration section.


<configfile> tag set

This tag set is contained within the <configfiles> tag set. It allows for the creation of a temporary file for file-based parameter transfer.

attribute

values

details

required

example

name

a string *

* This value is the parameter name of the configuration file

yes

name="maf_filter_file"

filename

a string

Specify the name of the configuration file to create

no

filename="your_template.yaml"

Example

The following is taken from the ~/tools/plotting/xy_plot.xml tool config.

   1 <configfiles>
   2     <configfile name="script_file">
   3       ## Setup R error handling to go to stderr
   4       options( show.error.messages=F, error = function () { cat( geterrmessage(), file=stderr() ); q( "no", 1, F ) } )
   5       ## Determine range of all series in the plot
   6       xrange = c( NULL, NULL )
   7       yrange = c( NULL, NULL )
   8       #for $i, $s in enumerate( $series )
   9           s${i} = read.table( "${s.input.file_name}" )
  10           x${i} = s${i}[,${s.xcol}]
  11           y${i} = s${i}[,${s.ycol}]
  12           xrange = range( x${i}, xrange )
  13           yrange = range( y${i}, yrange )
  14       #end for
  15       ## Open output PDF file
  16       pdf( "${out_file1}" )
  17       ## Dummy plot for axis / labels
  18       plot( NULL, type="n", xlim=xrange, ylim=yrange, main="${main}", xlab="${xlab}", ylab="${ylab}" )
  19       ## Plot each series
  20       #for $i, $s in enumerate( $series )
  21           #if $s.series_type['type'] == "line"
  22               lines( x${i}, y${i}, lty=${s.series_type.lty}, lwd=${s.series_type.lwd}, col=${s.series_type.col} )
  23           #elif $s.series_type.type == "points"
  24               points( x${i}, y${i}, pch=${s.series_type.pch}, cex=${s.series_type.cex}, col=${s.series_type.col} )
  25           #end if
  26       #end for
  27       ## Close the PDF file
  28       devname = dev.off()
  29     </configfile>
  30 </configfiles>



<outputs> tag set

Container tag set for the <data> tag set. The files created by tools as a result of their execution are named by Galaxy. You specify the number and type of your output files using the contained <data> tags. You must pass them to your tool executable through using line variables just like the parameters described in the previous sections.


<data> tag set

This tag set is contained within the <outputs> tag set, and it defines the output data description for the files resulting from the tool's execution. The value of the attribute "label" can be acquired from input parameters or metadata in the same way that the command line parameters are ( discussed in the <command> tag set section above ).

attribute

values

details

required

example

name

a string *

* This attribute name must match the attribute name of the command line parameter defined for the output

yes

name="output1"

format

a supported data type

This is the data type of the output file. It can be one of the supported data types ( e.g., "tabular" ) or the format of the tool's input dataset ( e.g., format="input" )

yes

format="fasta"

format_source

name of an input data parameter

This sets the data type of the output file to be the same format as that of a tool input dataset. Useful when there are multiple inputs to match.

no

format_source="input2"

metadata_source

value of the input data parameter

This copies the metadata information from the tool's input dataset. This is particularly useful for interval data types where the order of the columns is not set.

no

metadata_source="input"

label

a string

This will be the label of the history item for the output data set. The string can include structure like ${<some param name>.<some attribute>} , as discussed for command line parameters in the <command> tag set section above.

no

label="Blat on ${database.value_label} "

from_work_dir

a string

Relative path to a file produced by the tool in its working directory. Output's contents are set to this file's contents.

no

from_work_dir="tool_output_file.txt"

hidden

True, true, False, false

Whether to hide dataset in the history view.

no

hidden="True"

Example

The following will create a dataset in the history panel whose data type is the same as that of the input dataset selected for the tool.

   1 <outputs>
   2     <data format="input" name="out_file1" metadata_source="input"/>
   3 </outputs>

Example

The following will create datasets in the history panel, setting the output data type to be the same as that of an input dataset named by the "format_source" attribute. Note that a conditional name is not included, so 2 separate conditional blocks should not contain parameters with the same name.

   1 <inputs>
   2   <!-- fasta may be an aligned fasta that subclasses Fasta -->
   3   <param name="fasta" type="data" format="fasta" label="fasta - Sequences"/>
   4   <conditional name="qual">
   5    <param name="add" type="select" label="Trim based on a quality file?" help="">
   6     <option value="no">no</option>
   7     <option value="yes">yes</option>
   8    </param>
   9    <when value="no"/>
  10    <when value="yes">
  11     <!-- qual454, qualsolid, qualillumina -->
  12     <param name="qfile" type="data" format="qual" label="qfile - a quality file"/>
  13    </when>
  14   </conditional>
  15 </inputs>
  16 <outputs>
  17   <data format_source="fasta" name="trim_fasta" label="${tool.name} on ${on_string}: trim.fasta"/>
  18   <data format_source="qfile" name="trim_qual" label="${tool.name} on ${on_string}: trim.qual">
  19    <filter>(qual['add'] == 'yes')</filter>
  20   </data>
  21 </outputs>

Example

The following will create a variable called $out_file1 with data type "pdf".

   1 <outputs>
   2     <data format="pdf" name="out_file1" />
   3 </outputs>

Example

Assume that the tool includes an input parameter named "database" which is a select list ( e.g., assume the following inputs ):

   1 <inputs>
   2     <param format="tabular" name="input" type="data" label="Input stuff"/>
   3     <param type="select" name="database" label="Database">
   4         <option value="alignseq.loc">Human (hg18)</option>
   5         <option value="faseq.loc">Fly (dm3)</option>
   6     </param>
   7 </inputs>

Assume that the user selects the first option in the $database select list. Then the following will ensure that the tool produces a tabular data set whose associated history item has the label "Blat on Human (hg18)".

   1 <outputs>
   2     <data format="input" name="output" label="Blat on ${database.value_label}" />
   3 </outputs>



<change_format> tag set

See ~/tools/extract/extract_genomic_dna.xml for an example of how this tag set is used in a tool. This tag set is optionally contained within the <data> tag set and is the container tag set for the following <when> tag set.


<when> tag set ( change_format )

If the data type of the output dataset is the specified type, the data type is changed to the desired type.

attribute

values

details

required

example

input

a string *

* This value must be the attribute name of the desired input parameter

yes

input="out_format"

value

a string *

* This value must also be an attribute name of an input parameter

yes

 value="interval"

format

a string *

* This value must be a supported data type

yes

format="interval"

Example

Assume that your tool config includes the following select list parameter structure:

   1 <param name="out_format" type="select" label="Output data type">
   2     <option value="fasta">FASTA</option>
   3     <option value="interval">Interval</option>
   4 </param>

Then whenever the user selects the "interval"" option from the select list, the following structure in your tool config will override the format="fasta" setting in the <data> tag set with format="interval".

   1 <outputs>
   2     <data format="fasta" name="out_file1">
   3         <change_format>
   4             <when input="out_format" value="interval" format="interval" />
   5         </change_format>
   6     </data>
   7 </outputs>



<discover_datasets> tag set

See Multiple Output Files#Number_of_Output_datasets_cannot_be_determined_until_tool_run for tips to use multiple datasets including discovered datasets. This tag set is optionally contained within the <data> tag set.

Code examples can be found at: ~/test/functional/tools/multi_output.xml ~/test/functional/tools/multi_output_assign_primary.xml ~/test/functional/tools/multi_output_configured.xml


<actions> tag set

The <actions> in the Bowtie wrapper is used in lieu of the deprecated <code> tag to set the dbkey of the output dataset. In bowtie_wrapper.xml (see below), according to the first action block, if the refGenomeSource.genomeSource is "indexed" (not "history"), then it will assign the dbkey of the output file to be the same as that of the reference file. It does this by looking at through the loc file and finding the line that has the value that's been selected in the index dropdown box as column 1 of the loc file entry and using the dbkey, in column 0 (ignoring comment lines (starting with #) along the way).

If refGenomeSource.genomeSource is "history", it resorts to default behavior for Galaxy, which is that the output is assigned the same value as the first input that has a dbkey specified.

The second block would not be needed for most cases--it is required here to handle the specific case of a small reference file we use for functional testing. It says that if the dbkey has been set to "equCab2chrM" (that's what the <filter type="metadata_value"... column="1" /> tag) does then it should be changed to "equCab2" (the <option type="from_param" ... column="0" ...> tag does).

Example

   1 <actions>
   2    <conditional name="refGenomeSource.genomeSource">
   3       <when value="indexed">
   4            <action type="metadata" name="dbkey">
   5             <option type="from_file" name="bowtie_indices.loc" column="0" offset="0">
   6                <filter type="param_value" column="0" value="#" compare="startswith" keep="False"/>
   7                <filter type="param_value" ref="refGenomeSource.index" column="1"/>
   8             </option>
   9          </action>
  10        </when>
  11     </conditional>
  12     <!-- Special casing equCab2chrM to equCab2 -->
  13     <action type="metadata" name="dbkey">
  14         <option type="from_param" name="refGenomeSource.genomeSource" column="0" offset="0">
  15             <filter type="insert_column" column="0" value="equCab2chrM"/>
  16             <filter type="insert_column" column="0" value="equCab2"/>
  17             <filter type="metadata_value" ref="output" name="dbkey" column="1" />
  18         </option>
  19     </action>
  20 </actions>



<tests> tag set

Container tag set for the <test> tag sets. Any number of tests can be included, and each test is wrapped within separate <test> tag sets. Functional tests are executed via the ~/run_functional_tests.sh shell script. See Admin/Tools/Writing Tests.


<test> tag set

This tag set contains the necessary parameter values for executing the tool via the functional test framework.

Example

The following two tests will tool execute the ~/tools/filters/sorter.xml tool. Notice the way that the tool's inputs and outputs are defined.

   1   <tests>
   2     <test>
   3       <param name="input" value="1.bed"/>
   4       <param name="column" value="1"/>
   5       <param name="order" value="ASC"/>
   6       <param name="style" value="num"/>
   7       <output name="out_file1" file="sort1_num.bed"/>
   8     </test>
   9     <test>
  10       <param name="input" value="7.bed"/>
  11       <param name="column" value="1"/>
  12       <param name="order" value="ASC"/>
  13       <param name="style" value="alpha"/>
  14       <output name="out_file1" file="sort1_alpha.bed"/>
  15     </test>
  16   </tests>

Example

Test the execution of the MAF-to-FASTA converter ( ~/tools/maf/maf_to_fasta.xml ).

   1 <tests>
   2     <test>
   3         <param name="input1" value="3.maf" ftype="maf"/>
   4         <param name="species" value="canFam1"/>
   5         <param name="fasta_type" value="concatenated"/>
   6         <output name="out_file1" file="cf_maf2fasta_concat.dat" ftype="fasta"/>
   7     </test>
   8 </tests>

Example

This test demonstrates verifying specific properties about a test output instead of directly comparing it to another file. Here the file attribute is not specified and instead a series of assertions is made about the output.

   1 <test>
   2     <param name="input" value="maf_stats_interval_in.dat" />
   3     <param name="lineNum" value="99999"/>
   4     <output name="out_file1">
   5         <assert_contents>
   6             <has_text text="chr7" />
   7             <not_has_text text="chr8" />
   8             <has_text_matching expression="1274\d+53" />
   9             <has_line_matching expression=".*\s+127489808\s+127494553" />
  10             <!-- &#009; is XML escape code for tab -->
  11             <has_line line="chr7&#009;127471195&#009;127489808" />
  12             <has_n_columns n="3" />
  13         </assert_contents>
  14     </output>
  15 </test>



<param> tag set (functional tests)

This tag set defines the tool's input parameters for executing the tool via the functional test framework.

attribute

values

details

required

example

name

name of an input parameter

This value must match the name of the associated input parameter.

yes

name="input1"

value

a legal value of an input parameter

This value must be one of the legal values that can be assigned to an input parameter. ( Note: If a select option starts with '-' the test value should be preceded by a '+', e.g. <option value="-snp" would be: <param value="+-snp" )

yes

value="3.maf"

ftype

data type of the input file *

* This attribute name should be included only with the parameter that defines the input dataset for the tool. If this attribute name is not included, the functional test framework will attempt to determine the data type for the input dataset using the data type sniffers.

no

ftype="maf"

Example

The following defines the four input values that are passed to the ~/tools/filters/sorter.xml tool via functional test framework.

   1     <param name="input" value="7.bed"/>
   2     <param name="column" value="1"/>
   3     <param name="order" value="ASC"/>
   4     <param name="style" value="alpha"/>



<output> tag set (functional tests)

This tag set defines the variable that names the output dataset for the functional test framework. The functional test framework will execute the tool using the parameters defined in the <param> tag sets and generate a temporary file, which will either be compared with the file named in the "file" attribute value or checked against assertions made by a child assert_contents tag to verify that the tool is functionally correct.

attribute

values

details

required

example

name

parameter name of the output file

This value is the same as the value of the "name" attribute of the <data> tag set contained within the tool's <outputs> tag set.

yes

name="outfile_1"

file

file name

This value is the name of the output file stored in the ~/test-data directory which will be used to compare the results of executing the tool via the functional test framework

yes

file="cf_maf2fasta_concat.dat"


<assert_contents> tag set (functional tests)

This tag set defines a sequence of checks or assertions to run against an output dataset for the functional test framework. This tag requires no attributes, but child tags should be used to define the assertions to make about the output dataset. The functional test framework makes it easy to extend Galaxy with such tags, the following table summarizes many of the default assertion tags that come with Galaxy and examples of each can be found below.

tag

description

example

has_text

Asserts the specified text appears in the output.

<has_text text="chr7">

not_has_text

Asserts the specified text does not appear in the output.

<not_has_text text="chr8" /> 

has_text_matching

Asserts text matching the specified regular expression (expression) appears in the output.

<has_text_matching expression="1274\d+53" />

has_line_matching

Asserts a line matching the specified regular expression (expression) appears in the output.

<has_line_matching expression=".*\s+127489808\s+127494553" />

has_n_columns

Asserts tabular output contains the specified number (n) of columns.

<has_n_columns n="3" />

is_valid_xml

Asserts the output is a valid XML file.

<is_valid_xml />

has_element_with_path

Asserts the XML output contains at least one element (or tag) with the specified XPath-like path.

<has_element_with_path path="BlastOutput_param/Parameters/Parameters_matrix" />

has_n_elements_with_path

Asserts the XML output contains the specified number (n) of elements (or tags) with the specified XPath-like path

<has_n_elements_with_path n="9" path="BlastOutput_iterations/Iteration/Iteration_hits/Hit/Hit_num" />

element_text_is

Asserts the text of the XML element with the specified XPath-like path is the specified text.

<element_text_is path="BlastOutput_program" text="blastp" />

element_text_matches

Asserts the text of the XML element with the specified XPath-like path matches the regular expression defined by expression.

<element_text_matches path="BlastOutput_version" expression="BLASTP\s+2\.2.*" />

attribute_is

Asserts the XML attribute for the element (or tag) with the specified XPath-like path is the specified text.

<attribute_is path="outerElement/innerElement1" attribute="foo" text="bar" />

attribute_matches

Asserts the XML attribute for the element (or tag) with the specified XPath-like path matches the regular expression specified by expression

<attribute_matches path="outerElement/innerElement2" attribute="foo2" expression="bar\d+" />

element_text

This tag allows the developer to recurisively specify additional assertions as child elements about just the text contained in the element specified by the XPath-like path.

<element_text path="BlastOutput_iterations/Iteration/Iteration_hits/Hit/Hit_def"><not_has_text text="EDK72998.1" /></element_text>

Example

   1 <output name="out_file1">
   2     <assert_contents>
   3         <has_text text="chr7" />
   4         <not_has_text text="chr8" />
   5         <has_text_matching expression="1274\d+53" />
   6         <has_line_matching expression=".*\s+127489808\s+127494553" />
   7         <!-- &#009; is XML escape code for tab -->
   8         <has_line line="chr7&#009;127471195&#009;127489808" />
   9         <has_n_columns n="3" />
  10     </assert_contents>
  11 </output>

Example

   1 <output name="out_file1">
   2     <assert_contents>
   3         <is_valid_xml />
   4         <has_element_with_path path="BlastOutput_param/Parameters/Parameters_matrix" />
   5         <has_n_elements_with_path n="9" path="BlastOutput_iterations/Iteration/Iteration_hits/Hit/Hit_num" />
   6         <element_text_matches path="BlastOutput_version" expression="BLASTP\s+2\.2.*" />
   7         <element_text_is path="BlastOutput_program" text="blastp" />
   8         <element_text path="BlastOutput_iterations/Iteration/Iteration_hits/Hit/Hit_def">
   9             <not_has_text text="EDK72998.1" />
  10             <has_text_matching expression="ABK[\d\.]+" />
  11         </element_text>
  12     </assert_contents>
  13 </output>

Example

   1 <output name="out_file1">
   2     <assert_contents>
   3         <attribute_is path="outerElement/innerElement1" attribute="foo" text="bar" />
   4         <attribute_matches path="outerElement/innerElement2" attribute="foo2" expression="bar\d+" />
   5     </assert_contents>
   6 </output>


<page> tag set

This tag set is deprecated and not recommended for new tools. In older tools, if you needed to split your interface over multiple pages, you could do so by wrapping each page with a <page></page> tag and putting them in order in the XML file.

To create two-page interface:

   1 <page>
   2     <!-- tag sets for the first page -->
   3 </page>
   4 
   5 <page>
   6     <!-- tag sets for the second page -->
   7 </page>



<code> tag set

Deprecated do not use this unless absolutely necessary. This tag set provides detailed control of the way the tool is executed. This (optional) code can be deployed in a separate file in the same directory as the tool's config file. These hooks are being replaced by new tool config features and methods in the ~/lib/galaxy/tools/__init__.py code file.

attribute

values

details

required

example

file

a string *

This value is the name of the executable code file, and is called in the exec_before_process(), exec_before_job(), exec_after_process() and exec_after_job()( methods.

yes

file="extract_genomic_dna_code.py"

Example

The following is taken from the ~/tools/new_operations/coverage.xml tool config.

   1 <code file="operation_filter.py"/>



<requirements> tag set

See ~/tools/extract/phastOdds/phastOdds_tool.xml for an example of how this tag set is used in a tool. This is a container tag set for the <requirement> tag set described below.


<requirement> tag set

This tag set is contained within the <requirements> tag set. Third party programs or modules that the tool depends upon (and which are not distributed with Galaxy) are included in this tag set. The intention is that when Galaxy starts it can check whether the required programs or modules are available, and if not this tool will not be loaded. The Galaxy Tool Shed uses package requirements as part of the dependency management, see Tool Shed dependency management.

attribute

values

details

required

example

type

package, set_environment

This value defines the which type of the 3rd party module required by this tool

yes

type="package"

version

string

Required for package type requirements

no

0.0.18

Note: Earlier versions of this page also listed 'python-module' and 'binary' as possible values for the 'type' attribute, but the current version of Galaxy appears to just ignore those requirement types. See Tool Shed dependency management on how to add such dependencies.

Example

This example shows a tool that requires the samtools 0.0.18 package via the Tool Shed (see Tool Shed dependency management).

   1 <requirements>
   2     <requirement type="package" version="0.1.18">samtools</requirement>
   3 </requirements>

Example

This example shows a tool that requires R version 2.51.1. The tool_depensencies.xml should contain matching declarations for Galaxy to actually install the R runtime.

   1 <requirements>
   2     <requirement type="set_environment">R_SCRIPT_PATH</requirement>
   3     <requirement type="package" version="2.15.1">R</requirement>
   4 </requirements>



<stdio>, <regex>, and <exit_code> tag sets

Tools write the bulk of useful data to datasets, but they can also write messages to standard I/O (stdio) channels known as standard output (stdout) and standard error (stderr). Both stdout and stderr are typically written to the executing program's console or terminal. Previous versions of Galaxy checked stderr for execution errors - if any text showed up on stderr, then the tool's execution was marked as failed. However, many tools write messages to stderr that are not errors, and using stderr allows programs to redirect other interesting messages to a separate file. Programs may also exit with codes that indicate success or failure. One convention is for programs to return 0 on success and a non-zero exit code on failure.

Galaxy currently supports using regular expressions to scan stdout and stderr, and it also allows exit codes to be scanned for ranges. The <stdio> tag has two subtags, <regex> and <exit_code>, to define regular expressions and exit code processing, respectively. They are defined below. If a tool does not have any valid <regex> or <exit_code> tags, then Galaxy will use the previous technique for finding errors - any text on stderr indicates an error, and neither stdout nor the tool's exit code will be checked.

A note should be made on the order in which exit codes and regular expressions are applied and how the processing stops. Exit code rules are applied before regular expression rules. The rationale is that exit codes are more clearly defined and are easier to check computationally, so they are applied first. (Feedback is welcome on this; the author has considered eliminating this constraint.) Exit code rules are applied in the order in which they appear in the tool's configuration file, and regular expressions are also applied in the order in which they appear in the tool's configuration file. However, once a rule is triggered that causes a fatal error, no further rules are checked.

Exit code ranges and output regular expressions are defined below.

<exit_code> tag set

Tools may use exit codes to indicate specific execution errors. Many programs use 0 to indicate success and non-zero exit codes to indicate errors. Galaxy allows each tool to specify exit codes that indicate errors. Each <exit_code> tag defines a range of exit codes, and each range can be associated with a description of the error (e.g., "Out of Memory", "Invalid Sequence File") and an error level. The description just describes the condition and can be anything. The error level is either a warning or a fatal error. A warning means that stderr will be updated with the error's description. A fatal error means that the tool's execution will be marked as having an error and the workflow will stop. Note that, if the error level is not supplied, then a fatal error is assumed to have occurred.

The exit code's range can be any consecutive group of integers. More advanced ranges, such as noncontiguous ranges, are currently not supported. Ranges can be specified in the form "m:n", where m is the start integer and n is the end integer. If ":n" is specified, then the exit code will be compared against all integers less than or equal to n. If "m:" is used, then the exit code will be compared against all integers greater than or equal to m. If the exit code matches, then the error level is applied and the error's description is added to stderr. If a tool's exit code does not match any of the supplied <exit_code> tags' ranges, then no errors are applied to the tool's execution.

Note that most Unix and Linux variants only support positive integers 0 to 255 for exit codes. If an exit code falls out of the range 0 to 255, the usual convention is to only use the lower 8 bits for the exit code. The only known exception is if a job is broken into subtasks using the tasks runner and one of those tasks is stopped with a POSIX signal. (Note that signals should be used as a last resort for terminating processes.) In those cases, the task will receive -1 times the signal number. For example, suppose that a job uses the tasks runner and 8 tasks are created for the job. If one of the tasks hangs, then a sysadmin may choose to send the "kill" signal, SIGKILL, to the process. In that case, the task (and its job) will exit with an exit code of -9. More on POSIX signals can be found at http://en.wikipedia.org/wiki/Unix_signal as well as man pages on "signal".

The <exit_code> tag's supported attributes are as follows:

  • range: This indicates the range of exit codes to check. The range can be one of the following:

    • n: the exit code will only be compared to n;

    • [m:n]: the exit code must be greater than or equal to m and less than or equal to n;

    • [m:]: the exit code must be greater than or equal to m;

    • [:n]: the exit code must be less than or equal to n.

  • level: This indicates the error level of the exit code. The level can have one of two values:

    • warning: If an exit code falls in the given range, then a description of the error will be added to the beginning of stderr. A warning-level error will not cause the tool to fail.

    • fatal: If an exit code falls in the given range, then a description of the error will be added to the beginning of stderr. A fatal-level error will cause the tool to fail. If no level is specified, then the fatal error level will be assumed to have occurred.

  • description: This is an optional description of the error that corresponds to the exit code.

The following is an example of the <exit_code> tag:

   1 <stdio>
   2     <exit_code range="2"   level="fatal"   description="Out of Memory" />
   3     <exit_code range="3:5" level="warning" description="Low disk space" />
   4     <exit_code range="6:"  level="fatal"   description="Bad input dataset" />
   5 </stdio>

If the tool returns 0 or 1, then the tool will not be marked as having an error. If the exit code is 2, then the tool will fail with the description "Out of Memory" added to stderr. If the tool returns 3, 4, or 5, then the tool will not be marked as having failed, but "Low disk space" will be added to stderr. Finally, if the tool returns any number greater than or equal to 6, then the description "Bad input dataset" will be added to stderr and the tool will be marked as having failed.

<regex> tag set

A regular expression defines a pattern of characters. The patterns include the following:

  • GCTA, which matches on the fixed string "GCTA";
  • [abcd], which matches on the characters a, b, c, or d;
  • [CG]{12}, which matches on 12 consecutive characters that are C or G;
  • a.*z, which matches on the character "a", followed by 0 or more characters of any type, followed by a "z";
  • ^X, which matches the letter X at the beginning of a string;
  • Y$, which matches the letter Y at the end of a string.

There are many more possible regular expressions. A reference to all supported regular expressions can be found under Python Regular Expression Syntax.

A regular expression includes the following attributes:

  • source: This tells whether the regular expression should be matched against stdout, stderr, or both. If this attribute is missing or is incorrect, then both stdout and stderr will be checked. The source can be one of the follwing values:

    • stdout: the regular expression will be applied to stdout;

    • stderr: the regular expression will be applied to stderr;

    • both: the regular expression will be applied to both stderr and stdout (which is the default case).

  • match: This is the regular expression that will be used to match against stdout and/or stderr. If the <regex> tag does not contain the match attribute, then the <regex> tag will be ignored. The regular expression can be any valid Python regular expression. All regular expressions are performed case insensitively. For example, if match contains the regular expression "actg", then the regular expression will match against "actg", "ACTG", "AcTg", and so on. Also note that, if double quotes (") are to be used in the match attribute, then the value " can be used in place of double quotes. Likewise, if single quotes (') are to be used in the match attribute, then the value ' can be used if necessary.

  • level: This works very similarly to the <exit_code> tag, except that, when a regular expression matches against its source, the description is added to the beginning of the source. For example, if stdout matches on a regular expression, then the regular expression's description is added to the beginning of stdout (instead of stderr). The level can be log, warning or fatal as described below.

    • log and warning: If the regular expression matches against its source input (i.e., stdout and/or stderr), then a description of the error will be added to the beginning of the source, prepended with either 'Log:' or 'Warning:'. A log-level/warning-level error will not cause the tool to fail.

    • fatal: If the regular expression matches against its source input, then a description of the error will be added to the beginning of the source. A fatal-level error will cause the tool to fail. If no level is specified, then the fatal error level will be assumed to have occurred.

  • description: Just like its <exit_code> counterpart, this is an optional description of the regular expression that has matched.

The following is an example of regular expressions that may be used:

   1 <stdio>
   2     <regex match="low space"
   3            source="both"
   4            level="warning"
   5            description="Low space on device" />
   6     <regex match="error"
   7            source="stdout"
   8            level="fatal"
   9            description="Unknown error encountered" />
  10     <regex match="[CG]{12}"
  11            description="Fatal error - CG island 12 nts long found" />
  12     <regex match="^Branch A"
  13            level="warning"
  14            description="Branch A was taken in execution" />
  15 </stdio>

The regular expression matching proceeds as follows. First, if either stdout or stderr match on "low space", then a warning is registered. If stdout contained the string "---LOW SPACE---", then stdout has the string "Warning: Low space on device" added to its beginning. The same goes for if stderr had contained the string "low space". Since only a warning could have occurred, the processing continues.

Next, the regular expression "error" is matched only against stdout. If stdout contains the string "error" regardless of its capitalization, then a fatal error has occurred and the processing stops. In that case, stdout would be prepended with the string "Fatal: Unknown error encountered". Note that, if stderr contained "error", "ERROR", or "ErRor" then it would not matter - stderr was not being scanned.

If the second regular expression did not match, then the third regular expression is checked. The third regular expression does not contain an error level, so an error level of "fatal" is assumed. The third regular expression also does not contain a source, so both stdout and stderr are checked. The third regular expression looks for 12 consecutive "C"s or "G"s in any order and in uppercase or lowercase. If stdout contained "cgccGGCCcGGcG" or stderr contained "CCCCCCgggGGG", then the regular expression would match, the tool would be marked with a fatal error, and the stream that contained the 12-nucleotide CG island would be prepended with "Fatal: Fatal error - CG island 12 nts long found".

Finally, if the tool did not match any of the fatal errors, then the fourth regular expression is checked. Since no source is specified, both stdout and stderr are checked. If "Branch A" is at the beginning of stdout or stderr, then a warning will be registered and the source that contained "Branch A" will be prepended with the warning "Warning: Branch A was taken in execution".


<help> tag set

This tag set includes all of the necessary details of how to use the tool. This tag set should be included as the last tag set in the tool config. Tool help is written in reStructuredText. Included here is only an overview of a subset of features. For more information see http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html.

tag

details

.. class:: warningmark

a yellow warning symbol

.. class:: infomark

a blue information symbol

.. image:: path-of-the-file.png :height: 500 :width: 600

insert a png file of height 500 and width 600 at this position

**bold**

bold

*italic*

italic

*

list

-

list

::

paragraph

-----

a horizontal line

Example

Show a warning sign to remind users that this tool accept fasta format files only, followed by an example of the query sequence and a figure.

   1 <help>
   2 
   3 .. class:: warningmark
   4 
   5 '''TIP''' This tool requires *fasta* format.
   6 
   7 ----
   8 
   9 '''Example'''
  10 
  11 Query sequence::
  12     >seq1
  13     >ATCG...
  14 
  15 .. image:: my_figure.png
  16     :height: 500
  17     :width: 600
  18 
  19 </help>

<citations> tag set

Tool files may declare one citations element. Each citations element can contain one or more citation tag elements - each of which specifies tool citation information using either a DOI or a BibTeX entry.

These citations will appear at the bottom of the tool form in a formatted way but the user will have to option to select RAW BibTeX for copying and pasting as well. Likewise, the history menu includes an option allowing users to aggregate all such citations across an analysis in a list of citations.

BibTeX entries for citations annotated with DOIs will be fetched by Galaxy from http://dx.doi.org/ and cached.

   1  <citations>
   2     <!-- Example of annotating a citation using a DOI. -->
   3     <citation type="doi">10.1093/bioinformatics/btq281</citation>
   4 
   5     <!-- Example of annotating a citation using a BibTex entry. -->
   6     <citation type="bibtex">@ARTICLE{Kim07aninterior-point,
   7     author = {Seung-jean Kim and Kwangmoo Koh and Michael Lustig and Stephen Boyd and Dimitry Gorinevsky},
   8     title = {An interior-point method for large-scale l1-regularized logistic regression},
   9     journal = {Journal of Machine Learning Research},
  10     year = {2007},
  11     volume = {8},
  12     pages = {1519-1555}
  13     }</citation>
  14   </citations>

For more implementation information see the pull request adding this feature. For more examples of how to add this to tools checkout the following changesets adding this to the NCBI BLAST+ suite, phenotype association tools, MAF suite, and MACS2 suite.

This feature was added to the August 2014 release of Galaxy, tools annotated with citations will work in older releases of Galaxy but no citation information will be available to the end user.

Reusing Repeated Configuration Elements

Frequently, tools may require the same XML fragments be repeated in a file (for instance similar conditional branches, repeated options, etc...) or between tools in the same repository (for instance, nearly all of the GATK tools contain the same standard options). As of the April 1st, 2013 release of Galaxy a macroing system has been implemented to begin to address this problem.

Direct XML Macros

The following examples are taken from Pull Request 129 the initial implementation. Prior to to the inclusion of macros, the tophat2 wrapper defined several outputs each which had the following identical actions block associated with them:

   1             <actions>
   2               <conditional name="refGenomeSource.genomeSource">
   3                 <when value="indexed">
   4                   <action type="metadata" name="dbkey">
   5                     <option type="from_data_table" name="tophat2_indexes" column="1" offset="0">
   6                       <filter type="param_value" column="0" value="#" compare="startswith" keep="False"/>
   7                       <filter type="param_value" ref="refGenomeSource.index" column="0"/>
   8                     </option>
   9                   </action>
  10                 </when>
  11                 <when value="history">
  12                   <action type="metadata" name="dbkey">
  13                     <option type="from_param" name="refGenomeSource.ownFile" param_attribute="dbkey" />
  14                   </action>
  15                 </when>
  16               </conditional>
  17             </actions>

To reuse this action definition, first a macros section has been defined in the tophat2_wrpper.xml file.

   1 <tool>
   2    ...
   3    <macros>
   4      <xml name="dbKeyActions">
   5        <action><!-- Whole big example above. -->
   6          ....
   7        </action>
   8      </xml>
   9    </macros>

With this in place, each output data element can include this block using the expand XML element as follows:

   1         <data format="bed" name="insertions" label="${tool.name} on ${on_string}: insertions" from_work_dir="tophat_out/insertions.bed">
   2             <expand macro="dbKeyActions" />
   3         </data>
   4         <data format="bed" name="deletions" label="${tool.name} on ${on_string}: deletions" from_work_dir="tophat_out/deletions.bed">
   5           <expand macro="dbKeyActions" />
   6         </data>
   7         <data format="bed" name="junctions" label="${tool.name} on ${on_string}: splice junctions" from_work_dir="tophat_out/junctions.bed">
   8           <expand macro="dbKeyActions" />
   9         </data>
  10         <data format="bam" name="accepted_hits" label="${tool.name} on ${on_string}: accepted_hits" from_work_dir="tophat_out/accepted_hits.bam">
  11           <expand macro="dbKeyActions" />
  12         </data>

This has reduced the size of the XML file by dozens of lines and reduces the long term maintenance associated with copied and pasted code.

Imported Macros

The macros element described above, can also contain any number of import elements. This allows a directory/repository of tool XML files to contain shared macro definitions that can be used by any number of actual tool files in that directory/repository.

Revisiting the tophat example, all three tophat wrappers (tophat_wrapper.xml, tophat_color_wrapper.xml, and tophat2_wrapper.xml) share some common functionality. To reuse XML elements between these files, a tophat_macros.xml file was added to that directory. The following block is a simplified version of that macros file's contents:

   1 <macros>
   2   <xml name="own_junctionsConditional">
   3     <conditional name="own_junctions">
   4       <param name="use_junctions" type="select" label="Use Own Junctions">
   5         <option value="No">No</option>
   6         <option value="Yes">Yes</option>
   7       </param>
   8       <when value="Yes">
   9         <conditional name="gene_model_ann">
  10           <param name="use_annotations" type="select" label="Use Gene Annotation Model">
  11             <option value="No">No</option>
  12             <option value="Yes">Yes</option>
  13           </param>
  14           <when value="No" />
  15           <when value="Yes">
  16             <param format="gtf,gff3" name="gene_annotation_model" type="data" label="Gene Model Annotations" help="TopHat will use the exon records in this file to build a set of known splice junctions for each gene, and will attempt to align reads to these junctions even if they would not normally be covered by the initial mapping."/>
  17           </when>
  18         </conditional>
  19         <expand macro="raw_juncsConditional" />
  20         <expand macro="no_novel_juncsParam" />
  21       </when>
  22       <when value="No" />
  23     </conditional> <!-- /own_junctions -->
  24   </xml>
  25   <xml name="raw_juncsConditional">
  26     <conditional name="raw_juncs">
  27       <param name="use_juncs" type="select" label="Use Raw Junctions">
  28         <option value="No">No</option>
  29         <option value="Yes">Yes</option>
  30       </param>
  31       <when value="No" />
  32       <when value="Yes">
  33         <param format="interval" name="raw_juncs" type="data" label="Raw Junctions" help="Supply TopHat with a list of raw junctions. Junctions are specified one per line, in a tab-delimited format. Records look like: [chrom] [left] [right] [+/-] left and right are zero-based coordinates, and specify the last character of the left sequenced to be spliced to the first character of the right sequence, inclusive."/>
  34       </when>
  35     </conditional>
  36   </xml>
  37   <xml name="no_novel_juncsParam">
  38     <param name="no_novel_juncs" type="select" label="Only look for supplied junctions">
  39       <option value="No">No</option>
  40       <option value="Yes">Yes</option>
  41     </param>
  42   </xml>
  43 </macros>

Any tool definition in that directory can use the macros contained therein once imported as shown below.

   1 <tool>
   2   ...
   3   <macros>
   4     <import>tophat_macros.xml</import>
   5   </macros>
   6   ...
   7   <inputs>
   8     <expand macro="own_junctionsConditional" />
   9     ...

This example also demonstrates that macros may themselves expand macros (though due to a bug in the original implementation this only works to a depth of 1 - i.e. a macro may not expand a macro that expands a macro, Pull Request #140 (not merged into Galaxy) fixes this).

Parameterizing XML Macros

In some cases, tools may contain similar though not exact same definitions. Some parameterization can be performed by declaring expand elements with child elements and expanding them in the macro definition with a yield element.

For instance, previously the tophat wrapper contained the following definition:

   1         <conditional name="refGenomeSource">
   2           <param name="genomeSource" type="select" label="Will you select a reference genome from your history or use a built-in index?" help="Built-ins were indexed using default options">
   3             <option value="indexed">Use a built-in index</option>
   4             <option value="history">Use one from the history</option>
   5           </param>
   6           <when value="indexed">
   7             <param name="index" type="select" label="Select a reference genome" help="If your genome of interest is not listed, contact the Galaxy team">
   8               <options from_data_table="tophat_indexes_color">
   9                 <filter type="sort_by" column="2"/>
  10                 <validator type="no_options" message="No indexes are available for the selected input dataset"/>
  11               </options>
  12             </param>
  13           </when>
  14           <when value="history">
  15             <param name="ownFile" type="data" format="fasta" metadata_name="dbkey" label="Select the reference genome" />
  16           </when>  <!-- history -->
  17         </conditional>  <!-- refGenomeSource -->

and the tophat2 wrapper contained the highly analogous definition:

   1         <conditional name="refGenomeSource">
   2           <param name="genomeSource" type="select" label="Will you select a reference genome from your history or use a built-in index?" help="Built-ins were indexed using default options">
   3             <option value="indexed">Use a built-in index</option>
   4             <option value="history">Use one from the history</option>
   5           </param>
   6           <when value="indexed">
   7             <param name="index" type="select" label="Select a reference genome" help="If your genome of interest is not listed, contact the Galaxy team">
   8               <options from_data_table="tophat2_indexes_color">
   9                 <filter type="sort_by" column="2"/>
  10                 <validator type="no_options" message="No indexes are available for the selected input dataset"/>
  11               </options>
  12             </param>
  13           </when>
  14           <when value="history">
  15             <param name="ownFile" type="data" format="fasta" metadata_name="dbkey" label="Select the reference genome" />
  16           </when>  <!-- history -->
  17         </conditional>  <!-- refGenomeSource -->

These blocks differ only in the from_data_table attribute on the options element. To capture this pattern, tophat_macros.xml contains the following macro definition:

   1   <xml name="refGenomeSourceConditional">
   2     <conditional name="refGenomeSource">
   3       <param name="genomeSource" type="select" label="Use a built in reference genome or own from your history" help="Built-ins genomes were created using default options">
   4         <option value="indexed" selected="True">Use a built-in genome</option>
   5         <option value="history">Use a genome from history</option>
   6       </param>
   7       <when value="indexed">
   8         <param name="index" type="select" label="Select a reference genome" help="If your genome of interest is not listed, contact the Galaxy team">
   9           <yield />
  10         </param>
  11       </when>
  12       <when value="history">
  13         <param name="ownFile" type="data" format="fasta" metadata_name="dbkey" label="Select the reference genome" />
  14       </when>  <!-- history -->
  15     </conditional>  <!-- refGenomeSource -->
  16   </xml>

Notice the yield statement in lieu of an options declaration. This allows the nested options element to be declared when expanding the macro:

The following expand declarations have replaced the original conditional elements.

   1         <expand macro="refGenomeSourceConditional">
   2           <options from_data_table="tophat_indexes">
   3             <filter type="sort_by" column="2"/>
   4             <validator type="no_options" message="No genomes are available for the selected input dataset"/>
   5           </options>
   6         </expand>

   1         <expand macro="refGenomeSourceConditional">
   2           <options from_data_table="tophat2_indexes">
   3             <filter type="sort_by" column="2"/>
   4             <validator type="no_options" message="No genomes are available for the selected input dataset"/>
   5           </options>
   6         </expand>

Macro Token

You can use

   1 <token name="@IS_PART_OF_VCFLIB@">is a part of VCFlib toolkit developed by Erik Garrison (https://github.com/ekg/vcflib).</token>

and then call the token within the file like this

   1 Vcfallelicprimitives @IS_PART_OF_VCFLIB@

Miscellaneous tips and tricks

If you need to label a dataset with its real name (as displayed in the history)

<outputs>
   <data name="output" format="tabular" label="SEQLEN of ${input1.name}" />
</outputs>

If you need to find out about the data type of a dataset (here exemplified by putting it in a configfile)

<configfiles>
   <configfile name="parametersfile">
      input1_datatype == ${input1.ext}
   </configfile>
</configfiles>

If you want to use XSD to validate your tool XML you can try the following project by Jean-Frédéric: https://github.com/JeanFred/Galaxy-XSD