Locked History Actions

Support

Support
Galaxy has many options for getting and giving help.

Before posting a question to one of the Mailing Lists, we ask that you please first review the support resources summarized here and search with our Custom Tools to see if the same question has come up in the past.

If you have help to offer other Galaxy users, please dive right in and reply to questions on the Mailing Lists, submit tools to the Tool Shed, and/or add your expertise into our Wiki! Galaxy is a community of scientists and developers working together and contributions are most welcome.


Galaxy Web Search

Galaxy Mailing List Search

Using Galaxy Web Search

Galaxy administration, tool, and deployment search

Using Galaxy

Learning Hub

See our Learning hub for key coverage of Galaxy user interface concepts, data, and tools. Watch the short Learn screencast for a learning resource overview.

Screencasts

Screencast videos demonstrate the step-by-step for a range of topics. Some are quick. Some are not. All are packed with tips and methods usable across analysis workflows.

Tutorials

Tutorials embedded inside of Galaxy. Created by Galaxy's scientists, these are packed with example datasets, histories, visualizations, and workflows to import and experiment with.

FAQs

FAQs are always under active development. Both scientific and technical users are encouraged to add in their own tips and best practices for new and existing data types, tools, workflows, set-up, and administration.

Custom Searches

Finally, there are the Galaxy Custom Searches. The MailingList search finds all related prior Q & A from the galaxy-user and galaxy-dev mailing lists. The UseGalaxy search finds all online resources for information about using Galaxy. This includes this wiki, tool help and shared Galaxy objects at UseGalaxy.org, and Mailing Lists.

Mailing Lists

Galaxy has two public public mailing lists for questions, one private mailing list for bug reports, and one announcement mailing list. Please do not post questions through the Galaxy Issue Board; these will only be redirected to the mailing lists. Manage subscriptions and learn more about these list at the Mailing Lists home page. See also:

Galaxy Issue Board

Galaxy Issue Board

The Galaxy Project uses Trello for issue tracking and feature requests. The Galaxy Issue Board supports issue creation, commenting, and voting on issues.

About Galaxy

Galaxy is an open, web-based platform for accessible, reproducible, and transparent computational biomedical research.

In addition to using the public Galaxy server, you can also install your own instance of Galaxy, or create an instance of Galaxy on the cloud. Another option is to use one of the ever-increasing number of Public Galaxy Servers hosted by other organizations.

The GALAXY Framework at the highest level a set of reusable software components. Learn more about Galaxy's Architecture.

Galaxy ProjectBig PictureCommunityGet GalaxyCloudMan




Help! Common Solutions


Galaxy is designed to have a simplified tool interface while still retaining maximum functionality. Many tools have "most common" default parameter settings with "full parameter" options for those that need it. Documentation is in the tool form itself that covers parameters, expected input dataset format, along with links to publications and 3rd party web sites/support.

Finding a tool

In the left tool panel, click on the top Options menu to open the tool search. Type in a tool name or data type. Shorter keywords find more choices.

Loading data

Data is loaded using the tools in the Get Data tool group. To load your own local data or data from another source, use the tool Get Data → Upload - watch the screencast to see how it works.

  • Each file loaded creates one dataset in the history.
  • The maximum size limit is 50G (uncompressed).
  • Most individual file compression formats are supported, but .tar archives are not.

Get Data → Upload methods:

  • Load by "browsing" for a local file. Only good for very small datasets. ( < 2G, but often works best for smaller). If you are having problems with this method, try FTP.

  • Load using an HTTP URL or FTP URL.

  • Load a few lines of plain text.
  • Load using FTP. Either line command or with a desktop client.

Upload tips

  • Data quota is at limit, so no new data can be loaded. Disk usage and quotas are reported at User → Preferences when logged in.

  • Password protected data will require a special URL format. Ask the data source. Double check that it is publicly accessible.

  • Use FTP, not SFTP. Check with local admin if not sure.

  • No HTML content. The loading error generated may state this. Remove HTML fields from your dataset before loading into Galaxy or omit HTML fields from the query if importing from a data source (such as Biomart).

  • Compression types .gz/.gzip, .bz/.bzip, .bz2/.bzip2, and single-file .zip are supported.

  • Only the first file in any compressed archive will load as a dataset.

  • Data must be < 50G (uncompressed) to be successfully uploaded and added as a dataset to a history, from any source.

  • Is the problem the dataset format or the assigned datatype? Can this be corrected by editing the datatype or converting formats? See Learn/Managing Datasets for help or watch the screencast above for a how-to example.

  • Problems in the first step working with your loaded data? It may not have uploaded completely. If you used an FTP client, the transfer message will indicate if a load was successful or not and can often restart interrupted loads. This makes FTP a great choice for slower connections, even when loading small files.


Dataset status and how jobs execute

When a tool is executed, one or more new datasets are added to a history. The same is true when a workflow is executed. If using the public Main Galaxy instance, the most effective strategy when running jobs on the shared resource is to start jobs (or workflows), and then leave them alone to execute until completion.

How does the processing of tool jobs actually work?

  • The color of a dataset designates the current status of the underlying job.

    • green - the job completed successfully

      • The resulting data is ready to be used in visualizations, available as input to tools, can be downloaded, or utilized for any other downstream purpose.
    • yellow - the job is in progress

      • If you are using the public Main Galaxy instance, this job is running on one of our clusters. Different types of tools send jobs to different clusters appropriate for the requirements of each tool. Some tools are more compute intensive than others and significant resources are dedicated to job processing. Jobs have up to 72 hours to complete, if they run longer than this they will fail with a "wall-time" error and turn red. Examining tool paramaters is the first option, less sensitive parameters may result in an equally acceptable result, but use less resource. If that is not appropriate or does not succeed, a CloudMan Galaxy or Local Galaxy with sufficient resources may be the solution.

    • grey - the job is waiting to run

      • If you are using the public Main Galaxy instance, this job is queued, waiting for an opening on the appropriate cluster. It is very important to allow queued jobs to remain queued, and to not delete/re-run them. If re-run, this not only moves the new job back to the end of the queue, effectively lengthening the wait time to execute, but if done repeatedly, the volume of "executing deleted" jobs can create additional work processes in the history as these are cleared away, using up resources, and can cause additional delays.

    • red - the job has failed

      • There can be many reasons for this, see the next section, Error from tools for details.

    • blue-purple with moving arrow - (applies to "Get Data -> Upload File" tool only) - the job is queuing or running

      • The job may run immediately, or may turn grey if the server is busy, meaning that guidelines for grey jobs apply, and these grey datasets should never be deleted/re-run, for the same reasons explained above.

      • An upload job that seems to run in the blue-purple state for a very long time generally indicates that the file being loaded is too large for the method used (specifically, a browsed-file upload) and FTP should be used instead. This is the only active job that should be deleted under normal usage, as it will never complete (no file over 2G will ever load via file browser upload).

Shared and Published data

Have you been asked to share a history? Or has someone shared a workflow with you but you're not sure where to find it? Or maybe you just want to find out more about how publishing your work in Galaxy can be used to support your next publication? Watch the how to Share and Publish screencast and read more here.

Error from tools

Dataset format problems are the #1 reason that tools fail. Most likely this problem was introduced during the initial data upload. Double check the dataset against Galaxy's datatypes or external specifications. In many cases, the format issues can be corrected using a creative combination of Galaxy's tools.

Troubleshooting tool errors

  • Verify the size/number of lines or md5sum between the source and Galaxy. Use Line/Word/Character count of a dataset or Secure Hash / Message Digest on a dataset to do this.

  • Look at the end of your file. Is it complete? Are there extra empty lines? Use Select last lines from a dataset with the default 10 to check.

  • Check errors that come from tools such as the FASTQ Groomer. Many tools report the exact problem with exact instructions for corrections.

  • Is the format to specification? Is it recognized by Galaxy? By the target tool or display application? Check against the Galaxy Datatypes list.

    • Are you using a Custom Reference Genome? Have you tried the quick Troubleshooting tips on the wiki?

    • Note: not all formats are outlined in detail as they are common types or derived from a particular source. Read the target tool help, ask the tool authors, or even just google for the most current specification.
  • Is the problem the dataset format or the assigned datatype? Can this be corrected by editing the datatype or converting formats? Often a combination of tools can correct a formatting problem, if the rest of the file is intact (completely loaded).

  • Is the problem a scientific or technical problem? Also see #Interpreting scientific results to decide.

    • Example NGS: Mapping tools: On the tool form itself is a short list of help plus links to publications and the tool author's documentation and/or website. If you are having trouble with Bowtie, look on this tool's form for more information, including a link to this website: http://bowtie-bio.sourceforge.net/index.shtml.

    • Example NGS: RNA Analysis tools: See the galaxy-rna-seq-analysis-exercise tutorial and transcriptome-analysis-faq. If these do not address the problem, then contacting the tool authors is the next step at: mailto:tophat.cufflinks@gmail.com.

    • Example NGS: SAM Tools tools: SAMTools requires that all input files be to specification (Learn/Datatypes) and that the same exact reference genome is used for all steps. Double checking format is the first check. Double checking the the same exact version of the reference genome is used is the second check. The last double check is that the number of jobs and size of data on disk is under quota. Problems with this set of tools is rarely caused by other issues.

  • Tools for fixing/converting/modifing a dataset will often include the datatype name. Use the tool search to locate candidate tools, likely in tool groups Text Manipulation, Convert Formats, or NGS: QC and manipulation.

  • The most commonly used tools for investigating problems with upload, format and making corrections are:
    • TIP: use the Tool search in top left panel to find tools by keyword

    • Edit Attributes form, found by clicking a dataset's Images/Icons/pencil.png icon

    • Convert Format tool group

    • Select first lines from a dataset

    • Select last lines from a dataset

    • Line/Word/Character count of a dataset

    • Secure Hash / Message Digest on a dataset

    • FASTQ Groomer

    • FastQC

    • Tabular to FASTQ, FASTQ to Tabular

    • Tabular to FASTA, FASTA to Tabular

    • FASTA Width formatter

    • Text Manipulation tool group

    • Filter and Sort tool group

Tool doesn't recognize dataset

Usually a simple datatype assignment incompatibility between the dataset and the tool. Expected input datatype format is explained on the Tool form itself under the parameter settings. Convert Format or modify the datatype using the dataset's pencil icon to reach the Edit Attributes form.

Dataset special cases

  • If the required input is a FASTQ datatype, and the data is a newly uploaded FASTQ file, run FASTQ Groomer as a first step, then continue with your analysis.

    • If you are certain that the quality scores are already scaled to Sanger Phred+33 (the result of an Illumina 1.8+ pipeline), the datatype ".fastqsanger" can be directly assinged. Click the pencil icon to reach the Edit Attributes form. In the center panel, click on the "Datatype" tab (3rd), enter the datatype ".fastqsanger", and save. Metadata will assign, then the dataset can be used.

    • If you are not sure what type of FASTQ data you have, see the help directly on the FASTQ Groomer tool, optionally run FASTQ Groomer on a sample of your data, plus run FastQC (the output report will note the quality score type interpreted by the tool).

    • If your data is FASTA, but you want to use tools that require FASTQ input, then using the tool NGS: QC and manipulation -> Combine FASTA and QUAL. This tool will create "placeholder" quality scores that fit your data. On the output, click the pencil icon to reach the Edit Attributes form. In the center panel, click on the "Datatype" tab (3rd), enter the datatype ".fastqsanger", and save. Metadata will assign, then the dataset can be used.

  • If the required input is a Tabluar datatype, other dataypes that are in a specialized tabular format, such as .bed, .interval, or .txt, can often be directly reassigned to tabular format. Click the pencil icon to reach the Edit Attributes form. In the center panel, using tabs to navigate, change the datatype (3rd tab) and save, then label columns (1st tab) and save. Metadata will assign, then the dataset can be used.

  • If the required input is a BED or Interval datatype, the reverse (.tab → .bed, .tab → .interval) may be possible using a combination of Text Manipulation tools, to create a dataset that matches the BED or Interval datatype specifications.

Reference genomes

Using the same exact reference genome for all steps in an analysis is often mandatory to obtain accurate results.

  • How can I tell if I have a reference genome mismatch problem?

    • There isn't one single error that points to this problem. But, if you are running a tool for the first time using a newly uploaded genome, and an error occurs or more likely simply unexpected results are produced - double checking the reference genome would be a good choice.

  • When moving between instances, what can be done to mitigate the risk of using the wrong assembly?

  • How do I load a reference genome?

    • Use FTP - details are here... and troubleshooting help is here...

    • If your genome is small (bacterial, etc.), using it as a Custom Reference Genome is the quickest way to to get it into Galaxy and to start using it with tools.

Reporting tool errors

  • If running a tool on the public Galaxy server (i.e., http://usegalaxy.org) is resulting in an error (the dataset is red), and you can't determine the root cause from the error message or input format checks:

    • Re-run the job to eliminate transitory cluster issues.
    • Report the problem using the dataset's bug icon. Do not submit an error for the first failure, but leave it undeleted in your history for reference.

    • IMPORTANT: Get the quickest resolution by leaving all of the input and output datasets in the analysis thread leading up to the error in your history undeleted until we have written you back. Use Options → Show Deleted Datasets and click dataset links to undelete to recover error datasets before reporting the problem, if necessary.

      • Example: Error with Cufflinks? Leave the ungroomed + groomed FASTQ, Bowtie/Tophat SAM, optional GTF + custom genome, and Cufflinks datasets undeleted.

    • Include in the bug report what checks confirmed that data format was not an issue
    • Anything else you feel is relevant to the error
  • We do our best to respond to bug reports as soon as possible.
  • Please send all email as reply-all as we work to resolve the error. The galaxy-bugs address we will be corresponding from is internal to the Galaxy team only and we work together to resolve reported problems.

  • If you have resolved the issue, a reply to the bug report to let us know is appreciated.

Interpreting scientific results

A double check against the tool help and documentation is the first step. If the tool was developed by a 3rd party, they are likely the best experts for detailed questions. Tool forms have links to documentation/authors.

Tools on the Test server

  • Tools on Test will have little to no support help offered.

  • Test tool errors reported as a bug reports (#Error from tools) are considered low priority and may not receive a reply.

  • General feedback & discussion threads (instead of questions requiring a reply from the Galaxy team) are welcomed at the development mailing list.

  • Exceptions are possible. Sometimes community users help to test-drive new functionality. If you are interested in this type of testing for a particular tool, contact us on the development mailing list.

Tools on the Main server


Custom reference genome

Often the quickest way to get your analysis going is to load a custom genome for your own use. Simply upload the FASTA file using FTP and use it as the "reference genome from the history" (wording can vary slightly between tools, but most have this option). Read more about how to set up a Custom Genome ...

Best Practices

  • Make sure the reference genome is in FASTA format and is completely loaded (see Trouble loading data above).

  • Use the same custom genome for all the steps in your analysis that require a reference genome. Don't switch or the data may not align between your files in downstream steps.
  • TIP: To modify a dataset to have an unassigned reference genome, use the pencil icon to "Edit Attributes". On the form, for the attribute Database/Build:, set the genome to be " unspecified (?) ", and submit. Any prior assignments will be removed.

Quick genome access

  • If your genome is small (bacterial, etc.), using it as a Custom Reference Genome is the quickest way to to get it into Galaxy and to start using it with tools.

  • Obtain a FASTA version, load using using FTP, and use from your history with tools.

Tools on the Main server

  • ExampleFetch Sequences: Extract Genomic DNA

    • Start by loading the custom reference genome in FASTA format into your history as a dataset, using FTP if the dataset is over 2G in size.

    • Load or create an appropriate Interval, BED, or GFF coordinate dataset into the same history.

    • On the Extract Genomic DNA tool form, you will use the options:

      • "Source for Genomic Data:" as "History"
      • next, for the new menu option "Using reference file", select the fasta dataset of your target genome from your active history




Public mailing list Q & A discussions



Search all

Searching prior Q & A

Still need help not covered by the tool help, the Learning Hub, a Screencast, a Tutorial, or an FAQ?

  • Start with a search in our mailing list archives to see if this question has come up before.

  • If you have a development topic to discuss, your data/tool situation has not come up before, and/or troubleshooting has failed (including at least one re-run, as explained in Error from tools above), send us an email.


Note: If your question is about an error on Main for a job failure, start by reviewing the troubleshooting help for Tool Errors. If data input and the job error message don't resolve the issue, please use the tool error submission form from the red error dataset, instead of starting a public mailing list discussion thread (do not delete error datasets). Read more ...

What to include in a question

  1. Where you are using Galaxy: Main, other public, local, or cloud instance

  2. End-user questions from Test are generally not sent/supported - Test is for breaking

  3. If a local or cloud instance, the distribution or galaxy-central hg pull #
  4. If on Main, date/time the initial and ru-run jobs were executed

  5. If there is an example/issue, exact steps to reproduce
  6. What troubleshooting steps (if a problem is being reported) you have tested out
  7. If on Main, you may be asked for a shared history link. Use Options → Share or Publish, generate the link, and email it directly back off-list. Note the dataset #'s you have questions about.

  8. IMPORTANT: Get the quickest answer for data questions by leaving all of the input and output datasets in the analysis thread in your shared history undeleted until we have written you back. Use Options → Show Deleted Datasets and click dataset links to undelete to recover datasets if necessary

  9. Always reply-all unless sharing a private link



Starting a scientific, data, or tool usage thread

  • Gather information "What to include in a question" above
  • Send an email to mailto:galaxy-user@bx.psu.edu

  • Subscribing to the Galaxy User List is recommended for researchers

  • Discussion threads are open to the entire community and the Galaxy team to answer
  • Always reply-all unless sharing a private link



Starting a technical tool, local/cloud instance, or development thread

  • Gather information "What to include in a question" above
  • Send an email to mailto:galaxy-dev@bx.psu.edu

  • Subscribing to the Galaxy Development List is recommended for tool developers and instance administrators

  • Discussion threads are open to the entire community and the Galaxy team to answer
  • Always reply-all unless sharing a private link






Reporting a software bug


If you think you've seen a bug (not an "Error from tools" ), please report it to the Galaxy Development List.

Please do not report a new usage problem through the Galaxy Issue Board unless you are fairly certain the problem is software and that it can't be remedied quickly. Sending question to the galaxy-dev@bx.psu.edu mailing list for review is most often the best first pass if there is a problem. The great part about this method that if the problem is usage, we can help solve the problem, if the problem is that you are looking for a wrapper or help with a wrapper, the community can offer quick feedback, and if there really is a serious issue with Galaxy itself - it might apply globally, and we often can just fix it right away.


Bug or Error from tools? Sometimes it is hard to tell. If you are on the public Main instance, and ran a tool that produced a red error dataset, then you will probably want to start by reporting this as a Tool Error, but add in comments about your suspicious about a bug if there is something odd about the job failure.

What to include in a bug report

  1. Where you are using Galaxy (Main, local, or cloud instance).

  2. Bug reports from Test are generally not sent

  3. If a local or cloud instance, the distribution or galaxy-central hg pull #
  4. The date/time the bug was detected
  5. Exact steps to reproduce the issue
  6. What troubleshooting steps (if any) you have tested out
  7. If you can reproduce on Main, you may be asked to send in a tool error report or share a history link. Use Options → Share or Publish, generate the link, and email it directly back off-list. Note the problem dataset #'s.

  8. IMPORTANT: If data is involved, leave all of the related datasets in the analysis thread leading up to the bug in your history undeleted until we have written you back. Use Options → Show Deleted Datasets and click dataset links to undelete to recover error datasets before reporting a bug if necessary.

  9. Always reply-all unless sharing a private link