Before posting a question to one of the Mailing Lists, we ask that you please first review the support resources summarized here and search with our Custom Tools to see if the same question has come up in the past.
If you have help to offer other Galaxy users, please dive right in and reply to questions on the Mailing Lists, submit tools to the Tool Shed, and/or add your expertise into our Wiki! Galaxy is a community of scientists and developers working together and contributions are most welcome.
Contents
- Using Galaxy
- Help! Common Solutions
- Public mailing list Q & A discussions
- Reporting a software bug
Using Galaxy
Learning Hub
See our Learning hub for key coverage of Galaxy user interface concepts, data, and tools.
Quickies
Screencast videos demonstrate the step-by-step for a range of topics. Some are quick. Some are not. All are packed with tips and methods usable across analysis workflows.
Tutorials
Tutorials embedded inside of Galaxy. Created by Galaxy's scientists, these are packed with example datasets, histories, visualizations, and workflows to import and experiment with.
FAQs
FAQs are always under active development. Both scientific and technical users are encouraged to add in their own tips and best practices for new and existing data types, tools, workflows, set-up, and administration.
Custom Searches
Finally, there are the Galaxy Custom Searches. The MailingList search finds all related prior Q & A from the galaxy-user and galaxy-dev mailing lists. The UseGalaxy search finds all online resources for information about using Galaxy. This includes this wiki, tool help and shared Galaxy objects at UseGalaxy.org, and Mailing Lists.
Mailing Lists
Galaxy has two public public mailing lists for questions, one private mailing list for bug reports, and one announcement mailing list. Please do not post questions through the Galaxy Issue Board; these will only be redirected to the mailing lists. Manage subscriptions and learn more about these list at the Mailing Lists home page. See also:
Galaxy Issue Board
The Galaxy Project uses Trello for issue tracking and feature requests. The Galaxy Issue Board supports issue creation, commenting, and voting on issues.
About Galaxy
Galaxy is an open, web-based platform for accessible, reproducible, and transparent computational biomedical research.
In addition to using the public Galaxy server, you can also install your own instance of Galaxy, or create an instance of Galaxy on the cloud. Another option is to use one of the ever-increasing number of Public Galaxy Servers hosted by other organizations.
The GALAXY Framework at the highest level a set of reusable software components. Learn more about Galaxy's Architecture.
Galaxy Project • Big Picture • Community • Get Galaxy • CloudMan
Help! Common Solutions
Galaxy is designed to have a simplified tool interface while still retaining maximum functionality. Many tools have "most common" default parameter settings with "full parameter" options for those that need it. Documentation is in the tool form itself that covers parameters, expected input dataset format, along with links to publications and 3rd party web sites/support.
Finding a tool
In the left tool panel, click on the top Options menu to open the tool search. Type in a tool name or data type. Shorter keywords find more choices.
Trouble loading data
Data is loaded using the tool Get Data → Upload
- Load from one of the listed data sources.
Load by "browsing" for a local file. Only good for very small datasets. ( < 2G, but usually smaller)
Load using an HTTP or FTP URL by pasting it in.
Load using FTP. Either line command or with a desktop client as explained in link.
Upload tips
Data quota is at limit, so no new data can be loaded. Disk usage and quotas are reported at User → Preferences when logged in.
Password protected data will require a special URL format. Ask the data source. Double check that it is publicly accessible.
Use FTP, not SFTP. Check with local admin if not sure.
No HTML content. The loading error generated may state this. Remove HTML fields from your dataset before loading into Galaxy or omit HTML fields from the query if importing from a data source (such as Biomart).
Compression types .gz/.gzip, .bz/.bzip, .bz2/.bzip2, and single-file .zip are supported.
Only the first file in any compressed archive will load as a dataset.
Data must be < 50G to be uploaded from any source.
Is the problem the dataset format or the assigned datatype? Can this be corrected by editing the datatype or converting formats? See Learn/Managing Datasets for help.
Problems in the first step working with your loaded data? It may not have uploaded completely. If you used an FTP client, the transfer message will indicate if a load was successful or not and can often restart interrupted loads.
Error from tools
Dataset format problems are the #1 reason that tools fail. Most likely this problem was introduced during the initial data upload. Double check the dataset against Galaxy's datatypes or external specifications. In many cases, the format issues can be corrected using a creative combination of Galaxy's tools.
Troubleshooting tool errors
Verify the size/number of lines or md5sum between the source and Galaxy. Use Line/Word/Character count of a dataset or Secure Hash / Message Digest on a dataset to do this.
Look at the end of your file. Is it complete? Are there extra empty lines? Use Select last lines from a dataset with the default 10 to check.
Check errors that come from tools such as the FASTQ Groomer. Many tools report the exact problem with exact instructions for corrections.
Is the format to specification? Is it recognized by Galaxy? By the target tool or display application? Check against the Galaxy Datatypes list.
Are you using a Custom Reference Genome? Have you tried the quick Troubleshooting tips on the wiki?
- Note: not all formats are outlined in detail as they are common types or derived from a particular source. Read the target tool help, ask the tool authors, or even just google for the most current specification.
Is the problem the dataset format or the assigned datatype? Can this be corrected by editing the datatype or converting formats? Often a combination of tools can correct a formatting problem, if the rest of the file is intact (completely loaded).
Is the problem a scientific or technical problem? Also see #Interpreting scientific results to decide.
Example NGS: Mapping tools: On the tool form itself is a short list of help plus links to publications and the tool author's documentation and/or website. If you are having trouble with Bowtie, look on this tool's form for more information, including a link to this website: http://bowtie-bio.sourceforge.net/index.shtml.
Example NGS: RNA Analysis tools: See the galaxy-rna-seq-analysis-exercise tutorial and transcriptome-analysis-faq. If these do not address the problem, then contacting the tool authors is the next step at: mailto:tophat.cufflinks@gmail.com.
Example NGS: SAM Tools tools: SAMTools requires that all input files be to specification (Learn/Datatypes) and that the same exact reference genome is used for all steps. Double checking format is the first check. Double checking the the same exact version of the reference genome is used is the second check. The last double check is that the number of jobs and size of data on disk is under quota. Problems with this set of tools is rarely caused by other issues.
Tools for fixing/converting/modifing a dataset will often include the datatype name. Use the tool search to locate candidate tools, likely in tool groups Text Manipulation, Convert Formats, or NGS: QC and manipulation.
- The most commonly used tools for investigating problems with upload, format and making corrections are:
TIP: use the Tool search in top left panel to find tools by keyword
Edit Attributes form, found by clicking a dataset's
icon Convert Format tool group
Select first lines from a dataset
Select last lines from a dataset
Line/Word/Character count of a dataset
Secure Hash / Message Digest on a dataset
FASTQ Groomer
FastQC
Tabular to FASTQ, FASTQ to Tabular
Tabular to FASTA, FASTA to Tabular
FASTA Width formatter
Text Manipulation tool group
Filter and Sort tool group
Tool doesn't recognize dataset
Usually a simple datatype assignment incompatibility between the dataset and the tool. Expected input datatype format is explained on the Tool form itself under the parameter settings. Convert Format or modify the datatype using the dataset's
icon to reach the Edit Attributes form.
Dataset special cases
If the required input is a FASTQ datatype, and the data is a newly uploaded FASTQ file, run FASTQ Groomer as a first step, then continue with your analysis.
If you are certain that the quality scores are already scaled to Sanger Phred+33 (the result of an Illumina 1.8+ pipeline), the datatype ".fastqsanger" can be directly assinged. Click the
icon to reach the Edit Attributes form. In the center panel, click on the "Datatype" tab (3rd), enter the datatype ".fastqsanger", and save. Metadata will assign, then the dataset can be used. If you are not sure what type of FASTQ data you have, see the help directly on the FASTQ Groomer tool, optionally run FASTQ Groomer on a sample of your data, plus run FastQC (the output report will note the quality score type interpreted by the tool).
If your data is FASTA, but you want to use tools that require FASTQ input, then using the tool NGS: QC and manipulation -> Combine FASTA and QUAL. This tool will create "placeholder" quality scores that fit your data. On the output, click the
icon to reach the Edit Attributes form. In the center panel, click on the "Datatype" tab (3rd), enter the datatype ".fastqsanger", and save. Metadata will assign, then the dataset can be used.
If the required input is a Tabluar datatype, other dataypes that are in a specialized tabular format, such as .bed, .interval, or .txt, can often be directly reassigned to tabular format. Click the
icon to reach the Edit Attributes form. In the center panel, using tabs to navigate, change the datatype (3rd tab) and save, then label columns (1st tab) and save. Metadata will assign, then the dataset can be used. If the required input is a BED or Interval datatype, the reverse (.tab → .bed, .tab → .interval) may be possible using a combination of Text Manipulation tools, to create a dataset that matches the BED or Interval datatype specifications.
Reference genomes
Using the same exact reference genome for all steps in an analysis is often mandatory to obtain accurate results.
How can I tell if I have a reference genome mismatch problem?
There isn't one single error that points to this problem. But, if you are running a tool for the first time using a newly uploaded genome, and an error occurs or more likely simply unexpected results are produced - double checking the reference genome would be a good choice.
When moving between instances, what can be done to mitigate the risk of using the wrong assembly?
When moving between a Galaxy CloudMan AMI and the public Main Galaxy instance, just make sure the database name is the same. If the assigned database name is the same, the content of the reference genome is the same.
When moving between a local Galaxy and the public Main Galaxy instance, there are a few choices:
Consider using our version of commonly used genomes, available through our rsync server
Make your own indexes, but check against our rsync indexes to be aware of differences
When there are differences, use your own genome as a Custom Reference Genome with tools
When moving between a local Galaxy and a Galaxy CloudMan AMI, the same guidelines as immediate above (for Main) would apply, since Main and the CloudMan AMI are based off the same content
How do I load a reference genome?
Use FTP - details are here... and troubleshooting help is here...
If your genome is small (bacterial, etc.), using it as a Custom Reference Genome is the quickest way to to get it into Galaxy and to start using it with tools.
Reporting tool errors
If running a tool on the public Galaxy server (i.e., http://usegalaxy.org) is resulting in an error (the dataset is red), and you can't determine the root cause from the error message or input format checks:
- Re-run the job to eliminate transitory cluster issues.
Report the problem using the dataset's
icon. Do not submit an error for the first failure, but leave it undeleted in your history for reference. IMPORTANT: Get the quickest resolution by leaving all of the input and output datasets in the analysis thread leading up to the error in your history undeleted until we have written you back. Use Options → Show Deleted Datasets and click dataset links to undelete to recover error datasets before reporting the problem, if necessary.
Example: Error with Cufflinks? Leave the ungroomed + groomed FASTQ, Bowtie/Tophat SAM, optional GTF + custom genome, and Cufflinks datasets undeleted.
- Include in the bug report what checks confirmed that data format was not an issue
- Anything else you feel is relevant to the error
- We do our best to respond to bug reports as soon as possible.
Please send all email as reply-all as we work to resolve the error. The galaxy-bugs address we will be corresponding from is internal to the Galaxy team only and we work together to resolve reported problems.
- If you have resolved the issue, a reply to the bug report to let us know is appreciated.
Interpreting scientific results
A double check against the tool help and documentation is the first step. If the tool was developed by a 3rd party, they are likely the best experts for detailed questions. Tool forms have links to documentation/authors.
Tools on the Test server
Tools on Test will have little to no support help offered.
Test tool errors reported as a bug reports (#Error from tools) are considered low priority and may not receive a reply.
General feedback & discussion threads (instead of questions requiring a reply from the Galaxy team) are welcomed at the development mailing list.
Exceptions are possible. Sometimes community users help to test-drive new functionality. If you are interested in this type of testing for a particular tool, contact us on the development mailing list.
Tools on the Main server
Example → RNA-seq analysis tools.
GTF dataset formats can widely vary in the 9th attribute column. Cufflinks and the other tools in this set expect a certain format for full functionality. GFF3 datasets have a specific structure, including a unique "ID" attribute. The details are explained in the file format specifications and at the tool's websites: http://cufflinks.cbcb.umd.edu/faq.html, http://cufflinks.cbcb.umd.edu/ & http://tophat.cbcb.umd.edu/.
Using the same reference genome for all steps is very important. Even small differences in chromosome/scaffold names can result in errors. Double check that the naming between the reference genome and any other inputs such as SAM/BAM and GTF datasets all use the same naming conventions. See our FAQ for more help if this is suspected to be the root cause of an error.
Confirming data sources using gffread locally, before loading data into Galaxy, can be one way to discovering where problems are.
- Read the recent publication from the tool authors
Trapnell, et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks Nature Protocols doi:10.1038/nprot.2012.016
If the tool form help, tutorial, FAQ, or tool author's web site/publication do not address the question or problem, then contacting the tool authors is the next step at: mailto:tophat.cufflinks@gmail.com.
Custom reference genome
Often the quickest way to get your analysis going is to load a custom genome for your own use. Simply upload the FASTA file using FTP and use it as the "reference genome from the history" (wording can vary slightly between tools, but most have this option). Read more about how to set up a Custom Genome ...
Best Practices
Make sure the reference genome is in FASTA format and is completely loaded (see Trouble loading data above).
- Use the same custom genome for all the steps in your analysis that require a reference genome. Don't switch or the data may not align between your files in downstream steps.
TIP: To modify a dataset to have an unassigned reference genome, use the
icon to "Edit Attributes". On the form, for the attribute Database/Build:, set the genome to be " unspecified (?) ", and submit. Any prior assignments will be removed.
Quick genome access
If your genome is small (bacterial, etc.), using it as a Custom Reference Genome is the quickest way to to get it into Galaxy and to start using it with tools.
Obtain a FASTA version, load using using FTP, and use from your history with tools.
Tools on the Main server
Example → Fetch Sequences: Extract Genomic DNA
Start by loading the custom reference genome in FASTA format into your history as a dataset, using FTP if the dataset is over 2G in size.
Load or create an appropriate Interval, BED, or GFF coordinate dataset into the same history.
On the Extract Genomic DNA tool form, you will use the options:
- "Source for Genomic Data:" as "History"
next, for the new menu option "Using reference file:", select the fasta dataset of your target genome from your active history
Public mailing list Q & A discussions
Searching prior Q & A
Still need help not covered by the tool help, the Learning Hub, a Screencast, a Tutorial, or an FAQ?
Start with a search in our mailing list archives to see if this question has come up before.
If you have a development topic to discuss, your data/tool situation has not come up before, and/or troubleshooting has failed (including at least one re-run, as explained in Error from tools above), send us an email.
Note: If your question is about an error on Main for a job failure, start by reviewing the troubleshooting help for Tool Errors. If data input and the job error message don't resolve the issue, please use the tool error submission form from the red error dataset, instead of starting a public mailing list discussion thread (do not delete error datasets). Read more ...
What to include in a question
Where you are using Galaxy: Main, other public, local, or cloud instance
End-user questions from Test are generally not sent/supported - Test is for breaking
- If a local or cloud instance, the distribution or galaxy-central hg pull #
If on Main, date/time the initial and ru-run jobs were executed
- If there is an example/issue, exact steps to reproduce
- What troubleshooting steps (if a problem is being reported) you have tested out
If on Main, you may be asked for a shared history link. Use Options → Share or Publish, generate the link, and email it directly back off-list. Note the dataset #'s you have questions about.
IMPORTANT: Get the quickest answer for data questions by leaving all of the input and output datasets in the analysis thread in your shared history undeleted until we have written you back. Use Options → Show Deleted Datasets and click dataset links to undelete to recover datasets if necessary
Always reply-all unless sharing a private link
Starting a scientific, data, or tool usage thread
- Gather information "What to include in a question" above
Send an email to mailto:galaxy-user@bx.psu.edu
Subscribing to the Galaxy User List is recommended for researchers
- Discussion threads are open to the entire community and the Galaxy team to answer
Always reply-all unless sharing a private link
Starting a technical tool, local/cloud instance, or development thread
- Gather information "What to include in a question" above
Send an email to mailto:galaxy-dev@bx.psu.edu
Subscribing to the Galaxy Development List is recommended for tool developers and instance administrators
- Discussion threads are open to the entire community and the Galaxy team to answer
Always reply-all unless sharing a private link
Reporting a software bug
If you think you've seen a bug (not an "Error from tools" ), please report it to the Galaxy Development List.
Please do not report a new usage problem through the Galaxy Issue Board unless you are fairly certain the problem is software and that it can't be remedied quickly. Sending question to the galaxy-dev@bx.psu.edu mailing list for review is most often the best first pass if there is a problem. The great part about this method that if the problem is usage, we can help solve the problem, if the problem is that you are looking for a wrapper or help with a wrapper, the community can offer quick feedback, and if there really is a serious issue with Galaxy itself - it might apply globally, and we often can just fix it right away.
Bug or Error from tools? Sometimes it is hard to tell. If you are on the public Main instance, and ran a tool that produced a red error dataset, then you will probably want to start by reporting this as a Tool Error, but add in comments about your suspicious about a bug if there is something odd about the job failure.
What to include in a bug report
Where you are using Galaxy (Main, local, or cloud instance).
Bug reports from Test are generally not sent
- If a local or cloud instance, the distribution or galaxy-central hg pull #
- The date/time the bug was detected
- Exact steps to reproduce the issue
- What troubleshooting steps (if any) you have tested out
If you can reproduce on Main, you may be asked to send in a tool error report or share a history link. Use Options → Share or Publish, generate the link, and email it directly back off-list. Note the problem dataset #'s.
IMPORTANT: If data is involved, leave all of the related datasets in the analysis thread leading up to the bug in your history undeleted until we have written you back. Use Options → Show Deleted Datasets and click dataset links to undelete to recover error datasets before reporting a bug if necessary.
Always reply-all unless sharing a private link





