About Galaxy on the Cloud
With sporadic availability of data, individuals and labs may have a need to, over a period of time, process greatly variable amounts of data. Such variability in data volume imposes variable requirements on availability of compute resources used to process given data. Rather than having to purchase and maintain desired compute resources or having to wait a long time for data processing jobs to complete, the Galaxy Team has enabled Galaxy to be instantiated on cloud computing]] infrastructures, primarily Amazon Elastic Compute Cloud (EC2). An instance of Galaxy on the Cloud behaves just like a local instance of Galaxy except that it offers the benefits of cloud computing resource availability, scalability and pay-as-you-go resource ownership model.
Having simple ability to launch pre-configured Galaxy on the Cloud enables as many instances of Galaxy to be acquired and started as is needed to process given data without setting up Galaxy. Each instance can be dynamically scaled as a virtual cluster in the cloud and it only takes minutes to do so. Once the need for the compute resources subsides, those instances can be shut down. With such a paradigm, one pays only for the resources they need and use.
When to use Galaxy on the Cloud
The following is a non-exhaustive list of scenarios when it is beneficial to use Galaxy on the Cloud:
- Do not want to spend time setting up a Galaxy instance
- Need to customize a Galaxy instance with new tools or genome reference data
- Have variable or high requirements for compute or storage resources
Getting started with Galaxy on the Cloud
To start your own Galaxy in the Cloud cluster, see the Getting Started page. This page describes concepts and points to key features of using Galaxy on the Cloud.
Determining the size of your cloud cluster
Cloud computing allows your cloud cluster to be variable in size and capacity. See this page for some guidelines on how to decide what is right for you.
Customizing your cloud cluster
If you are interested in running your own version of Galaxy and/or tools on the cloud while utilizing all the automation and functionality provided by CloudMan, this page explains how to do it.
A note about costs
Amazon EC2 service is a pay-as-you-go service where all that is needed to use it is a valid credit card. Rates for Amazon EC2 can be found here.
To see how much using Amazon cloud might cost, you can use the AWS cost calculator. When calculating the total cost, in addition to the EC2 instance, you will have EBS data volumes associated with your cluster. There are a total of two EBS volumes associated with each Galaxy cluster: your data volume (size is decided by you when setting up the cluster, say 100GB to begin with) and genomics indices volume (600GB). (Note, the indices volume can be greatly reduced if you don't need all the genome data - contact us about how to do this.)
- AMI: ami-b45e59de
Name: Galaxy-CloudMan-1457720469 (active dates: 2016-03-24 -> present)
Note that the current AMI represents the environment required to run CloudMan (in the format of a machine image) and the machine image release date does not represent the most recent update or version of either CloudMan or Galaxy. Versions of those tools can be seen (and automatically updated, with the Update button in the CloudMan Admin page) once an instance has been instantiated (we are also looking into a more explicit form of making this information available).
- AMI: ami-d5246abf
Name: Galaxy-CloudMan-1449500413 (active dates: 2015-12-18 -> 2016-03-24)
- AMI: ami-d1c77fba
Name: Galaxy-CloudMan-1440625733 (active dates: 2015-09-03 -> 2015-12-18)
- AMI: ami-a7dbf6ce
Name: Galaxy CloudMan 2.3 (active dates: 2014-01-07 -> 2015-09-03)
- AMI: ami-118bfc78
Name: 861460482541/Galaxy CloudMan 2.0
- AMI: ami-da58aab3
- Name: 861460482541/galaxy-cloudman-2011-03-22
- AMI: ami-9a7485f3
- Name: 861460482541/galaxy-cloudman-2010-01-12
- AMI: ami-228a7e4b
- Name: 115971652512/galaxy-cloudman-2010-10-08
- AMI: ami-ed03ed84
- Name: 115971652512/galaxy-2010-04-20_2
Note, the AMI ami-561bc93f, 072133624695/galaxy-cloudman-2012-02-26 is from unknown origin, and not supported.
Citing and Publications
If Galaxy on the Cloud has been significant to a project that has led to an academic publication, please acknowledge the contribution by citing the following paper:
- Afgan E., Baker D., Coraor N., Goto H., Paul I.M., Makova K.D., Nekrutenko A., Taylor J., "Harnessing cloud computing with Galaxy Cloud," Nature Biotechnology, Vol 29, Issue 11, 2011.
For a complete list of publications and presentations linked to CloudMan and Galaxy on the Cloud, see this page.