May 2014 Galaxy Update

Galaxy Updates

Welcome to the May 2014 Galaxy Update, a monthly summary of what is going on in the Galaxy community. Galaxy Updates complement the Galaxy Development News Briefs which accompany new Galaxy releases and focus on Galaxy code updates.

Galaxy Biostar

Galaxy Biostar

Note: Galaxy's support forum has moved to help.galaxyproject.org.

Galaxy has teamed up with Biostar to create Galaxy Biostar, a Galaxy User support forum.

Galaxy Biostar is a space where researchers using Galaxy come together and share both scientific advice and practical tool help. Whether on usegalaxy.org, a CloudMan instance, or any other Galaxy (public or local), if you have something to say about Using Galaxy, this is the place to do it.

Current integration with usegalaxy.org

  • The whole history of the galaxy-user@bx.psu.edu mailing list was imported into Galaxy Biostar. Your prior posts are automatically claimed when you login!
  • If you access Galaxy Biostar from usegalaxy.org (Menu: Help → Galaxy Biostar) you will be automatically logged in. A Galaxy Biostar account will be created for you if it did not previously exist. To obtain this account’s password please use the password reset feature of Galaxy Biostar.
  • When you have a question, search Galaxy Biostar directly from any Galaxy tool page.

Read more about how to get started on the Biostar wiki page or on the Biostar itself.

Roll-out phase

  • Galaxy Biostar is available at biostar.usegalaxy.org and will be our primary avenue for end-user support
  • The galaxy-user@bx.psu.edu mailing list will continue to be supported during the transition but starting now please use the Galaxy Biostar forum to ask all questions about using Galaxy.
  • Please do not double post to both Galaxy Biostar and galaxy-user@bx.psu.edu
  • Send us feedback in this Biostar post to tell us what you think. We care.

What’s next

Galaxy Biostar was launched on April 23. We hope you like the change and look forward to any feedback you may have.

New Papers

47 papers were added to the Galaxy CiteULike Group in April. Some papers that may be particularly interesting to the Galaxy community:

The new papers were tagged in 14 different areas (the most diverse month we've had):

# Tag    # Tag    # Tag    # Tag
3 Cloud 2 Project 5 Tools 5 UsePublic
1 HowTo 1 RefPublic - UseCloud 3 Visualization
1 IsGalaxy 1 Reproducibility 2 UseLocal 12 Workbench
21 Methods 1 Shared 9 UseMain

Events

GCC2014: June 30 - July 2, Baltimore

GCC2014: June 30 - July 2

The 2014 Galaxy Community Conference (GCC2014) will be held June 30 through July 2, at the Homewood Campus of Johns Hopkins University, in Baltimore, Maryland, United States.

Early Registration Closes May 23

Early registration closes the month. Early registration saves more than 70% on registration costs, and Training Day registration is an additional 55% off if you register for both at the same time. This is by far the most affordable option, with early registration fees starting at less than $50 per day. When you register you can also reserve lodging at Charles Commons, a very affordable housing option in the same building as the conference.

Training Day is an opportunity to learn about all things Galaxy including using Galaxy, deploying and managing Galaxy, extending Galaxy, and Galaxy internals. There are 5 parallel tracks, each with 3 sessions, with each of those sessions two and half hours long. That's 15 sessions and over 37 hours of workshop material.



Steven Salzberg

Keynote Speaker: Steven Salzberg

We are pleased to announce that Steven Salzberg will be the keynote speaker at GCC2014. Steven is a Professor of Medicine, Biostatistics, and Computer Science at the Johns Hopkins University School of Medicine where he is also Director of the Center for Computational Biology at the McKusick-Nathans Institute of Genetic Medicine. Steven has made many prominent contributions to open source software, including several of the most popular tools used on Galaxy Platforms. Recently he was awarded the 2013 Benjamin Franklin Award for Open Access in the Life Sciences, and the 2012 Balles Prize in Critical Thinking for his science column at Forbes.

Steve's GCC2014 talk will be on "Transcriptomes and Exomes: Computational Challenges of NGS Data."

Galaxy Hackathon at GCC2014

GCC2014 Hackathon

The very first Galaxy Project Hackathon will be a three day event taking place at Johns Hopkins immediately preceding GCC2014 from June 28th-30th.

Do you have a feature you've always wanted to implement? Just want to hack on Galaxy (or CloudMan!) with other folks? The Galaxy Hackathon will be a great opportunity to meet and work closely with other community and Galaxy Team members over the course of three days, culminating in some really great improvements and new features to show off at the Galaxy Community Conference afterward.

Participation in the hackathon itself is completely free, but there's limited space so if you're interested and would like to participate please go ahead and book both your lodging and hackathon seat at EventBrite.

To help organize ideas and people into more concrete projects, we've also set up a hackathon-specific Trello board that we'd love for everyone to go ahead and start using it. The board is public and open to commentary and voting, but to create new cards you’ll need to be added as a member so please note the instructions on the board for that.

Curoverse: Open Source developers of Arvados  Arvados: a free and open source bioinformatics platform for genomic and biomedical data

Finally, we are very happy to have Curoverse on board as the exclusive Peta level sponsor of the hackathon. If you know of any other group that might be interested in sponsoring at the Giga level please let us know.

Abstracts and Program

The deadlines for both oral and poster presentations were in April. Oral presentation submitters have been contacted and we've heard back from over half of them, and we will continue to update the Talk Abstracts page as we hear from the rest. If you submitted a poster abstract, they you will be notified at the end of this week if your poster was accepted. We'll start posting those abstracts online then too.

The conference is still accepting late abstract submissions. These will not be considered for constructing the initial list of accepted abstracts, but will be reviewed as cancellations occur and space frees up (and we have always had a few cancellations).

Look for a more detailed draft program to be posted later this month, once we have heard from all presenters.

Sponsorships and Exhibitors

Biological Chemistry @ Johns Hopkins
Center for Epigenetics

We are happy to have the Center for Epigenetics, and the Department of Biological Chemistry, both of Johns Hopkins University as GCC2014 Sponsors.

There are still Silver and Bronze sponsorships available for the GCC2014 and Giga sponsorships for the Hackathon. Please contact the Organizers if your organization would like to help sponsor these events.

In 2014 we are also adding non-sponsor exhibit spaces in addition to the sponsor exhibits. This will significantly increase the size of the exhibit floor. Please contact the Organizers if your organization would like to have an exhibit space at GCC2014.

UK May 2014 Galaxy Tour

UK May 2014 Galaxy Tour
Introduction to Galaxy Workshop

Introduction to Galaxy Workshop

5th Edinburgh Bioinformatics Meeting
Institute of Genetics and Molecular Medicine

A Galaxy Tour is happening in the United Kingdom in early May 2014. If you are anywhere close to Norwich or Edinburgh, then it might be worth your while to attend an event.

First, there will be a talk on Scaling Galaxy for Big Data at the NGS Data after the Gold Rush meeting, being held 6-7 May, at The Genome Access Centre (TGAC) in Norwich. This will be followed by a hands-on Introduction to Galaxy Workshop on 9 May, also at TGAC.

After that, there will be 3 events in Edinburgh the following week, starting on Monday with a hands on Introduction to Galaxy Workshop at the University of Edinburgh in the morning and a Galaxy Project Update talk at the 5th Edinburgh Bioinformatics Meeting, in the afternoon, and also at the University. Finally, on Tuesday 13 May, there will be a all-day hands-on Galaxy Workshop at the Institute of Genetics and Molecular Medicine (IGMM) at Western General Hospital.

You must be affiliated with the University of Edinburgh or the IGMM to register for either of those workshops, but all other events are open to anyone.

Other Events

And don't worry if you are not near Norwich or Edinburgh in May. There are at least 17 other Galaxy related events in the next 70 days in Norway, France, online, Croatia, Thailand, Canada, the US, the Netherlands, and Australia. Also see the Galaxy Events Google Calendar for details on other events of interest to the community.


  GMOD Online Training   GCC2014 early registration and abstract submission are now open 
Date Topic/Event Venue/Location Contact
April 29 -
May 1
Two tutorials and at least five talks BioIT World
Boston, Massachusetts, United States
See tutorial and talk list
May 6-7 Scaling Galaxy for Big Data NGS Data after the Gold Rush, TGAC, Norwich, United Kingdom Dave Clements
May 9 Introduction to Galaxy Workshop The Genome Analysis Centre (TGAC), Norwich, United Kingdom
May 12 Galaxy Workshop University of Edinburgh, Edinburgh, UK
Galaxy Project Update 5th Edinburgh Bioinformatics Meeting, University of Edinburgh, Edinburgh, UK
May 13 Galaxy Workshop Institute of Genetics and Molecular Medicine (IGMM), Edinburgh, UK
May 12-14 Short course on RNA-seq and ChIP-seq University of Bergen, Bergen, Norway David Fredman
May 16 Galaxy Initiation Formation en Bioinformatique Plateforme ABiMS, Station Biologique de Roscoff, France Christophe Caron
May 19 Initiation au traitement et à l'analyse des données métabolomiques sur la plateforme scientifique web Galaxy IFB-MetaboHUB 8e Journées Scientifiques du RFMF, Lyon, France Réseau Français de Métabolomique et Fluxomique
May 19-23 Installing Galaxy Workshop
Application deadline: April 28
GMOD Online Training Carl Eberhard, Amelia Ireland
May 27 Enabling Data Analysis with Galaxy CloudMan workshop MIPRO 2014, Opatija, Croatia Enis Afgan
June 2-3 Open (and Big) Data – the next challenge 1st Asian Conference on Open Access Scholarly Publishing, Bangkok, Thailand Scott Edmunds
June 9-10 Informatics on High Throughput Sequencing Data Workshop Toronto, Canada Francis Ouellette
June 10 Initiation à l’utilisation de Galaxy Cycle "Bioinformatique par la pratique" 2014, INRA Jouy-en-Josas, France Véronique Martin, Sophie Schbath
June 11 Analyse primaire de données issues de séquenceurs nouvelle génération sous Galaxy
June 16 Initiation à l’utilisation de Galaxy Cycle "Bioinformatique par la pratique" 2014, INRA Jouy-en-Josas, France Véronique Martin, Sophie Schbath
June 17 Analyse primaire de données issues de séquenceurs nouvelle génération sous Galaxy
June 16-20 Using Galaxy for Analysis of High Throughput Sequence Data UC Davis, California, United States UC Davis Bioinformatics Training
June 23-34 Galaxy : initiation à la phylogénie Formation en Bioinformatique Plateforme ABiMS, Station Biologique de Roscoff, France Christophe Caron
June 28-30 Galaxy Hackathon Homewood Campus of Johns Hopkins University, Baltimore, Maryland, United States Organizers
June 30 -
July 2
2014 Galaxy Community Conference (GCC2014) Organizers
July 7-9 NBIC/BioSB RNA- seq data analysis course Leiden, the Netherlands NBIC/BioSB
July 10 An Introduction to Galaxy with the Genomics Virtual Lab Post-GSA 2014 Workshop, Sydney, Australia Mark Crowe

Who's Hiring

Please Help! Yes you!

The Galaxy is expanding! Please help it grow.

Got a Galaxy-related opening? Send it to outreach@galaxyproject.org and we'll put it in the Galaxy News feed and include it in next month's update.


New Public Servers

Two public Galaxy servers were added to the published list in April:

Globus Genomics Proteomics

Globus Genomics Proteomics Galaxy Server

SunLab

SunLab Galaxy Server
  • Links: SunLab Galaxy Server
  • Domain/Purpose: Provides access to computational tools developed by Fengzhu Sun's group at University of Southern California, notably tools for local similarity analysis (LSA).
  • Comments:
  • User Support: [Email](mailto:lxia AT usc DOT edu)
  • Quotas: "Due to the limited computational resources, we refer users not using the tools developed by SunLab to the main public Galaxy site. We also encourage user applying SunLab tools to large data sets to install their standalone version of the specific tools, or install this version of Galaxy server with SunLab tools integrated.
  • Sponsor(s): The SunLab at the University of Southern California.

Galaxy Distributions

April 14, 2014 Galaxy Distribution

Trackster Vis
*Trackster **deep coverage** view*</div>

**[News Brief](http://wiki.galaxyproject.org/DevNewsBriefs/2014-04-14)** **Highlights:**
  • Visualization framework and Trackster display enhancements
  • Tool Shed upgrades for repos, installs, tests, and docs
  • Over 100 genomes with new content on our rsync server
  • UI unification of design plus expanded dataset action access
  • API functionality additions including new job control/admin abilities
  • More features for admin functions, config options, and job controls
  • 18 new community contributions incorporated (big thanks!)
getgalaxy    getgalaxy.org
galaxy-dist.readthedocs.org
bitbucket.org/galaxy/galaxy-dist
new: $ hg clone https://bitbucket.org/galaxy/galaxy-dist#stable
upgrade: $ hg pull
$ hg update release_2014.04.14

CloudMan and BioBlend

BioBlend 0.4.3 was released on April 11, 2014.

The most recent version of CloudMan was released in January 2014.

Galaxy Community Hubs

   Galaxy Community Log Board
Galaxy Deployment Catalog   

   Share your experience now   



The Community Log Board and Deployment Catalog Galaxy community hubs were launched in December. If you have a deployment, or experience you want to share then please publish them.


Galaxy ToolShed

ToolShed Contributions

New Public Tool Sheds

DTL ToolShed

The Dutch Techcentre for Life Sciences (DTL) has made its Galaxy ToolShed publicly available. The DTL ToolShed has almost 70 tools in it, from ANNOVAR to VCF-2-VariantList. This ToolShed was originally started at NBIC.

Galaxy Project ToolShed New Repositories

  • infernal: Inference of RNA Alignments search DNA sequence DBs for RNA structure/sequence similarities
  • msa_datatypes: Galaxy applicable data formats for Multiple Sequence Alignments
  • taxonomy_krona_chart: convert metagenomic profiling results into zoomable pie chart using Krona
  • deeptools_workflows: deepTools workflows to visualize large datasets in a meaningful way
  • mosaik2: reference-guided aligner for next-generation sequencing technologies.
  • suite_gops_1_0: Metarepository for the gops tool suite - will install the gops tool suite
  • suite_gatk_1_4: A suite of Galaxy utilities associated with version 1.4 of the GATK package.
  • join: Join the intervals of two datasets side-by-side
  • compute_q_values: Compute q-values based on multiple simultaneous tests p-values
  • charts: Enable advanced visualization options in Galaxy Charts, a visualization plugin for Galaxy
  • concat: Concatenate two datasets into one dataset
  • merge: Merge the overlapping intervals of a dataset
  • coverage: Coverage of a set of intervals on second set of intervals
  • basecoverage: count total bases covered by a set of intervals
  • intersect: Intersect the intervals of two datasets
  • flanking_features: Fetch closest non-overlapping feature for every interval
  • subtract: Subtract the intervals of two datasets
  • quality_filter: filter nucleotides in every alignment block of MAF file based on quality/PHRED scores
  • rcve: Compute RCVE (Relative Contribution to Variance) for all possible variable subsets
  • microsats_mutability: Estimate microsatellite mutability by specified attributes
  • partialr_square: Compute partial R square
  • linear_regression: uses R 'lm' function to perform linear regression
  • getindels_2way: Fetch Indels from pairwise alignments
  • getindelrates_3way: Estimate Indel Rates for 3-way alignments
  • cluster: Cluster the intervals of a dataset
  • complement: Complement intervals of a dataset
  • subtract_query: Subtract Whole Dataset from another dataset
  • featurecounter: find the coverage of intervals in the first dataset on intervals in the second dataset
  • logistic_regression_vif: Perform Logistic Regression with vif
  • maf_cpg_filter: Mask CpG/non-CpG sites from MAF file
  • get_flanks: find the upstream and/or downstream flanking region(s)
  • count_covariates: Count Covariates on BAM files
  • depth_of_coverage: Depth of Coverage on BAM files at different levels of partitioning and aggregation
  • substitutions: Fetch substitutions from pairwise alignments
  • microsats_alignment_level: Extract Orthologous Microsatellites from pair-wise alignments
  • variant_combine: Combines VCF records from different sources; supports full merges & set unions
  • best_regression_subsets: use regsubsets R function for regression subset selection
  • variant_filtration: Filter variant calls using user-selectable, parameterizable criteria
  • windowsplitter: splits intervals into smaller intervals based on the specified window-size and type
  • variants_validate: Validates a variants file.
  • variant_select: Select Variants from VCF files
  • table_recalibration: Second pass in a two-pass BAM processing step, doing a by-read traversal
  • realigner_target_creator: Realigner Target Creator for use in local realignment
  • variant_eval: General tool for variant evaluation (% in dbSNP, genotype concordance, Ti/Tv ratios, ...)
  • variant_recalibrator: learns a Gaussian mixture model over variant annotations and evaluates the variant
  • unified_genotyper: Variant caller which unifies approaches of several disparate callers
  • substitution_rates: Estimate substitution rates for non-coding regions using Jukes-Cantor JC69 model
  • variant_annotator: Annotate variant calls with context information.
  • weightedaverage: Assign weighted-average of the values of features overlapping an interval
  • tables_arithmetic_operations: Arithmetic Operations on tables
  • print_reads: Dynamically merge multiple BAM files, resulting in merged output sorted in coordinate order
  • variant_apply_recalibration: Cut vcf to get novel FDR levels specified during Variant Recalibration
  • indel_realigner: local realignment of reads based on misalignments due to the presence of indels
  • analyze_covariates: Create collapsed recal csv files, call R to plot residual error vs covariates
  • data_manager_gatk_picard_index_builder: Generate GATK-sorted Picard indexes
  • chartskit: Enables advanced visualization options in Galaxy Charts
  • column_join: Join tabular files

Other News