February 2015 Galaxy Update

Galaxy Updates

Welcome to the February Galaxy Update, a summary of what is going on in the Galaxy community. Galaxy Updates complement the Galaxy Development News Briefs which accompany new Galaxy releases and focus on Galaxy code updates.

New Papers

55 new papers referencing, using, extending, and implementing Galaxy were added to the Galaxy CiteULike Group in January. Some highlights:

The new papers were related to:

# Tag    # Tag    # Tag    # Tag
1 Cloud 2 Project 7 Tools 9 UsePublic
2 HowTo 1 RefPublic 1 UseCloud - Visualization
1 IsGalaxy 2 Reproducibility 2 UseLocal 6 Workbench
36 Methods 4 Shared 7 UseMain

Events

GCC2015: 6-8 July, Norwich UK

Sponsor GCC2015

The 2015 Galaxy Community Conference (GCC2015) is the Galaxy community's annual gathering of users, developers, and administrators. Previous GCC's have drawn over 200 participants, and we expect that to happen again in 2015. GCC2015 is being hosted by The Sainsbury Lab in Norwich, UK, immediately before BOSC and ISMB/ECCB in Dublin.

Early Registration opens in February

Early registration (save heaps) and abstract submission will open later in February. If you work in data-intensive life science research, then it is hard find a meeting more relevant than GCC2015. We look forward to seeing you there.

Training Day Topic Voting Closes TODAY

Vote now!
   Vote now!   

Voting on what topics should be offered at the GCC2015 Training Day closes today, 30 January.

Topics for the GCC2015 Training Day are selected by you, the Galaxy Community. There are nominated topics spanning from basic usage to advanced deployment. No matter what you do with Galaxy, there are topics for you to choose from. Your vote will determine the topics that are offered, which topics should be offered more than once, and which ones should not be scheduled at the same time. Your vote matters.

The Training Day schedule, including instructors, will be published before early registration opens in February.

About the GCC2015 Training Day:

The 2015 Galaxy Community Conference (GCC2015) will start on 6 July with a Training Day featuring parallel tracks, each with several multi-hour workshops. There will be at least one complete track about using Galaxy for biological research, and at least one full track on deploying and managing Galaxy instances.

Sponsorships

GenomeWeb

We are delighted to have GenomeWeb as a returning Silver Sponsor for GCC2015.

GenomeWeb

GenomeWeb is an independent online news organization based in New York. Since 1997, GenomeWeb has served the global community of scientists, technology professionals, and executives who use and develop the latest advanced tools in molecular biology research and molecular diagnostics. Our editorial mission is to cover the scientific and economic ecosystem spurred by the advent of high-throughput genome sequencing. We operate the largest online newsroom focused on advanced molecular research tools in order to provide our readers with exclusive news and in-depth analysis of this rapidly evolving market.

Globus Genomics

And we are delighted to announce that Globus Genomics is also a returning Silver Sponsor for GCC2015.

Globus Genomics

Globus Genomics is an integrated solution for Next Gen Sequencing analysis utilizing technologies in big data management and big data analysis. Globus Genomics combines big data management capabilities of Globus Online with the Galaxy framework and high throughput computing capabilities on Amazon Web Services. With Globus Genomics researchers can easily transfer large data sets and analyze the data in Galaxy. Globus Genomics is a non-profit service offering delivered and maintained by the Globus team at the Computation Institute, University of Chicago.

The Genome Analysis Centre (TGAC)

Finally, please welcome The Genome Analysis Centre (TGAC) as first-time Silver Sponsor for GCC!

The Genome Analysis Centre

The Genome Analysis Centre (TGAC) is a world-class research institute focusing on the development of genomics and computational biology. TGAC is based on the Norwich Research Park and receives strategic funding from the Biotechnology and Biological Science Research Council (BBSRC) as well as from other research funders. TGAC offers a state of the art DNA sequencing facility, unique by its operation of multiple complementary technologies for data generation. The Institute hosts one of the largest computing hardware facilities dedicated to life science research in Europe. It is also actively involved in developing novel platforms to provide access to computational tools and processing capacity for academic and industrial users and promoting applications of computational Bioscience. Additionally, the Institute is a member of the Galaxy Training Network and offers a Training programme through courses and workshops, and has an Education and Public Engagement programme targeting schools, teachers and the general public.

Call for Sponsors

The 2015 Galaxy Community Conference (GCC2015) is now accepting Sponsorships. Your organisation can play a prominent part in the Galaxy community by sponsoring GCC2015. Sponsorship is an excellent way to raise your organization’s visibility.

Several sponsorship levels are available, including two levels of premier sponsorships that include presentations. Premium sponsorships are limited, however, so you are encouraged to act soon.

Please let the organisers know if you are interested in helping make this event a success.

19 February GalaxyAdmins Meetup

GalaxyAdmins meetup February 19

The first GalaxyAdmins meetup of 2015 will happen online on Thursday, 19 February. GalaxyAdmins is a special interest group for Galaxy community members who are responsible for Galaxy installations.

A summary of both the user and admin/developer 2014 Galaxy Community Questionnaires will be presented by Dave Clements, a Galaxy Project Update will be offered by a Galaxy Team member, and Hans-Rudolf Hotz will lead a short discussion on GalaxyAdmins direction.

Thanks to everyone for letting us know what dates and times worked best for you. This time and day of the week worked for a remarkable 91% of respondents.

January Baltimore Area Galaxy Meetup Report

January 2015 Galaxy Baltimore Meetup

The first Baltimore Area Galaxy meetup was sold out. It was the right mix of current Galaxy users, people who want to learn how to use Galaxy, and students in Biology/Computer science programs. It was good to see academics unwind with a few beers and talk about their ideas to use Galaxy. The idea behind the meetup was to build a local community of Galaxy users, and to get to know one another personally. This would in turn foster collaborations and also bring forward better feedback to the Galaxy Team as well.

The second meetup is going to be hosted in February, for which we hope to have a slightly more focussed agenda. To talk about one or a few of the topics raised during the course of the month. There is now a Baltimore newsletter if you are interested) which goes out to the members who signed up for the first meetup. We hope to have more people in the coming months, and to grow a stronger Galaxy community in Baltimore.

Enis Afgan & Nitesh Turaga

Other Events

Analyse bioinformatique de séquences sous Galaxy   19 February GalaxyAdmins Web Meetup   QFAB Workshops RNA-Seq and ChIP-Seq Analysis with Galaxy  

There are upcoming events in 5 countries on 4 continents. See the Galaxy Events Google Calendar for details on other events of interest to the community.

Date Topic/Event Venue/Location Contact
February 9-13 Analyse bioinformatique de séquences sous Galaxy
Training offered by GTN Member
Montpellier, France
J.F. Dufayard
February 16-18 Accessible and Reproducible Large-Scale Analysis with Galaxy Genome and Transcriptome Analysis, part of Molecular Medicine Tri-Conference, San Francisco, California, United States James Taylor
Large-Scale NGS data Analysis on Amazon Web Services Using Globus Genomic Genomics & Sequencing Data Integration, Analysis and Visualization, part of Molecular Medicine Tri-Conference, San Francisco, California, United States Ravi Madduri
iReport: An Integrative “omics” Reporting and Visualisation Platform Andrew Stubbs
February 19 February 2015 GalaxyAdmins Web Meetup Online Hans-Rudolf Hotz, Dave Clements
March 2 RNA-Seq analysis using Galaxy
Training offered by GTN Member
QFAB, University of Queensland, St Lucia, Australia
Mark Crowe
March 3-5 Développement et intégration d’applications sous Galaxy
Training offered by GTN Member
La Chapelle Sur Erdre, France
Frederique Malipier
March 23-26 RNA-Seq and ChIP-Seq Analysis with Galaxy
Training offered by GTN Member
UC Davis Bioinformatics Core, Davis, California, United States
UC Davis Bioinformatics
April 14-15 GlobusWorld 2015 Chicago, Illinois, United States Globus Outreach
April 20-21 Workshop: Extended RNA-Seq analysis
Training offered by GTN Member
QFAB, University of Queensland, St Lucia, Australia
Mark Crowe
April 28 Galaxy Workshop Tokyo 2015 RCAST, The University of Tokyo, Japan Ryota Yamanaka
July 6-8 2015 Galaxy Community Conference (GCC2015)
Training offered by GTN Member
The Sainsbury Lab, Norwich, United Kingdom
Galaxy Outreach
Designates a training event offered by GTN Member Designates a training event offered by GTN member(s)

Who's Hiring

Please Help! Yes you!

The Galaxy is expanding! Please help it grow.

Got a Galaxy-related opening? Send it to outreach@galaxyproject.org and we'll put it in the Galaxy News feed and include it in next month's update.


New Public Servers

Several new public Galaxy servers were added in January.

RiboGalaxy

RiboGalaxy

RiboGalaxy provides on-line tools for the analysis and visualization of ribo-seq data obtained with the ribosome profiling technique. It is a freely available web server for processing and analysing ribosome profiling (ribo-seq) data with the visualization functionality provided by GWIPS-viz. RiboGalaxy provides a compact suite of tools specifically tailored for the alignment and visualization of ribo-seq and corresponding mRNA-seq data. Users can take advantage of the published workflows on RiboGalaxy which reduce the multi-step alignment process to a minimum of inputs.

RiboGalaxy has its own dedicated Help page. Please post any questions you may have on our RiboGalaxy Forum. Anonymous users can use RiboGalaxy. However, the upload and processing of datasets larger than 2GB and the use of advanced features such as published workflows, will require the user to be registered on RiboGalaxy.

RiboGalaxy is supported by Science Foundation Ireland.

CardioVascular Research Grid (CVRG)

CardioVascular Research Grid (CVRG)

The CardioVascular Research Grid (CVRG) project provides CVRG Galaxy for secure seamless access to study data and analysis tools. Users can transfer data to CVRG Galaxy at high speed using Globus Connect. CVRG Galaxy provides a wide range of analysis algorithms, including Physionet algorithms for ECG analysis, and stored workflows that simplify the process of data analysis. CVRG Galaxy also has tools that help users run their analyses faster by using multiple processors on the Amazon Elastic Compute Cloud. CVRG Galaxy can annotate data by retrieving ontology terms from the Bioportal ontology server.

There is CVRG Galaxy Wiki Page for support. To use CVRG Galaxy, an account is required, and anyone can create an account

"The CVRG project is supported by the National Heart Lung & Blood Institute. The project is based at the Institute for Computational Medicine at the Johns Hopkins University, Department of Biomedical Informatics at Vanderbilt University Medical Center, The College of Computing and Informatics at UNC Charlotte, The Center for Comprehensive Informatics at Emory University, The College of Engineering and Applied Sciences at Stony Brook University, and the Computation Institute at The University of Chicago."

Center for Phage Technology (CPT)

Center for Phage Technology (CPT)

The Center for Phage Technology (CPT) Galaxy Server includes 50 additional tools, and PAUSE (Pile-Up Analysis Using Starts & Ends) V1 and V2 tool sets. There are also several published workflows demonstrating PAUSE for both paired and single-end reads.

An account is required, and anyone can create an account. See this FAQ.

CPT Galaxy is sponsored by the Center for Phage Technology (CPT), Texas A\&M University

Galaxy Community Hubs

Galaxy Training Network Galaxy Community Log Board Galaxy Deployment Catalog
Share your training resources and experience now Share your experience now

Ruđer Bošković Institute (RBI)

New GTN Member: Ruđer Bošković Institute

We are pleased to welcome the Ruđer Bošković Institute in Zagreb, Croatia as the newest member of Galaxy Training Network (GTN). Dobrodošli!

New Releases

Galaxy 2015-01-13 Distribution

width=175

Complete News Brief

Security

Several critical security vulnerabilities were recently discovered by Bartlomiej Balcerek and Mateusz Stahl at the Wroclaw Centre for Networking and Supercomputing. This stable Galaxy release contains fixes for those vulnerabilities. The Galaxy Team strongly encourages Galaxy server administrators to update their Galaxy servers immediately.

IPython Integration

Thanks to the awesome work of community members Björn Grüning and Helena Rasche, Galaxy now features integration with the popular IPython project. The Galaxy-IPython project has been merged into Galaxy core and made into a generic plugin framework of interactive environments based on Docker.

** Tool Form Upgrade (for Beta Testing) **

In Galaxy's development branch, the basic tool from has been redesigned and modernized to address certain limitations in Galaxy's responsiveness when working with longer forms containing multiple parameter choices. This new tool form will become the default with the next release - but we are hoping tool author's and power users enable it and provide feedback during this release cycle in order to ensure it is working ideally when it becomes the default.

** Get The Distribution **

getgalaxy    getgalaxy.org
galaxy-dist.readthedocs.org
bitbucket.org/galaxy/galaxy-dist
new: $ hg clone https://bitbucket.org/galaxy/galaxy-dist#stable
upgrade: $ hg pull
$ hg update latest_2015.01.13

Planemo 0.2.0

Version 0.2.0 of Planemo was released in January. Planemo is a set of command-line utilities to assist in building tools for the Galaxy project. Updates included:

  • Improvements to way Planemo loads its own copy of Galaxy modules to prevent various conflicts when launching Galaxy from Planemo. (Pull Request 56)
  • Allow setting various test output options in ~/.planemo.yml and disabling JSON output.
  • More experimental Brew and Tool Shed options that should not be considered part of Planemo's stable API. See this presentation for more details.
  • Fix project_init for BSD tar (thanks to Nitesh Turaga for the bug report.)
  • Documentation fixes for tool linting command (thanks to Nicola Soranzo).

BioBlend

BioBlend v0.5.2 was released in October. BioBlend is a python library for interacting with CloudMan and the Galaxy API. In January the BioBlend source was moved to the galaxyproject Github account.

CloudMan and blend4j

New versions CloudMan, and blend4j were released in August.


Galaxy ToolShed

ToolShed Contributions

Galaxy Project ToolShed Repos

Here are new contributions for the past month.

In no particular order:

New Tools

  • From rnateam:

    • mirdeep2: identification of novel and known miRNAs MiRDeep2 is a software package for identification of novel and known miRNAs in deep sequencing data. Furthermore, it can be used for miRNA expression profiling across samples.
    • mirdeep2_mapper: MiRDeep2 Mapper - process and map reads to a reference genome The mapper module is designed as a tool to process deep sequencing reads and/or map them to the reference genome. The module works in sequence space, and can process or map data that is in sequence fasta format. A number of the functions of the mapper module are implemented specifically with Solexa/Illumina data in mind.
    • mirdeep2_quantifier: Imported from capsule None fast quantitation of reads mapping to known miRBase precursors The module maps the deep sequencing reads to predefined miRNA precursors and determines by that the expression of the corresponding miRNAs. First, the predefined mature miRNA sequences are mapped to the predefined precursors. Optionally, predefined star sequences can be mapped to the precursors too. By that the mature and star sequence in the precursors are determined. Second, the deep sequencing reads are mapped to the precursors. The number of reads falling into an interval 2nt upstream and 5nt downstream of the mature/star sequence is determined.
  • From hammock:

    • hammock: Cluster large amounts of short peptide sequences and generate MSAs of resulting clusters.
  • From iuc:

    http://snpeff.sourceforge.net/SnpSift.html#geneSets !SnpEff and !SnpSift are developed by Pablo Cingolani at http://snpeff.sourceforge.net/
    Repository-Maintainer: Björn Grüning, Jim Johnson, Nicola Soranzo
    Repository-Development: https://github.com/galaxy-iuc/tool_shed/

  • snpsift_dbnsfp_generic: snpEff SnpSift dbnsfp tool that can use any dbnsfp-like annotation data Annotates variants on genes using a tabular set of annotation values such as those from the dbNSFP, an integrated database of human functional predictions from multiple algorithms (SIFT, Polyphen2, LRT and MutationTaster, PhyloP and GERP++, etc.)
    This tool determines the available annotations from the input, so it can be used for other organisms other than human or other annotation values than available from dbNSFP. http://snpeff.sourceforge.net[/SnpSift](/galaxy-updates/2015-02/SnpSift/).html#dbNSFP
    SnpEff and SnpSift are developed by Pablo Cingolani at http://snpeff.sourceforge.net/
    Repository-Maintainer: Björn Grüning, Jim Johnson, Nicola Soranzo
    Repository-Development: https://github.com/galaxy-iuc/tool_shed/
  • From amawla:

    • edger: Empirical analysis of digital gene expression data. GVL. PeterMac Estimates differential gene expression for short read sequence count using methods appropriate for count data.
      Performs digital differential gene expression analysis between groups (eg a treatment and control). Biological replicates provide information about experimental variability required for reliable inference.
      Designed for easier use in Galaxy.
      R package requirements:

      • edgeR
      • limma

    Contributors: Monica Britton, Blythe Durbin-Johnson, Joseph Fass, Nikhil Joshi, Alex Mawla

  • From jjohnson:

    • gffread: cufflinks gffread filters and/or converts GFF3/GTF2 records cufflinks gffread filters and/or converts GFF3/GTF2 records and can produce cdna, CDS, peptide fasta sequences
  • From mvdbeek:

    • add_input_name_as_column: Add input name as column on an existing tabular file Retrieves the histroy name of the input dataset and adds it as the last column. Very useful when working with dataset collections.
  • From peterjc:

    • seq_filter_by_mapping: Uploaded v0.0.2, fixed some error messages Filter sequencing reads using SAM/BAM mapping files This tool is a short Python script (using Biopython library functions) which divides a FASTA, FASTQ, or SFF file in two, those sequences which do or do not map according to given SAM/BAM file(s).
      Example uses include mapping of FASTQ reads against a known contaminant in order to remove reads prior to a de novo assembly.
      See https://github.com/peterjc/pico_galaxy/tree/master/tools/seq_filter_by_mapping
  • From devteam:

  • bamtools_filter: Utility for filtering BAM files. It is based on BAMtools suite of tools by Derek Barnett (https://github.com/pezmaster31/bamtools).
    GitHub repo for this collection of tools can be found here.
  • From bigrna:

    • gpsrna: Plant small RNA analysis toolkit, microRNA identification and quantification, multiple distinct small RNAs identification, small interacting RNAs identification and quantification.
  • From fubar:

    • tool_factory_2: Initial commit of code in iuc github repository. updated version of the tool factory Includes arbitrary parameters and multiple inputs. Cost is more complex parameter parsing but examples are provided. Now allows selecting any installed interpreter tool shed package or using the system default!
  • From nikhil-joshi:

    • sam2counts_edger: Takes SAM files and outputs a table of counts that can be used with edgeR.

Tool Suites

  • From rnateam:

    • suite_mirdeep_2_0: The suite of tool wrappers for mirDeep2. MiRDeep2 is a software package for identification of novel and known miRNAs in deep sequencing data. Furthermore, it can be used for miRNA expression profiling across samples.
  • From devteam:

Data Managers

Packages / Tool Dependency Definitions

  • From biopython:

    • package_biopython_1_65: Uploaded with NumPy 1.9 dependency. Downloads and compiles version 1.65 of the Biopython package. The Biopython Project is an international association of developers of freely available Python tools for computational molecular biology.
      http://www.biopython.org
  • From iuc:

  • From devteam:

    • package_bamtools_2_3_0_2d7685d2ae: bamtools - a collection of utilities for processing of bam files Binary files in this package are compiled from source code version 2d7685d2ae. This is package dependency for tools relying on bamtools toolkit developed by by Derek Barnett (https://github.com/pezmaster31/bamtools). This package is distributed as x86_64 binaries only. These binaries should work on any of our stated supported linux platforms other than RHEL/CentOS 5.
  • From malbuquerque:

    • package_seqtk_1_0_0: Builds Seqtk version 1.0.0 "Seqtk is a fast and lightweight tool for processing sequences in the FASTA or FASTQ format. It seamlessly parses both FASTA and FASTQ files which can also be optionally compressed by gzip" -Heng Li, Broad Institute
    • package_bamtools_2_3_0: BamTools - a collection of utilities for processing of bam files - for all architectures
    • package_cmake_3_1_0: Builds CMake version 3.1.0 - Cross Platform Makefile Generator
    • package_ucsc_user_apps_310: Builds most UCSC user applications Only those tools which successfully build without using mysql_config. Specifically only those in the kent/src/utils directory (those publicly available).

Other News

And a Huge Thank You!

The Galaxy Team would like to thank everyone who took the time to respond to the project's call for support. These will be extremely helpful with our renewal.