INCLUDE(/UnderConstruction)

Datafiles used in various teaching exercises using Mouse mm9 chr19

  • Used in Using Galaxy to Perform Large-Scale Interactive Data Analysis: A live supplement June 2012 Update published in Current Protocols in Bioinformatics (pending release)

    • Using Galaxy to Perform Large-Scale Interactive Data Analysis - Publication, June 2012 CPB
    • Complete Live Supplemental - Page, Galaxy /Main
    • Loading Data and Understanding Datatypes - Screencast, Protocol 2
    • Calling Peaks for ChIP-seq Data - Screencast, Protocol 3
  • Other

Mammalian Promoter Database Files

MPromDB
These data come from the [Mammalian Promoter Database (MPromDB)](http://mpromdb.wistar.upenn.edu/) at the [Wistar Instute](http://www.wistar.org). MPromDB "is a curated database that strives to annotate gene promoters identified from ChIP-Seq experiment results. The long term goal of this database is to provide an integrated resource for mammalian gene transcriptional regulation and epigenetics." It is produced and supported by the [Davuluri Lab](http://bioinformatics.wistar.upenn.edu/davuluri), and Galaxy wishes to thank them for allowing us to use this data.

Restrictions

MPromDB is a public resource. However, it does have a few restrictions:

  1. You must be a registered user to download data files.
  2. Downloaded data files may be used only for non-commercial purposes.

The files here are a small subset of the files available for download from MPromDB. We ask you to honor MPromDB's use restrictions. The Galaxy Team also wishes to thank the Davuluri Lab for allowing us to use this data in the Galaxy project.

Data

File Description
[ &do=get](ATTACHMENT_URLMM9.chr19.AnnotatedPromotersWithTissueRNAP2Density.txt)

Mouse ENCODE Files

These files are from the Transcription Factor Binding Sites by ChIP-seq from ENCODE/Stanford/Yale mouse ChIP-SEQ experiment in the ENCODE project. These data were generated and analyzed by the labs of Michael Snyder at Stanford University and Sherman Weissman at Yale University. The exact data can be found at:

  • ftp://hgdownload.cse.ucsc.edu/goldenPath/mm9/encodeDCC/wgEncodeSydhTfbs/

    • wgEncodeSydhTfbsMelCtcfDmso20IggyaleRawDataRep2.fastq.gz - tags dataset
    • wgEncodeSydhTfbsMelInputDmso20IggyaleRawData.fastq.gz - control dataset (note: this is a correction from the control data source listed in the publication)

The original files from ENCODE were too large to use in teaching examples, so they have been reduced to contain only data that corresponds to chromosome 19 (the shortest).

These files were created by, well, cheating. We first processed the entire dataset, mapping it to the mm9 genome (source: UCSC). When went back and extracted from the original datasets only those records that eventually mapped to chromosome 19.

These data are also available as Galaxy dataset objects in Protocol 2 of the Using Galaxy to Perform Large-Scale Interactive Data Analysis: A live supplement June 2012 Update published in Current Protocols in Bioinformatics (pending release)

File Description
[ &do=get](ATTACHMENT_URLMouseChipSeqControlChr19.fastq)
[ &do=get](ATTACHMENT_URLMouseChipSeqRep1Chr19.fastq)