PX174 Track Settings
 
PX174 tracks   (All Isolate Alignments tracks)

Display mode:       Reset to defaults

Type of graph:
Track height: pixels (range: 8 to 128)
Data view scaling: Always include zero: 
Vertical viewing range: min:  max:   (range: -10 to 10)
Transform function:Transform data points by: 
Windowing function: Smoothing window:  pixels
Negate values:
Draw y indicator lines:at y = 0.0:    at y =
Graph configuration help
List subtracks: only selected/visible    all  
hide
 Configure
 PX174 ALE Depth  PX174 ALE Depth   Schema 
hide
 Configure
 PX174 ALE Insert  PX174 ALE Insert   Schema 
hide
 Configure
 PX174 ALE Place  PX174 ALE Place   Schema 
hide
 Configure
 PX174 coverage  PX174 coverage   Schema 

Description

This set of tracks displays both the (a) ALE assembly validation tracks and the (b) depth of coverage generated by aligning raw Illumina reads from C. elegans wild isolate PX174 to the CB4856 reference genome. Four tracks are present:

  1. ALE Depth, evaluating the evenness of sequencing depth (accounting for GC bias)
  2. ALE Insert, quantifying how well the insert sizes of mapped reads matches the distribution expected from the sequencing library
  3. ALE Placement, describing how well reads agree with the assembly given their probabilistic placement by the aligner.
  4. Coverage, the raw sequencing coverage.

The three ALE tracks were made using alignments to the CB4856 reference genome with SMALT (H. Ponstingl, see references), while the raw coverage track was made using alignments generated with phaster (P. green, unpublished).

Display Conventions and Configuration

All four tracks are displayed in WIG format. Tracks generated with ALE are scaled from -10 to 10 by default, allowing easy visualization of regions with scores consistently lower than -10: in regions of high agreement between PX174 and CB4856 all three ALE scores tend to deviate only little and very sporatically from 0.

Each of the wild isolates sequenced as part of the Million Mutation Project targeted a coverage depth of approximately 30x: the default vertical viewing range has been set to 100x to allow for visualization of regions of greater depth without auto-scaling in repetitive regions. These parameters can be adjusted as desired in the Track Settings.

Methods

To generate ALE assembly annotations we first mapped reads from C. elegans isolate PX174 to the CB4856 reference sequence using SMALT version 0.7.0.1. These bam alignments were used as input for ALE version 20130717, which gave as output statistically defined Depth, Insertion, and Placement scores for each base in the assembly.

To generate the raw coverage of PX174 mapped to CB4856, paired-end Illumina reads were converted to calf format and aligned to the reference genome with phaster. A maximum gapped indel size of 300bp was allowed as in the Million Mutation Project, and coverage depth was extracted from the merged bam file using the samtools "mpileup" command.

Credits

Please feel free to contact Owen Thompson with any questions and/or concerns regarding this or other tracks.

The raw data used to generate ALE scores and depth of coverage as displayed here is available from the Short Read Archive as experiment SRX219154.

Thanks to the ALE team at the Joint Genome Institute and Cornell University for their contribution to assembly annotation. Thanks also to Hannes Ponstingl for his alignment algorithm SMALT, and to Phil Green for his alignment algorithm phaster. Thanks also to Heng Li for his work creating samtools.

References

Clark SC, Egan R, Frazier PI, Wang Z. ALE: a generic assembly likelihood evaluation framework for assessing the accuracy of genome and metagenome assemblies. Bioinformatics. 2013;29(4):435-43. PMID 23303509. Website.

Li H, Handsaker B, Wysoker A, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25(16):2078-9. PMID 19505943.

Ponstingl, Hannes. (2012, May 11). SMALT - Sequence Mapping and Alignment Tool. Retrieved from http://www.sanger.ac.uk/resources/software/smalt.

Thompson O, Edgley M, Strasbourger P, et al. The million mutation project: a new approach to genetics in Caenorhabditis elegans. Genome Res. 2013;23(10):1749-62. PMID 23800452.