Package picard.analysis
Class CollectWgsMetrics
- java.lang.Object
-
- picard.cmdline.CommandLineProgram
-
- picard.analysis.CollectWgsMetrics
-
- Direct Known Subclasses:
CollectRawWgsMetrics
,CollectWgsMetricsWithNonZeroCoverage
@DocumentedFeature public class CollectWgsMetrics extends CommandLineProgram
Computes a number of metrics that are useful for evaluating coverage and performance of whole genome sequencing experiments. Two algorithms are available for this metrics: default and fast. The fast algorithm is enabled by USE_FAST_ALGORITHM option. The fast algorithm works better for regions of BAM file with coverage at least 10 reads per locus, for lower coverage the algorithms perform the same.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
CollectWgsMetrics.CollectWgsMetricsIntervalArgumentCollection
protected static class
CollectWgsMetrics.WgsMetricsCollector
-
Field Summary
Fields Modifier and Type Field Description List<Double>
ALLELE_FRACTION
boolean
COUNT_UNPAIRED
int
COVERAGE_CAP
boolean
INCLUDE_BQ_HISTOGRAM
File
INPUT
protected IntervalArgumentCollection
intervalArgumentCollection
protected File
INTERVALS
int
LOCUS_ACCUMULATION_CAP
int
MINIMUM_BASE_QUALITY
int
MINIMUM_MAPPING_QUALITY
File
OUTPUT
int
READ_LENGTH
int
SAMPLE_SIZE
long
STOP_AFTER
File
THEORETICAL_SENSITIVITY_OUTPUT
boolean
USE_FAST_ALGORITHM
-
Fields inherited from class picard.cmdline.CommandLineProgram
COMPRESSION_LEVEL, CREATE_INDEX, CREATE_MD5_FILE, GA4GH_CLIENT_SECRETS, MAX_ALLOWABLE_ONE_LINE_SUMMARY_LENGTH, MAX_RECORDS_IN_RAM, QUIET, REFERENCE_SEQUENCE, referenceSequence, specialArgumentsCollection, SYNTAX_TRANSITION_URL, TMP_DIR, USE_JDK_DEFLATER, USE_JDK_INFLATER, VALIDATION_STRINGENCY, VERBOSITY
-
-
Constructor Summary
Constructors Constructor Description CollectWgsMetrics()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected int
doWork()
Do the work after command line has been parsed.protected WgsMetrics
generateWgsMetrics(htsjdk.samtools.util.IntervalList intervals, htsjdk.samtools.util.Histogram<Integer> highQualityDepthHistogram, htsjdk.samtools.util.Histogram<Integer> unfilteredDepthHistogram, double pctExcludedByAdapter, double pctExcludedByMapq, double pctExcludedByDupes, double pctExcludedByPairing, double pctExcludedByBaseq, double pctExcludedByOverlap, double pctExcludedByCapping, double pctTotal, int coverageCap, htsjdk.samtools.util.Histogram<Integer> unfilteredBaseQHistogram, int theoreticalHetSensitivitySampleSize)
protected long
getBasesExcludedBy(CountingFilter filter)
If INTERVALS is specified, this will count bases beyond the interval list when the read overlaps the intervals and extends beyond the edge.protected AbstractWgsMetricsCollector
getCollector(int coverageCap, htsjdk.samtools.util.IntervalList intervals)
CreatesAbstractWgsMetricsCollector
implementation according toUSE_FAST_ALGORITHM
value.protected htsjdk.samtools.util.IntervalList
getIntervalsToExamine()
Gets the intervals over which we will calculate metrics.protected htsjdk.samtools.util.AbstractLocusIterator
getLocusIterator(htsjdk.samtools.SamReader in)
CreatesAbstractLocusIterator
implementation according toUSE_FAST_ALGORITHM
value.protected htsjdk.samtools.SAMFileHeader
getSamFileHeader()
This method should only be called aftergetSamReader()
is called.protected htsjdk.samtools.SamReader
getSamReader()
Gets the SamReader from which records will be examined.protected IntervalArgumentCollection
makeIntervalArgumentCollection()
protected boolean
requiresReference()
-
Methods inherited from class picard.cmdline.CommandLineProgram
checkRInstallation, customCommandLineValidation, getCommandLine, getCommandLineParser, getCommandLineParserForArgs, getDefaultHeaders, getFaqLink, getMetricsFile, getPGRecord, getStandardUsagePreamble, getStandardUsagePreamble, getVersion, hasWebDocumentation, instanceMain, instanceMainWithExit, makeReferenceArgumentCollection, parseArgs, setDefaultHeaders, useLegacyParser
-
-
-
-
Field Detail
-
INPUT
@Argument(shortName="I", doc="Input SAM/BAM/CRAM file.") public File INPUT
-
OUTPUT
@Argument(shortName="O", doc="Output metrics file.") public File OUTPUT
-
MINIMUM_MAPPING_QUALITY
@Argument(shortName="MQ", doc="Minimum mapping quality for a read to contribute coverage.") public int MINIMUM_MAPPING_QUALITY
-
MINIMUM_BASE_QUALITY
@Argument(shortName="Q", doc="Minimum base quality for a base to contribute coverage. N bases will be treated as having a base quality of negative infinity and will therefore be excluded from coverage regardless of the value of this parameter.") public int MINIMUM_BASE_QUALITY
-
COVERAGE_CAP
@Argument(shortName="CAP", doc="Treat positions with coverage exceeding this value as if they had coverage at this value (but calculate the difference for PCT_EXC_CAPPED).") public int COVERAGE_CAP
-
LOCUS_ACCUMULATION_CAP
@Argument(doc="At positions with coverage exceeding this value, completely ignore reads that accumulate beyond this value (so that they will not be considered for PCT_EXC_CAPPED). Used to keep memory consumption in check, but could create bias if set too low") public int LOCUS_ACCUMULATION_CAP
-
STOP_AFTER
@Argument(doc="For debugging purposes, stop after processing this many genomic bases.") public long STOP_AFTER
-
INCLUDE_BQ_HISTOGRAM
@Argument(doc="Determines whether to include the base quality histogram in the metrics file.") public boolean INCLUDE_BQ_HISTOGRAM
-
COUNT_UNPAIRED
@Argument(doc="If true, count unpaired reads, and paired reads with one end unmapped") public boolean COUNT_UNPAIRED
-
SAMPLE_SIZE
@Argument(doc="Sample Size used for Theoretical Het Sensitivity sampling. Default is 10000.", optional=true) public int SAMPLE_SIZE
-
intervalArgumentCollection
@ArgumentCollection protected IntervalArgumentCollection intervalArgumentCollection
-
THEORETICAL_SENSITIVITY_OUTPUT
@Argument(doc="Output for Theoretical Sensitivity metrics.", optional=true) public File THEORETICAL_SENSITIVITY_OUTPUT
-
ALLELE_FRACTION
@Argument(doc="Allele fraction for which to calculate theoretical sensitivity.", optional=true) public List<Double> ALLELE_FRACTION
-
USE_FAST_ALGORITHM
@Argument(doc="If true, fast algorithm is used.") public boolean USE_FAST_ALGORITHM
-
READ_LENGTH
@Argument(doc="Average read length in the file. Default is 150.", optional=true) public int READ_LENGTH
-
INTERVALS
protected File INTERVALS
-
-
Method Detail
-
requiresReference
protected boolean requiresReference()
- Overrides:
requiresReference
in classCommandLineProgram
-
makeIntervalArgumentCollection
protected IntervalArgumentCollection makeIntervalArgumentCollection()
- Returns:
- An interval argument collection to be used for this tool. Subclasses can override this to provide an argument collection with alternative arguments or argument annotations.
-
getSamReader
protected htsjdk.samtools.SamReader getSamReader()
Gets the SamReader from which records will be examined. This will also set the header so that it is available in
-
doWork
protected int doWork()
Description copied from class:CommandLineProgram
Do the work after command line has been parsed. RuntimeException may be thrown by this method, and are reported appropriately.- Specified by:
doWork
in classCommandLineProgram
- Returns:
- program exit status.
-
getIntervalsToExamine
protected htsjdk.samtools.util.IntervalList getIntervalsToExamine()
Gets the intervals over which we will calculate metrics.
-
getSamFileHeader
protected htsjdk.samtools.SAMFileHeader getSamFileHeader()
This method should only be called aftergetSamReader()
is called.
-
generateWgsMetrics
protected WgsMetrics generateWgsMetrics(htsjdk.samtools.util.IntervalList intervals, htsjdk.samtools.util.Histogram<Integer> highQualityDepthHistogram, htsjdk.samtools.util.Histogram<Integer> unfilteredDepthHistogram, double pctExcludedByAdapter, double pctExcludedByMapq, double pctExcludedByDupes, double pctExcludedByPairing, double pctExcludedByBaseq, double pctExcludedByOverlap, double pctExcludedByCapping, double pctTotal, int coverageCap, htsjdk.samtools.util.Histogram<Integer> unfilteredBaseQHistogram, int theoreticalHetSensitivitySampleSize)
-
getBasesExcludedBy
protected long getBasesExcludedBy(CountingFilter filter)
If INTERVALS is specified, this will count bases beyond the interval list when the read overlaps the intervals and extends beyond the edge. Ideally INTERVALS should only include regions that have hard edges without reads that could extend beyond the boundary (such as a whole contig).
-
getLocusIterator
protected htsjdk.samtools.util.AbstractLocusIterator getLocusIterator(htsjdk.samtools.SamReader in)
CreatesAbstractLocusIterator
implementation according toUSE_FAST_ALGORITHM
value.- Parameters:
in
- innerSamReader
- Returns:
- if
USE_FAST_ALGORITHM
is enabled, returnsEdgeReadIterator
implementation, otherwise default algorithm is used andSamLocusIterator
is returned.
-
getCollector
protected AbstractWgsMetricsCollector getCollector(int coverageCap, htsjdk.samtools.util.IntervalList intervals)
CreatesAbstractWgsMetricsCollector
implementation according toUSE_FAST_ALGORITHM
value.- Parameters:
coverageCap
- the maximum depth/coverage to consider.intervals
- the intervals over which metrics are collected.- Returns:
- if
USE_FAST_ALGORITHM
is enabled, returnsFastWgsMetricsCollector
implementation, otherwise default algorithm is used andCollectWgsMetrics.WgsMetricsCollector
is returned.
-
-