Package htsjdk.variant.vcf
Class AbstractVCFCodec
- java.lang.Object
-
- htsjdk.tribble.AbstractFeatureCodec<T,LineIterator>
-
- htsjdk.tribble.AsciiFeatureCodec<VariantContext>
-
- htsjdk.variant.vcf.AbstractVCFCodec
-
- All Implemented Interfaces:
FeatureCodec<VariantContext,LineIterator>
,NameAwareCodec
public abstract class AbstractVCFCodec extends AsciiFeatureCodec<VariantContext> implements NameAwareCodec
-
-
Field Summary
Fields Modifier and Type Field Description protected Map<String,List<Allele>>
alleleMap
protected boolean
doOnTheFlyModifications
If true, then we'll magically fix up VCF headers on the fly when we read them inprotected HashMap<String,List<String>>
filterHash
protected String[]
genotypeParts
protected VCFHeader
header
protected int
lineNo
protected String[]
locParts
static int
MAX_ALLELE_SIZE_BEFORE_WARNING
protected String
name
protected static int
NUM_STANDARD_FIELDS
protected String[]
parts
protected String
remappedSampleName
If non-null, we will replace the sample name read from the VCF header with this sample name.protected Map<String,String>
stringCache
static boolean
validate
protected VCFHeaderVersion
version
protected boolean
warnedAboutNoEqualsForNonFlag
-
Constructor Summary
Constructors Modifier Constructor Description protected
AbstractVCFCodec()
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description static boolean
canDecodeFile(String potentialInput, String MAGIC_HEADER_LINE)
LazyGenotypesContext.LazyData
createGenotypeMap(String str, List<Allele> alleles, String chr, int pos)
create a genotype mapVariantContext
decode(String line)
decode the line into a feature (VariantContext)Feature
decodeLoc(String line)
the fast decode functionvoid
disableOnTheFlyModifications()
Forces all VCFCodecs to not perform any on the fly modifications to the VCF header of VCF records.protected void
generateException(String message)
protected static void
generateException(String message, int lineNo)
VCFAltHeaderLine
getAltHeaderLine(String headerLineString, VCFHeaderVersion sourceVersion)
Create and return a VCFAltHeaderLine object from a header line string that conforms to thesourceVersion
protected String
getCachedString(String str)
Return a cached copy of the supplied string.VCFHeader
getHeader()
VCFMetaHeaderLine
getMetaHeaderLine(String headerLineString, VCFHeaderVersion sourceVersion)
Create and return a VCFMetaHeaderLine object from a header line string that conforms to thesourceVersion
String
getName()
get the name of this codecVCFPedigreeHeaderLine
getPedigreeHeaderLine(String headerLineString, VCFHeaderVersion sourceVersion)
Create and return a VCFPedigreeHeaderLine object from a header line string that conforms to thesourceVersion
VCFSampleHeaderLine
getSampleHeaderLine(String headerLineString, VCFHeaderVersion sourceVersion)
Create and return a VCFSampleHeaderLine object from a header line string that conforms to thesourceVersion
TabixFormat
getTabixFormat()
Define the tabix format for the feature, used for indexing.VCFHeaderVersion
getVersion()
protected static Allele
oneAllele(String index, List<Allele> alleles)
create a an allele from an index and an array of allelesprotected static List<Allele>
parseAlleles(String ref, String alts, int lineNo)
parse out the allelesprotected abstract List<String>
parseFilters(String filterString)
parse the filter string, first checking to see if we already have parsed it in a previous attemptprotected static List<Allele>
parseGenotypeAlleles(String GT, List<Allele> alleles, Map<String,List<Allele>> cache)
parse genotype alleles from the genotype stringprotected VCFHeader
parseHeaderFromLines(List<String> headerStrings, VCFHeaderVersion version)
create a VCF header from a set of header record linesprotected static Double
parseQual(String qualString)
parse out the qual valuevoid
setName(String name)
set the name of this codecvoid
setRemappedSampleName(String remappedSampleName)
Replaces the sample name read from the VCF header with the remappedSampleName.VCFHeader
setVCFHeader(VCFHeader newHeader, VCFHeaderVersion newVersion)
Explicitly set the VCFHeader on this codec.-
Methods inherited from class htsjdk.tribble.AsciiFeatureCodec
close, decode, isDone, makeIndexableSourceFromStream, makeSourceFromStream, readActualHeader, readHeader
-
Methods inherited from class htsjdk.tribble.AbstractFeatureCodec
decodeLoc, getFeatureType
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface htsjdk.tribble.FeatureCodec
canDecode, getPathToDataFile
-
-
-
-
Field Detail
-
MAX_ALLELE_SIZE_BEFORE_WARNING
public static final int MAX_ALLELE_SIZE_BEFORE_WARNING
-
NUM_STANDARD_FIELDS
protected static final int NUM_STANDARD_FIELDS
- See Also:
- Constant Field Values
-
header
protected VCFHeader header
-
version
protected VCFHeaderVersion version
-
validate
public static boolean validate
-
parts
protected String[] parts
-
genotypeParts
protected String[] genotypeParts
-
locParts
protected final String[] locParts
-
name
protected String name
-
lineNo
protected int lineNo
-
warnedAboutNoEqualsForNonFlag
protected boolean warnedAboutNoEqualsForNonFlag
-
doOnTheFlyModifications
protected boolean doOnTheFlyModifications
If true, then we'll magically fix up VCF headers on the fly when we read them in
-
remappedSampleName
protected String remappedSampleName
If non-null, we will replace the sample name read from the VCF header with this sample name. This feature works only for single-sample VCFs.
-
-
Method Detail
-
parseFilters
protected abstract List<String> parseFilters(String filterString)
parse the filter string, first checking to see if we already have parsed it in a previous attempt- Parameters:
filterString
- the string to parse- Returns:
- a set of the filters applied
-
parseHeaderFromLines
protected VCFHeader parseHeaderFromLines(List<String> headerStrings, VCFHeaderVersion version)
create a VCF header from a set of header record lines- Parameters:
headerStrings
- a list of strings that represent all the ## and # entries- Returns:
- a VCFHeader object
-
getHeader
public VCFHeader getHeader()
- Returns:
- the header that was either explicitly set on this codec, or read from the file. May be null. The returned value should not be modified.
-
getVersion
public VCFHeaderVersion getVersion()
- Returns:
- the version number that was either explicitly set on this codec, or read from the file. May be null.
-
setVCFHeader
public VCFHeader setVCFHeader(VCFHeader newHeader, VCFHeaderVersion newVersion)
Explicitly set the VCFHeader on this codec. This will overwrite the header read from the file and the version state stored in this instance; conversely, reading the header from a file will overwrite whatever is set here.- Parameters:
newHeader
-newVersion
-- Returns:
- the actual header for this codec. The returned header may not be identical to the header argument since the header lines may be "repaired" (i.e., rewritten) if doOnTheFlyModifications is set.
- Throws:
TribbleException
- if the requested header version is not compatible with the existing version
-
getAltHeaderLine
public VCFAltHeaderLine getAltHeaderLine(String headerLineString, VCFHeaderVersion sourceVersion)
Create and return a VCFAltHeaderLine object from a header line string that conforms to thesourceVersion
- Parameters:
headerLineString
- VCF header line being parsed without the leading "##ALT="sourceVersion
- the VCF header version derived from which the source was retrieved. The resulting header line object should be validate for this header version.- Returns:
- a VCFAltHeaderLine object
-
getPedigreeHeaderLine
public VCFPedigreeHeaderLine getPedigreeHeaderLine(String headerLineString, VCFHeaderVersion sourceVersion)
Create and return a VCFPedigreeHeaderLine object from a header line string that conforms to thesourceVersion
- Parameters:
headerLineString
- VCF header line being parsed without the leading "##PEDIGREE="sourceVersion
- the VCF header version derived from which the source was retrieved. The resulting header line object should be validate for this header version.- Returns:
- a VCFPedigreeHeaderLine object
-
getMetaHeaderLine
public VCFMetaHeaderLine getMetaHeaderLine(String headerLineString, VCFHeaderVersion sourceVersion)
Create and return a VCFMetaHeaderLine object from a header line string that conforms to thesourceVersion
- Parameters:
headerLineString
- VCF header line being parsed without the leading "##META="sourceVersion
- the VCF header version derived from which the source was retrieved. The resulting header line object should be validate for this header version.- Returns:
- a VCFMetaHeaderLine object
-
getSampleHeaderLine
public VCFSampleHeaderLine getSampleHeaderLine(String headerLineString, VCFHeaderVersion sourceVersion)
Create and return a VCFSampleHeaderLine object from a header line string that conforms to thesourceVersion
- Parameters:
headerLineString
- VCF header line being parsed without the leading "##SAMPLE="sourceVersion
- the VCF header version derived from which the source was retrieved. The resulting header line object should be validate for this header version.- Returns:
- a VCFSampleHeaderLine object
-
decodeLoc
public Feature decodeLoc(String line)
the fast decode function- Parameters:
line
- the line of text for the record- Returns:
- a feature, (not guaranteed complete) that has the correct start and stop
-
decode
public VariantContext decode(String line)
decode the line into a feature (VariantContext)- Specified by:
decode
in classAsciiFeatureCodec<VariantContext>
- Parameters:
line
- the line- Returns:
- a VariantContext
- See Also:
AsciiFeatureCodec.decode(htsjdk.tribble.readers.LineIterator)
-
getName
public String getName()
get the name of this codec- Specified by:
getName
in interfaceNameAwareCodec
- Returns:
- our set name
-
setName
public void setName(String name)
set the name of this codec- Specified by:
setName
in interfaceNameAwareCodec
- Parameters:
name
- new name
-
getCachedString
protected String getCachedString(String str)
Return a cached copy of the supplied string.- Parameters:
str
- string- Returns:
- interned string
-
oneAllele
protected static Allele oneAllele(String index, List<Allele> alleles)
create a an allele from an index and an array of alleles- Parameters:
index
- the indexalleles
- the alleles- Returns:
- an Allele
-
parseGenotypeAlleles
protected static List<Allele> parseGenotypeAlleles(String GT, List<Allele> alleles, Map<String,List<Allele>> cache)
parse genotype alleles from the genotype string- Parameters:
GT
- GT stringalleles
- list of possible allelescache
- cache of alleles for GT- Returns:
- the allele list for the GT string
-
parseQual
protected static Double parseQual(String qualString)
parse out the qual value- Parameters:
qualString
- the quality string- Returns:
- return a double
-
parseAlleles
protected static List<Allele> parseAlleles(String ref, String alts, int lineNo)
parse out the alleles- Parameters:
ref
- the reference basealts
- a string of alternates to break into alleleslineNo
- the line number for this record- Returns:
- a list of alleles, and a pair of the shortest and longest sequence
-
createGenotypeMap
public LazyGenotypesContext.LazyData createGenotypeMap(String str, List<Allele> alleles, String chr, int pos)
create a genotype map- Parameters:
str
- the stringalleles
- the list of alleles- Returns:
- a mapping of sample name to genotype object
-
disableOnTheFlyModifications
public final void disableOnTheFlyModifications()
Forces all VCFCodecs to not perform any on the fly modifications to the VCF header of VCF records. Useful primarily for raw comparisons such as when comparing raw VCF records
-
setRemappedSampleName
public void setRemappedSampleName(String remappedSampleName)
Replaces the sample name read from the VCF header with the remappedSampleName. Works only for single-sample VCFs -- attempting to perform sample name remapping for multi-sample VCFs will produce an Exception.- Parameters:
remappedSampleName
- replacement sample name for the sample specified in the VCF header
-
generateException
protected void generateException(String message)
-
generateException
protected static void generateException(String message, int lineNo)
-
getTabixFormat
public TabixFormat getTabixFormat()
Description copied from interface:FeatureCodec
Define the tabix format for the feature, used for indexing. Default implementation throws an exception. Note that onlyAsciiFeatureCodec
could read tabix files as defined inAbstractFeatureReader.getFeatureReader(String, String, FeatureCodec, boolean, java.util.function.Function, java.util.function.Function)
- Specified by:
getTabixFormat
in interfaceFeatureCodec<VariantContext,LineIterator>
- Returns:
- the format to use with tabix
-
-