Package picard.sam

Class RevertSam


  • @DocumentedFeature
    public class RevertSam
    extends CommandLineProgram
    Reverts a SAM file by optionally restoring original quality scores and by removing all alignment information.

    This tool removes or restores certain properties of the SAM records, including alignment information. It can be used to produce an unmapped BAM (uBAM) from a previously aligned BAM. It is also capable of restoring the original quality scores of a BAM file that has already undergone base quality score recalibration (BQSR) if the original qualities were retained during the calibration (in the OQ tag).

    Usage Examples

    Output to a single file

     java -jar picard.jar RevertSam \\
          I=input.bam \\
          O=reverted.bam
     

    Output by read group into multiple files with sample map

     java -jar picard.jar RevertSam \\
          I=input.bam \\
          OUTPUT_BY_READGROUP=true \\
          OUTPUT_MAP=reverted_bam_paths.tsv
     

    Output by read group with no output map

     java -jar picard.jar RevertSam \\
          I=input.bam \\
          OUTPUT_BY_READGROUP=true \\
          O=/write/reverted/read/group/bams/in/this/dir
     
    This will output a BAM (Can be overridden with OUTPUT_BY_READGROUP_FILE_FORMAT option.)
    Note: If the program fails due to a SAM validation error, consider setting the VALIDATION_STRINGENCY option to LENIENT or SILENT if the failures are expected to be obviated by the reversion process (e.g. invalid alignment information will be obviated when the REMOVE_ALIGNMENT_INFORMATION option is used).
    • Field Detail

      • INPUT

        @Argument(shortName="I",
                  doc="The input SAM/BAM/CRAM file to revert the state of.")
        public PicardHtsPath INPUT
      • OUTPUT

        @Argument(mutex="OUTPUT_MAP",
                  shortName="O",
                  doc="The output SAM/BAM/CRAM file to create, or an output directory if OUTPUT_BY_READGROUP is true.")
        public File OUTPUT
      • OUTPUT_MAP

        @Argument(mutex="OUTPUT",
                  shortName="OM",
                  doc="Tab separated file with two columns, READ_GROUP_ID and OUTPUT, providing file mapping only used if OUTPUT_BY_READGROUP is true.")
        public File OUTPUT_MAP
      • OUTPUT_BY_READGROUP

        @Argument(shortName="OBR",
                  doc="When true, outputs each read group in a separate file.")
        public boolean OUTPUT_BY_READGROUP
      • RESTORE_HARDCLIPS

        @Argument(shortName="RHC",
                  doc="When true, restores reads and qualities of records with hard-clips containing XB and XQ tags.")
        public boolean RESTORE_HARDCLIPS
      • OUTPUT_BY_READGROUP_FILE_FORMAT

        @Argument(shortName="OBRFF",
                  doc="When using OUTPUT_BY_READGROUP, the output file format can be set to a certain format.")
        public RevertSam.FileType OUTPUT_BY_READGROUP_FILE_FORMAT
      • SORT_ORDER

        @Argument(shortName="SO",
                  doc="The sort order to create the reverted output file with.")
        public htsjdk.samtools.SAMFileHeader.SortOrder SORT_ORDER
      • RESTORE_ORIGINAL_QUALITIES

        @Argument(shortName="OQ",
                  doc="True to restore original qualities from the OQ field to the QUAL field if available.")
        public boolean RESTORE_ORIGINAL_QUALITIES
      • REMOVE_DUPLICATE_INFORMATION

        @Argument(doc="Remove duplicate read flags from all reads.  Note that if this is false and REMOVE_ALIGNMENT_INFORMATION==true,  the output may have the unusual but sometimes desirable trait of having unmapped reads that are marked as duplicates.")
        public boolean REMOVE_DUPLICATE_INFORMATION
      • REMOVE_ALIGNMENT_INFORMATION

        @Argument(doc="Remove all alignment information from the file.")
        public boolean REMOVE_ALIGNMENT_INFORMATION
      • ATTRIBUTE_TO_CLEAR

        @Argument(doc="When removing alignment information, the set of optional tags to remove.")
        public List<String> ATTRIBUTE_TO_CLEAR
      • SANITIZE

        @Argument(doc="WARNING: This option is potentially destructive. If enabled will discard reads in order to produce a consistent output BAM. Reads discarded include (but are not limited to) paired reads with missing mates, duplicated records, records with mismatches in length of bases and qualities. This option can only be enabled if the output sort order is queryname and will always cause sorting to occur.")
        public boolean SANITIZE
      • MAX_DISCARD_FRACTION

        @Argument(doc="If SANITIZE=true and higher than MAX_DISCARD_FRACTION reads are discarded due to sanitization then the program will exit with an Exception instead of exiting cleanly. Output BAM will still be valid.")
        public double MAX_DISCARD_FRACTION
      • KEEP_FIRST_DUPLICATE

        @Argument(doc="If SANITIZE=true keep the first record when we find more than one record with the same name for R1/R2/unpaired reads respectively. For paired end reads, keeps only the first R1 and R2 found respectively, and discards all unpaired reads. Duplicates do not refer to the duplicate flag in the FLAG field, but instead reads with the same name.")
        public boolean KEEP_FIRST_DUPLICATE
      • SAMPLE_ALIAS

        @Argument(doc="The sample alias to use in the reverted output file.  This will override the existing sample alias in the file and is used only if all the read groups in the input file have the same sample alias.",
                  shortName="ALIAS",
                  optional=true)
        public String SAMPLE_ALIAS
      • LIBRARY_NAME

        @Argument(doc="The library name to use in the reverted output file.  This will override the existing sample alias in the file and is used only if all the read groups in the input file have the same library name.",
                  shortName="LIB",
                  optional=true)
        public String LIBRARY_NAME
    • Constructor Detail

      • RevertSam

        public RevertSam()
    • Method Detail

      • customCommandLineValidation

        protected String[] customCommandLineValidation()
        Enforce that output ordering is queryname when sanitization is turned on since it requires a queryname sort.
        Overrides:
        customCommandLineValidation in class CommandLineProgram
        Returns:
        null if command line is valid. If command line is invalid, returns an array of error message to be written to the appropriate place.
      • doWork

        protected int doWork()
        Description copied from class: CommandLineProgram
        Do the work after command line has been parsed. RuntimeException may be thrown by this method, and are reported appropriately.
        Specified by:
        doWork in class CommandLineProgram
        Returns:
        program exit status.
      • revertSamRecord

        public void revertSamRecord​(htsjdk.samtools.SAMRecord rec)
        Takes an individual SAMRecord and applies the set of changes/reversions to it that have been requested by program level options.