Package picard.sam

Class SplitSamByNumberOfReads


  • @DocumentedFeature
    public class SplitSamByNumberOfReads
    extends CommandLineProgram

    Splits the input queryname sorted or query-grouped SAM/BAM/CRAM file and writes it into multiple BAM files, each with an approximately equal number of reads. This will retain the sort order within each output BAM and if the BAMs are concatenated in order (output files are named numerically) the order of the reads will match the original BAM. It will traverse the bam twice unless TOTAL_READS_IN_INPUT is provided.

    • Field Detail

      • INPUT

        @Argument(doc="Input SAM/BAM/CRAM file to split",
                  shortName="I")
        public File INPUT
      • SPLIT_TO_N_READS

        @Argument(shortName="N_READS",
                  doc="Split to have approximately N reads per output file. The actual number of reads per output file will vary by no more than the number of output files * (the maximum number of reads with the same queryname - 1).",
                  mutex="SPLIT_TO_N_FILES")
        public int SPLIT_TO_N_READS
      • SPLIT_TO_N_FILES

        @Argument(shortName="N_FILES",
                  doc="Split to N files.",
                  mutex="SPLIT_TO_N_READS")
        public int SPLIT_TO_N_FILES
      • TOTAL_READS_IN_INPUT

        @Argument(shortName="TOTAL_READS",
                  doc="Total number of reads in the input file. If this is not provided, the input will be read twice, the first time to get a count of the total reads.",
                  optional=true)
        public long TOTAL_READS_IN_INPUT
      • OUTPUT

        @Argument(shortName="O",
                  doc="Directory in which to output the split BAM files.")
        public File OUTPUT
      • OUT_PREFIX

        @Argument(shortName="OUT_PREFIX",
                  doc="Output files will be named <OUT_PREFIX>_N.EXT, where N enumerates the output file and EXT is the same as that of the input.")
        public String OUT_PREFIX
    • Constructor Detail

      • SplitSamByNumberOfReads

        public SplitSamByNumberOfReads()
    • Method Detail

      • doWork

        protected int doWork()
        Description copied from class: CommandLineProgram
        Do the work after command line has been parsed. RuntimeException may be thrown by this method, and are reported appropriately.
        Specified by:
        doWork in class CommandLineProgram
        Returns:
        program exit status.
      • customCommandLineValidation

        protected String[] customCommandLineValidation()
        Description copied from class: CommandLineProgram
        Put any custom command-line validation in an override of this method. clp is initialized at this point and can be used to print usage and access argv. Any options set by command-line parser can be validated.
        Overrides:
        customCommandLineValidation in class CommandLineProgram
        Returns:
        null if command line is valid. If command line is invalid, returns an array of error message to be written to the appropriate place.