Class POIOLE2TextExtractor

    • Field Detail

      • document

        protected POIDocument document
        The POIDocument that's open
    • Constructor Detail

      • POIOLE2TextExtractor

        public POIOLE2TextExtractor​(POIDocument document)
        Creates a new text extractor for the given document
        Parameters:
        document - The POIDocument to use in this extractor.
      • POIOLE2TextExtractor

        protected POIOLE2TextExtractor​(POIOLE2TextExtractor otherExtractor)
        Creates a new text extractor, using the same document as another text extractor. Normally only used by properties extractors.
        Parameters:
        otherExtractor - the extractor which document to be used
    • Method Detail

      • getDocSummaryInformation

        public DocumentSummaryInformation getDocSummaryInformation()
        Returns the document information metadata for the document
        Returns:
        The Document Summary Information or null if it could not be read for this document.
      • getSummaryInformation

        public SummaryInformation getSummaryInformation()
        Returns the summary information metadata for the document.
        Returns:
        The Summary information for the document or null if it could not be read for this document.
      • getMetadataTextExtractor

        public POITextExtractor getMetadataTextExtractor()
        Returns an HPSF powered text extractor for the document properties metadata, such as title and author.
        Specified by:
        getMetadataTextExtractor in class POITextExtractor
        Returns:
        an instance of POIExtractor that can extract meta-data.
      • getRoot

        public DirectoryEntry getRoot()
        Return the underlying DirectoryEntry of this document.
        Returns:
        the DirectoryEntry that is associated with the POIDocument of this extractor.