Package org.apache.poi.extractor
Class POIOLE2TextExtractor
- java.lang.Object
-
- org.apache.poi.extractor.POITextExtractor
-
- org.apache.poi.extractor.POIOLE2TextExtractor
-
- All Implemented Interfaces:
java.io.Closeable
,java.lang.AutoCloseable
- Direct Known Subclasses:
EventBasedExcelExtractor
,ExcelExtractor
,HPSFPropertiesExtractor
,OutlookTextExtactor
,PowerPointExtractor
,PublisherTextExtractor
,VisioTextExtractor
,Word6Extractor
,WordExtractor
public abstract class POIOLE2TextExtractor extends POITextExtractor
Common Parent for OLE2 based Text Extractors of POI Documents, such as .doc, .xls You will typically find the implementation of a given format's text extractor under org.apache.poi.[format].extractor .- See Also:
ExcelExtractor
,PowerPointExtractor
,VisioTextExtractor
,WordExtractor
-
-
Field Summary
Fields Modifier and Type Field Description protected POIDocument
document
The POIDocument that's open
-
Constructor Summary
Constructors Modifier Constructor Description protected
POIOLE2TextExtractor(POIOLE2TextExtractor otherExtractor)
Creates a new text extractor, using the same document as another text extractor.POIOLE2TextExtractor(POIDocument document)
Creates a new text extractor for the given document
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description DocumentSummaryInformation
getDocSummaryInformation()
Returns the document information metadata for the documentPOIDocument
getDocument()
Return the underlying POIDocumentPOITextExtractor
getMetadataTextExtractor()
Returns an HPSF powered text extractor for the document properties metadata, such as title and author.DirectoryEntry
getRoot()
Return the underlying DirectoryEntry of this document.SummaryInformation
getSummaryInformation()
Returns the summary information metadata for the document.-
Methods inherited from class org.apache.poi.extractor.POITextExtractor
close, getText, setFilesystem
-
-
-
-
Field Detail
-
document
protected POIDocument document
The POIDocument that's open
-
-
Constructor Detail
-
POIOLE2TextExtractor
public POIOLE2TextExtractor(POIDocument document)
Creates a new text extractor for the given document- Parameters:
document
- The POIDocument to use in this extractor.
-
POIOLE2TextExtractor
protected POIOLE2TextExtractor(POIOLE2TextExtractor otherExtractor)
Creates a new text extractor, using the same document as another text extractor. Normally only used by properties extractors.- Parameters:
otherExtractor
- the extractor which document to be used
-
-
Method Detail
-
getDocSummaryInformation
public DocumentSummaryInformation getDocSummaryInformation()
Returns the document information metadata for the document- Returns:
- The Document Summary Information or null if it could not be read for this document.
-
getSummaryInformation
public SummaryInformation getSummaryInformation()
Returns the summary information metadata for the document.- Returns:
- The Summary information for the document or null if it could not be read for this document.
-
getMetadataTextExtractor
public POITextExtractor getMetadataTextExtractor()
Returns an HPSF powered text extractor for the document properties metadata, such as title and author.- Specified by:
getMetadataTextExtractor
in classPOITextExtractor
- Returns:
- an instance of POIExtractor that can extract meta-data.
-
getRoot
public DirectoryEntry getRoot()
Return the underlying DirectoryEntry of this document.- Returns:
- the DirectoryEntry that is associated with the POIDocument of this extractor.
-
getDocument
public POIDocument getDocument()
Return the underlying POIDocument- Specified by:
getDocument
in classPOITextExtractor
- Returns:
- the underlying POIDocument
-
-